On Wed, 2009-04-29 at 00:14 +0200, Thomas Glanzmann wrote:
> Hello Chris,
> 
> > They are, but only the crc32c are stored today.
> 
> maybe crc32c is good enough to identify duplicated blocks, I mean we
> only need a hint, the dedup ioctl does the double checking. I will write
> tomorrow a perl script and compare the results to the one that uses md5
> and repoort back.

Its a start at least.

> 
> > Yes, that's the idea.  An ioctl to walk the tree and report on
> > changes, but this doesn't have to be done with version 1 of the dedup
> > code, you can just scan the file based on mtime/ctime.
> 
> Good point.
> 
> > > > But, the ioctl to actually do the dedup needs to be able to verify a
> > > > given block has the contents you expect it to.  The only place you can
> > > > lock down the pages in the file and prevent new changes is inside the
> > > > kernel.
> 
> > > I totally agree to that. How much time would it consume to implement
> > > such a systemcall?
> 
> > It is probably a 3 week to one month effort.
> 
> I'm taking the challenge. Is there a document that I can read that
> introduces me to the structures used in btrfs or can someone walk me
> through on the phone to get a quick start?
> 

Great to hear.  It's an ambitious project, but I'll definitely help
explain things.

You can start with the code documentation section on
http://btrfs.wiki.kernel.org

I'll write up my ideas on how userspace controlled dedup should work.

> I also would like to retrieve the checksums and identify the potential
> blocks and after that work is done (even in a very preliminary state) in
> a way that someone can work with it, I would like to move on to the
> dedup ioctl.

Sounds fair, I'll forward the original patch.

-chris


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to