Le 15 août 2013 01:08, "Adam Borowski" <[email protected]> a écrit :

First, some context : I'm trying to efficiently store a huge nfs-root farm
(objective: 500+), therefore the perf impact should be very limited.

I don't bother dedup inside files. I'm only aiming at deduping whole files.

> There are two ways:

I'm exploring a third way, entirely userspace (I'll draft a mixed one at
the end).

For that I'm using the "hardlink" package.

It does operate on the file level, in userspace. It has some serious
drawbacks :
* bugged on a race condition when changing files
* once hardlinked, one cannot write in the file anymore. It has to be
written as a new one then replaced using rename(2).

Despite these it has huge benefits :
* Nil performance impact.
* Asynchronous (offline) deduplication can be scheduled off peak hours.
* usable in old-stable kernels

> * a nice and clean way.  The kernel interface would need to be "hey
kernel,
>   I think the block in fd 1 offset 0 might be same as a block in fd 2
offset
>   4096, care to compare and perhaps combine them?".

So all the cleverness of *what* to merge would only happen in userspace ?
What would be the impact of a runtime read ?

> Offline (a confusing name, it's a mounted filesystem but at a later time)

It can even be done asynchronously, by registering an inotify on it, and
then queuing the dedups in a userspace daemon.

> See how much fun can we have with data structures?

> And the best of all, the kernel needs just a single syscall, with all the
complexity done in userspace.

That's the whole beauty of it. You then create a whole ecosystem of
softwares to address that complexity in every different manner possible :)

Now, the mixed approach I promised earlier :

As pure userspace take is not ideal, I was thinking about adding a FUSE
in-place layer than would synchronously copy deduped-hardlinks on write.
Could be triggered by a open(w) or a real write().

Else than that, just offer a raw, native, access to every other fileops.

Steve.
_______________________________________________
Debconf-discuss mailing list
[email protected]
http://lists.debconf.org/mailman/listinfo/debconf-discuss

Reply via email to