On Jan 28, 2008, at 4:46 PM, Murali Vilayannur wrote:

Sam,
Isn't it possible that creates can now fail after exhausting 32 retries?

Yes. The odds of that happening though are extremely small. Even with 16 million files in the filesystem, the odds of hitting collisions 32 times in a row are 1 in 250 billion. I could just increase the retries to 1000, which makes the odds of the create failing before the handle space runs out something ridiculous.

There's a general problem here that as the filesystem gets extremely large, the accuracy of an ENOSPC error is directly proportional to the amount of time the create operation will take.


Can we not load the ledger in the background while the server sets itself up
and let it continue  processing requests normally?
that way we can still lookup some handles from the ledgers for
create's handle allocations.

In these cases where collisions are more likely (huge filesystems), I'm not sure we can assume that the entire set of handles can be loaded into an in-memory ledger. It is extent based, so it would have a fairly small memory footprint if files are only being added and never removed. But failing that assumption, we would have to be able to store and query the ledger out of core.

RobR has mentioned before that the ledger was initially intended to be serializable so that it could be written to disk. This would at least save recreating it on server startup, but as the handle space became more fragmented, we'd still have to worry about its being a memory hog. But you're right, if we're targeting workloads where files are created and never removed, keeping the ledger is probably the right move.

-sam


thanks,
Murali


On 1/28/08, Sam Lang <[EMAIL PROTECTED]> wrote:

Attached patch disables the handle ledger.  For those not familiar,
the handle ledger is an in-memory structure that maintains allocated
handles for a given server.  I'm disabling it because reading the
entire database each time the server loads is extremely expensive for
large filesystems. Instead of choosing a handle from the ledger, the
patch picks one randomly.  This means we have to deal with collisions
now, but because of our large handle space, they only occur every 100
billion times or so.

I didn't blow away the handle allocation code entirely...I just
disabled the calls that we had been using to invoke the handle ledger,
and added some functionality that picks a random handle from a given
range. In the dspace code, I modified the create function to continue
up to 32 times if a collision with an already existing handle occurs.

-sam




_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers





_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to