On Jan 28, 2008, at 4:46 PM, Murali Vilayannur wrote:
Sam,
Isn't it possible that creates can now fail after exhausting 32
retries?
Yes. The odds of that happening though are extremely small. Even
with 16 million files in the filesystem, the odds of hitting
collisions 32 times in a row are 1 in 250 billion. I could just
increase the retries to 1000, which makes the odds of the create
failing before the handle space runs out something ridiculous.
There's a general problem here that as the filesystem gets extremely
large, the accuracy of an ENOSPC error is directly proportional to the
amount of time the create operation will take.
Can we not load the ledger in the background while the server sets
itself up
and let it continue processing requests normally?
that way we can still lookup some handles from the ledgers for
create's handle allocations.
In these cases where collisions are more likely (huge filesystems),
I'm not sure we can assume that the entire set of handles can be
loaded into an in-memory ledger. It is extent based, so it would have
a fairly small memory footprint if files are only being added and
never removed. But failing that assumption, we would have to be able
to store and query the ledger out of core.
RobR has mentioned before that the ledger was initially intended to be
serializable so that it could be written to disk. This would at least
save recreating it on server startup, but as the handle space became
more fragmented, we'd still have to worry about its being a memory
hog. But you're right, if we're targeting workloads where files are
created and never removed, keeping the ledger is probably the right
move.
-sam
thanks,
Murali
On 1/28/08, Sam Lang <[EMAIL PROTECTED]> wrote:
Attached patch disables the handle ledger. For those not familiar,
the handle ledger is an in-memory structure that maintains allocated
handles for a given server. I'm disabling it because reading the
entire database each time the server loads is extremely expensive for
large filesystems. Instead of choosing a handle from the ledger,
the
patch picks one randomly. This means we have to deal with collisions
now, but because of our large handle space, they only occur every 100
billion times or so.
I didn't blow away the handle allocation code entirely...I just
disabled the calls that we had been using to invoke the handle
ledger,
and added some functionality that picks a random handle from a given
range. In the dspace code, I modified the create function to
continue
up to 32 times if a collision with an already existing handle occurs.
-sam
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers