Re: [Pvfs2-developers] duplicate entries in directory listing

Phil Carns Mon, 09 Oct 2006 11:51:00 -0700

We've talked about having pvfs2-client pull out duplicates (or thekernel module) in the cases where one of those chooses to break areaddir into multiple operations, but we haven't spent much timeinvestigating where the replication is actually happening in order toaccomplish this.
Other solutions could include locking the directory (not going tohappen), restarting the ls entirely if the directory changes duringthe read (would cause starvation), improving ls to remove duplicateson its own (probably realistic for pvfs2-ls, unlikely to get acceptedby GNU tools group for stock ls), and reordering directory entriesreturned so that most recently changing entries are returned last(high overhead on server to sort, lots of coding probably).
Any other ideas?
Yeah I guess my proposed changes wouldn't help in this case. Berkeleydb has the notion of a secondary (read-only) database based on theprimary, where keys in the secondary are based on the primary's dataand some function you provide. So it might be possible to create asecondary database for iterating based on update time instead ofalphabetically based on component name. I'm not sure how efficientthat would be though...we might just be pushing the sorting problem tothe db layer. Also, we would have to start storing update times inkeyval dirent entries.

I started thinking about some more possible ideas, but I realized afterlooking closer at the code that I don't actually see why duplicateswould occur in the first place with the algorithm that is being used :)I apologize if this has been discussed a few times already, but couldwe walk through it one more time?

I know that the request protocol uses a token (integer based) to keeptrack of position. However, the pcache converts this into a particularkey based on where the last iteration left off. This key contains thehandle as well as the alphanumeric name of the entry.

Trove then does a c_get on that key with the DB_SET flag, which shouldput the cursor at the proper position. If the entry has been deleted(which is not happening in my case- I am only creating files), then itretries the c_get with the DB_SET_RANGE flag which should set the cursorat the next position. "next" in this case is defined by the comparisonfunction, PINT_trove_dbpf_keyval_compare().

The keyval_comare() function sorts the keys based on handle value, thenkey length, then stncmp of the key name.

This means that essentially we are indexing off of the name of the entryrather than a position in the database.

So how could inserting a new entry between readdir requests cause aduplicate? The old entry that is stored in the pcache should still bevalid. If the newly inserted entry comes after it (according to thekeyval_comare() sort order), then we should see it as we continueiterating. If the new entry comes before it, then it should not show up(we don't back up in the directory listing). It doesn't seem like thereshould be any combination that causes it to show up twice.

Is c_get() not traversing the db in the order defined by thekeyval_comare() function?

The only other danger that I see is that if the pcache_lookup() fails,the code falls back to stepping linearly through the db to the tokenposition which I could imagine might have ordering implications.However, I am only talking to the server from a single client, so Idon't see why it would ever miss the pcache lookup.

I just want to confirm that there is actually an algorithm problem hererather than just a bug in the code somewhere.


-Phil
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] duplicate entries in directory listing

Reply via email to