Re: [Pvfs2-developers] crdirent

Sam Lang Tue, 13 Jun 2006 06:26:19 -0700


On Jun 12, 2006, at 10:07 PM, Rob Ross wrote:

hey,
i know we're trying to keep the # of DBs down, but would it reallyhurt that much to just use a separate DB for this data rather thanhaving to play funny games with the key strings?

I don't have much preference either way. I don't find the nullstring to be that much of a hack, but I can see the advantages ofhaving a separate db for stuff like this. One disadvantage ofseparate dbs is that we can't just do one sync at the end of acrdirent or rmdirent.

also, it seems a little wacky that we have to pass a flag to telltrove when to count and when not to count. is there a clean way toavoid that?

This is the problem that dbpf doesn't know anything about the commonkeys. We could copy the common keys in the dbpf layer, kind of anugly hack though. Also, the crdirent and rmdirent calls just give ahandle and the component name, so we really can only tell thedifference between common keys and everything else (!is_this_a_common_key(key)). In this case that will either be acomponent name or an xattr. So we'd only be able to do as good ascounting both xattrs and directory entries.

We talked about just adding the count to every handle in the keyvaldb. That adds a bunch of unecessary keyval entries (for each fileand directory). I was trying to avoid that, but maybe the cost isn'tworth the hastle.


how do you read the count?


There's an additional trove_keyval_get_handle_info function.

otherwise i think it's great that we're moving the count increment/decrement into trove, that this will allow for concurrentmodification, and that we can simplify the state machines.
thanks!

rob

Sam Lang wrote:
Hi all,
The new keyval code currently stores the size of a directory as aseparate common keyval. The server state machines update thisvalue with get/set state actions as needed (incrdirent,rmdirent,etc.). This get and set actually prevents usfrom allowing the create and delete operations of different filesin the same directory to take place concurrently, since thecrdirent and rmdirent ops (on the parent dirdata handle) getserialized.I'd like to fix all this by providing a keyval per handle thatcontains a null string as part of the key (I call it keyval-handle-info). The advantage of making it the null string is that it willappear first in the lexical ordering of directory entries, so Ican skip over it in readdir easily. This null keyval would onlybe created on handles as necessary (right now only for countingdirents). The TROVE_KEYVAL_HANDLE_COUNT ds flag can be passed totrove operations, for example in the case of crdirent, theTROVE_KEYVAL_HANDLE_COUNT and TROVE_NOOVERWITE flags would bepassed to the trove_keyval_write call and specify that the countshould be incremented (or created and set to 0 if it doesn'texist). rmdirent would do something similar in trove_keyval_remove.Also, at present the crdirent and rmdirent state machines first doa read of the keyval to check for existence. This seemsunnecessary. Instead, the crdirent sm can just passTROVE_NOOVERWITE to the keyval_write call, and fail if that callfails. rmdirent already fails if the keyval_remove fails so theextra keyval_read to check for existence seems redundant. Arethere any good reasons for those extra state actions that I'mmissing?I've attached a patch of the changes I've described. I would liketo have this go in to the trunk before the upcoming release, sinceit requires (yet another) storage format change. Let me know ifthere are any questions or concerns.


_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] crdirent

Reply via email to