hey,
i know we're trying to keep the # of DBs down, but would it really hurt
that much to just use a separate DB for this data rather than having to
play funny games with the key strings?
also, it seems a little wacky that we have to pass a flag to tell trove
when to count and when not to count. is there a clean way to avoid that?
how do you read the count?
otherwise i think it's great that we're moving the count
increment/decrement into trove, that this will allow for concurrent
modification, and that we can simplify the state machines.
thanks!
rob
Sam Lang wrote:
Hi all,
The new keyval code currently stores the size of a directory as a
separate common keyval. The server state machines update this value
with get/set state actions as needed (in crdirent,rmdirent,etc.). This
get and set actually prevents us from allowing the create and delete
operations of different files in the same directory to take place
concurrently, since the crdirent and rmdirent ops (on the parent dirdata
handle) get serialized.
I'd like to fix all this by providing a keyval per handle that contains
a null string as part of the key (I call it keyval-handle-info). The
advantage of making it the null string is that it will appear first in
the lexical ordering of directory entries, so I can skip over it in
readdir easily. This null keyval would only be created on handles as
necessary (right now only for counting dirents). The
TROVE_KEYVAL_HANDLE_COUNT ds flag can be passed to trove operations, for
example in the case of crdirent, the TROVE_KEYVAL_HANDLE_COUNT and
TROVE_NOOVERWITE flags would be passed to the trove_keyval_write call
and specify that the count should be incremented (or created and set to
0 if it doesn't exist). rmdirent would do something similar in
trove_keyval_remove.
Also, at present the crdirent and rmdirent state machines first do a
read of the keyval to check for existence. This seems unnecessary.
Instead, the crdirent sm can just pass TROVE_NOOVERWITE to the
keyval_write call, and fail if that call fails. rmdirent already fails
if the keyval_remove fails so the extra keyval_read to check for
existence seems redundant. Are there any good reasons for those extra
state actions that I'm missing?
I've attached a patch of the changes I've described. I would like to
have this go in to the trunk before the upcoming release, since it
requires (yet another) storage format change. Let me know if there are
any questions or concerns.
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers