I just stumbled on to something while trying to clean up my previous tests. If I have the pcache disabled, then rm -rf of a large directory never works right, even if all of the other clients are idle. It is possible that we have the opposite problem? That actually the pcache works fine it but it is the pcache miss case that is broken?

I'll switch back and forth a couple more times to confirm, but I'm pretty sure this is consistent.


Yeah possibly. It would make sense that ls while rm -rf would cause that too. The pcache is FIFO, so the pos->name entries might be getting FIFO-ed out of the cache by the entries that ls is adding. When the pcache misses, it uses the step_to function based on the value of the position (walks through all the entries). It could just be a bug in that code, but honestly the readdir with removes always confuses me when the index is meant to keep track of the position. How did it work in the previous impl when we had a db per directory?

-sam

I confirmed that it seems to be a problem with the "pcache miss" path. If I disable the pcache, then rm -rf of a directory with just 50 entries fails. Everything works fine if the pcache is enabled. Here is an example:

# mkdir /mnt/pvfs2/testdir6

# for i in `seq 1 50`; do cp /etc/hosts /mnt/pvfs2/testdir6/$i; done

# rm -rf /mnt/pvfs2/testdir6
rm: cannot remove directory `/mnt/pvfs2/testdir6': Directory not empty

# ls /mnt/pvfs2/testdir6
33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50

# ls /mnt/pvfs2/testdir6 |wc
     18      18      54

-Phil
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to