This has got to caused by the way I did the caching of positions on
the server. I think it might make sense to replace that with code
that uses the component name as the position, instead of trying to
debug this problem. I feel like the caching is the inherent problem
that causes these bugs, and using the component name should solve them.
We talked about using the component name as the position back in july
during the futures meeting, and IIRC we left it at the problem of the
kernel module not always having enough space for all the entries
returned, and so the position would have to be modified on the client
somehow. Then we sort of went off and discussed operators on
positions...
I talked with Murali about this again recently, and it sounds like we
can grab previous component names to use that as the position, so
maybe that's the way to go. I'll look into doing that and see how
much work it is.
-sam
On Oct 9, 2006, at 10:14 AM, Phil Carns wrote:
We are seeing a strange bug where if we list the contents of a
directory
while files are being created in it, we sometimes get duplicates
and/or
missing files in the output.
I can reproduce it on a single machine by running these two scripts at
the same time:
tester.sh:
-----------------------------------
#!/bin/tcsh
foreach file ( `seq 1 10000` )
touch /mnt/pvfs2/testdir/${file}
end
watcher.sh:
-----------------------------------
#!/bin/tcsh
there:
set foo=`ls /mnt/pvfs2/testdir | wc -l`
set bar=`ls /mnt/pvfs2/testdir | uniq -d | wc -l`
echo listing count: $foo, duplicates: $bar
sleep 1
goto there
The test machine that I am using is pretty slow. On faster
machines you
may need to create more than 10,000 files, or maybe slow it down by
actually writing a little bit of data into each file.
At any rate, the output looks normal for a while, but then we start
seeing results like this from watcher.sh:
...
listing count: 6310, duplicates: 0
listing count: 6320, duplicates: 0
listing count: 6334, duplicates: 0
listing count: 6371, duplicates: 0
listing count: 6382, duplicates: 0
listing count: 6396, duplicates: 5024
listing count: 6406, duplicates: 0
listing count: 10896, duplicates: 5344
listing count: 6430, duplicates: 5120
listing count: 6434, duplicates: 0
listing count: 11574, duplicates: 6048
listing count: 6472, duplicates: 0
...
The listing count is supposed to steadily increase, and the duplicates
field should always be zero. The problem only occurs while files are
being created. Once tester.sh is done, the listing looks perfectly
normal.
Anyone have any ideas? I think this problem has been hanging
around for
a little while but we just now figured out how to reliably trigger
it. It is at least in current cvs head and was in a snapshot from
August 21.
-Phil
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers