It doesn't actually exist right now, but the counter example I can
come up with has to do with file open, where the O_CREAT flag is set
but O_EXCL is not. If a parallel job tries to create a file (call
PVFS_sys_create), and it already exists, it gets back an EEXISTS
error, and then has to do a lookup to get the actual PVFS_handle of
that file. This means if a bunch of jobs all do a create of the same
file (stupid sure, but possible), all the ones that got EEXISTS back
have to do a lookup. We could potentially optimize this by sticking
the handle into the error response.
Maybe not worth it for this case, but it shows that some error
responses might be treated as success (or at least non-fatal) by the
caller.
-sam
On Oct 19, 2006, at 2:26 PM, Rob Ross wrote:
Sounds like formalizing this would be a good idea. Is there even
one counter-example that we can think of or that exists today?
Rob
Pete Wyckoff wrote:
[EMAIL PROTECTED] wrote on Thu, 19 Oct 2006 16:34 +0200:
We've plugged some similar bugs before by just fixing the
specific case (being more careful about cleaning up masks,
pointers, sizes, etc. after an error), and the same could easily
be done here. Should there be a more general solution, though?
What I am wondering is if the response encoder should even bother
to encode the whole message if the status is non-zero. It seems
like if there is an error code it should just stick to encoding
the basic PVFS_server_resp struct. The rest of the fields can't
really be trusted since we hit an error condition. Likewise on
the decode side of things.
The lebf_encode_resp() function already sort of catches one case
of this:
/* we stand a good chance of segfaulting if we try to encode
the response
* after something bad happened reading data from disk. */
if (resp->status != -PVFS_EIO)
{
... but that only handles one specific error code.
I'm sort of in agreement with your analysis. We had a similar issue
on the decoding side on the client: an EIO came back from the
server but the client proceeded to try to decode particular fields.
http://www.beowulf-underground.org/pipermail/pvfs2-developers/2006-
June/002202.html
Maybe both the lebf_{en,de}code_resp should encode or decode the
rest of the message if status == 0. The comparison for EIO alone is
not sufficient.
Are there any reasons the client should be looking at more fields
once it sees a negative status? Maybe we should just formalize
this. And fixup lebf_decode_rel so it only frees things if status
was zero.
-- Pete
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers