Hi Walt,
> apparently, this code never considered that a posted op might complete,
> and so never included a specific case for it. There is a now a specific
> return code for that, but I don't know where to put it since I don't
> really understand the "reposting" thing.
>
> Are one of these actions correct for a "posted" operation that has
> already completed, or do we need to add another one? THis is from
> src/apps/kernel/linux/pvfs2-client-core.c about line 2812.
You are right.. The code definitely does not handle the case where a
posted operation immediately finishes. It only handles cases where the
operation was serviced synchronously.
> repost_op:
> /*
> check if we need to repost the operation (in case of failure or
> inlined handling/completion)
> */
> switch(ret)
> {
> case 0:
> {
> /*
> if we've already completed the operation, just repost
> the unexp request
> */
> if (vfs_request->was_handled_inline)
> {
> ret = repost_unexp_vfs_request(
> vfs_request, "inlined completion");
> }
> else
> {
What we could do to retain this part of the code is to set the
->was_handled_inline member variable in all the post_*_request() functions
if the PVFS_isys_*() function indicated that the posted operation finished
immediately.
If we do that, then we could retain this code as is without adding any
more cases. Great catch!
thanks,
Murali
> /*
> otherwise, we've just properly posted a non-blocking
> op; mark it as no longer a dev unexp msg and add it
> to the ops in progress table
> */
> vfs_request->is_dev_unexp = 0;
> ret = add_op_to_op_in_progress_table(vfs_request);
> #if 0
> assert(is_op_in_progress(vfs_request));
> #endif
> }
> }
> break;
> case REMOUNT_PENDING:
> ret = repost_unexp_vfs_request(
> vfs_request, "mount pending");
> break;
> case OP_IN_PROGRESS:
> ret = repost_unexp_vfs_request(
> vfs_request, "op already in progress");
> break;
> default:
> PVFS_perror_gossip("Operation failed", ret);
> ret = repost_unexp_vfs_request(
> vfs_request, "failure");
> break;
> }
> return ret;
> }
>
>
> Walter B. Ligon III wrote:
>> OK, I think I figured out that the request completed without ever
>> deferring due to a cache hit, so it returned SM_ACTION_TERMINATE and the
>> client took that as an error. I'm going to have to dig some more and
>> figure that out, but I'm not sure why the state machine status and not
>> the actual error code was returned.
>>
>> Walt
>>
>> Walter B. Ligon III wrote:
>>
>>> Hello all - I'm debugging away in the pvfs-client. I start everything
>>> up, do a mount, which seems to work, then do an ls, and it hangs. In
>>> the log I can see a series of GETATTR requests going through the
>>> pvfs-client, until I get to the stuff I've copied here. Its nearly
>>> done with a GETATTR, nothing unusual. It finishes the "cleanup"
>>> state, and then the "set_sys_response" state (which is the last one)
>>> which terminates the state machine. Then something unusual happens,
>>> it blurts out "Posted PVFS_SYS_GETATTR" at a place it had never done
>>> before, then "Operation failed: Device initialization failed" then
>>> appears to try to restart the request with equally bad results.
>>> Finally gives up and goes back to timer pings. See the marked lines
>>> in the posting. I'm trying to figure this out, but with my limited
>>> exposure to the pvfs-client I'm not clear WHAT exactly failed.
>>>
>>> Hoping someone has some ideas???
>>>
>>> Walt
>
>
> --
> Dr. Walter B. Ligon III
> Associate Professor
> ECE Department
> Clemson University
> _______________________________________________
> Pvfs2-developers mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
>
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers