OK, I think I figured out that the request completed without ever deferring due to a cache hit, so it returned SM_ACTION_TERMINATE and the client took that as an error. I'm going to have to dig some more and figure that out, but I'm not sure why the state machine status and not the actual error code was returned.

Walt

Walter B. Ligon III wrote:
Hello all - I'm debugging away in the pvfs-client. I start everything up, do a mount, which seems to work, then do an ls, and it hangs. In the log I can see a series of GETATTR requests going through the pvfs-client, until I get to the stuff I've copied here. Its nearly done with a GETATTR, nothing unusual. It finishes the "cleanup" state, and then the "set_sys_response" state (which is the last one) which terminates the state machine. Then something unusual happens, it blurts out "Posted PVFS_SYS_GETATTR" at a place it had never done before, then "Operation failed: Device initialization failed" then appears to try to restart the request with equally bad results. Finally gives up and goes back to timer pings. See the marked lines in the posting. I'm trying to figure this out, but with my limited exposure to the pvfs-client I'm not clear WHAT exactly failed.

Hoping someone has some ideas???

Walt

[D 16:07:28.492434] SM next smcb 0x9b31280 op 5
[D 16:07:28.492450] SM invoke smcb 0x9b31280 op 5
[D 16:07:28.492466] [SM Entering]: (0x9b31280) client_getattr_sm:cleanup (status: 0)
[D 16:07:28.492482] FRAME smcb 0x9b31280 base 0 stack 1
[D 16:07:28.492499] (0x9b3b8e0) getattr state: getattr_cleanup
[D 16:07:28.492515] [SM Exiting]: (0x9b31280) client_getattr_sm:cleanup (error code: 0)
[D 16:07:28.492556] SM Returns Complete (0x9b31280)
[D 16:07:28.492572] SM invoke smcb 0x9b31280 op 5
[D 16:07:28.492588] [SM Entering]: (0x9b31280) client_sysint_getattr_sm:set_sys_response (status: 0)
[D 16:07:28.492604] FRAME smcb 0x9b31280 base 0 stack 1
[D 16:07:28.492621] dfile_count: 0
[D 16:07:28.492636] dist_name_len = 0, dist_params_len = 0
[D 16:07:28.492652] [SM Exiting]: (0x9b31280) client_sysint_getattr_sm:set_sys_response (error code: 0)
[D 16:07:28.492669] SM Terminates (0x9b31280)
[D 16:07:28.492684] client_state_machine_terminate smcb 0x9b31280
[D 16:07:28.492700] add smcb 0x9b31280 to completion list
[D 16:07:28.492717] Posted PVFS_SYS_GETATTR (170) (ran to termination) <<<<<<<<<<<<<<
[E 16:07:28.492736] Operation failed: Device initialization failed
<<<<<<<<<<<<<<
[D 16:07:28.492752] PVFS_sys_release id 170
[D 16:07:28.492769] FRAME smcb 0x9b31280 base 0 stack 1
[D 16:07:28.492787] SM free smcb 0x9b31280 op 5
[D 16:07:28.492805] PINT_sys_dev_unexp
[D 16:07:28.492821] SM allocate smcb 0x9b31280 op 400
[D 16:07:28.492843] PUSH smcb 0x9b31280 base 0 stack 0
[D 16:07:28.492860] SM locate smcb 0x9b31280 op 400
[D 16:07:28.492876] client_op_state_get_machine 400
[D 16:07:28.492891] FRAME smcb 0x9b31280 base 0 stack 1
[D 16:07:28.492907] FRAME smcb 0x9b31280 base 0 stack 1
[D 16:07:28.492923] PINT_client_state_machine_post smcb 0x9b31280
[D 16:07:28.492940] SM invoke smcb 0x9b31280 op 400
[D 16:07:28.492956] [SM Entering]: (0x9b31280) sysdev_unexp_sm:post (status: 0)
[D 16:07:28.492973] FRAME smcb 0x9b31280 base 0 stack 1
[D 16:07:28.492991] [SM Exiting]: (0x9b31280) sysdev_unexp_sm:post (error code:
0)
[D 16:07:28.493008] SM Returns Deferred (0x9b31280)
[D 16:07:28.493024] Posted PVFS_DEV_UNEXPECTED (171) (waiting for test)
[D 16:07:28.493041] [-] reposted unexp req [0x9b5a030] due to failure
[D 16:07:28.493059] FRAME smcb 0x9b31280 base 0 stack 1
[D 16:07:28.493075] PVFS_sys_release id 171
[D 16:07:28.493092] FRAME smcb 0x9b31280 base 0 stack 1
[D 16:07:28.493109] SM free smcb 0x9b31280 op 400
[D 16:07:28.493126] PINT_sys_testsome returned unexp vfs_request 0x9b5a030
[D 16:07:28.493143] [+] dev req msg: sz: 0,tag: 0,data: (nil),type: 0
[E 16:07:28.493161] Error! Short read from device; aborting!
<<<<<<<<<<<<<<
[E 16:07:28.493199] Operation failed: Input/output error
<<<<<<<<<<<<<<
[D 16:07:28.493216] PVFS_sys_release id 171
[D 16:07:28.493234] PINT_sys_dev_unexp
[D 16:07:28.493250] SM allocate smcb 0x9b31280 op 400
[D 16:07:28.493273] PUSH smcb 0x9b31280 base 0 stack 0
[D 16:07:28.493290] SM locate smcb 0x9b31280 op 400
[D 16:07:28.493305] client_op_state_get_machine 400
[D 16:07:28.493321] FRAME smcb 0x9b31280 base 0 stack 1
[D 16:07:28.493338] FRAME smcb 0x9b31280 base 0 stack 1
[D 16:07:28.493354] PINT_client_state_machine_post smcb 0x9b31280
[D 16:07:28.493371] SM invoke smcb 0x9b31280 op 400
[D 16:07:28.493391] [SM Entering]: (0x9b31280) sysdev_unexp_sm:post (status: 0)
[D 16:07:28.493408] FRAME smcb 0x9b31280 base 0 stack 1
[D 16:07:28.493426] [SM Exiting]: (0x9b31280) sysdev_unexp_sm:post (error code:
0)
[D 16:07:28.493443] SM Returns Deferred (0x9b31280)
[D 16:07:28.493461] Posted PVFS_DEV_UNEXPECTED (173) (waiting for test)
[D 16:07:28.493479] [-] reposted unexp req [0x9b5a030] due to failure


--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to