Hi Everybody, I have been facing some problems recently when using mpich2 and pvfs2. My program worked fine earlier and I did not face any problems before while executing those programs. All of a sudden, when I run my programs now on a reconfigured setup (8 IO servers, 8 clients and 4 metadata servers), I get the below error messages. I have browsed through the forums and there have been similar reports before. However, I couldn't really figure out if anybody got a solution to the problem. I generally get the error when I scale the number of instances running to 16 or 32.
6: [E 06:13:46.025685] job_time_mgr_expire: job time out: cancelling flow operation, job_id: 67. 6: [E 06:13:46.025976] fp_multiqueue_cancel: flow proto cancel called on 0x8cebcac 6: [E 06:13:46.026004] handle_io_error: flow proto error cleanup started on 0x8cebcac, error_code: -1610613121 6: [E 06:13:46.026099] handle_io_error: flow proto 0x8cebcac canceled 1 operations, will clean up. 6: [E 06:13:46.026138] handle_io_error: flow proto 0x8cebcac error cleanup finished, error_code: -1610613121 11: [E 06:13:46.075671] job_time_mgr_expire: job time out: cancelling flow operation, job_id: 71. 11: [E 06:13:46.075994] fp_multiqueue_cancel: flow proto cancel called on 0x96f3aac 11: [E 06:13:46.076022] handle_io_error: flow proto error cleanup started on 0x96f3aac, error_code: -1610613121 11: [E 06:13:46.076117] handle_io_error: flow proto 0x96f3aac canceled 1 operations, will clean up. 11: [E 06:13:46.076152] handle_io_error: flow proto 0x96f3aac error cleanup finished, error_code: -1610613121 14: [E 06:19:45.563289] handle_io_error: flow proto error cleanup started on 0x9c6349c, error_code: -1073741973 I would appreciate any help and suggestions that you'll can offer, Regards, Christina. _______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
