On Tue, Oct 14, 2008 at 11:22 AM, <[EMAIL PROTECTED]> wrote: > Hello, > > I am trying to do some "load tests" with pvfs2, but find the following > in the logs (I produced them with 'pvfs2-set-debugmask -m /mnt/test > "network,server,client"'): > > Client: > > [D 11:34:10.421223] [INFO]: Mapping pointer 0x2b875cb28000 for I/O. > [D 11:34:10.433532] [INFO]: Mapping pointer 0x6a9000 for I/O. > [E 11:40:02.941501] job_time_mgr_expire: job time out: cancelling bmi > operation, job_id: 31963. > > Server01: > > [D 10/08 11:40] BMI_tcp_post_send_generic: Sent: 24 bytes of data. > [D 10/08 11:40] [BMI CONTROL]: BMI_set_info: set_info: 7570864 option: 6 > [D 10/08 11:40] [BMI CONTROL]: BMI_set_info: searching for ref 7570864 > [D 10/08 11:40] [BMI CONTROL]: BMI_set_info: decremented ref 7570864 to: 0 > [D 10/08 11:40] server_state_machine_complete 0x2aaab4022030 > [D 10/08 11:40] server_state_machine_terminate 0x2aaab4022030 > [D 10/08 11:40] Error: bmi_tcp: Connection reset by peer > [D 10/08 11:40] BMI_testcontext completing: 46912585631680 > [E 10/08 11:40] handle_io_error: flow proto error cleanup started on > 0x2aaab0008690: Connection reset by peer > [E 10/08 11:40] handle_io_error: flow proto 0x2aaab0008690 canceled 0 > operations, will clean up. > [E 10/08 11:40] handle_io_error: flow proto 0x2aaab0008690 error cleanup > finished: Connection reset by peer > [D 10/08 11:40] [BMI CONTROL]: BMI_set_info: set_info: 7811296 option: 6 > [D 10/08 11:40] [BMI CONTROL]: BMI_set_info: searching for ref 7811296 > [D 10/08 11:40] [BMI CONTROL]: BMI_set_info: decremented ref 7811296 to: 0 > [D 10/08 11:40] [BMI CONTROL]: bmi_addr_drop: bmi discarding address: > 7811296 > [D 10/08 11:40] server_state_machine_complete 0x2aaab40381d0 > > The cluster configuration is as follows: > - three hosts with ~400Gb ext3 slice each mounted from a SAN via FC > acting as metadata servers, I/O servers and clients; > - two hosts acting as clients only. > - Debian 4.0, kernel 2.6.24, pvfs2 module 2.7.1 > > The hosts are connected to each other by gigabit Ethernet. I am > mounting the filesystem on each client-only host from a different > server: is this correct? What is the difference between mounting from > different servers and using one server for all clients?
There shouldnt be a difference. But generally people use the same server - its a mount point, data doesnt get explicitly routed through this server-node for specific clients. > > Each server/client host instead uses itself as server. Again, would it > be better to use other hosts as servers? Again this ideally shouldnt matter. > > Last, but not least: have you got any clues on the possible cause of > the error? I checked all the other logs, and are perfectly clean. Also, > pvfs2-ping doesn't report anything wrong. I would start by checking to make sure your ethernet is working without errors. Also check to make sure all of the server processes are still up, I've seen this error when server processes die. Did you change the timeout values in the config file? > > Please forgive me if the above questions have already been answered: I > tried searching the mailing list archives but without success... > > > Thank you very much for your kind attention! > > _______________________________________________ > Pvfs2-developers mailing list > [email protected] > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers > _______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
