I ran into this same error recently. I committed a small change into the stable branch (what will become 2.8.6) that addresses the problem. Vlad, are using 2.8.5? The 2.8.6 version has not been released yet.
Thanks, Randy From: Kyle Schochenmaier <[email protected]> To: vlad <[email protected]> Cc: "[email protected]" <[email protected]> Subject: Re: [Pvfs2-users] Timeouts while reading from our pvfs2-system client collapses > Hi vlad, this is a new one for me, and issues similar rarely occur under > relatively low loads like 1GB/s in my experience, are you able to reproduce by > using pvfs2-cp /input/file /dev/null and specifying a -b to set block sizes? > If this is what I think it is you shouldn't have any associated timeouts on > server side, can you verify? > More info to come once I get into the office. > > On Jun 12, 2012 7:52 AM, "vlad" <[email protected]> wrote: >> Hi! >> >> We are evaluating orangefs 2.8.6 with QDR-Infiniband on rocks cluster >> suite 6.0 (based on CentOS 6.x) and I have set up 8 Nodes (doppler14-20 and >> doppler22). Each node is metaserver, storageserver and client. >> >> Connection is made via ib://doppler18:3335/pvfs2-fs. The file system is >> mounted to /scratchfs via the kernel-inteface (pvfs2.ko). Our kernel >> version is "2.6.32-220.13.1.el6.x86_64" >> >> We have very impressive transfer rates (with 800-600MB/s) when we dump >> very big files (1TB) on the filesystem (dd if=/dev/zero >> of=/scratchfs/testfile.dump bs=8192K) , but when reading the dump to >> /dev/zero >> the client-core collapses and our /scratchfs gets inaccessible. >> >> The use of pvfs2fuse does not improve the situation, since we get a >> socket error (usually after dumping of 1GB of data, sometimes earlier, >> sometimes later ..). The pvfs2fuse-mountpoint gets also inaccessible . >> >> >> I've found this in one of our client log files: >> ".. >> [E 14:22:23.279365] Error: encourage_recv_incoming: mop_id 7f6ce4000950 in >> RTS_DONE message not found. >> [E 14:22:23.292947] [bt] pvfs2-client-core(error+0xca) [0x46f91a] >> [E 14:22:23.292978] [bt] pvfs2-client-core() [0x46ccc4] >> [E 14:22:23.292999] [bt] pvfs2-client-core() [0x46ea65] >> [E 14:22:23.293018] [bt] pvfs2-client-core(BMI_testcontext+0xf3) >> [0x45aa83] >> [E 14:22:23.293037] [bt] >> pvfs2-client-core(PINT_thread_mgr_bmi_push+0x159) [0x4608a9] >> [E 14:22:23.293056] [bt] pvfs2-client-core() [0x45c9aa] >> [E 14:22:23.293074] [bt] pvfs2-client-core(job_testcontext+0x12a) >> [0x45d19a] >> [E 14:22:23.293092] [bt] >> pvfs2-client-core(PINT_client_state_machine_testsome+0xee) [0x41757e] >> [E 14:22:23.293111] [bt] pvfs2-client-core() [0x412ecd] >> [E 14:22:23.293130] [bt] pvfs2-client-core(main+0x703) [0x413fb3] >> [E 14:22:23.293165] [bt] /lib64/libc.so.6(__libc_start_main+0xfd) >> [0x392b41ecdd] >> [E 14:22:23.303725] pvfs2-client-core with pid 29108 exited with value 1 >> .." >> >> I have ot found any evidence for this error in the server log files >> though .. >> >> This is the output of our /etc/pvfs2tab: >> >> "ib://doppler18:3335/pvfs2-fs /scratchfs pvfs2 defaults,noauto 0 0" >> >> Please can you help me to stabilize the read access to our files ? >> >> >> Greetings from Salzburg/Austria/Europe >> >> >> Vlad Popa >> >> University of Salzburg >> Dept Of Computer Science-HPC Computing >> Jakob-Harringer-Str2 >> 5020 Salzburg >> Tel 0043-662-80446313 >> mal:[email protected] <mailto:mal%[email protected]> >> _______________________________________________ >> Pvfs2-users mailing list >> [email protected] >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
_______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
