I ran into this same error recently.  I committed a small change into the
stable branch (what will become 2.8.6) that addresses the problem.   Vlad,
are using 2.8.5?  The 2.8.6 version has not been released yet.

Thanks,
Randy

From:  Kyle Schochenmaier <[email protected]>
To:  vlad <[email protected]>
Cc:  "[email protected]"
<[email protected]>
Subject:  Re: [Pvfs2-users] Timeouts while reading from our pvfs2-system
client collapses

> Hi vlad, this is a new one for me, and issues similar rarely occur under
> relatively low loads like 1GB/s in my experience, are you able to reproduce by
> using pvfs2-cp /input/file /dev/null and specifying a -b to set block sizes?
> If this is what I think it is you shouldn't have any associated  timeouts on
> server side, can you verify?
> More info to come once I get into the office.
> 
> On Jun 12, 2012 7:52 AM, "vlad" <[email protected]> wrote:
>> Hi!
>> 
>> We are  evaluating  orangefs 2.8.6 with QDR-Infiniband on rocks cluster
>> suite 6.0 (based on CentOS 6.x) and I have set up 8 Nodes (doppler14-20 and
>> doppler22). Each node is metaserver, storageserver and client.
>> 
>> Connection is made via ib://doppler18:3335/pvfs2-fs. The file system is
>> mounted to /scratchfs via the kernel-inteface (pvfs2.ko). Our kernel
>> version is "2.6.32-220.13.1.el6.x86_64"
>> 
>> We have very impressive  transfer rates (with 800-600MB/s) when we dump
>> very big files (1TB) on the  filesystem (dd if=/dev/zero
>> of=/scratchfs/testfile.dump bs=8192K) , but when reading the dump to
>> /dev/zero
>> the client-core collapses and our /scratchfs gets inaccessible.
>> 
>> The use of pvfs2fuse  does not improve the situation, since we get a
>> socket error (usually after dumping of 1GB of data, sometimes earlier,
>> sometimes later ..). The pvfs2fuse-mountpoint gets also inaccessible .
>> 
>> 
>> I've found this in one of our client log files:
>> "..
>> [E 14:22:23.279365] Error: encourage_recv_incoming: mop_id 7f6ce4000950 in
>> RTS_DONE message not found.
>> [E 14:22:23.292947]     [bt] pvfs2-client-core(error+0xca) [0x46f91a]
>> [E 14:22:23.292978]     [bt] pvfs2-client-core() [0x46ccc4]
>> [E 14:22:23.292999]     [bt] pvfs2-client-core() [0x46ea65]
>> [E 14:22:23.293018]     [bt] pvfs2-client-core(BMI_testcontext+0xf3)
>> [0x45aa83]
>> [E 14:22:23.293037]     [bt]
>> pvfs2-client-core(PINT_thread_mgr_bmi_push+0x159) [0x4608a9]
>> [E 14:22:23.293056]     [bt] pvfs2-client-core() [0x45c9aa]
>> [E 14:22:23.293074]     [bt] pvfs2-client-core(job_testcontext+0x12a)
>> [0x45d19a]
>> [E 14:22:23.293092]     [bt]
>> pvfs2-client-core(PINT_client_state_machine_testsome+0xee) [0x41757e]
>> [E 14:22:23.293111]     [bt] pvfs2-client-core() [0x412ecd]
>> [E 14:22:23.293130]     [bt] pvfs2-client-core(main+0x703) [0x413fb3]
>> [E 14:22:23.293165]     [bt] /lib64/libc.so.6(__libc_start_main+0xfd)
>> [0x392b41ecdd]
>> [E 14:22:23.303725] pvfs2-client-core with pid 29108 exited with value 1
>> .."
>> 
>> I have ot found  any evidence  for this error in the server log files
>> though ..
>> 
>> This is the output of our  /etc/pvfs2tab:
>> 
>> "ib://doppler18:3335/pvfs2-fs /scratchfs pvfs2 defaults,noauto 0 0"
>> 
>> Please can you help me to stabilize the read access to our files ?
>> 
>> 
>> Greetings from Salzburg/Austria/Europe
>> 
>> 
>> Vlad Popa
>> 
>> University of Salzburg
>> Dept Of Computer Science-HPC Computing
>> Jakob-Harringer-Str2
>> 5020 Salzburg
>> Tel 0043-662-80446313
>> mal:[email protected] <mailto:mal%[email protected]>
>> _______________________________________________
>> Pvfs2-users mailing list
>> [email protected]
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users


_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to