Hello, During some testing I found that orangefs behaves badly when multiple intense parallel i/o is used on the same directory. For testing I used parallel make: just untar some relatively large tarball and run make -j10 I used torque-3.0.5, but this should not matter.
My current setup is: orangefs-2.8.5, 15 servers serving both data and metadata, 16 clients, 15 of them are on the same nodes as servers, this testing was conducted on a separated node with no servers on it. Kernel is linux-3.2.14, ACL support is disabled due to previously found bugs: http://www.beowulf-underground.org/pipermail/pvfs2-developers/2012-April/004974.html I use TroveSync disabled. During parallel make random files (rarely directories) become inaccessible, any attempt to use them results in EIO (system error 5, input/output error). However, these files can be normally accessed from other nodes or even from the same node using pvfs2-cp, which doesn't use kernel VFS to my knowledge. I made a series of tests to find what may affect this behaviour and found that: 1) Error rate depends on parallelism level: make -j2 is often fine, -j5 produces more problems, -j10 tends to "generate" broken files very often and so on. 2) With client-side caching disabled (defaults are -a5 -n5): pvfs2-client -a 0 -n 0 ... things became worse: frequency of error occurrence raised significantly. Somewhat large cache (-a10 -n10) seems to work better, but doesn't eliminate problem completely. 3) During such tests I found that sometimes kernel produce backtraces and complains about NULL pointer dereference. See attached kernel.log for details. pvfs2-client complains a lot in its log via the same message: [E 09:49:22.580278] Completed upcall of unknown type ff00000d! Though, it is not strictly in sync with kernel backtraces. 4) When I tried to increase client cache significatly (-a500 -b500) and run make -j10, I got kernel crash, all disk subsystem (not only pvfs2) became unresponsive and only hardware watchdog save the situation. This was general protection fault. I managed to saved kernel trace, see kernel.crash.log. 5) There are no errors logged on the pvfs2 servers. 6) TroveSyncMeta yes has no noticeable effect on this issue. 7) TroveSyncData yes makes it somewhat better in one cases and worse in another. 8) I tried to increase AttrCacheSize and AttrCacheMaxNumElems values, though with no effect. Nevertheless I plan to keep larger values, they shouldn't hurt and we have a plenty of RAM available. My current pvfs config as attached for reference. As for now I can somewhat mitigate this issue by using a cron script with either mount -o remount on nodes with problems (though remount with live applications may produce problems itself) or by using the following sequence: pvfs2-cp badfile tempfile pvfs2-rm badfile cp tempfile badfile But anyway this will not help with already confused applications... I'm aware that support for 3.1 and 3.2 kernels is still experimental, but I can't downgrade this system because other applications require some new kernel features. Also I found an interesting options: DBCacheSizeBytes and DBCacheType, though as far as I understand they have effect only on TroveMethod dbpf and are useless for alt-aio used in my setup. Please correct me if I'm wrong. Best regards, Andrew Savchenko
PVFS: kernel debug mask has been modified to "none" (0x00000000) PVFS: client debug mask has been modified to "none" (0x00000000) general protection fault: 0000 [#1] SMP CPU 4 Modules linked in: pvfs2(O) knem(O) md5 nfsd 8021q garp stp llc xt_NOTRACK iptable_raw iptable_nat nf_nat iptable_mangle ipt_REJECT ipt_LOG xt_pkttype xt_limit xt_tcpudp xt_recent nf_conntrack_ipv4 nf_defrag_ipv4 xt_hashlimit xt_conntrack iptable_filter ip_tables x_tables nf_conntrack_ftp nf_conntrack [last unloaded: pvfs2] Pid: 3871, comm: pvfs2-client-co Tainted: G O 3.2.14-unicluster #2 HP ProLiant BL2x220c G5 RIP: 0010:[<ffffffffa00ac705>] [<ffffffffa00ac705>] PVFS_proc_mask_to_eventlog+0x995/0x1110 [pvfs2] RSP: 0018:ffff8807dadc7ca8 EFLAGS: 00010246 RAX: 0000000000000350 RBX: 0000000000002030 RCX: 0000000000009d48 RDX: 00000000000000c4 RSI: ffff8807f37f41c8 RDI: ffff8807dadc7d00 RBP: ffff8807f37f41c8 R08: ffff8807dadc7f58 R09: ffffffffa00ac5a0 R10: 0000000000002030 R11: ffff8807dadc7fd8 R12: ffff8807da98a038 R13: ffff8807dadc7e78 R14: dead000000100100 R15: ffff8807dafaed40 FS: 00007f4666f78720(0000) GS:ffff88081fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2f1aa85070 CR3: 00000007f8b15000 CR4: 00000000000406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process pvfs2-client-co (pid: 3871, threadinfo ffff8807dadc6000, task ffff8807faf10000) Stack: ffff8807faf10000 0000000000000350 0000000000000004 ffff8807da98a040 ffffffff00000001 0000000000000000 0000000000000064 ffff8807dadc7ce0 ffff8807dadc7ce0 ffff8807dadc7cf0 ffff8807dadc7cf0 0000000000009d48 Call Trace: [<ffffffffa00ac5a0>] ? PVFS_proc_mask_to_eventlog+0x830/0x1110 [pvfs2] [<ffffffff810c6929>] ? do_sync_readv_writev+0xa9/0xf0 [<ffffffff810529c8>] ? thread_group_cputime+0x78/0xb0 [<ffffffff810c6aba>] ? rw_copy_check_uvector+0x9a/0x140 [<ffffffff810c6c46>] ? do_readv_writev+0xe6/0x210 [<ffffffff81033260>] ? get_task_mm+0x10/0x40 [<ffffffff810c6f0e>] ? sys_writev+0x4e/0x90 [<ffffffff8140d57b>] ? system_call_fastpath+0x16/0x1b Code: 08 48 c1 64 24 08 04 48 8b 44 24 08 49 03 07 48 8b 28 4c 8b 75 00 48 39 e8 75 25 e9 76 01 00 00 66 0f 1f 44 00 00 48 8b 44 24 08 <49> 8b 16 49 03 07 49 39 c6 0f 84 5c 01 00 00 4c 89 f5 49 89 d6 RIP [<ffffffffa00ac705>] PVFS_proc_mask_to_eventlog+0x995/0x1110 [pvfs2] RSP <ffff8807dadc7ca8> ---[ end trace 6109a457ca33b74a ]---
kernel.log.xz
Description: Binary data
<Defaults>
UnexpectedRequests 50
EventLogging none
EnableTracing no
LogStamp datetime
BMIModules bmi_tcp
FlowModules flowproto_multiqueue
PerfUpdateInterval 1000
ServerJobBMITimeoutSecs 30
ServerJobFlowTimeoutSecs 30
ClientJobBMITimeoutSecs 300
ClientJobFlowTimeoutSecs 300
ClientRetryLimit 5
ClientRetryDelayMilliSecs 2000
PrecreateBatchSize 0,32,512,32,32,32,0
PrecreateLowThreshold 0,16,256,16,16,16,0
DataStorageSpace /mnt/pvfs2
MetadataStorageSpace /mnt/pvfs2
LogFile /var/log/pvfs2/server.log
</Defaults>
<Aliases>
Alias n01 tcp://n01:3334
Alias n02 tcp://n02:3334
Alias n03 tcp://n03:3334
Alias n04 tcp://n04:3334
Alias n05 tcp://n05:3334
Alias n06 tcp://n06:3334
Alias n07 tcp://n07:3334
Alias n08 tcp://n08:3334
Alias n09 tcp://n09:3334
Alias n10 tcp://n10:3334
Alias n11 tcp://n11:3334
Alias n12 tcp://n12:3334
Alias n13 tcp://n13:3334
Alias n14 tcp://n14:3334
Alias n15 tcp://n15:3334
</Aliases>
<Filesystem>
Name pvfs2-fs
ID 158402586
RootHandle 1048576
FileStuffing yes
<MetaHandleRanges>
Range n01 3-307445734561825862
Range n02 307445734561825863-614891469123651722
Range n03 614891469123651723-922337203685477582
Range n04 922337203685477583-1229782938247303442
Range n05 1229782938247303443-1537228672809129302
Range n06 1537228672809129303-1844674407370955162
Range n07 1844674407370955163-2152120141932781022
Range n08 2152120141932781023-2459565876494606882
Range n09 2459565876494606883-2767011611056432742
Range n10 2767011611056432743-3074457345618258602
Range n11 3074457345618258603-3381903080180084462
Range n12 3381903080180084463-3689348814741910322
Range n13 3689348814741910323-3996794549303736182
Range n14 3996794549303736183-4304240283865562042
Range n15 4304240283865562043-4611686018427387902
</MetaHandleRanges>
<DataHandleRanges>
Range n01 4611686018427387903-4919131752989213762
Range n02 4919131752989213763-5226577487551039622
Range n03 5226577487551039623-5534023222112865482
Range n04 5534023222112865483-5841468956674691342
Range n05 5841468956674691343-6148914691236517202
Range n06 6148914691236517203-6456360425798343062
Range n07 6456360425798343063-6763806160360168922
Range n08 6763806160360168923-7071251894921994782
Range n09 7071251894921994783-7378697629483820642
Range n10 7378697629483820643-7686143364045646502
Range n11 7686143364045646503-7993589098607472362
Range n12 7993589098607472363-8301034833169298222
Range n13 8301034833169298223-8608480567731124082
Range n14 8608480567731124083-8915926302292949942
Range n15 8915926302292949943-9223372036854775802
</DataHandleRanges>
<StorageHints>
TroveSyncMeta no
TroveSyncData no
CoalescingHighWatermark 128
CoalescingLowWatermark 1
TroveMethod alt-aio
AttrCacheSize 8191
AttrCacheMaxNumElems 65536
</StorageHints>
<Distribution>
Name simple_stripe
Param strip_size
Value 1048576
</Distribution>
</Filesystem>
pgpgCcXdkOLmP.pgp
Description: PGP signature
_______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
