Hey guys,
I am running into a problem with a system copy command segfaulting on 2.4
kernels. Specifically, I am seeing this show up on RHEL3 machines running a
patched version of PVFS 2.6. Machines running Linux 2.6 kernels do not
experience this problem. I believe we may have mentioned this recently but
hoped it would be fixed by some updates pulled into dcache. That,
apparently, is not the case.
The segfault is extremely consistent; it happens every time a cp is executed
with a PVFS2 file system as the target. The target file is always created
with a size of zero, so at least part of the command is completing. 'dd'
commands execute normally.
The setup is simple: 1 server node (RHEL4 2.6 kernel) with the default
interactive genconfig output, and 1 client with a 2.4 kernel. Mount the
file system, execute a copy onto the file system.
Here is the conf file contents:
<Defaults>
UnexpectedRequests 50
EventLogging none
LogStamp datetime
BMIModules bmi_tcp
FlowModules flowproto_multiqueue
PerfUpdateInterval 1000
ServerJobBMITimeoutSecs 30
ServerJobFlowTimeoutSecs 30
ClientJobBMITimeoutSecs 300
ClientJobFlowTimeoutSecs 300
ClientRetryLimit 5
ClientRetryDelayMilliSecs 2000
TCPBindSpecific yes
</Defaults>
<Aliases>
Alias node1 tcp://node1:3334
</Aliases>
<Filesystem>
Name pvfs2-fs
ID 1227216139
RootHandle 1048576
<MetaHandleRanges>
Range node1 4-2147483650
</MetaHandleRanges>
<DataHandleRanges>
Range node1 2147483651-4294967297
</DataHandleRanges>
<StorageHints>
TroveSyncMeta no
TroveSyncData no
CoalescingHighWatermark infinity
CoalescingLowWatermark 0
TroveSyncMetaTimerSecs 5
DBCacheSizeBytes 1073741824
</StorageHints>
</Filesystem>
And here is the last bit of an strace on a copy command:
[r...@node1 root]# strace cp test.file /mnt/pvfs2/
.....
brk(0) = 0x95ce000
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=32148976, ...}) = 0
mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb73f4000
close(3) = 0
geteuid32() = 0
lstat64("/mnt/pvfs2/", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) =
0
stat64("/mnt/pvfs2/", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
stat64("test.file", {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
stat64("/mnt/pvfs2/test.file", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
open("test.file", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
open("/mnt/pvfs2/test.file", O_WRONLY|O_TRUNC|O_LARGEFILE) = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
There is nothing in the client or server logs without turning on additional
logging.
Are there any suggestions on what might be causing this? Can I provide any
additional information that will be helpful for debugging?
Bart.
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers