Hi!
File stagein on an IPv6-only host with GT4 fails, though globus-url-copy
works fine. While transferring from localhost to localhost, globusrun-ws
returns:
globusrun-ws: Job failed: Staging error for RSL element fileStageIn.
Setting destination anne-vz102.inf-ra.uni-jena.de to striped passive
failed [Caused by: java.io.EOFException]
Setting destination anne-vz102.inf-ra.uni-jena.de to striped passive
failed [Caused by: java.io.EOFException]
Running gridftp from the commandline reveals a segfault in a child:
[pid 32277] socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 10
[pid 32277] connect(10, {sa_family=AF_INET6, sin6_port=htons(53),
inet_pton(AF_INET6, "2001:638:906:1::1", &sin6_addr), sin6_flowinfo=0,
sin6_scope_id=0}, 28) = 0
[pid 32277] fcntl64(10, F_GETFL) = 0x2 (flags O_RDWR)
[pid 32277] fcntl64(10, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 32277] gettimeofday({1207224144, 722785}, NULL) = 0
[pid 32277] poll([{fd=10, events=POLLOUT, revents=POLLOUT}], 1, 0) = 1
[pid 32277] send(10,
"\242O\1\0\0\1\0\0\0\0\0\0\nanne-vz102\6inf-ra\10u"..., 47, 0) = 47
[pid 32277] poll([{fd=10, events=POLLIN, revents=POLLIN}], 1, 5000) = 1
[pid 32277] ioctl(10, FIONREAD, [97]) = 0
[pid 32277] recvfrom(10,
"\242O\201\200\0\1\0\0\0\1\0\0\nanne-vz102\6inf-ra\10u"..., 1024, 0,
{sa_family=AF_INET6, sin6_port=htons(53), inet_pton(AF_INET6,
"2001:638:906:1::1", &sin6_addr), sin6_flowinfo=0,
sin6_scope_id=0}, [28]) = 97
[pid 32277] close(10) = 0
[pid 32277] write(3, "0MUmIfiwM=\r\n632-FwMAACDg2ktvvxJ6"..., 4096) =
4096
[pid 32277] gettimeofday({1207224144, 724158}, NULL) = 0
[pid 32277] select(5, [0 4], [], NULL, {0, 0}) = 0 (Timeout)
[pid 32277] gettimeofday({1207224144, 724307}, NULL) = 0
[pid 32277] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
[pid 32277] time(NULL) = 1207224144
[pid 32277] write(3, "edglobus_libc_gethostaddr failed"..., 189) = 189
[pid 32277] munmap(0xb7c4a000, 4096) = 0
[pid 32277] exit_group(11) = ?
Process 32277 detached
[pid 32256] <... select resumed> ) = ? ERESTARTNOHAND (To be
restarted)
Main difference between globus-url-copy and globusrun-ws is the presence
of 0.0.0.0 in the latter case:
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 8
connect(8, {sa_family=AF_INET, sin_port=htons(2811),
sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINVAL (Invalid argument)
close(8) = 0
socket(PF_INET6, SOCK_STREAM, IPPROTO_TCP) = 8
fcntl64(8, F_SETFD, FD_CLOEXEC) = 0
setsockopt(8, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(8, {sa_family=AF_INET6, sin6_port=htons(2811), inet_pton(AF_INET6,
"::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
listen(8, 128) = 0
fcntl64(8, F_SETFD, FD_CLOEXEC) = 0
and also for the forked child:
[pid 32276] socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 8
[pid 32276] connect(8, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINVAL (Invalid argument)
[pid 32276] close(8) = 0
[pid 32276] socket(PF_INET6, SOCK_DGRAM, IPPROTO_UDP) = 8
[pid 32276] fcntl64(8, F_SETFD, FD_CLOEXEC) = 0
[pid 32276] bind(8, {sa_family=AF_INET6, sin6_port=htons(0),
inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
Summarized:
[EMAIL PROTECTED]:/usr/local/gt4.0.6-all-source-installer/source-trees/gridftp#
grep "\"0.0.0.0" /tmp/segfault
connect(7, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINVAL (Invalid argument)
connect(8, {sa_family=AF_INET, sin_port=htons(2811),
sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINVAL (Invalid argument)
[pid 32276] connect(8, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINVAL (Invalid argument)
[pid 32277] connect(8, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINVAL (Invalid argument)
[pid 32277] bind(9, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 32277] getsockname(9, {sa_family=AF_INET, sin_port=htons(45581),
sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
When running gridftp-server without forking (-nf), the segfault can be easily
located:
Program received signal SIGSEGV, Segmentation fault.
0xb7de2a3c in globus_libc_strdup (string=0x4 <Address 0x4 out of bounds>) at
globus_libc.c:2278
2278 l = strlen(string);
(gdb) bt
#0 0xb7de2a3c in globus_libc_strdup (string=0x4 <Address 0x4 out of bounds>)
at globus_libc.c:2278
#1 0xb7f18d66 in globus_gridftp_server_control_finished_passive_connect
(op=0x808dc50,
user_data_handle=0x0, data_dir=GLOBUS_GRIDFTP_SERVER_CONTROL_DATA_DIR_BI,
cs=0x80924b8,
cs_count=1,
response_code=GLOBUS_GRIDFTP_SERVER_CONTROL_RESPONSE_ACTION_FAILED,
msg=0x8093200 "globus_libc_addr_to_contact_string failed.\nglobus_common:
globus_libc_gethostaddr failed\n") at globus_gridftp_server_control.c:4843
#2 0xb7fa94e1 in globus_l_gfs_data_passive_data_cb (reply=0xbfaebe54,
user_arg=0x807eb88)
at globus_i_gfs_control.c:1539
#3 0xb7f86f60 in globus_l_gfs_data_passive_kickout (user_arg=0x80a3aa0) at
globus_i_gfs_data.c:2434
#4 0xb7dd8ca2 in globus_callback_space_poll (timestop=0x804d450, space=-2)
at globus_callback_nothreads.c:1430
#5 0x0804c21f in main (argc=4, argv=0xbfaec004) at globus_gridftp_server.c:1436
The segfault is the result of a call to globus_libc_gethostaddr_by_family:
Breakpoint 2, globus_libc_gethostaddr_by_family (addr=0xbff68fa0, family=2) at
globus_libc.c:1048
1048 rc = globus_libc_gethostname(hostname, sizeof(hostname));
(gdb) list
1055 hints.ai_flags = 0;
1056 hints.ai_family = family;
1057 hints.ai_socktype = SOCK_STREAM;
1058 hints.ai_protocol = 0;
1059
1060 result = globus_libc_getaddrinfo(
1061 hostname, GLOBUS_NULL, &hints, &save_addrinfo);
1062 if(result != GLOBUS_SUCCESS)
1063 {
1064 return -1;
(gdb) p hostname
$1 =
"anne-vz102.inf-ra.uni-jena.de\000Ù·ôÿú·h$À\000`,Ú·8\217ö¿ë\027ú·ijÚ·ZÑô·\030\027Ý·"
(gdb) p hints.ai_family
$2 = 2
(gdb) n
(gdb) p result
$4 = 16
(gdb) n
1064 return -1;
(gdb)
1080 }
(gdb)
globus_libc_addr_to_contact_string (addr=0xbff69090, opts_mask=11,
contact_string=0xbff6908c)
at globus_libc.c:3552
3552 result = globus_error_put(
Afterwards, the error string is constructed. The segfault occurs in
strlen, the backtrace is already shown above.
I wonder why I see family=2, which is AF_INET. The corresponding address
is even worse:
(gdb) p ((struct sockaddr_in*) addr).sin_addr
$18 = {s_addr = 33}
Any ideas? Especially how to enable file stagin? ;)
If interested, here are the container logs:
2008-04-03 14:02:24,726 ERROR service.TransferWork [WorkThread-21,run:494]
Transient transfer error
Setting destination anne-vz102.inf-ra.uni-jena.de to striped passive failed
[Caused by: java.io.EOFExce
ption]
Setting destination anne-vz102.inf-ra.uni-jena.de to striped passive failed.
Caused by java.io.EOFExcep
tion
at
org.globus.ftp.extended.GridFTPInputStream.readMsg(GridFTPInputStream.java:100)
at
org.globus.gsi.gssapi.net.GssInputStream.hasData(GssInputStream.java:81)
at org.globus.gsi.gssapi.net.GssInputStream.read(GssInputStream.java:55)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
at java.io.InputStreamReader.read(InputStreamReader.java:167)
at java.io.BufferedReader.fill(BufferedReader.java:136)
at java.io.BufferedReader.readLine(BufferedReader.java:299)
at java.io.BufferedReader.readLine(BufferedReader.java:362)
at org.globus.ftp.vanilla.Reply.<init>(Reply.java:66)
at
org.globus.ftp.vanilla.FTPControlChannel.read(FTPControlChannel.java:257)
at
org.globus.ftp.vanilla.FTPControlChannel.exchange(FTPControlChannel.java:300)
at
org.globus.ftp.vanilla.FTPControlChannel.execute(FTPControlChannel.java:325)
at
org.globus.ftp.GridFTPClient.setStripedPassive(GridFTPClient.java:311)
at
org.globus.transfer.reliable.service.cache.ThirdPartyConnectionImpl.<init>(ThirdPartyConnect
ionImpl.java:98)
at
org.globus.transfer.reliable.service.cache.ConnectionManager.createNewConnection(ConnectionM
anager.java:376)
at
org.globus.transfer.reliable.service.cache.ConnectionManager.getConnection(ConnectionManager
.java:259)
at
org.globus.transfer.reliable.service.client.ThirdPartyTransferClient.<init>(ThirdPartyTransf
erClient.java:65)
at
org.globus.transfer.reliable.service.client.ClientFactory.createThirdPartyTransferClient(Cli
entFactory.java:151)
at
org.globus.transfer.reliable.service.TransferWork.run(TransferWork.java:441)
at
org.globus.wsrf.impl.work.WorkManagerImpl$WorkWrapper.run(WorkManagerImpl.java:355)
at java.lang.Thread.run(Thread.java:619)
--
mail: [EMAIL PROTECTED] http://adi.thur.de PGP/GPG: key via
keyserver
Reincarnation - the torture never ends.