Adam,
i tried it and i can't reproduce it.
Just for sanity: please try the following:
use the following job description (does some kind of dummy-staging:
a staging from headnode to headnode (gram-host == rft-host ==
gridftp-host)).
Does that work?
Does myEcho exist in $GLOBUS_USER_HOME afterwards?
Can you execute it?
Martin
job description (replace host and port values and path to "echo"
as needed):
<job>
<executable>${GLOBUS_USER_HOME}/myEcho</executable>
<argument>whatTheHeck</argument>
<stdout>${GLOBUS_USER_HOME}/stdout</stdout>
<stderr>${GLOBUS_USER_HOME}/stderr</stderr>
<fileStageIn>
<transfer>
<sourceUrl>
gsiftp://gridftp-host:port/bin/echo
</sourceUrl>
<destinationUrl>
gsiftp://gridftp-host:port/${GLOBUS_USER_HOME}/myEcho
</destinationUrl>
</transfer>
</fileStageIn>
</job>
> Hi,
>
> Just wanted to report a possible bug. We recently upgraded two Globus
> installations (4.0.3->4.0.6, and 4.0.4->4.0.6), and I noticed that when
> GRAM
> jobs were submitted, the jobs would fail. I checked and all the files
> seem
> to get staged in properly (executable & input files), but the executable
> segfaults when you try to run it by hand. I have tried this with multiple
> different executables, and I get the same result each time. If I manually
> SCP or even globus-url-copy the executable over, it runs just fine.
> However, if I use rft... -h <service_host> to move the executable over, as
> would happen in a GRAM job submission, I end up with a corrupted binary.
> As
> a sanity check, we rolled back to 4.0.4 and everything works just fine.
> So,
> as far as the corruption... diff says the binaries are different:
>
> seil:whatever$ diff garli garli_broken
> Binary files garli and garli_broken differ
>
> if i run gdb on the broken one, I get tons of these messages:
>
> BFD:
> /a/storage.seil.umd.edu/export/home/seil/globus/.globus/scratch/whatever/garli_broken:
> invalid string offset 1811940244 >= 315067 for section ` .strtab'
> BFD:
> /a/storage.seil.umd.edu/export/home/seil/globus/.globus/scratch/whatever/garli_broken:
> invalid string offset 2684355476 >= 315067 for section ` .strtab'
> BFD:
> /a/storage.seil.umd.edu/export/home/seil/globus/.globus/scratch/whatever/garli_broken:
> invalid string offset 2818573205 >= 315067 for section ` .strtab'
>
> followed by:
>
> Dwarf Error: wrong version in compilation unit header (is 0, should be 2)
> [in module /a/storage.seil.umd.edu/export/home/seil/globus/.globus/scratch
> /whatever/garli_broken]
> Using host libthread_db library "/lib/tls/libthread_db.so.1".
>
> running it, of course, yields a segfault:
>
> (gdb) run
> Starting program:
> /a/storage.seil.umd.edu/export/home/seil/globus/.globus/scratch/whatever/garli_broken
> warning: shared library handler failed to enable breakpoint
>
> Program received signal SIGSEGV, Segmentation fault.
> 0xf6081c1a in ?? ()
>
> So... any ideas? As you can see I've isolated the problem to an RFT
> transfer using the 4.0.6 container as the RFT service. The primary reason
> we upgraded in the first place was to avoid a memory leak with the PBS job
> manager, so... if anyone has an idea for a workaround, that would be
> helpful
> too. We are happy to provide additional information about our setup &
> config if that would be helpful.
>
> Thanks,
> Adam
>