This is an error we don't know why and under what
circumstances it happens because it happens rather
infrequently but we have a fix for it in 4.0.6 and
upcoming 4.2.
Please try the following: Replace

    my $count = File::Path::rmtree($job_path);

in $GLOBUS_LOCATION/lib/perl/Globus/GRAM/JobManager.pm
in the subroutin cache_cleanup (about at line 757)
by

    chdir("/");
    my $count = File::Path::rmtree($job_path);

You don't have to restart the container to make this
change become active, just submit another job.
You would do me a favor if you alternatively tried to
replace that line it by

    system("rm -rf $job_path");

and see if it solves the problem too because i can't
reproduce it here.

Please let me know how it works.

Note: This seems to be a different issue than the one
you have with your java client, but i would like to
see it working with our clients first to eliminate
possible causes of errors.

Martin


> Martin,
>           I try the test, it failed:
> [EMAIL PROTECTED] ~]$ globusrun-ws \
>> -submit \
>> -F https://g3.grid.cn:8443/wsrf/services/ManagedJobFactoryService \
>> -Ft Fork \
>> -S \
>> -f job.xml
> Delegating user credentials...Done.
> Submitting job...Done.
> Job ID: uuid:e5796f12-bb58-11dc-b14a-0019b9f6a7f4
> Termination time: 01/06/2008 06:38 GMT
> Current job state: StageIn
> Current job state: Active
> Current job state: CleanUp
> Current job state: Failed
> Destroying job...Done.
> Cleaning up any delegated credentials...Done.
>
> and the job.xml:
> <job>
>   <executable>/bin/date</executable>
>   <stdout>${GLOBUS_USER_HOME}/stdout</stdout>
>   <stderr>${GLOBUS_USER_HOME}/stderr</stderr>
>   <fileStageIn>
>     <transfer>
>       <sourceUrl>gsiftp://g3.grid.cn:2811/bin/echo</sourceUrl>
>       <destinationUrl>gsiftp://g3.grid.cn:2811/tmp/TESTFILE</destinationUrl>
>     </transfer>
>   </fileStageIn>
> </job>
>
> I did some checking  and found such facts:
> 1. The  file transfer is completed (I used "diff" command to check the
> two files.)
> 2. The stdout file's content is the output of  running "date" command
> (Sat Jan  5 14:39:05 CST 2008) and the stderr file's length is 0.
>
> So I'm more confused why the job failed, could you please give me some
> hints?
>
> The attachment is the the complete container logfile of g3 node.
>
>
> Regards,
> ---------------------
> Lin Wan
>


Reply via email to