Martin Feller wrote:
> Hi,
> 
> The problem you describe, and which is summarized in the bug you mention,
> is an architectural problem in WS-GRAM in 4.0.
> We fixed it in the 4.2 branch. We had to change the interface for this change
> that's why we can't port it back to the 4.0 branch.
> If you can upgrade to the 4.2 series I'd recommend this.
> 
> With 4.0.x there is currently no other way than:
> 1. Stop the container
> 2. Delete the problematic job from the persistence directory (by default
>    ~/.globus of the user who runs the container).
>    In your case: remove the file
>    
> ~containeruser/.globus/<hostname>-<port>/ManagedExecutableJobResourceStateType/1748b3d0-8c4b-11de-8543-b8f655c16264.xml

I'm sorry, the path i gave wasn't correct: The persistence directory is by 
default
in ~containeruser/.globus/persisted, so it should be
~containeruser/.globus/persisted/<hostname>-<port>/ManagedExecutableJobResourceStateType/1748b3d0-8c4b-11de-8543-b8f655c16264.xml

-Martin

> 3. Restart the container.
> 
> -Martin
> 
> Hazlewood, Victor Gene wrote:
>> Hey GTers,
>>
>> Running WSRF v 4.0.8-r2 on a Cray XT5.  Have a user job that looks like
>> it has gone into an unresolvable state and the log file is filling up
>> with messages about not being able to resolve the FailureFileCleanUp
>> state.   Anyone have any suggestions how to get rid of this?   Have
>> looked at the documentation (nothing I found covers this), looked at
>> bugzilla (http://bugzilla.mcs.anl.gov/globus/show_bug.cgi?id=5247 is
>> close but says it will be fixed in a future release, but gives no
>> instructions how to resolve it currently). I'm running out of ideas.
>>
>>  
>>
>> The recurring messages are
>>
>>  
>>
>> 2009-08-29 12:40:02,267 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,getInternalState:1666] getting resource datum internalState
>>
>> 2009-08-29 12:40:02,267 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,remove:285] Waiting to be Done or Failed. Current state:
>> FailureFileCleanUp
>>
>>  
>>
>> Any help on how to resolve this would be appreciated (besides the "it is
>> fixed in the next release" type of resolution).
>>
>>  
>>
>> Below are the complete job entries for the job.
>>
>>  
>>
>> -Victor
>>
>>  
>>
>>  
>>
>> Victor Hazlewood, CISSP
>>
>> Senior HPC Systems Analyst
>>
>> National Institute for Computational Science
>>
>> University of Tennessee
>>
>> http://www.nics.tennessee.edu/ <http://www.nics.utk.edu/> 
>>
>>  
>>
>>  
>>
>> Complete log file entry:
>>
>>  
>>
>> 2009-08-28 20:13:32,174 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,initialize:142] Entering initialize()
>>
>> 2009-08-28 20:13:32,175 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,initialize:147] at super.initialize()
>>
>> 2009-08-28 20:13:32,180 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,initialize:153] at initSecurity()
>>
>> 2009-08-28 20:13:32,180 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,initSecurity:316] Entering initSecurity()
>>
>> 2009-08-28 20:13:32,182 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,initSecurity:338] resource credential subject:
>>
>> 2009-08-28 20:13:32,183 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,initSecurity:346] setting resource securty grid map...
>>
>> 2009-08-28 20:13:32,183 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,initSecurity:356] Leaving initSecurity()
>>
>> 2009-08-28 20:13:32,186 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,initVariableMap:704]
>> GLOBUS_SCRATCH_DIR:${GLOBUS_USER_HOME}/.globus/scratch
>>
>> 2009-08-28 20:13:32,370 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1290] resolving variables in attribute
>> environment
>>
>> 2009-08-28 20:13:32,370 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1295] looking at string
>> ${GLOBUS_USER_HOME}
>>
>> 2009-08-28 20:13:32,370 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1296] found $ at index 0
>>
>> 2009-08-28 20:13:32,371 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1302] found '{'---looks like a
>> reference
>>
>> 2009-08-28 20:13:32,371 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1348] looking up GLOBUS_USER_HOME in
>> {GLOBUS_SCRATCH_DIR=${GLOBUS_USER_HOME}/.globus/scratch,
>> GLOBUS_LOCATION=/usr/local/globus-wsrf-4.0.8-r2,
>> GLOBUS_JOB_ID=1748b3d0-8c4b-11de-8543-b8f655c16264,
>> GLOBUS_USER_HOME=/nics/c/home/turuncu, GLOBUS_USER_NAME=turuncu}
>>
>> 2009-08-28 20:13:32,371 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1353] mapped GLOBUS_USER_HOME to value
>> /nics/c/home/turuncu
>>
>> 2009-08-28 20:13:32,371 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1392] Final string is
>> /nics/c/home/turuncu
>>
>> 2009-08-28 20:13:32,372 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1290] resolving variables in attribute
>> environment
>>
>> 2009-08-28 20:13:32,372 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1295] looking at string
>> ${GLOBUS_USER_NAME}
>>
>> 2009-08-28 20:13:32,372 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1296] found $ at index 0
>>
>> 2009-08-28 20:13:32,372 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1302] found '{'---looks like a
>> reference
>>
>> 2009-08-28 20:13:32,373 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1348] looking up GLOBUS_USER_NAME in
>> {GLOBUS_SCRATCH_DIR=${GLOBUS_USER_HOME}/.globus/scratch,
>> GLOBUS_LOCATION=/usr/local/globus-wsrf-4.0.8-r2,
>> GLOBUS_JOB_ID=1748b3d0-8c4b-11de-8543-b8f655c16264,
>> GLOBUS_USER_HOME=/nics/c/home/turuncu, GLOBUS_USER_NAME=turuncu}
>>
>> 2009-08-28 20:13:32,373 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1353] mapped GLOBUS_USER_NAME to value
>> turuncu
>>
>> 2009-08-28 20:13:32,373 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1392] Final string is turuncu
>>
>> 2009-08-28 20:13:32,373 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1290] resolving variables in attribute
>> environment
>>
>> 2009-08-28 20:13:32,374 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1295] looking at string
>> ${GLOBUS_SCRATCH_DIR}
>>
>> 2009-08-28 20:13:32,374 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1296] found $ at index 0
>>
>> 2009-08-28 20:13:32,374 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1302] found '{'---looks like a
>> reference
>>
>> 2009-08-28 20:13:32,374 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1348] looking up GLOBUS_SCRATCH_DIR in
>> {GLOBUS_SCRATCH_DIR=${GLOBUS_USER_HOME}/.globus/scratch,
>> GLOBUS_LOCATION=/usr/local/globus-wsrf-4.0.8-r2,
>> GLOBUS_JOB_ID=1748b3d0-8c4b-11de-8543-b8f655c16264,
>> GLOBUS_USER_HOME=/nics/c/home/turuncu, GLOBUS_USER_NAME=turuncu}
>>
>> 2009-08-28 20:13:32,375 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1353] mapped GLOBUS_SCRATCH_DIR to
>> value ${GLOBUS_USER_HOME}/.globus/scratch
>>
>> 2009-08-28 20:13:32,375 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1295] looking at string
>> ${GLOBUS_USER_HOME}/.globus/scratch
>>
>> 2009-08-28 20:13:32,375 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1296] found $ at index 0
>>
>> 2009-08-28 20:13:32,375 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1302] found '{'---looks like a
>> reference
>>
>> 2009-08-28 20:13:32,376 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1348] looking up GLOBUS_USER_HOME in
>> {GLOBUS_SCRATCH_DIR=${GLOBUS_USER_HOME}/.globus/scratch,
>> GLOBUS_LOCATION=/usr/local/globus-wsrf-4.0.8-r2,
>> GLOBUS_JOB_ID=1748b3d0-8c4b-11de-8543-b8f655c16264,
>> GLOBUS_USER_HOME=/nics/c/home/turuncu, GLOBUS_USER_NAME=turuncu}
>>
>> 2009-08-28 20:13:32,376 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1353] mapped GLOBUS_USER_HOME to value
>> /nics/c/home/turuncu
>>
>> 2009-08-28 20:13:32,376 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,resolveVariableInString:1392] Final string is
>> /nics/c/home/turuncu/.globus/scratch
>>
>> 2009-08-28 20:13:32,377 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,initExtraPerlAttributes:588] Adding extra attributes to the
>> Perl job attribute map
>>
>> 2009-08-28 20:13:32,377 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,initExtraPerlAttributes:615] checking for condorness of PBS
>>
>> 2009-08-28 20:13:32,421 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,initialize:171] Perl Job Description: $description = {
>>
>>     jobdir => [
>> '/nics/c/home/turuncu/.globus/1748b3d0-8c4b-11de-8543-b8f655c16264' ],
>>
>>     environment => [ [ 'GLOBUS_LOCATION',
>> '/usr/local/globus-wsrf-4.0.8-r2' ], [ 'X509_CERT_DIR',
>> '/etc/grid-security/certificates' ], [ 'X509_USER_PROXY', '' ], [
>> 'X509_USER_CERT', '' ], [ 'X509_USER_KEY', '' ], [ 'HOME',
>> '/nics/c/home/turuncu' ], [ 'LOGNAME', 'turuncu' ], [
>> 'SCRATCH_DIRECTORY', '/nics/c/home/turuncu/.globus/scratch' ], [
>> 'JAVA_HOME', '/opt/java/jdk1.6.0_05/jre' ], [ 'GLOBUS_GRAM_JOB_HANDLE',
>> 'https://grid.nics.utk.edu:4321/wsrf/services/ManagedExecutableJobServic
>> e?1748b3d0-8c4b-11de-8543-b8f655c16264' ],  ],
>>
>> 2009-08-28 20:13:32,421 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,initialize:178] Leaving initialize()
>>
>> 2009-08-28 20:13:32,429 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,getInternalState:1666] getting resource datum internalState
>>
>> 2009-08-28 20:13:32,429 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,remove:275] Remove called with external state Done and
>> internal state FailureFileCleanUp
>>
>> 2009-08-28 20:13:32,429 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,remove:285] Waiting to be Done or Failed. Current state:
>> FailureFileCleanUp
>>
>> 2009-08-28 20:13:34,432 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,getInternalState:1666] getting resource datum internalState
>>
>> 2009-08-28 20:13:34,432 DEBUG
>> ManagedExecutableJobResource.1748b3d0-8c4b-11de-8543-b8f655c16264
>> [Thread-7,remove:285] Waiting to be Done or Failed. Current state:
>> FailureFileCleanUp
>>
>>  
>>
>> (last two messages repeated 29536 times)
>>
>>  
>>
>>  
>>
>>
> 

Reply via email to