Hi Ravi,

Just the default Derby database, which has worked fine on all of our other
installations. Again, I'm leaning towards something weird about what I was
doing messing with the Condor log file, and maybe something gets out of
sync, I don't know; the other possibility, of course, is that something
with RFT isn't configured properly and the other problem just happened to
expose it. If you know of anything I can do to debug, please let me know.

thanks,
Adam



On Tue, Sep 11, 2012 at 6:19 PM, Ravi K Madduri <[email protected]> wrote:

> Adam
> What database do you use for RFT? Is it Postgresql or derby?
>
> On Sep 11, 2012, at 5:08 PM, Adam Bazinet wrote:
>
> Sorry, forgot to attach the log:
>
> ----------------------------------------
> PROCESSING INTERNAL STATE:  -- StageOut --
> ----------------------------------------
> 2012-09-11T17:49:48.478-04:00 DEBUG processing.StateMachine
> [pool-1-thread-1,processInternalState:115] Processing resource
> d69fb7c0-fc45-11e1-93da-f931d11c3d83 in internal state StageOut
> 2012-09-11T17:49:48.478-04:00 DEBUG handler.InternalStateHandler
> [pool-1-thread-1,processInternalState:44]
> [resourceKey:d69fb7c0-fc45-11e1-93da-f931d11c3d83] Start processing
> internal state StageOut
> 2012-09-11T17:49:48.484-04:00 DEBUG
> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83
> [pool-1-thread-1,setState:846] receiving request for state change to
> StageOut
> 2012-09-11T17:49:48.484-04:00 DEBUG
> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83
> [pool-1-thread-1,setState:525] Setting new state RP value StageOut
> 2012-09-11T17:49:48.484-04:00 DEBUG
> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83
> [pool-1-thread-1,setState:854] State of job {
> http://www.globus.org/namespaces/2008/03/gram/job}ResourceID=d69fb7c0-fc45-11e1-93da-f931d11c3d83changed
>  to: StageOut
> 2012-09-11T17:49:48.484-04:00 DEBUG
> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83
> [pool-1-thread-1,setState:862] Holding: false
> 2012-09-11T17:49:48.484-04:00 DEBUG
> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83
> [pool-1-thread-1,setState:896] exitCode is null
> 2012-09-11T17:49:48.485-04:00 DEBUG
> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83
> [pool-1-thread-1,setState:905] Notifying of job state change to topic
> listeners
> 2012-09-11T17:49:48.549-04:00 DEBUG utils.LocalStagingHelper
> [pool-1-thread-1,submitStagingRequest:72]
> [resourceKey:d69fb7c0-fc45-11e1-93da-f931d11c3d83] Entering
> sumbitStagingRequest()
> 2012-09-11T17:49:48.552-04:00 DEBUG
> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83
> [pool-1-thread-1,getStagingCredential:434] getStagingCredential() called
> 2012-09-11T17:49:48.552-04:00 DEBUG
> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83
> [pool-1-thread-1,getResourceDatum:209] getting resource datum gridMapFile
> 2012-09-11T17:49:48.554-04:00 DEBUG
> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83
> [pool-1-thread-1,getStagingCredential:453] staging credential endpoint:
> Address: https://128.8.91.70:8443/wsrf/services/DelegationService
> Reference property[0]:
> <ns1:DelegationKey xmlns:ns1="
> http://www.globus.org/08/2004/delegationService
> ">d61f6430-fc45-11e1-93da-f931d11c3d83</ns1:DelegationKey>
>
> 2012-09-11T17:49:48.554-04:00 DEBUG utils.DelegatedCredential
> [pool-1-thread-1,getDelegatedCredential:98] Entering
> getDelegatedCredential()
> 2012-09-11T17:49:48.554-04:00 DEBUG utils.DelegatedCredential
> [pool-1-thread-1,getDelegatedCredential:112] checking for existing
> credential listener
> 2012-09-11T17:49:48.555-04:00 DEBUG utils.DelegatedCredential
> [pool-1-thread-1,getDelegationKey:605] Pulled out DelegationKey:
> d61f6430-fc45-11e1-93da-f931d11c3d83
> 2012-09-11T17:49:48.555-04:00 DEBUG utils.DelegatedCredential
> [pool-1-thread-1,getDelegatedCredential:124] reusing DelegatedCredential
> object
> 2012-09-11T17:49:48.555-04:00 DEBUG
> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83
> [pool-1-thread-1,getStagingCredential:476] Delegated Staging Credential
> Object: org.globus.exec.service.utils.DelegatedCredential@26b98a06
> 2012-09-11T17:49:48.555-04:00 DEBUG utils.DelegatedCredential
> [pool-1-thread-1,getCredential:333] waiting to receive initial credential...
> 2012-09-11T17:49:48.556-04:00 DEBUG utils.DelegatedCredential
> [pool-1-thread-1,getCredential:341] done waiting...
> 2012-09-11T17:49:48.559-04:00 DEBUG
> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83
> [pool-1-thread-1,getStagingCredential:479] Using proxy with subject
> /O=Grid/OU=GlobusTest/OU=
> simpleCA-leucine.umiacs.umd.edu/OU=umiacs.umd.edu/CN=GT4 Adminto contact
> the staging service
> 2012-09-11T17:49:48.580-04:00 DEBUG utils.LocalStagingHelper
> [pool-1-thread-1,submitStagingRequest:91] creating RFT resource
> 2012-09-11T17:49:48.711-04:00 DEBUG
> factory.ReliableFileTransferFactoryService [pool-1-thread-1,oldLog:168]
> Registration to MDS enabled
> 2012-09-11T17:49:50.981-04:00 DEBUG utils.LocalInvocationHelper
> [pool-1-thread-1,getServiceObject:403]
> className=org.globus.transfer.reliable.service.factory.ReliableFileTransferFactoryService
> handlerClass=org.globus.axis.providers.RPCProvider
> 2012-09-11T17:49:50.981-04:00 DEBUG utils.LocalInvocationHelper
> [pool-1-thread-1,getServiceObject:416] getting the service object
> 2012-09-11T17:49:50.982-04:00 DEBUG utils.LocalInvocationHelper
> [pool-1-thread-1,getServiceObject:422] caching service object for service
> ReliableFileTransferFactoryService
> 2012-09-11T17:49:50.983-04:00 DEBUG utils.LocalInvocationHelper
> [pool-1-thread-1,getMethod:457] caching method
> ReliableFileTransferFactoryService.createReliableFileTransfer
> 2012-09-11T17:49:50.984-04:00 DEBUG
> factory.ReliableFileTransferFactoryService [pool-1-thread-1,oldLog:168]
> PerformanceLog createReliableFileTransfer() enter
> 2012-09-11T17:49:50.984-04:00 DEBUG
> factory.ReliableFileTransferFactoryService [pool-1-thread-1,oldLog:168] Got
> a transfer request
> 2012-09-11T17:49:51.049-04:00 DEBUG service.ReliableFileTransferHome
> [pool-1-thread-1,oldLog:168] Loading the RFT home's list of resource keys
> 2012-09-11T17:49:51.051-04:00 DEBUG database.RFTDatabaseSetup
> [pool-1-thread-1,oldLog:168] getDBConnection() enter
> 2012-09-11T17:49:51.051-04:00 WARN  service.ReliableFileTransferHome
> [pool-1-thread-1,oldLog:190] All RFT requests will fail and all GRAM jobs
> that require file staging will fail.Database driver is not initialized,
> Need to setup database
> 2012-09-11T17:49:51.052-04:00 DEBUG service.ReliableFileTransferHome
> [pool-1-thread-1,oldLog:164] All RFT requests will fail and all GRAM jobs
> that require file staging will fail.Database driver is not initialized,
> Need to setup database
> org.globus.transfer.reliable.service.database.RftDBException: Database
> driver is not initialized, Need to setup database
> at
> org.globus.transfer.reliable.service.database.RFTDatabaseSetup.getDBConnection(RFTDatabaseSetup.java:249)
>         at
> org.globus.transfer.reliable.service.database.ReliableFileTransferDbAdapter.getActiveRequestIds(ReliableFileTransferDbAdapter.java:917)
>         at
> org.globus.transfer.reliable.service.ReliableFileTransferResource.getStoredResourceKeyValues(ReliableFileTransferResource.java:544)
>         at
> org.globus.transfer.reliable.service.ReliableFileTransferHome.initialize(ReliableFileTransferHome.java:69)
>         at
> org.globus.wsrf.jndi.BasicBeanFactory.getObjectInstance(BasicBeanFactory.java:54)
>         at
> org.globus.wsrf.jndi.BeanFactory.getInstance(BeanFactory.java:118)
>         at org.globus.wsrf.jndi.BeanFactory.access$100(BeanFactory.java:40)
>         at
> org.globus.wsrf.jndi.BeanFactory$GetInstanceAction.run(BeanFactory.java:142)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:416)
>         at org.globus.gsi.jaas.GlobusSubject.runAs(GlobusSubject.java:60)
>         at org.globus.gsi.jaas.JaasSubject.doAs(JaasSubject.java:100)
>         at
> org.globus.wsrf.jndi.BeanFactory.getObjectInstance(BeanFactory.java:90)
>         at
> org.apache.naming.factory.ResourceFactory.getObjectInstance(ResourceFactory.java:135)
>         at
> javax.naming.spi.NamingManager.getObjectInstance(NamingManager.java:321)
>         at org.apache.naming.NamingContext.lookup(NamingContext.java:827)
>         at org.apache.naming.NamingContext.lookup(NamingContext.java:155)
>         at
> org.apache.naming.SynchronizedContext.lookup(SynchronizedContext.java:69)
>         at org.apache.naming.NamingContext.lookup(NamingContext.java:815)
>         at org.apache.naming.NamingContext.lookup(NamingContext.java:155)
>         at
> org.apache.naming.SynchronizedContext.lookup(SynchronizedContext.java:69)
>         at org.apache.naming.NamingContext.lookup(NamingContext.java:815)
>         at org.apache.naming.NamingContext.lookup(NamingContext.java:155)
>         at
> org.apache.naming.SynchronizedContext.lookup(SynchronizedContext.java:69)
>         at org.apache.naming.NamingContext.lookup(NamingContext.java:815)
>         at org.apache.naming.NamingContext.lookup(NamingContext.java:155)
>         at
> org.apache.naming.SynchronizedContext.lookup(SynchronizedContext.java:69)
>         at org.apache.naming.NamingContext.lookup(NamingContext.java:815)
>         at org.apache.naming.NamingContext.lookup(NamingContext.java:168)
>         at
> org.apache.naming.SynchronizedContext.lookup(SynchronizedContext.java:81)
>         at javax.naming.InitialContext.lookup(InitialContext.java:409)
>         at
> org.apache.naming.SelectorContext.lookup(SelectorContext.java:145)
>         at javax.naming.InitialContext.lookup(InitialContext.java:409)
>         at
> org.globus.transfer.reliable.service.factory.ReliableFileTransferFactoryService.createReliableFileTransfer(ReliableFileTransferFactoryService.java:247)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at
> org.globus.exec.service.exec.utils.LocalInvocationHelper.callService(LocalInvocationHelper.java:140)
>         at
> org.globus.exec.service.exec.utils.LocalStagingHelper.submitStagingRequest(LocalStagingHelper.java:95)
>         at
> org.globus.exec.service.exec.processing.handler.StageOutStateHandler.process(StageOutStateHandler.java:79)
>         at
> org.globus.exec.service.exec.processing.handler.InternalStateHandler.processInternalState(InternalStateHandler.java:49)
>         at
> org.globus.exec.service.exec.processing.StateMachine.processInternalState(StateMachine.java:121)
>         at
> org.globus.exec.service.exec.processing.StateProcessingTask.run(StateProcessingTask.java:82)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:679)
> 2012-09-11T17:49:51.062-04:00 DEBUG database.RFTDatabaseSetup
> [pool-1-thread-1,oldLog:168] getDBConnection() enter
> 2012-09-11T17:49:51.062-04:00 ERROR
> factory.ReliableFileTransferFactoryService [pool-1-thread-1,oldLog:175]
> Unable to create RFT resource
> org.globus.transfer.reliable.service.database.RftDBException: Database
> driver is not initialized, Need to setup database
>         at
> org.globus.transfer.reliable.service.database.RFTDatabaseSetup.getDBConnection(RFTDatabaseSetup.java:249)
>         at
> org.globus.transfer.reliable.service.database.ReliableFileTransferDbAdapter.storeTransferRequest(ReliableFileTransferDbAdapter.java:204)
>         at
> org.globus.transfer.reliable.service.ReliableFileTransferResource.<init>(ReliableFileTransferResource.java:191)
>         at
> org.globus.transfer.reliable.service.ReliableFileTransferHome.create(ReliableFileTransferHome.java:126)
>         at
> org.globus.transfer.reliable.service.factory.ReliableFileTransferFactoryService.createReliableFileTransfer(ReliableFileTransferFactoryService.java:249)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at
> org.globus.exec.service.exec.utils.LocalInvocationHelper.callService(LocalInvocationHelper.java:140)
>         at
> org.globus.exec.service.exec.utils.LocalStagingHelper.submitStagingRequest(LocalStagingHelper.java:95)
>         at
> org.globus.exec.service.exec.processing.handler.StageOutStateHandler.process(StageOutStateHandler.java:79)
>         at
> org.globus.exec.service.exec.processing.handler.InternalStateHandler.processInternalState(InternalStateHandler.java:49)
>         at
> org.globus.exec.service.exec.processing.StateMachine.processInternalState(StateMachine.java:121)
>         at
> org.globus.exec.service.exec.processing.StateProcessingTask.run(StateProcessingTask.java:82)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:679)
> 2012-09-11T17:49:51.070-04:00 DEBUG utils.FaultUtils
> [pool-1-thread-1,makeFault:541] Fault Class: class
> org.globus.exec.generated.StagingFaultType
> 2012-09-11T17:49:51.070-04:00 DEBUG utils.FaultUtils
> [pool-1-thread-1,makeFault:542] Resource Key: {
> http://www.globus.org/namespaces/2008/03/gram/job}ResourceID=d69fb7c0-fc45-11e1-93da-f931d11c3d83
> 2012-09-11T17:49:51.070-04:00 DEBUG utils.FaultUtils
> [pool-1-thread-1,makeFault:543] Description: Staging error for RSL element
> fileStageOut.
> 2012-09-11T17:49:51.070-04:00 DEBUG utils.FaultUtils
> [pool-1-thread-1,makeFault:545] Cause: java.rmi.RemoteException
> 2012-09-11T17:49:51.070-04:00 DEBUG utils.FaultUtils
> [pool-1-thread-1,makeFault:549] State when failure occurred: StageOut
> 2012-09-11T17:49:51.070-04:00 DEBUG utils.FaultUtils
> [pool-1-thread-1,makeFault:551] Script Command: StageOut
>
>
> On Tue, Sep 11, 2012 at 6:05 PM, Adam Bazinet <[email protected]>wrote:
>
>> Dear list and particularly GT 4.x and Condor GRAM users,
>>
>> To those of you who may be interested, I have confirmed that the Condor
>> SEG is sensitive to the order and formatting of the elements in
>> globus-condor.log. If I shift them around manually so that they conform to
>> what a <c> block looks like with an older version of Condor, Globus
>> properly picks up the fact that my jobs are complete. However, as it goes
>> to stage out the files, I get an RFT failure saying the "Database driver is
>> not initialized, Need to setup database". My question is: do I have
>> something wrong with RFT in this Globus installation? This seems unlikely
>> since I have several others configured the same way without any issue, and
>> there are no problems staging files *in*. Does it have to do with my
>> manually mucking around with the globus-condor.log file? A relevant bit of
>> the container.log is below. I'd appreciate your input because regarding
>> this and also a possible workaround for the SEG parsing problem (Condor
>> version downgrade, modifications to SEG parsing code, or otherwise).
>>
>> thanks,
>> Adam
>>
>>
>>
>> On Thu, Aug 30, 2012 at 4:09 PM, Adam Bazinet <[email protected]>wrote:
>>
>>> Hi all,
>>>
>>> I just installed a new GT 4.2.1 on a Condor resource, and while jobs run
>>> and complete in Condor fine, the finished status is not being picked up by
>>> Globus appropriately.
>>>
>>> Is the GT 4.2.1 sensitive to the order/formatting of the elements in
>>> globus-condor.log?
>>>
>>> For example, here is a JobTerminatedEvent from globus-condor.log from an
>>> older Condor/Globus installation that works properly:
>>>
>>> <c>
>>>     <a n="MyType"><s>JobTerminatedEvent</s></a>
>>>     <a n="EventTypeNumber"><i>5</i></a>
>>>     <a n="MyType"><s>JobTerminatedEvent</s></a>
>>>     <a n="EventTime"><s>2012-08-29T00:32:04</s></a>
>>>     <a n="Cluster"><i>18720</i></a>
>>>     <a n="Proc"><i>36</i></a>
>>>     <a n="Subproc"><i>0</i></a>
>>>     <a n="TerminatedNormally"><b v="t"/></a>
>>>     <a n="ReturnValue"><i>0</i></a>
>>>     <a n="RunLocalUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a>
>>>     <a n="RunRemoteUsage"><s>Usr 0 07:01:33, Sys 0 00:00:13</s></a>
>>>     <a n="TotalLocalUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a>
>>>     <a n="TotalRemoteUsage"><s>Usr 0 07:01:33, Sys 0 00:00:13</s></a>
>>>     <a n="SentBytes"><r>3.244000000000000E+04</r></a>
>>>     <a n="ReceivedBytes"><r>4.281464000000000E+06</r></a>
>>>     <a n="TotalSentBytes"><r>3.244000000000000E+04</r></a>
>>>     <a n="TotalReceivedBytes"><r>5.565903200000000E+07</r></a>
>>> </c>
>>>
>>> and here is the one from the new resource that is NOT working properly:
>>>
>>> <c>
>>>     <a n="MyType"><s>JobTerminatedEvent</s></a>
>>>     <a n="TotalLocalUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a>
>>>     <a n="Proc"><i>0</i></a>
>>>     <a n="EventTime"><s>2012-08-28T16:07:17</s></a>
>>>     <a n="TotalRemoteUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a>
>>>     <a n="TotalReceivedBytes"><r>2.231476000000000E+06</r></a>
>>>     <a n="ReturnValue"><i>0</i></a>
>>>     <a n="RunRemoteUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a>
>>>     <a n="RunLocalUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a>
>>>     <a n="SentBytes"><r>3.979500000000000E+04</r></a>
>>>     <a n="Cluster"><i>20</i></a>
>>>     <a n="TotalSentBytes"><r>3.979500000000000E+04</r></a>
>>>     <a n="Subproc"><i>0</i></a>
>>>     <a n="CurrentTime"><e>time()</e></a>
>>>     <a n="EventTypeNumber"><i>5</i></a>
>>>     <a n="ReceivedBytes"><r>2.231476000000000E+06</r></a>
>>>     <a n="TerminatedNormally"><b v="t"/></a>
>>> </c>
>>>
>>> Basically, the only variable that's different about this resource is
>>> that it's running a newer version of Condor. My hunch is that broke
>>> compatibility somewhere along the way. Can someone confirm this, or provide
>>> another mechanism to debug? I'm attaching the container.log from the
>>> resource in question, which has GRAM debugging enabled as some jobs came
>>> in. It didn't really show me anything, though.
>>>
>>> thanks,
>>> Adam
>>>
>>>
>>
>
> --Ravi
>
>

Reply via email to