Adam I can't think of any reason why messing with condor log file will affect RFT. Can you, may be, manually create the derby database and restart the container? Or alternatively, you can create a mySQL database (instructions are here: http://www.globus.org/toolkit/docs/4.2/4.2.0/data/rft/admin/rft-admin-mysql.html) On Sep 11, 2012, at 5:23 PM, Adam Bazinet wrote:
> Hi Ravi, > > Just the default Derby database, which has worked fine on all of our other > installations. Again, I'm leaning towards something weird about what I was > doing messing with the Condor log file, and maybe something gets out of sync, > I don't know; the other possibility, of course, is that something with RFT > isn't configured properly and the other problem just happened to expose it. > If you know of anything I can do to debug, please let me know. > > thanks, > Adam > > > > On Tue, Sep 11, 2012 at 6:19 PM, Ravi K Madduri <[email protected]> wrote: > Adam > What database do you use for RFT? Is it Postgresql or derby? > > On Sep 11, 2012, at 5:08 PM, Adam Bazinet wrote: > >> Sorry, forgot to attach the log: >> >> ---------------------------------------- >> PROCESSING INTERNAL STATE: -- StageOut -- >> ---------------------------------------- >> 2012-09-11T17:49:48.478-04:00 DEBUG processing.StateMachine >> [pool-1-thread-1,processInternalState:115] Processing resource >> d69fb7c0-fc45-11e1-93da-f931d11c3d83 in internal state StageOut >> 2012-09-11T17:49:48.478-04:00 DEBUG handler.InternalStateHandler >> [pool-1-thread-1,processInternalState:44] >> [resourceKey:d69fb7c0-fc45-11e1-93da-f931d11c3d83] Start processing internal >> state StageOut >> 2012-09-11T17:49:48.484-04:00 DEBUG >> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83 >> [pool-1-thread-1,setState:846] receiving request for state change to StageOut >> 2012-09-11T17:49:48.484-04:00 DEBUG >> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83 >> [pool-1-thread-1,setState:525] Setting new state RP value StageOut >> 2012-09-11T17:49:48.484-04:00 DEBUG >> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83 >> [pool-1-thread-1,setState:854] State of job >> {http://www.globus.org/namespaces/2008/03/gram/job}ResourceID=d69fb7c0-fc45-11e1-93da-f931d11c3d83 >> changed to: StageOut >> 2012-09-11T17:49:48.484-04:00 DEBUG >> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83 >> [pool-1-thread-1,setState:862] Holding: false >> 2012-09-11T17:49:48.484-04:00 DEBUG >> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83 >> [pool-1-thread-1,setState:896] exitCode is null >> 2012-09-11T17:49:48.485-04:00 DEBUG >> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83 >> [pool-1-thread-1,setState:905] Notifying of job state change to topic >> listeners >> 2012-09-11T17:49:48.549-04:00 DEBUG utils.LocalStagingHelper >> [pool-1-thread-1,submitStagingRequest:72] >> [resourceKey:d69fb7c0-fc45-11e1-93da-f931d11c3d83] Entering >> sumbitStagingRequest() >> 2012-09-11T17:49:48.552-04:00 DEBUG >> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83 >> [pool-1-thread-1,getStagingCredential:434] getStagingCredential() called >> 2012-09-11T17:49:48.552-04:00 DEBUG >> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83 >> [pool-1-thread-1,getResourceDatum:209] getting resource datum gridMapFile >> 2012-09-11T17:49:48.554-04:00 DEBUG >> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83 >> [pool-1-thread-1,getStagingCredential:453] staging credential endpoint: >> Address: https://128.8.91.70:8443/wsrf/services/DelegationService >> Reference property[0]: >> <ns1:DelegationKey >> xmlns:ns1="http://www.globus.org/08/2004/delegationService">d61f6430-fc45-11e1-93da-f931d11c3d83</ns1:DelegationKey> >> >> 2012-09-11T17:49:48.554-04:00 DEBUG utils.DelegatedCredential >> [pool-1-thread-1,getDelegatedCredential:98] Entering getDelegatedCredential() >> 2012-09-11T17:49:48.554-04:00 DEBUG utils.DelegatedCredential >> [pool-1-thread-1,getDelegatedCredential:112] checking for existing >> credential listener >> 2012-09-11T17:49:48.555-04:00 DEBUG utils.DelegatedCredential >> [pool-1-thread-1,getDelegationKey:605] Pulled out DelegationKey: >> d61f6430-fc45-11e1-93da-f931d11c3d83 >> 2012-09-11T17:49:48.555-04:00 DEBUG utils.DelegatedCredential >> [pool-1-thread-1,getDelegatedCredential:124] reusing DelegatedCredential >> object >> 2012-09-11T17:49:48.555-04:00 DEBUG >> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83 >> [pool-1-thread-1,getStagingCredential:476] Delegated Staging Credential >> Object: org.globus.exec.service.utils.DelegatedCredential@26b98a06 >> 2012-09-11T17:49:48.555-04:00 DEBUG utils.DelegatedCredential >> [pool-1-thread-1,getCredential:333] waiting to receive initial credential... >> 2012-09-11T17:49:48.556-04:00 DEBUG utils.DelegatedCredential >> [pool-1-thread-1,getCredential:341] done waiting... >> 2012-09-11T17:49:48.559-04:00 DEBUG >> PersistentManagedExecutableJobResource.d69fb7c0-fc45-11e1-93da-f931d11c3d83 >> [pool-1-thread-1,getStagingCredential:479] Using proxy with subject >> /O=Grid/OU=GlobusTest/OU=simpleCA-leucine.umiacs.umd.edu/OU=umiacs.umd.edu/CN=GT4 >> Adminto contact the staging service >> 2012-09-11T17:49:48.580-04:00 DEBUG utils.LocalStagingHelper >> [pool-1-thread-1,submitStagingRequest:91] creating RFT resource >> 2012-09-11T17:49:48.711-04:00 DEBUG >> factory.ReliableFileTransferFactoryService [pool-1-thread-1,oldLog:168] >> Registration to MDS enabled >> 2012-09-11T17:49:50.981-04:00 DEBUG utils.LocalInvocationHelper >> [pool-1-thread-1,getServiceObject:403] >> className=org.globus.transfer.reliable.service.factory.ReliableFileTransferFactoryService >> handlerClass=org.globus.axis.providers.RPCProvider >> 2012-09-11T17:49:50.981-04:00 DEBUG utils.LocalInvocationHelper >> [pool-1-thread-1,getServiceObject:416] getting the service object >> 2012-09-11T17:49:50.982-04:00 DEBUG utils.LocalInvocationHelper >> [pool-1-thread-1,getServiceObject:422] caching service object for service >> ReliableFileTransferFactoryService >> 2012-09-11T17:49:50.983-04:00 DEBUG utils.LocalInvocationHelper >> [pool-1-thread-1,getMethod:457] caching method >> ReliableFileTransferFactoryService.createReliableFileTransfer >> 2012-09-11T17:49:50.984-04:00 DEBUG >> factory.ReliableFileTransferFactoryService [pool-1-thread-1,oldLog:168] >> PerformanceLog createReliableFileTransfer() enter >> 2012-09-11T17:49:50.984-04:00 DEBUG >> factory.ReliableFileTransferFactoryService [pool-1-thread-1,oldLog:168] Got >> a transfer request >> 2012-09-11T17:49:51.049-04:00 DEBUG service.ReliableFileTransferHome >> [pool-1-thread-1,oldLog:168] Loading the RFT home's list of resource keys >> 2012-09-11T17:49:51.051-04:00 DEBUG database.RFTDatabaseSetup >> [pool-1-thread-1,oldLog:168] getDBConnection() enter >> 2012-09-11T17:49:51.051-04:00 WARN service.ReliableFileTransferHome >> [pool-1-thread-1,oldLog:190] All RFT requests will fail and all GRAM jobs >> that require file staging will fail.Database driver is not initialized, Need >> to setup database >> 2012-09-11T17:49:51.052-04:00 DEBUG service.ReliableFileTransferHome >> [pool-1-thread-1,oldLog:164] All RFT requests will fail and all GRAM jobs >> that require file staging will fail.Database driver is not initialized, Need >> to setup database >> org.globus.transfer.reliable.service.database.RftDBException: Database >> driver is not initialized, Need to setup database >> at >> org.globus.transfer.reliable.service.database.RFTDatabaseSetup.getDBConnection(RFTDatabaseSetup.java:249) >> at >> org.globus.transfer.reliable.service.database.ReliableFileTransferDbAdapter.getActiveRequestIds(ReliableFileTransferDbAdapter.java:917) >> at >> org.globus.transfer.reliable.service.ReliableFileTransferResource.getStoredResourceKeyValues(ReliableFileTransferResource.java:544) >> at >> org.globus.transfer.reliable.service.ReliableFileTransferHome.initialize(ReliableFileTransferHome.java:69) >> at >> org.globus.wsrf.jndi.BasicBeanFactory.getObjectInstance(BasicBeanFactory.java:54) >> at org.globus.wsrf.jndi.BeanFactory.getInstance(BeanFactory.java:118) >> at org.globus.wsrf.jndi.BeanFactory.access$100(BeanFactory.java:40) >> at >> org.globus.wsrf.jndi.BeanFactory$GetInstanceAction.run(BeanFactory.java:142) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:416) >> at org.globus.gsi.jaas.GlobusSubject.runAs(GlobusSubject.java:60) >> at org.globus.gsi.jaas.JaasSubject.doAs(JaasSubject.java:100) >> at >> org.globus.wsrf.jndi.BeanFactory.getObjectInstance(BeanFactory.java:90) >> at >> org.apache.naming.factory.ResourceFactory.getObjectInstance(ResourceFactory.java:135) >> at >> javax.naming.spi.NamingManager.getObjectInstance(NamingManager.java:321) >> at org.apache.naming.NamingContext.lookup(NamingContext.java:827) >> at org.apache.naming.NamingContext.lookup(NamingContext.java:155) >> at >> org.apache.naming.SynchronizedContext.lookup(SynchronizedContext.java:69) >> at org.apache.naming.NamingContext.lookup(NamingContext.java:815) >> at org.apache.naming.NamingContext.lookup(NamingContext.java:155) >> at >> org.apache.naming.SynchronizedContext.lookup(SynchronizedContext.java:69) >> at org.apache.naming.NamingContext.lookup(NamingContext.java:815) >> at org.apache.naming.NamingContext.lookup(NamingContext.java:155) >> at >> org.apache.naming.SynchronizedContext.lookup(SynchronizedContext.java:69) >> at org.apache.naming.NamingContext.lookup(NamingContext.java:815) >> at org.apache.naming.NamingContext.lookup(NamingContext.java:155) >> at >> org.apache.naming.SynchronizedContext.lookup(SynchronizedContext.java:69) >> at org.apache.naming.NamingContext.lookup(NamingContext.java:815) >> at org.apache.naming.NamingContext.lookup(NamingContext.java:168) >> at >> org.apache.naming.SynchronizedContext.lookup(SynchronizedContext.java:81) >> at javax.naming.InitialContext.lookup(InitialContext.java:409) >> at org.apache.naming.SelectorContext.lookup(SelectorContext.java:145) >> at javax.naming.InitialContext.lookup(InitialContext.java:409) >> at >> org.globus.transfer.reliable.service.factory.ReliableFileTransferFactoryService.createReliableFileTransfer(ReliableFileTransferFactoryService.java:247) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:616) >> at >> org.globus.exec.service.exec.utils.LocalInvocationHelper.callService(LocalInvocationHelper.java:140) >> at >> org.globus.exec.service.exec.utils.LocalStagingHelper.submitStagingRequest(LocalStagingHelper.java:95) >> at >> org.globus.exec.service.exec.processing.handler.StageOutStateHandler.process(StageOutStateHandler.java:79) >> at >> org.globus.exec.service.exec.processing.handler.InternalStateHandler.processInternalState(InternalStateHandler.java:49) >> at >> org.globus.exec.service.exec.processing.StateMachine.processInternalState(StateMachine.java:121) >> at >> org.globus.exec.service.exec.processing.StateProcessingTask.run(StateProcessingTask.java:82) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) >> at java.lang.Thread.run(Thread.java:679) >> 2012-09-11T17:49:51.062-04:00 DEBUG database.RFTDatabaseSetup >> [pool-1-thread-1,oldLog:168] getDBConnection() enter >> 2012-09-11T17:49:51.062-04:00 ERROR >> factory.ReliableFileTransferFactoryService [pool-1-thread-1,oldLog:175] >> Unable to create RFT resource >> org.globus.transfer.reliable.service.database.RftDBException: Database >> driver is not initialized, Need to setup database >> at >> org.globus.transfer.reliable.service.database.RFTDatabaseSetup.getDBConnection(RFTDatabaseSetup.java:249) >> at >> org.globus.transfer.reliable.service.database.ReliableFileTransferDbAdapter.storeTransferRequest(ReliableFileTransferDbAdapter.java:204) >> at >> org.globus.transfer.reliable.service.ReliableFileTransferResource.<init>(ReliableFileTransferResource.java:191) >> at >> org.globus.transfer.reliable.service.ReliableFileTransferHome.create(ReliableFileTransferHome.java:126) >> at >> org.globus.transfer.reliable.service.factory.ReliableFileTransferFactoryService.createReliableFileTransfer(ReliableFileTransferFactoryService.java:249) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:616) >> at >> org.globus.exec.service.exec.utils.LocalInvocationHelper.callService(LocalInvocationHelper.java:140) >> at >> org.globus.exec.service.exec.utils.LocalStagingHelper.submitStagingRequest(LocalStagingHelper.java:95) >> at >> org.globus.exec.service.exec.processing.handler.StageOutStateHandler.process(StageOutStateHandler.java:79) >> at >> org.globus.exec.service.exec.processing.handler.InternalStateHandler.processInternalState(InternalStateHandler.java:49) >> at >> org.globus.exec.service.exec.processing.StateMachine.processInternalState(StateMachine.java:121) >> at >> org.globus.exec.service.exec.processing.StateProcessingTask.run(StateProcessingTask.java:82) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) >> at java.lang.Thread.run(Thread.java:679) >> 2012-09-11T17:49:51.070-04:00 DEBUG utils.FaultUtils >> [pool-1-thread-1,makeFault:541] Fault Class: class >> org.globus.exec.generated.StagingFaultType >> 2012-09-11T17:49:51.070-04:00 DEBUG utils.FaultUtils >> [pool-1-thread-1,makeFault:542] Resource Key: >> {http://www.globus.org/namespaces/2008/03/gram/job}ResourceID=d69fb7c0-fc45-11e1-93da-f931d11c3d83 >> 2012-09-11T17:49:51.070-04:00 DEBUG utils.FaultUtils >> [pool-1-thread-1,makeFault:543] Description: Staging error for RSL element >> fileStageOut. >> 2012-09-11T17:49:51.070-04:00 DEBUG utils.FaultUtils >> [pool-1-thread-1,makeFault:545] Cause: java.rmi.RemoteException >> 2012-09-11T17:49:51.070-04:00 DEBUG utils.FaultUtils >> [pool-1-thread-1,makeFault:549] State when failure occurred: StageOut >> 2012-09-11T17:49:51.070-04:00 DEBUG utils.FaultUtils >> [pool-1-thread-1,makeFault:551] Script Command: StageOut >> >> >> On Tue, Sep 11, 2012 at 6:05 PM, Adam Bazinet <[email protected]> >> wrote: >> Dear list and particularly GT 4.x and Condor GRAM users, >> >> To those of you who may be interested, I have confirmed that the Condor SEG >> is sensitive to the order and formatting of the elements in >> globus-condor.log. If I shift them around manually so that they conform to >> what a <c> block looks like with an older version of Condor, Globus properly >> picks up the fact that my jobs are complete. However, as it goes to stage >> out the files, I get an RFT failure saying the "Database driver is not >> initialized, Need to setup database". My question is: do I have something >> wrong with RFT in this Globus installation? This seems unlikely since I have >> several others configured the same way without any issue, and there are no >> problems staging files in. Does it have to do with my manually mucking >> around with the globus-condor.log file? A relevant bit of the container.log >> is below. I'd appreciate your input because regarding this and also a >> possible workaround for the SEG parsing problem (Condor version downgrade, >> modifications to SEG parsing code, or otherwise). >> >> thanks, >> Adam >> >> >> >> On Thu, Aug 30, 2012 at 4:09 PM, Adam Bazinet <[email protected]> >> wrote: >> Hi all, >> >> I just installed a new GT 4.2.1 on a Condor resource, and while jobs run and >> complete in Condor fine, the finished status is not being picked up by >> Globus appropriately. >> >> Is the GT 4.2.1 sensitive to the order/formatting of the elements in >> globus-condor.log? >> >> For example, here is a JobTerminatedEvent from globus-condor.log from an >> older Condor/Globus installation that works properly: >> >> <c> >> <a n="MyType"><s>JobTerminatedEvent</s></a> >> <a n="EventTypeNumber"><i>5</i></a> >> <a n="MyType"><s>JobTerminatedEvent</s></a> >> <a n="EventTime"><s>2012-08-29T00:32:04</s></a> >> <a n="Cluster"><i>18720</i></a> >> <a n="Proc"><i>36</i></a> >> <a n="Subproc"><i>0</i></a> >> <a n="TerminatedNormally"><b v="t"/></a> >> <a n="ReturnValue"><i>0</i></a> >> <a n="RunLocalUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a> >> <a n="RunRemoteUsage"><s>Usr 0 07:01:33, Sys 0 00:00:13</s></a> >> <a n="TotalLocalUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a> >> <a n="TotalRemoteUsage"><s>Usr 0 07:01:33, Sys 0 00:00:13</s></a> >> <a n="SentBytes"><r>3.244000000000000E+04</r></a> >> <a n="ReceivedBytes"><r>4.281464000000000E+06</r></a> >> <a n="TotalSentBytes"><r>3.244000000000000E+04</r></a> >> <a n="TotalReceivedBytes"><r>5.565903200000000E+07</r></a> >> </c> >> >> and here is the one from the new resource that is NOT working properly: >> >> <c> >> <a n="MyType"><s>JobTerminatedEvent</s></a> >> <a n="TotalLocalUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a> >> <a n="Proc"><i>0</i></a> >> <a n="EventTime"><s>2012-08-28T16:07:17</s></a> >> <a n="TotalRemoteUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a> >> <a n="TotalReceivedBytes"><r>2.231476000000000E+06</r></a> >> <a n="ReturnValue"><i>0</i></a> >> <a n="RunRemoteUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a> >> <a n="RunLocalUsage"><s>Usr 0 00:00:00, Sys 0 00:00:00</s></a> >> <a n="SentBytes"><r>3.979500000000000E+04</r></a> >> <a n="Cluster"><i>20</i></a> >> <a n="TotalSentBytes"><r>3.979500000000000E+04</r></a> >> <a n="Subproc"><i>0</i></a> >> <a n="CurrentTime"><e>time()</e></a> >> <a n="EventTypeNumber"><i>5</i></a> >> <a n="ReceivedBytes"><r>2.231476000000000E+06</r></a> >> <a n="TerminatedNormally"><b v="t"/></a> >> </c> >> >> Basically, the only variable that's different about this resource is that >> it's running a newer version of Condor. My hunch is that broke compatibility >> somewhere along the way. Can someone confirm this, or provide another >> mechanism to debug? I'm attaching the container.log from the resource in >> question, which has GRAM debugging enabled as some jobs came in. It didn't >> really show me anything, though. >> >> thanks, >> Adam >> >> >> > > --Ravi > > --Ravi
smime.p7s
Description: S/MIME cryptographic signature
