2011/7/8 Timo Henne <[email protected]> > Hi, > > after fiddling around with netstat (actually trying to produce a log to > help in analysis, and stumbling upon tcp6 entries) I found a workaround: > disabling ipv6 on the VM makes the remote jobs get done. >
Great! > I could probably also add entries for ipv6 in my host file, I will try that > later. I still have no idea why ipv6 was apparently only used for the > notification part. Do you have any idea why? > No idea. > Also, the error message with globusrun-ws remains. > I think this is because the container is using 127.0.1.1 (localhost) instead of its public IP. Regards, Dr. Eduardo Huedo Cuesta Associate Professor (Profesor Titular), Universidad Complutense de Madrid http://dsa-research.org/ehuedo Bests, > Timo > > > > Am 07.07.2011 15:38, schrieb Eduardo Huedo: > >> Hi, >> >> >> I don't think it is related. Could you send me the full output? >> >> Thanks, >> >> Dr. Eduardo Huedo Cuesta >> Associate Professor (Profesor Titular), Universidad Complutense de Madrid >> http://dsa-research.org/ehuedo >> >> >> >> 2011/7/7 Timo Henne <[email protected] >> <mailto:[email protected]**goettingen.de <[email protected]>>> >> >> Dear Eduardo, >> >> with globusrun-ws -submit -F <remote_machine> -dbg -c /bin/ls >> >> I get this msg: >> ... >> Current job state: Done >> Destroying job... >> === REQUEST MESSAGE (length 427) (time 1310039859.775479000) === >> <ns00:Envelope >> >> xmlns:ns00="http://schemas.__x**mlsoap.org/soap/envelope/<http://xmlsoap.org/soap/envelope/> >> >> <http://schemas.xmlsoap.org/**soap/envelope/<http://schemas.xmlsoap.org/soap/envelope/> >> >"><__ns00:**Header></ns00:Header><__ns00:**Body><ns01:terminate >> xmlns:ns01="http://www.globus.**__org/namespaces/2008/03/gram/** >> __job/terminate >> >> <http://www.globus.org/**namespaces/2008/03/gram/job/**terminate<http://www.globus.org/namespaces/2008/03/gram/job/terminate> >> >"><ns01:__**destroyAfterCleanup>true</__**ns01:destroyAfterCleanup><__** >> ns01:continueNotifying>false</**__ns01:continueNotifying><**ns01:__** >> destroyDelegatedCredentials>__**false</ns01:__** >> destroyDelegatedCredentials></**__ns01:terminate></ns00:Body><** >> /__ns00:Envelope> >> ------------------------------**__---------------- >> Failed. >> globusrun-ws: Unable to destroy job: Error: invalid or unknown job >> reference. Unable to destroy job. It may have expired or already >> been destroyed. >> >> and this error in container.log: >> 2011-07-07T13:57:40.684+02:00 ERROR >> providers.__**TerminateManagedJobProvider >> [ServiceThread-58,logError:__**184] Job resource >> 54869ce0-a890-11e0-ae4b-__**edf1d7e307e9 not found. >> >> ...which doesn't really help me. Does it mean anything concerning my >> problem? >> >> Timo >> >> >> Am 07.07.2011 13:26, schrieb Eduardo Huedo: >> >> Dear Timo, >> >> Since only remote notifications fail, I am quite sure that the >> problem >> is about networking. >> But maybe you can get more information with a simple job >> submission with >> globusrun-ws using -dbg. >> >> Regarding IGE RT, you can use a guest account >> >> (http://www.ige-project.eu/__**hub/rt/rtguest<http://www.ige-project.eu/__hub/rt/rtguest> >> >> <http://www.ige-project.eu/**hub/rt/rtguest<http://www.ige-project.eu/hub/rt/rtguest>>) >> or request your own >> >> (http://www.ige-project.eu/__**hub/rt<http://www.ige-project.eu/__hub/rt> >> <http://www.ige-project.eu/**hub/rt<http://www.ige-project.eu/hub/rt> >> >). >> >> Regards, >> >> Dr. Eduardo Huedo Cuesta >> Associate Professor (Profesor Titular), Universidad Complutense >> de Madrid >> http://dsa-research.org/ehuedo >> >> >> >> 2011/7/7 Timo Henne <[email protected] >> <mailto:[email protected]**goettingen.de<[email protected]> >> > >> <mailto:[email protected]__**goettingen.de<[email protected]__goettingen.de> >> <mailto:[email protected]**goettingen.de<[email protected]> >> >>> >> >> Dear Eduardo, >> >> thanks for your answer. However, I now ensured that the >> variable is >> set, according to some page which I found I also set >> tcp.port.range >> in ~/.globus/cog.properties, I even disabled the >> firewall(s), yet to >> no avail: the same warning message appears. Do you have any >> more >> ideas on what to try? Can the VM be a problem? What could I >> try to >> test this? >> >> I signed up for egcf, but for the ticketing system I need an >> account >> which I don't have (yet) and don't know where to get it from. >> >> Thanks, >> Timo >> >> >> Am 07.07.2011 12:14, schrieb Eduardo Huedo: >> >> Dear Timo, >> >> Since it says "Connection refused", first of all, ensure >> that >> you don't >> have any firewall problem. >> For example, check that GLOBUS_TCP_PORT_RANGE is >> appropriately >> set in >> the client. As you probably know, the client starts a small >> container to >> receive notifications in a user-space dynamic port. This >> variable limits >> the range of these dynamic ports, so only that range >> should be >> open in >> the firewall. >> >> For your information, the IGE project >> (http://www.ige-project.eu) is now >> providing support (and much more) for Globus in Europe. For >> example, we >> have a request tracking system >> (http://rt.ige-project.eu) where >> you can >> open a ticket to request support, suggest improvement or >> report >> bugs. >> >> Regards, >> >> Dr. Eduardo Huedo Cuesta >> Associate Professor (Profesor Titular), Universidad >> Complutense >> de Madrid >> http://dsa-research.org/ehuedo >> >> >> >> 2011/7/7 Timo Henne <[email protected] >> <mailto:[email protected]**goettingen.de<[email protected]> >> > >> <mailto:[email protected]__**goettingen.de<[email protected]__goettingen.de> >> <mailto:[email protected]**goettingen.de<[email protected]> >> >> >> >> <mailto:[email protected]____**goettingen.de<[email protected]____goettingen.de> >> <mailto:[email protected]__**goettingen.de<[email protected]__goettingen.de> >> > >> <mailto:[email protected]__**goettingen.de<[email protected]__goettingen.de> >> <mailto:[email protected]**goettingen.de<[email protected]> >> >>>> >> >> >> Hi, >> >> my previous two mails somehow didn't make it to the >> list, so >> here is >> another attempt: >> >> I am trying to use gridway(5.6.1) to schedule simple >> test >> jobs (/bin/ls) >> across two machines, both with gt4.2.1 installed and >> running. One of the >> machines is running Debian, the other is a VM >> running Ubuntu. >> Communication and Authentification apparently works >> fine, >> the machines >> see and trust each other, and the jobs gets scheduled. >> However, in *both >> directions*, only those jobs running on the local >> machine >> (from where >> they are started using gwsubmit) actually get "done" >> - the >> others remain >> in "wrap pend" state. Apparently they are executed >> correctly >> on the >> remote machine, since the result output is there, but >> somehow the >> notification to the originating machine fails. >> Searching the >> list, >> enabling debugging and digging in the logs I found this >> warning/exception at the >> end of the globus container.log on the remote machine: >> >> <...snip...> >> 2011-07-01T15:40:41.231+02:00 INFO >> impl.DefaultIndexService >> >> [ServiceThread-58,______**performDefaultRegistrations:__** >> ____261] >> guid=b646c8e0-a3e7-11e0-b059-_**_____b76342becd29 >> >> >> event=org.globus.mds.index.___**___** >> performDefaultRegistrations.__**____end status=0 >> 2011-07-01T15:41:21.751+02:00 INFO >> >> >> PersistentManagedExecutableJob**______Resource.ce605ef0-a3e7-_** >> _11e0-____b059-b76342becd29 >> >> [ServiceThread-57,start:761] Job >> ce605ef0-a3e7-11e0-b059-______**b76342becd29 >> with client submission-id null accepted for local >> user 'the' >> 2011-07-01T15:41:22.032+02:00 INFO >> handler.SubmitStateHandler >> [pool-1-thread-7,process:172] Job >> ce605ef0-a3e7-11e0-b059-______**b76342becd29 >> submitted with local job ID >> 'ce9b08e8-a3e7-11e0-bcc8-_____**_b7ebd4913b23:17697' >> 2011-07-01T15:41:23.327+02:00 DEBUG >> impl.______**SimpleSubscriptionTopicListene**______r >> >> [pool-1-thread-1,setPort:287] Security properties >> not null: not >> secure conv >> 2011-07-01T15:41:23.327+02:00 DEBUG >> impl.______**SimpleSubscriptionTopicListene**______r >> >> [pool-1-thread-1,setPort:314] set port with false >> 2011-07-01T15:41:23.366+02:00 DEBUG >> impl.______**SimpleSubscriptionTopicListene**______r >> >> [pool-1-thread-1,setPort:290] Setting security >> properties >> 2011-07-01T15:41:23.482+02:00 DEBUG >> impl.______**SimpleSubscriptionTopicListene**______r >> >> [pool-1-thread-3,setPort:287] Security properties >> not null: not >> secure conv >> 2011-07-01T15:41:23.483+02:00 DEBUG >> impl.______**SimpleSubscriptionTopicListene**______r >> >> [pool-1-thread-3,setPort:314] set port with false >> 2011-07-01T15:41:23.483+02:00 DEBUG >> impl.______**SimpleSubscriptionTopicListene**______r >> >> [pool-1-thread-3,setPort:290] Setting security >> properties >> 2011-07-01T15:41:23.505+02:00 WARN >> impl.______**SimpleSubscriptionTopicListene**______r >> >> [pool-1-thread-3,topicChanged:**______129] >> [JWSCORE-169] Failed >> to send >> notification for subscription with key >> '______**B26B14DD21498C52B1E38CC2F042B0** >> ______AF0E65BAE6+ce647da0-**a3e7-____11e0-__b059-**b76342becd29': >> >> >> >> java.net.ConnectException: Connection refused >> 2011-07-01T15:41:23.506+02:00 DEBUG >> impl.______**SimpleSubscriptionTopicListene**______r >> >> [pool-1-thread-3,topicChanged:**______132] >> javax.xml.rpc.JAXRPCException: >> java.net.ConnectException: >> Connection >> refused >> at >> >> org.apache.axis.client.Call.__**____invokeOneWay(Call.java:** >> 1871) >> at >> >> >> org.oasis.wsn.______**NotificationConsumerSOAPBindin** >> ______gStub.notify(______**NotificationConsumerSOAPBindin** >> ______gStub.java:701) >> at >> >> >> org.globus.wsrf.impl.______**SimpleSubscriptionTopicListene** >> ______r.notify(______**SimpleSubscriptionTopicListene**______r.java:256) >> at >> >> >> org.globus.wsrf.impl.______**SimpleSubscriptionTopicListene** >> ______r.topicChanged(______**SimpleSubscriptionTopicListene** >> ______r.java:123) >> >> at >> >> >> org.globus.wsrf.impl.______**SimpleTopic.topicChanged(_____** >> _SimpleTopic.java:205) >> at >> >> >> org.globus.wsrf.impl.______**SimpleTopic.notify(______** >> SimpleTopic.java:112) >> at >> >> >> org.globus.exec.service.exec._**_______** >> ManagedExecutableJobResource._**_____setState(______** >> ManagedExecutableJobResource._**_____java:909) >> at >> >> >> org.globus.exec.service.exec._**_____processing.handler.______** >> CleanUpStateHandler.process(__**____CleanUpStateHandler.java:**56) >> at >> >> >> org.globus.exec.service.exec._**_____processing.handler.______** >> InternalStateHandler.______**processInternalState(______** >> InternalStateHandler.java:49) >> at >> >> >> org.globus.exec.service.exec._**_____processing.StateMachine._** >> _____processInternalState(____**__StateMachine.java:121) >> at >> >> >> org.globus.exec.service.exec._**_____processing.______** >> StateProcessingTask.run(______**StateProcessingTask.java:82) >> at >> >> >> java.util.concurrent.______**ThreadPoolExecutor$Worker.____** >> __runTask(ThreadPoolExecutor._**_____java:886) >> at >> >> >> java.util.concurrent.______**ThreadPoolExecutor$Worker.run(** >> ______ThreadPoolExecutor.java:**__908) >> at java.lang.Thread.run(Thread.__** >> ____java:662) >> <...snip...> >> >> The VM has a static IP. Have you got any clue for me >> on what >> could be >> the problem? Anything else I should provide for >> analysis? >> >> Thanks, >> Timo >> >> >> >> -- >> -- >> Timo Henne >> Research and Development Department (RDD) >> State and University Library >> Georg-August-Universitaet Goettingen >> 37073 Goettingen >> Germany >> >> Phone: +49 551 39 3883 >> http://www.sub.uni-goettingen.**____de/ >> >> <http://www.sub.uni-__**goettingen.de/<http://www.sub.uni-__goettingen.de/> >> >> <http://www.sub.uni-**goettingen.de/<http://www.sub.uni-goettingen.de/> >> >> >> >> >> >> -- >> -- >> Timo Henne >> Research and Development Department (RDD) >> State and University Library >> Georg-August-Universitaet Goettingen >> 37073 Goettingen >> Germany >> >> Phone: +49 551 39 3883 >> http://www.sub.uni-goettingen.**__de/ <http://www.sub.uni-** >> goettingen.de/ <http://www.sub.uni-goettingen.de/>> >> >> >> > -- > -- > Timo Henne > Research and Development Department (RDD) > State and University Library > Georg-August-Universitaet Goettingen > 37073 Goettingen > Germany > > Phone: +49 551 39 3883 > http://www.sub.uni-goettingen.**de/ <http://www.sub.uni-goettingen.de/> >
