2011/7/8 Timo Henne <[email protected]>

> Hi,
>
> after fiddling around with netstat (actually trying to produce a log to
> help in analysis, and stumbling upon tcp6 entries) I found a workaround:
> disabling ipv6 on the VM makes the remote jobs get done.
>

Great!


> I could probably also add entries for ipv6 in my host file, I will try that
> later. I still have no idea why ipv6 was apparently only used for the
> notification part. Do you have any idea why?
>

No idea.


> Also, the error message with globusrun-ws remains.
>

I think this is because the container is using 127.0.1.1 (localhost) instead
of its public IP.

Regards,

Dr. Eduardo Huedo Cuesta
Associate Professor (Profesor Titular), Universidad Complutense de Madrid
http://dsa-research.org/ehuedo

Bests,
> Timo
>
>
>
> Am 07.07.2011 15:38, schrieb Eduardo Huedo:
>
>> Hi,
>>
>>
>> I don't think it is related. Could you send me the full output?
>>
>> Thanks,
>>
>> Dr. Eduardo Huedo Cuesta
>> Associate Professor (Profesor Titular), Universidad Complutense de Madrid
>> http://dsa-research.org/ehuedo
>>
>>
>>
>> 2011/7/7 Timo Henne <[email protected]
>> <mailto:[email protected]**goettingen.de <[email protected]>>>
>>
>>    Dear Eduardo,
>>
>>    with globusrun-ws -submit -F <remote_machine> -dbg -c /bin/ls
>>
>>    I get this msg:
>>    ...
>>    Current job state: Done
>>    Destroying job...
>>    === REQUEST MESSAGE (length 427) (time 1310039859.775479000) ===
>>    <ns00:Envelope
>>    
>> xmlns:ns00="http://schemas.__x**mlsoap.org/soap/envelope/<http://xmlsoap.org/soap/envelope/>
>>    
>> <http://schemas.xmlsoap.org/**soap/envelope/<http://schemas.xmlsoap.org/soap/envelope/>
>> >"><__ns00:**Header></ns00:Header><__ns00:**Body><ns01:terminate
>>    xmlns:ns01="http://www.globus.**__org/namespaces/2008/03/gram/**
>> __job/terminate
>>    
>> <http://www.globus.org/**namespaces/2008/03/gram/job/**terminate<http://www.globus.org/namespaces/2008/03/gram/job/terminate>
>> >"><ns01:__**destroyAfterCleanup>true</__**ns01:destroyAfterCleanup><__**
>> ns01:continueNotifying>false</**__ns01:continueNotifying><**ns01:__**
>> destroyDelegatedCredentials>__**false</ns01:__**
>> destroyDelegatedCredentials></**__ns01:terminate></ns00:Body><**
>> /__ns00:Envelope>
>>    ------------------------------**__----------------
>>    Failed.
>>    globusrun-ws: Unable to destroy job: Error: invalid or unknown job
>>    reference. Unable to destroy job. It may have expired or already
>>    been destroyed.
>>
>>    and this error in container.log:
>>    2011-07-07T13:57:40.684+02:00 ERROR
>>    providers.__**TerminateManagedJobProvider
>>    [ServiceThread-58,logError:__**184] Job resource
>>    54869ce0-a890-11e0-ae4b-__**edf1d7e307e9 not found.
>>
>>    ...which doesn't really help me. Does it mean anything concerning my
>>    problem?
>>
>>    Timo
>>
>>
>>    Am 07.07.2011 13:26, schrieb Eduardo Huedo:
>>
>>        Dear Timo,
>>
>>        Since only remote notifications fail, I am quite sure that the
>>        problem
>>        is about networking.
>>        But maybe you can get more information with a simple job
>>        submission with
>>        globusrun-ws using -dbg.
>>
>>        Regarding IGE RT, you can use a guest account
>>        
>> (http://www.ige-project.eu/__**hub/rt/rtguest<http://www.ige-project.eu/__hub/rt/rtguest>
>>        
>> <http://www.ige-project.eu/**hub/rt/rtguest<http://www.ige-project.eu/hub/rt/rtguest>>)
>> or request your own
>>        
>> (http://www.ige-project.eu/__**hub/rt<http://www.ige-project.eu/__hub/rt>
>>        <http://www.ige-project.eu/**hub/rt<http://www.ige-project.eu/hub/rt>
>> >).
>>
>>        Regards,
>>
>>        Dr. Eduardo Huedo Cuesta
>>        Associate Professor (Profesor Titular), Universidad Complutense
>>        de Madrid
>>        http://dsa-research.org/ehuedo
>>
>>
>>
>>        2011/7/7 Timo Henne <[email protected]
>>        <mailto:[email protected]**goettingen.de<[email protected]>
>> >
>>        <mailto:[email protected]__**goettingen.de<[email protected]__goettingen.de>
>>        <mailto:[email protected]**goettingen.de<[email protected]>
>> >>>
>>
>>            Dear Eduardo,
>>
>>            thanks for your answer. However, I now ensured that the
>>        variable is
>>            set, according to some page which I found I also set
>>        tcp.port.range
>>            in ~/.globus/cog.properties, I even disabled the
>>        firewall(s), yet to
>>            no avail: the same warning message appears. Do you have any
>> more
>>            ideas on what to try? Can the VM be a problem? What could I
>>        try to
>>            test this?
>>
>>            I signed up for egcf, but for the ticketing system I need an
>>        account
>>            which I don't have (yet) and don't know where to get it from.
>>
>>            Thanks,
>>            Timo
>>
>>
>>            Am 07.07.2011 12:14, schrieb Eduardo Huedo:
>>
>>                Dear Timo,
>>
>>                Since it says "Connection refused", first of all, ensure
>>        that
>>                you don't
>>                have any firewall problem.
>>                For example, check that GLOBUS_TCP_PORT_RANGE is
>>        appropriately
>>                set in
>>                the client. As you probably know, the client starts a small
>>                container to
>>                receive notifications in a user-space dynamic port. This
>>                variable limits
>>                the range of these dynamic ports, so only that range
>>        should be
>>                open in
>>                the firewall.
>>
>>                For your information, the IGE project
>>                (http://www.ige-project.eu) is now
>>                providing support (and much more) for Globus in Europe. For
>>                example, we
>>                have a request tracking system
>>        (http://rt.ige-project.eu) where
>>                you can
>>                open a ticket to request support, suggest improvement or
>>        report
>>                bugs.
>>
>>                Regards,
>>
>>                Dr. Eduardo Huedo Cuesta
>>                Associate Professor (Profesor Titular), Universidad
>>        Complutense
>>                de Madrid
>>        http://dsa-research.org/ehuedo
>>
>>
>>
>>                2011/7/7 Timo Henne <[email protected]
>>        <mailto:[email protected]**goettingen.de<[email protected]>
>> >
>>        <mailto:[email protected]__**goettingen.de<[email protected]__goettingen.de>
>>        <mailto:[email protected]**goettingen.de<[email protected]>
>> >>
>>        
>> <mailto:[email protected]____**goettingen.de<[email protected]____goettingen.de>
>>        <mailto:[email protected]__**goettingen.de<[email protected]__goettingen.de>
>> >
>>        <mailto:[email protected]__**goettingen.de<[email protected]__goettingen.de>
>>        <mailto:[email protected]**goettingen.de<[email protected]>
>> >>>>
>>
>>
>>                    Hi,
>>
>>                    my previous two mails somehow didn't make it to the
>>        list, so
>>                here is
>>                    another attempt:
>>
>>                    I am trying to use gridway(5.6.1) to schedule simple
>>        test
>>                jobs (/bin/ls)
>>                    across two machines, both with gt4.2.1 installed and
>>                running. One of the
>>                    machines is running Debian, the other is a VM
>>        running Ubuntu.
>>                    Communication and Authentification apparently works
>>        fine,
>>                the machines
>>                    see and trust each other, and the jobs gets scheduled.
>>                However, in *both
>>                    directions*, only those jobs running on the local
>>        machine
>>                (from where
>>                    they are started using gwsubmit) actually get "done"
>>        - the
>>                others remain
>>                    in "wrap pend" state. Apparently they are executed
>>        correctly
>>                on the
>>                    remote machine, since the result output is there, but
>>                somehow the
>>                    notification to the originating machine fails.
>>        Searching the
>>                list,
>>                    enabling debugging and digging in the logs I found this
>>                    warning/exception at the
>>                    end of the globus container.log on the remote machine:
>>
>>        <...snip...>
>>                    2011-07-01T15:40:41.231+02:00 INFO
>>          impl.DefaultIndexService
>>
>>          [ServiceThread-58,______**performDefaultRegistrations:__**
>> ____261]
>>                    guid=b646c8e0-a3e7-11e0-b059-_**_____b76342becd29
>>
>>
>>          event=org.globus.mds.index.___**___**
>> performDefaultRegistrations.__**____end status=0
>>                    2011-07-01T15:41:21.751+02:00 INFO
>>
>>
>>          PersistentManagedExecutableJob**______Resource.ce605ef0-a3e7-_**
>> _11e0-____b059-b76342becd29
>>
>>                    [ServiceThread-57,start:761] Job
>>                ce605ef0-a3e7-11e0-b059-______**b76342becd29
>>                    with client submission-id null accepted for local
>>        user 'the'
>>                    2011-07-01T15:41:22.032+02:00 INFO
>>          handler.SubmitStateHandler
>>                    [pool-1-thread-7,process:172] Job
>>                ce605ef0-a3e7-11e0-b059-______**b76342becd29
>>                    submitted with local job ID
>>        'ce9b08e8-a3e7-11e0-bcc8-_____**_b7ebd4913b23:17697'
>>                    2011-07-01T15:41:23.327+02:00 DEBUG
>>                    impl.______**SimpleSubscriptionTopicListene**______r
>>
>>                    [pool-1-thread-1,setPort:287] Security properties
>>        not null: not
>>                    secure conv
>>                    2011-07-01T15:41:23.327+02:00 DEBUG
>>                    impl.______**SimpleSubscriptionTopicListene**______r
>>
>>                    [pool-1-thread-1,setPort:314] set port with false
>>                    2011-07-01T15:41:23.366+02:00 DEBUG
>>                    impl.______**SimpleSubscriptionTopicListene**______r
>>
>>                    [pool-1-thread-1,setPort:290] Setting security
>>        properties
>>                    2011-07-01T15:41:23.482+02:00 DEBUG
>>                    impl.______**SimpleSubscriptionTopicListene**______r
>>
>>                    [pool-1-thread-3,setPort:287] Security properties
>>        not null: not
>>                    secure conv
>>                    2011-07-01T15:41:23.483+02:00 DEBUG
>>                    impl.______**SimpleSubscriptionTopicListene**______r
>>
>>                    [pool-1-thread-3,setPort:314] set port with false
>>                    2011-07-01T15:41:23.483+02:00 DEBUG
>>                    impl.______**SimpleSubscriptionTopicListene**______r
>>
>>                    [pool-1-thread-3,setPort:290] Setting security
>>        properties
>>                    2011-07-01T15:41:23.505+02:00 WARN
>>                      impl.______**SimpleSubscriptionTopicListene**______r
>>
>>                    [pool-1-thread-3,topicChanged:**______129]
>>        [JWSCORE-169] Failed
>>                to send
>>                    notification for subscription with key
>>        '______**B26B14DD21498C52B1E38CC2F042B0**
>> ______AF0E65BAE6+ce647da0-**a3e7-____11e0-__b059-**b76342becd29':
>>
>>
>>
>>                    java.net.ConnectException: Connection refused
>>                    2011-07-01T15:41:23.506+02:00 DEBUG
>>                    impl.______**SimpleSubscriptionTopicListene**______r
>>
>>                    [pool-1-thread-3,topicChanged:**______132]
>>                    javax.xml.rpc.JAXRPCException:
>>        java.net.ConnectException:
>>                Connection
>>                    refused
>>                            at
>>
>>          org.apache.axis.client.Call.__**____invokeOneWay(Call.java:**
>> 1871)
>>                            at
>>
>>
>>          org.oasis.wsn.______**NotificationConsumerSOAPBindin**
>> ______gStub.notify(______**NotificationConsumerSOAPBindin**
>> ______gStub.java:701)
>>                            at
>>
>>
>>          org.globus.wsrf.impl.______**SimpleSubscriptionTopicListene**
>> ______r.notify(______**SimpleSubscriptionTopicListene**______r.java:256)
>>                            at
>>
>>
>>          org.globus.wsrf.impl.______**SimpleSubscriptionTopicListene**
>> ______r.topicChanged(______**SimpleSubscriptionTopicListene**
>> ______r.java:123)
>>
>>                            at
>>
>>
>>          org.globus.wsrf.impl.______**SimpleTopic.topicChanged(_____**
>> _SimpleTopic.java:205)
>>                            at
>>
>>
>>          org.globus.wsrf.impl.______**SimpleTopic.notify(______**
>> SimpleTopic.java:112)
>>                            at
>>
>>
>>          org.globus.exec.service.exec._**_______**
>> ManagedExecutableJobResource._**_____setState(______**
>> ManagedExecutableJobResource._**_____java:909)
>>                            at
>>
>>
>>          org.globus.exec.service.exec._**_____processing.handler.______**
>> CleanUpStateHandler.process(__**____CleanUpStateHandler.java:**56)
>>                            at
>>
>>
>>          org.globus.exec.service.exec._**_____processing.handler.______**
>> InternalStateHandler.______**processInternalState(______**
>> InternalStateHandler.java:49)
>>                            at
>>
>>
>>          org.globus.exec.service.exec._**_____processing.StateMachine._**
>> _____processInternalState(____**__StateMachine.java:121)
>>                            at
>>
>>
>>          org.globus.exec.service.exec._**_____processing.______**
>> StateProcessingTask.run(______**StateProcessingTask.java:82)
>>                            at
>>
>>
>>          java.util.concurrent.______**ThreadPoolExecutor$Worker.____**
>> __runTask(ThreadPoolExecutor._**_____java:886)
>>                            at
>>
>>
>>          java.util.concurrent.______**ThreadPoolExecutor$Worker.run(**
>> ______ThreadPoolExecutor.java:**__908)
>>                            at java.lang.Thread.run(Thread.__**
>> ____java:662)
>>        <...snip...>
>>
>>                    The VM has a static IP. Have you got any clue for me
>>        on what
>>                could be
>>                    the problem? Anything else I should provide for
>>        analysis?
>>
>>                    Thanks,
>>                    Timo
>>
>>
>>
>>            --
>>            --
>>            Timo Henne
>>            Research and Development Department (RDD)
>>            State and University Library
>>            Georg-August-Universitaet Goettingen
>>            37073 Goettingen
>>            Germany
>>
>>            Phone: +49 551 39 3883
>>        http://www.sub.uni-goettingen.**____de/
>>        
>> <http://www.sub.uni-__**goettingen.de/<http://www.sub.uni-__goettingen.de/>
>>        
>> <http://www.sub.uni-**goettingen.de/<http://www.sub.uni-goettingen.de/>
>> >>
>>
>>
>>
>>    --
>>    --
>>    Timo Henne
>>    Research and Development Department (RDD)
>>    State and University Library
>>    Georg-August-Universitaet Goettingen
>>    37073 Goettingen
>>    Germany
>>
>>    Phone: +49 551 39 3883
>>    http://www.sub.uni-goettingen.**__de/ <http://www.sub.uni-**
>> goettingen.de/ <http://www.sub.uni-goettingen.de/>>
>>
>>
>>
> --
> --
> Timo Henne
> Research and Development Department (RDD)
> State and University Library
> Georg-August-Universitaet Goettingen
> 37073 Goettingen
> Germany
>
> Phone: +49 551 39 3883
> http://www.sub.uni-goettingen.**de/ <http://www.sub.uni-goettingen.de/>
>

Reply via email to