yeah, that's a typo in the bug report. Both last night and this morning I have been unable to access mantis or the the pv web site. Page never loads. I'll try later.

pat marion wrote:
Hey Burlen, on the bug report page for 10283, I think you need to fix the command line you are testing with :

$ ssh remote cmd1 && cm2

will execute cmd1 on remote and cmd2 locally.  It should be:

$ ssh remote "cmd1 && cmd2"

Pat

On Fri, Apr 30, 2010 at 9:12 AM, pat marion <[email protected] <mailto:[email protected]>> wrote:

    I have applied your patch.  I agree that paraview should explicity
    close the child process.  But... what I am pointing out is that
    calling QProcess::close() does not help in this situation.  What I
    am saying is that, even when paraview does kill the process, any
    commands run by ssh on the other side of the netpipe will be
    orphaned by sshd.  Are you sure you can't reproduce it?


    $ ssh localhost sleep 1d
    $ < press control-c >
    $ pidof sleep
    $ # sleep is still running

    Pat


    On Fri, Apr 30, 2010 at 2:08 AM, burlen <[email protected]
    <mailto:[email protected]>> wrote:

        Hi Pat,

        From my point of view the issue is philosophical, because
        practically speaking I couldn't reproduce the orphans with out
        doing something a little odd namely, ssh ... &&  sleep 1d.
        Although the fact that a user reported suggests that it may
        occur in the real world as well. The question is this: should
        an application explicitly clean up resources it allocates? or
        should an application rely on the user not only knowing that
        there is the potential for a resource leak but also knowing
        enough to do the right thing to avoid it (eg ssh -tt ...)? In
        my opinion, as a matter of principle, if PV spawns a process
        it should explicitly clean it up and there should be no way it
        can become an orphan. In this case the fact that the orphan
        can hold ports open is particularly insidious, because further
        connection attempt on that port fails with no helpful error
        information. Also it is not very difficult to clean up a
        spawned process. What it comes down to is a little book
        keeping to hang on to the qprocess handle and a few lines of
        code called from pqCommandServerStartup destructor to make
        certain it's cleaned up. This is from the patch I submitted
        when I filed the bug report.

        +    // close running process
        +    if (this->Process->state()==QProcess::Running)
        +      {
        +      this->Process->close();
        +      }
        +    // free the object
        +    delete this->Process;
        +    this->Process=NULL;

        I think if the cluster admins out there new which ssh options
        (GatewayPorts etc) are important for ParView to work
        seamlessly, then they might be willing to open them up. It's
        my impression that the folks that build clusters want tools
        like PV to be easy to use, but they don't necessarily know all
        the in's and out's of confinguring and running PV.

        Thanks for looking at this again! The -tt option to ssh is
        indeed a good find.

        Burlen

        pat marion wrote:

            Hi all!

            I'm bringing this thread back- I have learned a couple new
            things...

            -----------------------
            No more orphans:

            Here is an easy way to create an orphan:

              $ ssh localhost sleep 1d
              $ <press control c>

            The ssh process is cleaned up, but sshd orphans the sleep
            process.  You can avoid this by adding '-t' to ssh:

             $ ssh -t localhost sleep 1d

            Works like a charm!  But then there is another problem...
            try this command from paraview (using QProcess) and it
            still leaves an orphan, doh!  Go back and re-read ssh's
            man page and you have the solution, use '-t' twice: ssh -tt

            -------------------------
            GatewayPorts and portfwd workaround:

            In this scenario we have 3 machines: workstation,
            service-node, and compute-node.  I want to ssh from
            workstation to service-node and submit a job that will run
            pvserver on compute-node.  When pvserver starts on
            compute-node I want it to reverse connect to service-node
            and I want service-node to forward the connection to
            workstation.  So here I go:

              $ ssh -R11111:localhost:11111 service-node qsub
            start_pvserver.sh

            Oops, the qsub command returns immediately and closes my
            ssh tunnel.  Let's pretend that the scheduler doesn't
            provide an easy way to keep the command alive, so I have
            resorted to using 'sleep 1d'.  So here I go, using -tt to
            prevent orphans:

             $ ssh -tt -R11111:localhost:11111 service-node "qsub
            start_pvserver.sh && sleep 1d"

            Well, this will only work if GatewayPorts is enabled in
            sshd_config on service-node.  If GatewayPorts is not
            enabled, the ssh tunnel will only accept connections from
            localhost, it will not accept a connection from
            compute-node.  We can ask the sysadmin to enable
            GatewayPorts, or we could use portfwd.  You can run
            portfwd on service-node to forward port 22222 to port
            11111, then have compute-node connect to
            service-node:22222.  So your job script would launch
            pvserver like this:

             pvserver -rc -ch=service-node -sp=22222

            Problem solved!  Also convenient, we can use portfwd to
            replace 'sleep 1d'.  So the final command, executed by
            paraview client:

             ssh -tt -R 11111:localhost:11111 service-node "qsub
            start_pvserver.sh && portfwd -g -c fwd.cfg"

            Where fwd.cfg contains:

             tcp { 22222 { => localhost:11111 } }


            Hope this helps!

            Pat

            On Fri, Feb 12, 2010 at 7:06 PM, burlen
            <[email protected] <mailto:[email protected]>
            <mailto:[email protected]
            <mailto:[email protected]>>> wrote:


                   Incidentally, this brings up an interesting point about
                   ParaView with client/server.  It doesn't try to
            clean up it's
                   child processes, AFAIK.  For example, if you set up
            this ssh
                   tunnel inside the ParaView GUI (e.g., using a
            command instead
                   of a manual connection), and you cancel the
            connection, it
                   will leave the ssh running.  You have to track down
            the ssh
                   process and kill it yourself.  It's minor thing,
            but it can
                   also prevent future connections if you don't
            realize there's a
                   zombie ssh that kept your ports open.

               I attempted to reproduce on my kubuntu 9.10, qt 4.5.2
            system, with
               slightly different results, which may be qt/distro/os
            specific.

               On my system as long as the process ParaView spawns
            finishes on
               its own there is no problem. That's usually how one
            would expect
               things to work out since when the client disconnects
            the server
               closes followed by ssh. But, you are right that PV never
               explicitly kills or otherwise cleans up after the
            process it
               starts. So if the spawned process for some reason
            doesn't finish
               orphan processes are introduced.

               I was able to produce orphan ssh processes, giving the
            PV client a
               server start up command that doesn't finish. eg

                 ssh ... pvserver ... && sleep 100d

               I get the situation you described which prevents further
               connection on the same ports. Once PV tries and fails
            to connect
               on th eopen ports, there is crash soon after.

               I filed a bug report with a patch:
               http://www.paraview.org/Bug/view.php?id=10283



               Sean Ziegeler wrote:

                   Most batch systems have an option to wait until the
            job is
                   finished before the submit command returns.  I know
            PBS uses
                   "-W block=true" and that SGE and LSF have similar
            options (but
                   I don't recall the precise flags).

                   If your batch system doesn't provide that, I'd
            recommend
                   adding some shell scripting to loop through
            checking the queue
                   for job completion and not return until it's done.
             The sleep
                   thing would work, but wouldn't exit when the server
            finishes,
                   leaving the ssh tunnels (and other things like
            portfwd if you
                   put them in your scripts) lying around.

                   Incidentally, this brings up an interesting point about
                   ParaView with client/server.  It doesn't try to
            clean up it's
                   child processes, AFAIK.  For example, if you set up
            this ssh
                   tunnel inside the ParaView GUI (e.g., using a
            command instead
                   of a manual connection), and you cancel the
            connection, it
                   will leave the ssh running.  You have to track down
            the ssh
                   process and kill it yourself.  It's minor thing,
            but it can
                   also prevent future connections if you don't
            realize there's a
                   zombie ssh that kept your ports open.


                   On 02/08/10 21:03, burlen wrote:

                       I am curious to hear what Sean has to say.

                       But, say the batch system returns right away
            after the job
                       is submitted,
                       I think we can doctor the command so that it
            will live for
                       a while
                       longer, what about something like this:

                       ssh -R XXXX:localhost:YYYY remote_machine
                       "submit_my_job.sh && sleep
                       100d"


                       pat marion wrote:

                           Hey just checked out the wiki page, nice! One
                           question, wouldn't this
                           command hang up and close the tunnel after
            submitting
                           the job?
                           ssh -R XXXX:localhost:YYYY remote_machine
            submit_my_job.sh
                           Pat

                           On Mon, Feb 8, 2010 at 8:12 PM, pat marion
                           <[email protected]
            <mailto:[email protected]>
            <mailto:[email protected]
            <mailto:[email protected]>>
                           <mailto:[email protected]
            <mailto:[email protected]>

                           <mailto:[email protected]
            <mailto:[email protected]>>>> wrote:

                           Actually I didn't write the notes at the
            hpc.mil <http://hpc.mil>
                           <http://hpc.mil> <http://hpc.mil>

                           link.

                           Here is something- and maybe this is the
            problem that
                           Sean refers
                           to- in some cases, when I have set up a
            reverse ssh
                           tunnel from
                           login node to workstation (command executed
            from
                           workstation) then
                           the forward does not work when the compute node
                           connects to the
                           login node. However, if I have the compute node
                           connect to the
                           login node on port 33333, then use portfwd
            to forward
                           that to
                           localhost:11111, where the ssh tunnel is
            listening on
                           port 11111,
                           it works like a charm. The portfwd tricks
            it into
                           thinking the
                           connection is coming from localhost and
            allow the ssh
                           tunnel to
                           work. Hope that made a little sense...

                           Pat


                           On Mon, Feb 8, 2010 at 6:29 PM, burlen
                           <[email protected]
            <mailto:[email protected]>
            <mailto:[email protected]
            <mailto:[email protected]>>
                           <mailto:[email protected]
            <mailto:[email protected]>
                           <mailto:[email protected]
            <mailto:[email protected]>>>> wrote:

                           Nice, thanks for the clarification. I am
            guessing that
                           your
                           example should probably be the recommended
            approach rather
                           than the portfwd method suggested on the PV
            wiki. :) I
                           took
                           the initiative to add it to the Wiki. KW
            let me know
                           if this
                           is not the case!

http://paraview.org/Wiki/Reverse_connection_and_port_forwarding#Reverse_connection_over_an_ssh_tunnel



                           Would you mind taking a look to be sure I
            didn't miss
                           anything
                           or bollix it up?

                           The sshd config options you mentioned may
            be why your
                           method
                           doesn't work on the Pleiades system, either
            that or
                           there is a
                           firewall between the front ends and compute
            nodes. In
                           either
                           case I doubt the NAS sys admins are going to
                           reconfigure for
                           me :) So at least for now I'm stuck with
            the two hop ssh
                           tunnels and interactive batch jobs. if
            there were
                           someway to
                           script the ssh tunnel in my batch script I
            would be
                           golden...

                           By the way I put the details of the two hop
            ssh tunnel
                           on the
                           wiki as well, and a link to Pat's hpc.mil
            <http://hpc.mil>
                           <http://hpc.mil> <http://hpc.mil>

                           notes. I don't dare try to summarize them
            since I've never
                           used portfwd and it refuses to compile both
            on my
                           workstation
                           and the cluster.

                           Hopefully putting these notes on the Wiki
            will save future
                           ParaView users some time and headaches.


                           Sean Ziegeler wrote:

                           Not quite- the pvsc calls ssh with both the
            tunnel options
                           and the commands to submit the batch job.
            You don't even
                           need a pvsc; it just makes the interface
            fancier. As long
                           as you or PV executes something like this
            from your
                           machine:
                           ssh -R XXXX:localhost:YYYY remote_machine
            submit_my_job.sh

                           This means that port XXXX on remote_machine
            will be the
                           port to which the server must connect. Port
            YYYY (e.g.,
                           11111) on your client machine is the one on
            which PV
                           listens. You'd have to tell the server (in
            the batch
                           submission script, for example) the name of
            the node and
                           port XXXX to which to connect.

                           One caveat that might be causing you
            problems, port
                           forwarding (and "gateway ports" if the
            server is running
                           on a different node than the login node)
            must be enabled
                           in the remote_machine's sshd_config. If
            not, no ssh
                           tunnels will work at all (see: man ssh and man
                           sshd_config). That's something that an
            administrator
                           would need to set up for you.

                           On 02/08/10 12:26, burlen wrote:

                           So to be sure about what you're saying:
            Your .pvsc
                           script ssh's to the
                           front end and submits a batch job which
            when it's
                           scheduled , your batch
                           script creates a -R style tunnel and starts
            pvserver
                           using PV reverse
                           connection. ? or are you using portfwd or a
            second ssh
                           session to
                           establish the tunnel ?

                           If you're doing this all from your .pvsc script
                           without a second ssh
                           session and/or portfwd that's awesome! I
            haven't been
                           able to script
                           this, something about the batch system
            prevents the
                           tunnel created
                           within the batch job's ssh session from
            working. I
                           don't know if that's
                           particular to this system or a general fact
            of life
                           about batch systems.

                           Question: How are you creating the tunnel
            in your
                           batch script?

                           Sean Ziegeler wrote:

                           Both ways will work for me in most cases,
            i.e. a
                           "forward" connection
                           with ssh -L or a reverse connection with
            ssh -R.

                           However, I find that the reverse method is more
                           scriptable. You can
                           set up a .pvsc file that the client can
            load and
                           will call ssh with
                           the appropriate options and commands for the
                           remote host, all from the
                           GUI. The client will simply wait for the
            reverse
                           connection from the
                           server, whether it takes 5 seconds or 5
            hours for
                           the server to get
                           through the batch queue.

                           Using the forward connection method, if the
            server
                           isn't started soon
                           enough, the client will attempt to connect and
                           then fail. I've always
                           had to log in separately, wait for the
            server to
                           start running, then
                           tell my client to connect.

                           -Sean

                           On 02/06/10 12:58, burlen wrote:

                           Hi Pat,

                           My bad. I was looking at the PV wiki, and
                           thought you were talking about
                           doing this without an ssh tunnel and using
                           only port forward and
                           paraview's --reverse-connection option . Now
                           that I am reading your
                           hpc.mil <http://hpc.mil> <http://hpc.mil>
            <http://hpc.mil> post I see

                           what you
                           mean :)

                           Burlen


                           pat marion wrote:

                           Maybe I'm misunderstanding what you mean
                           by local firewall, but
                           usually as long as you can ssh from your
                           workstation to the login node
                           you can use a reverse ssh tunnel.


                           _______________________________________________
                           Powered by www.kitware.com
            <http://www.kitware.com> <http://www.kitware.com>
                           <http://www.kitware.com>

                           Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html

                           Please keep messages on-topic and check the
                           ParaView Wiki at:
                           http://paraview.org/Wiki/ParaView

                           Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview










_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at 
http://www.kitware.com/opensource/opensource.html

Please keep messages on-topic and check the ParaView Wiki at: 
http://paraview.org/Wiki/ParaView

Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview

Reply via email to