Ok great, sounds like a plan!

> On 04 Feb 2015, at 2:53 , Ralph Castain <r...@open-mpi.org> wrote:
> 
> Appreciate your patience! I'm somewhat limited this week by being on travel 
> to our HQ, so I don't have access to my usual test cluster. I'll be better 
> situated to complete the implementation once I get home.
> 
> For now, some quick thoughts:
> 
> 1. stdout/stderr: yes, I just need to "register" orte-submit as the one to 
> receive those from the submitted job.
> 
> 2. That one is going to be a tad trickier, but is resolvable. May take me a 
> little longer to fix.
> 
> 3. dang - I thought I had it doing so. I'll look to find the issue. I suspect 
> it's just a case of correctly setting the return code of orte-submit.
> 
> I'd welcome the help! Let me ponder the best way to point you to the areas 
> needing work, and we can kick around off-list about who does what.
> 
> Great to hear this is working with your tool so quickly!!
> Ralph
> 
> 
> On Tue, Feb 3, 2015 at 3:49 PM, Mark Santcroos <mark.santcr...@rutgers.edu> 
> wrote:
> Hi Ralph,
> 
> Besides the items in the other mail, I have three more items that would need 
> resolving at some point.
> 
> 1. STDOUT/STDERR currently go to the orte-dvm console.
>    I'm sure this is not a fundamental limitation.
>    Even if getting the information to the orte-submit instance would be 
> problematic, the orte-dvm writing this to a file per session would be good 
> enough too.
> 
> 2. Failing applications currently tear down the dvm.
>    Ideally that would not be the case, and this would be handled in relation 
> to item (3).
>    Possibly this needs to be configurable, if others would like to see 
> different behaviour.
> 
> 3. orte-submit doesn't return the exit code of the application.
> 
> To be clear, I realise the current implementation is a proof of concept, so 
> these are no complaints, just wishes of where I hope to see this going!
> 
> FWIW: these items might require less intricate knowledge of OMPI in general, 
> so with some pointers/guidance I can probably work on these myself if needed.
> 
> Cheers,
> 
> Mark
> 
> ps. I did a quick-and-dirty integration with our own tool and the ORTE 
> abstraction maps like a charm!
>     
> (https://github.com/radical-cybertools/radical.pilot/commit/2d36e886081bf8531097edfc95ada1826257e460)
> 
> > On 03 Feb 2015, at 20:38 , Mark Santcroos <mark.santcr...@rutgers.edu> 
> > wrote:
> >
> > Hi Ralph,
> >
> >> On 03 Feb 2015, at 16:28 , Ralph Castain <r...@open-mpi.org> wrote:
> >> I think I fixed some of the handshake issues - please give it another try.
> >> You should see orte-submit properly shutdown upon completion,
> >
> > Indeed, it works on my laptop now! Great!
> > It feels quite fast too, for sort tasks :-)
> >
> >> and orte-dvm properly shutdown when sent the terminate cmd.
> >
> > ACK. This also works as expected.
> >
> >> I was able to cleanly run MPI jobs on my laptop.
> >
> > Do you also see the following errors/warnings on the dvm side?
> >
> > [netbook:28324] [[20896,0],0] Releasing job data for [INVALID]
> > Hello, world, I am 0 of 1, (Open MPI v1.9a1, package: Open MPI mark@netbook 
> > Distribution, ident: 1.9.0a1, repo rev: dev-811-g7299cc3, Unreleased 
> > developer copy, 132)
> > [netbook:28324] sess_dir_finalize: proc session dir does not exist
> > [netbook:28324] [[20896,0],0] dvm: job [20896,20] has completed
> > [netbook:28324] [[20896,0],0] Releasing job data for [20896,20]
> >
> > The "INVALID" message is there for every "submit", the sess_dir_finalize 
> > exists per instance/core.
> > Is that something to worry about, that needs fixing or is that a 
> > configuration issue?
> >
> > I haven't been able to test on Edison because of maintenance 
> > (today+tomorrow), so I will report on that later.
> >
> > Thanks again!
> >
> > Mark
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/02/26282.php
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/02/26284.php

Reply via email to