Re: [OMPI users] Using OpenMPI / ORTE as cluster aware GNU Parallel

2017-02-28 Thread Mark Santcroos
Hi Brock, Angel, Reuti, You might want to look at a tool we developed: http://radical-cybertools.github.io/radical-pilot/index.html This was actually one of the drivers for isolating the persistent ORTE DVM thats being discussed in this thread. With RADICAL-Pilot you can use a Python API to

Re: [OMPI users] resolution of MPI_Wtime

2016-04-05 Thread Mark Santcroos
> On 05 Apr 2016, at 16:46 , Aurélien Bouteiller wrote: > Open MPI uses clock_gettime when it is available, and defaults to > gettimeofday only when this better option can't be found. Check that your > system has clock_gettime and the resolution of this timer. Depending

Re: [OMPI users] Orted path with module manager on cluster

2016-03-03 Thread Mark Santcroos
> On 03 Mar 2016, at 23:22 , Davide Vanzo wrote: > I have built OpenMPI 1.10.2 with RoCE network support on our test cluster. On > the cluster we use lmod to manage paths to different versions of softwares. > The problem I have is that I receive the "orted: command

Re: [OMPI users] how to benchmark a server with openmpi?

2016-01-25 Thread Mark Santcroos
Another canonical benchmarking suite can be found at http://www.nas.nasa.gov/publications/npb.html > On 24 Jan 2016, at 20:51 , Ibrahim Ikhlawi wrote: > > Thanks for reply. > > But I want to have an imagination about the behaviour of my server. Therefore > I need an

Re: [OMPI users] OpenMPI 1.8.5 build question

2015-09-23 Thread Mark Santcroos
> On 23 Sep 2015, at 13:49 , Kumar, Sudhir wrote: > I have a version of OpenMPI 1.8.5 installed. Is there any way of knowing, > with which version of gcc it was compiled with. ompi_info |grep -i compiler

Re: [OMPI users] Building OpenMPI 1.8.7 on XC30

2015-07-29 Thread Mark Santcroos
Hi Erik, > On 29 Jul 2015, at 3:35 , Erik Schnetter wrote: > I was able to build openmpi-v2.x-dev-96-g918650a without problems on Edison, > and also on other systems. And does it also work as expected after you have build it? :-) Thanks Mark

Re: [OMPI users] Building OpenMPI 1.8.7 on XC30

2015-07-26 Thread Mark Santcroos
on OS X; I > see messages such as > > [warn] select: Bad file descriptor > > Are these important? If not, how can I suppress them? > > -erik > > > On Sat, Jul 25, 2015 at 7:49 AM, Mark Santcroos <mark.santcr...@rutgers.edu> > wrote: > Hi Erik, > > Do you

Re: [OMPI users] Building OpenMPI 1.8.7 on XC30

2015-07-25 Thread Mark Santcroos
Hi Erik, Do you really want 1.8.7, otherwise you might want to give latest master a try. Other including myself had more luck with that on Cray's, including Edison. Mark > On 25 Jul 2015, at 1:35 , Erik Schnetter wrote: > > I want to build OpenMPI 1.8.7 on a Cray XC30

Re: [OMPI users] open mpi on blue waters

2015-03-26 Thread Mark Santcroos
> On 26 Mar 2015, at 16:01 , Ralph Castain <r...@open-mpi.org> wrote: > >> >> On Mar 26, 2015, at 1:33 AM, Mark Santcroos <mark.santcr...@rutgers.edu> >> wrote: >> >> Hi guys, >> >> Thanks for the follow-up. >> >> It a

Re: [OMPI users] open mpi on blue waters

2015-03-26 Thread Mark Santcroos
Hi Ralph, > On 25 Mar 2015, at 21:59 , Mark Santcroos <mark.santcr...@rutgers.edu> wrote: >> Anyway, see if this fixes the problem. >> >> https://github.com/open-mpi/ompi/pull/497 Can confirm the fallback works now without setting explicitly to basic (with the merged changes). Thanks! Mark

Re: [OMPI users] open mpi on blue waters

2015-03-26 Thread Mark Santcroos
Hi guys, Thanks for the follow-up. It appears that you are ruling out that Munge is required because the system runs TORQUE, but as far as I can see Munge is/can be used by both SLURM and TORQUE.

Re: [OMPI users] open mpi on blue waters

2015-03-25 Thread Mark Santcroos
Hi Ralph, > On 25 Mar 2015, at 21:25 , Ralph Castain wrote: > I think I have this resolved, > though that I still suspect their is something wrong on that system. You > shouldn’t have some nodes running munge and others not running it. For completeness, it's not "some"

Re: [OMPI users] open mpi on blue waters

2015-03-25 Thread Mark Santcroos
> On 25 Mar 2015, at 17:39 , Ralph Castain wrote: > Not surprising - I’m surprised to find munge on the mom’s node anyway given > that you are using Torque. > > I have to finish something else first, and it sounds like you aren’t blocked > at the moment. I’ll provide a

Re: [OMPI users] open mpi on blue waters

2015-03-25 Thread Mark Santcroos
don’t see > this kind of mismatch for the very reason you are hitting - it becomes > difficult to resolve authentications. > > Let me ponder a bit. We can resolve it easily enough, but I want to ensure we > don’t do it by creating a security hole. > >> On Mar 25, 2015, at 9:25

Re: [OMPI users] open mpi on blue waters

2015-03-25 Thread Mark Santcroos
> On 25 Mar 2015, at 17:06 , Ralph Castain wrote: > > OHO! You have munge running on the head node, but not on the backends! Ok, so I now know that munge is ... :) It's running on the MOM node (not on the head node): daemon 18800 0.0 0.0 118476 3212 ?Sl

Re: [OMPI users] open mpi on blue waters

2015-03-25 Thread Mark Santcroos
> On 25 Mar 2015, at 17:06 , Ralph Castain wrote: > OHO! You have munge running on the head node, but not on the backends! Im all for munching, but what does that mean? ;-) Is that something actively running or do you mean library available or such? > Okay, all you have to

Re: [OMPI users] open mpi on blue waters

2015-03-25 Thread Mark Santcroos
> On 25 Mar 2015, at 16:52 , Ralph Castain wrote: > > Hmmm…okay, sorry to keep drilling down here, but let’s try adding “-mca > sec_base_verbose 100” now > /u/sciteam/marksant/openmpi/installation/bin/mpirun -mca oob_base_verbose 100 > -mca sec_base_verbose 100 ./a.out

Re: [OMPI users] open mpi on blue waters

2015-03-25 Thread Mark Santcroos
25 Mar 2015, at 16:49 , Ralph Castain <r...@open-mpi.org> wrote: > > Hmmm…well, it will generate some output, so keep the system down to two nodes > if you can just to minimize the chatter. Add “-mca oob_base_verbose 100” to > your cmd line > >> On Mar 25, 2015, at 8:4

Re: [OMPI users] open mpi on blue waters

2015-03-25 Thread Mark Santcroos
TH > >> On Mar 25, 2015, at 7:46 AM, Howard Pritchard <hpprit...@gmail.com> wrote: >> >> turn off the disable getpwuid. >> >> On Mar 25, 2015 8:14 AM, "Mark Santcroos" <mark.santcr...@rutgers.edu> wrote: >> Hi Howard, >> >&

Re: [OMPI users] open mpi on blue waters

2015-03-25 Thread Mark Santcroos
> On 25 Mar 2015, at 15:46 , Howard Pritchard wrote: > turn off the disable getpwuid. That doesn't seem to make a difference. Have their been changes in this area? Last time I checked this a couple of months ago on Edison I needed this flag not to get spammed.

Re: [OMPI users] open mpi on blue waters

2015-03-25 Thread Mark Santcroos
I want to use orte-submit and friends, so I "explicitly" don't want to use aprun. > you definitely dont need to use ccm. > and shouldnt. Depends on the use-case, but happy to leave that out of scope for now :-) Thanks! Mark > > On Mar 25, 2015 6:00 AM, "Mark Santcro

[OMPI users] open mpi on blue waters

2015-03-25 Thread Mark Santcroos
Hi, Any users of Open MPI on Blue Waters here? And then I specifically mean in "native" mode, not inside CCM. After configuring and building as I do on other Cray's, mpirun gives me the following: [nid25263:31700] [[23896,0],0] ORTE_ERROR_LOG: Authentication failed in file

Re: [OMPI users] independent startup of orted and orterun

2015-02-04 Thread Mark Santcroos
ing work, and we can kick around off-list about who does what. > > Great to hear this is working with your tool so quickly!! > Ralph > > > On Tue, Feb 3, 2015 at 3:49 PM, Mark Santcroos <mark.santcr...@rutgers.edu> > wrote: > Hi Ralph, > > Besides the items

Re: [OMPI users] independent startup of orted and orterun

2015-02-03 Thread Mark Santcroos
/2d36e886081bf8531097edfc95ada1826257e460) > On 03 Feb 2015, at 20:38 , Mark Santcroos <mark.santcr...@rutgers.edu> wrote: > > Hi Ralph, > >> On 03 Feb 2015, at 16:28 , Ralph Castain <r...@open-mpi.org> wrote: >> I think I fixed some of the handshake issues - please give it anothe

Re: [OMPI users] independent startup of orted and orterun

2015-02-03 Thread Mark Santcroos
Hi Ralph, > On 03 Feb 2015, at 16:28 , Ralph Castain wrote: > I think I fixed some of the handshake issues - please give it another try. > You should see orte-submit properly shutdown upon completion, Indeed, it works on my laptop now! Great! It feels quite fast too, for sort

Re: [OMPI users] independent startup of orted and orterun

2015-02-03 Thread Mark Santcroos
On 03 Feb 2015, at 0:20 , Ralph Castain wrote: > Okay, thanks - I'll get on it tonight. Looks like a fairly simple bug, so > hopefully I'll have it ironed out tonight. Sorry, I was not completely accurate. Let me be more specific: * The orte-submit does not return though, so

Re: [OMPI users] independent startup of orted and orterun

2015-02-02 Thread Mark Santcroos
FWIW: I see similar behaviour on my laptop (OS X Yosemite 10.10.2). > On 02 Feb 2015, at 21:26 , Mark Santcroos <mark.santcr...@rutgers.edu> wrote: > > Ok, let me check on some other systems too though, it might be Cray specific. > > >> On 02 Feb 2015, at 19:07

Re: [OMPI users] independent startup of orted and orterun

2015-02-02 Thread Mark Santcroos
happened here. I'm on travel this week, > but I'll try to dig into this a bit and spot the issue. > > Thanks! > Ralph > > > On Mon, Feb 2, 2015 at 3:50 AM, Mark Santcroos <mark.santcr...@rutgers.edu> > wrote: > Hi Ralph, > > Great, the semantics look exactly a

Re: [OMPI users] independent startup of orted and orterun

2015-02-02 Thread Mark Santcroos
ping/ranking/binding options supported just yet as I > first wanted to see if this meets your basic needs before worrying about the > detail. > > Let me know what you think > Ralph > > >> On Jan 21, 2015, at 4:07 PM, Mark Santcroos <mark.santcr...@rutgers.edu&g

Re: [OMPI users] independent startup of orted and orterun

2015-01-21 Thread Mark Santcroos
g the daemons. This will allow > you to reuse the existing DVM, making each independent job start a great deal > faster. You’ll need to either manually terminate the DVM, or the RM will do > so when the allocation expires. > > HTH > Ralph > > >> On Jan 21, 2015, a

Re: [OMPI users] independent startup of orted and orterun

2015-01-21 Thread Mark Santcroos
Hi Ralph, > On 21 Jan 2015, at 21:20 , Ralph Castain <r...@open-mpi.org> wrote: > > Hi Mark > >> On Jan 21, 2015, at 11:21 AM, Mark Santcroos <mark.santcr...@rutgers.edu> >> wrote: >> >> Hi Ralph, all, >> >> To give some ba

Re: [OMPI users] independent startup of orted and orterun

2015-01-21 Thread Mark Santcroos
Hi Ralph, all, To give some background, I'm part of the RADICAL-Pilot [1] development team. RADICAL-Pilot is a Pilot System, an implementation of the Pilot (job) concept, which is in its most minimal form takes care of the decoupling of resource acquisition and workload management. So instead

Re: [OMPI users] independent startup of orted and orterun

2015-01-21 Thread Mark Santcroos
you >> ask. The launch system in there isn’t fully implemented yet, but the >> fundamental idea is valid and supports some range of capability. >> >> We used to have a cmd line option in ORTE for what you propose - it wouldn’t >> be too hard to restore. Is th

[OMPI users] independent startup of orted and orterun

2015-01-21 Thread Mark Santcroos
Hi, Would it be possible to initially run "idle" orted's on my resources and then use orterun to launch my applications to these already running orted's. Thanks! Mark