[OMPI devel] FlowChecker: Detecting Bugs in MPI Libraries via Message Flow Checking

2010-11-20 Thread Christopher Samuel
lowchecker.pdf I've emailed them to see if the code is going to be available as it could be quite a handy tool to have when trying to track down issues like the one Sébastien posted about. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-24 Thread Christopher Samuel
ock. Hmm, we've had a report from someone trying to use Ray on our BG/P that they've seen it lock up - is it likely to be the same issue ? cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.e

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-24 Thread Christopher Samuel
rful, thank you! :-) - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: U

Re: [OMPI devel] Help needed to run OMPI jobs under internal resource manager

2011-03-09 Thread Christopher Samuel
spect, from which it is derived): [root@bruce-m openmpi-1.4.2]# find . -name tm ./orte/mca/plm/tm ./orte/mca/ras/tm cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)

Re: [OMPI devel] 1.5.4 and 1.4.4 NEWS items

2011-08-18 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 18/08/11 23:11, Jeff Squyres wrote: > 1.4.4 Haven't been keeping up I'm afraid - is 1.4.4 backwards compatible with 1.4.2 ? cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Science

Re: [OMPI devel] "Open MPI"-based MPI library used by K computer

2011-11-14 Thread Christopher Samuel
98] [bruce002:04048] [11] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3e12a1d994] [bruce002:04048] [12] ./tp_lb_ub_ng [0x400af9] [bruce002:04048] *** End of error message *** - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam

Re: [OMPI devel] OMPI 1.4.5rc1 posted

2011-12-13 Thread Christopher Samuel
ging about (for example): Local host: bruce001 File Name: /vlsci/tmp/979325.bruce-m.vlsci.unimelb.edu.au/openmpi-sessions-samuel@bruce001_0/14488/1/shared_mem_pool.bruce001 Is there a way to tell it to use /tmp without changing what $TMPDIR is set to ? cheers, Chris - -- Christopher

Re: [OMPI devel] OMPI 1.4.5rc1 posted

2011-12-13 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 14/12/11 09:21, Jeff Squyres wrote: > On Dec 13, 2011, at 1:10 AM, Christopher Samuel wrote: > > > I think you want s/settings/setting/ there. > > Fixed! Thanks. Not a problem. > > Also I can not seem to make it accept

Re: [OMPI devel] OMPI 1.4.5rc1 posted

2011-12-13 Thread Christopher Samuel
MPI usage"), here are sm pingpong latencies > (using 1.4.3) for session dirs on Lustre, an SSD and tmpfs: Very interesting, no measurable difference (maybe even tmpfs being a touch slower).. Is that benchmark public ? cheers! Chris - -- Christopher Samuel - Senior Systems Administrato

Re: [OMPI devel] OMPI 1.4.5rc1 posted

2011-12-13 Thread Christopher Samuel
memory backed filesystems with the same benchmark. +1 to be able to disable this warning via an MCA parameter. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903

Re: [OMPI devel] OMPI 1.4.5rc1 posted

2011-12-13 Thread Christopher Samuel
mpi/ticket/2937 That's great - much appreciated! cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP S

Re: [OMPI devel] OMPI 1.4.5rc1 posted

2011-12-14 Thread Christopher Samuel
until 1.4.6 appears. cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux

Re: [OMPI devel] RFC: Support Cross Memory Attach in sm btl

2012-01-12 Thread Christopher Samuel
you have any figures comparing some code with and without CMA ? cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -

Re: [OMPI devel] RFC: Support Cross Memory Attach in sm btl

2012-01-12 Thread Christopher Samuel
know does a lot of comms and is latency sensitive), but my Copious Free Time(tm) appears to have run out for the moment. :-( But certainly very interesting.. - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unim

Re: [OMPI devel] 1.4.5rc2 now released

2012-01-19 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 20/01/12 04:55, Jeff Squyres wrote: > Please test: Great - we can now silence that warning for NFS, thanks! - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email:

Re: [OMPI devel] Compile-time MPI_Datatype checking

2012-02-02 Thread Christopher Samuel
GCC with its plugin architecture ? cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Versi

Re: [OMPI devel] 1.5 supported systems

2012-02-23 Thread Christopher Samuel
com/ResLibSearchResult.aspx?keywords=openpbs Does anyone test against it? cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN

Re: [OMPI devel] 1.5.5rc2

2012-02-23 Thread Christopher Samuel
1.5.5rc2/ompi/contrib/vt' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/tmp/chris/openmpi-1.5.5rc2/ompi' make: *** [all-recursive] Error 1 - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@u

Re: [OMPI devel] 1.5.5rc2

2012-02-23 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 24/02/12 15:12, Christopher Samuel wrote: > I suspect this is irrelevant, but I got a build failure trying to > compile it on our BG/P front end node (login node) with the IBM XL > compilers. Oops, forgot how I built it.. export

Re: [OMPI devel] poor btl sm latency

2012-02-28 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 13/02/12 22:11, Matthias Jurenz wrote: > Do you have any idea? Please help! Do you see the same bad latency in the old branch (1.4.5) ? cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Scien

Re: [OMPI devel] Open MPI nightly tarballs suspended / 1.5.5rc3

2012-02-28 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 29/02/12 07:44, Jeffrey Squyres wrote: > - BlueGene fixes rc3 fixes the builds on our front end node, thanks! - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email:

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

2012-03-01 Thread Christopher Samuel
r > 1.6.0). What symptoms would an affected job show? Does it fail with an OMPI error or does it just hang using 0% CPU? cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0

Re: [OMPI devel] RFC: ob1: fallback on put/send on rget failure

2012-03-18 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 16/03/12 08:14, Shamis, Pavel wrote: > I did not get any patch. It arrived OK here, you can get it from the archive: http://www.open-mpi.org/community/lists/devel/2012/03/10717.php - -- Christopher Samuel - Senior Systems Administra

Re: [OMPI devel] 1.6.1rc3 - 3 of 5 tests failed on OSX 10.8

2012-08-23 Thread Christopher Samuel
FOR A PARTICULAR PURPOSE. Hope this is of use! cheers, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEG

[OMPI devel] CRIU checkpoint support in Open-MPI?

2012-12-05 Thread Christopher Samuel
alk?day=thursday Is there interest from OMPI in supporting this, given it looks like it's quite likely to make it into the mainline kernel? Or is better to wait for it to be merged, and then take a look? All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI

[OMPI devel] Choosing an Open-MPI release for a new cluster

2013-05-01 Thread Christopher Samuel
plan or not! Thoughts please? All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP

Re: [OMPI devel] Choosing an Open-MPI release for a new cluster

2013-05-02 Thread Christopher Samuel
at 1.6. Thanks so much to you all! All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN

[OMPI devel] Any plans to support Intel MIC (Xeon Phi) in Open-MPI?

2013-05-02 Thread Christopher Samuel
l the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11

Re: [OMPI devel] Any plans to support Intel MIC (Xeon Phi) in Open-MPI?

2013-05-02 Thread Christopher Samuel
yet. > Brice: do the Phis appear in the hwloc topology object? They appear in lstopo as mic0 and mic1. > Chris: can you run lstopo on one of the nodes and send me the > output (off-list)? One of the hosts? Not a problem, will do. All the best! Chris - -- Christopher SamuelSenio

Re: [OMPI devel] Any plans to support Intel MIC (Xeon Phi) in Open-MPI?

2013-05-03 Thread Christopher Samuel
ther > nodes. Gotcha. > Solving the first two is relatively straightforward. In my mind, > the primary issue is the last one - does anyone know if a process > on the Phi's can "see" interconnects like a TCP NIC or an > Infiniband adaptor? I'm not sure, but I

Re: [OMPI devel] Any plans to support Intel MIC (Xeon Phi) in Open-MPI?

2013-05-03 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 03/05/13 14:30, Ralph Castain wrote: > On May 2, 2013, at 9:18 PM, Christopher Samuel > wrote: > >> We're using Slurm, and it supports them already apparently, so I'm >> not sure if that helps? > > It doe

[OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-07-23 Thread Christopher Samuel
ng like this, or got any ideas? cheers, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE

Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-07-23 Thread Christopher Samuel
run as they will be used to from our other Intel systems and to only use srun if the code requires it (one or two commercial apps that use Intel MPI). Can I ask, if the PMI2 ideas work out is that likely to get backported to OMPI 1.6.x ? All the best, Chris - -- Christopher Samuel

Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-07-24 Thread Christopher Samuel
to see if it resolves the difference. When I've got the current rush out of the way I'll try a private build of 1.7 and see how that goes with NAMD. cheers! Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Ini

[OMPI devel] Memory accounting issues with mpirun (was Re: [slurm-dev] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun)

2013-08-07 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 23/07/13 17:06, Christopher Samuel wrote: > Bringing up a new IBM SandyBridge cluster I'm running a NAMD test > case and noticed that if I run it with srun rather than mpirun it > goes over 20% slower. Following on from this issu

Re: [OMPI devel] Memory accounting issues with mpirun (was Re: [slurm-dev] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun)

2013-08-07 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/08/13 16:18, Christopher Samuel wrote: > Anyone seen anything similar, or any ideas on what could be going > on? Apologies, forgot to mention that Slurm is set up with: # ACCOUNTING JobAcctGatherType=jobacct_gather

Re: [OMPI devel] [slurm-dev] slurm-dev Memory accounting issues with mpirun (was Re: Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun)

2013-08-07 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/08/13 16:19, Christopher Samuel wrote: > Anyone seen anything similar, or any ideas on what could be going > on? Sorry, this was with: # ACCOUNTING JobAcctGatherType=jobacct_gather/linux JobAcctGatherFrequency=30 Since those initial

Re: [OMPI devel] [slurm-dev] Re: slurm-dev Memory accounting issues with mpirun (was Re: Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun)

2013-08-07 Thread Christopher Samuel
is using more than its allowed memory per tasks, but I'm not sure I understand how that could lead to Slurm thinking the job is using vastly more memory than it actually is though. cheers, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computa

Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-08-08 Thread Christopher Samuel
love to be able to test this out if we can as I currently see a 60% penalty with srun with my test NAMD job from our tame MM person. thanks! Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.ed

Re: [OMPI devel] [slurm-dev] slurm-dev Memory accounting issues with mpirun (was Re: Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun)

2013-08-19 Thread Christopher Samuel
issues with it so far. In the long term I suspect the jobacct_gather/cgroup plugin will give better numbers once it's had more work. All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimel

[OMPI devel] How to deal with F90 mpi.mod with single stack and multiple compiler suites?

2013-08-22 Thread Christopher Samuel
rush to try and get this done before I need to leave for the day). :-( cheers, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twi

Re: [OMPI devel] Possible OMPI 1.6.5 bug? SEGV in malloc.c

2013-08-29 Thread Christopher Samuel
are: (gdb) print remainder $1 = (struct malloc_chunk *) 0x2008e5700 (gdb) print remainder_size $2 = 0 ANy ideas? cheers, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0

Re: [OMPI devel] Possible OMPI 1.6.5 bug? SEGV in malloc.c

2013-08-29 Thread Christopher Samuel
o you get the same behavior if you disable ptmalloc in OMPI? > (your IB large message bandwidth will suffer a bit, though) Not tried that, but I'll take a look at it if it doesn't seem possible to fix it with a change to the default memory limits (that'll be the least intrusive).

Re: [OMPI devel] Possible OMPI 1.6.5 bug? SEGV in malloc.c

2013-08-30 Thread Christopher Samuel
_disable=1 I don't get the crash at all, or the spin with the Intel compiler build. Nice! Thanks for this, I'll take a look further next week.. Very much obliged, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Ini

Re: [OMPI devel] Possible OMPI 1.6.5 bug? SEGV in malloc.c

2013-09-02 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 30/08/13 16:01, Christopher Samuel wrote: > Thanks for this, I'll take a look further next week.. The code where it's SEGV'ing is here: /* check that one of the above allocation paths succeeded */ if ((unsigned long)(size)

Re: [OMPI devel] Possible OMPI 1.6.5 bug? SEGV in malloc.c

2013-09-02 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/09/13 15:40, Christopher Samuel wrote: > It dies when it does: > > set_head(remainder, remainder_size | PREV_INUSE); > > where remainder_size=0. Ignore that, I've shown it to someone who is actually a programmer and w

Re: [OMPI devel] Possible OMPI 1.6.5 bug? SEGV in malloc.c

2013-09-02 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/09/13 16:32, Christopher Samuel wrote: > I cannot duplicate this under valgrind or gdb and given that this > doesn't happen every time I run it and gdb indicates there are at > least 2 threads running then we're wonderin

Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-09-02 Thread Christopher Samuel
flat out! All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: Gnu

Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-09-02 Thread Christopher Samuel
y release > it soon. Stupid question, but never having played with PMI before is it just the case of appending the --with-pmi option to our current configure? thanks, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation

Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-09-03 Thread Christopher Samuel
r29103 with mpirun - 8341 seconds Open-MPI 1.7.3a1r29103 with srun - 7476 seconds So that's about 11% faster, and the mpirun speed has decreased though of course that's built using PMI so perhaps that's the cause? cheers, Chris - -- Christopher SamuelSenior Systems Administra

Re: [OMPI devel] Possible OMPI 1.6.5 bug? SEGV in malloc.c

2013-09-03 Thread Christopher Samuel
sr/local/${BASE} --with-slurm --with-openib --enable-static --enable-shared make -j - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://

Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-09-03 Thread Christopher Samuel
t's doing a lot more than that and has a reputation for being a *very* chatty MPI code. For comparison whilst users see GROMACS also suffer with srun under 1.6.5 they don't see anything like the slow down that NAMD gets. All the best, Chris - -- Christopher SamuelSenior Syste

Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-09-04 Thread Christopher Samuel
mory in use. Hope this is useful! All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -

Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-09-05 Thread Christopher Samuel
irun has: envar 64 whereas srun has: envar NULL Are these differences significant? I'm intrigued that the problem child (srun 1.6.5) is the only one where number is 1. All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Scien

Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-09-06 Thread Christopher Samuel
samuel@barcoo ~]$ module show openmpi 2>&1 | grep binding setenv OMPI_MCA_orte_process_binding core However, modifying the test program confirms that variable is getting propagated as expected with both mpirun and srun for 1.6.5 and the 1.7 snapshot. :-( cheers, Chris - -- Chris

Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-09-06 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 06/09/13 14:14, Christopher Samuel wrote: > However, modifying the test program confirms that variable is getting > propagated as expected with both mpirun and srun for 1.6.5 and the 1.7 > snapshot. :-( Investigating further by setting

Re: [OMPI devel] Openmpi 1.6.5 is freezing under GNU/Linux ia64

2013-09-20 Thread Christopher Samuel
s! Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux)

Re: [OMPI devel] Openmpi 1.6.5 is freezing under GNU/Linux ia64

2013-09-21 Thread Christopher Samuel
the debug info hadn't shown it getting to the point of launching the executable. Mea culpa. I blame jet-lag. ;-) cheers, Chris (about to get a second dose) - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.e

[OMPI devel] 1.6.5 large matrix test doesn't pass (decode) ?

2013-10-04 Thread Christopher Samuel
All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuP

Re: [OMPI devel] 1.6.5 large matrix test doesn't pass (decode) ?

2013-10-16 Thread Christopher Samuel
ilure occurs with plain v1.6.5 and it doesn't > occur with patched v1.6.5. Perfect, thanks! Sorry for the delay, been away on holiday. All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r29615 - in trunk: . contrib contrib/dist/linux debian debian/source

2013-11-06 Thread Christopher Samuel
aps a better way to assist would be to help out Sylvestre and the other Debian maintainers? This might be a handy place to start: http://qa.debian.org/developer.php?login=pkg-openmpi-maintainers%40lists.alioth.debian.org All the best, Chris - -- Christopher SamuelSenior Systems Admi

[OMPI devel] Happy Open-MPI day everyone!

2013-11-22 Thread Christopher Samuel
I Day!". All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Vers

Re: [OMPI devel] RFC: usnic BTL MPI_T pvar scheme

2013-11-22 Thread Christopher Samuel
I operations until you've read a value back from the OS. But then if you want to read multiple values from the OS you're going to be out of luck there too. Unless I'm missing something? So perhaps the best thing is to just document this prominently. All the best, Chris - -- C

Re: [OMPI devel] Openmpi 1.6.5 is freezing under GNU/Linux ia64

2013-12-03 Thread Christopher Samuel
provement occurs, ia64 will be removed from testing on # Friday 24th January 2014. - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://t

Re: [OMPI devel] SC13 birds of a feather

2013-12-03 Thread Christopher Samuel
s with a grab bag of jobs it's not likely useful, but if you had a system dedicated to running an in house code then you could conceive of situations where you might want to react to over-temperature cores, nodes, etc. cheers, Chris - -- Christopher SamuelSenior Systems Administrat

Re: [OMPI devel] SC13 birds of a feather

2013-12-05 Thread Christopher Samuel
27;s what there. > (Thanks for the idea, Samuel!) My pleasure! All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://

Re: [OMPI devel] RFC: Force Slurm to use PMI-1 unless PMI-2 is specifically requested

2014-05-06 Thread Christopher Samuel
with OMPI 1.6.x but Slurm then gets all its memory stats wrong and if you run with CR_Core_Memory in Slurm you have a very high risk your job will get killed incorrectly. All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation

Re: [OMPI devel] RFC: Force Slurm to use PMI-1 unless PMI-2 is specifically requested

2014-05-06 Thread Christopher Samuel
6.072693 CPUTime: 586.072693 > > Average of 563 seconds. > > So that's about 23% slower. > > Everything is identical (they're all symlinks to the same golden > master) *except* for the srun / mpirun which is modified by > copying the batch script and substituting

Re: [OMPI devel] RFC: Force Slurm to use PMI-1 unless PMI-2 is specifically requested

2014-05-07 Thread Christopher Samuel
e: 64 CPUs 0.280824 s/step 1.62514 days/ns 904.91 MB memory WallClock: 7522.677246 CPUTime: 7522.677246 Memory: 969.433594 MB So to me it looks like (for NAMD on our system at least) that PMI2 does seem to give better scalability. All the best! Chris - -- Christopher SamuelSenior Systems

Re: [OMPI devel] RFC: Force Slurm to use PMI-1 unless PMI-2 is specifically requested

2014-05-07 Thread Christopher Samuel
ad an option to enable PMI2 by default so that only those who requested it got it then I'd be more than happy - we'd just add it to our script to build it. All the best! Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initia

Re: [OMPI devel] RFC: Force Slurm to use PMI-1 unless PMI-2 is specifically requested

2014-05-07 Thread Christopher Samuel
the cluster that those tests were run on has 70 nodes, each with 16 cores, so I suspect we're a long long way away from that pain point. All the best! Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unim

Re: [OMPI devel] RFC: Force Slurm to use PMI-1 unless PMI-2 is specifically requested

2014-05-07 Thread Christopher Samuel
unfortunately the cluster we tested on then is flat out at the moment but I'll try and sneak a 64-core job using identical configs and compare mpirun, srun on its own and srun with PMI2. All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victoria

Re: [OMPI devel] RFC: Force Slurm to use PMI-1 unless PMI-2 is specifically requested

2014-05-08 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/05/14 23:45, Ralph Castain wrote: > Artem and I are working on a new PMIx plugin that will resolve it > for non-Mellanox cases. Ah yes of course, sorry my bad! - -- Christopher SamuelSenior Systems Administrator VLSCI - Vic

Re: [OMPI devel] RFC: Force Slurm to use PMI-1 unless PMI-2 is specifically requested

2014-05-08 Thread Christopher Samuel
re is work on an alternative solution that we will be able to use. Thanks! Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/v

Re: [OMPI devel] 1.4.2rc1 available for test

2010-04-18 Thread Christopher Samuel
On 14/04/10 04:43, Ralph Castain wrote: > I rolled the 1.4.2 release candidate 1 tarball today. The NEWS file only covers changes up to 1.4.1, does that usually get updated prior to, or after, the RC's ? cheers, Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - V

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-05-02 Thread Christopher Samuel
it master) the function that handles this - do_shm_rmid() in ipc/shm.c - only destroys the segment if nobody is attached to it, otherwise it marks the segment as IPC_PRIVATE to stop others finding it and with SHM_DEST so that it is automatically destroyed on the last detach. cheers, Chris -- Chri

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-05-02 Thread Christopher Samuel
rrent master of the kernel, IPC_PRIVATE is set on the segment with the comment: /* Do not find it any more */ That flag means that ipcget() - used by sys_shmget() - take a different code path and now call ipcget_new() rather than ipcget_public(). cheers, Chris -- Christ

[OMPI devel] Unchecked malloc()'s in OMPI 1.4.x

2010-05-02 Thread Christopher Samuel
pal_hash_table.c line 431 - opal_hash_table_set_value_ptr() node->hn_key = malloc(key_size); orte/mca/ras/alps/ras_alps_module.c line 243 - orte_ras_alps_read_appinfo_file() cpBuf=malloc(szLen+1); /* Allocate buffer */ All the best, Chris -- Christopher Samuel - Senior S

Re: [OMPI devel] System V Shared Memory for OpenMPI:Request forCommunity Input and Testing

2010-05-05 Thread Christopher Samuel
nd that was on a Solaris system rather than Linux. SunOS burl-ct-v440-2 5.10 Generic_118833-33 sun4u sparc SUNW,Sun-Fire-V440 cheers, Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au P

Re: [OMPI devel] Very poor performance with btl sm on twin nehalem servers with Mellanox ConnectX installed

2010-05-13 Thread Christopher Samuel
a /tmp which was NFS mounted; changing the location where their files were kept to another directory with the orte_tmpdir_base MCA parameter fixed that issue for them. Could it be similar for yourself ? cheers, Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Vi

Re: [OMPI devel] RFC: Remove all other paffinity components

2010-05-13 Thread Christopher Samuel
heers! Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/

Re: [OMPI devel] Very poor performance with btl sm on twin nehalem servers with Mellanox ConnectX installed

2010-05-16 Thread Christopher Samuel
On 15/05/10 07:49, Ralph Castain wrote: > We have had a FAQ on this for a long time...problem is, > nobody reads it :-/ I even grabbed the URL to paste into my reply, but forgot to actually paste it before hitting send! :-/ -- Christopher Samuel - Senior Systems Administrator

Re: [OMPI devel] Very poor performance with btl sm on twin nehalem servers with Mellanox ConnectX installed

2010-05-17 Thread Christopher Samuel
More info here: http://wiki.linuxquestions.org/wiki/Tmpfs cheers! Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/

Re: [OMPI devel] System V Shared Memory for Open MPI: Request forCommunity Input and Testing

2010-06-11 Thread Christopher Samuel
but lower values are meant to make the kernel less likely to swap out applications and instead concentrate on reclaiming pages from the page cache. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unime

Re: [OMPI devel] v1.5: thumbs up or down?

2010-07-01 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 30/06/10 01:38, Jeff Squyres wrote: > Can we get a thumbs up / down from each organization > about where you think we are with v1.5? I'm unlikely to get a chance to test it I'm afraid, flat out here.. :-( - -- Christopher

Re: [OMPI devel] OMPI 1.5 twitter notification plugin probably broken by switch to OAUTH

2010-09-08 Thread Christopher Samuel
cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Usi

Re: [OMPI devel] mpif.h on Intel build when run with OMPI_FC=gfortran

2016-03-03 Thread Christopher Samuel
sers (who happened to be our director) tried to use it it failed because the mpi.mod module created during the build is compiler dependent. :-( So ever since we've done separate builds for GCC and for Intel. All the best! Chris -- Christopher SamuelSenior Systems Administrator VLS

Re: [OMPI devel] mpif.h on Intel build when run with OMPI_FC=gfortran

2016-03-03 Thread Christopher Samuel
ebug why their code wouldn't compile. :-) Apologies for the noise. All the best, Chris -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci

Re: [OMPI devel] [2.0.0rc2] xlc-13.1.0 ICE (hwloc)

2016-05-05 Thread Christopher Samuel
On 03/05/16 18:11, Paul Hargrove wrote: > xlc-13.1.0 on Linux dies compiling the embedded hwloc in this rc > (details below). In case it's useful xlc 12.1.0.9-140729 (yay for BGQ living in the past) doesn't ICE on RHEL6 on Power7. All the best, Chris -- Christopher Samu

Re: [OMPI devel] Github pricing plan changes announced today

2016-05-17 Thread Christopher Samuel
tool and found MTT but commented: # OpenMPI has the MPI Testing Tool which looks like it would work, # but most of there tests seem private. and so moved on to look at other options instead. All the best, Chris -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life

Re: [OMPI devel] Github pricing plan changes announced today

2016-05-17 Thread Christopher Samuel
On 18/05/16 09:59, Gilles Gouaillardet wrote: > the (main) reason is none of us are lawyers and none of us know whether > all test suites can be redistributed for general public use or not. Thanks Gilles, All the best, Chris -- Christopher SamuelSenior Systems Administrator

Re: [OMPI devel] RFC: Public Test Repo

2016-05-19 Thread Christopher Samuel
ink it sends the right message about openness and hopefully allows a community to build around MPI testing in general. Certainly happy to try it out! All the best, Chris -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Emai

Re: [OMPI devel] [1.10.3rc4] testing results

2016-06-06 Thread Christopher Samuel
olved with this. -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci

Re: [OMPI devel] Migration of mailman mailing lists

2016-07-18 Thread Christopher Samuel
On 19/07/16 02:05, Brice Goglin wrote: > Yes, kill all netloc lists. Will the archives be preserved somewhere for historical reference? All the best, Chris -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email:

Re: [OMPI devel] Off-topic re: supporting old systems

2016-08-30 Thread Christopher Samuel
bian GNU/Linux 7 \n \l cheers, Chris -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci ___

Re: [OMPI devel] Off-topic re: supporting old systems

2016-08-30 Thread Christopher Samuel
==>] 8,192,091 1.75M/s in 7.3s 2016-08-31 12:12:08 (1.07 MB/s) - `openmpi-2.0.1rc2.tar.bz2' saved [8192091/8192091] All the best, Chris -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative E

Re: [OMPI devel] Off-topic re: supporting old systems

2016-08-31 Thread Christopher Samuel
On 31/08/16 14:01, Paul Hargrove wrote: > So, the sparc platform is a bit more orphaned that it already was when > support stopped at Wheezy. Ah sorry, I didn't realise you were on a non-LTS Wheezy architecture. -- Christopher SamuelSenior Systems Administrator VLSCI - Vic

Re: [OMPI devel] RFC: Rename nightly snapshot tarballs

2016-10-17 Thread Christopher Samuel
s been happened before then I'd suggest allow for it to happen again by adding HHMM. Otherwise looks sensible to me (YMMV). -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 555

Re: [OMPI devel] Segfault on MPI init

2017-02-21 Thread Christopher Samuel
t a backtrace from all threads at once with: thread apply all bt It's not just limited to 'bt' either: (gdb) help thread apply Apply a command to a list of threads. List of thread apply subcommands: thread apply all -- Apply a command to all threads -- Christopher Samuel

[OMPI devel] Open-MPI killing nodes with mlx5 drivers?

2017-10-29 Thread Christopher Samuel
est, Chris -- Christopher SamuelSenior Systems Administrator Melbourne Bioinformatics - The University of Melbourne Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 ___ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.o

  1   2   >