Stable 1.8.7 has been released - please let me know if the problem is resolved.
> On Jul 15, 2015, at 9:53 PM, Lane, William <william.l...@cshs.org> wrote: > > Ralph, > > I'd rather wait for the stable release of 1.8.7, but I'm willing to give > it a try if my supervisor is. > > -Bill L. > > From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain > [r...@open-mpi.org] > Sent: Tuesday, July 14, 2015 12:47 PM > To: Open MPI Users > Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = crash > > Can you give it a try? I’m skeptical, but it might work. The rc is out on the > web site: > > http://www.open-mpi.org/software/ompi/v1.8/ > <http://www.open-mpi.org/software/ompi/v1.8/> > > >> On Jul 14, 2015, at 11:17 AM, Lane, William <william.l...@cshs.org >> <mailto:william.l...@cshs.org>> wrote: >> >> Ralph, >> >> Do you think the 1.8.7 release will solve the problems w/our >> heterogeneous cluster? >> >> Bill L. >> >> From: users [users-boun...@open-mpi.org <mailto:users-boun...@open-mpi.org>] >> on behalf of Ralph Castain [r...@open-mpi.org <mailto:r...@open-mpi.org>] >> Sent: Tuesday, July 07, 2015 8:59 PM >> To: Open MPI Users >> Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = crash >> >> No need for the lstopo data anymore, Bill - I was able to recreate the >> situation using some very nice hwloc functions plus your prior descriptions. >> I'm not totally confident that this fix will resolve the problem but it will >> clear out at least one problem. >> >> We'll just have to see what happens and attack it next. >> Ralph >> >> >> On Tue, Jul 7, 2015 at 8:07 PM, Lane, William <william.l...@cshs.org >> <mailto:william.l...@cshs.org>> wrote: >> I'm sorry I haven't been able to get the lstopo information for >> all the nodes, but I had to get the latest version of hwloc installed >> first. They've even added in some more modern blades that also >> support hyperthreading, ugh. They've also been doing some memory >> upgrades as well. >> >> I'm trying to get a Bash script running on the cluster via qsub >> that will run lstopo and output the host information to a file located >> in my $HOME directory but it hasn't been working (there are 60 nodes >> in the heterogeneous cluster that needs to have OpenMPI running). >> >> I will try to get the lstopo information by the end of the week. >> >> I'd be willing to do most anything to get these OpenMPI issues >> resolved. I'd even wash your cars for you! >> >> -Bill L. >> ________________________________________ >> From: users [users-boun...@open-mpi.org <mailto:users-boun...@open-mpi.org>] >> on behalf of Ralph Castain [r...@open-mpi.org <mailto:r...@open-mpi.org>] >> Sent: Tuesday, July 07, 2015 1:36 PM >> To: Open MPI Users >> Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = crash >> >> I may have finally tracked this down. At least, I can now get the correct >> devel map to come out, and found a memory corruption issue that only >> impacted hetero operations. I can’t know if this is the root cause of the >> problem Bill is seeing, however, as I have no way of actually running the >> job. >> >> I pushed this into the master and will bring it back to 1.8.7 as well as >> 1.10. >> >> Bill - would you be able/willing to give it a try there? It would be nice to >> confirm this actually fixed the problem. >> >> >> > On Jun 29, 2015, at 1:58 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com >> > <mailto:jsquy...@cisco.com>> wrote: >> > >> > lstopo will tell you -- if there is more than one "PU" (hwloc terminology >> > for "processing unit") per core, then hyper threading is enabled. If >> > there's only one PU per core, then hyper threading is disabled. >> > >> > >> >> On Jun 29, 2015, at 4:42 PM, Lane, William <william.l...@cshs.org >> >> <mailto:william.l...@cshs.org>> wrote: >> >> >> >> Would the output of dmidecode -t processor and/or lstopo tell me >> >> conclusively >> >> if hyperthreading is enabled or not? Hyperthreading is supposed to be >> >> enabled >> >> for all the IBM x3550 M3 and M4 nodes, but I'm not sure if it actually is >> >> and I >> >> don't have access to the BIOS settings. >> >> >> >> -Bill L. >> >> >> >> From: users [users-boun...@open-mpi.org >> >> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain >> >> [r...@open-mpi.org <mailto:r...@open-mpi.org>] >> >> Sent: Saturday, June 27, 2015 7:21 PM >> >> To: Open MPI Users >> >> Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = >> >> crash >> >> >> >> Bill - this is such a jumbled collection of machines that I’m having >> >> trouble figuring out what I should replicate. I can create something >> >> artificial here so I can try to debug this, but I need to know exactly >> >> what I’m up against - can you tell me: >> >> >> >> * the architecture of each type - how many sockets, how many >> >> cores/socket, HT on or off. If two nodes have the same physical setup but >> >> one has HT on and the other off, then please consider those two different >> >> types >> >> >> >> * how many nodes of each type >> >> >> >> Looking at your map output, it looks like the map is being done >> >> correctly, but somehow the binding locale isn’t getting set in some >> >> cases. You latest error output would seem out-of-step with your prior >> >> reports, so something else may be going on there. As I said earlier, this >> >> is the most hetero environment we’ve seen, and so there may be some code >> >> paths your hitting that haven’t been well exercised before. >> >> >> >> >> >> >> >> >> >>> On Jun 26, 2015, at 5:22 PM, Lane, William <william.l...@cshs.org >> >>> <mailto:william.l...@cshs.org>> wrote: >> >>> >> >>> Well, I managed to get a successful mpirun @ a slot count of 132 using >> >>> --mca btl ^sm, >> >>> however when I increased the slot count to 160, mpirun crashed without >> >>> any error >> >>> output: >> >>> >> >>> mpirun -np 160 -display-devel-map --prefix /hpc/apps/mpi/openmpi/1.8.6/ >> >>> --hostfile hostfile-noslots --mca btl ^sm --hetero-nodes --bind-to core >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3 >> out.txt 2>&1 >> >>> >> >>> -------------------------------------------------------------------------- >> >>> WARNING: a request was made to bind a process. While the system >> >>> supports binding the process itself, at least one node does NOT >> >>> support binding memory to the process location. >> >>> >> >>> Node: csclprd3-6-1 >> >>> >> >>> This usually is due to not having the required NUMA support installed >> >>> on the node. In some Linux distributions, the required support is >> >>> contained in the libnumactl and libnumactl-devel packages. >> >>> This is a warning only; your job will continue, though performance may >> >>> be degraded. >> >>> -------------------------------------------------------------------------- >> >>> -------------------------------------------------------------------------- >> >>> A request was made to bind to that would result in binding more >> >>> processes than cpus on a resource: >> >>> >> >>> Bind to: CORE >> >>> Node: csclprd3-6-1 >> >>> #processes: 2 >> >>> #cpus: 1 >> >>> >> >>> You can override this protection by adding the "overload-allowed" >> >>> option to your binding directive. >> >>> -------------------------------------------------------------------------- >> >>> >> >>> But csclprd3-6-1 (a blade) does have 2 CPU's on 2 separate sockets w/2 >> >>> cores apiece as shown in my dmidecode output: >> >>> >> >>> csclprd3-6-1 ~]# dmidecode -t processor >> >>> # dmidecode 2.11 >> >>> SMBIOS 2.4 present. >> >>> >> >>> Handle 0x0008, DMI type 4, 32 bytes >> >>> Processor Information >> >>> Socket Designation: Socket 1 CPU 1 >> >>> Type: Central Processor >> >>> Family: Xeon >> >>> Manufacturer: GenuineIntel >> >>> ID: F6 06 00 00 01 03 00 00 >> >>> Signature: Type 0, Family 6, Model 15, Stepping 6 >> >>> Flags: >> >>> FPU (Floating-point unit on-chip) >> >>> CX8 (CMPXCHG8 instruction supported) >> >>> APIC (On-chip APIC hardware supported) >> >>> Version: Intel Xeon >> >>> Voltage: 2.9 V >> >>> External Clock: 333 MHz >> >>> Max Speed: 4000 MHz >> >>> Current Speed: 3000 MHz >> >>> Status: Populated, Enabled >> >>> Upgrade: ZIF Socket >> >>> L1 Cache Handle: 0x0004 >> >>> L2 Cache Handle: 0x0005 >> >>> L3 Cache Handle: Not Provided >> >>> >> >>> Handle 0x0009, DMI type 4, 32 bytes >> >>> Processor Information >> >>> Socket Designation: Socket 2 CPU 2 >> >>> Type: Central Processor >> >>> Family: Xeon >> >>> Manufacturer: GenuineIntel >> >>> ID: F6 06 00 00 01 03 00 00 >> >>> Signature: Type 0, Family 6, Model 15, Stepping 6 >> >>> Flags: >> >>> FPU (Floating-point unit on-chip) >> >>> CX8 (CMPXCHG8 instruction supported) >> >>> APIC (On-chip APIC hardware supported) >> >>> Version: Intel Xeon >> >>> Voltage: 2.9 V >> >>> External Clock: 333 MHz >> >>> Max Speed: 4000 MHz >> >>> Current Speed: 3000 MHz >> >>> Status: Populated, Enabled >> >>> Upgrade: ZIF Socket >> >>> L1 Cache Handle: 0x0006 >> >>> L2 Cache Handle: 0x0007 >> >>> L3 Cache Handle: Not Provided >> >>> >> >>> csclprd3-6-1 ~]# lstopo >> >>> Machine (16GB) >> >>> Socket L#0 + L2 L#0 (4096KB) >> >>> L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0) >> >>> L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#2) >> >>> Socket L#1 + L2 L#1 (4096KB) >> >>> L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#1) >> >>> L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3) >> >>> >> >>> csclprd3-0-1 information (which looks correct as this particular x3550 >> >>> should >> >>> have one socket populated (of two) with a 6 core Xeon (or 12 cores >> >>> w/hyperthreading >> >>> turned on): >> >>> >> >>> csclprd3-0-1 ~]# lstopo >> >>> Machine (71GB) >> >>> Socket L#0 + L3 L#0 (12MB) >> >>> L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU >> >>> L#0 (P#0) >> >>> L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU >> >>> L#1 (P#1) >> >>> L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU >> >>> L#2 (P#2) >> >>> L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU >> >>> L#3 (P#3) >> >>> L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU >> >>> L#4 (P#4) >> >>> L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU >> >>> L#5 (P#5) >> >>> >> >>> csclprd3-0-1 ~]# dmidecode -t processor >> >>> # dmidecode 2.11 >> >>> # SMBIOS entry point at 0x7f6be000 >> >>> SMBIOS 2.5 present. >> >>> >> >>> Handle 0x0001, DMI type 4, 40 bytes >> >>> Processor Information >> >>> Socket Designation: Node 1 Socket 1 >> >>> Type: Central Processor >> >>> Family: Xeon MP >> >>> Manufacturer: Intel(R) Corporation >> >>> ID: C2 06 02 00 FF FB EB BF >> >>> Signature: Type 0, Family 6, Model 44, Stepping 2 >> >>> Flags: >> >>> FPU (Floating-point unit on-chip) >> >>> VME (Virtual mode extension) >> >>> DE (Debugging extension) >> >>> PSE (Page size extension) >> >>> TSC (Time stamp counter) >> >>> MSR (Model specific registers) >> >>> PAE (Physical address extension) >> >>> MCE (Machine check exception) >> >>> CX8 (CMPXCHG8 instruction supported) >> >>> APIC (On-chip APIC hardware supported) >> >>> SEP (Fast system call) >> >>> MTRR (Memory type range registers) >> >>> PGE (Page global enable) >> >>> MCA (Machine check architecture) >> >>> CMOV (Conditional move instruction supported) >> >>> PAT (Page attribute table) >> >>> PSE-36 (36-bit page size extension) >> >>> CLFSH (CLFLUSH instruction supported) >> >>> DS (Debug store) >> >>> ACPI (ACPI supported) >> >>> MMX (MMX technology supported) >> >>> FXSR (FXSAVE and FXSTOR instructions supported) >> >>> SSE (Streaming SIMD extensions) >> >>> SSE2 (Streaming SIMD extensions 2) >> >>> SS (Self-snoop) >> >>> HTT (Multi-threading) >> >>> TM (Thermal monitor supported) >> >>> PBE (Pending break enabled) >> >>> Version: Intel(R) Xeon(R) CPU E5645 @ 2.40GHz >> >>> Voltage: 1.2 V >> >>> External Clock: 5866 MHz >> >>> Max Speed: 4400 MHz >> >>> Current Speed: 2400 MHz >> >>> Status: Populated, Enabled >> >>> Upgrade: ZIF Socket >> >>> L1 Cache Handle: 0x0002 >> >>> L2 Cache Handle: 0x0003 >> >>> L3 Cache Handle: 0x0004 >> >>> Serial Number: Not Specified >> >>> Asset Tag: Not Specified >> >>> Part Number: Not Specified >> >>> Core Count: 6 >> >>> Core Enabled: 6 >> >>> Thread Count: 6 >> >>> Characteristics: >> >>> 64-bit capable >> >>> >> >>> Handle 0x005A, DMI type 4, 40 bytes >> >>> Processor Information >> >>> Socket Designation: Node 1 Socket 2 >> >>> Type: Central Processor >> >>> Family: Xeon MP >> >>> Manufacturer: Not Specified >> >>> ID: 00 00 00 00 00 00 00 00 >> >>> Signature: Type 0, Family 0, Model 0, Stepping 0 >> >>> Flags: None >> >>> Version: Not Specified >> >>> Voltage: 1.2 V >> >>> External Clock: 5866 MHz >> >>> Max Speed: 4400 MHz >> >>> Current Speed: Unknown >> >>> Status: Unpopulated >> >>> Upgrade: ZIF Socket >> >>> L1 Cache Handle: Not Provided >> >>> L2 Cache Handle: Not Provided >> >>> L3 Cache Handle: Not Provided >> >>> Serial Number: Not Specified >> >>> Asset Tag: Not Specified >> >>> Part Number: Not Specified >> >>> Characteristics: None >> >>> >> >>> >> >>> From: users [users-boun...@open-mpi.org >> >>> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain >> >>> [r...@open-mpi.org <mailto:r...@open-mpi.org>] >> >>> Sent: Wednesday, June 24, 2015 6:06 AM >> >>> To: Open MPI Users >> >>> Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = >> >>> crash >> >>> >> >>> I think trying with --mca btl ^sm makes a lot of sense and may solve the >> >>> problem. I also noted that we are having trouble with the topology of >> >>> several of the nodes - seeing only one socket, non-HT where you say we >> >>> should see two sockets and HT-enabled. In those cases, the locality is >> >>> "unknown" - given that those procs are on remote nodes from the one >> >>> being impacted, I don't think it should cause a problem. However, it >> >>> isn't correct, and that raises flags. >> >>> >> >>> My best guess of the root cause of that error is either we are getting >> >>> bad topology info on those nodes, or we have a bug that is mishandling >> >>> this scenario. It would probably be good to get this error fixed to >> >>> ensure it isn't the source of the eventual crash, even though I'm not >> >>> sure they are related. >> >>> >> >>> Bill: Can we examine one of the problem nodes? Let's pick csclprd3-0-1 >> >>> (or take another one from your list - just look for one where "locality" >> >>> is reported as "unknown" for the procs in the output map). Can you run >> >>> lstopo on that node and send us the output? In the above map, it is >> >>> reporting a single socket with 6 cores, non-HT. Is that what lstopo >> >>> shows when run on the node? Is it what you expected? >> >>> >> >>> >> >>> On Wed, Jun 24, 2015 at 4:07 AM, Gilles Gouaillardet >> >>> <gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> >> >>> wrote: >> >>> Bill, >> >>> >> >>> were you able to get a core file and analyze the stack with gdb ? >> >>> >> >>> I suspect the error occurs in mca_btl_sm_add_procs but this is just my >> >>> best guess. >> >>> if this is correct, can you check the value of >> >>> mca_btl_sm_component.num_smp_procs ? >> >>> >> >>> as a workaround, can you try >> >>> mpirun --mca btl ^sm ... >> >>> >> >>> I do not see how I can tackle the root cause without being able to >> >>> reproduce the issue :-( >> >>> >> >>> can you try to reproduce the issue with the smallest hostfile, and then >> >>> run lstopo on all the nodes ? >> >>> btw, you are not mixing 32 bits and 64 bits OS, are you ? >> >>> >> >>> Cheers, >> >>> >> >>> Gilles >> >>> >> >>> >> >>> >> >>> >> >>> mca_btl_sm_add_procs( >> >>> >> >>> >> >>> >> >>> int >> >>> >> >>> mca_btl_sm_add_procs >> >>> ( >> >>> On Wednesday, June 24, 2015, Lane, William <william.l...@cshs.org >> >>> <mailto:william.l...@cshs.org>> wrote: >> >>> Gilles, >> >>> >> >>> All the blades only have two core Xeons (without hyperthreading) >> >>> populating both their sockets. All >> >>> the x3550 nodes have hyperthreading capable Xeons and Sandybridge server >> >>> CPU's. It's possible >> >>> hyperthreading has been disabled on some of these nodes though. The >> >>> 3-0-n nodes are all IBM x3550 >> >>> nodes while the 3-6-n nodes are all blade nodes. >> >>> >> >>> I have run this exact same test code successfully in the past on another >> >>> cluster (~200 nodes of Sunfire X2100 >> >>> 2x dual-core Opterons) w/no issues on upwards of 390 slots. I even >> >>> tested it recently on OpenMPI 1.8.5 >> >>> on another smaller R&D cluster consisting of 10 Sunfire X2100 nodes (w/2 >> >>> dual core Opterons apiece). >> >>> On this particular cluster I've had success running this code on < 132 >> >>> slots. >> >>> >> >>> Anyway, here's the results of the following mpirun: >> >>> >> >>> mpirun -np 132 -display-devel-map --prefix /hpc/apps/mpi/openmpi/1.8.6/ >> >>> --hostfile hostfile-noslots --mca btl_tcp_if_include eth0 --hetero-nodes >> >>> --bind-to core /hpc/home/lanew/mpi/openmpi/ProcessColors3 >> out.txt 2>&1 >> >>> >> >>> -------------------------------------------------------------------------- >> >>> WARNING: a request was made to bind a process. While the system >> >>> supports binding the process itself, at least one node does NOT >> >>> support binding memory to the process location. >> >>> >> >>> Node: csclprd3-6-1 >> >>> >> >>> This usually is due to not having the required NUMA support installed >> >>> on the node. In some Linux distributions, the required support is >> >>> contained in the libnumactl and libnumactl-devel packages. >> >>> This is a warning only; your job will continue, though performance may >> >>> be degraded. >> >>> -------------------------------------------------------------------------- >> >>> Data for JOB [51718,1] offset 0 >> >>> >> >>> Mapper requested: NULL Last mapper: round_robin Mapping policy: >> >>> BYSOCKET Ranking policy: SLOT >> >>> Binding policy: CORE Cpu set: NULL PPR: NULL Cpus-per-rank: 1 >> >>> Num new daemons: 0 New daemon starting vpid INVALID >> >>> Num nodes: 15 >> >>> >> >>> Data for node: csclprd3-6-1 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],1] Daemon launched: True >> >>> Num slots: 4 Slots in use: 4 Oversubscribed: FALSE >> >>> Num slots allocated: 4 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 4 Next node_rank: 4 >> >>> Data for proc: [[51718,1],0] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 0 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [B/B][./.] >> >>> Binding: [B/.][./.] >> >>> Data for proc: [[51718,1],1] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 1 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [./.][B/B] >> >>> Binding: [./.][B/.] >> >>> Data for proc: [[51718,1],2] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 2 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [B/B][./.] >> >>> Binding: [./B][./.] >> >>> Data for proc: [[51718,1],3] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 3 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [./.][B/B] >> >>> Binding: [./.][./B] >> >>> >> >>> Data for node: csclprd3-6-5 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],2] Daemon launched: True >> >>> Num slots: 4 Slots in use: 4 Oversubscribed: FALSE >> >>> Num slots allocated: 4 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 4 Next node_rank: 4 >> >>> Data for proc: [[51718,1],4] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 4 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [B/B][./.] >> >>> Binding: [B/.][./.] >> >>> Data for proc: [[51718,1],5] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 5 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [./.][B/B] >> >>> Binding: [./.][B/.] >> >>> Data for proc: [[51718,1],6] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 6 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [B/B][./.] >> >>> Binding: [./B][./.] >> >>> Data for proc: [[51718,1],7] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 7 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [./.][B/B] >> >>> Binding: [./.][./B] >> >>> >> >>> Data for node: csclprd3-0-0 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],3] Daemon launched: True >> >>> Num slots: 12 Slots in use: 12 Oversubscribed: FALSE >> >>> Num slots allocated: 12 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 12 Next node_rank: 12 >> >>> Data for proc: [[51718,1],8] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 8 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [B/B/B/B/B/B][./././././.] >> >>> Binding: [B/././././.][./././././.] >> >>> Data for proc: [[51718,1],9] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 9 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [./././././.][B/B/B/B/B/B] >> >>> Binding: [./././././.][B/././././.] >> >>> Data for proc: [[51718,1],10] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 10 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [B/B/B/B/B/B][./././././.] >> >>> Binding: [./B/./././.][./././././.] >> >>> Data for proc: [[51718,1],11] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 11 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [./././././.][B/B/B/B/B/B] >> >>> Binding: [./././././.][./B/./././.] >> >>> Data for proc: [[51718,1],12] >> >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 12 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [B/B/B/B/B/B][./././././.] >> >>> Binding: [././B/././.][./././././.] >> >>> Data for proc: [[51718,1],13] >> >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 13 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [./././././.][B/B/B/B/B/B] >> >>> Binding: [./././././.][././B/././.] >> >>> Data for proc: [[51718,1],14] >> >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 14 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [B/B/B/B/B/B][./././././.] >> >>> Binding: [./././B/./.][./././././.] >> >>> Data for proc: [[51718,1],15] >> >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 15 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [./././././.][B/B/B/B/B/B] >> >>> Binding: [./././././.][./././B/./.] >> >>> Data for proc: [[51718,1],16] >> >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 16 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [B/B/B/B/B/B][./././././.] >> >>> Binding: [././././B/.][./././././.] >> >>> Data for proc: [[51718,1],17] >> >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 17 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [./././././.][B/B/B/B/B/B] >> >>> Binding: [./././././.][././././B/.] >> >>> Data for proc: [[51718,1],18] >> >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 18 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [B/B/B/B/B/B][./././././.] >> >>> Binding: [./././././B][./././././.] >> >>> Data for proc: [[51718,1],19] >> >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 19 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [./././././.][B/B/B/B/B/B] >> >>> Binding: [./././././.][./././././B] >> >>> >> >>> Data for node: csclprd3-0-1 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],4] Daemon launched: True >> >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE >> >>> Num slots allocated: 6 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 6 Next node_rank: 6 >> >>> Data for proc: [[51718,1],20] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 20 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [B/././././.] >> >>> Data for proc: [[51718,1],21] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 21 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./B/./././.] >> >>> Data for proc: [[51718,1],22] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 22 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [././B/././.] >> >>> Data for proc: [[51718,1],23] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 23 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./././B/./.] >> >>> Data for proc: [[51718,1],24] >> >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 24 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [././././B/.] >> >>> Data for proc: [[51718,1],25] >> >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 25 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./././././B] >> >>> >> >>> Data for node: csclprd3-0-2 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],5] Daemon launched: True >> >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE >> >>> Num slots allocated: 6 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 6 Next node_rank: 6 >> >>> Data for proc: [[51718,1],26] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 26 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [B/././././.] >> >>> Data for proc: [[51718,1],27] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 27 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./B/./././.] >> >>> Data for proc: [[51718,1],28] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 28 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [././B/././.] >> >>> Data for proc: [[51718,1],29] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 29 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./././B/./.] >> >>> Data for proc: [[51718,1],30] >> >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 30 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [././././B/.] >> >>> Data for proc: [[51718,1],31] >> >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 31 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./././././B] >> >>> >> >>> Data for node: csclprd3-0-3 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],6] Daemon launched: True >> >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE >> >>> Num slots allocated: 6 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 6 Next node_rank: 6 >> >>> Data for proc: [[51718,1],32] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 32 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [B/././././.] >> >>> Data for proc: [[51718,1],33] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 33 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./B/./././.] >> >>> Data for proc: [[51718,1],34] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 34 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [././B/././.] >> >>> Data for proc: [[51718,1],35] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 35 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./././B/./.] >> >>> Data for proc: [[51718,1],36] >> >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 36 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [././././B/.] >> >>> Data for proc: [[51718,1],37] >> >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 37 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./././././B] >> >>> >> >>> Data for node: csclprd3-0-4 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],7] Daemon launched: True >> >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE >> >>> Num slots allocated: 6 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 6 Next node_rank: 6 >> >>> Data for proc: [[51718,1],38] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 38 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [B/././././.] >> >>> Data for proc: [[51718,1],39] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 39 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./B/./././.] >> >>> Data for proc: [[51718,1],40] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 40 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [././B/././.] >> >>> Data for proc: [[51718,1],41] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 41 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./././B/./.] >> >>> Data for proc: [[51718,1],42] >> >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 42 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [././././B/.] >> >>> Data for proc: [[51718,1],43] >> >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 43 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./././././B] >> >>> >> >>> Data for node: csclprd3-0-5 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],8] Daemon launched: True >> >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE >> >>> Num slots allocated: 6 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 6 Next node_rank: 6 >> >>> Data for proc: [[51718,1],44] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 44 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [B/././././.] >> >>> Data for proc: [[51718,1],45] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 45 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./B/./././.] >> >>> Data for proc: [[51718,1],46] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 46 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [././B/././.] >> >>> Data for proc: [[51718,1],47] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 47 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./././B/./.] >> >>> Data for proc: [[51718,1],48] >> >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 48 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [././././B/.] >> >>> Data for proc: [[51718,1],49] >> >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 49 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./././././B] >> >>> >> >>> Data for node: csclprd3-0-6 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],9] Daemon launched: True >> >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE >> >>> Num slots allocated: 6 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 6 Next node_rank: 6 >> >>> Data for proc: [[51718,1],50] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 50 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [B/././././.] >> >>> Data for proc: [[51718,1],51] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 51 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./B/./././.] >> >>> Data for proc: [[51718,1],52] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 52 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [././B/././.] >> >>> Data for proc: [[51718,1],53] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 53 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./././B/./.] >> >>> Data for proc: [[51718,1],54] >> >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 54 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [././././B/.] >> >>> Data for proc: [[51718,1],55] >> >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 55 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [./././././B] >> >>> >> >>> Data for node: csclprd3-0-7 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],10] Daemon launched: True >> >>> Num slots: 16 Slots in use: 16 Oversubscribed: FALSE >> >>> Num slots allocated: 16 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 16 Next node_rank: 16 >> >>> Data for proc: [[51718,1],56] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 56 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [BB/../../../../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],57] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 57 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][BB/../../../../../../..] >> >>> Data for proc: [[51718,1],58] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 58 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../BB/../../../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],59] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 59 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../BB/../../../../../..] >> >>> Data for proc: [[51718,1],60] >> >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 60 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../BB/../../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],61] >> >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 61 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../BB/../../../../..] >> >>> Data for proc: [[51718,1],62] >> >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 62 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../BB/../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],63] >> >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 63 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../BB/../../../..] >> >>> Data for proc: [[51718,1],64] >> >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 64 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../BB/../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],65] >> >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 65 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../BB/../../..] >> >>> Data for proc: [[51718,1],66] >> >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 66 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../../BB/../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],67] >> >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 67 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../../BB/../..] >> >>> Data for proc: [[51718,1],68] >> >>> Pid: 0 Local rank: 12 Node rank: 12 App rank: 68 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../../../BB/..][../../../../../../../..] >> >>> Data for proc: [[51718,1],69] >> >>> Pid: 0 Local rank: 13 Node rank: 13 App rank: 69 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../../../BB/..] >> >>> Data for proc: [[51718,1],70] >> >>> Pid: 0 Local rank: 14 Node rank: 14 App rank: 70 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../../../../BB][../../../../../../../..] >> >>> Data for proc: [[51718,1],71] >> >>> Pid: 0 Local rank: 15 Node rank: 15 App rank: 71 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../../../../BB] >> >>> >> >>> Data for node: csclprd3-0-8 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],11] Daemon launched: True >> >>> Num slots: 16 Slots in use: 16 Oversubscribed: FALSE >> >>> Num slots allocated: 16 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 16 Next node_rank: 16 >> >>> Data for proc: [[51718,1],72] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 72 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [BB/../../../../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],73] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 73 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][BB/../../../../../../..] >> >>> Data for proc: [[51718,1],74] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 74 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../BB/../../../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],75] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 75 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../BB/../../../../../..] >> >>> Data for proc: [[51718,1],76] >> >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 76 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../BB/../../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],77] >> >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 77 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../BB/../../../../..] >> >>> Data for proc: [[51718,1],78] >> >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 78 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../BB/../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],79] >> >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 79 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../BB/../../../..] >> >>> Data for proc: [[51718,1],80] >> >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 80 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../BB/../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],81] >> >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 81 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../BB/../../..] >> >>> Data for proc: [[51718,1],82] >> >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 82 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../../BB/../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],83] >> >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 83 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../../BB/../..] >> >>> Data for proc: [[51718,1],84] >> >>> Pid: 0 Local rank: 12 Node rank: 12 App rank: 84 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../../../BB/..][../../../../../../../..] >> >>> Data for proc: [[51718,1],85] >> >>> Pid: 0 Local rank: 13 Node rank: 13 App rank: 85 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../../../BB/..] >> >>> Data for proc: [[51718,1],86] >> >>> Pid: 0 Local rank: 14 Node rank: 14 App rank: 86 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../../../../BB][../../../../../../../..] >> >>> Data for proc: [[51718,1],87] >> >>> Pid: 0 Local rank: 15 Node rank: 15 App rank: 87 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../../../../BB] >> >>> >> >>> Data for node: csclprd3-0-10 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],12] Daemon launched: True >> >>> Num slots: 16 Slots in use: 16 Oversubscribed: FALSE >> >>> Num slots allocated: 16 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 16 Next node_rank: 16 >> >>> Data for proc: [[51718,1],88] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 88 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [BB/../../../../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],89] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 89 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][BB/../../../../../../..] >> >>> Data for proc: [[51718,1],90] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 90 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../BB/../../../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],91] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 91 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../BB/../../../../../..] >> >>> Data for proc: [[51718,1],92] >> >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 92 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../BB/../../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],93] >> >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 93 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../BB/../../../../..] >> >>> Data for proc: [[51718,1],94] >> >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 94 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../BB/../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],95] >> >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 95 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../BB/../../../..] >> >>> Data for proc: [[51718,1],96] >> >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 96 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../BB/../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],97] >> >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 97 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../BB/../../..] >> >>> Data for proc: [[51718,1],98] >> >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 98 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../../BB/../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],99] >> >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 99 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../../BB/../..] >> >>> Data for proc: [[51718,1],100] >> >>> Pid: 0 Local rank: 12 Node rank: 12 App rank: 100 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../../../BB/..][../../../../../../../..] >> >>> Data for proc: [[51718,1],101] >> >>> Pid: 0 Local rank: 13 Node rank: 13 App rank: 101 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../../../BB/..] >> >>> Data for proc: [[51718,1],102] >> >>> Pid: 0 Local rank: 14 Node rank: 14 App rank: 102 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../../../../BB][../../../../../../../..] >> >>> Data for proc: [[51718,1],103] >> >>> Pid: 0 Local rank: 15 Node rank: 15 App rank: 103 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../../../../BB] >> >>> >> >>> Data for node: csclprd3-0-11 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],13] Daemon launched: True >> >>> Num slots: 16 Slots in use: 16 Oversubscribed: FALSE >> >>> Num slots allocated: 16 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 16 Next node_rank: 16 >> >>> Data for proc: [[51718,1],104] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 104 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [BB/../../../../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],105] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 105 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][BB/../../../../../../..] >> >>> Data for proc: [[51718,1],106] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 106 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../BB/../../../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],107] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 107 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../BB/../../../../../..] >> >>> Data for proc: [[51718,1],108] >> >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 108 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../BB/../../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],109] >> >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 109 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../BB/../../../../..] >> >>> Data for proc: [[51718,1],110] >> >>> Pid: 0 Local rank: 6 Node rank: 6 App rank: 110 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../BB/../../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],111] >> >>> Pid: 0 Local rank: 7 Node rank: 7 App rank: 111 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../BB/../../../..] >> >>> Data for proc: [[51718,1],112] >> >>> Pid: 0 Local rank: 8 Node rank: 8 App rank: 112 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../BB/../../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],113] >> >>> Pid: 0 Local rank: 9 Node rank: 9 App rank: 113 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../BB/../../..] >> >>> Data for proc: [[51718,1],114] >> >>> Pid: 0 Local rank: 10 Node rank: 10 App rank: 114 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../../BB/../..][../../../../../../../..] >> >>> Data for proc: [[51718,1],115] >> >>> Pid: 0 Local rank: 11 Node rank: 11 App rank: 115 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../../BB/../..] >> >>> Data for proc: [[51718,1],116] >> >>> Pid: 0 Local rank: 12 Node rank: 12 App rank: 116 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../../../BB/..][../../../../../../../..] >> >>> Data for proc: [[51718,1],117] >> >>> Pid: 0 Local rank: 13 Node rank: 13 App rank: 117 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../../../BB/..] >> >>> Data for proc: [[51718,1],118] >> >>> Pid: 0 Local rank: 14 Node rank: 14 App rank: 118 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> Binding: [../../../../../../../BB][../../../../../../../..] >> >>> Data for proc: [[51718,1],119] >> >>> Pid: 0 Local rank: 15 Node rank: 15 App rank: 119 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../../../..][../../../../../../../BB] >> >>> >> >>> Data for node: csclprd3-0-12 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],14] Daemon launched: True >> >>> Num slots: 6 Slots in use: 6 Oversubscribed: FALSE >> >>> Num slots allocated: 6 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 6 Next node_rank: 6 >> >>> Data for proc: [[51718,1],120] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 120 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [BB/../../../../..] >> >>> Data for proc: [[51718,1],121] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 121 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [../BB/../../../..] >> >>> Data for proc: [[51718,1],122] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 122 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [../../BB/../../..] >> >>> Data for proc: [[51718,1],123] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 123 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [../../../BB/../..] >> >>> Data for proc: [[51718,1],124] >> >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 124 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [../../../../BB/..] >> >>> Data for proc: [[51718,1],125] >> >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 125 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: UNKNOWN >> >>> Binding: [../../../../../BB] >> >>> >> >>> Data for node: csclprd3-0-13 Launch id: -1 State: 0 >> >>> Daemon: [[51718,0],15] Daemon launched: True >> >>> Num slots: 12 Slots in use: 6 Oversubscribed: FALSE >> >>> Num slots allocated: 12 Max slots: 0 >> >>> Username on node: NULL >> >>> Num procs: 6 Next node_rank: 6 >> >>> Data for proc: [[51718,1],126] >> >>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 126 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB][../../../../../..] >> >>> Binding: [BB/../../../../..][../../../../../..] >> >>> Data for proc: [[51718,1],127] >> >>> Pid: 0 Local rank: 1 Node rank: 1 App rank: 127 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../..][BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../..][BB/../../../../..] >> >>> Data for proc: [[51718,1],128] >> >>> Pid: 0 Local rank: 2 Node rank: 2 App rank: 128 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB][../../../../../..] >> >>> Binding: [../BB/../../../..][../../../../../..] >> >>> Data for proc: [[51718,1],129] >> >>> Pid: 0 Local rank: 3 Node rank: 3 App rank: 129 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../..][BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../..][../BB/../../../..] >> >>> Data for proc: [[51718,1],130] >> >>> Pid: 0 Local rank: 4 Node rank: 4 App rank: 130 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [BB/BB/BB/BB/BB/BB][../../../../../..] >> >>> Binding: [../../BB/../../..][../../../../../..] >> >>> Data for proc: [[51718,1],131] >> >>> Pid: 0 Local rank: 5 Node rank: 5 App rank: 131 >> >>> State: INITIALIZED App_context: 0 >> >>> Locale: [../../../../../..][BB/BB/BB/BB/BB/BB] >> >>> Binding: [../../../../../..][../../BB/../../..] >> >>> [csclprd3-0-13:31619] *** Process received signal *** >> >>> [csclprd3-0-13:31619] Signal: Bus error (7) >> >>> [csclprd3-0-13:31619] Signal code: Non-existant physical address (2) >> >>> [csclprd3-0-13:31619] Failing at address: 0x7f1374267a00 >> >>> [csclprd3-0-13:31620] *** Process received signal *** >> >>> [csclprd3-0-13:31620] Signal: Bus error (7) >> >>> [csclprd3-0-13:31620] Signal code: Non-existant physical address (2) >> >>> [csclprd3-0-13:31620] Failing at address: 0x7fcc702a7980 >> >>> [csclprd3-0-13:31615] *** Process received signal *** >> >>> [csclprd3-0-13:31615] Signal: Bus error (7) >> >>> [csclprd3-0-13:31615] Signal code: Non-existant physical address (2) >> >>> [csclprd3-0-13:31615] Failing at address: 0x7f8128367880 >> >>> [csclprd3-0-13:31616] *** Process received signal *** >> >>> [csclprd3-0-13:31616] Signal: Bus error (7) >> >>> [csclprd3-0-13:31616] Signal code: Non-existant physical address (2) >> >>> [csclprd3-0-13:31616] Failing at address: 0x7fe674227a00 >> >>> [csclprd3-0-13:31617] *** Process received signal *** >> >>> [csclprd3-0-13:31617] Signal: Bus error (7) >> >>> [csclprd3-0-13:31617] Signal code: Non-existant physical address (2) >> >>> [csclprd3-0-13:31617] Failing at address: 0x7f061c32db80 >> >>> [csclprd3-0-13:31618] *** Process received signal *** >> >>> [csclprd3-0-13:31618] Signal: Bus error (7) >> >>> [csclprd3-0-13:31618] Signal code: Non-existant physical address (2) >> >>> [csclprd3-0-13:31618] Failing at address: 0x7fb8402eaa80 >> >>> [csclprd3-0-13:31618] [ 0] >> >>> /lib64/libpthread.so.0(+0xf500)[0x7fb851851500] >> >>> [csclprd3-0-13:31618] [ 1] [csclprd3-0-13:31616] [ 0] >> >>> /lib64/libpthread.so.0(+0xf500)[0x7fe6843a4500] >> >>> [csclprd3-0-13:31616] [ 1] [csclprd3-0-13:31620] [ 0] >> >>> /lib64/libpthread.so.0(+0xf500)[0x7fcc80c54500] >> >>> [csclprd3-0-13:31620] [ 1] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fcc80fc9f61] >> >>> [csclprd3-0-13:31620] [ 2] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fcc80fca047] >> >>> [csclprd3-0-13:31620] [ 3] [csclprd3-0-13:31615] [ 0] >> >>> /lib64/libpthread.so.0(+0xf500)[0x7f81385ca500] >> >>> [csclprd3-0-13:31615] [ 1] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f813893ff61] >> >>> [csclprd3-0-13:31615] [ 2] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f8138940047] >> >>> [csclprd3-0-13:31615] [ 3] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fb851bc6f61] >> >>> [csclprd3-0-13:31618] [ 2] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fb851bc7047] >> >>> [csclprd3-0-13:31618] [ 3] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fb851ab4670] >> >>> [csclprd3-0-13:31618] [ 4] [csclprd3-0-13:31617] [ 0] >> >>> /lib64/libpthread.so.0(+0xf500)[0x7f062cfe5500] >> >>> [csclprd3-0-13:31617] [ 1] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f062d35af61] >> >>> [csclprd3-0-13:31617] [ 2] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f062d35b047] >> >>> [csclprd3-0-13:31617] [ 3] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f062d248670] >> >>> [csclprd3-0-13:31617] [ 4] [csclprd3-0-13:31619] [ 0] >> >>> /lib64/libpthread.so.0(+0xf500)[0x7f1384fde500] >> >>> [csclprd3-0-13:31619] [ 1] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7f1385353f61] >> >>> [csclprd3-0-13:31619] [ 2] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x167f61)[0x7fe684719f61] >> >>> [csclprd3-0-13:31616] [ 2] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7fe68471a047] >> >>> [csclprd3-0-13:31616] [ 3] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fe684607670] >> >>> [csclprd3-0-13:31616] [ 4] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x168047)[0x7f1385354047] >> >>> [csclprd3-0-13:31619] [ 3] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f1385241670] >> >>> [csclprd3-0-13:31619] [ 4] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f13852425ab] >> >>> [csclprd3-0-13:31619] [ 5] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f1385242751] >> >>> [csclprd3-0-13:31619] [ 6] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f13853501c9] >> >>> [csclprd3-0-13:31619] [ 7] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f1385336628] >> >>> [csclprd3-0-13:31619] [ 8] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7fcc80eb7670] >> >>> [csclprd3-0-13:31620] [ 4] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fcc80eb85ab] >> >>> [csclprd3-0-13:31620] [ 5] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fcc80eb8751] >> >>> [csclprd3-0-13:31620] [ 6] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fcc80fc61c9] >> >>> [csclprd3-0-13:31620] [ 7] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fcc80fac628] >> >>> [csclprd3-0-13:31620] [ 8] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fcc8111fd61] >> >>> [csclprd3-0-13:31620] [ 9] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x55670)[0x7f813882d670] >> >>> [csclprd3-0-13:31615] [ 4] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f813882e5ab] >> >>> [csclprd3-0-13:31615] [ 5] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f813882e751] >> >>> [csclprd3-0-13:31615] [ 6] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f813893c1c9] >> >>> [csclprd3-0-13:31615] [ 7] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f8138922628] >> >>> [csclprd3-0-13:31615] [ 8] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f8138a95d61] >> >>> [csclprd3-0-13:31615] [ 9] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f813885d747] >> >>> [csclprd3-0-13:31615] [10] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fb851ab55ab] >> >>> [csclprd3-0-13:31618] [ 5] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fb851ab5751] >> >>> [csclprd3-0-13:31618] [ 6] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fb851bc31c9] >> >>> [csclprd3-0-13:31618] [ 7] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fb851ba9628] >> >>> [csclprd3-0-13:31618] [ 8] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fb851d1cd61] >> >>> [csclprd3-0-13:31618] [ 9] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fb851ae4747] >> >>> [csclprd3-0-13:31618] [10] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7f062d2495ab] >> >>> [csclprd3-0-13:31617] [ 5] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7f062d249751] >> >>> [csclprd3-0-13:31617] [ 6] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7f062d3571c9] >> >>> [csclprd3-0-13:31617] [ 7] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7f062d33d628] >> >>> [csclprd3-0-13:31617] [ 8] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f062d4b0d61] >> >>> [csclprd3-0-13:31617] [ 9] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f062d278747] >> >>> [csclprd3-0-13:31617] [10] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_grow+0x3b9)[0x7fe6846085ab] >> >>> [csclprd3-0-13:31616] [ 5] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_free_list_resize_mt+0xfb)[0x7fe684608751] >> >>> [csclprd3-0-13:31616] [ 6] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_btl_sm_add_procs+0x671)[0x7fe6847161c9] >> >>> [csclprd3-0-13:31616] [ 7] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(+0x14a628)[0x7fe6846fc628] >> >>> [csclprd3-0-13:31616] [ 8] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7fe68486fd61] >> >>> [csclprd3-0-13:31616] [ 9] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fe684637747] >> >>> [csclprd3-0-13:31616] [10] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fe68467750b] >> >>> [csclprd3-0-13:31616] [11] >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] >> >>> [csclprd3-0-13:31616] [12] >> >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fe684021cdd] >> >>> [csclprd3-0-13:31616] [13] >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] >> >>> [csclprd3-0-13:31616] *** End of error message *** >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f062d2b850b] >> >>> [csclprd3-0-13:31617] [11] >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] >> >>> [csclprd3-0-13:31617] [12] >> >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f062cc62cdd] >> >>> [csclprd3-0-13:31617] [13] >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] >> >>> [csclprd3-0-13:31617] *** End of error message *** >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(mca_pml_ob1_add_procs+0xff)[0x7f13854a9d61] >> >>> [csclprd3-0-13:31619] [ 9] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7f1385271747] >> >>> [csclprd3-0-13:31619] [10] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f13852b150b] >> >>> [csclprd3-0-13:31619] [11] >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] >> >>> [csclprd3-0-13:31619] [12] >> >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f1384c5bcdd] >> >>> [csclprd3-0-13:31619] [13] >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] >> >>> [csclprd3-0-13:31619] *** End of error message *** >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(ompi_mpi_init+0xbda)[0x7fcc80ee7747] >> >>> [csclprd3-0-13:31620] [10] >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fcc80f2750b] >> >>> [csclprd3-0-13:31620] [11] >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] >> >>> [csclprd3-0-13:31620] [12] >> >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fcc808d1cdd] >> >>> [csclprd3-0-13:31620] [13] >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] >> >>> [csclprd3-0-13:31620] *** End of error message *** >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7f813889d50b] >> >>> [csclprd3-0-13:31615] [11] >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] >> >>> [csclprd3-0-13:31615] [12] >> >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f8138247cdd] >> >>> [csclprd3-0-13:31615] [13] >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] >> >>> [csclprd3-0-13:31615] *** End of error message *** >> >>> /hpc/apps/mpi/openmpi/1.8.6/lib/libmpi.so.1(MPI_Init+0x185)[0x7fb851b2450b] >> >>> [csclprd3-0-13:31618] [11] >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400ad0] >> >>> [csclprd3-0-13:31618] [12] >> >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fb8514cecdd] >> >>> [csclprd3-0-13:31618] [13] >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3[0x400999] >> >>> [csclprd3-0-13:31618] *** End of error message *** >> >>> -------------------------------------------------------------------------- >> >>> mpirun noticed that process rank 126 with PID 0 on node csclprd3-0-13 >> >>> exited on signal 7 (Bus error). >> >>> -------------------------------------------------------------------------- >> >>> >> >>> From: users [users-boun...@open-mpi.org >> >>> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain >> >>> [r...@open-mpi.org <mailto:r...@open-mpi.org>] >> >>> Sent: Tuesday, June 23, 2015 6:20 PM >> >>> To: Open MPI Users >> >>> Subject: Re: [OMPI users] OpenMPI 1.8.6, CentOS 6.3, too many slots = >> >>> crash >> >>> >> >>> Wow - that is one sick puppy! I see that some nodes are reporting >> >>> not-bound for their procs, and the rest are binding to socket (as they >> >>> should). Some of your nodes clearly do not have hyper threads enabled >> >>> (or only have single-thread cores on them), and have 2 cores/socket. >> >>> Other nodes have 8 cores/socket with hyper threads enabled, while still >> >>> others have 6 cores/socket and HT enabled. >> >>> >> >>> I don't see anyone binding to a single HT if multiple HTs/core are >> >>> available. I think you are being fooled by those nodes that either don't >> >>> have HT enabled, or have only 1 HT/core. >> >>> >> >>> In both cases, it is node 13 that is the node that fails. I also note >> >>> that you said everything works okay with < 132 ranks, and node 13 hosts >> >>> ranks 127-131. So node 13 would host ranks even if you reduced the >> >>> number in the job to 131. This would imply that it probably isn't >> >>> something wrong with the node itself. >> >>> >> >>> Is there any way you could run a job of this size on a homogeneous >> >>> cluster? The procs all show bindings that look right, but I'm wondering >> >>> if the heterogeneity is the source of the trouble here. We may be >> >>> communicating the binding pattern incorrectly and giving bad info to the >> >>> backend daemon. >> >>> >> >>> Also, rather than --report-bindings, use "--display-devel-map" on the >> >>> command line and let's see what the mapper thinks it did. If there is a >> >>> problem with placement, that is where it would exist. >> >>> >> >>> >> >>> On Tue, Jun 23, 2015 at 5:12 PM, Lane, William <william.l...@cshs.org >> >>> <mailto:william.l...@cshs.org>> wrote: >> >>> Ralph, >> >>> >> >>> There is something funny going on, the trace from the >> >>> runs w/the debug build aren't showing any differences from >> >>> what I got earlier. However, I did do a run w/the --bind-to core >> >>> switch and was surprised to see that hyperthreading cores were >> >>> sometimes being used. >> >>> >> >>> Here's the traces that I have: >> >>> >> >>> mpirun -np 132 -report-bindings --prefix /hpc/apps/mpi/openmpi/1.8.6/ >> >>> --hostfile hostfile-noslots --mca btl_tcp_if_include eth0 --hetero-nodes >> >>> /hpc/home/lanew/mpi/openmpi/ProcessColors3 >> >>> [csclprd3-0-5:16802] MCW rank 44 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-5:16802] MCW rank 45 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-5:16802] MCW rank 46 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-6-5:12480] MCW rank 4 bound to socket 0[core 0[hwt 0]], socket >> >>> 0[core 1[hwt 0]]: [B/B][./.] >> >>> [csclprd3-6-5:12480] MCW rank 5 bound to socket 1[core 2[hwt 0]], socket >> >>> 1[core 3[hwt 0]]: [./.][B/B] >> >>> [csclprd3-6-5:12480] MCW rank 6 bound to socket 0[core 0[hwt 0]], socket >> >>> 0[core 1[hwt 0]]: [B/B][./.] >> >>> [csclprd3-6-5:12480] MCW rank 7 bound to socket 1[core 2[hwt 0]], socket >> >>> 1[core 3[hwt 0]]: [./.][B/B] >> >>> [csclprd3-0-5:16802] MCW rank 47 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-5:16802] MCW rank 48 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-5:16802] MCW rank 49 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-1:14318] MCW rank 22 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-1:14318] MCW rank 23 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-1:14318] MCW rank 24 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-6-1:24682] MCW rank 3 bound to socket 1[core 2[hwt 0]], socket >> >>> 1[core 3[hwt 0]]: [./.][B/B] >> >>> [csclprd3-6-1:24682] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket >> >>> 0[core 1[hwt 0]]: [B/B][./.] >> >>> [csclprd3-0-1:14318] MCW rank 25 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-1:14318] MCW rank 20 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-3:13827] MCW rank 34 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-1:14318] MCW rank 21 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-3:13827] MCW rank 35 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-6-1:24682] MCW rank 1 bound to socket 1[core 2[hwt 0]], socket >> >>> 1[core 3[hwt 0]]: [./.][B/B] >> >>> [csclprd3-0-3:13827] MCW rank 36 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-6-1:24682] MCW rank 2 bound to socket 0[core 0[hwt 0]], socket >> >>> 0[core 1[hwt 0]]: [B/B][./.] >> >>> [csclprd3-0-6:30371] MCW rank 51 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-6:30371] MCW rank 52 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-6:30371] MCW rank 53 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-2:05825] MCW rank 30 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-6:30371] MCW rank 54 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-3:13827] MCW rank 37 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-2:05825] MCW rank 31 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-3:13827] MCW rank 32 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-6:30371] MCW rank 55 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-3:13827] MCW rank 33 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-6:30371] MCW rank 50 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-2:05825] MCW rank 26 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-2:05825] MCW rank 27 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-2:05825] MCW rank 28 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-2:05825] MCW rank 29 is not bound (or bound to all available >> >>> processors) >> >>> [csclprd3-0-12:12383] MCW rank 121 is not bound (or bound to all >> >>> available processors) >> >>> [csclprd3-0-12:12383] MCW rank 122 is not bound (or bound to all >> >>> available processors) >> >>> [csclprd3-0-12:12383] MCW rank 123 is not bound (or bound to all >> >>> available processors) >> >>> [csclprd3-0-12:12383] MCW rank 124 is not bound (or bound to all >> >>> available processors) >> >>> [csclprd3-0-12:12383] MCW rank 125 is not bound (or bound to all >> >>> available processors) >> >>> [csclprd3-0-12:12383] MCW rank 120 is not bound (or bound to all >> >>> available processors) >> >>> [csclprd3-0-0:31079] MCW rank 13 bound to socket 1[core 6[hwt 0]], >> >>> socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt >> >>> 0]], socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: >> >>> [./././././.][B/B/B/B/B/B] >> >>> [csclprd3-0-0:31079] MCW rank 14 bound to socket 0[core 0[hwt 0]], >> >>> socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt >> >>> 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: >> >>> [B/B/B/B/B/B][./././././.] >> >>> [csclprd3-0-0:31079] MCW rank 15 bound to socket 1[core 6[hwt 0]], >> >>> socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt >> >>> 0]], socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: >> >>> [./././././.][B/B/B/B/B/B] >> >>> [csclprd3-0-0:31079] MCW rank 16 bound to socket 0[core 0[hwt 0]], >> >>> socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt >> >>> 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: >> >>> [B/B/B/B/B/B][./././././.] >> >>> [csclprd3-0-7:20515] MCW rank 68 bound to socket 0[core 0[hwt 0-1]], >> >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core >> >>> 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], >> >>> socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: >> >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> [csclprd3-0-10:19096] MCW rank 100 bound to socket 0[core 0[hwt 0-1]], >> >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core >> >>> 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], >> >>> socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: >> >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> [csclprd3-0-7:20515] MCW rank 69 bound to socket 1[core 8[hwt 0-1]], >> >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core >> >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], >> >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: >> >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> [csclprd3-0-10:19096] MCW rank 101 bound to socket 1[core 8[hwt 0-1]], >> >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core >> >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], >> >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: >> >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> [csclprd3-0-0:31079] MCW rank 17 bound to socket 1[core 6[hwt 0]], >> >>> socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt >> >>> 0]], socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: >> >>> [./././././.][B/B/B/B/B/B] >> >>> [csclprd3-0-7:20515] MCW rank 70 bound to socket 0[core 0[hwt 0-1]], >> >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core >> >>> 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], >> >>> socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: >> >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> [csclprd3-0-10:19096] MCW rank 102 bound to socket 0[core 0[hwt 0-1]], >> >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core >> >>> 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], >> >>> socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: >> >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> [csclprd3-0-11:31636] MCW rank 116 bound to socket 0[core 0[hwt 0-1]], >> >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core >> >>> 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], >> >>> socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: >> >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> [csclprd3-0-11:31636] MCW rank 117 bound to socket 1[core 8[hwt 0-1]], >> >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core >> >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], >> >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: >> >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> [csclprd3-0-0:31079] MCW rank 18 bound to socket 0[core 0[hwt 0]], >> >>> socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt >> >>> 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: >> >>> [B/B/B/B/B/B][./././././.] >> >>> [csclprd3-0-11:31636] MCW rank 118 bound to socket 0[core 0[hwt 0-1]], >> >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core >> >>> 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], >> >>> socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: >> >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> [csclprd3-0-0:31079] MCW rank 19 bound to socket 1[core 6[hwt 0]], >> >>> socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt >> >>> 0]], socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: >> >>> [./././././.][B/B/B/B/B/B] >> >>> [csclprd3-0-7:20515] MCW rank 71 bound to socket 1[core 8[hwt 0-1]], >> >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core >> >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], >> >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: >> >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> [csclprd3-0-10:19096] MCW rank 103 bound to socket 1[core 8[hwt 0-1]], >> >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core >> >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], >> >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: >> >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> [csclprd3-0-0:31079] MCW rank 8 bound to socket 0[core 0[hwt 0]], socket >> >>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], >> >>> socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: >> >>> [B/B/B/B/B/B][./././././.] >> >>> [csclprd3-0-0:31079] MCW rank 9 bound to socket 1[core 6[hwt 0]], socket >> >>> 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]], >> >>> socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: >> >>> [./././././.][B/B/B/B/B/B] >> >>> [csclprd3-0-10:19096] MCW rank 88 bound to socket 0[core 0[hwt 0-1]], >> >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core >> >>> 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], >> >>> socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: >> >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> [csclprd3-0-11:31636] MCW rank 119 bound to socket 1[core 8[hwt 0-1]], >> >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core >> >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], >> >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: >> >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> [csclprd3-0-7:20515] MCW rank 56 bound to socket 0[core 0[hwt 0-1]], >> >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core >> >>> 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], >> >>> socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: >> >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> [csclprd3-0-0:31079] MCW rank 10 bound to socket 0[core 0[hwt 0]], >> >>> socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt >> >>> 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: >> >>> [B/B/B/B/B/B][./././././.] >> >>> [csclprd3-0-7:20515] MCW rank 57 bound to socket 1[core 8[hwt 0-1]], >> >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core >> >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], >> >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: >> >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> [csclprd3-0-10:19096] MCW rank 89 bound to socket 1[core 8[hwt 0-1]], >> >>> socket 1[core 9[hwt 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core >> >>> 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], >> >>> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: >> >>> [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB] >> >>> [csclprd3-0-11:31636] MCW rank 104 bound to socket 0[core 0[hwt 0-1]], >> >>> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core >> >>> 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], >> >>> socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: >> >>> [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..] >> >>> [csclprd3-0-0:31079] MCW rank 11 bound to socket 1[core 6[hwt 0]], >> >>> socket 1[core 7[hwt 0]], socket 1[core 8[hwt 0]], socket 1[core 9[hwt >> >>> 0]], socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: >> >>> [./././././.][B/B/B/B/B/B] >> >>> [csclprd3-0-0:31079] MCW rank 12 bound to socket 0[core 0[hwt 0]], >> >>> socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt >> >>> 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: >> >>> [B/B/B/B/B/B][./././././.] >> >>> [csclprd3-0-4:30348] MCW rank 42 is not bound (or bound to all >> >>> >> >>> _______________________________________________ >> >>> users mailing list >> >>> us...@open-mpi.org <mailto:us...@open-mpi.org> >> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> >>> <http://www.open-mpi.org/mailman/listinfo.cgi/users> >> >>> Link to this post: >> >>> http://www.open-mpi.org/community/lists/users/2015/06/27185.php >> >>> <http://www.open-mpi.org/community/lists/users/2015/06/27185.php> >> >>> >> >>> IMPORTANT WARNING: This message is intended for the use of the person or >> >>> entity to which it is addressed and may contain information that is >> >>> privileged and confidential, the disclosure of which is governed by >> >>> applicable law. If the reader of this message is not the intended >> >>> recipient, or the employee or agent responsible for delivering it to the >> >>> intended recipient, you are hereby notified that any dissemination, >> >>> distribution or copying of this information is strictly prohibited. >> >>> Thank you for your cooperation. >> >>> _______________________________________________ >> >>> users mailing list >> >>> us...@open-mpi.org <mailto:us...@open-mpi.org> >> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> >>> <http://www.open-mpi.org/mailman/listinfo.cgi/users> >> >>> Link to this post: >> >>> http://www.open-mpi.org/community/lists/users/2015/06/27204.php >> >>> <http://www.open-mpi.org/community/lists/users/2015/06/27204.php> >> >> >> >> IMPORTANT WARNING: This message is intended for the use of the person or >> >> entity to which it is addressed and may contain information that is >> >> privileged and confidential, the disclosure of which is governed by >> >> applicable law. If the reader of this message is not the intended >> >> recipient, or the employee or agent responsible for delivering it to the >> >> intended recipient, you are hereby notified that any dissemination, >> >> distribution or copying of this information is strictly prohibited. Thank >> >> you for your cooperation. >> >> _______________________________________________ >> >> users mailing list >> >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> <http://www.open-mpi.org/mailman/listinfo.cgi/users> >> >> Link to this post: >> >> http://www.open-mpi.org/community/lists/users/2015/06/27220.php >> >> <http://www.open-mpi.org/community/lists/users/2015/06/27220.php> >> > >> > >> > -- >> > Jeff Squyres >> > jsquy...@cisco.com <mailto:jsquy...@cisco.com> >> > For corporate legal information go to: >> > http://www.cisco.com/web/about/doing_business/legal/cri/ >> > <http://www.cisco.com/web/about/doing_business/legal/cri/> >> > >> > _______________________________________________ >> > users mailing list >> > us...@open-mpi.org <mailto:us...@open-mpi.org> >> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> > <http://www.open-mpi.org/mailman/listinfo.cgi/users> >> > Link to this post: >> > http://www.open-mpi.org/community/lists/users/2015/06/27222.php >> > <http://www.open-mpi.org/community/lists/users/2015/06/27222.php> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> <http://www.open-mpi.org/mailman/listinfo.cgi/users> >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2015/07/27261.php >> <http://www.open-mpi.org/community/lists/users/2015/07/27261.php> >> IMPORTANT WARNING: This message is intended for the use of the person or >> entity to which it is addressed and may contain information that is >> privileged and confidential, the disclosure of which is governed by >> applicable law. If the reader of this message is not the intended recipient, >> or the employee or agent responsible for delivering it to the intended >> recipient, you are hereby notified that any dissemination, distribution or >> copying of this information is strictly prohibited. Thank you for your >> cooperation. >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> <http://www.open-mpi.org/mailman/listinfo.cgi/users> >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2015/07/27263.php >> <http://www.open-mpi.org/community/lists/users/2015/07/27263.php> >> >> IMPORTANT WARNING: This message is intended for the use of the person or >> entity to which it is addressed and may contain information that is >> privileged and confidential, the disclosure of which is governed by >> applicable law. If the reader of this message is not the intended recipient, >> or the employee or agent responsible for delivering it to the intended >> recipient, you are hereby notified that any dissemination, distribution or >> copying of this information is strictly prohibited. Thank you for your >> cooperation. _______________________________________________ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> <http://www.open-mpi.org/mailman/listinfo.cgi/users> >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2015/07/27279.php >> <http://www.open-mpi.org/community/lists/users/2015/07/27279.php> > IMPORTANT WARNING: This message is intended for the use of the person or > entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended recipient, > or the employee or agent responsible for delivering it to the intended > recipient, you are hereby notified that any dissemination, distribution or > copying of this information is strictly prohibited. Thank you for your > cooperation. _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > <http://www.open-mpi.org/mailman/listinfo.cgi/users> > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27281.php > <http://www.open-mpi.org/community/lists/users/2015/07/27281.php>