Added this info to the ticket, and added you to it as well.

Thanks again
Ralph


On Dec 25, 2013, at 3:42 PM, tmish...@jcity.maeda.co.jp wrote:

> 
> 
> Hi Ralph,
> 
> Thank you for your reply. After that, I found more reasonable fix,
> I guess. I moved OBJ_CONSTRUCT for opal_tree_item_t out of debug
> part in opal_tree_construct as shown below:
> 
> static void opal_tree_construct(opal_tree_t *tree)
> {
>    OBJ_CONSTRUCT( &(tree->opal_tree_sentinel), opal_tree_item_t ); /*
> tmishima */
> #if OPAL_ENABLE_DEBUG
>    /* These refcounts should never be used in assertions because they
>       should never be removed from this list, added to another list,
>       etc.  So set them to sentinel values. */
> 
>    tree->opal_tree_sentinel.opal_tree_item_refcount  = 1;
>    tree->opal_tree_sentinel.opal_tree_item_belong_to = tree;
> #endif
>    tree->opal_tree_sentinel.opal_tree_container = tree;
>    tree->opal_tree_sentinel.opal_tree_parent = &tree->opal_tree_sentinel;
>    tree->opal_tree_sentinel.opal_tree_num_ancestors = -1;
> 
>    tree->opal_tree_sentinel.opal_tree_next_sibling =
>        &tree->opal_tree_sentinel;
>    tree->opal_tree_sentinel.opal_tree_prev_sibling =
>        &tree->opal_tree_sentinel;
> 
>    tree->opal_tree_sentinel.opal_tree_first_child = &tree->
> opal_tree_sentinel;
>    tree->opal_tree_sentinel.opal_tree_last_child = &tree->
> opal_tree_sentinel;
> 
>    tree->opal_tree_num_items = 0;
>    tree->comp = NULL;
>    tree->serialize = NULL;
>    tree->deserialize = NULL;
>    tree->get_key = NULL;
> }
> 
> In addtion, I checked how lama worked for the hierarchy inversion.
> Then, it did not work on node04 which has the inversion and worked on
> node09 which has normal one. Please foward this information to lama
> developers.
> 
> Regerds,
> Tetsuya Mishima
> 
> qsub: job 8380.manage.cluster completed
> [mishima@manage openmpi-1.7.4rc2r30069]$ qsub -I -l nodes=4:ppn=8
> qsub: waiting for job 8381.manage.cluster to start
> qsub: job 8381.manage.cluster ready
> 
> [mishima@node09 ~]$ cd ~/Desktop/openmpi-1.7/demos/
> [mishima@node09 demos]$ mpirun -np 2 -report-bindings -mca rmaps lama -mca
> rmaps_lama_bind 1N -mca rmaps_lama_map Ncsbnh
> myprog
> [node09.cluster:20144] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
> [node09.cluster:20144] MCW rank 1 bound to socket 1[core 4[hwt 0]], socket
> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
> Hello world from process 1 of 2
> Hello world from process 0 of 2
> [mishima@node09 demos]$
> 
> 
> qsub: job 8383.manage.cluster completed
> [mishima@manage openmpi-1.7.4rc2r30069]$ qsub -I -l nodes=1:ppn=32
> qsub: waiting for job 8384.manage.cluster to start
> qsub: job 8384.manage.cluster ready
> 
> [mishima@node04 ~]$ cd ~/Desktop/openmpi-1.7/demos/
> [mishima@node04 demos]$ mpirun -np 2 -report-bindings -mca rmaps lama -mca
> rmaps_lama_bind 1N -mca rmaps_lama_map Ncsbnh
> myprog
> --------------------------------------------------------------------------
> RMaps LAMA detected that there are not enough resources to map the
> remainder of the job. Check the command line options, and the number of
> nodes allocated to this job.
> Application Context : 0
> # of Processes Successfully Mapped: 0
> # of Processes Requested          : 2
> Mapping  : Ncsbnh
> Binding  : 1N
> MPPR     : [Not Provided]
> Ordering : s
> --------------------------------------------------------------------------
> [node04.cluster:20298] [[21003,0],0] ORTE_ERROR_LOG: Error in file
> rmaps_lama_module.c at line 309
> 
> [node04.cluster:20298] [[21003,0],0] ORTE_ERROR_LOG: Error in file
> base/rmaps_base_map_job.c at line 217
> 
>> Deeply appreciate all you help! Your fix looks reasonable to me and is
> the kind of difference we frequently see between compilers and
> environments, which is why initializing variables is so
>> important. This one apparently slipped by the lama developers.
>> 
>> I'll apply to trunk and cmr it across to 1.7.4.
>> 
>> Thanks again
>> Ralph
>> 
>> On Dec 25, 2013, at 3:39 AM, tmish...@jcity.maeda.co.jp wrote:
>> 
>>> 
>>> 
>>> Hi Ralph,
>>> 
>>> I did valgrind and found uninitialised value errors. All of them
>>> occured in opal_tree_add_child as shown at the bottom. As a quick
>>> fix, I puted one line in "opal_tree.c", although it's not elegant:
>>> 
>>> void opal_tree_init(opal_tree_t *tree, opal_tree_comp_fn_t comp,
>>>                   opal_tree_item_serialize_fn_t serialize,
>>>                   opal_tree_item_deserialize_fn_t deserialize,
>>>                   opal_tree_get_key_fn_t get_key)
>>> {
>>>   tree->comp = comp;
>>>   tree->serialize = serialize;
>>>   tree->deserialize = deserialize;
>>>   tree->get_key = get_key;
>>>   opal_tree_get_root(tree)->opal_tree_num_children = 0 ; /* added by
>>> tmishima */
>>> }
>>> 
>>> Then, these errors all disappeared and openmpi with lama worked fine.
>>> As I told you before, I built openmpi with PGI 13.10. As far as I
>>> checked, no error was detected by valgrind with openmpi built by
>>> GNU compiler. Therefore, it might depend on compiler...
>>> Anyway, I would like to ask you (or openmpi team) to continue
>>> further investigation.
>>> 
>>> Regards,
>>> Tetsuya Mishima
>>> 
>>> valgrind -v --error-limit=no --leak-check=yes --show-reachable=no
> mpirun
>>> -np 1 -mca rmaps lama -report-bindings -mca rmaps_base_verbose 100
>>> --display-map ~/Desktop/openmpi-1.7/demos/myprog 2>&1 | tee
> valgrind.log
>>> 
>>> ....
>>> ==27313== Conditional jump or move depends on uninitialised value(s)
>>> ==27313==    at 0x4EC52A4: opal_tree_add_child (opal_tree.c:191)
>>> ==27313==    by 0x81E3314: rmaps_lama_convert_hwloc_subtree
>>> (rmaps_lama_max_tree.c:320)
>>> ==27313==    by 0x81E321D: rmaps_lama_convert_hwloc_tree_to_opal_tree
>>> (rmaps_lama_max_tree.c:267)
>>> ==27313==    by 0x81E2EE8: rmaps_lama_build_max_tree
>>> (rmaps_lama_max_tree.c:154)
>>> ==27313==    by 0x81E0E58: orte_rmaps_lama_map_core
>>> (rmaps_lama_module.c:664)
>>> ==27313==    by 0x81E02D7: orte_rmaps_lama_map
> (rmaps_lama_module.c:303)
>>> ==27313==    by 0x4C6468B: orte_rmaps_base_map_job
>>> (rmaps_base_map_job.c:204)
>>> ==27313==    by 0x4F094CC: event_process_active_single_queue
> (event.c:1366)
>>> ==27313==    by 0x4F090D8: event_process_active (event.c:1434)
>>> ==27313==    by 0x4F050FF: opal_libevent2021_event_base_loop
> (event.c:1645)
>>> ==27313==    by 0x4079A6: orterun (orterun.c:1049)
>>> ==27313==    by 0x40694A: main (main.c:13)
>>> .....
>>> ==27313== Conditional jump or move depends on uninitialised value(s)
>>> ==27313==    at 0x4EC52A4: opal_tree_add_child (opal_tree.c:191)
>>> ==27313==    by 0x4EC5D0E: deserialize_add_tree_item (opal_tree.c:496)
>>> ==27313==    by 0x4EC5578: opal_tree_deserialize (opal_tree.c:524)
>>> ==27313==    by 0x4EC5609: opal_tree_dup (opal_tree.c:544)
>>> ==27313==    by 0x81E2FF6: rmaps_lama_build_max_tree
>>> (rmaps_lama_max_tree.c:202)
>>> ==27313==    by 0x81E0E58: orte_rmaps_lama_map_core
>>> (rmaps_lama_module.c:664)
>>> ==27313==    by 0x81E02D7: orte_rmaps_lama_map
> (rmaps_lama_module.c:303)
>>> ==27313==    by 0x4C6468B: orte_rmaps_base_map_job
>>> (rmaps_base_map_job.c:204)
>>> ==27313==    by 0x4F094CC: event_process_active_single_queue
> (event.c:1366)
>>> ==27313==    by 0x4F090D8: event_process_active (event.c:1434)
>>> ==27313==    by 0x4F050FF: opal_libevent2021_event_base_loop
> (event.c:1645)
>>> ==27313==    by 0x4079A6: orterun (orterun.c:1049)
>>> ....
>>> ==27313== Conditional jump or move depends on uninitialised value(s)
>>> ==27313==    at 0x4EC52A4: opal_tree_add_child (opal_tree.c:191)
>>> ==27313==    by 0x4EC5D0E: deserialize_add_tree_item (opal_tree.c:496)
>>> ==27313==    by 0x4EC5578: opal_tree_deserialize (opal_tree.c:524)
>>> ==27313==    by 0x4EC5609: opal_tree_dup (opal_tree.c:544)
>>> ==27313==    by 0x81E2FF6: ???
>>> ==27313==    by 0x81E0E58: ???
>>> ==27313==    by 0x81E02D7: ???
>>> ==27313==    by 0x4C6468B: orte_rmaps_base_map_job
>>> (rmaps_base_map_job.c:204)
>>> ==27313==    by 0x4F094CC: event_process_active_single_queue
> (event.c:1366)
>>> ==27313==    by 0x4F090D8: event_process_active (event.c:1434)
>>> ==27313==    by 0x4F050FF: opal_libevent2021_event_base_loop
> (event.c:1645)
>>> ==27313==    by 0x4079A6: orterun (orterun.c:1049)
>>> .....
>>> ==27313== Conditional jump or move depends on uninitialised value(s)
>>> ==27313==    at 0x4EC52A4: opal_tree_add_child (opal_tree.c:191)
>>> ==27313==    by 0x81E3314: ???
>>> ==27313==    by 0x81E321D: ???
>>> ==27313==    by 0x81E2EE8: ???
>>> ==27313==    by 0x81E0E58: ???
>>> ==27313==    by 0x81E02D7: ???
>>> ==27313==    by 0x4C6468B: orte_rmaps_base_map_job
>>> (rmaps_base_map_job.c:204)
>>> ==27313==    by 0x4F094CC: event_process_active_single_queue
> (event.c:1366)
>>> ==27313==    by 0x4F090D8: event_process_active (event.c:1434)
>>> ==27313==    by 0x4F050FF: opal_libevent2021_event_base_loop
> (event.c:1645)
>>> ==27313==    by 0x4079A6: orterun (orterun.c:1049)
>>> ==27313==    by 0x40694A: main (main.c:13)
>>> 
>>> 
>>> 
>>>> Hi Ralph,
>>>> 
>>>> Here is the output when I put "-mca rmaps_base_verbose 10
> --display-map"
>>>> and where it stopped(by gdb), which shows it stopped in a function of
>>> lama.
>>>> 
>>>> I usually use PGI 13.10, so I tried to change it to gnu compiler.
>>>> Then, it works. Therefore, this problem depends on compiler.
>>>> 
>>>> That's all what I could find today.
>>>> 
>>>> Regards,
>>>> Tetsuya Mishima
>>>> 
>>>> [mishima@manage ~]$ gdb
>>>> GNU gdb (GDB) CentOS (7.0.1-42.el5.centos.1)
>>>> ....
>>>> (gdb) attach 14666
>>>> ....
>>>> 0x00002aaaab4c5c33 in rmaps_lama_prune_max_tree ()
>>>> at ./rmaps_lama_max_tree.c:814
>>>> 
>>>> [mishima@manage demos]$ mpirun -np 2 -mca rmaps lama -report-bindings
>>> -mca
>>>> rmaps_base_verbose 10 --display-map myprog
>>>> [manage.cluster:21503] mca: base: components_register: registering
> rmaps
>>>> components
>>>> [manage.cluster:21503] mca: base: components_register: found loaded
>>>> component lama
>>>> [manage.cluster:21503] mca:rmaps:lama: Priority   0
>>>> [manage.cluster:21503] mca:rmaps:lama: Map   : NULL
>>>> [manage.cluster:21503] mca:rmaps:lama: Bind  : NULL
>>>> [manage.cluster:21503] mca:rmaps:lama: MPPR  : NULL
>>>> [manage.cluster:21503] mca:rmaps:lama: Order : NULL
>>>> [manage.cluster:21503] mca: base: components_register: component lama
>>>> register function successful
>>>> [manage.cluster:21503] mca: base: components_open: opening rmaps
>>> components
>>>> [manage.cluster:21503] mca: base: components_open: found loaded
> component
>>>> lama
>>>> [manage.cluster:21503] mca:rmaps:select: checking available component
>>> lama
>>>> [manage.cluster:21503] mca:rmaps:select: Querying component [lama]
>>>> [manage.cluster:21503] [[23940,0],0]: Final mapper priorities
>>>> [manage.cluster:21503]  Mapper: lama Priority: 0
>>>> [manage.cluster:21503] mca:rmaps: mapping job [23940,1]
>>>> [manage.cluster:21503] mca:rmaps: creating new map for job [23940,1]
>>>> [manage.cluster:21503] mca:rmaps: nprocs 2
>>>> [manage.cluster:21503] mca:rmaps:lama: Mapping job [23940,1]
>>>> [manage.cluster:21503] mca:rmaps:lama: Revised Parameters -----
>>>> [manage.cluster:21503] mca:rmaps:lama: Map   : csbnh
>>>> [manage.cluster:21503] mca:rmaps:lama: Bind  : 1c
>>>> [manage.cluster:21503] mca:rmaps:lama: MPPR  : (null)
>>>> [manage.cluster:21503] mca:rmaps:lama: Order : s
>>>> [manage.cluster:21503] mca:rmaps:lama:
> ---------------------------------
>>>> [manage.cluster:21503] mca:rmaps:lama: ----- Binding  : [1c]
>>>> [manage.cluster:21503] mca:rmaps:lama: ----- Binding  :    1 x
> Core
>>>> [manage.cluster:21503] mca:rmaps:lama:
> ---------------------------------
>>>> [manage.cluster:21503] mca:rmaps:lama: ----- Mapping  : [csbnh]
>>>> [manage.cluster:21503] mca:rmaps:lama: ----- Mapping  : (0)       Core
> (7
>>>> vs 0)
>>>> [manage.cluster:21503] mca:rmaps:lama: ----- Mapping  : (1)     Socket
> (3
>>>> vs 1)
>>>> [manage.cluster:21503] mca:rmaps:lama: ----- Mapping  : (2)      Board
> (1
>>>> vs 3)
>>>> [manage.cluster:21503] mca:rmaps:lama: ----- Mapping  : (3)    Machine
> (0
>>>> vs 7)
>>>> [manage.cluster:21503] mca:rmaps:lama: ----- Mapping  : (4) Hw. Thread
> (8
>>>> vs 8)
>>>> [manage.cluster:21503] mca:rmaps:lama:
> ---------------------------------
>>>> [manage.cluster:21503] mca:rmaps:lama: ----- MPPR     : [(null)]
>>>> [manage.cluster:21503] mca:rmaps:lama:
> ---------------------------------
>>>> [manage.cluster:21503] mca:rmaps:lama: ----- Ordering : [s]
>>>> [manage.cluster:21503] mca:rmaps:lama: ----- Ordering : Sequential
>>>> [manage.cluster:21503] mca:rmaps:lama:
> ---------------------------------
>>>> [manage.cluster:21503] AVAILABLE NODES FOR MAPPING:
>>>> [manage.cluster:21503]     node: manage daemon: 0
>>>> [manage.cluster:21503] mca:rmaps:lama:
> ---------------------------------
>>>> [manage.cluster:21503] mca:rmaps:lama: ----- Building the Max Tree...
>>>> [manage.cluster:21503] mca:rmaps:lama:
> ---------------------------------
>>>> [manage.cluster:21503] mca:rmaps:lama: ----- Converting Remote Tree:
>>> manage
>>>> 
>>>> [mishima@manage demos]$ ompi_info | grep "C compiler family"
>>>> C compiler family name: GNU
>>>> [mishima@manage demos]$ mpirun -np 2 -mca rmaps lama myprog
>>>> Hello world from process 0 of 2
>>>> Hello world from process 1 of 2
>>>> 
>>>> 
>>>> 
>>>>> On Dec 21, 2013, at 8:16 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Ralph, thanks. I'll try it on Tuseday.
>>>>>> 
>>>>>> Let me confirm one thing. I don't put "-with-libevent" when I build
>>>>>> openmpi.
>>>>>> Is there any possibility to build with external libevent
>>> automatically?
>>>>> 
>>>>> No - only happens if you direct it
>>>>> 
>>>>> 
>>>>>> 
>>>>>> Tetsuya Mishima
>>>>>> 
>>>>>> 
>>>>>>> Not entirely sure - add "-mca rmaps_base_verbose 10 --display-map"
>>> to
>>>>>> your cmd line and let's see if it finishes the mapping.
>>>>>>> 
>>>>>>> Unless you specifically built with an external libevent (which I
>>>> doubt),
>>>>>> there is no conflict. The connection issue is unlikely to be a
> factor
>>>> here
>>>>>> as it works when not using the lama mapper.
>>>>>>> 
>>>>>>> 
>>>>>>> On Dec 21, 2013, at 3:43 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thank you, Ralph.
>>>>>>>> 
>>>>>>>> Then, this problem should depend on our environment.
>>>>>>>> But, at least, inversion problem is not the cause because
>>>>>>>> node05 has normal hier order.
>>>>>>>> 
>>>>>>>> I can not connect to our cluster now. Tuesday, going
>>>>>>>> back to my office, I'll send you further report.
>>>>>>>> 
>>>>>>>> Before that, please let me know your configuration. I will
>>>>>>>> follow your configuation as much as possible. Our configuraion
>>>>>>>> is very simple, only -with-tm -with-ibverbs -disable-ipv6.
>>>>>>>> (on CentOS 5.8)
>>>>>>>> 
>>>>>>>> The 1.7 series is a llite bit unstable on our cluster yet.
>>>>>>>> 
>>>>>>>> Similar freezing(hang up) was observed with 1.7.3. At that
>>>>>>>> time, lama worked well but putting "-rank-by something" caused
>>>>>>>> same freezing (curiously, rank-by works with 1.7.4rc1).
>>>>>>>> I checked where it stopped using gdb, then I found that it
>>>>>>>> stopped to wait for event in a function of libevent(I can not
>>>>>>>> recall the name).
>>>>>>>> 
>>>>>>>> Is this related to your "connection issue in the OOB
>>>>>>>> subsystem"? Or libevent version conflict? I guess these two
>>>>>>>> problems are related each other. They stopped at very early
>>>>>>>> stage before reaching mapping function because no message
>>>>>>>> appeared before freezing, which is my random guess.
>>>>>>>> 
>>>>>>>> Could you give me any hint or comment?
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Tetsuya Mishima
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> It seems to be working fine for me:
>>>>>>>>> 
>>>>>>>>> [rhc@bend001 tcp]$ mpirun -np 2 -host bend001 -report-bindings
>>> -mca
>>>>>>>> rmaps_lama_bind 1c -mca rmaps lama hostname
>>>>>>>>> bend001
>>>>>>>>> [bend001:17005] MCW rank 1 bound to socket 0[core 1[hwt 0-1]]:
>>>>>>>> [../BB/../../../..][../../../../../..]
>>>>>>>>> [bend001:17005] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]:
>>>>>>>> [BB/../../../../..][../../../../../..]
>>>>>>>>> bend001
>>>>>>>>> [rhc@bend001 tcp]$
>>>>>>>>> 
>>>>>>>>> (I also checked the internals using "-mca rmaps_base_verbose 10")
>>> so
>>>>>> it
>>>>>>>> could be your hier inversion causing problems again. Or it could
> be
>>>>>> that
>>>>>>>> you are hitting a connection issue we are seeing in
>>>>>>>>> some scenarios in the OOB subsystem - though if you are able to
>>> run
>>>>>> using
>>>>>>>> a non-lama mapper, that would seem unlikely.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Dec 20, 2013, at 8:09 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi Ralph,
>>>>>>>>> 
>>>>>>>>> Thank you very much. I tried many things such as:
>>>>>>>>> 
>>>>>>>>> mpirun -np 2 -host node05 -report-bindings -mca rmaps lama -mca
>>>>>>>>> rmaps_lama_bind 1c myprog
>>>>>>>>> 
>>>>>>>>> But every try failed. At least they were accepted by
> openmpi-1.7.3
>>>> as
>>>>>> far
>>>>>>>>> as I remember.
>>>>>>>>> Anyway, please check it when you have a time, because using lama
>>>> comes
>>>>>>>> from
>>>>>>>>> my curiosity.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Tetsuya Mishima
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I'll try to take a look at it - my expectation is that lama might
>>>> get
>>>>>>>>> stuck because you didn't tell it a pattern to map, and I doubt
>>> that
>>>>>> code
>>>>>>>>> path has seen much testing.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Dec 20, 2013, at 5:52 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi Ralph, I'm glad to hear that, thanks.
>>>>>>>>> 
>>>>>>>>> By the way, yesterday I tried to check how lama in 1.7.4rc treat
>>>> numa
>>>>>>>>> node.
>>>>>>>>> 
>>>>>>>>> Then, even wiht this simple command line, it freezed without any
>>>>>>>>> massage:
>>>>>>>>> 
>>>>>>>>> mpirun -np 2 -host node05 -mca rmaps lama myprog
>>>>>>>>> 
>>>>>>>>> Could you check what happened?
>>>>>>>>> 
>>>>>>>>> Is it better to open new thread or continue this thread?
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Tetsuya Mishima
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I'll make it work so that NUMA can be either above or below
> socket
>>>>>>>>> 
>>>>>>>>> On Dec 20, 2013, at 2:57 AM, tmish...@jcity.maeda.co.jp wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi Brice,
>>>>>>>>> 
>>>>>>>>> Thank you for your comment. I understand what you mean.
>>>>>>>>> 
>>>>>>>>> My opinion was made just considering easy way to adjust the code
>>> for
>>>>>>>>> inversion of hierarchy in object tree.
>>>>>>>>> 
>>>>>>>>> Tetsuya Mishima
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I don't think there's any such difference.
>>>>>>>>> Also, all these NUMA architectures are reported the same by
> hwloc,
>>>>>>>>> and
>>>>>>>>> therefore used the same in Open MPI.
>>>>>>>>> 
>>>>>>>>> And yes, L3 and NUMA are topologically-identical on AMD
>>> Magny-Cours
>>>>>>>>> (and
>>>>>>>>> most recent AMD and Intel platforms).
>>>>>>>>> 
>>>>>>>>> Brice
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Le 20/12/2013 11:33, tmish...@jcity.maeda.co.jp a écrit :
>>>>>>>>> 
>>>>>>>>> Hi Ralph,
>>>>>>>>> 
>>>>>>>>> The numa-node in AMD Mangy-Cours/Interlagos is so called cc(cache
>>>>>>>>> coherent)NUMA,>>>>>>> which seems to be a little bit different
> from the traditional numa
>>>>>>>>> defined
>>>>>>>>> in openmpi.
>>>>>>>>> 
>>>>>>>>> I notice that ccNUMA object is almost same as L3cache object.
>>>>>>>>> So "-bind-to l3cache" or "-map-by l3cache" is valid for what I
>>> want
>>>>>>>>> to
>>>>>>>>> do.
>>>>>>>>> Therefore, "do not touch it" is one of the solution, I think ...
>>>>>>>>> 
>>>>>>>>> Anyway, mixing up these two types of numa is the problem.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Tetsuya Mishima
>>>>>>>>> 
>>>>>>>>> I can wait it'll be fixed in 1.7.5 or later, because putting
>>>>>>>>> "-bind-to
>>>>>>>>> numa"
>>>>>>>>> and "-map-by numa" at the same time works as a workaround.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Tetsuya Mishima
>>>>>>>>> 
>>>>>>>>> Yeah, it will impact everything that uses hwloc topology maps, I
>>>>>>>>> fear.
>>>>>>>>> 
>>>>>>>>> One side note: you'll need to add --hetero-nodes to your cmd
>>>>>>>>> line.
>>>>>>>>> If
>>>>>>>>> we
>>>>>>>>> don't see that, we assume that all the node topologies are
>>>>>>>>> identical
>>>>>>>>> -
>>>>>>>>> which clearly isn't true here.
>>>>>>>>> I'll try to resolve the hier inversion over the holiday - won't
>>>>>>>>> be
>>>>>>>>> for
>>>>>>>>> 1.7.4, but hopefully for 1.7.5
>>>>>>>>> Thanks
>>>>>>>>> Ralph
>>>>>>>>> 
>>>>>>>>> On Dec 18, 2013, at 9:44 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I think it's normal for AMD opteron having 8/16 cores such as
>>>>>>>>> magny cours or interlagos. Because it usually has 2 numa nodes
>>>>>>>>> in a cpu(socket), numa-node can not include a socket. This type
>>>>>>>>> of hierarchy would be natural.
>>>>>>>>> 
>>>>>>>>> (node03 is Dell PowerEdge R815 and maybe quite common, I guess)
>>>>>>>>> 
>>>>>>>>> By the way, I think this inversion should affect rmaps_lama
>>>>>>>>> mapping.
>>>>>>>>> 
>>>>>>>>> Tetsuya Mishima
>>>>>>>>> 
>>>>>>>>> Ick - yeah, that would be a problem. I haven't seen that type
>>>>>>>>> of
>>>>>>>>> hierarchical inversion before - is node03 a different type of
>>>>>>>>> chip?
>>>>>>>>> Might take awhile for me to adjust the code to handle hier
>>>>>>>>> inversion... :-(
>>>>>>>>> On Dec 18, 2013, at 9:05 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi Ralph,
>>>>>>>>> 
>>>>>>>>> I found the reason. I attached the main part of output with 32
>>>>>>>>> core node(node03) and 8 core node(node05) at the bottom.
>>>>>>>>> 
>>>>>>>>> From this information, socket of node03 includes numa-node.
>>>>>>>>> On the other hand, numa-node of node05 includes socket.
>>>>>>>>> The direction of object tree is opposite.
>>>>>>>>> 
>>>>>>>>> Since "-map-by socket" may be assumed as default,
>>>>>>>>> for node05, "-bind-to numa and -map-by socket" means
>>>>>>>>> upward search. For node03, this should be downward.
>>>>>>>>> 
>>>>>>>>> I guess that openmpi-1.7.4rc1 will always assume numa-node
>>>>>>>>> includes socket. Is it right? Then, upward search is assumed
>>>>>>>>> in orte_rmaps_base_compute_bindings even for node03 when I
>>>>>>>>> put "-bind-to numa and -map-by socket" option.
>>>>>>>>> 
>>>>>>>>> [node03.cluster:15508] [[38286,0],0] rmaps:base:compute_usage
>>>>>>>>> [node03.cluster:15508] mca:rmaps: compute bindings for job
>>>>>>>>> [38286,1]
>>>>>>>>> with
>>>>>>>>> policy NUMA
>>>>>>>>> [node03.cluster:15508] mca:rmaps: bind upwards for job
>>>>>>>>> [38286,1]
>>>>>>>>> with
>>>>>>>>> bindings NUMA
>>>>>>>>> [node03.cluster:15508] [[38286,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Machine
>>>>>>>>> 
>>>>>>>>> That's the reason of this trouble. Therefore, adding "-map-by
>>>>>>>>> core"
>>>>>>>>> works.
>>>>>>>>> (mapping pattern seems to be strange ...)
>>>>>>>>> 
>>>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -bind-to numa -map-by
>>>>>>>>> core
>>>>>>>>> -report-bindings myprog
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> NUMANode
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> NUMANode
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> NUMANode
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> NUMANode
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> NUMANode
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode> >>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> NUMANode
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> NUMANode
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Cache
>>>>>>>>> [node03.cluster:15885] [[38679,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> NUMANode
>>>>>>>>> [node03.cluster:15885] MCW rank 2 bound to socket 0[core 0[hwt
>>>>>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>> cket 0[core 3[hwt 0]]:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>> [node03.cluster:15885] MCW rank 3 bound to socket 0[core 0[hwt
>>>>>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>> cket 0[core 3[hwt 0]]:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>> [node03.cluster:15885] MCW rank 4 bound to socket 0[core 4[hwt
>>>>>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>>>>>> cket 0[core 7[hwt 0]]:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>>>>>> [node03.cluster:15885] MCW rank 5 bound to socket 0[core 4[hwt
>>>>>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>>>>>> cket 0[core 7[hwt 0]]:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>>>>>> [node03.cluster:15885] MCW rank 6 bound to socket 0[core 4[hwt
>>>>>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>>>>>> cket 0[core 7[hwt 0]]:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>>>>>> [node03.cluster:15885] MCW rank 7 bound to socket 0[core 4[hwt
>>>>>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>>>>>> cket 0[core 7[hwt 0]]:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>>>>>> [node03.cluster:15885] MCW rank 0 bound to socket 0[core 0[hwt
>>>>>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>> cket 0[core 3[hwt 0]]:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>> [node03.cluster:15885] MCW rank 1 bound to socket 0[core 0[hwt
>>>>>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>> cket 0[core 3[hwt 0]]:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>> Hello world from process 6 of 8
>>>>>>>>> Hello world from process 5 of 8
>>>>>>>>> Hello world from process 0 of 8
>>>>>>>>> Hello world from process 7 of 8
>>>>>>>>> Hello world from process 3 of 8
>>>>>>>>> Hello world from process 4 of 8
>>>>>>>>> Hello world from process 2 of 8
>>>>>>>>> Hello world from process 1 of 8
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Tetsuya Mishima
>>>>>>>>> 
>>>>>>>>> [node03.cluster:15508] Type: Machine Number of child objects:
>>>>>>>>> 4
>>>>>>>>> Name=NULL
>>>>>>>>> total=132358820KB
>>>>>>>>> Backend=Linux
>>>>>>>>> OSName=Linux
>>>>>>>>> OSRelease=2.6.18-308.16.1.el5
>>>>>>>>> OSVersion="#1 SMP Tue Oct 2 22:01:43 EDT 2012"
>>>>>>>>> Architecture=x86_64
>>>>>>>>> Cpuset:  0xffffffff
>>>>>>>>> Online:  0xffffffff
>>>>>>>>> Allowed: 0xffffffff
>>>>>>>>> Bind CPU proc:   TRUE
>>>>>>>>> Bind CPU thread: TRUE
>>>>>>>>> Bind MEM proc:   FALSE
>>>>>>>>> Bind MEM thread: TRUE
>>>>>>>>> Type: Socket Number of child objects: 2
>>>>>>>>>         Name=NULL
>>>>>>>>>         total=33071780KB
>>>>>>>>>         CPUModel="AMD Opteron(tm) Processor 6136"
>>>>>>>>>         Cpuset:  0x000000ff
>>>>>>>>>         Online:  0x000000ff
>>>>>>>>>         Allowed: 0x000000ff
>>>>>>>>>         Type: NUMANode Number of child objects: 1
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> [node05.cluster:21750] Type: Machine Number of child objects:
>>>>>>>>> 2
>>>>>>>>> Name=NULL
>>>>>>>>> total=33080072KB
>>>>>>>>> Backend=Linux>>>>   OSName=Linux
>>>>>>>>> OSRelease=2.6.18-308.16.1.el5
>>>>>>>>> OSVersion="#1 SMP Tue Oct 2 22:01:43 EDT 2012"
>>>>>>>>> Architecture=x86_64
>>>>>>>>> Cpuset:  0x000000ff
>>>>>>>>> Online:  0x000000ff
>>>>>>>>> Allowed: 0x000000ff
>>>>>>>>> Bind CPU proc:   TRUE
>>>>>>>>> Bind CPU thread: TRUE
>>>>>>>>> Bind MEM proc:   FALSE
>>>>>>>>> Bind MEM thread: TRUE
>>>>>>>>> Type: NUMANode Number of child objects: 1
>>>>>>>>>         Name=NULL
>>>>>>>>>         local=16532232KB
>>>>>>>>>         total=16532232KB
>>>>>>>>>         Cpuset:  0x0000000f
>>>>>>>>>         Online:  0x0000000f
>>>>>>>>>         Allowed: 0x0000000f
>>>>>>>>>         Type: Socket Number of child objects: 1
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hmm...try adding "-mca rmaps_base_verbose 10 -mca
>>>>>>>>> ess_base_verbose
>>>>>>>>> 5"
>>>>>>>>> to
>>>>>>>>> your cmd line and let's see what it thinks it found.
>>>>>>>>> 
>>>>>>>>> On Dec 18, 2013, at 6:55 PM, tmish...@jcity.maeda.co.jp
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi, I report one more problem with openmpi-1.7.4rc1,
>>>>>>>>> which is more serious.
>>>>>>>>> 
>>>>>>>>> For our 32 core nodes(AMD magny cours based) which has
>>>>>>>>> 8 numa-nodes, "-bind-to numa" does not work. Without
>>>>>>>>> this option, it works. For your infomation, at the
>>>>>>>>> bottom of this mail, I added the lstopo information
>>>>>>>>> of the node.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Tetsuya Mishima
>>>>>>>>> 
>>>>>>>>> [mishima@manage ~]$ qsub -I -l nodes=1:ppn=32>> qsub: waiting for
>>>> job
>>>>>> 8352.manage.cluster to start
>>>>>>>>> qsub: job 8352.manage.cluster ready
>>>>>>>>> 
>>>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -report-bindings
>>>>>>>>> -bind-to
>>>>>>>>> numa
>>>>>>>>> myprog
>>>>>>>>> [node03.cluster:15316] [[37582,0],0] bind:upward target
>>>>>>>>> NUMANode
>>>>>>>>> type
>>>>>>>>> Machine
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
> --------------------------------------------------------------------------
>>>>>>>>> A request was made to bind to NUMA, but an appropriate
>>>>>>>>> target
>>>>>>>>> could
>>>>>>>>> not
>>>>>>>>> be found on node node03.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
> --------------------------------------------------------------------------
>>>>>>>>> [mishima@node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -report-bindings myprog
>>>>>>>>> [node03.cluster:15282] MCW rank 2 bound to socket 1[core 8
>>>>>>>>> [hwt
>>>>>>>>> 0]]:
>>>>>>>>> [./././././././.][B/././././././.][./././././././.][
>>>>>>>>> ./././././././.]>>>>>>>>>>>> [node03.cluster:15282] MCW rank
>>>>>>>>> 3 bound to socket 1[core 9[hwt
>>>>>>>>> 0]]:
>>>>>>>>> [./././././././.][./B/./././././.][./././././././.][
>>>>>>>>> ./././././././.]
>>>>>>>>> [node03.cluster:15282] MCW rank 4 bound to socket 2[core 16
>>>>>>>>> [hwt
>>>>>>>>> 0]]:
>>>>>>>>> [./././././././.][./././././././.][B/././././././.]
>>>>>>>>> [./././././././.]
>>>>>>>>> [node03.cluster:15282] MCW rank 5 bound to socket 2[core 17
>>>>>>>>> [hwt
>>>>>>>>> 0]]:
>>>>>>>>> [./././././././.][./././././././.][./B/./././././.]
>>>>>>>>> [./././././././.]
>>>>>>>>> [node03.cluster:15282] MCW rank 6 bound to socket 3[core 24
>>>>>>>>> [hwt
>>>>>>>>> 0]]:
>>>>>>>>> [./././././././.][./././././././.][./././././././.]
>>>>>>>>> [B/././././././.]
>>>>>>>>> [node03.cluster:15282] MCW rank 7 bound to socket 3[core 25
>>>>>>>>> [hwt
>>>>>>>>> 0]]:
>>>>>>>>> [./././././././.][./././././././.][./././././././.]
>>>>>>>>> [./B/./././././.]
>>>>>>>>> [node03.cluster:15282] MCW rank 0 bound to socket 0[core 0
>>>>>>>>> [hwt
>>>>>>>>> 0]]:
>>>>>>>>> [B/././././././.][./././././././.][./././././././.][
>>>>>>>>> ./././././././.]
>>>>>>>>> [node03.cluster:15282] MCW rank 1 bound to socket 0[core 1
>>>>>>>>> [hwt
>>>>>>>>> 0]]:
>>>>>>>>> [./B/./././././.][./././././././.][./././././././.][
>>>>>>>>> ./././././././.]
>>>>>>>>> Hello world from process 2 of 8
>>>>>>>>> Hello world from process 5 of 8
>>>>>>>>> Hello world from process 4 of 8
>>>>>>>>> Hello world from process 3 of 8>>>>>>>>>> Hello world from
>>>>>>>>> process 1 of 8
>>>>>>>>> Hello world from process 7 of 8
>>>>>>>>> Hello world from process 6 of 8
>>>>>>>>> Hello world from process 0 of 8>>>>>>> [mishima@node03 demos]$
> ~/opt/hwloc/bin/lstopo-no-graphics
>>>>>>>>> Machine (126GB)
>>>>>>>>> Socket L#0 (32GB)
>>>>>>>>> NUMANode L#0 (P#0 16GB) + L3 L#0 (5118KB)
>>>>>>>>> L2 L#0 (512KB) + L1d L#0 (64KB) + L1i L#0 (64KB) + Core L#0
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#0
>>>>>>>>> (P#0)
>>>>>>>>> L2 L#1 (512KB) + L1d L#1 (64KB) + L1i L#1 (64KB) + Core L#1
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#1
>>>>>>>>> (P#1)
>>>>>>>>> L2 L#2 (512KB) + L1d L#2 (64KB) + L1i L#2 (64KB) + Core L#2
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#2
>>>>>>>>> (P#2)
>>>>>>>>> L2 L#3 (512K
> B) + L1d L#3 (64KB) + L1i L#3 (64KB) + Core L#3
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#3
>>>>>>>>> (P#3)
>>>>>>>>> NUMANode L#1 (P#1 16GB) + L3 L#1 (5118KB)
>>>>>>>>> L2 L#4 (512KB) + L1d L#4 (64KB) + L1i L#4 (64KB) + Core L#4
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#4
>>>>>>>>> (P#4)
>>>>>>>>> L2 L#5 (512KB) + L1d L#5 (64KB) + L1i L#5 (64KB) + Core L#5
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#5
>>>>>>>>> (P#5)
>>>>>>>>> L2 L#6 (512KB) + L1d L#6 (64KB) + L1i L#6 (64KB) + Core L#6
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#6
>>>>>>>>> (P#6)
>>>>>>>>> L2 L#7 (512KB) + L1d L#7 (64KB) + L1i L#7 (64KB) + Core L#7
>>>>>>>>> +
>>>>>>>>> PU>>>>>> L#7
>>>>>>>>> (P#7)
>>>>>>>>> Socket L#1 (32GB)
>>>>>>>>> NUMANode L#2 (P#6 16GB) + L3 L#2 (5118KB)
>>>>>>>>> L2 L#8 (512KB) + L1d L#8 (64KB) + L1i L#8 (64KB) + Core L#8
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#8
>>>>>>>>> (P#8)
>>>>>>>>> L2 L#9 (512KB) + L1d L#9 (64KB) + L1i L#9 (64KB) + Core L#9
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#9
>>>>>>>>> (P#9)
>>>>>>>>> L2 L#10 (512KB) + L1d L#10 (64KB) + L1i L#10 (64KB) + Core
>>>>>>>>> L#10
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#10 (P#10)
>>>>>>>>> L2 L#11 (512KB) + L1d L#11 (64KB) + L1i L#11 (64KB) + Core
>>>>>>>>> L#11
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#11 (P#11)
>>>>>>>>> NUMANode L#3 (P#7 16GB) + L3 L#3 (5118KB)
>>>>>>>>> L2 L#12 (512KB) + L1d L#12 (64KB) + L1i L#12 (64KB) + Core
>>>>>>>>> L#12
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#12 (P#12)
>>>>>>>>> L2 L#13 (512KB) + L1d L#13 (64KB) + L1i L#13 (64KB) + Core
>>>>>>>>> L#13
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#13 (P#13)
>>>>>>>>> L2 L#14 (512KB) + L1d L#14 (64KB) + L1i L#14 (64KB) + Core
>>>>>>>>> L#14
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#14 (P#14)
>>>>>>>>> L2 L#15 (512KB) + L1d L#15 (64KB) + L1i L#15 (64KB) + Core
>>>>>>>>> L#15
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#15 (P#15)
>>>>>>>>> Socket L#2 (32GB)
>>>>>>>>> NUMANode L#4 (P#4 16GB) + L3 L#4 (5118KB)
>>>>>>>>> L2 L#16 (512KB) + L1d L#16 (64KB) + L1i L#16 (64KB) + Core
>>>>>>>>> L#16
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#16 (P#16)
>>>>>>>>> L2 L#17 (512KB) + L1d L#17 (64KB) + L1i L#17 (64KB) + Core
>>>>>>>>> L#17
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#17 (P#17)> >>>>>    L2 L#18 (512KB) + L1d L#18 (64KB) +
>>>>>>>>> L1i
>>>>>>>>> L#18 (64KB) + Core L#18
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#18 (P#18)
>>>>>>>>> L2 L#19 (512KB) + L1d L#19 (64KB) + L1i L#19 (64KB) + Core
>>>>>>>>> L#19
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#19 (P#19)
>>>>>>>>> NUMANode L#5 (P#5 16GB) + L3 L#5 (5118KB)
>>>>>>>>> L2 L#20 (512KB) + L1d L#20 (64KB) + L1i L#20 (64KB) + Core
>>>>>>>>> L#20
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#20 (P#20)
>>>>>>>>> L2 L#21 (512KB) + L1d L#21 (64KB) + L1i L#21 (64KB) + Core
>>>>>>>>> L#21
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#21 (P#21)
>>>>>>>>> L2 L#22 (512KB) + L1d L#22 (64KB) + L1i L#22 (64KB) + Core
>>>>>>>>> L#22
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#22 (P#22)
>>>>>>>>> L2 L#23 (512KB) + L1d L#23 (64KB) + L1i L#23 (64KB) + Core
>>>>>>>>> L#23
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#23 (P#23)
>>>>>>>>> Socket L#3 (32GB)
>>>>>>>>> NUMANode L#6 (P#2 16GB) + L3 L#6 (5118KB)
>>>>>>>>> L2 L#24 (512KB) + L1d L#24 (64KB) + L1i L#24 (64KB) + Core
>>>>>>>>> L#24
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#24 (P#24)>>>>>    L2 L#25 (512KB) + L1d L#25 (64KB) + L1i
>>>>>>>>> L#25
>>>>>>>>> (64KB) + Core L#25 +
>>>>>>>>> PU
>>>>>>>>> L#25 (P#25)
>>>>>>>>> L2 L#26 (512KB) + L1d L#26 (64KB) + L1i L#26 (64KB) + Core
>>>>>>>>> L#26
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#26 (P#26)
>>>>>>>>> L2 L#27 (512KB) + L1d L#27 (64KB) + L1i L#27 (64KB) + Core
>>>>>>>>> L#27
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#27 (P#27)
>>>>>>>>> NUMANode L#7 (P#3 16GB) + L3 L#7 (5118KB)
>>>>>>>>> L2 L#28 (512KB) + L1d L#28 (64KB) + L1i L#28 (64KB) + Core
>>>>>>>>> L#28
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#28 (P#28)
>>>>>>>>> L2 L#29 (512KB) + L1d L#29 (64KB) + L1i L#29 (64KB) + Core
>>>>>>>>> L#29
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#29 (P#29)
>>>>>>>>> L2 L#30 (512KB) + L1d L#30 (64KB) + L1i L#30 (64KB) + Core
>>>>>>>>> L#30
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#30 (P#30)
>>>>>>>>> L2 L#31 (512KB) + L1d L#31 (64KB) + L1i L#31 (64KB) + Core
>>>>>>>>> L#31
>>>>>>>>> +
>>>>>>>>> PU
>>>>>>>>> L#31 (P#31)
>>>>>>>>> HostBridge L#0
>>>>>>>>> PCIBridge
>>>>>>>>> PCI 14e4:1639
>>>>>>>>> Net L#0 "eth0"
>>>>>>>>> PCI 14e4:1639
>>>>>>>>> Net L#1 "eth1"
>>>>>>>>> PCIBridge
>>>>>>>>> PCI 14e4:1639
>>>>>>>>> Net L#2 "eth2"
>>>>>>>>> PCI 14e4:1639
>>>>>>>>> Net L#3 "eth3"
>>>>>>>>> PCIBridge
>>>>>>>>> PCIBridge
>>>>>>>>> PCIBridge
>>>>>>>>>  PCI 1000:0072
>>>>>>>>>    Block L#4 "sdb"
>>>>>>>>>    Block L#5 "sda"
>>>>>>>>> PCI 1002:4390
>>>>>>>>> Block L#6 "sr0"
>>>>>>>>> PCIBridge
>>>>>>>>> PCI 102b:0532
>>>>>>>>> HostBridge L#7
>>>>>>>>> PCIBridge
>>>>>>>>> PCI 15b3:6274
>>>>>>>>> Net L#7 "ib0"
>>>>>>>>> OpenFabrics L#8 "mthca0"
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org>>
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> 
>>>>>>>>> _______________________________________________> >>>> users
>>> mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> 
>>>>>>>>> _______________________________
>>> ________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
> http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________
> 
>>> 
>>>> 
>>>>>> 
>>>>>>>> 
>>>>>>>>> users mailing list
>>>>>>>>> 
>>> users@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to