Any pointers?
On Tue, Jan 25, 2022 at 12:55 PM Ralph Castain via users <
users@lists.open-mpi.org> wrote:
> Short answer is yes, but it it a bit complicated to do.
>
> On Jan 25, 2022, at 12:28 PM, Saliya Ekanayake via users <
> users@lists.open-mpi.org> wrote:
>
>
Hi,
I am trying to run an MPI program on a platform that launches the processes
using a custom launcher (not mpiexec). This will end up spawning N
processes of the program, but I am not sure if MPI_Init() would work or not
in this case?
Is it possible to have a group of processes launched by some
l)
>> that suggested to disable psm as a solution.
>>
>> It worked, but I would like to know what this module is and is there a
>> disadvantage in terms of performance by disabling it?
>>
>> Thank you,
>> Saliya
>>
>> --
>> Saliya Ekana
this module is and is there a
disadvantage in terms of performance by disabling it?
Thank you,
Saliya
--
Saliya Ekanayake, Ph.D
Applied Computer Scientist
Network Dynamics and Simulation Science Laboratory (NDSSL)
Virginia Tech, Blacksburg
___
users ma
I tested and the number of ranks in world comm is correct. I couldn't find
the bug that causes the program to produce erroneous answers when this
scheme is used, though.
On Fri, Jul 29, 2016 at 3:38 PM, Saliya Ekanayake wrote:
> Thank you, that's good to know.
>
> Yes, tes
nvironment variables (ie:- $OMPI_*)
> to its ranks. So I believe you may use $OMPI_COMM_WORLD_LOCAL_RANK to
> specifically filter out parameters within the script.
>
> Regards
> Udayanga Wickramasinghe
> Research Assistant
> School of Informatics and Computing | CREST
> Indiana University, Bloomington
e:- $OMPI_*)
> to its ranks. So I believe you may use $OMPI_COMM_WORLD_LOCAL_RANK to
> specifically filter out parameters within the script.
>
> Regards
> Udayanga Wickramasinghe
> Research Assistant
> School of Informatics and Computing | CREST
> Indiana University, Bloomingt
into different communicators and this pattern breaks that
logic.
Is there an alternative approach to doing this?
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana Universi
for mxm, you need to
>
> - force pml/ob1 (so mtl/mxm cannot be used by pml/cm)
>
> and
>
> - blacklist btl/openib
>
> your mpirun command line looks like this
>
> mpirun --mca pml ob1 --mca btl ^openib ...
>
>
> Cheers,
>
>
> Gilles
> On 7/
Thank you, but what's mxm?
On Tue, Jul 19, 2016 at 12:52 AM, Nathan Hjelm wrote:
> You probably will also want to run with -mca pml ob1 to make sure mxm is
> not in use. The combination should be sufficient to force tcp usage.
>
> -Nathan
>
> > On Jul 18, 2016, at
ssible for OpenMPI to use Infiniband and not TCP?
Is there a way to guarantee that a test is using TCP, but not IB?
Thank you,
saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
>>>> [titan01:01173] *** Process received signal ***
>>>> [titan01:01173] Signal: Aborted (6)
>>>> [titan01:01173] Signal code: (-6)
>>>> [titan01:01172] [ 0] /usr/lib64/libpthread.so.0(+0xf100)[0x2b7e9596a100]
>>>> [titan01:01172] [ 1] /usr/lib64/libc.so.6(gsignal+0x37)[0x2b7e95fc75f7]
>>>> [titan01:01172] [ 2] /usr/lib64/libc.so.6(abort+0x148)[0x2b7e95fc8ce8]
>>>> [titan01:01172] [ 3]
>>>> /home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x742ac5)[0x2b7e96a95ac5]
>>>> [titan01:01172] [ 4]
>>>> /home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8a2137)[0x2b7e96bf5137]
>>>> [titan01:01172] [ 5]
>>>> /home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(JVM_handle_linux_signal+0x140)[0x2b7e96a995e0]
>>>> [titan01:01172] [ 6] [titan01:01173] [ 0]
>>>> /usr/lib64/libpthread.so.0(+0xf100)[0x2af694ded100]
>>>> [titan01:01173] [ 1] /usr/lib64/libc.so.6(+0x35670)[0x2b7e95fc7670]
>>>> [titan01:01172] [ 7] [0x2b7e9c86e3a1]
>>>> [titan01:01172] *** End of error message ***
>>>> /usr/lib64/libc.so.6(gsignal+0x37)[0x2af69544a5f7]
>>>> [titan01:01173] [ 2] /usr/lib64/libc.so.6(abort+0x148)[0x2af69544bce8]
>>>> [titan01:01173] [ 3]
>>>> /home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x742ac5)[0x2af695f18ac5]
>>>> [titan01:01173] [ 4]
>>>> /home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8a2137)[0x2af696078137]
>>>> [titan01:01173] [ 5]
>>>> /home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(JVM_handle_linux_signal+0x140)[0x2af695f1c5e0]
>>>> [titan01:01173] [ 6] /usr/lib64/libc.so.6(+0x35670)[0x2af69544a670]
>>>> [titan01:01173] [ 7] [0x2af69c0693a1]
>>>> [titan01:01173] *** End of error message ***
>>>> ---
>>>> Primary job terminated normally, but 1 process returned
>>>> a non-zero exit code. Per user-direction, the job has been aborted.
>>>> ---
>>>>
>>>> --
>>>> mpirun noticed that process rank 1 with PID 0 on node titan01 exited on
>>>> signal 6 (Aborted).
>>>>
>>>>
>>>> CONFIGURATION:
>>>> I used the ompi master sources from github:
>>>> commit 267821f0dd405b5f4370017a287d9a49f92e734a
>>>> Author: Gilles Gouaillardet
>>>> Date: Tue Jul 5 13:47:50 2016 +0900
>>>>
>>>> ./configure --enable-mpi-java
>>>> --with-jdk-dir=/home/gl069/bin/jdk1.7.0_25 --disable-dlopen
>>>> --disable-mca-dso
>>>>
>>>> Thanks a lot for your help!
>>>> Gundram
>>>>
>>>> ___
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> <http://www.open-mpi.org/community/lists/users/2016/07/29584.php>
>>>> http://www.open-mpi.org/community/lists/users/2016/07/29584.php
>>>>
>>>
>>>
>>>
>>> ___
>>> users mailing listus...@open-mpi.org
>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/07/29585.php
>>>
>>>
>>>
>>
>> ___
>> users mailing listus...@open-mpi.org
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/07/29587.php
>>
>>
>>
>
> ___
> users mailing listus...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/07/29589.php
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/07/29590.php
>
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
try
> mpirun --mca coll_ml_priority 100 ...
>
> Cheers,
>
> Gilles
>
> On Thursday, June 30, 2016, Saliya Ekanayake wrote:
>
>> Thank you, Gilles. The reason for digging into intra-node optimizations
>> is that we've implemented several machine learning app
> (and libfabric, but I do not know the details...)
>
> Cheers,
>
> Gilles
>
> On Thursday, June 30, 2016, Saliya Ekanayake wrote:
>
>> OK, I am beginning to see how it works now. One question I still have is,
>> in the case of a mult-node communicator it seems coll/tuned (o
.
Cheers,
Gilles
On Thursday, June 30, 2016, Saliya Ekanayake wrote:
> Thank you, Gilles.
>
> What is the bcast I should look for? In general, how do I know which
> module was used to for which communication - can I print this info?
> On Jun 30, 2016 3:19 AM, "Gilles Gouaillardet
r is an inter
> communicator or the communicator spans on several nodes.
>
> you can have a look at the source code, and you will not that bcast does
> not use send/recv. instead, it uses a shared memory, so hopefully, it is
> faster than other modules
>
>
> Cheers,
>
>
> Gilles
aliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Hi,
I see in *mca_coll_sm_comm_query()* of *ompi/mca/coll/sm/coll_sm_module.c*
that al allreduce and bcast have shared memory implementations.
Is there a way to know if this implementation is being used when running my
program that calls these collectives?
Thank you,
Saliya
--
Saliya
's version
seem to support that, though.
On Wed, Jun 29, 2016 at 1:20 AM, Saliya Ekanayake wrote:
> Thank you, Ralph and Gilles.
>
> I didn't know about the OMPI_COMM_WORLD_LOCAL_RANK variable. Essentially,
> this means I should be able to wrap my application call in a
OCAL_RANK
> envar, and then use that to calculate the offset location for your threads
> (i.e., local rank 0 is on socket 0, local rank 1 is on socket 1, etc.). You
> can then putenv the correct value of the GOMP envar
>
>
> On Jun 28, 2016, at 8:40 PM, Saliya Ekanayake wrote:
>
Core 4?
P.S. I can manually achieve this within the program using
*sched_setaffinity()*, but that's not portable.
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
process, i guess case 1 and case 2
> will become pretty close.
>
> i also suggest that for cases 2 and 3, you bind processes to a socket
> instead of no binding at all
>
> Cheers,
>
> Gilles
>
> On 6/23/2016 2:41 PM, Saliya Ekanayake wrote:
>
> Thank you, Gilles for
o time sharing.
> but if the task is bound on more than one core, then the task and the
> helper run in parallel.
>
>
> Cheers,
>
> Gilles
>
> On 6/23/2016 1:21 PM, Saliya Ekanayake wrote:
>
> Hi,
>
> I am trying to understand this peculiar behavior where
Hi,
I am trying to understand this peculiar behavior where the communication
time in OpenMPI changes depending on the number of process elements (cores)
the process is bound to.
Is this expected?
Thank you,
saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics
f Hammond
> jeff.scie...@gmail.com
> http://jeffhammond.github.io/
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi
l.
>
> Cheers,
>
> Gilles
>
>
> On Monday, May 30, 2016, Saliya Ekanayake wrote:
>
>> Hi,
>>
>> I ran Ohio micro benchmarks for openmpi and noticed broadcast with
>> smaller number of bytes is faster than a barrie
e times?
>
> Thanks,
>
> Matthieu
>
> --
> *From:* users [users-boun...@open-mpi.org] on behalf of Saliya Ekanayake [
> esal...@gmail.com]
> *Sent:* Monday, May 30, 2016 7:53 AM
> *To:* Open MPI Users
> *Subject:* [OMPI users] Broadcast faster
Hi,
I ran Ohio micro benchmarks for openmpi and noticed broadcast with smaller
number of bytes is faster than a barrier - 2us vs 120us.
I'm trying to understand how this could happen?
Thank you
Saliya
/www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/05/29285.php
>>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/05/29288.php
>
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
LibPath.java
Description: Binary data
> It is true that we generally configure our schedulers to set the max
> #slots on each node to equal the #cores on the node - but that is purely a
> configuration choice.
>
>
> On May 19, 2016, at 4:29 PM, Saliya Ekanayake wrote:
>
> Thank you, Tetsuya. So is a slot = core?
like to pin them to each core the process has been bound to.
> >
> > On Thu, May 19, 2016 at 3:46 PM, Ralph Castain wrote:
> > Perhaps we should error out, but at the moment, PE=4 forces bind-to-core
> and so the bind-to socket is being ignored
> >
> > On May 19, 2016, a
t is what we will do - as I said, the
> —bind-to socket directive will be ignored.
>
> On May 19, 2016, at 1:03 PM, Saliya Ekanayake wrote:
>
> So if bind-to-core is in effect, does that mean it'll run only on 1 core
> even though I'd like it to be able to utilize 4 core
3:46 PM, Ralph Castain wrote:
> Perhaps we should error out, but at the moment, PE=4 forces bind-to-core
> and so the bind-to socket is being ignored
>
> On May 19, 2016, at 12:06 PM, Saliya Ekanayake wrote:
>
> Hi,
>
> I understand --map-by will determine the process
nd-to socket
My understanding is that this will give each process 4 cores. Now, with
bind to socket, does that mean it's possible that within a socket the
assgined 4 cores for a process may change? Or will they stay in the same 4
cores always?
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D.
.com/open-mpi/ompi-java-test
>
> Howard
>
>
> 2016-02-27 23:01 GMT-07:00 Saliya Ekanayake :
>
>> Hi,
>>
>> I see this paper from Oscar refers to a Java implementation of NAS
>> benchmarks. Is this work publicly available (the code?)
>>
>> I'
ation/291695433_SPIDAL_Java_High_Performance_Data_Analytics_with_Java_and_MPI_on_Large_Multicore_HPC_Clusters)
and would like to test out the work in the above paper as well.
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomi
munity/lists/users/2016/01/28352.php
>
>
>
>
> --
> Kind regards Nick
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.o
>
>
> --
> Kind regards Nick
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/01/28353.php
>
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliya.org
flags to the java command line so the JVM can
> allocate more memory.
> java -Xmx=...
> or something like that (and that could be JVM dependent)
>
> Cheers,
>
> Gilles
>
>
> On Thursday, January 21, 2016, Saliya Ekanayake wrote:
>
>> Hi Ibrahim,
>>
&
i.org Subscription:
> http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/01/28302.php
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/01/28308.php
>
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliya.org
st
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/12/28190.php
>>
>>
>>
>> ___
>
ns the coll sm module does *not* implement allgatherv, so openmpi
> will use the next module
> (which is very likely the default module, that is why there is no
> performance improvement in your specific benchmark)
>
> Cheers,
>
> Gilles
>
>
>
> On 12/10/2015 2:53
n Bouteiller, Ph.D. ~~ https://icl.cs.utk.edu/~bouteill/
>
> Le 9 déc. 2015 à 09:53, Saliya Ekanayake a écrit :
>
> Hi,
>
> In a previous email, I wanted to know how to enable shared memory
> collectives and I was told setting the coll_sm_priority to anything over 30
ested for
different number of processes per node on 48 nodes. The total message size
is kept constant at 240 bytes (or 2.28MB).
Am I doing something wrong here?
Thank you,
saliya
[image: Inline image 1]
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Comp
Thanks Ralph. It's 1.8.1. I'll try this.
On Sun, Sep 27, 2015 at 8:25 PM, Ralph Castain wrote:
> What version? If it’s 1.10 or I think even 1.8, you should have the
> “--bind-to hwthread" option
>
> On Sep 27, 2015, at 3:02 PM, Saliya Ekanayake wrote:
>
> H
Hi,
I couldn't find any option in OpenMPI to bind a process to a hardware
thread. I am assuming this is not yet supported through binding options.
Could specifying a rank file be used as a workaround for this?
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
Scho
fix it.
>
>
> On Sep 13, 2015, at 1:10 AM, Saliya Ekanayake wrote:
>
> I could get it working by manually generating a rankfile all the ranks and
> not using any --map-by options.
>
> I'll try the --map-by core as well
>
> On Sun, Sep 13, 2015 at 3:59 AM, Tobia
I could get it working by manually generating a rankfile all the ranks and
not using any --map-by options.
I'll try the --map-by core as well
On Sun, Sep 13, 2015 at 3:59 AM, Tobias Kloeffel
wrote:
> Hi,
> use: --map-by core
>
> regards,
> Tobias
>
>
> On 09/13/20
machine numbers them, and I
> can’t guarantee it will work - but it’s worth a shot. If it doesn’t, then I
> may have to add an option for such purposes
>
> Ralph
>
> On Sep 12, 2015, at 7:39 PM, Saliya Ekanayake wrote:
>
> Hi,
>
> We've a machine as in the follow
ill make a process
bind to 2 cores, which is not what I want.
--map-by ppr:12:node:PE=1,SPAN
Thank you,
Saliya
[image: Inline image 1]
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391
; the sm coll mca is optimized for shared memory, but support intra node
>> communicators only.
>> the ml and hierarch coll have some optimizations for intra node
>> communications.
>> as far as i know, none of these are used in production.
>>
>> Cheers,
>>
>> Gilles
> you can run
> ompi_info --all | grep vader
> to check the btl parameters,
> of course, reading the source code is the best way to understand what the
> vader btl can do and how
>
> Cheers,
>
> Gilles
>
>
>
> On 9/1/2015 1:28 PM, Saliya Ekanayake wrote:
>
&g
one per pair.
> Typically, the openib or tcp btl is used for inter node communication, and
> the sm or vader btl for
> intra node.
> note the vader btl uses the knem kernel module when available for even
> more optimized configurations.
>
> Cheers,
>
> Gilles
>
>
> On
Hi,
Just trying to see if there are any optimizations (or options) in OpenMPI
to improve communication between intra node processes. For example do they
use something like shared memory?
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and
Thank you. This is very nice!
On Sun, Jul 19, 2015 at 2:25 PM, Ralph Castain wrote:
> Yes
>
> On Jul 19, 2015, at 10:47 AM, Saliya Ekanayake wrote:
>
> So does this mean I can have different options for each process by
> separating them with colons? That'll be ideal fo
> Wrap the call in a bash script or the like, there are several examples on
> this mailing list.
>
> I am sorry I am not at my computer so cannot find them.
> On 19 Jul 2015 06:34, "Saliya Ekanayake" wrote:
>
>> Hi,
>>
>> I am trying to profile one
port is passed as an option to the java command and not to the
program. Now the port has to be different for the 2 MPI procs and I am not
sure how this could be done.
Any thoughts?
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing
ns they should be optimized for multi
> node / multi tasks per node.
> that being said, ml is not production ready, and i am not sure wheter
> hierarch is actively maintained)
>
> i hope this helps
>
> Gilles
>
>
> On 7/9/2015 5:37 AM, Saliya Ekanayake wrote:
>
advantage of shared memory?
[1] https://www.open-mpi.org/faq/?category=sm
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliya.org
Just checking if anyone has experienced a similar situation or has any
pointers to understand this.
Thank you
Saliya
On Jul 1, 2015 9:27 PM, "Saliya Ekanayake" wrote:
> Hi,
>
> I am getting strange performance results for allgatherv operation for the
> same number of pr
pected with binding width? I
am a bit puzzled and would appreciate any help to understand this.
[image: Inline image 1]
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 8
for "numa").
On Wed, Jul 1, 2015 at 4:04 PM, Saliya Ekanayake wrote:
> Thank you Ralph
>
> Saliya
>
> On Wed, Jul 1, 2015 at 4:01 PM, Ralph Castain wrote:
>
>> Scenario 2: --map-by ppr:12:node,span --bind-to core
>>
>> will put 12 procs on each no
gi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/07/27239.php
>>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/07/27240.php
>
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliya.org
to just 1 core. This is what I don't know
how to do, because if I do --map-by socket:PE=1 then mpirun will put more
than 12 procs per node as it can do so.
I'd appreciate any help on this.
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Infor
Thank you George. This is very informative.
Is it possible to pass the option in runtime rather setting up in the
config file?
Thank you
Saliya
On Tue, Jun 30, 2015 at 7:20 PM, George Bosilca wrote:
> Saliya,
>
> On Tue, Jun 30, 2015 at 10:50 AM, Saliya Ekanayake
> wrote:
>
&g
://www.researchgate.net/profile/William_Gropp/publication/221597354_A_Simple_Pipelined_Algorithm_for_Large_Irregular_All-gather_Problems/links/00b49525d291830c6700.pdf
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital
Thank you. It worked!
On Fri, Mar 13, 2015 at 10:37 AM, Ralph Castain wrote:
> You shouldn’t have to do so
>
> On Mar 13, 2015, at 7:14 AM, Saliya Ekanayake wrote:
>
> Thanks Ralph. Do I need to specify where to find numactl-devel when
> compiling OpenMPI?
>
> On Thu,
.
> As the warning indicates, it can impact performance but won't stop you from
> running
>
>
> On Mar 12, 2015, at 12:51 PM, Saliya Ekanayake wrote:
>
> Hi,
>
> I am getting the following binding warning and wonde
[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core
4[hwt 0]], socket 0[core 5[hwt 0]]:
[B/B/B/B/B/B][./././././.][./././././.][./././././.]
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
ing? If not latest, is it possible to
> upgrade to latest OFED?. Otherwise, Can you try latest OMPI release (>=
> v1.8.4), where this warning is ignored on older OFEDs
>
> -Devendar
>
> On Sun, Feb 8, 2015 at 12:37 PM, Saliya Ekanayake
> wrote:
>
>> Hi,
>>
rtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliy
nfigure Open MPI with
> --enable-mpi-ext=affinity or --enable-mpi-ext=all). See:
>
> http://www.open-mpi.org/doc/v1.8/man3/OMPI_Affinity_str.3.php
>
>
>
> On Dec 21, 2014, at 1:57 AM, Saliya Ekanayake wrote:
>
> > Hi,
> >
> > Is it possible to get info
the test worked, but you are still encountering an error
> when executing an MPI job? Or are you saying things now work?
>
>
> On Dec 28, 2014, at 5:58 PM, Saliya Ekanayake wrote:
>
> Thank you Ralph. This produced the warning on memory limits similar to [1]
> and setting ul
the ibv_ud_pingpong test - that will exercise
> the portion of the system under discussion.
>
>
> On Dec 28, 2014, at 2:31 PM, Saliya Ekanayake wrote:
>
> What I heard from the administrator is that,
>
> "The tests that work are the simple utilities ib_read_lat and ib_read_bw
>
What I heard from the administrator is that,
"The tests that work are the simple utilities ib_read_lat and ib_read_bw
that measures latency and bandwith between two nodes. They are part of
the "perftest" repo package."
On Dec 28, 2014 10:20 AM, "Saliya Ekanayake&qu
btl_openib_connect_udcm.c:736: udcm_module_finalize:
> Assertion `((0xdeafbeedULL << 32) + 0xdeafbeedULL) == ((opal_object_t *)
> (&m->cm_recv_msg_queue))->obj_magic_id' failed.
>
> Thank you,
> Saliya
>
> On Mon, Nov 10, 2014 at 10:01 AM, Saliya Ekanayake
> w
_object_t *)
(&m->cm_recv_msg_queue))->obj_magic_id' failed.
Thank you,
Saliya
On Mon, Nov 10, 2014 at 10:01 AM, Saliya Ekanayake
wrote:
> Thank you Jeff, I'll try this and let you know.
>
> Saliya
> On Nov 10, 2014 6:42 AM, "Jeff Squyres (jsquyres)"
> wr
Thank you and one last question. Is it possible to avoid a core and
instruct OMPI to use only the other cores?
On Mon, Dec 22, 2014 at 2:08 PM, Ralph Castain wrote:
>
> On Dec 22, 2014, at 10:45 AM, Saliya Ekanayake wrote:
>
> Hi Ralph,
>
> Yes the report bindings show the
ou specified - I believe by default
> we bind to socket when mapping by socket. If you want them bound to core,
> you might need to add —bind-to core.
>
> I can take a look at it - I *thought* we had reset that to bind-to core
> when PE=N was specified, but maybe that got lost.
&
k you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliya.org
Hi,
Is it possible to get information on the process affinity that's set in
mpirun command within the MPI program? For example I'd like to know the
number of cores that a given rank is bound to.
Thank you
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Infor
dditional information to give a clue as to what is happening. :-(
>
>
>
> On Nov 9, 2014, at 11:43 AM, Saliya Ekanayake wrote:
>
> > Hi Jeff,
> >
> > You are probably busy, but just checking if you had a chance to look at
> this.
> >
> > Thanks,
>
Hi Jeff,
You are probably busy, but just checking if you had a chance to look at
this.
Thanks,
Saliya
On Thu, Nov 6, 2014 at 9:19 AM, Saliya Ekanayake wrote:
> Hi Jeff,
>
> I've attached a tar file with information.
>
> Thank you,
> Saliya
>
> On Tue, Nov 4,
p/
>
>
>
> On Nov 4, 2014, at 1:10 PM, Saliya Ekanayake wrote:
>
> > Hi,
> >
> > I am using OpenMPI 1.8.1 in a Linux cluster that we recently setup. It
> builds fine, but when I try to run even the simplest hello.c program it'll
> cause a segfault. Any
iya,
>
> Would you mind trying to reproduce the problem using the latest 1.8
> release - 1.8.3?
>
> Thanks,
>
> Howard
>
>
> 2014-11-04 11:10 GMT-07:00 Saliya Ekanayake :
>
>> Hi,
>>
>> I am using OpenMPI 1.8.1 in a Linux cluster that we recently se
. The ompi_info is attached.
2. cd to examples directory and mpicc hello_c.c
3. mpirun -np 2 ./a.out
4. Error text is attached.
Please let me know if you need more info.
Thank you,
Saliya
--
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383
http://saliy
Please find inline comments.
On Fri, Aug 22, 2014 at 3:45 PM, Rob Latham wrote:
>
>
> On 08/22/2014 02:40 PM, Saliya Ekanayake wrote:
>
>> Yes, these are all MPI_DOUBLE
>>
>
> well, yeah, but since you are talking about copying into a "direct buffer"
Yes, these are all MPI_DOUBLE
On Fri, Aug 22, 2014 at 3:38 PM, Rob Latham wrote:
>
>
> On 08/22/2014 10:10 AM, Saliya Ekanayake wrote:
>
>> Hi,
>>
>> I've a quick question about the usage of Java binding.
>>
>> Say there's a 2 dimensional d
s for copying?
Thank you,
Saliya
On Fri, Aug 22, 2014 at 3:24 PM, Oscar Vega-Gisbert
wrote:
> El 22/08/14 20:44, Saliya Ekanayake escribió:
>
> Thank you Oscar for the detailed information, but I'm still wondering how
>> would the copying in 2 would be different t
Thank you Oscar for the detailed information, but I'm still wondering how
would the copying in 2 would be different than what's done here with
copying to a buffer.
On Fri, Aug 22, 2014 at 2:17 PM, Oscar Vega-Gisbert
wrote:
> El 22/08/14 17:10, Saliya Ekanayake escribió:
>
>
n and send it
I guess 2 would internally do the copying to a buffer and use it, so
suggesting 1. is the best option. Is this the case or is there a better way
to do this?
Thank you,
Saliya
--
Saliya Ekanayake esal...@gmail.com
http://saliya.org
rs mailing list
> > us...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/07/24870.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/07/24874.php
>
--
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383
http://saliya.org
http://saliya.org
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> > _______
> > users mailing list
luster. Unfortunately, due to a Cray bug, case 80503, that has
> not yet worked.
> Ray
>
>
> On 4/16/2014 4:44 PM, Saliya Ekanayake wrote:
>
> Hi,
>
> We have a Cray XE6/XK7 supercomputer (BigRed II) and I was trying to get
> OpenMPI Java b
f you
could give some suggestions on how to build OpenMPI with Gemini support.
[1]
https://www.open-mpi.org/papers/cug-2012/cug_2012_open_mpi_for_cray_xe_xk.pdf
Thank you,
Saliya
--
Saliya Ekanayake esal...@gmail.com
http://saliya.org
Just an update. Yes, binding to all is as same as binding to none. I was
mistaken by my memory :)
On Fri, Apr 11, 2014 at 1:22 AM, Saliya Ekanayake wrote:
> Thank you Ralph for the details and it's a good point you mentioned on
> mapping by node vs socket. We have another program
ry, and so messaging will run slower -
> and you want the ranks that share a node to be the ones that most
> frequently communicate to each other, if you can identify them.
>
> HTH
> Ralph
>
> On Apr 10, 2014, at 5:59 PM, Saliya Ekanayake wrote:
>
> Hi,
>
> I am eval
to speed up these Tx*1*xN
cases? Also, I expected B to perform better than A as threads could utilize
all 8 cores, but it wasn't the case.
Thank you,
Saliya
[image: Inline image 1]
--
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383
http://saliya.org
faq/?category=java).
>
>
> On Apr 3, 2014, at 7:09 PM, Saliya Ekanayake wrote:
>
> > Great. I will cleanup and send you a tarball.
> >
> > Thank you
> > Saliya
> >
> > On Apr 3, 2014 5:51 PM, "Ralph Castain" wrote:
> > We'd be happ
; Thanks!
> Ralph
>
> On Apr 3, 2014, at 1:44 PM, Saliya Ekanayake wrote:
>
> Hi,
>
> I've been working on some applications in our group where I've been using
> OpenMPI Java binding. Over the course of this work, I've accumulated
> several samples that I w
1 - 100 of 138 matches
Mail list logo