from:"Durga Choudhury"

Re: [OMPI users] Windows support for OpenMPI

2012-12-07 Thread Durga Choudhury

All

Let me reiterate my (minimal, compared to other developers) support to the
OpenMPI project. If all it takes is to build and run regression tests on a
platform to add a feather in the cap, I am willing to do it.

The low interest in Windows platform does not surprise me; most HPC
infrastructures use a Unix-like setup, and those few who do use Windows
likely use Microsoft's own HPC server and MPI stack rather than OpenMPI.

That said, I believe that we should continue supporting Windows for this
one reason, if nothing else: since Windows comes preinstalled on all PCs,
new entrants to the field of computer programming are starting on a Windows
based machine. By providing Windows support for OpenMPI, we will make the
project accessible to the younger generation and ensure they adopt it when
they enter the work force. That is another reason that makes me think that
just because few people asked for it explicitly, few people are actually
using it, as the newbie types usually do not make explicit requests.

Thanks
Durga

On Fri, Dec 7, 2012 at 10:28 AM, Jeff Squyres <jsquy...@cisco.com> wrote:

> Sorry for my late reply; I've been in the MPI Forum and Open MPI
> engineering meetings all week.  Some points:
>
> 1. Yes, it would be a shame to lose all the Windows support that Shiqing
> did.
>
> 2. Microsoft has told me that they're of the mindset "the more, the
> merrier" for their platform (i.e., they'd love to have more than one MPI on
> Windows, but probably can't help develop/support Open MPI on windows).
>  Makes perfect sense to me.
>
> 3. I see that we have 2 volunteers to keep the build support going for the
> v1.6 series, and another volunteer to do continued development for v1.7 and
> beyond.  But all of these would need good reasons to go forward (active
> Open MPI Windows users, financial support, etc.).  It doesn't look like
> there is much support.
>
> 4. I'm bummed to hear that Windows building is broken in 1.6.x.  $%#$%#@!!
>  If anyone wants to take a gander at fixing it, I'd love to see your
> patches, for nothing other than just maintaining Windows support for the
> remainder of the 1.6.x series.  But per #3, it may not be worth it.
>
> 5. Based on this feedback, it seems like we should remove the Windows
> support from the OMPI SVN trunk and all future versions.  It can always be
> resurrected from SVN history if someone wants to pick up this effort again
> in the future.
>
>
> On Dec 6, 2012, at 11:07 AM, Damien wrote:
>
> > So far, I count three people interested in OpenMPI on Windows.  That's
> not a case for ongoing support.
> >
> > Damien
> >
> > On 04/12/2012 11:32 AM, Durga Choudhury wrote:
> >> All
> >>
> >> Since I did not see any Microsoft/other 'official' folks pick up the
> ball, let me step up. I have been lurking in this list for quite a while
> and I am a generic scientific programmer (i.e. I use many frameworks such
> as OpenCL/OpenMP etc, not just MPI)
> >> Although I am primarily a Linux user, I do own multiple versions of
> Visual Studio licenses and have a small cluster that dual boots to
> Windows/Linux (and more nodes can be added on demand). I cannot do any
> large scale testing on this, but I can build and run regression tests etc.
> >>
> >> If the community needs the Windows support to continue, I can take up
> that responsibility, until a more capable person/group is found at least.
> >>
> >> Thanks
> >> Durga
> >>
> >>
> >> On Mon, Dec 3, 2012 at 12:32 PM, Damien <dam...@khubla.com> wrote:
> >> All,
> >>
> >> I completely missed the message about Shiqing departing as the OpenMPI
> Windows maintainer.  I'll try and keep Windows builds going for 1.6 at
> least, I have 2011 and 2013 Intel licenses and VS2008 and 2012, but not
> 2010.  I see that the 1.6.3 code base already doesn't build on Windows in
> VS2012  :-(.
> >>
> >> While I can try and keep builds going, I don't have access to a Windows
> cluster right now, and I'm flat out on two other projects. I can test on my
> workstation, but that will only go so far. Longer-term, there needs to be a
> decision made on whether Windows gets to be a first-class citizen in
> OpenMPI or not.  Jeff's already told me that 1.7 is lagging behind on
> Windows.  It would be a shame to have all the work Shiqing put in gradually
> decay because it can't be supported enough.  If there's any
> Microsoft/HPC/Azure folks observing this list, or any other vendors who run
> on Windows with OpenMPI, maybe we can see what can be done if you're
> interested.
> >>
> >> Damien
> >> ___
> >> users mailing li

Re: [OMPI users] RDMA GPUDirect CUDA...

2012-08-14 Thread Durga Choudhury

Dear OpenMPI developers

I'd like to add my 2 cents that this would be a very desirable feature
enhancement for me as well (and perhaps others).

Best regards
Durga


On Tue, Aug 14, 2012 at 4:29 PM, Zbigniew Koza  wrote:

> Hi,
>
> I've just found this information on  nVidia's plans regarding enhanced
> support for MPI in their CUDA toolkit:
> http://developer.nvidia.com/**cuda/nvidia-gpudirect
>
> The idea that two GPUs can talk to each other via network cards without
> CPU as a middleman looks very promising.
> This technology is supposed to be revealed and released in September.
>
> My questions:
>
> 1. Will OpenMPI include   RDMA support in its CUDA interface?
> 2. Any idea how much can this technology reduce the CUDA Send/Recv latency?
> 3. Any idea whether this technology will be available for Fermi-class
> Tesla devices or only for Keplers?
>
> Regards,
>
> Z Koza
>
>
>
> __**_
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/**mailman/listinfo.cgi/users
>

Re: [OMPI users] system() call corrupts MPI processes

2012-01-19 Thread Durga Choudhury

This is just a thought:

according to the system() man page, 'SIGCHLD' is blocked during the
execution of the program. Since you are executing your command as a
daemon in the background, it will be permanently blocked.

Does OpenMPI daemon depend on SIGCHLD in any way? That is about the
only difference that I can think of between running the command
stand-alone (which works) and running via a system() API call (that
does not work).

Best
Durga


On Thu, Jan 19, 2012 at 9:52 AM, Jeff Squyres  wrote:
> Which network transport are you using, and what version of Open MPI are you 
> using?  Do you have OpenFabrics support compiled into your Open MPI 
> installation?
>
> If you're just using TCP and/or shared memory, I can't think of a reason 
> immediately as to why this wouldn't work, but there may be a subtle 
> interaction in there somewhere that causes badness (e.g., memory corruption).
>
>
> On Jan 19, 2012, at 1:57 AM, Randolph Pullen wrote:
>
>>
>> I have a section in my code running in rank 0 that must start a perl program 
>> that it then connects to via a tcp socket.
>> The initialisation section is shown here:
>>
>>     sprintf(buf, "%s/session_server.pl -p %d &", PATH,port);
>>     int i = system(buf);
>>     printf("system returned %d\n", i);
>>
>>
>> Some time after I run this code, while waiting for the data from the perl 
>> program, the error below occurs:
>>
>> qplan connection
>> DCsession_fetch: waiting for Mcode data...
>> [dc1:05387] [[40050,1],0] ORTE_ERROR_LOG: A message is attempting to be sent 
>> to a process whose contact information is unknown in file rml_oob_send.c at 
>> line 105
>> [dc1:05387] [[40050,1],0] could not get route to [[INVALID],INVALID]
>> [dc1:05387] [[40050,1],0] ORTE_ERROR_LOG: A message is attempting to be sent 
>> to a process whose contact information is unknown in file 
>> base/plm_base_proxy.c at line 86
>>
>>
>> It seems that the linux system() call is breaking OpenMPI internal 
>> connections.  Removing the system() call and executing the perl code 
>> externaly fixes the problem but I can't go into production like that as its 
>> a security issue.
>>
>> Any ideas ?
>>
>> (environment: OpenMPI 1.4.1 on kernel Linux dc1 
>> 2.6.18-274.3.1.el5.028stab094.3  using TCP and mpirun)
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] [Beowulf] How to justify the use MPI codes on multicore systems/PCs?

2011-12-12 Thread Durga Choudhury

I think this is a *great* topic for discussion, so let me throw some
fuel to the fire: the mechanism described in the blog (that makes
perfect sense) is fine for (N)UMA shared memory architectures. But
will it work for asymmetric architectures such as the Cell BE or
discrete GPUs where the data between the compute nodes have to be
explicitly DMA'd in? Is there a middleware layer that makes it
transparent to the upper layer software?

Best regards
Durga

On Mon, Dec 12, 2011 at 11:00 AM, Rayson Ho  wrote:
> On Sat, Dec 10, 2011 at 3:21 PM, amjad ali  wrote:
>> (2) The latest MPI implementations are intelligent enough that they use some
>> efficient mechanism while executing MPI based codes on shared memory
>> (multicore) machines.  (please tell me any reference to quote this fact).
>
> Not an academic paper, but from a real MPI library developer/architect:
>
> http://blogs.cisco.com/performance/shared-memory-as-an-mpi-transport/
> http://blogs.cisco.com/performance/shared-memory-as-an-mpi-transport-part-2/
>
> Open MPI is used by Japan's K computer (current #1 TOP 500 computer)
> and LANL's RoadRunner (#1 Jun 08 – Nov 09), and "10^16 Flops Can't Be
> Wrong" and "10^15 Flops Can't Be Wrong":
>
> http://www.open-mpi.org/papers/sc-2008/jsquyres-cisco-booth-talk-2up.pdf
>
> Rayson
>
> =
> Grid Engine / Open Grid Scheduler
> http://gridscheduler.sourceforge.net/
>
> Scalable Grid Engine Support Program
> http://www.scalablelogic.com/
>
>
>>
>>
>> Please help me in formally justifying this and comment/modify above two
>> justifications. Better if I you can suggent me to quote some reference of
>> any suitable publication in this regard.
>>
>> best regards,
>> Amjad Ali
>>
>> ___
>> Beowulf mailing list, beow...@beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
>
>
> --
> Rayson
>
> ==
> Open Grid Scheduler - The Official Open Source Grid Engine
> http://gridscheduler.sourceforge.net/
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Shared-memory problems

2011-11-03 Thread Durga Choudhury

Since /tmp is mounted across a network and /dev/shm is (always) local,
/dev/shm seems to be the right place for shared memory transactions.
If you create temporary files using mktemp is it being created in
/dev/shm or /tmp?


On Thu, Nov 3, 2011 at 11:50 AM, Bogdan Costescu  wrote:
> On Thu, Nov 3, 2011 at 15:54, Blosch, Edwin L  wrote:
>> -    /dev/shm is 12 GB and has 755 permissions
>> ...
>> % ls –l output:
>>
>> drwxr-xr-x  2 root root 40 Oct 28 09:14 shm
>
> This is your problem: it should be something like drwxrwxrwt. It might
> depend on the distribution, f.e. the following show this to be a bug:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=533897
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=317329
>
> and surely you can find some more on the subject with your favorite
> search engine. Another source could be a paranoid sysadmin who has
> changed the default (most likely correct) setting the distribution
> came with - not only OpenMPI but any application using shmem would be
> affected..
>
> Cheers,
> Bogdan
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] problème with MPI_FINALIZE

2011-11-02 Thread Durga Choudhury

Any particular reason these calls don't nest? In some other HPC-like
paradigms (e.g. VSIPL) such calls are allowed to nest (i.e. only the
finalize() that matches the first init() will destroy allocated
resources.)

Just a curiosity question, doesn't really concern me in any particular way.

Best regards
Durga


2011/11/2 Jeff Squyres (jsquyres) :
> Did you call MPI-INIT after you called MPI-finalize?  If so, you're not 
> allowed to do that. Call. MPI-INIT once and call MPI-finalize once.
>
> Sent from my phone. No type good.
>
> On Nov 1, 2011, at 2:45 PM, "amine mrabet"  wrote:
>
>> hey
>>
>> i'm new in mpi , i try tu use  mpi inside of function and i have this error 
>> messag
>>
>> An error occurred in MPI_Init
>> *** after MPI was finalized
>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>> [dellam:16806] Abort before MPI_INIT completed successfully; not able to 
>> guarantee that all other processes were killed!
>>
>> maybe i cant use mpi inside of function ?
>>
>> --
>> amine mrabet
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] configure with cuda

2011-10-27 Thread Durga Choudhury

Is there any provision/future plans to add OpenCL support as well?
CUDA is an Nvidia-only technology, so it might be a bit limiting in
some cases.

Best regards
Durga


On Thu, Oct 27, 2011 at 2:45 PM, Rolf vandeVaart  wrote:
> Actually, that is not quite right.  From the FAQ:
>
>
>
> “This feature currently only exists in the trunk version of the Open MPI
> library.”
>
>
>
> You need to download and use the trunk version for this to work.
>
>
>
> http://www.open-mpi.org/nightly/trunk/
>
>
>
> Rolf
>
>
>
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
> Behalf Of Ralph Castain
> Sent: Thursday, October 27, 2011 11:43 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] configure with cuda
>
>
>
>
>
> I'm pretty sure cuda support was never moved to the 1.4 series. You will,
> however, find it in the 1.5 series. I suggest you get the latest tarball
> from there.
>
>
>
>
>
> On Oct 27, 2011, at 12:38 PM, Peter Wells wrote:
>
>
>
> I am attempting to configure OpenMPI 1.4.3 with cuda support on a Redhat 5
> box. When I try to run configure with the following command:
>
>
>
>  ./configure
> --prefix=/opt/crc/sandbox/pwells2/openmpi/1.4.3/intel-12.0-cuda/ FC=ifort
> F77=ifort CXX=icpc CC=icc --with-sge --disable-dlopen --enable-static
> --enable-shared --disable-openib-connectx-xrc --disable-openib-rdmacm
> --without-openib --with-cuda=/opt/crc/cuda/4.0/cuda
> --with-cuda-libdir=/opt/crc/cuda/4.0/cuda/lib64
>
>
>
> I receive the warning that '--with-cuda' and '--with-cuda-libdir' are
> unrecognized options. According to the FAQ these options are supported in
> this version of OpenMPI. I attempted the same thing with v.1.4.4 downloaded
> directly from open-mpi.org with similar results. Attached are the results of
> configure and make on v.1.4.3. Any help would be greatly appreciated.
>
>
>
> Peter Wells
> HPC Intern
> Center for Research Computing
> University of Notre Dame
> pwel...@nd.edu
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> 
> This email message is for the sole use of the intended recipient(s) and may
> contain confidential information.  Any unauthorized review, use, disclosure
> or distribution is prohibited.  If you are not the intended recipient,
> please contact the sender by reply email and destroy all copies of the
> original message.
> 
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Memory mapped memory

2011-10-17 Thread Durga Choudhury

If the mmap() pages are created with MAP_SHARED, then they should be
sharable with other processes in the same node, isn't it? MPI
processes are just like any other process, aren't they? Will one of
the MPI Gurus please comment?

Regards
Durga


On Mon, Oct 17, 2011 at 9:45 AM, Gabriele Fatigati  wrote:
> More in detail,
> is it possible use mmap() function from MPI process and sharing these memory
> between others processes?
>
> 2011/10/13 Gabriele Fatigati 
>>
>> Dear OpenMPI users and developers,
>> is there some limitation or issues to use memory mapped memory into MPI
>> processes? I would like to share some memory in a node without using OpenM.
>> Thanks a lot.
>>
>> --
>> Ing. Gabriele Fatigati
>>
>> HPC specialist
>>
>> SuperComputing Applications and Innovation Department
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.it                    Tel:   +39 051 6171722
>>
>> g.fatigati [AT] cineca.it
>
>
>
> --
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it                    Tel:   +39 051 6171722
>
> g.fatigati [AT] cineca.it
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Related to project ideas in OpenMPI

2011-08-25 Thread Durga Choudhury

Is anything done at the kernel level portable (e.g. to Windows)? It
*can* be, in principle at least (by putting appropriate #ifdef's in
the code), but I am wondering if it is in reality.

Also, in 2005 there was an attempt to implement SSI (Single System
Image) functionality to the then-current 2.6.10 kernel. The proposal
was very detailed and covered most of the bases of task creation, PID
allocation etc across a loosely tied cluster (without using fancy
hardware such as RDMA fabric). Anybody knows if it was ever
implemented? Any pointers in this direction?

Thanks and regards
Durga


On Thu, Aug 25, 2011 at 11:08 AM, Rayson Ho  wrote:
> Srinivas,
>
> There's also Kernel-Level Checkpointing vs. User-Level Checkpointing -
> if you can checkpoint an MPI task and restart it on a new node, then
> this is also "process migration".
>
> Of course, doing a checkpoint & restart can be slower than pure
> in-kernel process migration, but the advantage is that you don't need
> any kernel support, and can in fact do all of it in user-space.
>
> Rayson
>
>
> On Thu, Aug 25, 2011 at 10:26 AM, Ralph Castain  wrote:
>> It also depends on what part of migration interests you - are you wanting to 
>> look at the MPI part of the problem (reconnecting MPI transports, ensuring 
>> messages are not lost, etc.) or the RTE part of the problem (where to 
>> restart processes, detecting failures, etc.)?
>>
>>
>> On Aug 24, 2011, at 7:04 AM, Jeff Squyres wrote:
>>
>>> Be aware that process migration is a pretty complex issue.
>>>
>>> Josh is probably the best one to answer your question directly, but he's 
>>> out today.
>>>
>>>
>>> On Aug 24, 2011, at 5:45 AM, srinivas kundaram wrote:
>>>
 I am final year grad student looking for my final year project in 
 OpenMPI.We are group of 4 students.
 I wanted to know about the "Process Migration" process of MPI processes in 
 OpenMPI.
 Can anyone suggest me any ideas for project related to process migration 
 in OenMPI or other topics in Systems.



 regards,
 Srinivas Kundaram
 srinu1...@gmail.com
 +91-8149399160
 ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Rayson
>
> ==
> Open Grid Scheduler - The Official Open Source Grid Engine
> http://gridscheduler.sourceforge.net/
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] multi-threaded programming

2011-03-08 Thread Durga Choudhury

A follow-up question (and pardon if this sounds stupid) is this:

If I want to make my process multithreaded, BUT only one thread has
anything to do with MPI (for example, using OpenMP inside MPI), then
the results will be correct EVEN IF #1 or #2 of Eugene holds true. Is
this correct?

Thanks
Durga

On Tue, Mar 8, 2011 at 12:34 PM, Eugene Loh  wrote:
> Let's say you have multi-threaded MPI processes, you request
> MPI_THREAD_MULTIPLE and get MPI_THREAD_MULTIPLE, and you use the self,sm,tcp
> BTLs (which have some degree of threading support).  Is it okay to have an
> [MPI_Isend|MPI_Irecv] on one thread be completed by an MPI_Wait on another
> thread?  I'm assuming some sort of synchronization and memory barrier/flush
> in between to protect against funny race conditions.
>
> If it makes things any easier on you, we can do this multiple-choice style:
>
> 1)  Forbidden by the MPI standard.
> 2)  Not forbidden by the MPI standard, but will not work with OMPI (not even
> with the BTLs that claim to be multi-threaded).
> 3)  Works well with OMPI (provided you use a BTL that's multi-threaded).
>
> It's looking like #2 to me, but I'm not sure.
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Pros and cons of --enable-heterogeneous

2010-10-07 Thread Durga Choudhury

I'd like to add to this question the following:

If I compile with --enable-heterogenous flag for different
*architectures* (I have a mix of old 32 bit x86, newer x86_64 and some
Cell BE based boxes (PS3)), would I be able to form a MPD ring between
all these different machines?

Best regards
Durga

On Thu, Oct 7, 2010 at 3:44 PM, David Ronis  wrote:
> I have various boxes that run openmpi and I can't seem to use all of
> them at once because they have different CPU's (e.g., pentiums, athlons
> (both 32 bit) vs Intel I7 (64 bit)).   I'm about the build 1.4.3 and was
> wondering if I should add --enable-heterogenous to the configure flags.
> Any advice as to why or why not would be appreciated.
>
> David
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Shared memory

2010-09-24 Thread Durga Choudhury

I think the 'middle ground' approach can be simplified even further if
the data file is in a shared device (e.g. NFS/Samba mount) that can be
mounted at the same location of the file system tree on all nodes. I
have never tried it, though and mmap()'ing a non-POSIX compliant file
system such as Samba might have issues I am unaware of.

However, I do not see why you should not be able to do this even if
the file is being written to as long as you call msync() before using
the mapped pages.

Durga


On Fri, Sep 24, 2010 at 12:31 PM, Eugene Loh  wrote:
> It seems to me there are two extremes.
>
> One is that you replicate the data for each process.  This has the
> disadvantage of consuming lots of memory "unnecessarily."
>
> Another extreme is that shared data is distributed over all processes.  This
> has the disadvantage of making at least some of the data less accessible,
> whether in programming complexity and/or run-time performance.
>
> I'm not familiar with Global Arrays.  I was somewhat familiar with HPF.  I
> think the natural thing to do with those programming models is to distribute
> data over all processes, which may relieve the excessive memory consumption
> you're trying to address but which may also just put you at a different
> "extreme" of this spectrum.
>
> The middle ground I think might make most sense would be to share data only
> within a node, but to replicate the data for each node.  There are probably
> multiple ways of doing this -- possibly even GA, I don't know.  One way
> might be to use one MPI process per node, with OMP multithreading within
> each process|node.  Or (and I thought this was the solution you were looking
> for), have some idea which processes are collocal.  Have one process per
> node create and initialize some shared memory -- mmap, perhaps, or SysV
> shared memory.  Then, have its peers map the same shared memory into their
> address spaces.
>
> You asked what source code changes would be required.  It depends.  If
> you're going to mmap shared memory in on each node, you need to know which
> processes are collocal.  If you're willing to constrain how processes are
> mapped to nodes, this could be easy.  (E.g., "every 4 processes are
> collocal".)  If you want to discover dynamically at run time which are
> collocal, it would be harder.  The mmap stuff could be in a stand-alone
> function of about a dozen lines.  If the shared area is allocated as one
> piece, substituting the single malloc() call with a call to your mmap
> function should be simple.  If you have many malloc()s you're trying to
> replace, it's harder.
>
> Andrei Fokau wrote:
>
> The data are read from a file and processed before calculations begin, so I
> think that mapping will not work in our case.
> Global Arrays look promising indeed. As I said, we need to put just a part
> of data to the shared section. John, do you (or may be other users) have an
> experience of working with GA?
> http://www.emsl.pnl.gov/docs/global/um/build.html
> When GA runs with MPI:
> MPI_Init(..)      ! start MPI
> GA_Initialize()   ! start global arrays
> MA_Init(..)       ! start memory allocator
>     do work
> GA_Terminate()    ! tidy up global arrays
> MPI_Finalize()    ! tidy up MPI
>                   ! exit program
> On Fri, Sep 24, 2010 at 13:44, Reuti  wrote:
>>
>> Am 24.09.2010 um 13:26 schrieb John Hearns:
>>
>> > On 24 September 2010 08:46, Andrei Fokau 
>> > wrote:
>> >> We use a C-program which consumes a lot of memory per process (up to
>> >> few
>> >> GB), 99% of the data being the same for each process. So for us it
>> >> would be
>> >> quite reasonable to put that part of data in a shared memory.
>> >
>> > http://www.emsl.pnl.gov/docs/global/
>> >
>> > Is this eny help? Apologies if I'm talking through my hat.
>>
>> I was also thinking of this when I read "data in a shared memory" (besides
>> approaches like http://www.kerrighed.org/wiki/index.php/Main_Page). Wasn't
>> this also one idea behind "High Performance Fortran" - running in parallel
>> across nodes even without knowing that it's across nodes at all while
>> programming and access all data like it's being local.
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Shared memory

2010-09-24 Thread Durga Choudhury

Is the data coming from a read-only file? In that case, a better way
might be to memory map that file in the root process and share the map
pointer in all the slave threads. This, like shared memory, will work
only for processes within a node, of course.


On Fri, Sep 24, 2010 at 3:46 AM, Andrei Fokau
 wrote:
> We use a C-program which consumes a lot of memory per process (up to few
> GB), 99% of the data being the same for each process. So for us it would be
> quite reasonable to put that part of data in a shared memory.
> In the source code, the memory is allocated via malloc() function. What
> would it require for us to change in the source code to be able to put that
> repeating data in a shared memory?
> The code is normally run on several nodes.
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Number of Sockets used by OpenMPI

2009-11-15 Thread Durga Choudhury

Thanks, George

This would be an invaluable reference to me.

Best regards
Durga

On Sun, Nov 15, 2009 at 6:53 PM, George Bosilca <bosi...@eecs.utk.edu> wrote:
> Durga,
>
> You can find the answer to your questions in
> http://www.netlib.org/netlib/utk/people/JackDongarra/PAPERS/scop3.pdf.
>
>  george.
>
>
> On Nov 15, 2009, at 14:39 , Durga Choudhury wrote:
>
>> I apologize for dragging in this conversation in a different
>> direction, but I'd be very interested to know why the behavior with
>> the Playstation is different from other architectures. The PS3 box has
>> a single gigabit ethernet and no exapansion ports, so I'd assume it's
>> behavior would be no different than, e.g. a regular PC using the TCP
>> BTL. Perhaps it has something to do with the Cell BE architecture,
>> then. What was the reasoning behind this decision?
>>
>> I am keen to know about such 'hybrid' parallel programming paradigm,
>> e.g. using Cell BE or NUMA or CUDA on top of an MPI (or even a grid
>> topology). I'd appreciate any pointers to any material in this
>> regards.
>>
>> Durga
>>
>> On Sun, Nov 15, 2009 at 4:48 PM, George Bosilca <bosi...@eecs.utk.edu>
>> wrote:
>>>
>>> By default only one socket per peer per physical network is opened.
>>> However,
>>> Open MPI has the possibility to open multiple socket per peer per
>>> network,
>>> based on some experiments with the Playstation (where having multiple
>>> socket
>>> allow for more bandwidth). The MCA parameter that allows such behavior is
>>> btl_tcp_links.
>>>
>>>  george.
>>>
>>> On Nov 13, 2009, at 17:59 , Charles Salvia wrote:
>>>
>>>> When using TCP, how many sockets does each process open per
>>>> peer-process?
>>>>  Does each process open a single socket to connect to each peer-process,
>>>> or
>>>> does it use TWO sockets, one for sending, one for receiving?
>>>>
>>>> Thanks,
>>>>
>>>> -Charles Salvia
>>>> ___
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Number of Sockets used by OpenMPI

2009-11-15 Thread Durga Choudhury

I apologize for dragging in this conversation in a different
direction, but I'd be very interested to know why the behavior with
the Playstation is different from other architectures. The PS3 box has
a single gigabit ethernet and no exapansion ports, so I'd assume it's
behavior would be no different than, e.g. a regular PC using the TCP
BTL. Perhaps it has something to do with the Cell BE architecture,
then. What was the reasoning behind this decision?

I am keen to know about such 'hybrid' parallel programming paradigm,
e.g. using Cell BE or NUMA or CUDA on top of an MPI (or even a grid
topology). I'd appreciate any pointers to any material in this
regards.

Durga

On Sun, Nov 15, 2009 at 4:48 PM, George Bosilca  wrote:
> By default only one socket per peer per physical network is opened. However,
> Open MPI has the possibility to open multiple socket per peer per network,
> based on some experiments with the Playstation (where having multiple socket
> allow for more bandwidth). The MCA parameter that allows such behavior is
> btl_tcp_links.
>
>  george.
>
> On Nov 13, 2009, at 17:59 , Charles Salvia wrote:
>
>> When using TCP, how many sockets does each process open per peer-process?
>>  Does each process open a single socket to connect to each peer-process, or
>> does it use TWO sockets, one for sending, one for receiving?
>>
>> Thanks,
>>
>> -Charles Salvia
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] fault tolerance in open mpi

2009-08-03 Thread Durga Choudhury

Is that kind of approach possible within an MPI framework? Perhaps a
grid approach would be better. More experienced people, speak up,
please?
(The reason I say that is that I too am interested in the solution of
that kind of problem, where an individual blade of a blade server
fails and correcting for that failure on the fly is better than taking
checkpoints and restarting the whole process excluding the failed
blade.

Durga

On Mon, Aug 3, 2009 at 9:21 AM, jody<jody@gmail.com> wrote:
> Hi
>
> I guess "task-farming" could give you a certain amount of the kind of
> fault-tolerance you want.
> (i.e. a master process distributes tasks to idle slave processors -
> however, this will only work
> if the slave processes don't need to communicate with each other)
>
> Jody
>
>
> On Mon, Aug 3, 2009 at 1:24 PM, vipin kumar<vipinkuma...@gmail.com> wrote:
>> Hi all,
>>
>> Thanks Durga for your reply.
>>
>> Jeff, once you wrote code for Mandelbrot set to demonstrate fault tolerance
>> in LAM-MPI. i. e. killing any slave process doesn't
>> affect others. Exact behaviour I am looking for in Open MPI. I attempted,
>> but no luck. Can you please tell how to write such programs in Open MPI.
>>
>> Thanks in advance.
>>
>> Regards,
>> On Thu, Jul 9, 2009 at 8:30 PM, Durga Choudhury <dpcho...@gmail.com> wrote:
>>>
>>> Although I have perhaps the least experience on the topic in this
>>> list, I will take a shot; more experienced people, please correct me:
>>>
>>> MPI standards specify communication mechanism, not fault tolerance at
>>> any level. You may achieve network tolerance at the IP level by
>>> implementing 'equal cost multipath' routes (which means two equally
>>> capable NIC cards connecting to the same destination and modifying the
>>> kernel routing table to use both cards; the kernel will dynamically
>>> load balance.). At the MAC level, you can achieve the same effect by
>>> trunking multiple network cards.
>>>
>>> You can achieve process level fault tolerance by a checkpointing
>>> scheme such as BLCR, which has been tested to work with OpenMPI (and
>>> other processes as well)
>>>
>>> Durga
>>>
>>> On Thu, Jul 9, 2009 at 4:57 AM, vipin kumar<vipinkuma...@gmail.com> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > I want to know whether open mpi supports Network and process fault
>>> > tolerance
>>> > or not? If there is any example demonstrating these features that will
>>> > be
>>> > best.
>>> >
>>> > Regards,
>>> > --
>>> > Vipin K.
>>> > Research Engineer,
>>> > C-DOTB, India
>>> >
>>> > ___
>>> > users mailing list
>>> > us...@open-mpi.org
>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> --
>> Vipin K.
>> Research Engineer,
>> C-DOTB, India
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Network connection check

2009-07-23 Thread Durga Choudhury

The 'system' command will fork a separate process to run. If I
remember correctly, forking within MPI can lead to undefined behavior.
Can someone in OpenMPI development team clarify?

What I don't understand is: why is your TCP network so unstable that
you are worried about reachability? For MPI to run, they should be
connected on a local switch with a high bandwidth interconnect and not
dispersed across the internet. Perhaps you should look at the
underlying cause of network instability. If your network is actually
stable, then your problem is only theoretical.

Also, keep in mind that TCP itself offers a keepalive mechanism. Three
parameters may be specified: the amount of inactivity after which the
first probe is sent, the number of unanswered probes after which the
connection is dropped and the interval between the probes. Typing
'sysctl -a' will print the entire IP MIB that has these names (I don't
remember them off the top of my head). However, you say that you
*don't* want to drop the connection, simply want to know about
connectivity. What you can do, without causing 'undefined' MPI
behaviour is to implement a similar mechanism in your MPI application.

Durga

On Thu, Jul 23, 2009 at 10:25 AM, vipin kumar wrote:
> Thank you all Jeff, Jody, Prentice and Bogdan for your invaluable
> clarification, solution and suggestion,
>
>> Open MPI should return a failure if TCP connectivity is lost, even with a
>> non-blocking point-to-point operation.  The failure should be returned in
>> the call to MPI_TEST (and friends).
>
> even if MPI_TEST is a local operation?
>
>>
>>  So I'm not sure your timeout has meaning here -- if you reach the
>> timeout, I think it simply means that the MPI communication has not
>> completed yet.  It does not necessarily mean that the MPI communication has
>> failed.
>
> you are absolutely correct., but the job should be done before it expires.
> that's the reason I am using TIMEOUT.
>
> So the conclusion is :
>>
>> MPI doesn't provide any standard way to check reachability and/or health
>> of a peer process.
>
> That's what I wanted to confirm. And to find out the solution, if any, or
> any alternative.
>
> So now I think, I should go for Jody's approach
>
>>
>> How about you start your MPI program from a shell script that does the
>> following:
>>
>> 1. Reads a text file containing the names of all the possible candidates
>>  for MPI nodes
>>
>> 2. Loops through the list of names from (1) and pings each machine to
>> see if it's alive. If the host is pingable, then write it's name to a
>> different text file which will be host as the machine file for the
>> mpirun command
>
>
>>
>> 3. Call mpirun using the machine file generated in (2).
>
> I am assuming processes have been launched successfully.
>
>
>
> --
> Vipin K.
> Research Engineer,
> C-DOTB, India
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] fault tolerance in open mpi

2009-07-09 Thread Durga Choudhury

Although I have perhaps the least experience on the topic in this
list, I will take a shot; more experienced people, please correct me:

MPI standards specify communication mechanism, not fault tolerance at
any level. You may achieve network tolerance at the IP level by
implementing 'equal cost multipath' routes (which means two equally
capable NIC cards connecting to the same destination and modifying the
kernel routing table to use both cards; the kernel will dynamically
load balance.). At the MAC level, you can achieve the same effect by
trunking multiple network cards.

You can achieve process level fault tolerance by a checkpointing
scheme such as BLCR, which has been tested to work with OpenMPI (and
other processes as well)

Durga

On Thu, Jul 9, 2009 at 4:57 AM, vipin kumar wrote:
>
> Hi all,
>
> I want to know whether open mpi supports Network and process fault tolerance
> or not? If there is any example demonstrating these features that will be
> best.
>
> Regards,
> --
> Vipin K.
> Research Engineer,
> C-DOTB, India
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Apllication level checkpointing tools.

2009-06-30 Thread Durga Choudhury

Josh

This actually is a concern addressed to all the authors/OpenMPI
contributors. The links to IEEExplore or ACM requires a subscription
which, unfortunately, not all the list subscribers have.

Would it be a copyright violation to post the actual paper/article to
the list instead of just a link? That would be much appreciated by
many of the readers.

Best regards
Durga

On Tue, Jun 30, 2009 at 10:08 AM, Josh Hursey wrote:
> Checkpoint/restart in Open MPI supports TCP, Shared Memory, Infiniband, and
> Myrinet interconnects (possibly others, but they have not been tested) [1].
> Is this what you are looking for?
>
> -- Josh
>
> [1] Hursey, J., Mattox, T. I., and Lumsdaine, A. 2009. "Interconnect
> agnostic checkpoint/restart in Open MPI"
> http://doi.acm.org/10.1145/1551609.1551619
>
> On Jun 30, 2009, at 9:00 AM, nee...@crlindia.com wrote:
>
>>
>> Dear Mohamed,
>>
>>        Is there some checkpointing software for interconnect other than
>> tcp say IB or Myrinet?
>>
>> Regards
>>
>> Neeraj Chourasia (MTS)
>> Computational Research Laboratories Ltd.
>> (A wholly Owned Subsidiary of TATA SONS Ltd)
>> B-101, ICC Trade Towers, Senapati Bapat Road
>> Pune 411016 (Mah) INDIA
>> (O) +91-20-6620 9863  (Fax) +91-20-6620 9862
>> M: +91.9225520634
>>
>>
>>
>> Mohamed Slim bouguerra 
>> Sent by: users-boun...@open-mpi.org
>> 06/30/2009 05:42 PM
>> Please respond to
>> Open MPI Users 
>>
>>
>> To
>> Open MPI Users 
>> cc
>> Subject
>> Re: [OMPI users] Apllication level checkpointing tools.
>>
>>
>>
>>
>>
>> Dear Kritiraj,
>> You can use DMTCP  http://sourceforge.net/projects/dmtcp
>>
>> Le 30 juin 09 à 13:59, Kritiraj Sajadah a écrit :
>>
>> >
>> > Daer All,
>> >          I have successfully comfigure OPENMPI with BLCR and id some
>> > test. hover, i now want to do some testing with an Application Level
>> > checkpointng tools.  I tried using libckpt but could not install it.
>> >
>> > Do anyone of you know any open source application level
>> > checkpointing tools available that i can install and test with
>> > openmpi?
>> >
>> > Thank you
>> >
>> > Regards,
>> >
>> > Raj
>> >
>> >
>> >
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> =-=-= Notice: The information contained in this e-mail
>> message and/or attachments to it may contain confidential or privileged
>> information. If you are not the intended recipient, any dissemination, use,
>> review, distribution, printing or copying of the information contained in
>> this e-mail message and/or attachments to it are strictly prohibited. If you
>> have received this communication in error, please notify us by reply e-mail
>> or telephone and immediately and permanently delete the message and any
>> attachments. Internet communications cannot be guaranteed to be timely,
>> secure, error or virus-free. The sender does not accept liability for any
>> errors or omissions.Thank you =-=-=
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] How to override MPI functions such as MPI_Init, MPI_Recv...

2009-05-13 Thread Durga Choudhury

You could use a separate namespace (if you are using C++) and define
your functions there...

Durga

On Wed, May 13, 2009 at 1:20 PM, Le Duy Khanh  wrote:
> Dear,
>
>  I intend to override some MPI functions such as MPI_Init, MPI_Recv... but I
> don't want to dig into OpenMPI source code.Therefore, I am thinking of a way
> to create a lib called "mympi.h" in which I will #include "mpi.h" to
> override those functions. I will create a new interface with exactly the
> same signatures like MPI_Init (because users are familiar with those
> functions). However, the problem is that I don't know how to override those
> functions because as I know, C/C++ doesn't allow us to override functions
> (only overload them).
>
>  Could you please show me how to override OMPI functions but still keep the
> same function names and signatures?
>
>  Thank you so much for your time and consideration
>
> Le , Duy Khanh
> Cellphone: (+84)958521704
> Faculty of Computer Science and Engineering
> Ho Chi Minh city University of Technology , Viet Nam
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] OpenMP + OpenMPI

2007-12-06 Thread Durga Choudhury

Automatically striping large messages across multiple NICs is certainly a
very nice feature; I was not aware that OpenMPI does this transparently. (I
wonder if other MPI implementations do this or not). However, I have the
following concern: Since the communication over an ethernet NIC is most
likely over IP, does it take into account the route cost when striping
messages? For example, host A and B in the MPD ring might be connected via
two NICs, one direct and one via an intermediate router, or one with a large
bandwidth and another with a small bandwidth. Does OpenMPI send a smaller
chunk of data over a route with a higher cost?

Because of this concern, I think the channel bonding approach someone else
suggested is more preferable; all these details will be taken care of at the
hardware level instead of at the IP level.
Thanks
Durga

On Dec 6, 2007 9:42 AM, Jeff Squyres  wrote:

> Wow, that's quite a .sig.  :-)
>
> Open MPI will automatically stripe large messages across however many
> NICs you have.  So you shouldn't need to use multiple threads.
>
> The threading support in the OMPI v1.2 series is broken; it's not
> worth using. There's a big warning in configure when you enable it.  :-)
>
>
> On Dec 5, 2007, at 9:57 PM, Tee Wen Kai wrote:
>
> > Hi everyone,
> >
> > I have installed openmpi-1.2.3. My system has two ethernet ports.
> > Thus, I am trying to make use of both ports to speed up the
> > communication process by using openmp to split into two threads.
> > However, this implementation always cause error. Then I realized
> > that I need to build openmpi using --enable-mpi-threads and use
> > MPI_Init_thread to initialize. But, the initialization always return
> > MPI_THREAD_SINGLE no matter what value I set. Using ompi_info|grep
> > Thread, it shows that thread support has already been activated.
> > Thus, I seek your help to teach me what other configurations I need
> > to set in order to use multi-threads and what are the parameters to
> > include in mpirun in order to use the two ethernet ports.
> >
> > Thank you very much.
> >
> > Regards,
> > Tee
> >
> >
> >
> > _
> >
> >
> >
> >   Many of us spend our time wishing for things we could have if we
> > didn't spend half our time wishing.
> >
> > Looking for last minute shopping deals? Find them fast with Yahoo!
> > Search.___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Cisco Systems
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Its a battle between humans and communists;
Which side are you in?
.

Re: [OMPI users] libmpi.so.0 problem

2007-08-14 Thread Durga Choudhury

Did you export your variables? Otherwise the child shell that forks the MPI
process will not inherit it.



On 8/14/07, Rodrigo Faccioli  wrote:
>
> Thanks, Tim Prins for your email.
>
> However It did't resolve my problem.
>
> I set the enviroment variable on my Kubuntu Linux:
>
> faccioli@faccioli-desktop:/usr/local/lib$
> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/bin
>
> faccioli@faccioli-desktop:/usr/local/lib$ LD_LIBRARY_PATH=/usr/local/lib/
>
>
> Therefore, set command will display:
>
> BASH=/bin/bash
> BASH_ARGC=()
> BASH_ARGV=()
> BASH_COMPLETION=/etc/bash_completion
> BASH_COMPLETION_DIR=/etc/bash_completion.d
> BASH_LINENO=()
> BASH_SOURCE=()
> BASH_VERSINFO=([0]="3" [1]="2" [2]="13" [3]="1" [4]="release"
> [5]="x86_64-pc-linux-gnu")
> BASH_VERSION='3.2.13(1)-release'
> COLORTERM=
> COLUMNS=83
>
> DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-C83Ve0QbQz,guid=e07c2bd483a99b50932d080046c199e9
> DESKTOP_SESSION=default
> DIRSTACK=()
> DISPLAY=: 0.0
> DM_CONTROL=/var/run/xdmctl
> EUID=1000
> GROUPS=()
> GS_LIB=/home/faccioli/.fonts
> GTK2_RC_FILES=/home/faccioli/.gtkrc-
> 2.0-kde:/home/faccioli/.kde/share/config/gtkrc-2.0
> GTK_RC_FILES=/etc/gtk/gtkrc:/home/faccioli/.gtkrc:/home/faccioli/.kde/share/config/gtkrc
>
> HISTCONTROL=ignoreboth
> HISTFILE=/home/faccioli/.bash_history
> HISTFILESIZE=500
> HISTSIZE=500
> HOME=/home/faccioli
> HOSTNAME=faccioli-desktop
> HOSTTYPE=x86_64
> IFS=$' \t\n'
> KDE_FULL_SESSION=true
> KDE_MULTIHEAD=false
> KONSOLE_DCOP='DCOPRef(konsole-5587,konsole)'
> KONSOLE_DCOP_SESSION='DCOPRef(konsole-5587,session-2)'
> LANG=en_US.UTF-8
> LD_LIBRARY_PATH=/usr/local/lib/
> LESSCLOSE='/usr/bin/lesspipe %s %s'
> LESSOPEN='| /usr/bin/lesspipe %s'
> LINES=33
> LOGNAME=faccioli
> LS_COLORS='no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.avi=01;35:*.fli=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.flac=01;35:*.mp3=01;35:*.mpc=01;35:*.ogg=01;35:*.wav=01;35:'
>
> MACHTYPE=x86_64-pc-linux-gnu
> MAILCHECK=60
> OLDPWD=/home/faccioli
> OPTERR=1
> OPTIND=1
> OSTYPE=linux-gnu
> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/bin
>
> PIPESTATUS=([0]="0")
> PPID=5587
>
> Unfortunately,  when I execute mpirun a.out, the message I received is:
> a.out:  error while loading shared libraries: libmpi.so.0 : cannot open
> shared object file: No such file or directory
>
> Thanks,
>
>
> On 8/14/07, Tim Prins < tpr...@open-mpi.org> wrote:
> >
> > You need to set your LD_LIBRARY_PATH. See these FAQ entries:
> > http://www.open-mpi.org/faq/?category=running#run-prereqs
> > http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path
> >
> > Tim
> >
> > Rodrigo Faccioli wrote:
> > > Hi,
> > >
> > > I need to know what I can resolve my problem. I'm starting my study on
> > > mpi, more specificaly open-mpi.
> > >
> > > But, when I execute mpirun a.out, the message I received is: a.out:
> > > error while loading shared libraries: libmpi.so.0: cannot open shared
> > > object file: No such file or directory
> > >
> > > The a.out file was obtained through mpicc hello.c
> > >
> > > Thanks.
> > >
> > >
> > >
> > >
> > 
> > >
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Its a battle between humans and communists;
Which side are you in?
.

Re: [OMPI users] How to set paffinity on a multi-cpu node?

2006-11-29 Thread Durga Choudhury

Brian

But does it matter which core the process gets bound to? They are all
identical, and as long as the task is parallelized in equal chunks (that's
the key part), it should not matter. The last time I had to do this, the
problem had to do with real-time processing of a very large radar image. My
approach was to spawn *ONE* MPI process per blade and 12 threads (to utilize
the 12 processors). Inside the task entry point of each pthread, I called
sched_setaffinity(). Then I set the scheduling algorithm to real time with a
very high task priority to avoid preemption. It turns out that the last two
steps did not buy me much because ours was a lean, embedded architecture
anyway, designed to run real-time applications, but I definitely got a speed
up from the task distribution.

It sure would be very nice for openMPI to have this feature; no questions
about that. All I am saying is: if a user wants it today, a reasonable
workaround is available so he/she does not need to wait.

This is my $0.01's worth, since I am probably a lot less experienced.

Durga

On 11/29/06, Brian W. Barrett <bbarr...@lanl.gov> wrote:

It would be difficult to do well without some MPI help, in my
opinion.  You certainly could use the Linux processor affinity API
directly in the MPI application.  But how would the process know
which core to bind to?  It could wait until after MPI_INIT and call
MPI_COMM_RANK, but MPI implementations allocate many of their
resources during MPI_INIT, so there's high potential of the resources
(ie, memory) ending up associated with a different processor than the
one the process gets pinned to.  That isn't a big deal on Intel
machines, but is a major issue for AMD processors.

Just my $0.02, anyway.

Brian

On Nov 28, 2006, at 6:09 PM, Durga Choudhury wrote:

> Jeff (and everybody else)
>
> First of all, pardon me if this is a stupid comment; I am learning
> the nuts-and-bolts of parallel programming; but my comment is as
> follows:
>
> Why can't this be done *outside* openMPI, by calling Linux's
> processor affinity APIs directly? I work with a blade server kind
> of archirecture, where each blade has 12 CPUs. I use pthread within
> each blade and MPI to talk across blades. I use the Linux system
> calls to attach a thread to a specific CPU and it seems to work
> fine. The only drawback is: it makes the code unportable to a
> different OS. But even if you implemented paffinity within openMPI,
> the code will still be unportable to a different implementation of
> MPI, which, as is, it is not.
>
> Hope this helps to the original poster.
>
> Durga
>
>
> On 11/28/06, Jeff Squyres <jsquy...@cisco.com> wrote: There is not,
> right now.  However, this is mainly because back when I
> implemented the processor affinity stuff in OMPI (well over a year
> ago), no one had any opinions on exactly what interface to expose to
> the use.  :-)
>
> So right now there's only this lame control:
>
>  http://www.open-mpi.org/faq/?category=tuning#using-paffinity
>
> I am not opposed to implementing more flexible processor affinity
> controls, but the Big Discussion over the past few months is exactly
> how to expose it to the end user.  There have been several formats
> proposed (e.g., mpirun command line parameters, magic MPI attributes,
> MCA parameters, etc.), but nothing that has been "good" and "right".
> So here's the time to chime in -- anyone have any opinions on this?
>
>
>
> On Nov 25, 2006, at 9:31 AM, shap...@isp.nsc.ru wrote:
>
> > Hello,
> > i cant figure out, is there a way with open-mpi to bind all
> > threads on a given node to a specified subset of CPUs.
> > For example, on a multi-socket multi-core machine, i want to use
> > only a single core on each CPU.
> > Thank You.
> >
> > Best Regards,
> > Alexander Shaposhnikov
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Server Virtualization Business Unit
> Cisco Systems
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> --
> Devil wanted omnipresence;
> He therefore created communists.
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

--
  Brian Barrett
  Open MPI Team, CCS-1
  Los Alamos National Laboratory

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Devil wanted omnipresence;
He therefore created communists.

Re: [OMPI users] How to set paffinity on a multi-cpu node?

2006-11-28 Thread Durga Choudhury

Jeff (and everybody else)

First of all, pardon me if this is a stupid comment; I am learning the
nuts-and-bolts of parallel programming; but my comment is as follows:

Why can't this be done *outside* openMPI, by calling Linux's processor
affinity APIs directly? I work with a blade server kind of archirecture,
where each blade has 12 CPUs. I use pthread within each blade and MPI to
talk across blades. I use the Linux system calls to attach a thread to a
specific CPU and it seems to work fine. The only drawback is: it makes the
code unportable to a different OS. But even if you implemented paffinity
within openMPI, the code will still be unportable to a different
implementation of MPI, which, as is, it is not.

Hope this helps to the original poster.

Durga

On 11/28/06, Jeff Squyres  wrote:

There is not, right now.  However, this is mainly because back when I
implemented the processor affinity stuff in OMPI (well over a year
ago), no one had any opinions on exactly what interface to expose to
the use.  :-)

So right now there's only this lame control:

http://www.open-mpi.org/faq/?category=tuning#using-paffinity

I am not opposed to implementing more flexible processor affinity
controls, but the Big Discussion over the past few months is exactly
how to expose it to the end user.  There have been several formats
proposed (e.g., mpirun command line parameters, magic MPI attributes,
MCA parameters, etc.), but nothing that has been "good" and "right".
So here's the time to chime in -- anyone have any opinions on this?

On Nov 25, 2006, at 9:31 AM, shap...@isp.nsc.ru wrote:

> Hello,
> i cant figure out, is there a way with open-mpi to bind all
> threads on a given node to a specified subset of CPUs.
> For example, on a multi-socket multi-core machine, i want to use
> only a single core on each CPU.
> Thank You.
>
> Best Regards,
> Alexander Shaposhnikov
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Devil wanted omnipresence;
He therefore created communists.

Re: [OMPI users] efficient memory to memory transfer

2006-11-07 Thread Durga Choudhury


Chev

Interesting question; I too would like to hear about it from the experts in
this forum. However, off the top of my head, I have the following advise for
you.

Yes, you could share the memory between processes using the shm_xxx system
calls of unix. However, it would be a lot easier if you used a thread
programming paradigm like pthread. A lot of these overhead would be handled
for you by the library itself.

In general, there is not a lot of performance gains by oversubscribing your
processors (i.e. number of processes > number of CPUs), unless your
processes are I/O bound and are blocked for a significant amount of time. I
don't know what your application is, but in the HPC world, such problems are
rare.

In general, processes on a shared memory node (i.e. an SMP machine) have a
significantly higher memory bandwidth and reduced latency, than those across
a node, even when the interconnect network is RDMA capable (such as
infiniband)

Durga

On 11/7/06, Chevchenkovic Chevchenkovic  wrote:


Hi,
  I had the following setup:
 Rank 0 process on node 1 wants to send an array of particular
size to Rank 1 process on same node.
1. What are the optimisations that can be done/invoked while running
mpirun  to perform this memory to memory transfer efficiently?
2. Is there any performance gain  if 2 processes that are exchanging data
arrays are kept on the same node rather than on different nodes connected by
infiniband?
 Awaiting a reply,
-Chev







___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





--
Devil wanted omnipresence;
He therefore created communists.

Re: [OMPI users] openmpi problem

2006-11-03 Thread Durga Choudhury

Calin

Your questions don't belong in this forum. You either need to be computer
literate (your questions are basic OS related) or delegate this task to
someone more experienced.

Good luck
Durga

On 11/3/06, calin pal  wrote:

/*please read the mail and ans my query*/
sir,
   in   four machine of our college i have installed in this way..that
i m sending u
i start four machine from root...
then i installed the openmpi1.1.1 -tar.gz using the commands.
>>tar -xvzf openmpi-1.1.1
>>cd openmpi-1.1.1
>>./configure --prefix=/usr/local
>>make
>>make all install
>>ompi_info
that i did in root

then according to u r suggestion i went to user(where i did my program
jacobi.c)
gave the password
then i wrote
>>cd .bashrc
>>export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
>>source .bashrc
>>mpicc mpihello.c -o mpihello
>>mpirun -np 4 mpihello

after did all this thing i m getting the problem libmpi:so file
.."mpihello" is not working

what i supposed to do???

should i have to install again???

anything wrong in the installation    sir i cant undersatnd from the
FAQ whatever u have suggested to see methats why i m asking again sir
please tell me whatever i have done in our computer is this okay or anything
i have to change in the code what i have written in the above code please
check it out sir and tell me whats wrong in my code please
sir.please sir read the command also which i have used for installation
in root and user for running the openmpi-1.1.1.tar.gz ...please see
it.

calin pal
 msctech(maths and compsc)
pune ,india

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Devil wanted omnipresence;
He therefore created communists.

Re: [OMPI users] Fault Tolerance & Behavior

2006-10-26 Thread Durga Choudhury

As an alternate suggestion (although George's is better, since this will
affect your entire network connectivity), you could override the default TCP
timeout values with the "sysctl -w" command.
The following three OIDs affect TCP timeout behavior under Linux:

net.ipv4.tcp_keepalive_intvl = 75 <- How often (in seconds) to send
keepalive probes
net.ipv4.tcp_keepalive_probes = 9 <- How many probes to send before
declaring the connection dead.
net.ipv4.tcp_keepalive_time = 7200 <- How long the connection may be
idle before the first keepalive is sent.

Again, use them with caution and not on a live internet server.

Durga

On 10/26/06, George Bosilca  wrote:

The Open MPI behavior is the same independently of the network used
for the job. At least the behavior dictated by our internal message
passing layer. But, for this to happens we should get a warning from
the network that something is wrong (such a timeout). In the case of
TCP (and Myrinet) the timeout is so high that Open MPI was not
informed that something went wrong (we printout some warnings when
this happens). It was happily waiting for a message to complete ...
Once the network cable was reconnected, the network device itself
recover and resume the communication, leading to a correct send
operation (and this without involving Open MPI at all). There is
nothing (that has a reasonable cost) we can do about this.

For IB, look like the network timeout is smaller. Open MPI knew that
something was wrong (the output prove it), and tried to continue
using the other available devices. If none are available, then Open
MPI is supposed to abort the job. For your particular run did you had
Ethernet between the nodes ? If yes, I'm quite sure the MPI run
wasn't stopped ... it continued using the TCP device (if not disabled
by hand at mpirun time).

That's not what is supposed to happens right now. If there are other
devices (such as TCP) the MPI job will print out some warnings and
will continue over the remaining networks (some will continue to use
the other networks, only the peer where the network went down get
affected). If the network timeout is too high, Open MPI will never
notice that something went wrong. At least not the default message
layer (PML).

If you want to have the job abort when your main network goes down,
disable the usage of the others available network. More specifically
disable the TCP. A simple way to do it, it's to add the following
argument to your mpirun command:

--mca btl ^tcp (or --mca btl opnib,sm,self).

  Thanks,
george.

PS: There are several internal message passing modules available for
Open MPI. The default one, look more for performance than
reliability. If reliability it's what you need you should use the DR
PML. For this, you can specify --mca pml dr at mpirun time. This (DR)
PML has data reliability and timeout (Open MPI internal timeout that
are configurable), allowing to recover faster from a network failure.

On Oct 26, 2006, at 3:52 PM, Troy Telford wrote:

> I've recently had the chance to see how Open MPI (as well as other
> MPIs)
> behave in the case of network failure.
>
> I've looked at what happens when a node has its network connection
> disconnected in the middle of a job, with Ethernet, Myrinet (GM), and
> InfiniBand (OpenIB).
>
> With Ethernet and Myrinet, the job more or less pauses until the
> cable is
> re-connected.  (I imagine timeouts still apply, but I wasn't patient
> enough to wait for them)
>
> With InfiniBand, the job pauses and Open MPI throws a few error
> messages.
> After the cable is plugged back in (and the SM catches up), the job
> remains where it was when it was paused.  I'd guess that part of
> this is
> that the timeout is much shorter with IB than with Myri or
> Ethernet, and
> that when I unplug the IB cable, it times out fairly quickly (and then
> Open MPI throws its error messages).
>
> At any rate, the thought occurs (and it may just be my ignorance of
> MPI):
> After a network connection times out (as was apparently the case
> with IB),
> is the job salvageable?  If the jobs are not salvageable, why
> didn't Open
> MPI abort the job (and clean up the running processes on the nodes)?
> --
> Troy Telford
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Devil wanted omnipresence;
He therefore created communists.

Re: [OMPI users] Dual Gigabit ethernet support

2006-10-24 Thread Durga Choudhury


Very interesting, indeed! Message passing running over raw Ethernet using
cheap COTS PCs is indeed the need of the hours for people like me who has a
very shallow pocket. Great work! What would make this effort *really* cool
is to have a one-to-one mapping of APIs from MPI domain to GAMMA domain, so,
for example, existing MPI code can be ported with a trivial amount of work.
Professor Ladd, how did you do this porting, e.g. for VASP? How much of an
effort was it? (Or did the VASP guys already had a version running over
GAMMA ?)

Thanks
Durga


On 10/24/06, Tony Ladd  wrote:


Lisandro

I use my own network testing program; I wrote it some time ago because
Netpipe only tested 1-way rates at that point. I havent tried IMB but I
looked at the source and its very similar to what I do. 1) set up buffers
with data. 2) Start clock 3) Call MPI_xxx N times 4) Stop clock 5)
calculate
rate. IMB tests more things than I do; I just focused on the calls I use
(send recv allreduce). I have done a lot of testing of hardware and
software. I will have some web pages posted soon. I will put a note here
when I do. But a couple of things.
A) I have found the switch is the biggest discriminant if you want to run
HPC under Gigabit ethernet. Most GigE switches choke when all the ports
are
being used at once. This is the usual HPC pattern, but not of a typical
network, which is what these switches are geared towards. The one
exception
I have found is the Extreme Networks x450a-48t. In some test patterns I
found it to be 500 times faster (not a typo) than the s400-48t, which is
its
predecessor. I have tested several GigE switches (Extreme, Force10, HP,
Asante) and the x450 is the only one that copes with high traffic loads in
all port configurations. Its expensive for a GigE switch (~$6500) but
worth
it in my opinion if you want to do HPC. Its still much cheaper than
Infiniband.
B) You have to test the switch in different port configurations-a random
ring of SendRecv is good for this. I don't think IMB has it in its test
suite but its easy to program. Or you can change the order of nodes in the
machinefile to force unfavorable port assignments. A step of 12 is a good
test since many GigE switches use 12-port ASICS and this forces all the
traffic onto the backplane. On the Summit 400 this causes it to more or
less
stop working-rates drop to a few Kbytes/sec along each wire, but the x450
has no problem with the same test. You need to know how your nodes are
wired
to the switch to do this test.
C) GAMMA is an extraordinary accomplishment in my view; in a number of
tests
with codes like DLPOLY, GROMACS, VASP it can be 2-3 times the speed of TCP
based programs with 64 cpus. In many instances I get comparable (and
occasionally better) scaling than with the university HPC system which has
an Infiniband interconnect. Note I am not saying GigE is comparable to IB;
but that a typical HPC setup with nodes scattered all over a fat tree
topology (including oversubscription of the links and switches) is enough
of
a minus that an optimized GigE set up can compete; at least up to 48 nodes
(96 cpus in our case). I have worked with Giuseppe Ciaccio for the past 9
months eradicating some obscure bugs in GAMMA. I find them; he fixes them.
We have GAMMA running on 48 nodes quite reliably but there are still many
issues to address. GAMMA is very much a research tool-there are a number
of
features(?) which would hinder it being used in an HPC environment.
Basically Giuseppe needs help with development. Any volunteers?

Tony
---
Tony Ladd
Professor, Chemical Engineering
University of Florida
PO Box 116005
Gainesville, FL 32611-6005

Tel: 352-392-6509
FAX: 352-392-9513
Email: tl...@che.ufl.edu
Web: http://ladd.che.ufl.edu


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





--
Devil wanted omnipresence;
He therefore created communists.

Re: [OMPI users] dual Gigabit ethernet support

2006-10-23 Thread Durga Choudhury

What I think is happening is this:

The initial transfer rate you are seeing is the burst rate; after a long
time average, your sustained transfer rate emerges. Like George said, you
should use a proven tool to measure your bandwidth. We use netperf, a
freeware from HP.

That said, the ethernet technology is not a good candidate for HPC (one
reason people don't use it in the backplanes, despite the low cost). Do the
math yourself: there is a 54 byte overhead (14 B ethernet + 20B IP + 20B
TCP) for every packet sent, for socket communication. That is why protocols
like uDAPL over Infiniband is gaining in popularity.

Durga

On 10/23/06, Jayanta Roy <j...@ncra.tifr.res.in> wrote:

Hi,

I have tried with lamboot with a host file where odd-even nodes will talk
within themselves using eth0 and talk across them using eth1. So my
transfer runs @ 230MB/s at starting. But after few transfers rate falls
down to ~130MB/s and after long run finally comes to ~54MB/s. Why this
type of network slowing down with time is happenning?

Regards,
Jayanta

On Mon, 23 Oct 2006, Durga Choudhury wrote:

> Did you try channel bonding? If your OS is Linux, there are plenty of
> "howto" on the internet which will tell you how to do it.
>
> However, your CPU might be the bottleneck in this case. How much of CPU
> horsepower is available at 140MB/s?
>
> If the CPU *is* the bottleneck, changing your network driver (e.g. from
> interrupt-based to poll-based packet transfer) might help. If you are
> unfamiliar with writing network drivers for your OS, this may not be a
> trivial task, though.
>
> Oh, and like I pointed out last time, if all of the above seem OK, try
> putting your second link to a separate PC and see if you can gate twice
the
> throughput. If so, then the ECMP implementation of your IP stack is what
is
> causing the problem. This is the hardest one to fix. You could rewrite a
few
> routines in ipv4 processing and recompile the Kernel, if you are
familiar
> with Kernel building and your OS is Linux.
>
>
> On 10/23/06, Jayanta Roy <j...@ncra.tifr.res.in> wrote:
>>
>> Hi,
>>
>> Sometime before I have posted doubts about using dual gigabit support
>> fully. See I get ~140MB/s full duplex transfer rate in each of
following
>> runs.
>>
>> mpirun --mca btl_tcp_if_include eth0 -n 4 -bynode -hostfile host a.out
>>
>> mpirun --mca btl_tcp_if_include eth1 -n 4 -bynode -hostfile host a.out
>>
>> How to combine these two port or use a proper routing table in place
host
>> file? I am using openmpi-1.1 version.
>>
>> -Jayanta
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Devil wanted omnipresence;
> He therefore created communists.
>

Jayanta Roy
National Centre for Radio Astrophysics  |  Phone  : +91-20-25697107
Tata Institute of Fundamental Research  |  Fax: +91-20-25692149 Pune
University Campus, Pune 411 007|  e-mail : j...@ncra.tifr.res.in
India

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Devil wanted omnipresence;
He therefore created communists.

Re: [OMPI users] dual Gigabit ethernet support

2006-10-23 Thread Durga Choudhury


Did you try channel bonding? If your OS is Linux, there are plenty of
"howto" on the internet which will tell you how to do it.

However, your CPU might be the bottleneck in this case. How much of CPU
horsepower is available at 140MB/s?

If the CPU *is* the bottleneck, changing your network driver (e.g. from
interrupt-based to poll-based packet transfer) might help. If you are
unfamiliar with writing network drivers for your OS, this may not be a
trivial task, though.

Oh, and like I pointed out last time, if all of the above seem OK, try
putting your second link to a separate PC and see if you can gate twice the
throughput. If so, then the ECMP implementation of your IP stack is what is
causing the problem. This is the hardest one to fix. You could rewrite a few
routines in ipv4 processing and recompile the Kernel, if you are familiar
with Kernel building and your OS is Linux.


On 10/23/06, Jayanta Roy  wrote:


Hi,

Sometime before I have posted doubts about using dual gigabit support
fully. See I get ~140MB/s full duplex transfer rate in each of following
runs.

mpirun --mca btl_tcp_if_include eth0 -n 4 -bynode -hostfile host a.out

mpirun --mca btl_tcp_if_include eth1 -n 4 -bynode -hostfile host a.out

How to combine these two port or use a proper routing table in place host
file? I am using openmpi-1.1 version.

-Jayanta
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





--
Devil wanted omnipresence;
He therefore created communists.

Re: [OMPI users] problem abut openmpi running

2006-10-19 Thread Durga Choudhury

George

I knew that was the answer to Calin's question, but I still would like to
understand the issue:

by default, the openMPI installer installs the libraries in /usr/local/lib,
which is a standard location for the C compiler to look for libraries. So
*why* do I need to explicitly specify this with LD_LIBRARY_PATH? For
example, when I am compiling with pthread calls and pass -lpthread to gcc, I
need not specify the location of libpthread.so with LD_LIBRARY_PATH. I had
the same problem as Calin so I am curious. This is assuming he has not
redirected the installation path to some non-standard location.

Thanks
Durga

On 10/19/06, George Bosilca  wrote:

Calin,

Look like you're missing a proper value for the LD_LIBRARY_PATH.
Please read the Open MPI FAW at http://www.open-mpi.org/faq/?
category=running.

  Thanks,
george.

On Oct 19, 2006, at 6:41 AM, calin pal wrote:

>
>   hi,
>  i m calin from indiai m working on openmpii
> have installed openmpi 1.1.1-tar.gz in four machines in our college
> labin one system the openmpi is properly working.i have written
> "hello world" program in all machines .but in one machine its
> working properly.in other machine gives
> ((
> (hello:error while loading shared libraries:libmpi.so..o;cannot
> open shared object file:no such file or directory.)
>
>
> what is the problem plz tel me..and how to solve it..please
> tell me
>
> calin pal
> india
> fergusson college
> msc.tech(maths and computer sc.)
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Devil wanted omnipresence;
He therefore created communists.

[OMPI users] OpenMPI error in the simplest configuration...

2006-08-27 Thread Durga Choudhury


Hi all

I am getting an error (details follow) in the simplest of the possible test
scenarios:

Two identical regular Dell PCs connected back-to-back via an ethernet switch
on the 10/100 ethernet. Both run Fedora Core 4. Identical version (1.1) of
Open MPI is compiled and installed on both of them *without* a --prefix
option (i.e. installed on the default location of /usr/local).

The hostfile on both the machine is the same:

cat ~/hostfile

192.168.22.29
192.168.22.103

I can run openMPI on either of these two machines by forking two processes:

mpirun -np2 osu_acc_latency  <-- This runs fine on either of the two
machines.

However, when I try to luch the same program across the two machines, I get
an error:

mpirun --hostfile ~/hostfile -np2 /home/durga/openmpi-1.1
/osu_benchmarks/osu_acc_latency

durga@192.168.22.29's password: foobar
/home/durga/openmpi-1.1/osu_benchmarks/osu_acc_latency: error while loading
shared libraries: libmpi.so.0: cannot open shared object file: No such file
or directory.

However, the file *does exist* in /usr/local/lib:

ls -l /usr/local/lib/libmpi.so.0
libmpi.so.0 -> libmpi.so.0.0.0

I have also tried adding /usr/local/lib to my LD_LIBRARY_PATH on *both*
machines, to no avail.

Any help is greatly appreciated.

Thanks

Durga

[OMPI users] Proprieatary transport layer for openMPI...

2006-08-07 Thread Durga Choudhury


Hi All

We have been using the Argonne MPICH (over TCP/IP) on our in-house designed
embedded multicomputer for last several months with satisfactory results.
Our network technology is custom built and is * *not** infiniband (or any
published standards, such as Myrinet) based. This is due to the nature of
our application. We are currently running TCP/IP over out backplane network
and using that as the transport layer of MPICH.

For the next generation of our software release, we are planning to write a
low level transport layer to leverage our switch architecture and
considering changing the entire MPI protocol stack to openMPI. From what I
have found so far, I'd have to write routines to provide services similar to
the ones found under ompi/mca/btl/{tcp,mx,...}. I'd like to get some
guidance as to how to do this. Is there a document about this? Has anybody
in this list done something similar before and if so, what was the
difficulty level involved?

Thanks a lot in advance.

Durga

--
Devil wanted omnipresence;
He therefore created communists.

Re: [OMPI users] Open MPI on Dual Core Laptop?

2006-08-01 Thread Durga Choudhury


Do you want to use MPI to chain a bunch of such laptops together (e.g. via
ethernet) or just for the cores to talk to each other? If the latter; you do
not need MPI. Your SMP operating system (e.g. Linux) will automatically
utilize both cores. The Linux 2.6 kernel also supports processor affinity
which will always schedule the kernel on a fixed core, avoiding cache
invalidation and stuff like that.

Thanks

Durga


On 8/1/06, Wen Long at UMCES/HPL  wrote:


 Hi,

   Any people have installed open MPI on a Dual Core desktop or laptop?
Such as Intel Centrino Duo ? or it is totally impossible?

   Thanks,,

   Wen

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





--
Devil wanted omnipresence;
He therefore created communists.

Re: [OMPI users] Windows support for OpenMPI

Re: [OMPI users] RDMA GPUDirect CUDA...

Re: [OMPI users] system() call corrupts MPI processes

Re: [OMPI users] [Beowulf] How to justify the use MPI codes on multicore systems/PCs?

Re: [OMPI users] Shared-memory problems

Re: [OMPI users] problème with MPI_FINALIZE

Re: [OMPI users] configure with cuda

Re: [OMPI users] Memory mapped memory

Re: [OMPI users] Related to project ideas in OpenMPI

Re: [OMPI users] multi-threaded programming

Re: [OMPI users] Pros and cons of --enable-heterogeneous

Re: [OMPI users] Shared memory

Re: [OMPI users] Shared memory

Re: [OMPI users] Number of Sockets used by OpenMPI

Re: [OMPI users] Number of Sockets used by OpenMPI

Re: [OMPI users] fault tolerance in open mpi

Re: [OMPI users] Network connection check

Re: [OMPI users] fault tolerance in open mpi

Re: [OMPI users] Apllication level checkpointing tools.

Re: [OMPI users] How to override MPI functions such as MPI_Init, MPI_Recv...

Re: [OMPI users] OpenMP + OpenMPI

Re: [OMPI users] libmpi.so.0 problem

Re: [OMPI users] How to set paffinity on a multi-cpu node?

Re: [OMPI users] How to set paffinity on a multi-cpu node?

Re: [OMPI users] efficient memory to memory transfer

Re: [OMPI users] openmpi problem

Re: [OMPI users] Fault Tolerance & Behavior

Re: [OMPI users] Dual Gigabit ethernet support

Re: [OMPI users] dual Gigabit ethernet support

Re: [OMPI users] dual Gigabit ethernet support

Re: [OMPI users] problem abut openmpi running

[OMPI users] OpenMPI error in the simplest configuration...

[OMPI users] Proprieatary transport layer for openMPI...

Re: [OMPI users] Open MPI on Dual Core Laptop?

34 matches

Site Navigation

Mail list logo

Footer information