sorry, I should pay more attention when I edit the subject of the daily digest

Dear Eric, Aurelien and Eugene

thanks a lot for helping. What Eugene said summarizes exactly the
situation. I agree it's an issue with the full code, since the problem
doesn't arise in simple examples, like the one I posted. I was just
hoping I was doing something trivially wrong and that someone would
shout at me :-). I could post the full code, but it's quite a long
one. At the moment I am still going through it searching for the
problem, so I'll wait a bit before spamming the other users.

cheers

Enrico

>
> On Mon, Sep 15, 2008 at 6:00 PM,  <users-requ...@open-mpi.org> wrote:
>> Send users mailing list submissions to
>>        us...@open-mpi.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>        http://www.open-mpi.org/mailman/listinfo.cgi/users
>> or, via email, send a message with subject or body 'help' to
>>        users-requ...@open-mpi.org
>>
>> You can reach the person managing the list at
>>        users-ow...@open-mpi.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of users digest..."
>>
>>
>> Today's Topics:
>>
>>   1. Re: Problem using VampirTrace (Thomas Ropars)
>>   2. Re: Why compilig in global paths (only) for       configuretion
>>      files? (Paul Kapinos)
>>   3. Re: MPI_sendrecv = MPI_Send+ MPI_RECV ? (Eugene Loh)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Mon, 15 Sep 2008 15:04:07 +0200
>> From: Thomas Ropars <trop...@irisa.fr>
>> Subject: Re: [OMPI users] Problem using VampirTrace
>> To: Andreas Kn?pfer <andreas.knuep...@tu-dresden.de>
>> Cc: us...@open-mpi.org
>> Message-ID: <48ce5d47.50...@irisa.fr>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Hello,
>>
>> I don't have a common file system for all cluster nodes.
>>
>> I've tried to run the application again with VT_UNIFY=no and to call
>> vtunify manually. It works well. I managed to get the .otf file.
>>
>> Thank you.
>>
>> Thomas Ropars
>>
>>
>> Andreas Kn?pfer wrote:
>>> Hello Thomas,
>>>
>>> sorry for the delay. My first asumption about the cause of your problem is 
>>> the
>>> so called "unify" process. This is a post-processing step which is performed
>>> automatically after the trace run. This step needs read access to all files,
>>> though. So, do you have a common file system for all cluster nodes?
>>>
>>> If yes, set the env variable VT_PFORM_GDIR point there. Then the traces will
>>> be copied there from the location VT_PFORM_LDIR which still can be a
>>> node-local directory. Then everything will be handled automatically.
>>>
>>> If not, please set VT_UNIFY=no in order to disable automatic unification. 
>>> Then
>>> you need to call vtunify manually. Please copy all files from the run
>>> directory that start with your OTF file prefix to a common directory and 
>>> call
>>>
>>> %> vtunify <number of processes> <file prefix>
>>>
>>> there. This should give you the <prefix>.otf file.
>>>
>>> Please give this a try. If it is not working, please give me an 'ls -alh' 
>>> from
>>> your trace directory/directories.
>>>
>>> Best regards, Andreas
>>>
>>>
>>> P.S.: Please have my email on CC, I'm not on the us...@open-mpi.org list.
>>>
>>>
>>>
>>>
>>>>> From: Thomas Ropars <trop...@irisa.fr>
>>>>> Date: August 11, 2008 3:47:54 PM IST
>>>>> To: us...@open-mpi.org
>>>>> Subject: [OMPI users] Problem using VampirTrace
>>>>> Reply-To: Open MPI Users <us...@open-mpi.org>
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I'm trying to use VampirTrace.
>>>>> I'm working with r19234 of svn trunk.
>>>>>
>>>>> When I try to run a simple application with 4 processes on the same
>>>>> computer, it works well.
>>>>> But if try to use the same application with the 4 processes executed
>>>>> on 4 different computers, I never get the .otf file.
>>>>>
>>>>> I've tried to run with VT_VERBOSE=yes, and I get the following trace:
>>>>>
>>>>> VampirTrace: Thread object #0 created, total number is 1
>>>>> VampirTrace: Opened OTF writer stream [namestub /tmp/ring-
>>>>> vt.fffffffffe8349ca.3294 id 1] for generation [buffer 32000000 bytes]
>>>>> VampirTrace: Thread object #0 created, total number is 1
>>>>> VampirTrace: Opened OTF writer stream [namestub /tmp/ring-
>>>>> vt.fffffffffe834bca.3020 id 1] for generation [buffer 32000000 bytes]
>>>>> VampirTrace: Thread object #0 created, total number is 1
>>>>> VampirTrace: Opened OTF writer stream [namestub /tmp/ring-
>>>>> vt.fffffffffe834aca.3040 id 1] for generation [buffer 32000000 bytes]
>>>>> VampirTrace: Thread object #0 created, total number is 1
>>>>> VampirTrace: Opened OTF writer stream [namestub /tmp/ring-
>>>>> vt.fffffffffe834fca.3011 id 1] for generation [buffer 32000000 bytes]
>>>>> Ring : Start
>>>>> Ring : End
>>>>> [1]VampirTrace: Flushed OTF writer stream [namestub /tmp/ring-
>>>>> vt.fffffffffe834aca.3040 id 1]
>>>>> [2]VampirTrace: Flushed OTF writer stream [namestub /tmp/ring-
>>>>> vt.fffffffffe834bca.3020 id 1]
>>>>> [1]VampirTrace: Closed OTF writer stream [namestub /tmp/ring-
>>>>> vt.fffffffffe834aca.3040 id 1]
>>>>> [3]VampirTrace: Flushed OTF writer stream [namestub /tmp/ring-
>>>>> vt.fffffffffe834fca.3011 id 1]
>>>>> [2]VampirTrace: Closed OTF writer stream [namestub /tmp/ring-
>>>>> vt.fffffffffe834bca.3020 id 1]
>>>>> [0]VampirTrace: Flushed OTF writer stream [namestub /tmp/ring-
>>>>> vt.fffffffffe8349ca.3294 id 1]
>>>>> [1]VampirTrace: Wrote unify control file ./ring-vt.2.uctl
>>>>> [2]VampirTrace: Wrote unify control file ./ring-vt.3.uctl
>>>>> [3]VampirTrace: Closed OTF writer stream [namestub /tmp/ring-
>>>>> vt.fffffffffe834fca.3011 id 1]
>>>>> [0]VampirTrace: Closed OTF writer stream [namestub /tmp/ring-
>>>>> vt.fffffffffe8349ca.3294 id 1]
>>>>> [0]VampirTrace: Wrote unify control file ./ring-vt.1.uctl
>>>>> [0]VampirTrace: Checking for ./ring-vt.1.uctl ...
>>>>> [0]VampirTrace: Checking for ./ring-vt.2.uctl ...
>>>>> [1]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834aca.
>>>>> 3040.1.def
>>>>> [2]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834bca.
>>>>> 3020.1.def
>>>>> [3]VampirTrace: Wrote unify control file ./ring-vt.4.uctl
>>>>> [1]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834aca.
>>>>> 3040.1.events
>>>>> [2]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834bca.
>>>>> 3020.1.events
>>>>> [3]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834fca.
>>>>> 3011.1.def
>>>>> [1]VampirTrace: Thread object #0 deleted, leaving 0
>>>>> [2]VampirTrace: Thread object #0 deleted, leaving 0
>>>>> [3]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834fca.
>>>>> 3011.1.events
>>>>> [3]VampirTrace: Thread object #0 deleted, leaving 0
>>>>>
>>>>>
>>>>> Regards
>>>>>
>>>>> Thomas
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Mon, 15 Sep 2008 17:22:03 +0200
>> From: Paul Kapinos <kapi...@rz.rwth-aachen.de>
>> Subject: Re: [OMPI users] Why compilig in global paths (only) for
>>        configuretion files?
>> To: Open MPI Users <us...@open-mpi.org>,        Samuel Sarholz
>>        <sarh...@rz.rwth-aachen.de>
>> Message-ID: <48ce7d9b.8070...@rz.rwth-aachen.de>
>> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>>
>> Hi Jeff, hi all!
>>
>> Jeff Squyres wrote:
>>> Short answer: yes, we do compile in the prefix path into OMPI.  Check
>>> out this FAQ entry; I think it'll solve your problem:
>>>
>>>     http://www.open-mpi.org/faq/?category=building#installdirs
>>
>>
>> Yes, reading man pages helps!
>> Thank you to provide useful help.
>>
>> But the setting of the environtemt variable OPAL_PREFIX to an
>> appropriate value (assuming PATH and LD_LIBRARY_PATH are setted too) is
>> not enough to let the OpenMPI rock&roll from the new lokation.
>>
>> Because of the fact, that all the files containing settings for
>> opal_wrapper, which are located in share/openmpi/ and called e.g.
>> mpif77-wrapper-data.txt, contain (defined by installation with --prefix)
>> hard-coded paths, too.
>>
>> I have fixed the problem by parsing all the files  share/openmpi/*.txt
>> and replacing the old path through new path. This nasty solution seems
>> to work.
>>
>> But, is there an elegant way to do this correctness, maybe to
>> re-generate the config-files in share/openmpi/
>>
>> And last but not least, the FAQ on the web site you provided (see link
>> above) does not containn any info on the need to modufy the wrapper
>> configuretion files. Maybe this section schould be upgraded?
>>
>> Best regards Paul Kapinos
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>>
>>>
>>> On Sep 8, 2008, at 5:33 AM, Paul Kapinos wrote:
>>>
>>>> Hi all!
>>>>
>>>> We are using OpenMPI on an variety of machines (running Linux,
>>>> Solaris/Sparc and /Opteron) using couple of compilers (GCC, Sun
>>>> Studio, Intel, PGI, 32 and 64 bit...) so we have at least 15 versions
>>>> of each release of OpenMPI (SUN Cluster Tools not included).
>>>>
>>>> This shows, that we have to support an complete petting zoo of
>>>> OpenMPI's. Sometimes we may need to move things around.
>>>>
>>>>
>>>> If OpenMPI is being configured, the install path may be provided using
>>>> --prefix keyword, say so:
>>>>
>>>> ./configure --prefix=/my/love/path/for/openmpi/tmp1
>>>>
>>>> After "gmake all install" in ...tmp1 an installation of OpenMPI may be
>>>> found.
>>>>
>>>> Then, say, we need to *move* this Version to an another path, say
>>>> /my/love/path/for/openmpi/blupp
>>>>
>>>> Of course we have to set $PATH and $LD_LIBRARY_PATH accordingly (we
>>>> can that ;-)
>>>>
>>>> And if we tried to use OpenMPI from new location, we got error message
>>>> like
>>>>
>>>> $ ./mpicc
>>>> Cannot open configuration file
>>>> /my/love/path/for/openmpi/tmp1/share/openmpi/mpicc-wrapper-data.txt
>>>> Error parsing data file mpicc: Not found
>>>>
>>>> (note the old installation path used)
>>>>
>>>> That looks for me, that the install path provided with --prefix in
>>>> configuration step, is compiled into opal_wrapper executable file and
>>>> opal_wrapper works iff the set of configuration files is in this path.
>>>> But after move of the OpenMP installation directory the configuration
>>>> files aren't there...
>>>>
>>>> An side effect of this behaviour is the certainty that binary
>>>> distributions of OpenMPI (RPM's) are not relocatable. That's
>>>> uncomfortably. (Actually, this mail is initiated by the fact that Sun
>>>> ClusterTools RPM's are not relocatable)
>>>>
>>>>
>>>> So, does this behavior have an deeper sence I cannot recognise, or
>>>> maybe  the configuring of global paths is not needed?
>>>>
>>>> What I mean, is that the paths for the configuration files, which
>>>> opal_wrapper need, may be setted locally like ../share/openmpi/***
>>>> without affectiong the integrity of OpenMPI. Maybe there were were
>>>> more places where the usage of local paths may be needed to allowe
>>>> movable (relocable) OpenMPI.
>>>>
>>>> What do you mean about?
>>>>
>>>> Best regards
>>>> Paul Kapinos
>>>>
>>>>
>>>>
>>>> <kapinos.vcf>_______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: verwurschel_pfade_openmpi.sh
>> Type: application/x-sh
>> Size: 369 bytes
>> Desc: not available
>> URL: 
>> <http://www.open-mpi.org/MailArchives/users/attachments/20080915/434c3679/attachment.sh>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: kapinos.vcf
>> Type: text/x-vcard
>> Size: 330 bytes
>> Desc: not available
>> URL: 
>> <http://www.open-mpi.org/MailArchives/users/attachments/20080915/434c3679/attachment.vcf>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: smime.p7s
>> Type: application/x-pkcs7-signature
>> Size: 4230 bytes
>> Desc: S/MIME Cryptographic Signature
>> URL: 
>> <http://www.open-mpi.org/MailArchives/users/attachments/20080915/434c3679/attachment.bin>
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Mon, 15 Sep 2008 08:46:11 -0700
>> From: Eugene Loh <eugene....@sun.com>
>> Subject: Re: [OMPI users] MPI_sendrecv = MPI_Send+ MPI_RECV ?
>> To: Open MPI Users <us...@open-mpi.org>
>> Message-ID: <48ce8343.7060...@sun.com>
>> Content-Type: text/plain; format=flowed; charset=ISO-8859-1
>>
>> Aur?lien Bouteiller wrote:
>>
>>> You can't assume that MPI_Send does buffering.
>>
>> Yes, but I think this is what Eric meant by misinterpreting Enrico's
>> problem.  The communication pattern is to send a message, which is
>> received remotely.  There is remote computation, and then data is sent
>> back.  No buffering is needed for such a pattern.  The code is
>> "apparently" legal.  There is apparently something else going on in the
>> "real" code that is not captured in the example Enrico sent.
>>
>> Further, if I understand correctly, the remote process actually receives
>> the data!  If  this is true, the example is as simple as:
>>
>> process 1:
>>    MPI_Send()     // this call blocks
>>
>> process 0:
>>    MPI_Recv()    // this call actually receives the data sent by
>> MPI_Send!!!
>>
>> Enrico originally explained that process 0 actually receives the data.
>> So, MPI's internal buffering is presumably not a problem at all!  An
>> MPI_Send effectively sends data to a remote process, but simply never
>> returns control to the user program.
>>
>>> Without buffering, you  are in a possible deadlock situation. This
>>> pathological case is the  exact motivation for the existence of
>>> MPI_Sendrecv. You can also  consider Isend Recv Wait, then the Send
>>> will never block, even if the  destination is not ready to receive, or
>>> MPI_Bsend that will add  explicit buffering and therefore return
>>> control to you before the  message transmission actually begun.
>>>
>>> Aurelien
>>>
>>>
>>> Le 15 sept. 08 ? 01:08, Eric Thibodeau a ?crit :
>>>
>>>> Sorry about that, I had misinterpreted your original post as being
>>>> the pair of send-receive. The example you give below does seem
>>>> correct indeed, which means you might have to show us the code that
>>>> doesn't work. Note that I am in no way a Fortran expert, I'm more
>>>> versed in C. The only hint I'd give a C programmer in this case is
>>>> "make sure your receiving structures are indeed large enough (ie:
>>>> you send 3d but eventually receive 4d...did you allocate for 3d or
>>>> 4d for receiving the converted array...).
>>>>
>>>> Eric
>>>>
>>>> Enrico Barausse wrote:
>>>>
>>>>> sorry, I hadn't changed the subject. I'm reposting:
>>>>>
>>>>> Hi
>>>>>
>>>>> I think it's correct. what I want to to is to send a 3d array from  the
>>>>> process 1 to process 0 =root):
>>>>> call MPI_Send(toroot,3,MPI_DOUBLE_PRECISION,root,n,MPI_COMM_WORLD
>>>>>
>>>>> in some other part of the code process 0 acts on the 3d array and
>>>>> turns it into a 4d one and sends it back to process 1, which receives
>>>>> it with
>>>>>
>>>>> call MPI_RECV(tonode,
>>>>> 4,MPI_DOUBLE_PRECISION,root,n,MPI_COMM_WORLD,status,ierr)
>>>>>
>>>>> in practice, what I do i basically give by this simple code (which
>>>>> doesn't give the segmentation fault unfortunately):
>>>>>
>>>>>
>>>>>
>>>>>       a=(/1,2,3,4,5/)
>>>>>
>>>>>       call MPI_INIT(ierr)
>>>>>       call MPI_COMM_RANK(MPI_COMM_WORLD, id, ierr)
>>>>>       call MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
>>>>>
>>>>>       if(numprocs/=2) stop
>>>>>
>>>>>       if(id==0) then
>>>>>               do k=1,5
>>>>>                       a=a+1
>>>>>                       call MPI_SEND(a,5,MPI_INTEGER,
>>>>> 1,k,MPI_COMM_WORLD,ierr)
>>>>>                       call
>>>>> MPI_RECV(b,4,MPI_INTEGER,1,k,MPI_COMM_WORLD,status,ierr)
>>>>>               end do
>>>>>       else
>>>>>               do k=1,5
>>>>>                       call
>>>>> MPI_RECV(a,5,MPI_INTEGER,0,k,MPI_COMM_WORLD,status,ierr)
>>>>>                       b=a(1:4)
>>>>>                       call MPI_SEND(b,4,MPI_INTEGER,
>>>>> 0,k,MPI_COMM_WORLD,ierr)
>>>>>               end do
>>>>>       end if
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> --
>>> * Dr. Aur?lien Bouteiller
>>> * Sr. Research Associate at Innovative Computing Laboratory
>>> * University of Tennessee
>>> * 1122 Volunteer Boulevard, suite 350
>>> * Knoxville, TN 37996
>>> * 865 974 6321
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/user
>>> s
>>
>>
>>
>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> End of users Digest, Vol 1006, Issue 2
>> **************************************
>>
>

Reply via email to