> > As other people said, Fujitsu MPI used in K is based on old
> > Open MPI (v1.6.3 with bug fixes).
>
> I guess the obvious question is will the vanilla Open-MPI work on K?
Unfortunately no. Support of Tofu and Fujitsu resource manager
are not included in Open MPI.
Takahiro Kawashima,
MPI
Samuel,
I am a developer of Fujitsu MPI. Thanks for using the K computer.
For official support, please consult with the helpdesk of K,
as Gilles said. The helpdesk may have information based on past
inquiries. If not, the inquiry will be forwarded to our team.
As other people said, Fujitsu MPI
ebian/Sid system w/ glibc-2.24.
>
> The patch you pointed me at does appear to fix the problem!
> I will note this in your PRs.
>
> -Paul
>
> On Mon, Aug 21, 2017 at 9:17 PM, Kawashima, Takahiro <
> t-kawash...@jp.fujitsu.com> wrote:
>
> > Paul,
> >
> &g
Paul,
Did you upgrade glibc or something? I suspect newer glibc
supports process_vm_readv and process_vm_writev and output
of configure script changed. My Linux/SPARC64 with old glibc
can compile Open MPI 2.1.2rc2 (CMA is disabled).
To fix this, we need to cherry-pick d984b4b. Could you test the
It might be related to https://github.com/open-mpi/ompi/issues/3697 .
I added a comment to the issue.
Takahiro Kawashima,
Fujitsu
> On a PPC64LE w/ gcc-7.1.0 I see opal_fifo hang instead of failing.
>
> -Paul
>
> On Mon, Jul 3, 2017 at 4:39 PM, Paul Hargrove wrote:
>
> >
ing now, nor in
> 2.0.x. I've checked the master, and it also does not work there. Is
> there any time line for this?
>
> Thanks a lot!
>
> Marcin
>
>
>
> On 04/04/2017 11:03 AM, Kawashima, Takahiro wrote:
> > Hi,
> >
> > I encountered a simi
Hi,
I encountered a similar problem using MPI_COMM_SPAWN last month.
Your problem my be same.
The problem was fixed by commit 0951a34 in Open MPI master and
backported to v2.1.x v2.0.x but not backported to v1.8.x and
v1.10.x.
https://github.com/open-mpi/ompi/commit/0951a34
Please try the
Hi,
I created a pull request to add the persistent collective
communication request feature to Open MPI. Though it's
incomplete and will not be merged into Open MPI soon,
you can play your collective algorithms based on my work.
https://github.com/open-mpi/ompi/pull/2758
Takahiro Kawashima,
Gilles, Jeff,
In Open MPI 1.6 days, MPI_ARGVS_NULL and MPI_STATUSES_IGNORE
were defined as double precision and MPI_Comm_spawn_multiple
and MPI_Waitall etc. interfaces had two subroutines each.
https://github.com/open-mpi/ompi-release/blob/v1.6/ompi/include/mpif-common.h#L148
ram” for OpenMPI code base that shows
> existing classes and dependencies/associations. Are there any available tools
> to extract and visualize this information.
Thanks,
KAWASHIMA Takahiro
> I just checked MPICH 3.2, and they *do* include MPI_SIZEOF interfaces for
> CHARACTER and LOGICAL, but they are missing many of the other MPI_SIZEOF
> interfaces that we have in OMPI. Meaning: OMPI and MPICH already diverge
> wildly on MPI_SIZEOF. :-\
And OMPI 1.6 also had MPI_SIZEOF
ble check about using PID. if a broadcast is needed, i would
> rather use the process name of rank 0 in order to avoid a broadcast.
>
> Cheers,
>
> Gilles
>
> On 2/3/2016 8:40 AM, Kawashima, Takahiro wrote:
> > Nathan,
> >
> > Is is sufficient?
> > Mul
Nathan,
Is is sufficient?
Multiple windows can be created on a communicator.
So I think PID + CID is not sufficient.
Possible fixes:
- The root process creates a filename with a random number
and broadcast it in the communicator.
- Use per-communicator counter and use it in the filename.
`configure && make && make install && make check` and
running some sample MPI programs succeeded with 1.10.1rc3
on my SPARC-V9/Linux/GCC machine (Fujitsu PRIMEHPC FX10).
No @SET_MAKE@ appears in any Makefiles, of course.
> > For the first time I was also able to (attempt to) test SPARC64 via
Brice,
I'm a developer of Fujitsu MPI for K computer and Fujitsu
PRIMEHPC FX10/FX100 (SPARC-based CPU).
Though I'm not familiar with the hwloc code and didn't know
the issue reported by Gilles, I also would be able to help
you to fix the issue.
Takahiro Kawashima,
MPI development team,
Fujitsu
Oh, I also noticed it yesterday and was about to report it.
And one more, the base parameter of MPI_Win_detach.
Regards,
Takahiro Kawashima
> Dear OpenMPI developers,
>
> I noticed a bug in the definition of the 3 MPI-3 RMA functions
> MPI_Compare_and_swap, MPI_Fetch_and_op and
Hi folks,
`configure && make && make install && make test` and
running some sample MPI programs succeeded with 1.10.0rc1
on my SPARC-V9/Linux/GCC machine (Fujitsu PRIMEHPC FX10).
Takahiro Kawashima,
MPI development team,
Fujitsu
> Hi folks
>
> Now that 1.8.7 is out the door, we need to switch
formation that may be useful for users and developers.
Not so verbose. Output only on initialization or
object creation etc.
DEBUG:
Information that is useful only for developers.
Not so verbose. Output once per MPI routine call.
TRACE:
Information that is useful only for developers.
V
e epochs.
>
>
> and the test case calls MPI_Win_fence with MPI_MODE_NOPRECEDE.
>
> are you saying Open MPI implementation of MPI_Win_fence should perform
> a barrier in this case (e.g. MPI_MODE_NOPRECEDE) ?
>
> Cheers,
>
> Gilles
>
> On 4/21/2015 11:08 AM
manually adding a MPI_Barrier.
>
> Cheers,
>
> Gilles
>
> On 4/21/2015 10:20 AM, Kawashima, Takahiro wrote:
> > Hi Gilles, Nathan,
> >
> > I read the MPI standard but I think the standard doesn't
> > require a barrier in the test program.
> >
>
Hi Gilles, Nathan,
I read the MPI standard but I think the standard doesn't
require a barrier in the test program.
>From the standards (11.5.1 Fence) :
A fence call usually entails a barrier synchronization:
a process completes a call to MPI_WIN_FENCE only after all
other processes in
Yes, Fujitsu MPI is running on sparcv9-compatible CPU.
Though we currently use only stable-series (v1.6, v1.8),
they work fine.
Takahiro Kawashima,
MPI development team,
Fujitsu
> Nathan,
>
> Fujitsu MPI is openmpi based and is running on their sparcv9 like proc.
>
> Cheers,
>
> Gilles
>
>
Thanks!
> Takahiro,
>
> Sorry for the delay in answering. Thanks for the bug report and the patch.
> I applied you patch, and added some tougher tests to make sure we catch
> similar issues in the future.
>
> Thanks,
> George.
>
>
> On Mon, Sep 29, 2014 at 8
Hi George,
Thank you for attending the meeting at Kyoto. As we talked
at the meeting, my colleague suffers from a datatype problem.
See attached create_resized.c. It creates a datatype with an
LB marker using MPI_Type_create_struct and MPI_Type_create_resized.
Expected contents of the output
just FYI:
configure && make && make install && make test
succeeded on my SPARC64/Linux/GCC (both enable-debug=yes and no).
Takahiro Kawashima,
MPI development team,
Fujitsu
> Usual place:
>
> http://www.open-mpi.org/software/ompi/v1.8/
>
> Please beat it up as we want to release on Fri,
Hi Siegmar, Ralph,
I forgot to follow the previous report, sorry.
The patch I suggested is not included in Open MPI 1.8.2.
The backtrace Siegmar reported points the problem that I fixed
in the patch.
http://www.open-mpi.org/community/lists/users/2014/08/24968.php
Siegmar:
Could you try my
Hi Ralph,
Your commit r32459 fixed the bus error by correcting
opal/dss/dss_copy.c. It's OK for trunk because mca_dstore_hash
calls dss to copy data. But it's insufficient for v1.8 because
mca_db_hash doesn't call dss and copies data itself.
The attached patch is the minimum patch to fix it in
Siegmar, Ralph,
I'm sorry to response so late since last week.
Ralph fixed the problem in r32459 and it was merged to v1.8
in r32474. But in v1.8 an additional custom patch is needed
because the db/dstore source codes are different between trunk
and v1.8.
I'm preparing and testing the custom
upported by all compilers */
>
> as far as i am concerned, the same issue is also in the trunk,
> and if you do not hit it, it just means you are lucky :-)
>
> the same issue might also be in other parts of the code :-(
>
> Cheers,
>
> Gilles
>
> On 2014/08/08 13:4
; as a workaround, you can declare an opal_process_name_t (for alignment),
> and cast it to an orte_process_name_t
>
> i will write a patch (i will not be able to test on sparc ...)
> please note this issue might be present in other places
>
> Cheers,
>
> Gilles
>
&
Hi,
> > >>> I have installed openmpi-1.8.2rc2 with gcc-4.9.0 on Solaris
> > >>> 10 Sparc and I receive a bus error, if I run a small program.
I've finally reproduced the bus error in my SPARC environment.
#0 0x00db4740 (__waitpid_nocancel + 0x44)
George,
I compiled trunk with your patch for SPARCV9/Linux/GCC.
I see following warning/errors.
In file included from opal/include/opal/sys/atomic.h:175,
from opal/asm/asm.c:21:
with_wrapper_cxxflags=-g
with_wrapper_fflags=-g
with_wrapper_fcflags=-g
Regards,
KAWASHIMA Takahiro
> The problem is the code in question does not check the return code of
> MPI_T_cvar_handle_alloc . We are returning an error and they still try
> to use the handle (which is stale).
nel, and abnormal values are printed
if not yet.
So this SEGV doesn't occur if I configure Open MPI with
--disable-dlopen option. I think it's the reason why Nathan
doesn't see this error.
Regards,
KAWASHIMA Takahiro
Hi,
The attached patch corrects trivial typos in man files and
FUNC_NAME variables in ompi/mpi/c/*.c files.
One note which may not be trivial:
Before MPI-2.1, MPI standard says MPI_PACKED should be used for
MPI_{Pack,Unpack}_external. But in MPI-2.1, it was changed to
use MPI_BYTE. See 'B.3
Hi,
Open MPI's signal handler (show_stackframe function defined in
opal/util/stacktrace.c) calls non-async-signal-safe functions
and it causes a problem.
See attached mpisigabrt.c. Passing corrupted memory to realloc(3)
will cause SIGABRT and show_stackframe function will be invoked.
But invoked
It is a bug in the test program, test/datatype/ddt_raw.c, and it was
fixed at r24328 in trunk.
https://svn.open-mpi.org/trac/ompi/changeset/24328
I've confirmed the failure occurs with plain v1.6.5 and it doesn't
occur with patched v1.6.5.
Thanks,
KAWASHIMA Takahiro
> Not s
Thanks!
Takahiro Kawashima,
MPI development team,
Fujitsu
> Pushed in r29187.
>
> George.
>
>
> On Sep 17, 2013, at 12:03 , "Kawashima, Takahiro"
> <t-kawash...@jp.fujitsu.com> wrote:
>
> > George,
> >
> > Copyright-added patch is
o
> long before being discovered (especially the extent issue in the
> MPI_ALLTOALL). Please feel free to apply your patch and add the correct
> copyright at the beginning of all altered files.
>
> Thanks,
> George.
>
>
>
> On Sep 17, 2013, at 07:
Hi,
My colleague tested MPI_IN_PLACE for MPI_ALLTOALL, MPI_ALLTOALLV,
and MPI_ALLTOALLW, which was implemented two months ago in Open MPI
trunk. And he found three bugs and created a patch.
Found bugs are:
(A) Missing MPI_IN_PLACE support in self COLL component
The attached
George,
Thanks. I've confirmed your patch.
I wrote a simple program to test your patch and no problems are found.
The test program is attached to this mail.
Regards,
KAWASHIMA Takahiro
> Takahiro,
>
> Please find below another patch, this time hopefully fixing all issues. The
George,
A improved patch is attached. Latter half is same as your patch.
But again, I'm not sure this is a correct solution.
It works correctly for my attached put_dup_type_3.c.
Run as "mpiexec -n 1 ./put_dup_type_3".
It will print seven OKs if succeeded.
Regards,
KAWASHIMA Takahiro
No. My patch doesn't work for a more simple case,
just a duplicate of MPI_INT.
Datatype is too complex for me ...
Regards,
KAWASHIMA Takahiro
> George,
>
> Thanks. But no, your patch does not work correctly.
>
> The assertion failure disappeared by your patch but the value
t;total_pack_size = 0;
break;
case MPI_COMBINER_CONTIGUOUS:
This patch in addition to your patch works correctly for my program.
But I'm not sure this is a correct solution.
Regards,
KAWASHIMA Takahiro
> Takahiro,
>
> Nice catch. That particular code was an over-optimiza
George,
My colleague was working on your ompi-topo bitbucket repository
but it was not completed. But he found bugs in your patch attached
in your previous mail and created the fixing patch. See the attached
patch, which is a patch against Open MPI trunk + your patch.
His test programs are also
nagement of the common desc_t into a nightmare
> … with the effect you noticed few days ago. Too bad for the optimization
> part. I now duplicate the desc_t between the two layers, and all OMPI
> datatypes have now their own desc_t.
>
> Thanks for finding and analyzing so deeply this
and OMPI
datatypes is allowed?
Regards,
KAWASHIMA Takahiro
new patch. Basically I went over all
> the possible cases, both in test and wait, and ensure the behavior is always
> consistent. Please give it a try, and let us know of the outcome.
>
> Thanks,
> George.
>
>
>
> On Jan 25, 2013, at 00:53 , "Kawashima, Takahi
and another for bug fixes, as described in
my previous mail.
Regards,
KAWASHIMA Takahiro
> Jeff, George,
>
> I've implemented George's idea for ticket #3123 "MPI-2.2: Ordering of
> attribution deletion callbacks on MPI_COMM_SELF". See attached
> delete-attr-order.patc
I don't care the macro names. Either one is OK for me.
Thanks,
KAWASHIMA Takahiro
> Hmm, maybe something like:
>
> OPAL_LIST_FOREACH, OPAL_LISTFOREACH_REV, OPAL_LIST_FOREACH_SAFE,
> OPAL_LIST_FOREACH_REV_SAFE?
>
> -Nathan
>
> On Thu, Jan 31, 2013 at 12:36:29AM +
Hi,
Agreed.
But how about backward traversal in addition to forward traversal?
e.g. OPAL_LIST_FOREACH_FW, OPAL_LIST_FOREACH_FW_SAFE,
OPAL_LIST_FOREACH_BW, OPAL_LIST_FOREACH_BW_SAFE
We sometimes search an item from the end of a list.
Thanks,
KAWASHIMA Takahiro
> What: Add two new mac
of other bugs.
>
> Sent from my phone. No type good.
>
> On Jan 22, 2013, at 8:57 PM, "Kawashima, Takahiro"
> <t-kawash...@jp.fujitsu.com> wrote:
>
> > George,
> >
> > I reported the bug three months ago.
> > Your commit r27880 resolved on
and other 7 latest changesets are for bug/typo-fixes.
Regards,
KAWASHIMA Takahiro
> Jeff,
>
> OK. I'll try implementing George's idea and then you can compare which
> one is simpler.
>
> Regards,
> KAWASHIMA Takahiro
>
> > Not that I'm aware of; that would be gre
George,
I reported the bug three months ago.
Your commit r27880 resolved one of the bugs reported by me,
in another approach.
http://www.open-mpi.org/community/lists/devel/2012/10/11555.php
But other bugs are still open.
"(1) MPI_SOURCE of MPI_Status for a null request must be
>>>
> >>>
> >>> On Jan 18, 2013, at 5:47 AM, George Bosilca <bosi...@icl.utk.edu>
> >>> wrote:
> >>>
> >>>> Takahiro,
> >>>>
> >>>> The MPI_Dist_graph effort is happening in
> >>
I've confirmed. Thanks.
Takahiro Kawashima,
MPI development team,
Fujitsu
> Done -- thank you!
>
> On Jan 11, 2013, at 3:52 AM, "Kawashima, Takahiro"
> <t-kawash...@jp.fujitsu.com> wrote:
>
> > Hi Open MPI core members and Rayson,
> >
> > I'v
Hi,
Fujitsu is interested in completing MPI-2.2 on Open MPI and Open MPI
-based Fujitsu MPI.
We've read wiki and tickets. These two tickets seem to be almost done
but need testing and bug fixing.
https://svn.open-mpi.org/trac/ompi/ticket/2223
MPI-2.2: MPI_Dist_graph_* functions missing
Jeff,
OK. I'll try implementing George's idea and then you can compare which
one is simpler.
Regards,
KAWASHIMA Takahiro
> Not that I'm aware of; that would be great.
>
> Unlike George, however, I'm not concerned about converting to linear
> operations for attributes.
>
George,
Your idea makes sense.
Is anyone working on it? If not, I'll try.
Regards,
KAWASHIMA Takahiro
> Takahiro,
>
> Thanks for the patch. I deplore the lost of the hash table in the attribute
> management, as the potential of transforming all attributes operation to a
> li
e it, take in
this patch.
Though I'm a employee of a company, this is my independent and private
work at my home. No intellectual property from my company. If needed,
I'll sign to Individual Contributor License Agreement.
Regards,
KAWASHIMA Takahiro
delete-attr-order.patch.gz
Description: Binary data
Hi Open MPI core members and Rayson,
I've confirmed to the authors and created the bibtex reference.
Could you make a page in the "Open MPI Publications" page that
links to Fujitsu's PDF file? The attached file contains information
of title, authors, abstract, link URL, and bibtex reference.
Hi,
Sorry for not replying sooner.
I'm taliking with the authors (they are not in this list) and
will request linking the PDF soon if they allowed.
Takahiro Kawashima,
MPI development team,
Fujitsu
> Our policy so far was that adding a paper to the list of publication on the
> Open MPI website
ng this idea.
>
> Thanks,
> george.
>
>
> On Oct 18, 2012, at 03:06 , "Kawashima, Takahiro"
> <t-kawash...@jp.fujitsu.com> wrote:
>
> > Hi Open MPI developers,
> >
> > I found another issue in Open MPI.
> >
> > In MCA_PML
Hi Open MPI developers,
I found another issue in Open MPI.
In MCA_PML_OB1_RECV_FRAG_INIT macro in ompi/mca/pml/ob1/pml_ob1_recvfrag.h
file, we copy a PML header from an arrived message to another buffer,
as follows:
frag->hdr = *(mca_pml_ob1_hdr_t*)hdr;
On this copy, we cast hdr to
t; files. Basically, it doesn't matter that we leave the last returned error
> code on an inactive request, as we always return MPI_STATUS_EMPTY in the
> status for such requests.
>
> Thanks,
> george.
>
>
> On Oct 15, 2012, at 07:02 , "Kawashima, Takahiro"
Hi Open MPI developers,
> > > The bugs are:
> > >
> > > (1) MPI_SOURCE of MPI_Status for a null request must be MPI_ANY_SOURCE.
> > >
> > > (2) MPI_Status for an inactive request must be an empty status.
> > >
> > > (3) Possible BUS errors on sparc64 processors.
> > >
> > > r23554 fixed
Hi Eugene,
> > I found some bugs in Open MPI and attach a patch to fix them.
> >
> > The bugs are:
> >
> > (1) MPI_SOURCE of MPI_Status for a null request must be MPI_ANY_SOURCE.
> >
> > (2) MPI_Status for an inactive request must be an empty status.
> >
> > (3) Possible BUS errors on sparc64
Hi Open MPI developers,
I found some bugs in Open MPI and attach a patch to fix them.
The bugs are:
(1) MPI_SOURCE of MPI_Status for a null request must be MPI_ANY_SOURCE.
3.7.3 Communication Completion in MPI-3.0 (and also MPI-2.2)
says an MPI_Status object returned by
_complete
> + * to true. Otherwise, the request will never be freed.
> + */
> +request->req_recv.req_base.req_pml_complete = true;
> OPAL_THREAD_UNLOCK(>matching_lock);
>
> OPAL_THREAD_LOCK(_request_lock);
> @@ -138,7 +129,7 @@
> MCA_PML_OB1_RECV_REQUEST_MPI
Hi Open MPI developers,
I found a small bug in Open MPI.
See attached program cancelled.c.
In this program, rank 1 tries to cancel a MPI_Irecv and calls a MPI_Recv
instead if the cancellation succeeds. This program should terminate whether
the cancellation succeeds or not. But it leads a
70 matches
Mail list logo