Re: [OMPI devel] "Open MPI"-based MPI library used by K computer

2011-11-14 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 14/11/11 21:27, Y.MATSUMOTO wrote:

> I'm a member of MPI library development team in Fujitsu,
> Takahiro Kawashima, who sent mail before, is my colleague.
> We start to feed back.

First of all I'd like to say congratulations on breaking
10PF, and also a big thanks for working on contributing
changes back to Open-MPI!

Whilst I can't comment on the fix I can confirm that I also
see segfaults with Open-MPI 1.4.2 and 1.4.4 with your example
program.

Intel compilers 11.1:

- --
[bruce002:03973] *** Process received signal ***
[bruce002:03973] Signal: Segmentation fault (11)
[bruce002:03973] Signal code: Address not mapped (1)
[bruce002:03973] Failing at address: 0x10009
[bruce002:03973] [ 0] /lib64/libpthread.so.0 [0x3e1320eb10]
[bruce002:03973] [ 1] /usr/local/openmpi/1.4.4-intel/lib/libmpi.so.0 
[0x2ab5d79d]
[bruce002:03973] [ 2] 
/usr/local/openmpi/1.4.4-intel/lib/libopen-pal.so.0(opal_progress+0x87) 
[0x2b1fdc27]
[bruce002:03973] [ 3] /usr/local/openmpi/1.4.4-intel/lib/libmpi.so.0 
[0x2abce252]
[bruce002:03973] [ 4] 
/usr/local/openmpi/1.4.4-intel/lib/libmpi.so.0(PMPI_Recv+0x213) [0x2ab1e0f3]
[bruce002:03973] [ 5] ./tp_lb_ub_ng(main+0x29b) [0x4021ab]
[bruce002:03973] [ 6] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3e12a1d994]
[bruce002:03973] [ 7] ./tp_lb_ub_ng [0x401e59]
[bruce002:03973] *** End of error message ***
- --
mpiexec noticed that process rank 1 with PID 3973 on node bruce002 exited on 
signal 11 (Segmentation fault).
- --
[bruce002:03972] *** Process received signal ***
[bruce002:03972] Signal: Segmentation fault (11)
[bruce002:03972] Signal code: Address not mapped (1)
[bruce002:03972] Failing at address: 0xff84bad0
[bruce002:03972] [ 0] /lib64/libpthread.so.0 [0x3e1320eb10]
[bruce002:03972] [ 1] ./tp_lb_ub_ng(__intel_new_memcpy+0x2c) [0x403c9c]
[bruce002:03972] *** End of error message ***


GCC 4.4.4:

- --
[bruce002:04049] *** Process received signal ***
[bruce002:04049] Signal: Segmentation fault (11)
[bruce002:04049] Signal code: Address not mapped (1)
[bruce002:04049] Failing at address: 0x10009
[bruce002:04049] [ 0] /lib64/libpthread.so.0 [0x3e1320eb10]
[bruce002:04049] [ 1] /usr/local/openmpi/1.4.4-gcc/lib/libmpi.so.0 
[0x2ab51f27]
[bruce002:04049] [ 2] 
/usr/local/openmpi/1.4.4-gcc/lib/libopen-pal.so.0(opal_progress+0x5a) 
[0x2b14bb3a]
[bruce002:04049] [ 3] /usr/local/openmpi/1.4.4-gcc/lib/libmpi.so.0 
[0x2abb9985]
[bruce002:04049] [ 4] 
/usr/local/openmpi/1.4.4-gcc/lib/libmpi.so.0(PMPI_Recv+0x12f) [0x2ab1913f]
[bruce002:04049] [ 5] ./tp_lb_ub_ng(main+0x21c) [0x400dd0]
[bruce002:04049] [ 6] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3e12a1d994]
[bruce002:04049] [ 7] ./tp_lb_ub_ng [0x400af9]
[bruce002:04049] *** End of error message ***
- --
mpiexec noticed that process rank 1 with PID 4049 on node bruce002 exited on 
signal 11 (Segmentation fault).
- --
[bruce002:04048] *** Process received signal ***
[bruce002:04048] Signal: Segmentation fault (11)
[bruce002:04048] Signal code: Address not mapped (1)
[bruce002:04048] Failing at address: 0x2aaab0833000
[bruce002:04048] [ 0] /lib64/libpthread.so.0 [0x3e1320eb10]
[bruce002:04048] [ 1] /lib64/libc.so.6(memcpy+0x3ff) [0x3e12a7c63f]
[bruce002:04048] [ 2] /usr/local/openmpi/1.4.4-gcc/lib/libmpi.so.0 
[0x2aafef7b]
[bruce002:04048] [ 3] /usr/local/openmpi/1.4.4-gcc/lib/libmpi.so.0 
[0x2ab4fcdd]
[bruce002:04048] [ 4] /usr/local/openmpi/1.4.4-gcc/lib/libmpi.so.0 
[0x2abc1563]
[bruce002:04048] [ 5] /usr/local/openmpi/1.4.4-gcc/lib/libmpi.so.0 
[0x2abbce78]
[bruce002:04048] [ 6] /usr/local/openmpi/1.4.4-gcc/lib/libmpi.so.0 
[0x2ab52036]
[bruce002:04048] [ 7] 
/usr/local/openmpi/1.4.4-gcc/lib/libopen-pal.so.0(opal_progress+0x5a) 
[0x2b14bb3a]
[bruce002:04048] [ 8] /usr/local/openmpi/1.4.4-gcc/lib/libmpi.so.0 
[0x2abba5f5]
[bruce002:04048] [ 9] 
/usr/local/openmpi/1.4.4-gcc/lib/libmpi.so.0(MPI_Send+0x177) [0x2ab1b1d7]
[bruce002:04048] [10] ./tp_lb_ub_ng(main+0x1e4) [0x400d98]
[bruce002:04048] [11] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3e12a1d994]
[bruce002:04048] [12] ./tp_lb_ub_ng [0x400af9]
[bruce002:04048] *** End of error message ***


- -- 
Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.unimelb.edu.au/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARE

Re: [OMPI devel] VT issue

2011-11-14 Thread George Bosilca
This is supposed to be an intrinsic, automatically replaced by GCC during the 
compilation process. If I do the same configure as you (on the same machine) I 
have in my opal_config.h:

/* Whether C compiler supports __builtin_expect */
#define OPAL_C_HAVE_BUILTIN_EXPECT 1
/* Whether C++ compiler supports __builtin_expect */
#define OMPI_CXX_HAVE_BUILTIN_EXPECT 0

This means that the C compiler supports __builtin_expect while the C++ compiler 
doesn't. I guess the VT-folks should fix their usage of the BUILTIN_EXPECT 
macro …

  george.

On Nov 14, 2011, at 12:22 , Ralph Castain wrote:

> Hi VT-folks
> 
> I'm building the devel trunk on a Mac, using a vanilla configure line: 
> ./configure --prefix=foo. When I try to compile, I get this error:
> 
> undefined symbols for architecture x86_64:
>  "___builtin_expect", referenced from:
>  _main.omp_fn.0 in otfprofile-otfprofile.o
> ld: symbol(s) not found for architecture x86_64
> collect2: ld returned 1 exit status
> 
> 
> I believe this comes from your VT code. Can you take a look?
> 
> Thanks
> Ralph
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




[OMPI devel] VT issue

2011-11-14 Thread Ralph Castain
Hi VT-folks

I'm building the devel trunk on a Mac, using a vanilla configure line: 
./configure --prefix=foo. When I try to compile, I get this error:

undefined symbols for architecture x86_64:
  "___builtin_expect", referenced from:
  _main.omp_fn.0 in otfprofile-otfprofile.o
ld: symbol(s) not found for architecture x86_64
collect2: ld returned 1 exit status


I believe this comes from your VT code. Can you take a look?

Thanks
Ralph




Re: [OMPI devel] "Open MPI"-based MPI library used by K computer

2011-11-14 Thread Y.MATSUMOTO

Dear Open MPI community,

I'm a member of MPI library development team in Fujitsu,
Takahiro Kawashima, who sent mail before, is my colleague.
We start to feed back.

First, we fixed about MPI_LB/MPI_UB and data packing problem.

Program crashes when it meets all of the following conditions:
a: The type of sending data is contiguous and derived type.
b: Either or both of MPI_LB and MPI_UB is used in the data type.
c: The size of sending data is smaller than extent(Data type has gap).
d: Send-count is bigger than 1.
e: Total size of data is bigger than "eager limit"

This problem occurs in attachment C program.

An incorrect-address accessing occurs
because an unintended value of "done" inputs and
the value of "max_allowd" becomes minus
in the following place in "ompi/datatype/datatype_pack.c(in version 1.4.3)".


(ompi/datatype/datatype_pack.c)
188 packed_buffer = (unsigned char *) iov[iov_count].iov_base;
189 done = pConv->bConverted - i * pData->size;  /* partial data 
from last pack */
190 if( done != 0 ) {  /* still some data to copy from the last 
time */
191 done = pData->size - done;
192 OMPI_DDT_SAFEGUARD_POINTER( user_memory, done, 
pConv->pBaseBuf, pData, pConv->count );
193 MEMCPY_CSUM( packed_buffer, user_memory, done, pConv );
194 packed_buffer += done;
195 max_allowed -= done;
196 total_bytes_converted += done;
197 user_memory += (extent - pData->size + done);
198 }

This program assumes "done" as the size of partial data from last pack.
However, when the program crashes, "done" equals the sum of all transmitted 
data size.
It makes "max_allowed" to be a negative value.

We modified the code as following and it passed our test suite.
But we are not sure this fix is correct. Can anyone review this fix?
Patch (against Open MPI 1.4 branch) is attached to this mail.

-if( done != 0 ) {  /* still some data to copy from the last time */
+if( (done + max_allowed) >= pData->size ) {  /* still some data to 
copy from the last time */

Best regards,

Yuki MATSUMOTO
MPI development team,
Fujitsu

(2011/06/28 10:58), Takahiro Kawashima wrote:

Dear Open MPI community,

I'm a member of MPI library development team in Fujitsu. Shinji
Sumimoto, whose name appears in Jeff's blog, is one of our bosses.

As Rayson and Jeff noted, K computer, world's most powerful HPC system
developed by RIKEN and Fujitsu, utilizes Open MPI as a base of its MPI
library. We, Fujitsu, are pleased to announce that, and also have special
thanks to Open MPI community.
We are sorry to be late announce!

Our MPI library is based on Open MPI 1.4 series, and has a new point-
to-point component (BTL) and new topology-aware collective communication
algorithms (COLL). Also, it is adapted to our runtime environment (ESS,
PLM, GRPCOMM etc).

K computer connects 68,544 nodes by our custom interconnect.
Its runtime environment is our proprietary one. So we don't use orted.
We cannot tell start-up time yet because of disclosure restriction, sorry.

We are surprised by the extensibility of Open MPI, and have proved that
Open MPI is scalable to 68,000 processes level! We feel pleasure to
utilize such a great open-source software.

We cannot tell detail of our technology yet because of our contract
with RIKEN AICS, however, we will plan to feedback of our improvements
and bug fixes. We can contribute some bug fixes soon, however, for
contribution of our improvements will be next year with Open MPI
agreement.

Best regards,

MPI development team,
Fujitsu



I got more information:

http://blogs.cisco.com/performance/open-mpi-powers-8-petaflops/

Short version: yes, Open MPI is used on K and was used to power the 8PF runs.

w00t!



On Jun 24, 2011, at 7:16 PM, Jeff Squyres wrote:


w00t!

OMPI powers 8 petaflops!
(at least I'm guessing that -- does anyone know if that's true?)


On Jun 24, 2011, at 7:03 PM, Rayson Ho wrote:


Interesting... page 11:

http://www.fujitsu.com/downloads/TC/sc10/programming-on-k-computer.pdf

Open MPI based:

* Open Standard, Open Source, Multi-Platform including PC Cluster.
* Adding extension to Open MPI for "Tofu" interconnect

Rayson

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Index: ompi/datatype/datatype_pack.c
===
--- ompi/datatype/datatype_pack.c   (revision 25474)
+++ ompi/datatype/datatype_pack.c   (working copy)
@@ -187,7 +187,7 @@
 
 packed_buffer = (unsigned char *) iov[iov_count].iov_base;
 done = pConv->bConverted - i * pData->size;  /* partial data from 
last pack */
-if( done != 0 ) {  /* still some data to copy from the last time */
+if( (done + max_allowed) >= pData->size ) {  /* still some data to 
copy