Hi Akihiro --

Thanks for looking into this so quickly.

My recollection is that --enable-debug should be sufficient to turn on GASNet execution time assertions and checking. From the GASNet README:

    --enable-debug - build GASNet in a debugging mode. This turns on C-level
      debugger options and also enables extensive error and sanity checking
      system-wide, which is highly recommended for developing and debugging
      GASNet clients (but should not be used for performance testing).

I haven't used this recently enough to recall whether anything is printed to indicate that it's running in debug mode. My guess would be that it would be a silent execution unless something inappropriate happened.

GASNET_BACKTRACE=1 ought to have an effect whether or not --enable-debug is set, I believe (though --enable-debug should increase the amount of information produced).

So, assuming we're not missing anything obvious, would it be possible to write a short C+GASNet program that simply tries to transfer a large buffer to reproduce this issue? (That might be a question more for the Malaga team who have dug deeper into this, if I understand correctly). If that program was similarly silent with debugging enabled, it sounds like the program we should send to the GASNet team.

Before we invest that effort, though, let me send them a quick message to see if there are any known issues with larger messages on ibv to avoid duplicated effort.

-Brad


On Wed, 29 Jan 2014, Akihiro Hayashi wrote:

Hi, Brad,

Thank you for your suggestions.

I'm always glad when we can pass the blame to something other than Chapel.  :)
Exactly :)

I tried --enable-debug flag at GASNet configuration time and I rebuilt my 
Chapel compiler and runtime.
Then I compile and run the program with GASNET_BACKTRACE=1,  But i don't get 
any message.
Some debug messages are supposed to be shown during program execution if we 
configure GASNet with --enable-debug flag?

Please let me know if you have any suggestions.

Best,

Akihiro

On Jan 29, 2014, at 2:40 PM, Brad Chamberlain wrote:


Hi guys --

While I'm sorry about the time spent on this issue, I'm always glad when we can 
pass the blame to something other than Chapel.  :)

Something I'm wondering is whether, if GASNet is built with runtime checks on 
(I think this is done using --enable-debug at GASNet configuration time), will 
the ibv-conduit issue show up as a runtime assertion or something other than a 
bug?  I think we may want to pass along a report of this issue to the GASNet 
team, in which case this is undoubtedly the first question they'll ask.

It would also be great if we had a standalone C+GASNet program that exhibited 
the issue (as they're not deeply invested in Chapel), but if that isn't a 
15-minute exercise, we can tell them how to reproduce in Chapel.

Thanks,
-Brad


On Wed, 29 Jan 2014, Akihiro Hayashi wrote:

Hi, Rafael,

Thanks for your reply. I inlined my comments below:

I’ve been talking to Rafael Larrosa regarding the issue you are reporting. He 
has conducted some experiments with your code on Titan (Cray XK7 at ORNL, which 
features Cray Gemini interconnect) and there are no communication problem 
there. However, we have here in Malaga a cluster based on Infiniband 
(ibv-conduit) and the execution fails on that platform. I’ve also confirmed 
that udp-conduit does not pose any problem.
I really appreciate your and Rafael Larrosa's experiments. I'm glad to hear 
that this is not a problem in Chapel compiler.

Rafael Larrosa told me that he faced the same issue last month and that after 
spending more than two weeks tackling the problem he is almost sure that there 
is a bug (or a maximum buffer size limitation) in the ibv-conduit 
implementation of gasnet. For his code, he found a turning point when the 
transferred buffer was 128MBytes: smaller communications work fine, but larger 
always fail. He says it was tricky, because when you try to isolate the problem 
(i.e. isolate the particular transfer that fails by executing just this single 
communication) then the problem vanish. So it will be challenging to chase this 
bug.

Sure, now I understand there is a bug in the ibv-conduit implementation of 
gasnet. and yes, It seems fixing the bug is very difficult.  Actually, an 
original benchmark I want to run spawns many tasks by begin statement and each 
task does bulk transfer. I can imagine the benchmark exceeds some limit like 
Rafael's code.  I'm also wondering why the simplified code has this problem. 
There might be another problem.

You may want to circumvent the bug by:

1.- Not using bulkComms optimization (-suseBulkTransferStride=false 
-suseBulkTransfer=false). —> Slower comms.
2.- Implementing a version of bulkComms that splits big messages into smaller 
ones. —> wearisome tinkering
3.- Avoid ibv-conduit —> 207 out of the 500 supercomputers in latest top500 
list are based on ibv
4.- Dive into the ibv-conduit implementation —> Probably not your main research 
goal

For the time being we are conducting all our experiments on Cray machines, so 
we do not plan (and do not have time) to tackle 2 or 4, so we are getting by 
with 3.
Exactly, 4 is not my research goal, I 'd choose 3 if a benchmark I would like 
to run use bulk transfer. Thanks for your suggestions.

If Rafael wants to chime in, he can probably give you more details and advices, 
should you want to debug your code at a lower level.

I would appreciate if he could give me more details. I think I should mention 
the bug in my paper or something.

Best,

Akihiro

On Jan 29, 2014, at 4:46 AM, Rafael Asenjo Plaza wrote:

Hi Akihiro,

I’ve been talking to Rafael Larrosa regarding the issue you are reporting. He 
has conducted some experiments with your code on Titan (Cray XK7 at ORNL, which 
features Cray Gemini interconnect) and there are no communication problem 
there. However, we have here in Malaga a cluster based on Infiniband 
(ibv-conduit) and the execution fails on that platform. I’ve also confirmed 
that udp-conduit does not pose any problem.

Rafael Larrosa told me that he faced the same issue last month and that after 
spending more than two weeks tackling the problem he is almost sure that there 
is a bug (or a maximum buffer size limitation) in the ibv-conduit 
implementation of gasnet. For his code, he found a turning point when the 
transferred buffer was 128MBytes: smaller communications work fine, but larger 
always fail. He says it was tricky, because when you try to isolate the problem 
(i.e. isolate the particular transfer that fails by executing just this single 
communication) then the problem vanish. So it will be challenging to chase this 
bug.

You may want to circumvent the bug by:

1.- Not using bulkComms optimization (-suseBulkTransferStride=false 
-suseBulkTransfer=false). —> Slower comms.
2.- Implementing a version of bulkComms that splits big messages into smaller 
ones. —> wearisome tinkering
3.- Avoid ibv-conduit —> 207 out of the 500 supercomputers in latest top500 
list are based on ibv
4.- Dive into the ibv-conduit implementation —> Probably not your main research 
goal

For the time being we are conducting all our experiments on Cray machines, so 
we do not plan (and do not have time) to tackle 2 or 4, so we are getting by 
with 3.

If Rafael wants to chime in, he can probably give you more details and advices, 
should you want to debug your code at a lower level.

Regards,

Rafa.

El 28/01/2014, a las 19:31, Akihiro Hayashi <[email protected]> escribió:

Hi, Rafael,

Sorry for the delayed reply.
Let me share the program that reproduces the problem. (attached below)

As you can see, the program prints "INVALID? : true" if we get bulk copy transfer error, 
otherwise it prints "INVALID?: false".
I get the error when I run the program on 2 locales with ibv-conduit 
(mpi-spawner). The input data size is : matrixSize = 2000 and tileSize = 200. 
Please let me know if you want the input file.
Note that I don't get the error when I run the program on 1 locale. In 
addition, I don't get the error with smaller data size even on 2 or more 
locales (e.g 10x10 matrix and 2x2 tile size).
I'm guessing using ibv-conduit and transferring a certain amount of data incurs 
this problem.
FYI, using udp-conduit (amudprun) does not show the error.

Please let me know if you have any comments and questions.

Best,

Akihiro

--

use BlockDist;

config const matrixSize: int(32) = -1;
config const   tileSize: int(32) = -1;
config const     inFile: string = "m_2000.in";
const zero: int(32) = 0;
var tile_array_indices = {zero..tileSize-1,zero..tileSize-1};

class Tile {
var tile_array: [tile_array_indices] real;
}

proc read_2D_array ( fileName: string, matrixSize: int(32) ) {
var input_stream = open (fileName, iomode.r);
var reader = input_stream.reader();
var matrix_index_2D = {0..matrixSize-1, 0..matrixSize-1};
var array: [matrix_index_2D] real;

for ij in matrix_index_2D do {
    reader.read(array(ij));
}
input_stream.close();
reader.close();
// if (debug) { writeln("whole array: ",array); }
return array;
}

proc main(): void {
writeln("numLocales : ", numLocales);

var numTiles: int(32) = matrixSize/tileSize;
var numTiles_2: int(64) = matrixSize/tileSize;

var whole_array = read_2D_array(inFile, matrixSize);

var proto_ijk_space = {zero..numTiles_2-1, zero..numTiles_2, zero..numTiles_2};
var ijk_space = proto_ijk_space dmapped Block(boundingBox=proto_ijk_space);
var lkji_tiles: [ijk_space] Tile;

for i in zero..numTiles-1 do {
    for j in zero..i do {
        on lkji_tiles(i,j,zero).locale do {
            var curr_tile: Tile = new Tile();
                for (ii,jj) in tile_array_indices do {
                curr_tile.tile_array(ii,jj) = 
whole_array(i*tileSize+ii,j*tileSize+jj);
                }
            lkji_tiles(i,j,zero) = curr_tile;
        }
        }
}
var invalid : bool = false;
for i in zero..numTiles-1 do {
        for iB in zero..tileSize-1 do {
        for j in zero..i do {
                var temp = lkji_tiles(i,j,zero).tile_array;
                if(i != j) {
                for jB in zero..tileSize-1 do {
                        if (temp(iB,jB) != lkji_tiles(i, j, 
zero).tile_array(iB, jB)) {
                        invalid = true;
                        }
                }
                } else {
                for jB in zero..iB do {
                        if (temp(iB,jB) != lkji_tiles(i, j, 
zero).tile_array(iB, jB)) {
                        invalid = true;
                        }
                }
                }
        }
        }
}
writeln("INVALID? : ", invalid);

}
On Jan 22, 2014, at 1:46 PM, Akihiro Hayashi wrote:

Hi Rafael,

Thanks for your reply.

I inlined my comments below:

May we have a simplified copy of your code (kinda the snippet provided below 
but with initial values for tileSize, numTiles_2, k, etc. i.e. something that 
compiles) so that we can also give it a go?
Yes, it would be better if we can have a simplified code.
Actually, I have been trying to make a simple code that reproduce this problem 
for several weeks. finally I managed to make it this morning.
Let me ask my advisor if we can show you the code.

Would you like to try also with these flags?:

-suseBulkTransferStride=true -suseBulkTransfer=false
I tried these flags, but I still get the error.

I'll keep you updated.

Best,

Akihiro

On Jan 22, 2014, at 5:23 AM, Rafael Asenjo Plaza wrote:

Hi Akihiro,

May we have a simplified copy of your code (kinda the snippet provided below 
but with initial values for tileSize, numTiles_2, k, etc. i.e. something that 
compiles) so that we can also give it a go?

Would you like to try also with these flags?:

-suseBulkTransferStride=true -suseBulkTransfer=false

Thank you,

Rafa.

El 21/01/2014, a las 18:33, Akihiro Hayashi <[email protected]> escribió:

Dear Chapel developers,

This is Akihiro Hayashi, postdoc at Rice University.
I'm writing this to ask array copy failure in chapel.

I'm trying to evaluate some chapel benchmark across multiple nodes but I get 
strange error.
Please note that I'm using old version of chapel compiler (r21945) with 
qthread-1.10 and GASNet-1.20.2(infiniband-conduit, mpi-spawner) because the 
latest version does not work.
With the latest version of chapel compiler (r22568) with qthread-1.10 and 
GASNet-1.22.0(infiniband-conduit, mpi-spawner), I get SEGV when running simple 
program (coforall loc in Locales do on loc { writeln(loc); }) across multiple 
nodes with mpi spawner.
This is another problem but I have not investigated this problem yet. I'll work 
on this later.

The following problem might be fixed in the latest version, but any comments 
and suggestions are appreciated.
Here is part of my code.
The main data structure is a 3-dimensional array, which is declared as a 
distributed array that each of its element refers to a 2-dimension array.
You can see array copy statement (liBlock = lkji_tiles(k,k,k+1).tile_array;) in 
Line 11. I want to use this copy statement because the Chapel compiler 
generates bulk transfer code, which accelerates program execution.

// Code
1: const zero: int(32) = 0;
2: var tile_array_indices = {zero..tileSize-1,zero..tileSize-1};
3: class Tile {
4:    var tile_array: [tile_array_indices] real;
5: }
6: var proto_ijk_space = {zero..numTiles_2-1, zero..numTiles_2, 
zero..numTiles_2};
7: var ijk_space = proto_ijk_space dmapped Block(boundingBox=proto_ijk_space);
8: var lkji_tiles: [ijk_space] Tile;
...
9 :begin {
...
10:     var liBlock: [tile_array_indices] real;
11:     liBlock = lkji_tiles(k,k,k+1).tile_array;
12:     for (m,n) in tile_array_indices {
13:     if (liBlock(m,n) != lkji_tiles(k,k,k+1).tile_array(m,n)) {
14:        invalid = true;
15:     }
16:   }
17:   if (invalid) { writln("Copy Failed");}
18:   ...
19: }
...

In my experiment, when running the program on 2 or more locales, the program prints "Copy 
Failed" which means  "liBlock = lkji_tiles(k,k,k+1).tile_array;" in Line 11 failed.
This happens sometime (not always). and I confirmed the copy is successfully 
done if I replace the array copy in Line 11 with copy loop.
Additionally, I also see the same behavior when I replace the array copy in 
Line 11 with liBlock._value.doiBulkTransfer(lkji_tiles(k,k,k+1).tile_array);.

Here is an output log at runtime when I compile the program with -s 
debugBulkTransfer (tileSize=200):

-- Log starts here
In DefaultRectangularArr.doiBulkTransfer(): Alo=(0, 0), Blo=(0, 0), len=40000, 
elemSize=8;
-- End of Log

In both cases, the runtime internally calls chpl_comm_get API(*) and the API 
takes the above parameters.
I think it looks good.
(*) Please take a look at doiBulkTransfer function in 
CHPL_HOME/modules/internal/DefaultRectangular.chpl

Any comments and suggestions are appreciated.

Best regards,

Akihiro
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers

__
Rafael Asenjo Plaza
Dept. Arquitectura de Computadores
Complejo Tecnologico Campus de Teatinos
E-29071 MALAGA (SPAIN)
Tel: +34 95 213 27 91
Fax: +34 95 213 27 90
http://www.ac.uma.es/~asenjo





__
Rafael Asenjo Plaza
Dept. Arquitectura de Computadores
Complejo Tecnologico Campus de Teatinos
E-29071 MALAGA (SPAIN)
Tel: +34 95 213 27 91
Fax: +34 95 213 27 90
http://www.ac.uma.es/~asenjo




------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers

------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers

Reply via email to