Hi,

Yes the fix for exactly that was submitted by me at end of 2014th:
https://github.com/SchedMD/slurm/commit/7fff5eed6b8fe97347a832149966ed11f5805f99

You need to track if it was included to your version.

2015-08-21 17:31 GMT-04:00 Aaron Knister <[email protected]>:

> Hi Artem,
>
> Do you know if a fix for this was ever committed? We ran into this with a
> code base that builds non-mpi apps with mpicc and then attempts to run then
> multiple times from within a single SLURM task.
>
> -Aaron
>
> On Wed, May 21, 2014 at 9:12 AM, Artem Polyakov <[email protected]>
> wrote:
>
>> 2014-05-21 19:28 GMT+07:00 Hongjia Cao <[email protected]>:
>>
>>>
>>> You debugging and analysis is correct.
>>>
>>> PMI2_init() initialize PMI in two steps. First a PMI 1.1 init command is
>>> sent to the server and the version is negotiated with the server. After
>>> that a PMI 2.0 fullinit command is sent. Everything goes well so far.
>>> But since the version number is decided, the server do not expect
>>> another PMI 1.1 init command any more, which is in different format (see
>>> http://wiki.mpich.org/mpich/index.php/PMI_v2_Wire_Protocol).
>>>
>>> The mpi/pmi2 plugin does not implement all functions of the PMI2
>>> protocol (http://wiki.mpich.org/mpich/index.php/PMI_v2_API) yet. I just
>>> tested it with MPICH programs. It's not clearly specified whether a
>>> program may call PMI2_init() twice. I think this could be handled more
>>> easily in the client side: just return the old values in the second
>>> call.
>>>
>>
>> I agree: check PMI2_initialized and return immediately if set to
>> something other than PMI2_UNINITIALIZED.
>>
>>
>>>
>>>
>>> 在 2014-05-20二的 20:52 -0700,Artem Polyakov写道:
>>> > 2. "Double init hang" problem: program pmi_double_init.c (attached) is
>>> > launched with script pmi_double_init.job (attached) and it just hangs.
>>> > Here is what GDB shows on one of the processes:
>>> >
>>> > (gdb) bt #0  0x0000003b722db730 in __read_nocancel ()
>>> > from /lib64/libc.so.6 #1  0x00007f201cbd5ee4 in PMI2U_readline (fd=12,
>>> > buf=0x7fffa4f80ba0 "cmd=init pmi_version=2 pmi_subversion=0\n",
>>> > maxlen=1024) at pmi2_util.c:72 #2  0x00007f201cbcf74c in PMI2_Init
>>> > (spawned=0x7fffa4f81404, size=0x7fffa4f81400, rank=0x7fffa4f813fc,
>>> > appnum=0x7fffa4f813f8) at pmi2_api.c:221 #3  0x0000000000400626 in
>>> > main () at pmi_double_init.c:17
>>> >
>>> > (gdb) frame 3 #3  0x0000000000400626 in main () at
>>> > pmi_double_init.c:17 17          rc = PMI2_Init(&spawned, &size,
>>> > &rank, &appnum);
>>> >
>>> > (gdb) frame 1 #1  0x00007f201cbd5ee4 in PMI2U_readline (fd=12,
>>> > buf=0x7fffa4f80ba0 "cmd=init pmi_version=2 pmi_subversion=0\n",
>>> > maxlen=1024) at pmi2_util.c:72 72                      n = read(fd,
>>> > readbuf, sizeof(readbuf) - 1); (gdb) l 67          p = buf; 68
>>> >    curlen = 1; /* Make room for the null */ 69          while (curlen
>>> > < maxlen) { 70              if (nextChar == lastChar) { 71
>>> >        do { 72                      n = read(fd, readbuf,
>>> > sizeof(readbuf) - 1); 73                  } while (n == -1 && errno ==
>>> > EINTR); 74                  if (n == 0) { 75                      /*
>>> > EOF */ 76                      break;
>>> >
>>> > (gdb) frame 2 #2  0x00007f201cbcf74c in PMI2_Init
>>> > (spawned=0x7fffa4f81404, size=0x7fffa4f81400, rank=0x7fffa4f813fc,
>>> > appnum=0x7fffa4f813f8) at pmi2_api.c:221 221         ret =
>>> > PMI2U_readline(PMI2_fd, buf, PMI2_MAXLINE); (gdb) l 216
>>> > PMI2U_ERR_CHKANDJUMP(ret < 0, pmi2_errno, PMI2_ERR_OTHER, "**intern %
>>> > s", "failed to generate init line"); 217 218         ret =
>>> > PMI2U_writeline(PMI2_fd, buf); 219         PMI2U_ERR_CHKANDJUMP(ret <
>>> > 0, pmi2_errno, PMI2_ERR_OTHER, "**pmi2_init_send"); 220 221
>>> > ret = PMI2U_readline(PMI2_fd, buf, PMI2_MAXLINE); 222
>>> > PMI2U_ERR_CHKANDJUMP(ret < 0, pmi2_errno, PMI2_ERR_OTHER,
>>> > "**pmi2_initack %s", strerror(pmi2_errno)); 223 224
>>> > PMI2U_parse_keyvals(buf); 225         cmdline[0] = 0;
>>> >
>>> > So apps are hanged on waiting for responce from PMI Server while doing
>>> > non-full "init".
>>> >
>>> > And in error output I see following messages: ------------ 8<
>>> > ------------------------------------------------ slurmd[cn01]:
>>> > mpi/pmi2: request not begin with 'cmd=' slurmd[cn01]: mpi/pmi2: full
>>> > request is:  slurmd[cn01]: mpi/pmi2: invalid client request
>>> > ------------ 8< ------------------------------------------------
>>> >
>>> >
>>> >
>>> > If I attach befor second PMI2_Init call I can see that buf is no
>>> > empty: ... [ GDB attach right before PMI2_Init] .... (gdb) n 21
>>> >    rc = PMI2_Init(&spawned, &size, &rank, &appnum);
>>> > ------------------------ 8< -------------------------------------
>>> > (gdb)  203         if (PMI2_fd == -1) { (gdb) p PMI2_fd $1 = 12 (gdb)
>>> > n 215         ret = snprintf(buf, PMI2_MAXLINE, "cmd=init pmi_version=
>>> > %d pmi_subversion=%d\n", PMI_VERSION, PMI_SUBVERSION); (gdb)  216
>>> > PMI2U_ERR_CHKANDJUMP(ret < 0, pmi2_errno, PMI2_ERR_OTHER, "**intern %
>>> > s", "failed to generate init line"); (gdb) p buf $2 = "cmd=init
>>> > pmi_version=2 pmi_subversion=0\n\000mi_subversion\000 ... "...
>>> >
>>> > According to _handle_task_request SLURM uses following logic:
>>> > _handle_task_request(int fd, int lrank) if (initialized[lrank] == 0)
>>> > { rc = _handle_pmi1_init(fd, lrank); initialized[lrank] = 1; } else if
>>> > (is_pmi11()) { rc = handle_pmi1_cmd(fd, lrank); } else if (is_pmi20())
>>> > { rc = handle_pmi2_cmd(fd, lrank); } So once we call PMI2_Init first
>>> > time we will route next duplicating request to handle_pmi2_cmd (since
>>> > this is what we setup at first call).  And finaly handle_pmi2_cmd uses
>>> > safe_read (!!) in two steps: safe_read(fd, len_buf, 6); len_buf[6] =
>>> > '\0'; len = atoi(len_buf); buf = xmalloc(len + 1); safe_read(fd, buf,
>>> > len); buf[len] = '\0';
>>> >
>>> > and having "cmd=init pmi_version=2 pmi_subversion=0\n\000mi_subversion
>>> > \000" we will cut first 6 symbols from it and get: len_buf="cmd=in
>>> > \000" fd remains: "it pmi_version=2 pmi_subversion=0\n
>>> > \000mi_subversion\000" len = atoi("cmd=in\000") = 0; And we then read
>>> > 0-length buffer and return (as I can see in strerr). This will be
>>> > repeated until we finish the buffer. However it doesn't explain why we
>>> > hang but probably a good start to continue debuging.
>>> >
>>> > I think additional check in PMI2_Init on "already-initialized" case
>>> > will solve the problem.
>>> >
>>> >
>>>
>>
>>
>>
>> --
>> С Уважением, Поляков Артем Юрьевич
>> Best regards, Artem Y. Polyakov
>>
>
>


-- 
С Уважением, Поляков Артем Юрьевич
Best regards, Artem Y. Polyakov

Reply via email to