Hello Ralph

Thanks for your input. The routine that does the send is this:

static int btl_lf_modex_send(lfgroup lfgroup)
{
    char *grp_name = lf_get_group_name(lfgroup, NULL, 0);
    btl_lf_modex_t lf_modex;
    int rc;
    strncpy(lf_modex.grp_name, grp_name, GRP_NAME_MAX_LEN);
    OPAL_MODEX_SEND(rc, OPAL_PMIX_GLOBAL,
                    &mca_btl_lf_component.super.btl_version,
                    (char *)&lf_modex, sizeof(lf_modex));
    return rc;
}

This routine is called from the component init routine
(mca_btl_lf_component_init()). I have verified that the values in the modex
(lf_modex) are correct.

The receive happens in proc_create, and I call it like this:
OPAL_MODEX_RECV(rc, &mca_btl_lf_component.super.btl_version,
               &opal_proc->proc_name, (uint8_t **)&module_proc->proc_modex,
&size);

In here, I get junk value in proc_modex. If I pass a buffer that was
malloc()'ed in place of module_proc->proc_modex, I still get bad data.


Thanks again for your help.

Durga

We learn from history that we never learn from history.

On Sat, May 21, 2016 at 8:38 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Please provide the exact code used for both send/recv - you likely have an
> error in the syntax
>
>
> On May 20, 2016, at 9:36 PM, dpchoudh . <dpcho...@gmail.com> wrote:
>
> Hello all
>
> I have a naive question:
>
> My 'cluster' consists of two nodes, connected back to back with a
> proprietary link as well as GbE (over a switch).
> I am calling OPAL_MODEX_SEND() and the modex consists of just this:
>
> struct modex
> {char name[20], unsigned mtu};
>
> The mtu field is not currently being used. I bzero() the struct and have
> verified that the value being written to the 'name' field (this is similar
> to a PKEY for infiniband; the driver will translate this to a unique
> integer) is correct at the sending end.
>
> When I do a OPAL_MODEX_RECV(), the value is completely corrupted. However,
> the size of the modex message is still correct (24 bytes)
> What could I be doing wrong? (Both nodes are little endian x86_64 machines)
>
> Thanks in advance
> Durga
>
> We learn from history that we never learn from history.
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/05/19012.php
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/05/19019.php
>

Reply via email to