Rui Machado wrote:
2008/9/17 Rui Machado <[EMAIL PROTECTED]>:
From: Rui Machado <[EMAIL PROTECTED]>
Date: 2008/9/17
Subject: Re: [ofa-general] atomic operations on ppc64
To: Dotan Barak <[EMAIL PROTECTED]>


2008/9/17 Dotan Barak <[EMAIL PROTECTED]>:
On Wed, Sep 17, 2008 at 5:54 PM, Rui Machado <[EMAIL PROTECTED]> wrote:
2008/9/17 Dotan Barak <[EMAIL PROTECTED]>:
On Wed, Sep 17, 2008 at 5:44 PM, Rui Machado <[EMAIL PROTECTED]> wrote:
2008/9/17 Dotan Barak <[EMAIL PROTECTED]>:
On Wed, Sep 17, 2008 at 5:28 PM, Rui Machado <[EMAIL PROTECTED]> wrote:
Hey Dotan,

2008/9/17 Dotan Barak <[EMAIL PROTECTED]>:
On Wed, Sep 17, 2008 at 5:12 PM, Rui Machado <[EMAIL PROTECTED]> wrote:
Hi list,

does anyone have experienced problems using IB atomic operations
(fetch and add) on a ppc64 platform?
I tried a small example (using fetch and add) on x86 and ppc64 and on
x86 worked fine while on ppc64 didn't.
Do you handle the ntoh/hton or do you let the driver/HCA deal with it by itself?
Nop, I don't use those. I guess then I'm letting the driver/HCA deal with it....
Do you see endianess issues or completely corrupted data?

Just to make it clear (to me :) ). I'm talking about ppc64<-->ppc64
communication.
Should I still concern with converting data because of endianess?
What happens is that I ask for a fetch and add and it doesn't happen.
The value on the server doesn't get modified.
This is a weird behaviour indeed ..

Can you post the code in your program that fill the SR?

Dotan

Not sure what do you mean by SR.
Here's is the function inc() which I call to increment 1 one the
remote machine. The remote machine has its buffer full of zeroes.
That's what the client gets all the time although I increment 3 times
in a row (with a sleep in between)

Is this enough?
Thanks for the help

void inc()
{

       struct ibv_qp_attr check_attr;
       struct ibv_qp_init_attr check_init_attr;

       void *ev_ctx;

       struct ibv_send_wr *bad_wr;
       struct ibv_wc wc;
       struct ibv_sge slist;
       struct ibv_send_wr swr3;


       slist.addr = (uintptr_t)buffer;
       slist.length = 8;
       slist.lkey =mr->lkey;

       swr3.wr.atomic.remote_addr = remote_node->mi.bufAddr;
       swr3.wr.atomic.rkey = remote_node->mi.buf_rkey;
       swr3.wr.atomic.compare_add = 1;

       swr3.wr_id      = 1;
       swr3.sg_list    = &slist;
       swr3.num_sge    = 1;
       swr3.opcode     = IBV_WR_ATOMIC_FETCH_AND_ADD;
       swr3.send_flags = IBV_SEND_SIGNALED;
       swr3.next       = NULL;


       if(ibv_post_send(qp,&swr3,&bad_wr)){
               printf("Couldn't post send...\n");
               return 0;
       }


       int ne=0;
       do{
               ne = ibv_poll_cq(cq,1,&wc);
       }while(ne==0);

       if((ne < 0) || (wc.status != IBV_WC_SUCCESS)){

               //check qp status
               if(!ibv_query_qp(qp,&check_attr,IBV_QP_STATE,&check_init_attr))
                       printf("The qp state is: %d\n ",check_attr.qp_state);

       }
}

The code looks good and it should work...
(I would have memset every structure before using it ..)


Did you check the memory in the sender side or in the reciver side?

As I mentioned it does work on x86.

Actually on both:

server:
Initial counter at buffer is 0
counter at buffer is 0
counter at buffer is 0
counter at buffer is 0
counter at buffer is 0
counter at buffer is 0
counter at buffer is 0
counter at buffer is 0


client:
initial IB atomic counter 0
IB atomic counter 0
IB atomic counter 0
IB atomic counter 0

What could this be related to? Driver, HW?


Anyone with some insight on this?
Maybe how can I debug this further?
Bugs can be anywhere: application / Driver / HW ...

Can you try to use server in x86 and client in PPC64 and then server in PPC64 and client in x86?

Which OFED version do you use?
Can you send the output of ibv_devinfo?

Dotan
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to