On Thu, 2007-01-18 at 18:14 +0200, Michael S. Tsirkin wrote:
> > Quoting John W. Marland <[EMAIL PROTECTED]>:
> > Subject: Re: [PATCH] IB/core - ib_umad can cause address alignment fault on 
> > ia64
> > 
> > Michael S. Tsirkin wrote:
> > 
> > >>Quoting Ralph Campbell <[EMAIL PROTECTED]>:
> > >>Subject: [PATCH] IB/core - ib_umad can cause address alignment fault on 
> > >>ia64
> > >>
> > >>IB/core - ib_umad can cause address alignment fault
> > >>
> > >>In user_mad.c, the definition for struct ib_umad_packet includes
> > >>struct ib_user_mad at an odd 32-bit offset.  When ib_umad_write()
> > >>tries to assign rmpp_mad->mad_hdr.tid, there is an alignment fault on
> > >>architectures which have strict alignment for load/stores.
> > >>This patch fixes the problem by changing the offset on which
> > >>struct ib_user_mad is defined within struct ib_umad_packet.
> > >>
> > >>Thanks go to John W. Marland <[EMAIL PROTECTED]> for finding this.
> > >>
> > >>Signed-off-by: Ralph Campbell <[EMAIL PROTECTED]>
> > >>
> > >>diff -r b1128b48dc99 drivers/infiniband/core/user_mad.c
> > >>--- a/drivers/infiniband/core/user_mad.c  Fri Jan 12 20:00:03 2007 +0000
> > >>+++ b/drivers/infiniband/core/user_mad.c  Wed Jan 17 14:09:37 2007 -0800
> > >>@@ -125,7 +125,7 @@ struct ib_umad_packet {
> > >>  struct ib_mad_send_buf *msg;
> > >>  struct ib_mad_recv_wc  *recv_wc;
> > >>  struct list_head   list;
> > >>- int                length;
> > >>+ long               length;
> > >>  struct ib_user_mad mad;
> > >> };
> > >>    
> > >>
> > >
> > >This does not make sense to me - do we have to replace all int fields with 
> > >long
> > >now? Looks like a compiler or makefile bug in your setup - struct fields 
> > >should
> > >be naturally aligned.
> > >
> > >  
> > >
> >     We should probably have given a more complete explanation. The 
> > unaligned access hits in two places, that I've tracked down so far.
> >     The one where it's easiest to see what's happening is in ib_umad_write.
> > ______________________________________________________________________________________
> >        if (!ib_response_mad(packet->msg->mad)) {
> >                 tid = &((struct ib_mad_hdr *) packet->msg->mad)->tid;
> >                 *tid = cpu_to_be64(((u64) agent->hi_tid) << 32 |
> >                                    (be64_to_cpup(tid) & 0xffffffff));
> > 
> > ---> this line causes the access problem               
> > rmpp_mad->mad_hdr.tid = *tid;
> >         }
> > ________________________________________________________________________________________
> >     The rmpp_mad variable is an ib_rmpp_mad pointer that is initialized 
> >     from the packet->mad.data early in the function.
> >     Because the ib_umad_packet structure has a as it's last element an 
> >     ib_user_mad structure, not a pointer to one, but the structure.
> >     This means that the Data[0] declaration at the end of the ib_umad 
> >     structure is forced onto a 4 byte boundary.
> 
> So the issue is that we are casting char *data which has no alignment 
> guarantees
> to 64 bit number. We really must find a way to force 64 bit alignment for
> struct ib_user_mad all over. Would not something like the following simple 
> trick work?
> 
> struct ib_user_mad_hdr {
>       .............
> } __attribute__((aligned (8)));

This would work but the 8 byte alignment isn't needed everywhere.
The int -> long change is needed because struct ib_umad_packet
includes struct ib_user_mad (which has 4 byte alignment) but
is then cast to struct ib_mad_hdr which has 8 byte alignment.
It is not the fault of the compiler.



_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to