[ewg] change in diags in OFED 1.3? (2 ports; only 1 supported currently)

2007-12-05 Thread Scott Weitzenkamp (sweitzen)
This seems new in OFED 1.3:
 
[EMAIL PROTECTED] ~]# ibcheckerrors
perfquery: iberror: failed: smp query nodeinfo: 2 ports; only 1
supported currently
Error check on lid 8 (svbu-qa-pcie-2 HCA-1) port all: FAILED
perfquery: iberror: failed: perfquery
Error check on lid 8 (svbu-qa-pcie-2 HCA-1) port 1: FAILED
# Checked Ca: nodeguid 0x0005ad200860 with failure
...
 
I see these errors with all two-port HCAs.
Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Application layer support for RMPP using OFED stack.

2007-12-05 Thread Anton Bodner
Hello -

 

I'm from QLogic Corp, and I'm trying to port some older applications to
OFED, using the OFED stack and the OFED umad api (specifically - opening
the /dev/infiniband/umadX, and using read / write).  I am using OFED
1.2.5.

 

My old application interrogates the SA, and also implements the RMPP in
the application layer.  In porting this application to OFED, I realize
that the OFED stack has the capability to do the RMPP on my behalf, but
that has an adverse effect on my application since my infiniband
interface in my app transfers 256 byte packets ONLY.  

 

Hence - I want my app to do the RMPP, not OFED.

 

So - I disabled RMPP support in OFED stack by registering with RMPP
version of 0.  This seems effective in stopping OFED stack from doing
the RMPP for me.   However - when I try to send the RMPP ack, it fails
at the write ().

 

I investigated the OFED stack code, and its failing in
ib_create_send_mad (drivers/infiniband/core/mad.c).  Investigation shows
that if one registered with RMPP version 0, then the 'send' never allows
rmpp to be active...  (code snippet below)

if ((!mad_agent-rmpp_version 

 (rmpp_active || message_size  sizeof(struct ib_mad)))
||

(!rmpp_active  message_size  sizeof(struct ib_mad)))

{

return ERR_PTR(-EINVAL);

}

 

My questions are these:

1. Has anyone tried an application layer supported RMPP ? If so - how
did you get past this logic (possible OFED bug??)

 

2. Is the OFED implementation intent to disable RMPP support COMPLETELY
when registering with a RMPP version of  0? If so, then how does one
implement a user level RMPP using the OFED stack?  Perhaps no one is
doing this at all since the stack already does it for you?

 

Like I said - the only reason I'm doing this is to port an old/ existing
app to OFED, and there is a considerable level of difficulty to rip our
app level RMPP support.  But perhaps this is the alternative I must
do...

 

Thanks

 

Anton Bodner

 

 Anton Bodner Jr.

QLogic Corporation

(610)233-4856

[EMAIL PROTECTED]

http://www.qlogic.com

 

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] RE: OFED-1.3-beta sdp issue

2007-12-05 Thread Jim Mott
Hi,
  This looks very much like bug 793 
(https://bugs.openfabrics.org/show_bug.cgi?id=793).  There was a change in the 
sk_buff definition in 2.6.22+ kernels.  

  Could you verify that the fix posted in the bug is in your sdp_bcopy.c (or 
just send me your drivers/infiniband/ulp/sdp/sdp_bcopy.c) file?  This bug got 
picked up as a patch that gets applied by the build process instead of a change 
to base code.  Perhaps it is not being picked up for PPC.  I'll check it out.

Thanks,
JIm

Jim Mott
Mellanox Technologies Ltd.
mail: [EMAIL PROTECTED]
Phone: 512-294-5481

From: Stefan Roscher [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 05, 2007 4:32 AM
To: Jim Mott
Cc: Hoang-Nam Nguyen; Christoph Raisch; ewg@lists.openfabrics.org
Subject: OFED-1.3-beta sdp issue


Hi Jim, 

during the OFED-1.3-beta2 test on ppc64 systems with SLES10-SP1 I saw the 
following issue. 

I booted linux kernel 2.6.22 and 2.6.23 on SLES10-SP1 and netpipe sdp fails. 
with the following oops: 


REGS: c8ccf930 TRAP: 0700   Not tainted  (2.6.23-ppc64)                 
MSR: 80029032 EE,ME,IR,DR  CR: 2444  XER: 0005                
 
TASK = c8ccb6a0[25] 'events/6' THREAD: c8ccc000 CPU: 6          
 
GPR00: c0322b98 c8ccfbb0 c0680048 0087      
 
GPR04:    0024a7d8      
 
GPR08: 001bac9151b0 c05c8108 c001daa87b58 c05c8110      
 
GPR12:  c059a300        
 
GPR16:    4210      
 
GPR20: c054de98 c001a0bc4b00 0001       
 
GPR24:  c001beb7d000 beb7d014 0006      
 
GPR28: c001aae86100 c001d433c080 c062ef28 c001db841880      
 
NIP [c0322b9c] .skb_over_panic+0x50/0x58                                
 
LR [c0322b98] .skb_over_panic+0x4c/0x58                                 
Call Trace:                                                                     
[c8ccfbb0] [c0322b98] .skb_over_panic+0x4c/0x58 (unreliable)    
 
[c8ccfc40] [d0559df0] .sdp_poll_cq+0x380/0xa68 [ib_sdp]         
[c8ccfd10] [d055a8fc] .sdp_work+0xe8/0x10c [ib_sdp]             
[c8ccfda0] [c0076fac] .run_workqueue+0x118/0x208                
 
[c8ccfe40] [c0077f70] .worker_thread+0xcc/0xf0                  
 
[c8ccff00] [c007caa4] .kthread+0x78/0xc4                        
 
[c8ccff90] [c0026be4] .kernel_thread+0x4c/0x68                  
 
Instruction dump:                                                               
80a30068 e8e300b8 e90300c0 812300ac 814300b0 2fa0 409e0008 e81e8028         
e87e8038 f8010070 4bd3e4d1 6000 0fe0 4800 7c0802a6 faa1ffa8   

This issue occurs only on the two kernels mentioned above. 

My Question is , is this the bug you described here: 
https://bugs.openfabrics.org/show_bug.cgi?id=807 

or should I open a new one?     
                                                                                
Kind Regards

Stefan Roscher 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


RE: [ewg] change in diags in OFED 1.3? (2 ports; only1supportedcurrently)

2007-12-05 Thread Scott Weitzenkamp (sweitzen)

  Did not see this problem with OFED 1.2 or 1.2.5.
 
 Are you 100% sure ? Same HCAs and firmware ?

100% sure.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


RE: [ewg] change in diags in OFED 1.3? (2 ports; only 1 supportedcurrently)

2007-12-05 Thread Hal Rosenstock
On Wed, 2007-12-05 at 08:18 -0800, Scott Weitzenkamp (sweitzen) wrote:
 I'll open a bug.
 
  All two port HCAs ? Are all of them the same 2 port PCIe model or are
  there others ?
 
 All type of 2 port PCIe HCAs (LionCub, LionMini, and Eagle).
 
  Can you provide:
  
  smpquery nodedesc
  smpquery nodeinfo
  and most importantly
  perfquery -de
  
  for a failed node/port ?
 
 
 [EMAIL PROTECTED] ~]# perfquery -de
 ibwarn: [18050] smp_query: attr 0x11 mod 0x0 route DR path 0
 ibwarn: [18050] mad_rpc: data offs 64 sz 64
 mad data
 0101 0102 0005 ad00 0100 d050 0005 ad00
 0020 0848 0005 ad00 0020 0849 0040 6282
  00a0 0100 05ad    
        
 ibwarn: [18050] smp_query: attr 0x15 mod 0x0 route DR path 0
 ibwarn: [18050] mad_rpc: data offs 64 sz 64
 mad data
     fe80   
 0005 0002 0251 0a68  000f 0103 0302
 1452 0011 4040 0008 0804 f240  
  2008 10f0     
 ibwarn: [18050] pma_query: lid 5 port 1
 ibwarn: [18050] mad_rpc: data offs 64 sz 192
 mad data
 0101   0014    
        
        
        
        
        
        
        
        
        
        
        
 ibwarn: [18050] main: PerfMgt ClassPortInfo 0x0 extended counters not
 indicated

That's what I was afraid of and asked about on the list a while ago
(about whether there was such a use case and there is) :-(

Not sure exactly why this wasn't a problem prior to this change.
When was the last time you tried this on a subnet which included one or
more of these HCAs ?

So...

How important is fixing this (for OFED 1.3) ?

-- Hal

 ibwarn: [18050] pma_query: lid 5 port 1
 ibwarn: [18050] mad_rpc: MAD completed with error status 0xc
 perfquery: iberror: [pid 18050] main: failed: perfextquery
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


RE: [ewg] change in diags in OFED 1.3? (2 ports; only 1 supportedcurrently)

2007-12-05 Thread Scott Weitzenkamp (sweitzen)
I'll open a bug.

 All two port HCAs ? Are all of them the same 2 port PCIe model or are
 there others ?

All type of 2 port PCIe HCAs (LionCub, LionMini, and Eagle).

 Can you provide:
 
 smpquery nodedesc
 smpquery nodeinfo
 and most importantly
 perfquery -de
 
 for a failed node/port ?


[EMAIL PROTECTED] ~]# perfquery -de
ibwarn: [18050] smp_query: attr 0x11 mod 0x0 route DR path 0
ibwarn: [18050] mad_rpc: data offs 64 sz 64
mad data
0101 0102 0005 ad00 0100 d050 0005 ad00
0020 0848 0005 ad00 0020 0849 0040 6282
 00a0 0100 05ad    
       
ibwarn: [18050] smp_query: attr 0x15 mod 0x0 route DR path 0
ibwarn: [18050] mad_rpc: data offs 64 sz 64
mad data
    fe80   
0005 0002 0251 0a68  000f 0103 0302
1452 0011 4040 0008 0804 f240  
 2008 10f0     
ibwarn: [18050] pma_query: lid 5 port 1
ibwarn: [18050] mad_rpc: data offs 64 sz 192
mad data
0101   0014    
       
       
       
       
       
       
       
       
       
       
       
ibwarn: [18050] main: PerfMgt ClassPortInfo 0x0 extended counters not
indicated

ibwarn: [18050] pma_query: lid 5 port 1
ibwarn: [18050] mad_rpc: MAD completed with error status 0xc
perfquery: iberror: [pid 18050] main: failed: perfextquery
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


RE: [ewg] Re: [ofa-general] OFED 1.3 Beta release is available

2007-12-05 Thread Tang, Changqing

There are some other input structure changes such as ibv_qp_init_attr, if the 
qp_type is not IBV_QPT_XRC,
the field xrc_domain is not touched, right ?

Similar thing for struct ibv_send_wr xrc_remote_srq_num field.


--CQ Tang


 -Original Message-
 From: Jack Morgenstein [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, December 05, 2007 12:34 PM
 To: ewg@lists.openfabrics.org
 Cc: Roland Dreier; Tang, Changqing;
 [EMAIL PROTECTED]; [EMAIL PROTECTED]
 Subject: Re: [ewg] Re: [ofa-general] OFED 1.3 Beta release is
 available

 On Wednesday 05 December 2007 07:24, Roland Dreier wrote:
 
  I think the only alternative we have to preserve backwards
  compatibility is to leave struct ibv_context_ops alone and
 change the
  structure to:
 
  struct ibv_context {
  struct ibv_device  *device;
  struct ibv_context_ops  ops;
  int cmd_fd;
  int async_fd;
  int num_comp_vectors;
  pthread_mutex_t mutex;
  void   *abi_compat;
  struct ibv_xrc_op  *xrc_ops;
  };
 
  with xrc_ops added at the end.  It's my fault for not
 making the ops
  member a pointer I guess.
 

 We don't need to have this as a pointer, really (I'd like to
 save the extra malloc and associated bookkeeping). If we have
 the ibv_xrc_op struct at the end of ibv_context, this is
 sufficient for backwards binary
 compatibility(libmlx4 itself allocates the ibv_context
 structure for libibverbs.  If the actual structure is a bit
 bigger, who cares -- we just need to preserve the current
 offsets of the structure fields for binary compatibility).

 If you want to be a bit more generic, we could do this as an
 extra_ops
 structure and add new ops as needed.
 (If future changes are messier than just adding a new op, we
 can then increment the API version):

 struct ibv_context_extra_ops {
 struct ibv_srq *(*create_xrc_srq)(struct ibv_pd *pd,
   struct
 ibv_xrc_domain *xrc_domain,
   struct
 ibv_cq *xrc_cq,
   struct
 ibv_srq_init_attr *srq_init_attr);
 struct ibv_xrc_domain * (*open_xrc_domain)(struct
 ibv_context *context,
int fd, int oflag);
 int (*close_xrc_domain)(struct
 ibv_xrc_domain *d);
 };

  struct ibv_context {
  struct ibv_device  *device;
  struct ibv_context_ops  ops;
  int cmd_fd;
  int async_fd;
  int num_comp_vectors;
  pthread_mutex_t mutex;
  void   *abi_compat;
  struct ibv_context_extra_ops  extra_ops;  };


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [GIT PULL ofed-1.3] - RDMA/cxgb3 - fixes and 5.0 firmware support

2007-12-05 Thread Steve Wise

Vlad, please pull cxgb3 fixes for ofed-1.3 from:

git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel

These are cxgb3 bug fixes and PPC64 additions that we need for ofed-1.3.

The patches are all accepted upstream and were posted here:

http://www.spinics.net/lists/netdev/msg47492.html

and here:

http://www.spinics.net/lists/netdev/msg48240.html

Also, please pull version 1.1.0 of libcxgb3 from:

git://git.openfabrics.org/~swise/libcxgb3 ofed_1_3

The library and drivers need to be included together as they are both 
needed to support the chelsio 5.0 firmware.



Thanks,

Steve.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


RE: [ewg] change in diags in OFED 1.3? (2 ports; only 1supportedcurrently)

2007-12-05 Thread Scott Weitzenkamp (sweitzen)
 That's what I was afraid of and asked about on the list a while ago
 (about whether there was such a use case and there is) :-(
 
 Not sure exactly why this wasn't a problem prior to this change.
 When was the last time you tried this on a subnet which 
 included one or
 more of these HCAs ?

Did not see this problem with OFED 1.2 or 1.2.5.

 So...
 
 How important is fixing this (for OFED 1.3) ?

It's a regression, so I think it must be fixed.

Scott
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: [ofa-general] OFED 1.3 Beta release is available

2007-12-05 Thread Jack Morgenstein
On Wednesday 05 December 2007 07:24, Roland Dreier wrote:
 
 I think the only alternative we have to preserve backwards
 compatibility is to leave struct ibv_context_ops alone and change the
 structure to:
 
 struct ibv_context {
 struct ibv_device  *device;
 struct ibv_context_ops  ops;
 int cmd_fd;
 int async_fd;
 int num_comp_vectors;
 pthread_mutex_t mutex;
 void   *abi_compat;
 struct ibv_xrc_op  *xrc_ops;
 };
 
 with xrc_ops added at the end.  It's my fault for not making the ops
 member a pointer I guess.
 

We don't need to have this as a pointer, really (I'd like to save the
extra malloc and associated bookkeeping). If we have the ibv_xrc_op struct
at the end of ibv_context, this is sufficient for backwards binary
compatibility(libmlx4 itself allocates the ibv_context structure for
libibverbs.  If the actual structure is a bit bigger, who cares --
we just need to preserve the current offsets of the structure
fields for binary compatibility).

If you want to be a bit more generic, we could do this as an extra_ops
structure and add new ops as needed.
(If future changes are messier than just adding a new op, we can then
increment the API version):

struct ibv_context_extra_ops {
struct ibv_srq *(*create_xrc_srq)(struct ibv_pd *pd,
  struct ibv_xrc_domain 
*xrc_domain,
  struct ibv_cq *xrc_cq,
  struct ibv_srq_init_attr 
*srq_init_attr);
struct ibv_xrc_domain * (*open_xrc_domain)(struct ibv_context *context,
   int fd, int oflag);
int (*close_xrc_domain)(struct ibv_xrc_domain *d);
};

 struct ibv_context {
 struct ibv_device  *device;
 struct ibv_context_ops  ops;
 int cmd_fd;
 int async_fd;
 int num_comp_vectors;
 pthread_mutex_t mutex;
 void   *abi_compat;
 struct ibv_context_extra_ops  extra_ops;
 };
 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: [ofa-general] OFED 1.3 Beta release is available

2007-12-05 Thread Roland Dreier
   struct ibv_context {
   struct ibv_device  *device;
   struct ibv_context_ops  ops;
   int cmd_fd;
   int async_fd;
   int num_comp_vectors;
   pthread_mutex_t mutex;
   void   *abi_compat;
   struct ibv_xrc_op  *xrc_ops;
   };
   
  
  We don't need to have this as a pointer, really (I'd like to save the
  extra malloc and associated bookkeeping).

I think we could actually have libmlx4 have one copy of xrc_ops and
set the pointer to point at its copy.  And then the tests in each of
the xrc operations become just 'if (!context-xrc_ops) return ENOSYS;
But it's not a big deal really.

  If we have the ibv_xrc_op struct at the end of ibv_context, this is
  sufficient for backwards binary compatibility(libmlx4 itself
  allocates the ibv_context structure for libibverbs.  If the actual
  structure is a bit bigger, who cares -- we just need to preserve
  the current offsets of the structure fields for binary
  compatibility).

Yes, that's true.  I don't have much objection to adding a struct
ibv_xrc_ops inside the structure (rather than the pointer as I
suggested).

  If you want to be a bit more generic, we could do this as an extra_ops
  structure and add new ops as needed.

Actually I'd prefer to add xrc_ops and then if we need to extend
further with more new ops, add another structure afterw it.  That way
we avoid having to put any define in libibverbs to tell drivers like
libmlx4 that xrc support is present; libmlx4 et al can just use
AC_CHECK_MEMBER(struct ibv_context.xrc_ops) to test with autoconf.

 - R.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [ofa-general] OFED 1.3 Beta release is available

2007-12-05 Thread Roland Dreier
  I think in future we will have more such changes, why don't we take 
  the
  pain now to make ops as a pointer and mark it as verbs 1.2 ?

The problem is that undoubtedly the changes that require changing the
ABI will require something more than just additional ops, so we'll end
up needing yet another new ABI anyway.  So I don't see any real
benefit to breaking the ABI now except to make the change a little
cleaner, and that doesn't seem worth it to me.

 - R.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] change in diags in OFED 1.3? (2 ports; only 1 supported currently)

2007-12-05 Thread Hal Rosenstock
On Wed, 2007-12-05 at 00:25 -0800, Scott Weitzenkamp (sweitzen) wrote:
 This seems new in OFED 1.3:
  
 [EMAIL PROTECTED] ~]# ibcheckerrors
 perfquery: iberror: failed: smp query nodeinfo: 2 ports; only 1
 supported currently

There was a thread on this starting on Oct 12 titled
ibcheckerrrors/perfquery failure
http://lists.openfabrics.org/pipermail/general/2007-October/041901.html

There was some code which went in to support PMAs which don't support
the AllPortSelect option. Patches were sent on the list.

 Error check on lid 8 (svbu-qa-pcie-2 HCA-1) port all: FAILED
 perfquery: iberror: failed: perfquery
 Error check on lid 8 (svbu-qa-pcie-2 HCA-1) port 1: FAILED
 # Checked Ca: nodeguid 0x0005ad200860 with failure
 ...
  
 I see these errors with all two-port HCAs.

All two port HCAs ? Are all of them the same 2 port PCIe model or are
there others ?

Can you provide:

smpquery nodedesc
smpquery nodeinfo
and most importantly
perfquery -de

for a failed node/port ?

-- Hal

 Scott Weitzenkamp
 SQA and Release Manager
 Server Virtualization Business Unit
 Cisco Systems
 
 
 
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Please pull latest libehca

2007-12-05 Thread Stefan Roscher
 Please pull for OFED 1.3 the following branch for libehca.
 
 git://git.openfabrics.org/~/scm/libehca.git
 
 branch: ofed_1_3
 
 Thanks Stefan,
 


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: [ofa-general] OFED 1.3 Beta release is available

2007-12-05 Thread Jack Morgenstein
On Wednesday 05 December 2007 02:40, Roland Dreier wrote:
 BTW, sifting through the OFED 1.3 libibverbs tree, I do see that the
 commit to add max_xrc_domains to struct ibv_device_attr did break
 things by adding the member in the middle of the structure (so that an
 app compiled against the old header will see bogus values for
 local_ca_ack_delay and phys_port_count.
 
 Actually looking at the commit again, it's worse than that... anything
 compiled against the old header that calls ibv_query_device() may get
 memory corrupted, because the new ibv_query_device() writes to a
 bigger structure than the app passes in.
 
 The perils of not reviewing properly I guess...

This commit was subsequently reversed for exactly that reason.
Unfortunately, though, when I reviewed things regarding backwards binary
compatibility at the time I reversed the above commit, I also missed the
problem of the ibv_context_ops structure.

- Jack
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: [ofa-general] OFED 1.3 Beta release is available

2007-12-05 Thread Tang, Changqing

Roland:
I think in future we will have more such changes, why don't we take the
pain now to make ops as a pointer and mark it as verbs 1.2 ?


--CQ


 -Original Message-
 From: Roland Dreier [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, December 04, 2007 11:25 PM
 To: Tang, Changqing
 Cc: Tziporet Koren; ewg@lists.openfabrics.org;
 [EMAIL PROTECTED]
 Subject: Re: [ofa-general] OFED 1.3 Beta release is available

   I think the problem is that sizeof struct
 ibv_context_ops has   changed, so the new driver returns a
 big struct ibv_context, app   compiled with older header
 file has a smaller struct ibv_context
   and use the old offset to find fields after ops.

 Oh crud, you're obviously right.  For some reason I kept
 missing that when I looked over the code.

 I think the only alternative we have to preserve backwards
 compatibility is to leave struct ibv_context_ops alone and
 change the structure to:

 struct ibv_context {
 struct ibv_device  *device;
 struct ibv_context_ops  ops;
 int cmd_fd;
 int async_fd;
 int num_comp_vectors;
 pthread_mutex_t mutex;
 void   *abi_compat;
 struct ibv_xrc_op  *xrc_ops;
 };

 with xrc_ops added at the end.  It's my fault for not making
 the ops member a pointer I guess.

 Tziporet/Jack/whoever -- please fix up the libibverbs you
 ship for OFED 1.3 to resolve this.

 We can clean this up for libibverbs 1.2 when the ABI can
 change, if/when we have something worth breaking the ABI for.

  - R.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] OFED-1.3-beta sdp issue

2007-12-05 Thread Stefan Roscher
Hi Jim,

during the OFED-1.3-beta2 test on ppc64 systems with SLES10-SP1 I saw the 
following issue.

I booted linux kernel 2.6.22 and 2.6.23 on SLES10-SP1 and netpipe sdp 
fails. with the following oops:


REGS: c8ccf930 TRAP: 0700   Not tainted  (2.6.23-ppc64)  
MSR: 80029032 EE,ME,IR,DR  CR: 2444  XER: 0005  
TASK = c8ccb6a0[25] 'events/6' THREAD: c8ccc000 CPU: 6   
GPR00: c0322b98 c8ccfbb0 c0680048 0087 
 
GPR04:    0024a7d8 
 
GPR08: 001bac9151b0 c05c8108 c001daa87b58 c05c8110 
 
GPR12:  c059a300   
 
GPR16:    4210 
 
GPR20: c054de98 c001a0bc4b00 0001  
 
GPR24:  c001beb7d000 beb7d014 0006 
 
GPR28: c001aae86100 c001d433c080 c062ef28 c001db841880 
 
NIP [c0322b9c] .skb_over_panic+0x50/0x58  
LR [c0322b98] .skb_over_panic+0x4c/0x58  
Call Trace:  
[c8ccfbb0] [c0322b98] .skb_over_panic+0x4c/0x58 
(unreliable) 
[c8ccfc40] [d0559df0] .sdp_poll_cq+0x380/0xa68 [ib_sdp]   
[c8ccfd10] [d055a8fc] .sdp_work+0xe8/0x10c [ib_sdp]  
[c8ccfda0] [c0076fac] .run_workqueue+0x118/0x208  
[c8ccfe40] [c0077f70] .worker_thread+0xcc/0xf0  
[c8ccff00] [c007caa4] .kthread+0x78/0xc4  
[c8ccff90] [c0026be4] .kernel_thread+0x4c/0x68  
Instruction dump:  
80a30068 e8e300b8 e90300c0 812300ac 814300b0 2fa0 409e0008 e81e8028   
e87e8038 f8010070 4bd3e4d1 6000 0fe0 4800 7c0802a6 faa1ffa8 

This issue occurs only on the two kernels mentioned above.

My Question is , is this the bug you described here: 
https://bugs.openfabrics.org/show_bug.cgi?id=807

or should I open a new one? 
  
Kind Regards

Stefan Roscher ___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Re: [ofa-general] OFED 1.3 Beta release is available

2007-12-05 Thread Jack Morgenstein
On Wednesday 05 December 2007 21:59, Tang, Changqing wrote:
 There are some other input structure changes such as ibv_qp_init_attr, if the 
 qp_type is not IBV_QPT_XRC,
 the field xrc_domain is not touched, right ?
 
Right.

 Similar thing for struct ibv_send_wr xrc_remote_srq_num field.
 
Same thing -- the fields are not touched unless the qp type is IBV_QPT_XRC.

- Jack
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg