[ewg] ofa_1_5_kernel 20110216-0200 daily build status

2011-02-16 Thread Vladimir Sokolovsky (Mellanox)
This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git
git_branch: ofed_kernel_1_5

Common build parameters: 

Passed:
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.24
Passed on i686 with linux-2.6.26
Passed on i686 with linux-2.6.28
Passed on i686 with linux-2.6.27
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.31
Passed on i686 with linux-2.6.29
Passed on i686 with linux-2.6.33
Passed on i686 with linux-2.6.32
Passed on i686 with linux-2.6.30
Passed on i686 with linux-2.6.35
Passed on i686 with linux-2.6.34
Passed on i686 with linux-2.6.36
Passed on x86_64 with linux-2.6.16.60-0.54.5-smp
Passed on x86_64 with linux-2.6.16.60-0.21-smp
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.18-238.el5
Passed on x86_64 with linux-2.6.18-194.el5
Passed on x86_64 with linux-2.6.18-164.el5
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.26
Passed on x86_64 with linux-2.6.24
Passed on x86_64 with linux-2.6.27
Passed on x86_64 with linux-2.6.25
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.28
Passed on x86_64 with linux-2.6.27.19-5-smp
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.24
Passed on ia64 with linux-2.6.26
Passed on ia64 with linux-2.6.28
Passed on ia64 with linux-2.6.25
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.19

Failed:
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] rping/cxgb3 regression

2011-02-16 Thread Vladimir Sokolovsky

On 02/16/2011 04:00 AM, Hefty, Sean wrote:

Not a big deal.

Vlad, can you pull librdmacm 1.0.14.1 into the next OFED 1.5.3 RC?  The only 
change versus 1.0.14 is reverting a patch to the rping sample.

Thanks,
Sean




Done,

Regards,
Vladimir
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [PATCH]mlx4_ib XRC RCV: Fix mlx4_ib_reg_xrc_rcv_qp() locking

2011-02-16 Thread Jack Morgenstein
You are correct!  Good catch.
We will add this to OFED.

(P.S., I would rather leave irqsave -- it is used everywhere else for this 
spinlock).

-Jack

On Monday 14 February 2011 09:32, sebastien dugue wrote:
 
   Resending to the proper ML (sorry).
 
 
   In mlx4_ib_reg_xrc_rcv_qp(), we need to take the xrc_reg_list_lock spinlock
 when walking the xrc_reg_list.
 
   We've been hit by this on 2 customer sites.
 
   Also, I guess spin_lock_irqsave() could be replaced by spin_lock_irq() in
 that function as we know for sure we're in process context.
 
 Signed-off-by: Sébastien Dugué sebastien.du...@bull.net
 
 --
 
  qp.c |3 +++
  1 file changed, 3 insertions(+)
 
 dIndex: kernel-ib/drivers/infiniband/hw/mlx4/qp.c
 ===
 --- kernel-ib.orig/drivers/infiniband/hw/mlx4/qp.c2011-01-31 
 16:52:11.0 +0100
 +++ kernel-ib/drivers/infiniband/hw/mlx4/qp.c 2011-02-11 15:24:27.0 
 +0100
 @@ -2549,13 +2549,16 @@
   }
  
   mutex_lock(mibqp-mutex);
 + spin_lock_irqsave(mibqp-xrc_reg_list_lock, flags);
   list_for_each_entry(tmp, mibqp-xrc_reg_list, list)
   if (tmp-context == context) {
 + spin_unlock_irqrestore(mibqp-xrc_reg_list_lock, 
 flags);
   mutex_unlock(mibqp-mutex);
   kfree(ctx_entry);
   mutex_unlock(to_mdev(xrcd-device)-xrc_reg_mutex);
   return 0;
   }
 + spin_unlock_irqrestore(mibqp-xrc_reg_list_lock, flags);
  
   ctx_entry-context = context;
   spin_lock_irqsave(mibqp-xrc_reg_list_lock, flags);
 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] [PATCH]mlx4_ib XRC RCV: Fix mlx4_ib_reg_xrc_rcv_qp() locking

2011-02-16 Thread sebastien dugue
On Wed, 16 Feb 2011 14:50:02 +0200
Jack Morgenstein ja...@dev.mellanox.co.il wrote:

 You are correct!  Good catch.
 We will add this to OFED.

  Thanks,

 
 (P.S., I would rather leave irqsave -- it is used everywhere else for this 
 spinlock).

  Right, but everywhere you know for sure you're in which context you are 
(process
or interrupt), there's no need to use the save/restore variant. Those are just 
to be
used in places where you don't know in which context you are.

  Also, one thing I noticed in that same function: why allocate ctx_entry before
knowing if it's going to be of any use? The allocation could be done right 
before
the first use.

  Sébastien.

 
 -Jack
 
 On Monday 14 February 2011 09:32, sebastien dugue wrote:
  
Resending to the proper ML (sorry).
  
  
In mlx4_ib_reg_xrc_rcv_qp(), we need to take the xrc_reg_list_lock 
  spinlock
  when walking the xrc_reg_list.
  
We've been hit by this on 2 customer sites.
  
Also, I guess spin_lock_irqsave() could be replaced by spin_lock_irq() in
  that function as we know for sure we're in process context.
  
  Signed-off-by: Sébastien Dugué sebastien.du...@bull.net
  
  --
  
   qp.c |3 +++
   1 file changed, 3 insertions(+)
  
  dIndex: kernel-ib/drivers/infiniband/hw/mlx4/qp.c
  ===
  --- kernel-ib.orig/drivers/infiniband/hw/mlx4/qp.c  2011-01-31 
  16:52:11.0 +0100
  +++ kernel-ib/drivers/infiniband/hw/mlx4/qp.c   2011-02-11 
  15:24:27.0 +0100
  @@ -2549,13 +2549,16 @@
  }
   
  mutex_lock(mibqp-mutex);
  +   spin_lock_irqsave(mibqp-xrc_reg_list_lock, flags);
  list_for_each_entry(tmp, mibqp-xrc_reg_list, list)
  if (tmp-context == context) {
  +   spin_unlock_irqrestore(mibqp-xrc_reg_list_lock, 
  flags);
  mutex_unlock(mibqp-mutex);
  kfree(ctx_entry);
  mutex_unlock(to_mdev(xrcd-device)-xrc_reg_mutex);
  return 0;
  }
  +   spin_unlock_irqrestore(mibqp-xrc_reg_list_lock, flags);
   
  ctx_entry-context = context;
  spin_lock_irqsave(mibqp-xrc_reg_list_lock, flags);
  
 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] [PATCH]mlx4_ib XRC RCV: Fix mlx4_ib_reg_xrc_rcv_qp() locking

2011-02-16 Thread Jack Morgenstein
On Wednesday 16 February 2011 15:02, sebastien dugue wrote:
   Also, one thing I noticed in that same function: why allocate ctx_entry 
 before
 knowing if it's going to be of any use? The allocation could be done right 
 before
 the first use.
 
I did it just to gather all the error returns at the beginning of the function.
You are correct, though: I could have walked the list before doing the 
allocation.
I don't see this as critical, though.

-Jack
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] pull request

2011-02-16 Thread Mike Marciniszyn
Please pull the following fix from
git.openfabrics.org/~mmarciniszyn/scm/linux-2.6.to_ofed.   This fixes an issue 
revealed during OFED 1.5.3 testing.

Mike

commit 0012f25501856cf0f47397fbb57e26bf46b11b99
Author: Mike Marciniszyn mike.marcinis...@qlogic.com
Date:   Wed Feb 16 09:47:41 2011 -0500

IB/qib: Prevent double completions after a timeout or RNR error

From: Mike Marciniszyn mike.marcinis...@qlogic.com

There is a double completion associated with error handling
for RC QP's.

The sequence is:
- The do_rc_ack() routine fields an RNR nack and there are 0 rnr_retries
configured on the QP.
- qib_error_qp() stops the pending timer
- qib_rc_send_complete() is called from sdma_complete()
- qib_rc_send_complete() starts the timer because the msb of the psn just
completed says and ack is needed.
- a bunch of flushes occur as ipoib posts wqe's to an error'ed qp
- rc_timeout() calls qib_restart_rc()
- qib_restart_rc() calls qib_send_complete() with a IB_WC_RETRY_EXC_ERR on a
wqe that has already been completed in the past

The fix avoids starting the timer since another packet will never
arrive.

Signed-off-by: Mike Marciniszyn mike.marcinis...@qlogic.com


This message and any attached documents contain information from QLogic 
Corporation or its wholly-owned subsidiaries that may be confidential. If you 
are not the intended recipient, you may not read, copy, distribute, or use this 
information. If you have received this transmission in error, please notify the 
sender immediately by reply e-mail and then delete this message.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] pull request

2011-02-16 Thread Vladimir Sokolovsky

On 02/16/2011 05:55 PM, Mike Marciniszyn wrote:

Please pull the following fix from
git.openfabrics.org/~mmarciniszyn/scm/linux-2.6.to_ofed.   This fixes an issue 
revealed during OFED 1.5.3 testing.

Mike



Done,

Regards,
Vladimir
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [ANNOUNCE] management tarballs release

2011-02-16 Thread Alex Netes
Hi,

There is a new release of the management (OpenSM and infiniband
diagnostics) tarballs available in:

http://www.openfabrics.org/downloads/management/

(listed in http://www.openfabrics.org/downloads/management/latest.txt)

c0b24a1053ae8b0b3caf5950b3ede6dc  infiniband-diags-1.5.8.tar.gz
c2755aa360d3f29d04865ba4e2454a98  libibmad-1.3.7.tar.gz
c7575b7620615d7dfa1c7fdbbd310ec7  libibumad-1.3.7.tar.gz
df051f5f0192d369b0b904147cb045a8  opensm-3.3.8.tar.gz

All component versions are from recent master branch. Full list of
changes is below.

OpenSM:
===

Alex Netes (1):
  opensm: fixed getline pointer allocation free in osm_console_io

Eli Dorfman (Voltaire) (1):
  Wrong handling of MC create and delete traps

Hal Rosenstock (6):
  opensm/osm_state_mgr.c: Don't signal DISCOVER to SM state machine when 
already DISCOVERING
  opensm: Fix some typos
  osmtest/osmt_service.c: In osmt_run_service_records_flow, add missing 
status
  opensm/osm_ucast_ftree: When roots are not connected, update hop count 
but not lft
  opensm/osm_trap_rcv.c: No need to check for sweep for trap 145
  opensm: Add support for SwitchInfo:MulticastFDBTop

Ira Weiny (1):
  Add node/port/qos information to some error messages

Jason Gunthorpe (1):
  Fix autotools to include the necessary M4 files

Sasha Khapyorsky (3):
  opensm/sa: simplify osm_mcmr_rcv_find_or_create_new_mgrp() function call
  opensm/osm_node_info_rcv.c: move p_physp declaration under code block
  opensm/osm_db_files.c: malloc() return value run-time check

Stan C. Smith (2):
  replace (long*)(long) casting with transportable data type (uintptr_t)
  replace (long*)(long) casting with transportable data type (uintptr_t)

Yevgeny Kliteynik (28):
  opensm/osm_qos_policy.c: change a log message
  opensm/osm_prtn.c: removing TopSpin hack
  libvendor/osm_vendor_ibumad_sa.c: remove useless if statement
  libvendor/osm_vendor_mlx_sa.c: remove useless if statement
  opensm/osm_mtree.c: removing useless 'if' statement
  opensm/osm_sminfo_rcv.c: removing unused variable
  opensm/osm_pkey.c: removing unused function
  opensm/osm_sa_pkey_record.c: removing unused variable
  opensm/osm_sa_vlarb_record.c: removed unused variable
  opensm/osm_node_info_rcv.c: remove useless code line
  osmtest/osmtest.c: handle timeouts in PR stress test
  opensm/osm_helper.c: fix potential overrun of the array
  opensm/osm_helper.c: cosmetics - move define closer to the relevant code
  opensm/osm_mesh.c: fixing a bug in compare_switches()
  opensm/osm_subnet.c: fixing small bug in error path
  opensm/osm_db_files.c: fix small memory leak
  osmtest/osmt_slvl_vl_arb.c: handling fopen() failure
  opensm/osm_helper.c: use ARR_SIZE macro instead of hardcoded values
  osm_vl15intf.c: fixing use-after-free coredump
  opensm/osm_trap_rcv.c: fix possible core dump
  opensm/osm_ucast_ftree.c: fix small memory leak in error path
  opensm/osm_ucast_ftree.c: fixing another memory leak at error path
  opensm/osm_ucast_lash.c: small bug in calculating allocated size
  opensm/osm_pkey_mgr.c: fixing small memory leak
  opensm/osm_ucast_file.c: closing file descriptor in error path
  opensm/osm_qos_parser_y.y: fixing bunch of memory leaks on invalid values
  opensm/osm_console.c: fix memory and file descriptor leaks
  opensm/st.c: fix potential core dumps

libibumad:
==

Jason Gunthorpe (1):
  Fix autotools to include the necessary M4 files

Mike Heinz (1):
  FW: [PATCH] umad_send.3 (man page)

Yevgeny Kliteynik (1):
  umad.{c,h}: moving stdlib.h include from C to H file

libibmad:
=

Ira Weiny (1):
  libibmad/fields.c: Change all PortCounter names to match the Specification

Jason Gunthorpe (1):
  Fix autotools to include the necessary M4 files

infiniband-diags:
=

Albert Chu (4):
  add --diff support to iblinkinfo
  support --diffcheck in iblinkinfo
  Add lid and node description diff options for --diffcheck in iblinkinfo
  support --filterdownports in iblinkinfo

Alex Netes (3):
  Makefile: ChangeLog and version generation script path fix
  infiniband-diags: update shared library versions
  infiniband-diags: package versions update

Eli Dorfman (Voltaire) (2):
  infiniband-diags: Do not exit when unexpected node found
  inifiband-diags: Support Voltaire switch ISR4200

Hal Rosenstock (3):
  infiniband-diags/ibtracert: Eliminate direct route (-D) option
  infiniband-diags/saquery.c: In dump_one_mcmember_record, fix flow label 
endian
  infiniband-diags/iblinkinfo.c: Limit some queries to switches

Ira Weiny (4):
  libibmad/fields.c: Change all PortCounter names to match the Specification
  infiniband-diags: Verify timeout value specified to diagnostics
  Further timeout paramater verification (Was: [PATCH] infiniband-diags: 
Verify