question on the timewait event of the rdma-cm

2011-11-14 Thread Or Gerlitz
Sean, I'm debugging some disconnect related race in iser - and wanted to check with you something re the CM/RDMA-CM state machine: I see that when a disconnected is initiated by the passive side (iser target) of a connection, such that the active side (iser initiator) gets

Re: [BUG] Bad page map in process ibv_devinfo

2011-11-14 Thread Lukas Razik
  hca_id: mlx4_0             transport:                      InfiniBand (0)             fw_ver:                        2.6.628             node_guid:                      0003:ba00:0100:b1d8             sys_image_guid:                0003:ba00:0100:b1db             vendor_id:             

RE: question on the timewait event of the rdma-cm

2011-11-14 Thread Hefty, Sean
I'm debugging some disconnect related race in iser - and wanted to check with you something re the CM/RDMA-CM state machine: I see that when a disconnected is initiated by the passive side (iser target) of a connection, such that the active side (iser initiator) gets

Re: question on the timewait event of the rdma-cm

2011-11-14 Thread Or Gerlitz
On Mon, Nov 14, 2011 at 9:16 PM, Hefty, Sean sean.he...@intel.com wrote: After disconnecting, the QP should enter the timewait state for twice the packet lifetime. Does going through timewait always holds? e.g no matter what's the return status of rdma_disconnect and/or the status of the

RE: question on the timewait event of the rdma-cm

2011-11-14 Thread Hefty, Sean
Does going through timewait always holds? e.g no matter what's the return status of rdma_disconnect and/or the status of the rdma_cm disconnected event? It usually holds. It will fail if rdma_disconnect() is called from a bogus state. But otherwise, I believe that it will enter timewait on

Re: question on the timewait event of the rdma-cm

2011-11-14 Thread Or Gerlitz
It usually holds.  It will fail if rdma_disconnect() is called from a bogus state.  But otherwise, I believe that it will enter timewait on failure to send or receive a disconnect message mmm, so can these bogus states for rdma_disconnect to be called be better defined? basically, for the

Re: [BUG] Bad page map in process ibv_devinfo

2011-11-14 Thread Lukas Razik
   hca_id: mlx4_0             transport:                      InfiniBand (0)             fw_ver:                         2.6.628             node_guid:                      0003:ba00:0100:b1d8             sys_image_guid:                 0003:ba00:0100:b1db             vendor_id:       

RE: question on the timewait event of the rdma-cm

2011-11-14 Thread Hefty, Sean
mmm, so can these bogus states for rdma_disconnect to be called be better defined? basically, for the case where the rdma_cm manages the consumer QP, this call is the only way to move an RC QP into the error state when the QP is okay and the consumer want to flush, etc. By bogus I mean

Re: question on the timewait event of the rdma-cm

2011-11-14 Thread Or Gerlitz
By bogus I mean calling disconnect when the QP has never been connected, or calling disconnect twice what return value can serve as bogus indication for the application? is that -EINVAL? also, basically a QP could have buffers posted to it also before being connected (e.g after RTR or there's

Re: srp_transport: Fix atttribute registration race

2011-11-14 Thread Or Gerlitz
On Sun, Nov 13, 2011 at 11:55 PM, Dave Dillow dillo...@ornl.gov wrote: SRP uses RDMA, so you cannot use UC mode. per the IB spec, RDMA write is supported for UC Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More

RE: question on the timewait event of the rdma-cm

2011-11-14 Thread Hefty, Sean
what return value can serve as bogus indication for the application? is that -EINVAL? also, basically a QP could have buffers posted to it also before being connected (e.g after RTR or there's no point in time an rdma-cm consumer for which the cma manages the QP state is exposed to such QP in

Re: srp_transport: Fix atttribute registration race

2011-11-14 Thread Dave Dillow
On Mon, Nov 14, 2011 at 04:43:32PM -0500, Or Gerlitz wrote: On Sun, Nov 13, 2011 at 11:55 PM, Dave Dillow dillo...@ornl.gov wrote: SRP uses RDMA, so you cannot use UC mode. per the IB spec, RDMA write is supported for UC Yeah, I read that as UD for some reason... -- To unsubscribe from this

Re: [BUG] Bad page map in process ibv_devinfo

2011-11-14 Thread Lukas Razik
Hello again, Vladimir! To set up /dev/mst/mt25418_pci_cr0 manually I've followed up what /etc/init.d/mst start does. I've seen that it uses the minit tool which is part of the arch dependent mft-package. Is there maybe a possibility to get minit for sparc64? Otherwise I can't init

[opensm] [PATCH 2/5] Move no_fallback_routing_engine from osm_subn_opt_t to osm_opensm_t.

2011-11-14 Thread Albert Chu
no_fallback_routing_engine is a convenience flag and not a configurable option, so it should not be in osm_subn_opt_t. Signed-off-by: Albert L. Chu ch...@llnl.gov --- include/opensm/osm_opensm.h |4 include/opensm/osm_subnet.h |1 - opensm/osm_opensm.c |4 +++-

[opensm] [PATCH 3/5] Fix rescan config file parsing spamming and loading bugs

2011-11-14 Thread Albert Chu
This patch fixes several major issues when config files are rescanned. First, config file changes were only noticed if users listed the config file option in the file. If a user commented out an option (or uncommented a previously commented out option) the change may not be noticed. Second,

[opensm] [PATCH 4/5] Fix potential memleak

2011-11-14 Thread Albert Chu
Note that the new config file parsing code, different osm_subn_opt_t structures do not share pointers to the same strdup'ed memory. Therefore, this memory must be freed before reallocing to avoid a memleak. Signed-off-by: Albert L. Chu ch...@llnl.gov --- opensm/main.c |2 ++ 1 files changed,

[opensm] [PATCH 5/5] Remove duplicate initialization of scatter_ports

2011-11-14 Thread Albert Chu
Signed-off-by: Albert L. Chu ch...@llnl.gov --- opensm/osm_subnet.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/opensm/osm_subnet.c b/opensm/osm_subnet.c index 6b6bd63..f9327a6 100644 --- a/opensm/osm_subnet.c +++ b/opensm/osm_subnet.c @@ -1009,7 +1009,6 @@ void

Re: [BUG] Bad page map in process ibv_devinfo

2011-11-14 Thread Lukas Razik
Roland Dreier rol...@purestorage.com wrote: On Sun, Nov 13, 2011 at 12:26 AM, Vladimir Sokolovsky v...@dev.mellanox.co.il wrote: Try to update HCA's firmware to the latest version (http://www.mellanox.com/content/pages.php?pg=firmware_download). Independent of a firmware update, have you

Re: question on the timewait event of the rdma-cm

2011-11-14 Thread Or Gerlitz
On Mon, Nov 14, 2011 at 11:55 PM, Hefty, Sean sean.he...@intel.com wrote: [...] calling disconnect is one way that a QP may be transitioned into timewait [...] I was talking on the QP physical state (e.g error that causes flushes) not the state w.r.t the IB CM. Or. -- To unsubscribe from

Re: srp_transport: Fix atttribute registration race

2011-11-14 Thread Bart Van Assche
On Mon, Nov 14, 2011 at 10:43 PM, Or Gerlitz or.gerl...@gmail.com wrote: On Sun, Nov 13, 2011 at 11:55 PM, Dave Dillow dillo...@ornl.gov wrote: SRP uses RDMA, so you cannot use UC mode. per the IB spec, RDMA write is supported for UC Agreed. But an SRP target does not only issue RDMA write

Re: [opensm] [PATCH 1/5] Free memory from osm_subn_opt_t when osm_subn_t destroyed

2011-11-14 Thread Bart Van Assche
On Mon, Nov 14, 2011 at 11:49 PM, Albert Chu ch...@llnl.gov wrote: +       if (opt-vlarb_high) +               free(opt-vlarb_high); Those if-statements are superfluous - invoking free(NULL) is safe. See e.g. http://pubs.opengroup.org/onlinepubs/009695399/functions/free.html. Bart. -- To