Re: AW: AW: AW: IPoIB GRO

2013-11-05 Thread Erez Shitrit



I see. This didn't happen on our setups here since we tests with
newer cards (ConnectX2/3/3-pro).
For ConnectX1 (A0) and this firmware that you are using smells
like something goes wrong. If possible, I would change to newish
card.

No problem with that. My journey up to here was hard but very
interesting. Especially when you expect everything in the system
to be consistent and new speedups with every new kernel or
driver version. Encountering a throughput drop of nearly 50%
with the upgrade of our NFS servers I was challenged.

With TSO disabled on our old cards I'm back to LRO speeds and
I'm more than happy with that.

Just a final clarification for the interested reader: Are the TCP Ids
in an TSO setup generated through firmware or in the software
stack? And if in firmware: How does the card know how to
increase them? I would expect that it only works with IB packets
and does not know of the IP encapsulation.
The card (HW) knows how to deal with IP packets, the card is configured 
via the FW to increase the ip-id for each ip packet that it is part of 
the full message.


so, to summarize:
The HW does the work (truncates the big ip packet to series of ip 
packets, each with the relevant mtu size and increases the ip-id for each)

The FW enables that work on the HW
the FW in A0 card doesn't enable that option for the HW.



Best regards.

Markus


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AW: AW: AW: IPoIB GRO

2013-11-05 Thread Or Gerlitz

On 05/11/2013 10:25, Markus Stockhausen wrote:

Are the TCP Ids in an TSO setup generated through firmware or in the software 
stack?


in HW

And if in firmware: How does the card know how to increase them? I would expect 
that it only works with IB packets
and does not know of the IP encapsulation.


All vendors networking HW which does TSO gets a hint from the driver 
this is TSO packet, in your case see

mlx4_ib_post_send and look for IB_WR_LSO

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC v2 00/10] Introduce Signature feature

2013-11-05 Thread Sagi Grimberg

On 11/4/2013 8:41 PM, Nicholas A. Bellinger wrote:

On Sat, 2013-11-02 at 14:57 -0700, Bart Van Assche wrote:

On 1/11/2013 18:36, Nicholas A. Bellinger wrote:

On Fri, 2013-11-01 at 08:03 -0700, Bart Van Assche wrote:

On 31/10/2013 5:24, Sagi Grimberg wrote:

In T10-DIF, when a series of 512-byte data blocks are transferred, each
block is followed by an 8-byte guard. The guard consists of CRC that
protects the integrity of the data in the block, and some other tags
that protects against mis-directed IOs.

Shouldn't that read logical block length divided by 2**(protection
interval exponent) instead of 512 ? From the SPC-4 FORMAT UNIT
section:

Why should the protection interval in FORMAT_UNIT be mentioned when it's
not supported by the hardware, nor by drivers/scsi/sd_dif.c itself..?

Hello Nick,

My understanding is that this patch series is not only intended for
initiator drivers but also for target drivers like ib_srpt and ib_isert.
As you know target drivers do not restrict the initiator operating
system to Linux. Although I do not know whether there are already
operating systems that support the protection interval exponent,

It's my understanding that Linux is still the only stack that supports
DIF, so AFAICT no one is actually supporting this.


  I think it is a good idea to stay as close as possible to the terminology
of the SPC-4 standard.


No, in this context it only adds pointless misdirection because 1) The
hardware in question doesn't support it, and 2) Linux itself doesn't
support it.


I think that Bart is suggesting renaming block_size as pi_interval in 
ib_sig_domain.
I tend to agree since even if support for that does not exist yet, it 
might be in the future.
I think it is not a misdirection because it does represent the 
protection information interval.



--nab



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next V1 8/8] IB/core: extended command: move comp_mask to the extended header

2013-11-05 Thread Yann Droneaud
Hi Matan,

Le mercredi 30 octobre 2013 à 11:52 +0200, Matan Barak a écrit :
 From: Yann Droneaud ydrone...@opteya.com
 
 The unused field in the extended header is a perfect candidate
 to hold the command comp_mask (eg. bit field used to handle
 compatibility). This was suggested by Roland Dreier in a previous
 review[1].
 
 So this patch move comp_mask from create_flow/destroy_flow commands
 to the extended command header. Then comp_mask is passed as part
 of function parameters.
 

As I wrote in a previous mail, I think this comp_mask should not be
handled specificaly since a comp_mask might be also needed for the
provider 

So I'm now in favor of dropping this patch and adding a note in the
patch which update the framework about the usage of comp_mask in each
part of the command/response.

Regards.

-- 
Yann Droneaud
OPTEYA


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next V1 0/8] uverbs extensions fixes

2013-11-05 Thread Yann Droneaud
Hi,

Le mercredi 30 octobre 2013 à 11:52 +0200, Matan Barak a écrit :
 This series is a continuous improvement for the uverbs extension mechanism
 that was introduced as an experimental feature for v3.12.
 
 Yann Droneaud suggested and implemented the following improvements:
 - structure renaming to match others uverbs public structs;
 - changes usage of the flow_attr.size to not count the
   extended command header but to describe only the size
   of the flow specs following flow_attr;
 - removed unneeded flow_spec structure that don't need to be
   exposed to userspace.
 - ensure 64bits alignment
 
 This series is actually Yann's series with a bug fix.
 
 Changes from Yann's series
 (V0 http://marc.info/?l=linux-rdmam=138151196022025):
 1. Re-enable flow steering verbs and the extension verbs mechanism.
 2. Squashed patches 1 and 2 from the original series
 3. ib_uverbs_write should return the number of bytes including the
header's size (Patch 7).
 

Thanks Matan for carrying on the patchset.

I've quite the same patchset, but the other way around, eg. enabling
the flow steering verbs after cleanup on the new ABI. I thought it would
make more sense this way. Would you like me to send the patchset this
way, with my others patches to rename the function, which was dropped
from my latest attempt in order to squeeze the patchset to bare
minimal ?

Regarding the extensible framework, I haven't found time to design a new
proposal for the interface.

I keep in my mind that something built around writev(2) (struct iovec)
and/or cmsg(3) / netlink(3) would be preferable to ease sending
multipart command to uverbs subsystem.

BTW, I think we should drop the patch that adds the comp_mask in the
header. As you wrote in a previous mail, a comp_mask could be present
in the provider part of the command. This make handling of comp_mask
from header very different, very specific, while it's not, since there
could be more comp_mask: one in command, one in provider, one in
response and one in the provider response parts. So I would prefer not
have the command comp_mask being treated differently than the other.

Regards.

-- 
Yann Droneaud
OPTEYA


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next V1 0/8] uverbs extensions fixes

2013-11-05 Thread Or Gerlitz

On 05/11/2013 11:05, Yann Droneaud wrote:

Thanks Matan for carrying on the patchset.

I've quite the same patchset, but the other way around, eg. enabling
the flow steering verbs after cleanup on the new ABI. I thought it would
make more sense this way. Would you like me to send the patchset this
way, with my others patches to rename the function, which was dropped
from my latest attempt in order to squeeze the patchset to bare minimal ?


good! lets do that asap, please make sure to mark the series as V2 and 
list the changes from V1 in the cover-letter



Regarding the extensible framework, I haven't found time to design a new
proposal for the interface.

I keep in my mind that something built around writev(2) (struct iovec)
and/or cmsg(3) / netlink(3) would be preferable to ease sending
multipart command to uverbs subsystem.


Well, we do want the solution to use the uverbs framework and not 
involve another paradigm for user/kernel verbs interaction. With this in 
mind  as Matan and Tzahi wrote you earlier on the list, we do think 
the current proposal is good enough to carry on with.




BTW, I think we should drop the patch that adds the comp_mask in the
header. As you wrote in a previous mail, a comp_mask could be present
in the provider part of the command. This make handling of comp_mask
from header very different, very specific, while it's not, since there
could be more comp_mask: one in command, one in provider, one in
response and one in the provider response parts. So I would prefer not
have the command comp_mask being treated differently than the other.


makes sense.

Matan and Or.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AW: AW: AW: IPoIB GRO

2013-11-05 Thread Jason Gunthorpe
On Tue, Nov 05, 2013 at 10:48:10AM +0200, Erez Shitrit wrote:

 so, to summarize:
 The HW does the work (truncates the big ip packet to series of ip
 packets, each with the relevant mtu size and increases the ip-id for
 each)
 The FW enables that work on the HW
 the FW in A0 card doesn't enable that option for the HW.

Sounds like this bug causes a performance regression, and it sounds
like it puts incorrect packets on the wire.

This should be patched, have the driver disable TSO for cards that
can't support it...

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] librdmacm: Some fixes to man pages

2013-11-05 Thread Guy Shapiro
Fix the man pages of rdma_destroy_ep  rdma_destroy_qp to the correct return 
value (void).
---
 man/rdma_destroy_ep.3 |5 +
 man/rdma_destroy_qp.3 |3 ---
 2 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/man/rdma_destroy_ep.3 b/man/rdma_destroy_ep.3
index b48a1e5..750702a 100644
--- a/man/rdma_destroy_ep.3
+++ b/man/rdma_destroy_ep.3
@@ -5,16 +5,13 @@ rdma_destroy_ep \- Release a communication identifier.
 .SH SYNOPSIS
 .B #include rdma/rdma_cma.h
 .P
-.B int rdma_destroy_ep
+.B void rdma_destroy_ep
 .BI (struct rdma_cm_id * id );
 .SH ARGUMENTS
 .IP id 12
 The communication identifier to destroy.
 .SH DESCRIPTION
 Destroys the specified rdma_cm_id and all associated resources
-.SH RETURN VALUE
-Returns 0 on success, or -1 on error.  If an error occurs, errno will be
-set to indicate the failure reason.
 .SH NOTES
 rdma_destroy_ep will automatically destroy any QP and SRQ associated with
 the rdma_cm_id.
diff --git a/man/rdma_destroy_qp.3 b/man/rdma_destroy_qp.3
index aeff667..bb1360e 100644
--- a/man/rdma_destroy_qp.3
+++ b/man/rdma_destroy_qp.3
@@ -11,9 +11,6 @@ rdma_destroy_qp \- Deallocate a QP.
 RDMA identifier.
 .SH DESCRIPTION
 Destroy a QP allocated on the rdma_cm_id.
-.SH RETURN VALUE
-Returns 0 on success, or -1 on error.  If an error occurs, errno will be
-set to indicate the failure reason.
 .SH NOTES
 Users must destroy any QP associated with an rdma_cm_id before
 destroying the ID.
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH opensm] Fix dropping node after setPkey mad

2013-11-05 Thread Hal Rosenstock

From: Dan Ben Yosef da...@mellanox.com

Need to check pkey received counter flag only
after get mads and not set.

Signed-off-by: Dan Ben Yosef da...@mellanox.com
---
 include/opensm/osm_port.h |3 ++-
 opensm/osm_pkey_rcv.c |3 ++-
 opensm/osm_port.c |6 --
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/include/opensm/osm_port.h b/include/opensm/osm_port.h
index a1efc97..f4b7efd 100644
--- a/include/opensm/osm_port.h
+++ b/include/opensm/osm_port.h
@@ -558,7 +558,8 @@ void osm_physp_set_port_info(IN osm_physp_t * p_physp,
 void osm_physp_set_pkey_tbl(IN osm_log_t * p_log, IN const osm_subn_t * p_subn,
IN osm_physp_t * p_physp,
IN ib_pkey_table_t * p_pkey_tbl,
-   IN uint16_t block_num);
+   IN uint16_t block_num,
+   IN boolean_t is_set);
 /*
 * PARAMETERS
 *  p_log
diff --git a/opensm/osm_pkey_rcv.c b/opensm/osm_pkey_rcv.c
index 81da39c..6650766 100644
--- a/opensm/osm_pkey_rcv.c
+++ b/opensm/osm_pkey_rcv.c
@@ -186,7 +186,8 @@ void osm_pkey_rcv_process(IN void *context, IN void *data)
   p_pkey_tbl, FILE_ID, OSM_LOG_DEBUG);
 
osm_physp_set_pkey_tbl(sm-p_log, sm-p_subn,
-  p_physp, p_pkey_tbl, block_num);
+  p_physp, p_pkey_tbl, block_num,
+  p_context-set_method);
 
 Exit:
cl_plock_release(sm-p_lock);
diff --git a/opensm/osm_port.c b/opensm/osm_port.c
index 9dd7992..46152c5 100644
--- a/opensm/osm_port.c
+++ b/opensm/osm_port.c
@@ -646,7 +646,8 @@ boolean_t osm_link_is_healthy(IN const osm_physp_t * 
p_physp)
 void osm_physp_set_pkey_tbl(IN osm_log_t * p_log, IN const osm_subn_t * p_subn,
IN osm_physp_t * p_physp,
IN ib_pkey_table_t * p_pkey_tbl,
-   IN uint16_t block_num)
+   IN uint16_t block_num,
+   IN boolean_t is_set)
 {
uint16_t max_blocks;
 
@@ -687,7 +688,8 @@ void osm_physp_set_pkey_tbl(IN osm_log_t * p_log, IN const 
osm_subn_t * p_subn,
}
 
/* decrement block received counter */
-   p_physp-pkeys.rcv_blocks_cnt--;
+   if(!is_set)
+   p_physp-pkeys.rcv_blocks_cnt--;
osm_pkey_tbl_set(p_physp-pkeys, block_num, p_pkey_tbl,
 p_subn-opt.allow_both_pkeys);
 }
-- 
1.7.8.2

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


AW: AW: AW: AW: IPoIB GRO

2013-11-05 Thread Markus Stockhausen
  so, to summarize:
  The HW does the work (truncates the big ip packet to series of ip
  packets, each with the relevant mtu size and increases the ip-id for
  each)
  The FW enables that work on the HW
  the FW in A0 card doesn't enable that option for the HW.
 
 Sounds like this bug causes a performance regression, and it sounds
 like it puts incorrect packets on the wire.
 
 This should be patched, have the driver disable TSO for cards that
 can't support it...
 
 Jason

Incredible how a card that does not support TSO can bring big packets
on the wire that somehow get reassembled on the client side :) Maybe 
a two liner in mlx4_ib_query_device() could prevent further discussions.

Markus

Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497




Re: [PATCH RFC v2 00/10] Introduce Signature feature

2013-11-05 Thread Nicholas A. Bellinger
On Tue, 2013-11-05 at 11:13 +0200, Sagi Grimberg wrote:
 On 11/4/2013 8:41 PM, Nicholas A. Bellinger wrote:
  On Sat, 2013-11-02 at 14:57 -0700, Bart Van Assche wrote:
  On 1/11/2013 18:36, Nicholas A. Bellinger wrote:
  On Fri, 2013-11-01 at 08:03 -0700, Bart Van Assche wrote:
  On 31/10/2013 5:24, Sagi Grimberg wrote:
  In T10-DIF, when a series of 512-byte data blocks are transferred, each
  block is followed by an 8-byte guard. The guard consists of CRC that
  protects the integrity of the data in the block, and some other tags
  that protects against mis-directed IOs.
  Shouldn't that read logical block length divided by 2**(protection
  interval exponent) instead of 512 ? From the SPC-4 FORMAT UNIT
  section:
  Why should the protection interval in FORMAT_UNIT be mentioned when it's
  not supported by the hardware, nor by drivers/scsi/sd_dif.c itself..?
  Hello Nick,
 
  My understanding is that this patch series is not only intended for
  initiator drivers but also for target drivers like ib_srpt and ib_isert.
  As you know target drivers do not restrict the initiator operating
  system to Linux. Although I do not know whether there are already
  operating systems that support the protection interval exponent,
  It's my understanding that Linux is still the only stack that supports
  DIF, so AFAICT no one is actually supporting this.
 
I think it is a good idea to stay as close as possible to the terminology
  of the SPC-4 standard.
 
  No, in this context it only adds pointless misdirection because 1) The
  hardware in question doesn't support it, and 2) Linux itself doesn't
  support it.
 
 I think that Bart is suggesting renaming block_size as pi_interval in 
 ib_sig_domain.
 I tend to agree since even if support for that does not exist yet, it 
 might be in the future.
 I think it is not a misdirection because it does represent the 
 protection information interval.
 

The point is that changing the description from what the patch actually
does, to something it does not do in order to 'stay as close as possible
to the terminology of the SPC-4 standard' is pointlessly confusing.

--nab

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch v2] mlx5_core: delete some dead code

2013-11-05 Thread David Miller
From: Dan Carpenter dan.carpen...@oracle.com
Date: Tue, 5 Nov 2013 01:20:56 +0300

 The printk() looks like it is left over debug code.  I have removed it.
 
 Signed-off-by: Dan Carpenter dan.carpen...@oracle.com
 ---
 v2:  Remove the printk instead of moving it infront of the return.

This doesn't apply to the current tree, I suspect some recent changes made
this fix no longer applicable.  Please take a look.

Thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IPoIB GRO

2013-11-05 Thread Or Gerlitz

On 05/11/2013 20:08, Markus Stockhausen wrote:

Incredible how a card that does not support TSO can bring big packets
on the wire that somehow get reassembled on the client side

not sure to follow, you have shown they are **not**  reassembled, correct?
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


AW: IPoIB GRO

2013-11-05 Thread Markus Stockhausen
 Von: Or Gerlitz [ogerl...@mellanox.com]
 Gesendet: Mittwoch, 6. November 2013 08:50
 An: Markus Stockhausen; Jason Gunthorpe; Erez Shitrit
 Cc: linux-rdma@vger.kernel.org; Wendy Cheng
 Betreff: Re:  IPoIB GRO
 
 On 05/11/2013 20:08, Markus Stockhausen wrote:
  Incredible how a card that does not support TSO can bring big packets
  on the wire that somehow get reassembled on the client side
not sure to follow, you have shown they are **not**  reassembled, correct?

Sorry for being not correct. I meant that activating TSO
on that particular card seems to be nothing more than 
creating fragments. They are reassembled but not in the
GRO path. From my stupid point of view that could have 
resulted in much more problems than GRO not working 
correctly. 

Markus


Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497