Re: wake_wide mechanism clarification

2017-08-02 Thread Michael Wang
Hi, Joel

On 07/29/2017 10:13 AM, Joel Fernandes wrote:
> +Michael Wang on his current email address (old one bounced). (my
> reply was to Mike Galbraith but I also meant to CC Michael Wang for
> the discussion). Thanks

Just back from vacation and saw this long long discussion...

I think guys explained well on the idea, wake_wide() just try to filter
out the cases that may congest the waker's domain, as much as possible
without side effect (ideally).

The factor at very beginning is just a static number which picked by
enormous times of practice testing, Peter Zijlstr suggest we add the
domain size and make it a flexible factor, which by practice not that bad.

So the simple answer is we use the llc_size since no better option at
that time :-)

But things changing very fast and new feature can introduce new cases,
I'm pretty sure if we redo the testing the results will be very different,
however, the idea itself still make sense to me, at least on theory.

Recently I'm also thinking about the scheduler issue, cfs try to find out
general solution for all these cases and the best answer is obviously, all
the cases will suffer some damage and scheduler itself bloated to achieve
the goal 'taking care of all'.

So in order to achieve the maximum performance of particular workload, some
user defined scheduler would be an interesting idea :-P

Regards,
Michael Wang



> 
> On Sat, Jul 29, 2017 at 1:01 AM, Joel Fernandes  wrote:
>> Hi Mike,
>>
>> I have take spent some time understanding the email thread and
>> previous discussions. Unfortunately the second condition we are
>> checking for in the wake_wide still didn't make sense to me (mentioned
>> below) :-(
>>
>> On Fri, Jun 30, 2017 at 10:02 AM, Mike Galbraith
>>  wrote:
>>> On Fri, 2017-06-30 at 10:28 -0400, Josef Bacik wrote:
>>>> On Thu, Jun 29, 2017 at 08:04:59PM -0700, Joel Fernandes wrote:
>>>>
>>>>> That makes sense that we multiply slave's flips by a factor because
>>>>> its low, but I still didn't get why the factor is chosen to be
>>>>> llc_size instead of something else for the multiplication with slave
>>>>> (slave * factor).
>>>
>>>> Yeah I don't know why llc_size was chosen...
>>>
>>> static void update_top_cache_domain(int cpu)
>>> {
>>> struct sched_domain_shared *sds = NULL;
>>> struct sched_domain *sd;
>>> int id = cpu;
>>> int size = 1;
>>>
>>> sd = highest_flag_domain(cpu, SD_SHARE_PKG_RESOURCES);
>>> if (sd) {
>>> id = cpumask_first(sched_domain_span(sd));
>>> size = cpumask_weight(sched_domain_span(sd));
>>> sds = sd->shared;
>>> }
>>>
>>> rcu_assign_pointer(per_cpu(sd_llc, cpu), sd);
>>> per_cpu(sd_llc_size, cpu) = size;
>>>
>>> The goal of wake wide was to approximate when pulling would be a futile
>>> consolidation effort and counterproductive to scaling.  'course with
>>> ever increasing socket size, any 1:N waker is ever more likely to run
>>> out of CPU for its one and only self (slamming into scaling wall)
>>> before it needing to turn its minions loose to conquer the world.
>>
>> Actually the original question was why do we have the second condition
>> as "master < slave * factor", instead of "master < factor". that's
>> what didn't make sense to me. Why don't we return 0 from wake_wide if
>> master < factor ?
>>
>> Infact, as the factor is set to the llc_size, I think the condition
>> that makes sense to me is:
>>
>> if ((master + slave) < llc_size)
>>   return 0;
>>
>> In other words, if the master flips and the slave flips are totally
>> higher than the llc_size, then we are most likely waking up too many
>> tasks as affine and should then switch to wide to prevent overloading.
>>
>> Digging further into the original patch from Michael Wang (I also CC'd
>> him), this was the code (before you had changed it to master/slave):
>>
>> wakee->nr_wakee_switch > factor &&
>> waker->nr_wakee_switch > (factor * wakee->nr_wakee_switch)
>>
>> To explain the second condition above, Michael Wang said the following in [1]
>>
>> "Furthermore, if waker also has a high 'nr_wakee_switch', imply that multiple
>> tasks rely on it, then waker's higher latency will damage all of them, pull
>> wakee seems to be a bad deal."
>>
>&

[ Linux 4.4 stable ] missing 'printk: set may_schedule for some of console_trylock() callers'

2017-07-25 Thread Michael Wang
Hi, greg k-h

During our testing with 4.4.73 we got soft lockup like:

 NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [systemd-udevd:856]
 ...
 Call Trace:
  [] vprintk_emit+0x319/0x4a0
  [] printk_emit+0x33/0x3b
  [] ? simple_strtoull+0x2c/0x50 
  [] devkmsg_write+0xaa/0x100
  [] ? vprintk+0x30/0x30
  [] do_readv_writev+0x1c2/0x270 
  [] ? kmem_cache_free+0x7d/0x1a0
  [] vfs_writev+0x39/0x50
  [] SyS_writev+0x4a/0xd0
  [] entry_SYSCALL_64_fastpath+0x12/0x6a

Currently in 4.4 the console_unlock() called by vprintk_emit() is with
preemption disabled, so the cond_resched is not working, and soft lockup
appear if it take too much time on writing data into every console.

We found the upstream patch:
  commit 6b97a20d3a79 printk: set may_schedule for some of console_trylock() 
callers

which should have addressed this issue, but not included in the latest 4.4.78 
stable
yet, is there any plan on backport it in future?

Regards,
Michael Wang


Re: [PATCH] md: return 0 instead of error in rdev_attr_show()

2017-04-11 Thread Michael Wang
We found the upstream fix, sorry for the noise...

Regards,
Michael Wang

On 04/11/2017 12:14 PM, Michael Wang wrote:
> 
> sysfs_kf_read() expect the show() callback return the dumped
> length, while rdev_attr_show() can return the error which lead
> into overflow:
> 
>  BUG: unable to handle kernel paging request at 88040b084000
>  IP: [] __memmove+0x24/0x1a0
>  PGD 1edb067 PUD 1ede067 PMD 406b9a063 PTE 80040b084161
>  Oops: 0003 [#1] SMP 
>  [snip]
>  Call Trace:
>   [] ? sysfs_kf_read+0x80/0xb0
>   [] kernfs_fop_read+0xab/0x160
>   [] __vfs_read+0x28/0xd0
>   [] vfs_read+0x86/0x130
>   [] SyS_read+0x46/0xa0
>   [] entry_SYSCALL_64_fastpath+0x12/0x6a
> 
> Simply return 0 in case of error solved the problem.
> 
> Signed-off-by: Michael Wang 
> ---
>  drivers/md/md.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 1db88d7..d46d714 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -3271,10 +3271,9 @@ rdev_attr_show(struct kobject *kobj, struct attribute 
> *attr, char *page)
>   struct rdev_sysfs_entry *entry = container_of(attr, struct 
> rdev_sysfs_entry, attr);
>   struct md_rdev *rdev = container_of(kobj, struct md_rdev, kobj);
>  
> - if (!entry->show)
> - return -EIO;
> - if (!rdev->mddev)
> - return -EBUSY;
> + if (!entry->show || !rdev->mddev)
> + return 0;
> +
>   return entry->show(rdev, page);
>  }
>  
> 


[PATCH] md: return 0 instead of error in rdev_attr_show()

2017-04-11 Thread Michael Wang

sysfs_kf_read() expect the show() callback return the dumped
length, while rdev_attr_show() can return the error which lead
into overflow:

 BUG: unable to handle kernel paging request at 88040b084000
 IP: [] __memmove+0x24/0x1a0
 PGD 1edb067 PUD 1ede067 PMD 406b9a063 PTE 80040b084161
 Oops: 0003 [#1] SMP 
 [snip]
 Call Trace:
  [] ? sysfs_kf_read+0x80/0xb0
  [] kernfs_fop_read+0xab/0x160
  [] __vfs_read+0x28/0xd0
  [] vfs_read+0x86/0x130
  [] SyS_read+0x46/0xa0
  [] entry_SYSCALL_64_fastpath+0x12/0x6a

Simply return 0 in case of error solved the problem.

Signed-off-by: Michael Wang 
---
 drivers/md/md.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 1db88d7..d46d714 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -3271,10 +3271,9 @@ rdev_attr_show(struct kobject *kobj, struct attribute 
*attr, char *page)
struct rdev_sysfs_entry *entry = container_of(attr, struct 
rdev_sysfs_entry, attr);
struct md_rdev *rdev = container_of(kobj, struct md_rdev, kobj);
 
-   if (!entry->show)
-   return -EIO;
-   if (!rdev->mddev)
-   return -EBUSY;
+   if (!entry->show || !rdev->mddev)
+   return 0;
+
return entry->show(rdev, page);
 }
 
-- 
2.5.0


Re: [RFC PATCH] raid1: reset 'bi_next' before reuse the bio

2017-04-05 Thread Michael Wang


On 04/05/2017 12:17 AM, NeilBrown wrote:
[snip]
>> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
>> index 7d67235..0554110 100644
>> --- a/drivers/md/raid1.c
>> +++ b/drivers/md/raid1.c
>> @@ -1986,11 +1986,13 @@ static int fix_sync_read_error(struct r1bio *r1_bio)
>>  /* Don't try recovering from here - just fail it
>>   * ... unless it is the last working device of course */
>>  md_error(mddev, rdev);
>> -if (test_bit(Faulty, &rdev->flags))
>> +if (test_bit(Faulty, &rdev->flags)) {
>>  /* Don't try to read from here, but make sure
>>   * put_buf does it's thing
>>   */
>>  bio->bi_end_io = end_sync_write;
>> +bio->bi_next = NULL;
>> +}
>>  }
>>  
>>  while(sectors) {
> 
> 
> Ah - I see what is happening now.  I was looking at the vanilla 4.4
> code, which doesn't have the failfast changes.

My bad to forgot mention... yes our md stuff is very much close to the
upstream.

> 
> I don't think your patch is correct though.  We really shouldn't be
> re-using that bio, and setting bi_next to NULL just hides the bug.  It
> doesn't fix it.
> As the rdev is now Faulty, it doesn't make sense for
> sync_request_write() to submit a write request to it.

Make sense, while still have concerns regarding the design:
  * in this case since the read_disk already abandoned, is it fine to
keep r1_bio->read_disk recording the faulty device index?
  * we assign the 'end_sync_write' to the original read bio in this
case, but when is this supposed to be called?

> 
> Can you confirm that this works please.

Yes, it works.

Tested-by: Michael Wang 

Regards,
Michael Wang

> 
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index d2d8b8a5bd56..219f1e1f1d1d 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -2180,6 +2180,8 @@ static void sync_request_write(struct mddev *mddev, 
> struct r1bio *r1_bio)
>(i == r1_bio->read_disk ||
> !test_bit(MD_RECOVERY_SYNC, &mddev->recovery
>   continue;
> + if (test_bit(Faulty, &conf->mirrors[i].rdev->flags))
> + continue;
>  
>   bio_set_op_attrs(wbio, REQ_OP_WRITE, 0);
>   if (test_bit(FailFast, &conf->mirrors[i].rdev->flags))
> 
> 
> Thanks,
> NeilBrown
> 


[RFC PATCH] raid1: reset 'bi_next' before reuse the bio

2017-04-04 Thread Michael Wang

During the testing we found the sync read bio can go through
path:

  md_do_sync()
sync_request()
  generic_make_request()
blk_queue_bio()
  blk_attempt_plug_merge()
bio->bi_next CHAINED HERE

  ...

  raid1d()
sync_request_write()
  fix_sync_read_error()
if FailFast && Faulty
  bio->bi_end_io = end_sync_write
  generic_make_request()
BUG_ON(bio->bi_next)

This need to meet the conditions:
  * bio once merged
  * read disk have FailFast enabled
  * read disk is Faulty

And since the block layer won't reset the 'bi_next' after bio
is done inside request, we hit the BUG like that.

This patch simply reset the bi_next before we reuse it.

Signed-off-by: Michael Wang 
---
 drivers/md/raid1.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 7d67235..0554110 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1986,11 +1986,13 @@ static int fix_sync_read_error(struct r1bio *r1_bio)
/* Don't try recovering from here - just fail it
 * ... unless it is the last working device of course */
md_error(mddev, rdev);
-   if (test_bit(Faulty, &rdev->flags))
+   if (test_bit(Faulty, &rdev->flags)) {
/* Don't try to read from here, but make sure
 * put_buf does it's thing
 */
bio->bi_end_io = end_sync_write;
+   bio->bi_next = NULL;
+   }
}
 
while(sectors) {
-- 
2.5.0


Re: [RFC PATCH] blk: reset 'bi_next' when bio is done inside request

2017-04-04 Thread Michael Wang


On 04/04/2017 02:24 PM, Michael Wang wrote:
> On 04/04/2017 12:23 PM, Michael Wang wrote:
> [snip]
>>> add something like
>>>   if (wbio->bi_next)
>>>  printk("bi_next!= NULL i=%d read_disk=%d bi_end_io=%pf\n",
>>>   i, r1_bio->read_disk, wbio->bi_end_io);
>>>
>>> that might help narrow down what is happening.
>>
>> Just triggered again in 4.4, dmesg like:
>>
>> [  399.240230] md: super_written gets error=-5
>> [  399.240286] md: super_written gets error=-5
>> [  399.240286] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
>> 204160
>> [  399.240300] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
>> 204160
>> [  399.240312] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
>> 204160
>> [  399.240323] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
>> 204160
>> [  399.240334] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
>> 204160
>> [  399.240341] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
>> 204160
>> [  399.240349] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
>> 204160
>> [  399.240352] bi_next!= NULL i=0 read_disk=0 bi_end_io=end_sync_write 
>> [raid1]
> 
> Is it possible that the fail fast who changed the 'bi_end_io' inside
> fix_sync_read_error() help the used bio pass the check?

Hi, NeilBrown, below patch fixed the issue in our testing, I'll post a md
RFC patch so we can continue the discussion there.

Regards,
Michael Wang

> 
> I'm not sure but if the read bio was supposed to be reused as write
> for fail fast, maybe we should reset it like this?
> 
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 7d67235..0554110 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1986,11 +1986,13 @@ static int fix_sync_read_error(struct r1bio *r1_bio)
> /* Don't try recovering from here - just fail it
>  * ... unless it is the last working device of course */
> md_error(mddev, rdev);
> -   if (test_bit(Faulty, &rdev->flags))
> +   if (test_bit(Faulty, &rdev->flags)) {
> /* Don't try to read from here, but make sure
>  * put_buf does it's thing
>  */
> bio->bi_end_io = end_sync_write;
> +   bio->bi_next = NULL;
> +   }
> }
>  
> while(sectors) {
> 
> Regards,
> Michael Wang
> 
> 
>> [  399.240363] [ cut here ]
>> [  399.240364] kernel BUG at block/blk-core.c:2147!
>> [  399.240365] invalid opcode:  [#1] SMP 
>> [  399.240378] Modules linked in: ib_srp scsi_transport_srp raid1 md_mod 
>> ib_ipoib ib_cm ib_uverbs ib_umad mlx5_ib mlx5_core vxlan ip6_udp_tunnel 
>> udp_tunnel mlx4_ib ib_sa ib_mad ib_core ib_addr ib_netlink iTCO_wdt 
>> iTCO_vendor_support dcdbas dell_smm_hwmon acpi_cpufreq x86_pkg_temp_thermal 
>> tpm_tis coretemp evdev tpm i2c_i801 crct10dif_pclmul serio_raw crc32_pclmul 
>> battery processor acpi_pad button kvm_intel kvm dm_round_robin irqbypass 
>> dm_multipath autofs4 sg sd_mod crc32c_intel ahci libahci psmouse libata 
>> mlx4_core scsi_mod xhci_pci xhci_hcd mlx_compat fan thermal [last unloaded: 
>> scsi_transport_srp]
>> [  399.240380] CPU: 1 PID: 2052 Comm: md0_raid1 Not tainted 
>> 4.4.50-1-pserver+ #26
>> [  399.240381] Hardware name: Dell Inc. Precision Tower 3620/09WH54, BIOS 
>> 1.3.6 05/26/2016
>> [  399.240381] task: 8804031b6200 ti: 8800d72b4000 task.ti: 
>> 8800d72b4000
>> [  399.240385] RIP: 0010:[]  [] 
>> generic_make_request+0x29e/0x2a0
>> [  399.240385] RSP: 0018:8800d72b7d10  EFLAGS: 00010286
>> [  399.240386] RAX: 8804031b6200 RBX: 8800d2577e00 RCX: 
>> 3fff
>> [  399.240387] RDX: c001 RSI: 0001 RDI: 
>> 8800d5e8c1e0
>> [  399.240387] RBP: 8800d72b7d50 R08:  R09: 
>> 003f
>> [  399.240388] R10: 0004 R11: 001db9ac R12: 
>> 
>> [  399.240388] R13: 8800d2748e00 R14: 88040a016400 R15: 
>> 8800d2748e40
>> [  399.240389] FS:  () GS:88041dc4() 
>> knlGS:
>> [  399.240390] CS:  0010 DS:  ES:  CR0: 80050033
>> [  399.240390] CR2: 7fb49246a000 CR3: 00040215c000 CR4: 
>> 003406e0
>> [  399.240391] DR0:  DR1: 000

Re: [RFC PATCH] blk: reset 'bi_next' when bio is done inside request

2017-04-04 Thread Michael Wang
On 04/04/2017 12:23 PM, Michael Wang wrote:
[snip]
>> add something like
>>   if (wbio->bi_next)
>>  printk("bi_next!= NULL i=%d read_disk=%d bi_end_io=%pf\n",
>>   i, r1_bio->read_disk, wbio->bi_end_io);
>>
>> that might help narrow down what is happening.
> 
> Just triggered again in 4.4, dmesg like:
> 
> [  399.240230] md: super_written gets error=-5
> [  399.240286] md: super_written gets error=-5
> [  399.240286] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
> 204160
> [  399.240300] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
> 204160
> [  399.240312] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
> 204160
> [  399.240323] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
> 204160
> [  399.240334] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
> 204160
> [  399.240341] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
> 204160
> [  399.240349] md/raid1:md0: dm-0: unrecoverable I/O read error for block 
> 204160
> [  399.240352] bi_next!= NULL i=0 read_disk=0 bi_end_io=end_sync_write [raid1]

Is it possible that the fail fast who changed the 'bi_end_io' inside
fix_sync_read_error() help the used bio pass the check?

I'm not sure but if the read bio was supposed to be reused as write
for fail fast, maybe we should reset it like this?

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 7d67235..0554110 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1986,11 +1986,13 @@ static int fix_sync_read_error(struct r1bio *r1_bio)
/* Don't try recovering from here - just fail it
 * ... unless it is the last working device of course */
md_error(mddev, rdev);
-   if (test_bit(Faulty, &rdev->flags))
+   if (test_bit(Faulty, &rdev->flags)) {
/* Don't try to read from here, but make sure
 * put_buf does it's thing
 */
    bio->bi_end_io = end_sync_write;
+   bio->bi_next = NULL;
+   }
}
 
while(sectors) {

Regards,
Michael Wang


> [  399.240363] [ cut here ]
> [  399.240364] kernel BUG at block/blk-core.c:2147!
> [  399.240365] invalid opcode:  [#1] SMP 
> [  399.240378] Modules linked in: ib_srp scsi_transport_srp raid1 md_mod 
> ib_ipoib ib_cm ib_uverbs ib_umad mlx5_ib mlx5_core vxlan ip6_udp_tunnel 
> udp_tunnel mlx4_ib ib_sa ib_mad ib_core ib_addr ib_netlink iTCO_wdt 
> iTCO_vendor_support dcdbas dell_smm_hwmon acpi_cpufreq x86_pkg_temp_thermal 
> tpm_tis coretemp evdev tpm i2c_i801 crct10dif_pclmul serio_raw crc32_pclmul 
> battery processor acpi_pad button kvm_intel kvm dm_round_robin irqbypass 
> dm_multipath autofs4 sg sd_mod crc32c_intel ahci libahci psmouse libata 
> mlx4_core scsi_mod xhci_pci xhci_hcd mlx_compat fan thermal [last unloaded: 
> scsi_transport_srp]
> [  399.240380] CPU: 1 PID: 2052 Comm: md0_raid1 Not tainted 4.4.50-1-pserver+ 
> #26
> [  399.240381] Hardware name: Dell Inc. Precision Tower 3620/09WH54, BIOS 
> 1.3.6 05/26/2016
> [  399.240381] task: 8804031b6200 ti: 8800d72b4000 task.ti: 
> 8800d72b4000
> [  399.240385] RIP: 0010:[]  [] 
> generic_make_request+0x29e/0x2a0
> [  399.240385] RSP: 0018:8800d72b7d10  EFLAGS: 00010286
> [  399.240386] RAX: 8804031b6200 RBX: 8800d2577e00 RCX: 
> 3fff
> [  399.240387] RDX: c001 RSI: 0001 RDI: 
> 8800d5e8c1e0
> [  399.240387] RBP: 8800d72b7d50 R08:  R09: 
> 003f
> [  399.240388] R10: 0004 R11: 001db9ac R12: 
> 
> [  399.240388] R13: 8800d2748e00 R14: 88040a016400 R15: 
> 8800d2748e40
> [  399.240389] FS:  () GS:88041dc4() 
> knlGS:
> [  399.240390] CS:  0010 DS:  ES:  CR0: 80050033
> [  399.240390] CR2: 7fb49246a000 CR3: 00040215c000 CR4: 
> 003406e0
> [  399.240391] DR0:  DR1:  DR2: 
> 
> [  399.240391] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [  399.240392] Stack:
> [  399.240393]  8800d72b7d18 8800d72b7d30  
> 
> [  399.240394]  a079c290 8800d2577e00  
> 8800d2748e00
> [  399.240395]  8800d72b7e58 a079e74c 88040b661c00 
> 8800d2577e00
> [  399.240396] Call Trace:
> [  399.240398]  [] ? sync_request+0xb20/0xb20 [raid1]
> [  399.240400]  [] raid1d+0x65c/0x1060 [raid1]
> [  399.240403]  [] ? 
&g

Re: [RFC PATCH] blk: reset 'bi_next' when bio is done inside request

2017-04-04 Thread Michael Wang

On 04/04/2017 11:37 AM, NeilBrown wrote:
> On Tue, Apr 04 2017, Michael Wang wrote:
[snip]
>>>
>>> If sync_request_write() is using a bio that has already been used, it
>>> should call bio_reset() and fill in the details again.
>>> However I don't see how that would happen.
>>> Can you give specific details on the situation that triggers the bug?
>>
>> We have storage side mapping lv through scst to server, on server side
>> we assemble them into multipath device, and then assemble these dm into
>> two raid1.
>>
>> The test is firstly do mkfs.ext4 on raid1 then start fio on it, on storage
>> side we unmap all the lv (could during mkfs or fio), then on server side
>> we hit the BUG (reproducible).
> 
> So I assume the initial resync is still happening at this point?
> And you unmap *all* the lv's so you expect IO to fail?
> I can see that the code would behave strangely if you have a
> bad-block-list configured (which is the default).
> Do you have a bbl?  If you create the array without the bbl, does it
> still crash?

The resync is at least happen concurrently in this case, we try
to simulate the case that all the connections dropped, the IO do
failed, also bunch of kernel log like:

md: super_written gets error=-5
blk_update_request: I/O error, dev dm-3, sector 64184
md/raid1:md1: dm-2: unrecoverable I/O read error for block 46848

we expect that to happen, but server should not crash on BUG.

And we haven't done any thing special regarding bbl, the bad_blocks
in sysfs are all empty.

> 
>>
>> The path of bio was confirmed by add tracing, it is reused in 
>> sync_request_write()
>> with 'bi_next' once chained inside blk_attempt_plug_merge().
> 
> I still don't see why it is re-used.
> I assume you didn't explicitly ask for a check/repair (i.e. didn't write
> to .../md/sync_action at all?).  In that case MD_RECOVERY_REQUESTED is
> not set.

Just unmap lv on storage side, no operation on server side.

> So sync_request() sends only one bio to generic_make_request():
>r1_bio->bios[r1_bio->read_disk];
> 
> then sync_request_write() *doesn't* send that bio again, but does send
> all the others.
> 
> So where does it reuse a bio?

If that's the design then it would be strange... the log do showing the path
of that bio go through sync_request(), will do more investigation.

> 
>>
>> We also tried to reset the bi_next inside sync_request_write() before
>> generic_make_request() which also works.
>>
>> The testing was done with 4.4, but we found upstream also left bi_next
>> chained after done in request, thus we post this RFC.
>>
>> Regarding raid1, we haven't found the place on path where the bio was
>> reset... where does it supposed to be?
> 
> I'm not sure what you mean.
> We only reset bios when they are being reused.
> One place is in process_checks() where bio_reset() is called before
> filling in all the details.
> 
> 
> Maybe, in sync_request_write(), before
> 
>   wbio->bi_rw = WRITE;
> 
> add something like
>   if (wbio->bi_next)
>  printk("bi_next!= NULL i=%d read_disk=%d bi_end_io=%pf\n",
>   i, r1_bio->read_disk, wbio->bi_end_io);
> 
> that might help narrow down what is happening.

Just triggered again in 4.4, dmesg like:

[  399.240230] md: super_written gets error=-5
[  399.240286] md: super_written gets error=-5
[  399.240286] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
[  399.240300] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
[  399.240312] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
[  399.240323] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
[  399.240334] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
[  399.240341] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
[  399.240349] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
[  399.240352] bi_next!= NULL i=0 read_disk=0 bi_end_io=end_sync_write [raid1]
[  399.240363] [ cut here ]
[  399.240364] kernel BUG at block/blk-core.c:2147!
[  399.240365] invalid opcode:  [#1] SMP 
[  399.240378] Modules linked in: ib_srp scsi_transport_srp raid1 md_mod 
ib_ipoib ib_cm ib_uverbs ib_umad mlx5_ib mlx5_core vxlan ip6_udp_tunnel 
udp_tunnel mlx4_ib ib_sa ib_mad ib_core ib_addr ib_netlink iTCO_wdt 
iTCO_vendor_support dcdbas dell_smm_hwmon acpi_cpufreq x86_pkg_temp_thermal 
tpm_tis coretemp evdev tpm i2c_i801 crct10dif_pclmul serio_raw crc32_pclmul 
battery processor acpi_pad button kvm_intel kvm dm_round_robin irqbypass 
dm_multipath autofs4 sg sd_mod crc32c_intel ahci libahci psmouse libata 

Re: [RFC PATCH] blk: reset 'bi_next' when bio is done inside request

2017-04-04 Thread Michael Wang
Hi, Neil

On 04/03/2017 11:25 PM, NeilBrown wrote:
> On Mon, Apr 03 2017, Michael Wang wrote:
> 
>> blk_attempt_plug_merge() try to merge bio into request and chain them
>> by 'bi_next', while after the bio is done inside request, we forgot to
>> reset the 'bi_next'.
>>
>> This lead into BUG while removing all the underlying devices from md-raid1,
>> the bio once go through:
>>
>>   md_do_sync()
>> sync_request()
>>   generic_make_request()
> 
> This is a read request from the "first" device.
> 
>> blk_queue_bio()
>>   blk_attempt_plug_merge()
>> CHAINED HERE
>>
>> will keep chained and reused by:
>>
>>   raid1d()
>> sync_request_write()
>>   generic_make_request()
> 
> This is a write request to some other device, isn't it?
> 
> If sync_request_write() is using a bio that has already been used, it
> should call bio_reset() and fill in the details again.
> However I don't see how that would happen.
> Can you give specific details on the situation that triggers the bug?

We have storage side mapping lv through scst to server, on server side
we assemble them into multipath device, and then assemble these dm into
two raid1.

The test is firstly do mkfs.ext4 on raid1 then start fio on it, on storage
side we unmap all the lv (could during mkfs or fio), then on server side
we hit the BUG (reproducible).

The path of bio was confirmed by add tracing, it is reused in 
sync_request_write()
with 'bi_next' once chained inside blk_attempt_plug_merge().

We also tried to reset the bi_next inside sync_request_write() before
generic_make_request() which also works.

The testing was done with 4.4, but we found upstream also left bi_next
chained after done in request, thus we post this RFC.

Regarding raid1, we haven't found the place on path where the bio was
reset... where does it supposed to be?

BTW the fix_sync_read_error() also invoked and succeed before trigger
the BUG.

Regards,
Michael Wang

> 
> Thanks,
> NeilBrown
> 
> 
>> BUG_ON(bio->bi_next)
>>
>> After reset the 'bi_next' this can no longer happen.
>>
>> Signed-off-by: Michael Wang 
>> ---
>>  block/blk-core.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 43b7d06..91223b2 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -2619,8 +2619,10 @@ bool blk_update_request(struct request *req, int 
>> error, unsigned int nr_bytes)
>> struct bio *bio = req->bio;
>> unsigned bio_bytes = min(bio->bi_iter.bi_size, nr_bytes);
>>
>> -   if (bio_bytes == bio->bi_iter.bi_size)
>> +   if (bio_bytes == bio->bi_iter.bi_size) {
>> req->bio = bio->bi_next;
>> +   bio->bi_next = NULL;
>> +   }
>>
>> req_bio_endio(req, bio, bio_bytes, error);
>>
>> -- 
>> 2.5.0


[RFC PATCH] blk: reset 'bi_next' when bio is done inside request

2017-04-03 Thread Michael Wang

blk_attempt_plug_merge() try to merge bio into request and chain them
by 'bi_next', while after the bio is done inside request, we forgot to
reset the 'bi_next'.

This lead into BUG while removing all the underlying devices from md-raid1,
the bio once go through:

  md_do_sync()
sync_request()
  generic_make_request()
blk_queue_bio()
  blk_attempt_plug_merge()
CHAINED HERE

will keep chained and reused by:

  raid1d()
sync_request_write()
  generic_make_request()
BUG_ON(bio->bi_next)

After reset the 'bi_next' this can no longer happen.

Signed-off-by: Michael Wang 
---
 block/blk-core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 43b7d06..91223b2 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2619,8 +2619,10 @@ bool blk_update_request(struct request *req, int error, 
unsigned int nr_bytes)
struct bio *bio = req->bio;
unsigned bio_bytes = min(bio->bi_iter.bi_size, nr_bytes);

-   if (bio_bytes == bio->bi_iter.bi_size)
+   if (bio_bytes == bio->bi_iter.bi_size) {
req->bio = bio->bi_next;
+   bio->bi_next = NULL;
+   }

req_bio_endio(req, bio, bio_bytes, error);

-- 
2.5.0


Re: [PATCH v2 1/1] block: fix blk_queue_split() resource exhaustion

2016-12-23 Thread Michael Wang
Dear Maintainers

I'd like to ask for the status of this patch since we hit the
issue too during our testing on md raid1.

Split remainder bio_A was queued ahead, following by bio_B for
lower device, at this moment raid start freezing, the loop take
out bio_A firstly and deliver it, which will hung since raid is
freezing, while the freezing never end since it waiting for
bio_B to finish, and bio_B is still on the queue, waiting for
bio_A to finish...

We're looking for a good solution and we found this patch
already progressed a lot, but we can't find it on linux-next,
so we'd like to ask are we still planning to have this fix
in upstream?

Regards,
Michael Wang


On 07/11/2016 04:10 PM, Lars Ellenberg wrote:
> For a long time, generic_make_request() converts recursion into
> iteration by queuing recursive arguments on current->bio_list.
> 
> This is convenient for stacking drivers,
> the top-most driver would take the originally submitted bio,
> and re-submit a re-mapped version of it, or one or more clones,
> or one or more new allocated bios to its backend(s). Which
> are then simply processed in turn, and each can again queue
> more "backend-bios" until we reach the bottom of the driver stack,
> and actually dispatch to the real backend device.
> 
> Any stacking driver ->make_request_fn() could expect that,
> once it returns, any backend-bios it submitted via recursive calls
> to generic_make_request() would now be processed and dispatched, before
> the current task would call into this driver again.
> 
> This is changed by commit
>   54efd50 block: make generic_make_request handle arbitrarily sized bios
> 
> Drivers may call blk_queue_split() inside their ->make_request_fn(),
> which may split the current bio into a front-part to be dealt with
> immediately, and a remainder-part, which may need to be split even
> further. That remainder-part will simply also be pushed to
> current->bio_list, and would end up being head-of-queue, in front
> of any backend-bios the current make_request_fn() might submit during
> processing of the fron-part.
> 
> Which means the current task would immediately end up back in the same
> make_request_fn() of the same driver again, before any of its backend
> bios have even been processed.
> 
> This can lead to resource starvation deadlock.
> Drivers could avoid this by learning to not need blk_queue_split(),
> or by submitting their backend bios in a different context (dedicated
> kernel thread, work_queue context, ...). Or by playing funny re-ordering
> games with entries on current->bio_list.
> 
> Instead, I suggest to distinguish between recursive calls to
> generic_make_request(), and pushing back the remainder part in
> blk_queue_split(), by pointing current->bio_lists to a
>   struct recursion_to_iteration_bio_lists {
>   struct bio_list recursion;
>   struct bio_list queue;
>   }
> 
> By providing each q->make_request_fn() with an empty "recursion"
> bio_list, then merging any recursively submitted bios to the
> head of the "queue" list, we can make the recursion-to-iteration
> logic in generic_make_request() process deepest level bios first,
> and "sibling" bios of the same level in "natural" order.
> 
> Signed-off-by: Lars Ellenberg 
> Signed-off-by: Roland Kammerer 
> ---
>  block/bio.c   | 20 +++
>  block/blk-core.c  | 49 
> +--
>  block/blk-merge.c |  5 -
>  drivers/md/bcache/btree.c | 12 ++--
>  drivers/md/dm-bufio.c |  2 +-
>  drivers/md/raid1.c|  5 ++---
>  drivers/md/raid10.c   |  5 ++---
>  include/linux/bio.h   | 25 
>  include/linux/sched.h |  4 ++--
>  9 files changed, 80 insertions(+), 47 deletions(-)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 848cd35..c2606fd 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -366,12 +366,16 @@ static void punt_bios_to_rescuer(struct bio_set *bs)
>*/
>  
>   bio_list_init(&punt);
> - bio_list_init(&nopunt);
>  
> - while ((bio = bio_list_pop(current->bio_list)))
> + bio_list_init(&nopunt);
> + while ((bio = bio_list_pop(¤t->bio_lists->recursion)))
>   bio_list_add(bio->bi_pool == bs ? &punt : &nopunt, bio);
> + current->bio_lists->recursion = nopunt;
>  
> - *current->bio_list = nopunt;
> + bio_list_init(&nopunt);
> + while ((bio = bio_list_pop(¤t->bio_lists->queue)))
> + bio_list_add(bio->bi_pool == bs ? &punt : &nopunt, bio);
> + current->bio_lists-

[BUG] block: bdi_register_owner() failure cause NULL pointer dereference

2016-09-29 Thread Michael Wang
Hi, Folks

We observed the hard lockup while trying raid assemble with sas3ircu,
it was start with the failure inside bdi_register_owner() with duplicated
kobj path, and later comeup the NULL pointer dereference, after that system
hang and we saw hard lockup on screen.

The duplicated issue could be with the scsi controller driver and we are
going to upgrade it anyway, but my question is why we don't do some error
handling like:

diff --git a/block/genhd.c b/block/genhd.c
index a178c8e..318bc63 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -614,7 +614,15 @@ void device_add_disk(struct device *parent, struct gendisk 
*disk)
 
/* Register BDI before referencing it from bdev */
bdi = &disk->queue->backing_dev_info;
-   bdi_register_owner(bdi, disk_to_dev(disk));
+   if (bdi_register_owner(bdi, disk_to_dev(disk))) {
+   disk_release_events(disk);
+   blk_free_devt(devt);
+   disk->ev = NULL;
+   disk->first_minor = 0;
+   disk->major = 0;
+   WARN_ON(1);
+   return;
+   }
 
blk_register_region(disk_devt(disk), disk->minors, NULL,
exact_match, exact_lock, disk);

to prevent the following NULL pointer dereference and hard lockup?

Regards,
Michael Wang
Sep 29 09:53:28 st401b-3 systemd[1]: Starting Update UTMP about System Runlevel 
Changes...
Sep 29 09:53:28 st401b-3 ntpd[4970]: Listen and drop on 1 v6wildcard :: UDP 123
Sep 29 09:53:28 st401b-3 ntpd[4970]: Listen normally on 2 lo 127.0.0.1 UDP 123
Sep 29 09:53:28 st401b-3 ntpd[4970]: Listen normally on 3 eth0 10.41.12.3 UDP 
123
Sep 29 09:53:28 st401b-3 ntpd[4970]: Listen normally on 4 lo ::1 UDP 123
Sep 29 09:53:28 st401b-3 ntpd[4970]: Listen normally on 5 eth0 
fe80::ec4:7aff:feab:6b0 UDP 123
Sep 29 09:53:28 st401b-3 ntpd[4970]: peers refreshed
Sep 29 09:53:28 st401b-3 ntpd[4970]: Listening on routing socket on fd #22 for 
interface updates
Sep 29 09:53:28 st401b-3 systemd[1]: Started Update UTMP about System Runlevel 
Changes.
Sep 29 09:53:28 st401b-3 systemd[1]: Startup finished in 18.720s (kernel) + 
39.513s (userspace) = 58.233s.
Sep 29 09:55:01 st401b-3 CRON[5433]: (root) CMD (command -v debian-sa1 > 
/dev/null && debian-sa1 1 1)
Sep 29 09:55:01 st401b-3 CRON[5434]: (root) CMD (test -x 
/opt/profitbricks/bin/check_memory && /opt/profitbricks/bin/check_memory)
Sep 29 09:55:17 st401b-3 kernel: [  167.693658] scsi 0:1:0:0: Direct-Access 
LSI  Logical Volume   3000 PQ: 0 ANSI: 6
Sep 29 09:55:17 st401b-3 kernel: [  167.693795] scsi 0:1:0:0: RAID1: 
handle(0x0143), wwid(0x02f7ec7949091b05), pd_count(2), type(SSP)
Sep 29 09:55:17 st401b-3 kernel: [  167.694042] sd 0:1:0:0: [sdam] 5859373056 
512-byte logical blocks: (3.00 TB/2.73 TiB)
Sep 29 09:55:17 st401b-3 kernel: [  167.694044] sd 0:1:0:0: [sdam] 4096-byte 
physical blocks
Sep 29 09:55:17 st401b-3 kernel: [  167.694057] sd 0:1:0:0: Attached scsi 
generic sg40 type 0
Sep 29 09:55:17 st401b-3 kernel: [  167.694129] sd 0:1:0:0: [sdam] Write 
Protect is off
Sep 29 09:55:17 st401b-3 kernel: [  167.694131] sd 0:1:0:0: [sdam] Mode Sense: 
03 00 00 08
Sep 29 09:55:17 st401b-3 kernel: [  167.694166] sd 0:1:0:0: [sdam] No Caching 
mode page found
Sep 29 09:55:17 st401b-3 kernel: [  167.694282] sd 0:0:4:0: hidding raid 
component
Sep 29 09:55:17 st401b-3 kernel: [  167.694589] sd 0:1:0:0: [sdam] Assuming 
drive cache: write through
Sep 29 09:55:17 st401b-3 kernel: [  167.703346] sd 0:1:0:0: [sdam] Attached 
SCSI disk
Sep 29 09:55:17 st401b-3 kernel: [  167.703653] sd 0:0:5:0: hidding raid 
component
Sep 29 09:55:39 st401b-3 check_backup_lvm_push: critical: local git command 
rev-parse HEAD failed, retval: 0, fatal: Not a git repository: 
'/var/lib/backup-lvm/.git/'
Sep 29 09:56:03 st401b-3 kernel: [  213.684812] scsi 0:1:1:0: Direct-Access 
LSI  Logical Volume   3000 PQ: 0 ANSI: 6
Sep 29 09:56:03 st401b-3 kernel: [  213.684946] scsi 0:1:1:0: RAID1: 
handle(0x0142), wwid(0x0ab3eca651cd1b58), pd_count(2), type(SSP)
Sep 29 09:56:03 st401b-3 kernel: [  213.685189] sd 0:1:1:0: [sde] 5859373056 
512-byte logical blocks: (3.00 TB/2.73 TiB)
Sep 29 09:56:03 st401b-3 kernel: [  213.685192] sd 0:1:1:0: [sde] 4096-byte 
physical blocks
Sep 29 09:56:03 st401b-3 kernel: [  213.685204] sd 0:1:1:0: Attached scsi 
generic sg41 type 0
Sep 29 09:56:03 st401b-3 kernel: [  213.685275] sd 0:1:1:0: [sde] Write Protect 
is off
Sep 29 09:56:03 st401b-3 kernel: [  213.685277] sd 0:1:1:0: [sde] Mode Sense: 
03 00 00 08
Sep 29 09:56:03 st401b-3 kernel: [  213.685307] sd 0:1:1:0: [sde] No Caching 
mode page found
Sep 29 09:56:03 st401b-3 kernel: [  213.685423] sd 0:0:6:0: hidding raid 
component
Sep 29 09:56:03 st401b-3 kernel: [  213.685698] sd 0:1:1:0: [sde] Assuming 
drive cache: write through
Sep 29 09:56:03 st401b-3 kernel: [  213.686226] [ cut here 
]
Sep 29 09:56:03 st401b-3 kernel: [  213

Re: [PATCH RESEND] infiniband:core:Add needed error path in cm_init_av_by_path

2015-12-17 Thread Michael Wang


On 12/16/2015 07:16 PM, Jason Gunthorpe wrote:
> On Wed, Dec 16, 2015 at 11:26:39AM +0100, Michael Wang wrote:
[snip]
>>
>> I've rechecked the ib_init_ah_from_path() again, and found it
>> still set IB_AH_GRH when the GID cache missing, but with:
> 
> How do you mean?
> 
> ah_attr->ah_flags = IB_AH_GRH;
> ah_attr->grh.dgid = rec->dgid;
> 
> ret = ib_find_cached_gid(device, &rec->sgid, ndev, &port_num,
>  &gid_index);
> if (ret) {
> if (ndev)
> dev_put(ndev);
> return ret;
> }
> 
> If find_cached_gid fails then ib_init_ah_from_path also fails.
> 
> Is there a case where ib_find_cached_gid can succeed but not return
> good data?

Just for the GRH header, ib_find_cached_gid() will failed but the flag
and dgid will be correct, and others are all 0 including hop limit, but
may be just coincidence...

As long as hop_limit > 1 means the pkg do have to pass through a router
to other subnet, the fix make sense :-)

Regards,
Michael Wang

> 
> I agree it would read nicer if the ah_flags and gr.dgid was moved
> after the ib_find_cached_gid
> 
>> BTW, cma_sidr_rep_handler() also call ib_init_ah_from_path() with out
>> a check on return.
> 
> That sounds like a problem.
> 
> Jason
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] infiniband:core:Add needed error path in cm_init_av_by_path

2015-12-16 Thread Michael Wang

On 12/15/2015 06:30 PM, Jason Gunthorpe wrote:
> On Tue, Dec 15, 2015 at 05:38:34PM +0100, Michael Wang wrote:
>> The hop_limit is only suggest that the package allowed to be
>> routed, not have to, correct?
> 
> If the hop limit is >= 2 (?) then the GRH is mandatory. The
> SM will return this information in the PathRecord if it determines a
> GRH is required. The whole stack follows this protocol.
> 
> The GRH is optional for in-subnet communications.

Thanks for the explain :-)

I've rechecked the ib_init_ah_from_path() again, and found it
still set IB_AH_GRH when the GID cache missing, but with:

  grh.sgid_index = 0
  grh.flow_label = 0
  grh.hop_limit  = 0
  grh.traffic_class = 0

Not sure if it's just coincidence, hop_limit is 0, so router
will discard the pkg and GRH won't be used, the transaction in
subnet still works.

Could this by designed as an optimization for the case like when
SM reassigning the GID?

BTW, cma_sidr_rep_handler() also call ib_init_ah_from_path() with out
a check on return.

Regards,
Michael Wang

> 
> Jason
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] infiniband:core:Add needed error path in cm_init_av_by_path

2015-12-15 Thread Michael Wang


On 12/15/2015 04:52 PM, Nicholas Krause wrote:
> This adds a needed error path in the function, cm_init_av_by_path
> after the call to ib_init_ah_from_path in order to avoid incorrectly
> accessing the path pointer of structure type ib_sa_path_rec if this
> function call fails to complete its intended work successfully by
> returning a error code.
> 
> Signed-off-by: Nicholas Krause 
> ---
>  drivers/infiniband/core/cm.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index 0a26dd6..e9b36ea 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -383,8 +383,11 @@ static int cm_init_av_by_path(struct ib_sa_path_rec 
> *path, struct cm_av *av)
>   return ret;
>  
>   av->port = port;
> - ib_init_ah_from_path(cm_dev->ib_device, port->port_num, path,
> -  &av->ah_attr);
> + ret = ib_init_ah_from_path(cm_dev->ib_device, port->port_num, path,
> +&av->ah_attr);

..Just wondering what if the transport don't require GRH?
eg inside the same subnet?

The hop_limit is only suggest that the package allowed to be
routed, not have to, correct?

Regards,
Michael Wang

> + if (ret)
> + return ret;
> +
>   av->timeout = path->packet_life_time + 1;
>  
>   return 0;
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] infiniband:core:Add needed error path in cm_init_av_by_path

2015-12-07 Thread Michael Wang


On 12/07/2015 08:57 AM, Haggai Eran wrote:
> On Friday, December 4, 2015 8:02 PM, Nicholas Krause  
> wrote:
>> To: dledf...@redhat.com
>> Cc: sean.he...@intel.com; hal.rosenst...@gmail.com; Haggai Eran; 
>> jguntho...@obsidianresearch.com; Matan Barak; yun.w...@profitbricks.com; 
>> ted.h@oracle.com; Doron Tsur; Erez Shitrit; david.ah...@oracle.com; 
>> linux-r...@vger.kernel.org; linux-kernel@vger.kernel.org
>> Subject: [PATCH] infiniband:core:Add needed error path in cm_init_av_by_path
>>
>> This adds a needed error path in the function, cm_init_av_by_path
>> after the call to ib_init_ah_from_path in order to avoid incorrectly
>> accessing the path pointer of structure type ib_sa_path_rec if this
>> function call fails to complete its intended work successfully by
>> returning a error code.
>>
>> Signed-off-by: Nicholas Krause 
> 
> The subject doesn't seem to match the convention but apart from that,
> 
> Reviewed-by: Haggai Eran 
> 
> I wonder if this should go to stable. If I understand correctly, this will 
> fail only when the SGID isn't found in the GID table, but such connections 
> would fail later on when creating a QP, right?

Me too think this need a reconsider, to me the current logical don't
really care the missing gid in cache when initializing AV, I'm not
sure if it's necessary to fail all the following path for such cache
missing...

Regards,
Michael Wang

> 
> Haggai
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-12-03 Thread Michael Wang


On 12/02/2015 06:36 PM, Catalin Marinas wrote:
> On 2 December 2015 at 13:59, Borislav Petkov  wrote:
[snip]
> 
> 1. The sl?b allocators themselves use page allocations, so kmemleak
> could end up detecting the same pointer twice, hiding a potential leak
> 
> 2. Most page allocations do not contain data/pointers relevant to
> kmemleak (e.g. page cache pages), however the randomness of such data
> greatly diminishes kmemleak's ability to detect real leaks
> 
> Arguably, kmemleak could be made to detect both cases above by a
> combination of page flags, additional annotations or specific page
> alloc API. However, this has its own drawbacks in terms of code
> complexity (potentially outside mm/kmemleak.c) and overhead.

Thanks for the very nice explain :-) I used to thought overhead is
the only concern, missing the point regarding allocator it self.

Regards,
Michael Wang

> 
> Regarding a kmemleak_alloc() annotation like in the patch I suggested,
> that's the second one I've seen needed outside alloc APIs (the first
> one is commit f75782e4e067 - "block: kmemleak: Track the page
> allocations for struct request"). If the number of such explicit
> annotations stays small, it's better to keep it this way.
> 
> There are other explicit annotations like kmemleak_not_leak() or
> kmemleak_ignore() but these are for objects kmemleak knows about and
> incorrectly reports them as leaks. Most of the time is because the
> pointers to such objects are stored in a different form (e.g. physical
> address).
> 
> Anyway, kmemleak is not the only tool requiring annotations (take
> spin_lock_nested() for example). If needed, we could do with an
> additional page alloc/free API which informs kmemleak in the process
> but I don't think it's worth it.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-12-02 Thread Michael Wang


On 12/02/2015 03:13 PM, Borislav Petkov wrote:
> On Wed, Dec 02, 2015 at 03:09:18PM +0100, Michael Wang wrote:
>> This tool will help improve the kernel, AFAIK it's already made it's
>> best, if you got any idea on how to make it even better that would be
>> great, but at this moment, it still need few of care :-P
> 
> I think you're replying to my emails without even reading what I said.
> So I'm going to do the same and stop wasting my time.

First of all, I respect all your reply, and reply regarding your point.

Frankly speaking, I think you know all these already, it's not a big
deal but you refuse to obey the rules setup by others, although it do
help make things better, and benefit yourself too.

If you refuse to make things better I'm totally fine, time is valuable
for all of us :-)

Regards,
Michael Wang

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-12-02 Thread Michael Wang


On 12/02/2015 02:59 PM, Borislav Petkov wrote:
> On Wed, Dec 02, 2015 at 02:48:47PM +0100, Michael Wang wrote:
>> I'm not sure why amd-iommu use get_page not kmalloc to initialize the
>> pointer table, but if it's necessary then the conflict will be there,
>> it's not the fault of driver or kmemleak, but the design require them
>> to cooperate with each other.
> 
> So, according to you, we should go and "fix" all callers of
> __get_free_pages() to make kmemleak happy. Then when the next new tool
> comes along, we should "fix" another kernel API just so that the tools
> are happy.

That's the way we have to detect leak, no driver could get rid of
the possibility of memory leaking, so it should respect the rule to
help others locating the problem, if a driver full of false report then
most likely folks will gradually lost interests on help fix leaking
problem for it.

> 
> Bzzt. Wrong!
> 
> The tools should work without sprinkling their code everywhere. Driver
> etc developers don't need to care about what tool they make happy or
> not. Tools' hooks should be hidden in macro magic so that developers
> don't care.

This tool will help improve the kernel, AFAIK it's already made it's
best, if you got any idea on how to make it even better that would be
great, but at this moment, it still need few of care :-P

Regards,
Michael Wang

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-12-02 Thread Michael Wang


On 12/02/2015 02:40 PM, Borislav Petkov wrote:
> On Wed, Dec 02, 2015 at 02:18:44PM +0100, Michael Wang wrote:
[snip]
> 
>> Yeah, but it would be better to solve it, otherwise whoever saw this
>> report will need to go into the amd-iommu, make sure it's not a real
>> leak, then change their testing script...
> 
> No, you don't need to go into the iommu - you need to fix kmemleak.
> 
> And frankly, I'm getting sick and tired of all those tools needing
> special handling and us adding code just so that they're happy. If the
> tools can't figure out something, they shouldn't warn just in case but
> shut up instead.

If you mean the design of kmemleak, IMHO it's not that bad.

The problem is regarding performance, think about if kmemleak go into
every page to find out pointers, I guess the whole system will stuck.

I'm not sure why amd-iommu use get_page not kmalloc to initialize the pointer
table, but if it's necessary then the conflict will be there, it's not the fault
of driver or kmemleak, but the design require them to cooperate with each
other.

Regards,
Michael Wang

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-12-02 Thread Michael Wang
Hi, Borislav

On 12/02/2015 02:13 PM, Borislav Petkov wrote:
> On Wed, Dec 02, 2015 at 02:01:55PM +0100, Michael Wang wrote:
>> Yeah.. it's a little complicated since we have our own kernel tree and this
>> won't be a problem for us, but we really prefer to help fix it in mainline
>> too, as long as this is really a defect, so others could save time on 
>> research
>> in future.
> 
> Well, to keep it realistic and if it were me, I wouldn't even take such
> a fix as it is apparently kmemleak's problem.

Do you mean this could be a real kmemleak? Could you please provide more 
details?

> 
> So you could fix your testing instead to ignore that error message now
> that you know it is a false-positive. That should be easiest.
> 

Yeah, but it would be better to solve it, otherwise whoever saw this report
will need to go into the amd-iommu, make sure it's not a real leak, then
change their testing script...

Regards,
Michael Wang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-12-02 Thread Michael Wang

On 12/02/2015 01:56 PM, Joerg Roedel wrote:
> On Wed, Dec 02, 2015 at 01:31:38PM +0100, Michael Wang wrote:
>> It's not my work or your work... it's a defect in the module and maintainer
>> should take responsibility on fixing it, correct?
> 
> No, its a false positive from an in-kernel checking tool, the iommu
> driver is correct. You just sent a patch to silence the false positive
> report.

Yeah, but caused by the driver :-P and have to be fixed in there too.

> 
>> We're very willing to help, but as I mentioned we are out of resource for
>> testing at this moment, but we can send you a new patch without testing,
>> will that works for you?
> 
> This should be testable on any AMD IOMMU system with working interrupt
> remapping. I will probably have no time to test this, if you really
> can't test yourself, try to get a Tested-by from someone else.

Good point, anyone willing to help test the fix? Or better provide the new
patch to be the author.

Regards,
Michael Wang

> 
> 
> 
>   Joerg
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-12-02 Thread Michael Wang


On 12/02/2015 01:53 PM, Borislav Petkov wrote:
> On Wed, Dec 02, 2015 at 01:31:38PM +0100, Michael Wang wrote:
>> It's not my work or your work... it's a defect in the module and maintainer
>> should take responsibility on fixing it, correct?
> 
> Well, you said "actually we just want to get rid of this annoying report
> on obj won't leak..."
> 
> It sounds to me like you want to have something fixed. So you do the
> patch properly, add to the commit message why exactly you're doing it
> and test it. Like everyone else.

Yeah.. it's a little complicated since we have our own kernel tree and this
won't be a problem for us, but we really prefer to help fix it in mainline
too, as long as this is really a defect, so others could save time on research
in future.

But seems like we can only wait for another chance to confirm the another
solution, frankly speaking I think we both will forgot this soon... fortunately
it's not that critical :-P

Regards,
Michael Wang

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-12-02 Thread Michael Wang


On 12/02/2015 12:51 PM, Joerg Roedel wrote:
> On Wed, Dec 02, 2015 at 12:38:03PM +0100, Michael Wang wrote:
>> Joerg, this is really a tiny fix, would you mind to merge it into some
>> of your cleanup patch and testing them together? we are not in hurry,
>> just want to make sure the issue will get solved.
> 
> I am not doing your work. You sent a patch, received feedback, and now
> you can send a new patch based on it. Thats the process. If it addresses
> the feedback I will merge it, but I will not scissor your patch
> together.

It's not my work or your work... it's a defect in the module and maintainer
should take responsibility on fixing it, correct?

We're very willing to help, but as I mentioned we are out of resource for
testing at this moment, but we can send you a new patch without testing,
will that works for you?

Regards,
Michael Wang

> 
> 
>   Joerg
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-12-02 Thread Michael Wang

On 12/02/2015 12:31 PM, Catalin Marinas wrote:
> On 2 December 2015 at 10:56, Michael Wang  wrote:
[snip]
> 
> I could copy your description but I don't currently have a way (nor
> time) to test the patch. If you plan to test it anyway, please feel
> free to include my diff (which I guess was badly re-formatted by
> gmail), I don't really mind which author it is (I found it easier to
> show a diff than explain in plain English ;)).

Unfortunately that's the same on my side... we already close the ticket
and I don't have resources to testing it again.

Joerg, this is really a tiny fix, would you mind to merge it into some
of your cleanup patch and testing them together? we are not in hurry,
just want to make sure the issue will get solved.

Regards,
Michael Wang

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-12-02 Thread Michael Wang

On 12/02/2015 11:52 AM, Catalin Marinas wrote:
[snip]
>>
>> Is there any more concern? actually we just want to get rid of this
>> annoying report on obj won't leak, if you're going to create obj for
>> 'irq_lookup_table' that's also fine for us, or will you pick this patch?
> 
> My preference (from a kmemleak perspective) is to tell kmemleak about
> the irq_lookup_table. Untested:

I'm fine with both solution, will leave the decision to maintainer :-)

BTW, could you please send a formal patch with descriptions?

Regards,
Michael Wang

> 
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 013bdfff2d4d..c41609f71cbe 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -27,6 +27,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1692,6 +1693,7 @@ static struct syscore_ops amd_iommu_syscore_ops = {
> 
>  static void __init free_on_init_error(void)
>  {
> + kmemleak_free(irq_lookup_table);
>   free_pages((unsigned long)irq_lookup_table,
> get_order(rlookup_table_size));
> 
> @@ -1906,6 +1908,7 @@ static int __init early_amd_iommu_init(void)
>   irq_lookup_table = (void *)__get_free_pages(
>   GFP_KERNEL | __GFP_ZERO,
>   get_order(rlookup_table_size));
> + kmemleak_alloc(irq_lookup_table, rlookup_table_size, 1, GFP_KERNEL);
>   if (!irq_lookup_table)
>   goto out;
>   }
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-12-02 Thread Michael Wang
Hi, Joerg

On 11/25/2015 04:14 PM, Michael Wang wrote:
> On 11/25/2015 04:08 PM, Joerg Roedel wrote:
[snip]
>>> This is caused by the 'irq_lookup_table' was allocated with
>>> __get_free_pages() which won't create kmemleak object, thus it's
>>> pointers won't be count as referencing 'irq_remap_table' in
>>> kmemleak scan.
>>
>> Isn't it better to allocate the kmemleak object manually instead of
>> ignoring all irq-table pointers? With this patch we might not notice any
>> real leak of irq-tables.
> 
> We've considered that too, but found that the irq-tables is not
> dynamically alloc/free, they won't be freed once initialized, so there
> is no leaking for such object :-)

Is there any more concern? actually we just want to get rid of this
annoying report on obj won't leak, if you're going to create obj for
'irq_lookup_table' that's also fine for us, or will you pick this patch?

Regards,
Michael Wang

> 
> Regards,
> Michael Wang
> 
>>
>>
>>
>>  Joerg
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-11-25 Thread Michael Wang
On 11/25/2015 04:08 PM, Joerg Roedel wrote:
> On Fri, Nov 20, 2015 at 12:33:50PM +0100, Michael Wang wrote:
>> The kmemleak testing on 3.18.24 show:
>>
>> unreferenced object 0x880233ff9010 (size 16):
>>   comm "swapper/0", pid 1, jiffies 4294937440 (age 2010.490s)
>>   hex dump (first 16 bytes):
>> 0a 0a 00 00 20 00 00 00 00 44 fb 33 02 88 ff ff   D.3
>>   backtrace:
>> [] create_object+0x10d/0x2d0
>> [] kmemleak_alloc+0x5b/0xc0
>> [] kmem_cache_alloc_trace+0xb9/0x160
>> [] get_irq_table+0x151/0x380
>>
>> This is caused by the 'irq_lookup_table' was allocated with
>> __get_free_pages() which won't create kmemleak object, thus it's
>> pointers won't be count as referencing 'irq_remap_table' in
>> kmemleak scan.
> 
> Isn't it better to allocate the kmemleak object manually instead of
> ignoring all irq-table pointers? With this patch we might not notice any
> real leak of irq-tables.

We've considered that too, but found that the irq-tables is not
dynamically alloc/free, they won't be freed once initialized, so there
is no leaking for such object :-)

Regards,
Michael Wang

> 
> 
> 
>   Joerg
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-11-25 Thread Michael Wang
Hi, Joery

On 11/20/2015 12:33 PM, Michael Wang wrote:
> The kmemleak testing on 3.18.24 show:
> 
> unreferenced object 0x880233ff9010 (size 16):
>   comm "swapper/0", pid 1, jiffies 4294937440 (age 2010.490s)
>   hex dump (first 16 bytes):
> 0a 0a 00 00 20 00 00 00 00 44 fb 33 02 88 ff ff   D.3
>   backtrace:
> [] create_object+0x10d/0x2d0
> [] kmemleak_alloc+0x5b/0xc0
> [] kmem_cache_alloc_trace+0xb9/0x160
> [] get_irq_table+0x151/0x380
> 
> This is caused by the 'irq_lookup_table' was allocated with
> __get_free_pages() which won't create kmemleak object, thus it's
> pointers won't be count as referencing 'irq_remap_table' in
> kmemleak scan.
> 
> The 'irq_remap_table' won't be freed after initialized, doesn't
> make sense to check it's leaking.
> 
> This patch mark the 'irq_remap_table' object as 'gray' to stop
> the 'false positives' report.

Any comments on this one?

Regards,
Michael Wang

> 
> Signed-off-by: Michael Wang 
> ---
> v2:
>   Use kmemleak_not_leak() instead of kmemleak_ignore() since
>   the 'irq_remap_table' itself also contain pointer.
> 
>  drivers/iommu/amd_iommu.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 8b2be1e..87a1a88 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -3603,6 +3603,7 @@ static struct irq_remap_table *get_irq_table(u16 devid, 
> bool ioapic)
>   }
>  
>   irq_lookup_table[devid] = table;
> + kmemleak_not_leak(table);
>   set_dte_irq_entry(devid, table);
>   iommu_flush_dte(iommu, devid);
>   if (devid != alias) {
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nfnetlink warnings

2015-11-23 Thread Michael Wang


On 11/23/2015 11:32 AM, Borislav Petkov wrote:
> On Mon, Nov 23, 2015 at 11:20:18AM +0100, Michael Wang wrote:
>> Who want to do that would take responsibility to make an else branch at
>> that time, but reserve the branch at this moment sounds unnecessary, and
>> not that pretty frankly speaking.
> 
> Actually, I was looking for the better idea which doesn't uglify the
> code. And here it is:
> 
> https://lkml.kernel.org/r/5585663.OcpAQiytKY@wuerfel

Looks even better :-)

Regards,
Michael Wang

> 
> :-)
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nfnetlink warnings

2015-11-23 Thread Michael Wang


On 11/23/2015 10:54 AM, Borislav Petkov wrote:
> Hi Michael,
> 
> On Mon, Nov 23, 2015 at 10:49:34AM +0100, Michael Wang wrote:
>> Why not just initialized it as NULL, or mark it as uninitialized_var()?
> 
> because I'd like us to save us the redundant NULL initialization in the
> if-case.

Well, I would vote initialized with NULL, rather than use another else
branch to do the same thing.

> 
> I'm not saying any of the approaches are good visually, though. Who
> knows, someone might have a better idea like, maybe "Oh, I wanted to
> rewrite that code and this handlong is going to be different anyway ..."
> or so. Or something to that effect.

Who want to do that would take responsibility to make an else branch at
that time, but reserve the branch at this moment sounds unnecessary, and
not that pretty frankly speaking.

> 
> Btw, please do not top-post.

Enjoy ;-)

Regards,
Michael Wang

> 
> Thanks.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nfnetlink warnings

2015-11-23 Thread Michael Wang
Hi, Borislav

Why not just initialized it as NULL, or mark it as uninitialized_var()?

Regards,
Michael Wang

On 11/23/2015 10:36 AM, Borislav Petkov wrote:
> Hey,
> 
> so I keep getting those since recently:
> 
> net/netfilter/nfnetlink_queue.c:519:19: warning: ‘nfnl_ct’ may be used 
> uninitialized in this function [-Wmaybe-uninitialized]
>   if (ct && nfnl_ct->build(skb, ct, ctinfo, NFQA_CT, NFQA_CT_INFO) < 0)
>^
> net/netfilter/nfnetlink_queue.c:316:23: note: ‘nfnl_ct’ was declared here
>   struct nfnl_ct_hook *nfnl_ct;
>^
> net/netfilter/nfnetlink_queue.c: In function ‘nfqnl_recv_verdict’:
> net/netfilter/nfnetlink_queue.c:1083:11: warning: ‘nfnl_ct’ may be used 
> uninitialized in this function [-Wmaybe-uninitialized]
> nfnl_ct->seq_adjust(entry->skb, ct, ctinfo, diff);
>^
> 
> and was thinking can we shut them up like this? I know, it is ugly :-\
> 
> I mean, it is obvious in both cases that nfnl_ct won't be used if ct is
> not set but apparently gcc can't see that far...
> 
> ---
> diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
> index 7d81d280cb4f..cd61b0b5c413 100644
> --- a/net/netfilter/nfnetlink_queue.c
> +++ b/net/netfilter/nfnetlink_queue.c
> @@ -372,6 +372,8 @@ nfqnl_build_packet_message(struct net *net, struct 
> nfqnl_instance *queue,
>   if (ct != NULL)
>   size += nfnl_ct->build_size(ct);
>   }
> + } else {
> + nfnl_ct = NULL;
>   }
>  
>   if (queue->flags & NFQA_CFG_F_UID_GID) {
> @@ -1069,6 +1071,8 @@ nfqnl_recv_verdict(struct sock *ctnl, struct sk_buff 
> *skb,
>   nfnl_ct = rcu_dereference(nfnl_ct_hook);
>   if (nfnl_ct != NULL)
>   ct = nfqnl_ct_parse(nfnl_ct, nlh, nfqa, entry, &ctinfo);
> + } else {
> + nfnl_ct = NULL;
>   }
>  
>   if (nfqa[NFQA_PAYLOAD]) {
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-11-20 Thread Michael Wang


On 11/20/2015 12:33 PM, Michael Wang wrote:
> The kmemleak testing on 3.18.24 show:
> 
> unreferenced object 0x880233ff9010 (size 16):
>   comm "swapper/0", pid 1, jiffies 4294937440 (age 2010.490s)
>   hex dump (first 16 bytes):
> 0a 0a 00 00 20 00 00 00 00 44 fb 33 02 88 ff ff   D.3
>   backtrace:
> [] create_object+0x10d/0x2d0
> [] kmemleak_alloc+0x5b/0xc0
> [] kmem_cache_alloc_trace+0xb9/0x160
> [] get_irq_table+0x151/0x380
> 
> This is caused by the 'irq_lookup_table' was allocated with
> __get_free_pages() which won't create kmemleak object, thus it's
> pointers won't be count as referencing 'irq_remap_table' in
> kmemleak scan.
> 
> The 'irq_remap_table' won't be freed after initialized, doesn't
> make sense to check it's leaking.
> 
> This patch mark the 'irq_remap_table' object as 'gray' to stop
> the 'false positives' report.
> 
> Signed-off-by: Michael Wang 

Reported-by: Danil Kipnis 

Regards,
Michael Wang

> ---
> v2:
>   Use kmemleak_not_leak() instead of kmemleak_ignore() since
>   the 'irq_remap_table' itself also contain pointer.
> 
>  drivers/iommu/amd_iommu.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 8b2be1e..87a1a88 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -3603,6 +3603,7 @@ static struct irq_remap_table *get_irq_table(u16 devid, 
> bool ioapic)
>   }
>  
>   irq_lookup_table[devid] = table;
> + kmemleak_not_leak(table);
>   set_dte_irq_entry(devid, table);
>   iommu_flush_dte(iommu, devid);
>   if (devid != alias) {
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v2] iommu/amd: gray the 'irq_remap_table' object for kmemleak

2015-11-20 Thread Michael Wang
The kmemleak testing on 3.18.24 show:

unreferenced object 0x880233ff9010 (size 16):
  comm "swapper/0", pid 1, jiffies 4294937440 (age 2010.490s)
  hex dump (first 16 bytes):
0a 0a 00 00 20 00 00 00 00 44 fb 33 02 88 ff ff   D.3
  backtrace:
[] create_object+0x10d/0x2d0
[] kmemleak_alloc+0x5b/0xc0
[] kmem_cache_alloc_trace+0xb9/0x160
[] get_irq_table+0x151/0x380

This is caused by the 'irq_lookup_table' was allocated with
__get_free_pages() which won't create kmemleak object, thus it's
pointers won't be count as referencing 'irq_remap_table' in
kmemleak scan.

The 'irq_remap_table' won't be freed after initialized, doesn't
make sense to check it's leaking.

This patch mark the 'irq_remap_table' object as 'gray' to stop
the 'false positives' report.

Signed-off-by: Michael Wang 
---
v2:
  Use kmemleak_not_leak() instead of kmemleak_ignore() since
  the 'irq_remap_table' itself also contain pointer.

 drivers/iommu/amd_iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 8b2be1e..87a1a88 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3603,6 +3603,7 @@ static struct irq_remap_table *get_irq_table(u16 devid, 
bool ioapic)
}
 
irq_lookup_table[devid] = table;
+   kmemleak_not_leak(table);
set_dte_irq_entry(devid, table);
iommu_flush_dte(iommu, devid);
if (devid != alias) {
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH] iommu/amd: make kmemleak ignore the 'irq_remap_table' object

2015-11-20 Thread Michael Wang
The kmemleak testing on 3.18.24 show:

unreferenced object 0x880233ff9010 (size 16):
  comm "swapper/0", pid 1, jiffies 4294937440 (age 2010.490s)
  hex dump (first 16 bytes):
0a 0a 00 00 20 00 00 00 00 44 fb 33 02 88 ff ff   D.3
  backtrace:
[] create_object+0x10d/0x2d0
[] kmemleak_alloc+0x5b/0xc0
[] kmem_cache_alloc_trace+0xb9/0x160
[] get_irq_table+0x151/0x380

This is caused by the 'irq_lookup_table' was allocated with
__get_free_pages() which won't create kmemleak object, thus it's
pointers won't be count as referencing in kmemleak scanning.

The 'irq_remap_table' allocated won't be freed after initialized,
doesn't make sense to let kmemleak scan it.

This patch mark the 'irq_remap_table' object as 'ignored' to
stop the 'false positives' report.

Signed-off-by: Michael Wang 
---
 drivers/iommu/amd_iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 8b2be1e..87a1a88 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3603,6 +3603,7 @@ static struct irq_remap_table *get_irq_table(u16 devid, 
bool ioapic)
}
 
irq_lookup_table[devid] = table;
+   kmemleak_ignore(table);
set_dte_irq_entry(devid, table);
iommu_flush_dte(iommu, devid);
if (devid != alias) {
-- 
2.1.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v2] Documentation/infiniband: Add docs for rdma-helpers

2015-05-19 Thread Michael Wang


On 05/18/2015 06:58 PM, Doug Ledford wrote:
[snip]
>> I see :-) I've not work with the kdoc yet, not sure if there is any
>> guidelines on how to write the header of inline func for kdoc?
> 
> It's an automated tool thing.  Any comment section that starts with /**
> is automatically included as a kdoc.  Then there is an expected format
> after that.  See Documentation/kernel-doc-nano-HOWTO.txt.

Got it :-)

> 
>>>
>>> Just because I want to move this along versus waiting for another
>>> respin, I'm going to copy and paste these into those locations and clean
>>> up the changelog when I integrate this patch.
>>
>> Got it, if there is anything I could help, please let me know ;-)
> 
> I'm sending the patch for review, please let me know if you are OK with
> how I handled the attribution.

The definition is far more detailed and accurate, it's already good enough
according to my understanding, should benefit the developer a lot ;-)

Regards,
Michael Wang

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v2] Documentation/infiniband: Add docs for rdma-helpers

2015-05-18 Thread Michael Wang


On 05/18/2015 05:21 PM, Doug Ledford wrote:
[snip]
>>
>> I'll put the highlights and changelog under '---' in next version, is it
>> looks like this?
> 
> We're still missing Jason's feedback request though.  Specifically, he
> pointed out that kdocs are usually not done in Documentation/*, they are
> done in the .c files where the function is (or the .h file if the
> function is an inline, which these all are).  So, you included some
> limited documentation for each of these items in your original patches
> that added them.  His request was that you put this expanded information
> not in Documentation/infiniband where someone has to go looking for it,
> but as part of the kdoc header for each of the various helpers in
> ib_verbs.h itself.

I see :-) I've not work with the kdoc yet, not sure if there is any
guidelines on how to write the header of inline func for kdoc?

> 
> Just because I want to move this along versus waiting for another
> respin, I'm going to copy and paste these into those locations and clean
> up the changelog when I integrate this patch.

Got it, if there is anything I could help, please let me know ;-)

Regards,
Michael Wang

> 
>>
>> Subject: [PATCH RFC v3] Documentation/infiniband: Add docs for rdma-helpers
>>
>> This is the following patch for:
>>   https://lkml.org/lkml/2015/5/5/417
>> which try to document the settled rdma_cap_XX().
>>
>> Signed-off-by: Michael Wang 
>> ---
>> Highlights:
>>   There could be many missing/mistakes/misunderstanding, please don't
>>   be hesitate to point out the issues, any suggestions to improve or
>>   complete the description are very welcomed ;-)
>>
>> v2:
>>   * Merge the descriptions from Doug:
>> http://www.spinics.net/lists/linux-rdma/msg25172.html
>>
>> v3:
>>   ...
>>
>>  Documentation/infiniband/rdma_helpers.txt | 79 
>> +++
>>  1 file changed, 79 insertions(+)
>>  create mode 100644 Documentation/infiniband/rdma_helpers.txt
>>
>> diff --git a/Documentation/infiniband/rdma_helpers.txt 
>> b/Documentation/infiniband/rdma_helpers.txt
>> new file mode 100644
>> index 000..be9416d
>> --- /dev/null
>> +++ b/Documentation/infiniband/rdma_helpers.txt
>>
>> Regards,
>> Michael Wang
>>
>>>
>>>>
>>>> Signed-off-by: Michael Wang 
>>>> ---
>>>>  Documentation/infiniband/rdma_helpers.txt | 79 
>>>> +++
>>>>  1 file changed, 79 insertions(+)
>>>>  create mode 100644 Documentation/infiniband/rdma_helpers.txt
>>>>
>>>> diff --git a/Documentation/infiniband/rdma_helpers.txt 
>>>> b/Documentation/infiniband/rdma_helpers.txt
>>>> new file mode 100644
>>>> index 000..be9416d
>>>> --- /dev/null
>>>> +++ b/Documentation/infiniband/rdma_helpers.txt
>>>> @@ -0,0 +1,79 @@
>>>> +RDMA HELPERS
>>>> +
>>>> +  The following helpers are used to check the specific capabilities of a
>>>> +  particular port before utilizing those capabilities.
>>>> +
>>>> +rdma_cap_ib_mad- Infiniband Management Datagrams.
>>>> +rdma_cap_ib_smi- Infiniband Subnet Management Interface.
>>>> +rdma_cap_ib_cm - Infiniband Communication Manager.
>>>> +rdma_cap_iw_cm - IWARP Communication Manager.
>>>> +rdma_cap_ib_sa - Infiniband Subnet Administration.
>>>> +rdma_cap_ib_mcast  - Infiniband Multicast Join/Leave Protocol.
>>>> +rdma_cap_read_multi_sge- RDMA Read Work Request Support Multiple 
>>>> SGE.
>>>> +rdma_cap_af_ib - Native Infiniband Address.
>>>> +rdma_cap_eth_ah- InfiniBand Transport With Ethernet 
>>>> Address.
>>>> +
>>>> +USAGE
>>>> +
>>>> +  if (rdma_cap_XX(device, i)) {
>>>> +   /* The port i of device support XX */
>>>> +   ...
>>>> +  } else {
>>>> +   /* The port i of device don't support XX */
>>>> +   ...
>>>> +  }
>>>> +
>>>> +  rdma_cap_ib_mad
>>>> +  ---
>>>> +Management Datagrams (MAD) are a required part of the InfiniBand
>>>> +specification and are supported on all InfiniBand devices.  A slightly
>>>> +extended version are also supported on OPA i

Re: [PATCH RFC v2] Documentation/infiniband: Add docs for rdma-helpers

2015-05-18 Thread Michael Wang
Hi, Or

On 05/18/2015 11:47 AM, Or Gerlitz wrote:
[snip]
>> Highlights:
>>   There could be many missing/mistakes/misunderstanding, please don't
>>   be hesitate to point out the issues, any suggestions to improve or
>>   complete the description are very welcomed ;-)
> 
> Michael, none of what you wrote above belongs to the change-log which
> is going to stay for-ever in the upstream git repo. You should put it
> all belong the --- line after your S.O.B and add proper change log
> telling what this patch is about

Thanks for point out this for me :-)

I'll put the highlights and changelog under '---' in next version, is it
looks like this?


Subject: [PATCH RFC v3] Documentation/infiniband: Add docs for rdma-helpers

This is the following patch for:
  https://lkml.org/lkml/2015/5/5/417
which try to document the settled rdma_cap_XX().

Signed-off-by: Michael Wang 
---
Highlights:
  There could be many missing/mistakes/misunderstanding, please don't
  be hesitate to point out the issues, any suggestions to improve or
  complete the description are very welcomed ;-)

v2:
  * Merge the descriptions from Doug:
http://www.spinics.net/lists/linux-rdma/msg25172.html

v3:
  ...

 Documentation/infiniband/rdma_helpers.txt | 79 +++
 1 file changed, 79 insertions(+)
 create mode 100644 Documentation/infiniband/rdma_helpers.txt

diff --git a/Documentation/infiniband/rdma_helpers.txt 
b/Documentation/infiniband/rdma_helpers.txt
new file mode 100644
index 000..be9416d
--- /dev/null
+++ b/Documentation/infiniband/rdma_helpers.txt

Regards,
Michael Wang

> 
>>
>> Signed-off-by: Michael Wang 
>> ---
>>  Documentation/infiniband/rdma_helpers.txt | 79 
>> +++
>>  1 file changed, 79 insertions(+)
>>  create mode 100644 Documentation/infiniband/rdma_helpers.txt
>>
>> diff --git a/Documentation/infiniband/rdma_helpers.txt 
>> b/Documentation/infiniband/rdma_helpers.txt
>> new file mode 100644
>> index 000..be9416d
>> --- /dev/null
>> +++ b/Documentation/infiniband/rdma_helpers.txt
>> @@ -0,0 +1,79 @@
>> +RDMA HELPERS
>> +
>> +  The following helpers are used to check the specific capabilities of a
>> +  particular port before utilizing those capabilities.
>> +
>> +rdma_cap_ib_mad- Infiniband Management Datagrams.
>> +rdma_cap_ib_smi- Infiniband Subnet Management Interface.
>> +rdma_cap_ib_cm - Infiniband Communication Manager.
>> +rdma_cap_iw_cm - IWARP Communication Manager.
>> +rdma_cap_ib_sa - Infiniband Subnet Administration.
>> +rdma_cap_ib_mcast  - Infiniband Multicast Join/Leave Protocol.
>> +rdma_cap_read_multi_sge- RDMA Read Work Request Support Multiple 
>> SGE.
>> +rdma_cap_af_ib - Native Infiniband Address.
>> +rdma_cap_eth_ah- InfiniBand Transport With Ethernet Address.
>> +
>> +USAGE
>> +
>> +  if (rdma_cap_XX(device, i)) {
>> +   /* The port i of device support XX */
>> +   ...
>> +  } else {
>> +   /* The port i of device don't support XX */
>> +   ...
>> +  }
>> +
>> +  rdma_cap_ib_mad
>> +  ---
>> +Management Datagrams (MAD) are a required part of the InfiniBand
>> +specification and are supported on all InfiniBand devices.  A slightly
>> +extended version are also supported on OPA interfaces.
>> +
>> +  rdma_cap_ib_smi
>> +  ---
>> +Subnet Management Interface (SMI) will handle SMP packet from SM
>> +in an infiniband fabric.
>> +
>> +  rdma_cap_ib_cm
>> +  ---
>> +Communication Manager (CM) service, used to ease the process of 
>> connecting
>> +to a remote host.  The IB-CM can be used to connect to remote hosts 
>> using
>> +either InfiniBand or RoCE connections, iWARP has its own CM.
>> +
>> +  rdma_cap_iw_cm
>> +  ---
>> +iWARP Communication Manager (CM), Similar to the IB-CM, but only used on
>> +iWARP devices.
>> +
>> +  rdma_cap_ib_sa
>> +  ---
>> +Subnet Administration (SA) is the database built by SM in an
>> +infiniband fabric.
>> +
>> +  rdma_cap_ib_mcast
>> +  ---
>> +InfiniBand (and OPA) use a different multicast mechanism rather than
>> +traditional IP multicast found on Ethernet devices.  If this is true, 
>> then
>> +traditional IPv4/IPv6 multicast is handled by the IPoIB layer and direct
>> +multicast joins and leaves

[PATCH RFC v2] Documentation/infiniband: Add docs for rdma-helpers

2015-05-18 Thread Michael Wang
Since v1:
  * Merge the descriptions from Doug:
http://www.spinics.net/lists/linux-rdma/msg25172.html

This is the following patch for:
  https://lkml.org/lkml/2015/5/5/417
which try to document the settled rdma_cap_XX().

Highlights:
  There could be many missing/mistakes/misunderstanding, please don't
  be hesitate to point out the issues, any suggestions to improve or
  complete the description are very welcomed ;-)

Signed-off-by: Michael Wang 
---
 Documentation/infiniband/rdma_helpers.txt | 79 +++
 1 file changed, 79 insertions(+)
 create mode 100644 Documentation/infiniband/rdma_helpers.txt

diff --git a/Documentation/infiniband/rdma_helpers.txt 
b/Documentation/infiniband/rdma_helpers.txt
new file mode 100644
index 000..be9416d
--- /dev/null
+++ b/Documentation/infiniband/rdma_helpers.txt
@@ -0,0 +1,79 @@
+RDMA HELPERS
+
+  The following helpers are used to check the specific capabilities of a
+  particular port before utilizing those capabilities.
+
+rdma_cap_ib_mad- Infiniband Management Datagrams.
+rdma_cap_ib_smi- Infiniband Subnet Management Interface.
+rdma_cap_ib_cm - Infiniband Communication Manager.
+rdma_cap_iw_cm - IWARP Communication Manager.
+rdma_cap_ib_sa - Infiniband Subnet Administration.
+rdma_cap_ib_mcast  - Infiniband Multicast Join/Leave Protocol.
+rdma_cap_read_multi_sge- RDMA Read Work Request Support Multiple SGE.
+rdma_cap_af_ib - Native Infiniband Address.
+rdma_cap_eth_ah- InfiniBand Transport With Ethernet Address.
+
+USAGE
+
+  if (rdma_cap_XX(device, i)) {
+   /* The port i of device support XX */
+   ...
+  } else {
+   /* The port i of device don't support XX */
+   ...
+  }
+
+  rdma_cap_ib_mad
+  ---
+Management Datagrams (MAD) are a required part of the InfiniBand
+specification and are supported on all InfiniBand devices.  A slightly
+extended version are also supported on OPA interfaces.
+
+  rdma_cap_ib_smi
+  ---
+Subnet Management Interface (SMI) will handle SMP packet from SM
+in an infiniband fabric.
+
+  rdma_cap_ib_cm
+  ---
+Communication Manager (CM) service, used to ease the process of connecting
+to a remote host.  The IB-CM can be used to connect to remote hosts using
+either InfiniBand or RoCE connections, iWARP has its own CM.
+
+  rdma_cap_iw_cm
+  ---
+iWARP Communication Manager (CM), Similar to the IB-CM, but only used on
+iWARP devices.
+
+  rdma_cap_ib_sa
+  ---
+Subnet Administration (SA) is the database built by SM in an
+infiniband fabric.
+
+  rdma_cap_ib_mcast
+  ---
+InfiniBand (and OPA) use a different multicast mechanism rather than
+traditional IP multicast found on Ethernet devices.  If this is true, then
+traditional IPv4/IPv6 multicast is handled by the IPoIB layer and direct
+multicast joins and leaves are handled per the InfiniBand specifications.
+
+  rdma_cap_read_multi_sge
+  ---
+Certain devices (iWARP in particular) have restrictions on the number of
+scatter gather elements that can be present in an RDMA READ work request,
+this is true if the device does not have that restriction.
+
+  rdma_cap_af_ib
+  ---
+Many code paths for traditional InfiniBand and RoCE links are the same,
+but need minor differences to accommodate the different addresses on the
+two types of connections.  This helper is true when the address of the
+specific connection is of the InfiniBand native variety.
+
+  rdma_cap_eth_ah
+  ---
+Queue Pair is InfiniBand transport, but uses Ethernet address instead
+of native InfiniBand address (aka, this is a RoCE QP, and that means
+ethertype 0x8915 + GRH for RoCEv1 and IP/UDP to well known UDP port for
+RoCEv2), this is true when the address family of the specific queue pair
+is of the Ethernet (RoCE) variety.
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] Documentation/infiniband: Add docs for rdma-helpers

2015-05-15 Thread Michael Wang


On 05/15/2015 04:40 PM, Doug Ledford wrote:
[snip]
> 
> The test itself doesn't mean that.  It means we need a RoCE address
> (it's true when transport is IB and link layer is Ethernet).  That we
> *use* it during connectionless communication because we have to generate
> our own address vector for the packet while during connected queue pair
> use the address vector is created by the card using the queue pair
> information is just the circumstance of its use.  And even though a
> disconnected queue pair isn't solidly connected to a remote endpoint, it
> is solidly bound to an adapter that requires either an IB or Ethernet
> address family.  Maybe this to resolve your issue with the wording:

Thanks for the explain :-) The term 'connectionless' still sounds a little
strange to me when it's just means no HW support on creating address vector,
but I can understand the concept.

> 
> This helper is true when the address family of this queue pair is of the
> Ethernet (RoCE) variety.

Sounds good, will be merged in next version :-)

Regards,
Michael Wang

> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] Documentation/infiniband: Add docs for rdma-helpers

2015-05-15 Thread Michael Wang


On 05/15/2015 04:27 PM, Doug Ledford wrote:
[snip]
>>
>> Me too used to think it's 'connection', while I found some docs explain
>> this as 'communication'... but anyway, 'connection' sounds
>> more close to what it did in kernel :-)
> 
> That's kind of what I thought.  Anyway, it's communication management
> (which to me is a gross abuse of the english language for which the IBTA
> should be appropriately chastised), but that doesn't mean that lower
> down in the more descriptive area of text that we can't call out that
> this is really for establishing a connection and that once your
> connection is established and you *truly* want to communicate, this does
> nothing.

I see :-) we can reserve the communication management as the definition of CM,
to obey the standard, meanwhile give some description related to connection
below in the long description.

> 
[snip]
>> Shall we put this long description into USAGE? Here maybe list
>> all the helpers to give some quick overview with a brief
>> description, what's your opinion?
> 
> Given how we have a more complete description of this below, it need not
> have such a lengthy description here.

Got it :-)

Regards,
Michael Wang

>>
>>>> +
>>>> +USAGE
>>>> +
>>>> +  if (rdma_cap_XX(device, i)) {
>>>> +  /* The port i of device support XX */
>>>> +  ...
>>>> +  } else {
>>>> +  /* The port i of device don't support XX */
>>>> +  ...
>>>> +  }
>>>> +
>>>> +  rdma_cap_ib_mad
>>>> +  ---
>>>> +Management Datagrams (MAD) is the prototype of management packet
>>>> +to be used by all the kinds of infiniband managers, use the helper
>>>> +to verify the port before utilize related features.
>>> Management Datagrams (MAD) are a required part of the InfiniBand
>>> specification and are supported on all InfiniBand devices.  A slightly
>>> extended version are also supported on OPA interfaces.
>>>
>>> I would drop all instances of "use the helper to verify..." as that's
>>> redundant.  This whole doc is about using the helpers to verify things.
>>
>> Agree, will be dropped in next version.
>>
>> And all the comments below make sense, will be merged ;-)
>>
>> Regards,
>> Michael Wang
>>
>>>
>>>> +
>>>> +  rdma_cap_ib_smi
>>>> +  ---
>>>> +Subnet Management Interface (SMI) will handle SMP packet from SM
>>>> +in an infiniband fabric, use the helper to verify the port before
>>>> +utilize related features.
>>>> +
>>>> +  rdma_cap_ib_cm
>>>> +  ---
>>>> +Communication Manager (CM) will handle the connections between
>>>^Connection Manager (CM) service, used to ease the process of
>>> connecting to a remote host.  The IB CM can be used to connect to remote
>>> hosts using either InfiniBand or RoCE connections.  iWARP has its own
>>> connection manager, see below.
>>>> +adaptors, currently there are two different implementation,
>>>> +IB or IWARP, use the helper to verify whether the port using
>>>> +IB-CM or not
>>>> +
>>>> +  rdma_cap_iw_cm
>>>> +  ---
>>>> +IWARP has it's own implemented CM which is different from infiniband,
>>>   iWARP connection manager.  Similar to the IB Connection Manager,
>>> but only used on iWARP devices.
>>>> +use the helper to check whether the port using IWARP-CM or not.
>>>> +
>>>> +  rdma_cap_ib_sa
>>>> +  ---
>>>> +Subnet Administration (SA) is the database built by SM in an
>>>> +infiniband fabric, use the helper to verify the port before
>>>> +utilize related features.
>>>> +
>>>> +  rdma_cap_ib_mcast
>>>> +  ---
>>>> +Multicast is the feature for one QP to send messages to multiple
>>>> +QP in an infiniband fabric, use the helper to verify the port before
>>>> +utilize related features.
>>>
>>> InfiniBand (and OPA) use a different multicast mechanism than
>>> traditional IP multicast found on Ethernet devices.  If this capability
>>> is true, then traditional IPv4/IPv6 multicast is handled by the IPoIB
>>> layer and direct multicast joins and leaves are handled per the
>

Re: [PATCH RFC] Documentation/infiniband: Add docs for rdma-helpers

2015-05-15 Thread Michael Wang


On 05/13/2015 06:42 PM, Hefty, Sean wrote:
>>> +  rdma_cap_ib_cm
>>> +  ---
>>> +Communication Manager (CM) will handle the connections between
>>^Connection Manager (CM) service, used to ease the process of
> 
> In IB terms, this is communication manager.  It also handles transport level 
> address resolution for UD QPs.

I could find both 'connection' and 'communication' in different docs,
while 'connection' is more related to verbs, 'communication' is more
close to specification.

IMHO 'connection' make more sense, after all, all the transport between
adaptors could named as communication, while connection management is exactly
what CM did in kernel.

Doug, what's your opinion?

> 
>>> +  rdma_cap_eth_ah
>>> +  ---
>>> +Infiniband address handler format is special in ethernet fabric,
>> use
>>> +the helper to verify whether the port is using ethernet format or
>> not.
>>
>> This helper is true when the address of the specific connection is of
>> the Ethernet (RoCE) variety.
> 
> This is used for connectionless communication.

Could you please give more details on this?

Regards,
Michael Wang

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] Documentation/infiniband: Add docs for rdma-helpers

2015-05-15 Thread Michael Wang


On 05/13/2015 05:11 PM, Doug Ledford wrote:
[snip]
>> +
>> +  For core layer, below helpers are used to check if a paticular capability
>> +  is supported by the port.
> 
> The following helpers are used to check the specific capabilities of a
> particular port before utilizing those capabilities.

Will be in next version :-)

> 
>> +
>> +rdma_cap_ib_mad - Infiniband Management Datagrams.
>> +rdma_cap_ib_smi - Infiniband Subnet Management Interface.
>> +rdma_cap_ib_cm  - Infiniband Communication Manager.
> InfiniBand Connection Management

Me too used to think it's 'connection', while I found some docs explain
this as 'communication'... but anyway, 'connection' sounds
more close to what it did in kernel :-)

>> +rdma_cap_iw_cm  - IWARP Communication Manager.
> iWARP Connection Management
>> +rdma_cap_ib_sa  - Infiniband Subnet Administration.
>> +rdma_cap_ib_mcast   - Infiniband Multicast.
> InfiniBand Multicast join/leave protocol
>> +rdma_cap_read_multi_sge - RDMA Read Multiple Scatter-Gather Entries.
> RDMA Read verb supports more than 1 sge in the work request

Will be in next version :-)

>> +rdma_cap_af_ib  - Native Infiniband Address.
>> +rdma_cap_eth_ah - Ethernet Address Handler.
> Queue Pair is InfiniBand transport, but uses Ethernet address instead of
> native InfiniBand address (aka, this is a RoCE QP, and that means
> ethertype 0x8915 + GRH for RoCEv1 and IP/UDP to well known UDP port for
> RoCEv2)

Shall we put this long description into USAGE? Here maybe list
all the helpers to give some quick overview with a brief
description, what's your opinion?

>> +
>> +USAGE
>> +
>> +  if (rdma_cap_XX(device, i)) {
>> +/* The port i of device support XX */
>> +...
>> +  } else {
>> +/* The port i of device don't support XX */
>> +...
>> +  }
>> +
>> +  rdma_cap_ib_mad
>> +  ---
>> +Management Datagrams (MAD) is the prototype of management packet
>> +to be used by all the kinds of infiniband managers, use the helper
>> +to verify the port before utilize related features.
> Management Datagrams (MAD) are a required part of the InfiniBand
> specification and are supported on all InfiniBand devices.  A slightly
> extended version are also supported on OPA interfaces.
> 
> I would drop all instances of "use the helper to verify..." as that's
> redundant.  This whole doc is about using the helpers to verify things.

Agree, will be dropped in next version.

And all the comments below make sense, will be merged ;-)

Regards,
Michael Wang

> 
>> +
>> +  rdma_cap_ib_smi
>> +  ---
>> +Subnet Management Interface (SMI) will handle SMP packet from SM
>> +in an infiniband fabric, use the helper to verify the port before
>> +utilize related features.
>> +
>> +  rdma_cap_ib_cm
>> +  ---
>> +Communication Manager (CM) will handle the connections between
>^Connection Manager (CM) service, used to ease the process of
> connecting to a remote host.  The IB CM can be used to connect to remote
> hosts using either InfiniBand or RoCE connections.  iWARP has its own
> connection manager, see below.
>> +adaptors, currently there are two different implementation,
>> +IB or IWARP, use the helper to verify whether the port using
>> +IB-CM or not
>> +
>> +  rdma_cap_iw_cm
>> +  ---
>> +IWARP has it's own implemented CM which is different from infiniband,
>   iWARP connection manager.  Similar to the IB Connection Manager,
> but only used on iWARP devices.
>> +use the helper to check whether the port using IWARP-CM or not.
>> +
>> +  rdma_cap_ib_sa
>> +  ---
>> +Subnet Administration (SA) is the database built by SM in an
>> +infiniband fabric, use the helper to verify the port before
>> +utilize related features.
>> +
>> +  rdma_cap_ib_mcast
>> +  ---
>> +Multicast is the feature for one QP to send messages to multiple
>> +QP in an infiniband fabric, use the helper to verify the port before
>> +utilize related features.
> 
> InfiniBand (and OPA) use a different multicast mechanism than
> traditional IP multicast found on Ethernet devices.  If this capability
> is true, then traditional IPv4/IPv6 multicast is handled by the IPoIB
> layer and direct multicast joins and leaves are handled per the
> InfiniBand specifications.
> 
>> +
>> +  rdm

Re: [PATCH RFC] Documentation/infiniband: Add docs for rdma-helpers

2015-05-15 Thread Michael Wang


On 05/13/2015 05:59 PM, Jason Gunthorpe wrote:
> On Wed, May 13, 2015 at 03:24:32PM +0200, Michael Wang wrote:
>> This is the following patch for:
>>   https://lkml.org/lkml/2015/5/5/417
>> which try to document the settled rdma_cap_XX().
>>
>> Highlights:
>>   There could be many missing/mistakes/misunderstanding, please don't
>>   be hesitate to point out the issues, any suggestions to improve or
>>   complete the description are very welcomed ;-)
> 
> I'd rather see this in the kdoc for each function.

I used to thought you mean the kernel documentation like
this... my misunderstanding but this is the usual way to
document kernel stuff, isn't it?

BTW, could you give more details on the kdoc?

Regards,
Michael Wang

> 
> Thanks,
> Jason
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC] Documentation/infiniband: Add docs for rdma-helpers

2015-05-13 Thread Michael Wang
This is the following patch for:
  https://lkml.org/lkml/2015/5/5/417
which try to document the settled rdma_cap_XX().

Highlights:
  There could be many missing/mistakes/misunderstanding, please don't
  be hesitate to point out the issues, any suggestions to improve or
  complete the description are very welcomed ;-)

Signed-off-by: Michael Wang 
---
 Documentation/infiniband/rdma_helpers.txt | 76 +++
 1 file changed, 76 insertions(+)
 create mode 100644 Documentation/infiniband/rdma_helpers.txt

diff --git a/Documentation/infiniband/rdma_helpers.txt 
b/Documentation/infiniband/rdma_helpers.txt
new file mode 100644
index 000..abc75ec
--- /dev/null
+++ b/Documentation/infiniband/rdma_helpers.txt
@@ -0,0 +1,76 @@
+RDMA HELPERS
+
+  For core layer, below helpers are used to check if a paticular capability
+  is supported by the port.
+
+rdma_cap_ib_mad- Infiniband Management Datagrams.
+rdma_cap_ib_smi- Infiniband Subnet Management Interface.
+rdma_cap_ib_cm - Infiniband Communication Manager.
+rdma_cap_iw_cm - IWARP Communication Manager.
+rdma_cap_ib_sa - Infiniband Subnet Administration.
+rdma_cap_ib_mcast  - Infiniband Multicast.
+rdma_cap_read_multi_sge- RDMA Read Multiple Scatter-Gather Entries.
+rdma_cap_af_ib - Native Infiniband Address.
+rdma_cap_eth_ah- Ethernet Address Handler.
+
+USAGE
+
+  if (rdma_cap_XX(device, i)) {
+   /* The port i of device support XX */
+   ...
+  } else {
+   /* The port i of device don't support XX */
+   ...
+  }
+
+  rdma_cap_ib_mad
+  ---
+Management Datagrams (MAD) is the prototype of management packet
+to be used by all the kinds of infiniband managers, use the helper
+to verify the port before utilize related features.
+
+  rdma_cap_ib_smi
+  ---
+Subnet Management Interface (SMI) will handle SMP packet from SM
+in an infiniband fabric, use the helper to verify the port before
+utilize related features.
+
+  rdma_cap_ib_cm
+  ---
+Communication Manager (CM) will handle the connections between
+adaptors, currently there are two different implementation,
+IB or IWARP, use the helper to verify whether the port using
+IB-CM or not
+
+  rdma_cap_iw_cm
+  ---
+IWARP has it's own implemented CM which is different from infiniband,
+use the helper to check whether the port using IWARP-CM or not.
+
+  rdma_cap_ib_sa
+  ---
+Subnet Administration (SA) is the database built by SM in an
+infiniband fabric, use the helper to verify the port before
+utilize related features.
+
+  rdma_cap_ib_mcast
+  ---
+Multicast is the feature for one QP to send messages to multiple
+QP in an infiniband fabric, use the helper to verify the port before
+utilize related features.
+
+  rdma_cap_read_multi_sge
+  ---
+RDMA read operation could support multiple scatter-gather entries,
+use the helper to verify wthether the port support this feature
+or not.
+
+  rdma_cap_af_ib
+  ---
+RDMA address format could be ethernet or infiniband, use the helper
+to verify whether the port support infiniband format or not.
+
+  rdma_cap_eth_ah
+  ---
+Infiniband address handler format is special in ethernet fabric, use
+the helper to verify whether the port is using ethernet format or not.
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/23] IB/Verbs: IB Management Helpers

2015-05-13 Thread Michael Wang


On 05/12/2015 10:09 PM, Doug Ledford wrote:
[snip]
>>
>> I had asked for better kdocs for the new helpers so new people can
>> understand when and where to use them.
>>
>> I've not looked at the series at all for the past few postings.
> 
> Michael, please work up an incremental patch to address the kdocs issue.
> I've picked up the v8 patchset, and there is no need to respin it, but I
> would like to have that kdoc patch before the 4.2 merge window opens.

Sure, now these helpers are settled down, it's time for document,
I'll send out the RFC ASAP :-)

Regards,
Michael Wang

> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/23] IB/Verbs: IB Management Helpers

2015-05-12 Thread Michael Wang


On 05/12/2015 04:24 PM, Doug Ledford wrote:
[snip]
>>
>> AFAIK Or was asking to merge the #15~23, and want to reserve the changelog
>> meanwhile reply the cover of prev version (I'm still confused on that...),
>> I've replied but get no respond yet.
>>
>> I can make a v9 to merge the #15~#23 if that could benefit the 
>> maintainability,
>> please let me know your opinion :-)
> 
> I don't think it would make a significant difference.  I've pulled the
> v8 patchset out of patchworks and I'll throw a new branch with it
> included up to my github repo sometime today.

Got it :-)

Regards,
Michael Wang

> 
>> About the Bug, if it was not introduced in this series, maybe including the
>> fix in next series would be better?
>>
>> Regards,
>> Michael Wang
>>
>>>
>>>>   Frankly
>>>> I vote for the former because as it stands this series does not break 
>>>> directly.
>>>> It was only after I changed the implementation of rdma_cap_ib_mad that it
>>>> broke.
>>>>
>>>>
>>>> For the rest of the series.
>>>>
>>>> Reviewed-by: Ira Weiny 
>>>> Tested-by: Ira Weiny 
>>>>-- Limited to mlx4, qib, and OPA (with additional patches.)
>>>>
>>>>
>>>> On Tue, May 05, 2015 at 02:50:17PM +0200, Michael Wang wrote:
>>>>> Since v7:
>>>>>   * Thanks to Doug, Ira, Devesh for the testing :-)
>>>>>   * Thanks for the comments from or, Doug, Ira, Jason :-)
>>>>> Please remind me if anything missed :-P
>>>>>   * Use rdma_cap_XX() instead of cap_XX() for readability
>>>>>   * Remove CC list in git log for maintainability
>>>>>   * Use bool as return value
>>>>>   * Updated github repository to v8
>>>>>
>>>>> There are plenty of lengthy code to check the transport type of IB device,
>>>>> or the link layer type of it's port, but actually we are just speculating
>>>>> whether a particular management/feature is supported by the device/port.
>>>>>
>>>>> Thus instead of inferring, we should have our own mechanism for IB 
>>>>> management
>>>>> capability/protocol/feature checking, several proposals below.
>>>>>
>>>>> This patch set will introduce query_protocol() to check management 
>>>>> requirement
>>>>> instead of inferring from transport and link layer respectively, along 
>>>>> with
>>>>> the new enum on protocol type.
>>>>>
>>>>> Mapping List:
>>>>>   node-type   link-layer  transport   protocol
>>>>> nes   RNICETH IWARP   IWARP
>>>>> amso1100  RNICETH IWARP   IWARP
>>>>> cxgb3 RNICETH IWARP   IWARP
>>>>> cxgb4 RNICETH IWARP   IWARP
>>>>> usnic USNIC_UDP   ETH USNIC_UDP   USNIC_UDP
>>>>> ocrdmaIB_CA   ETH IB  IBOE
>>>>> mlx4  IB_CA   IB/ETH  IB  IB/IBOE
>>>>> mlx5  IB_CA   IB  IB  IB
>>>>> ehca  IB_CA   IB  IB  IB
>>>>> ipath IB_CA   IB  IB  IB
>>>>> mthca IB_CA   IB  IB  IB
>>>>> qib   IB_CA   IB  IB  IB
>>>>>
>>>>> For example:
>>>>>   if (transport == IB) && (link-layer == ETH)
>>>>> will now become:
>>>>>   if (query_protocol() == IBOE)
>>>>>
>>>>> Thus we will be able to get rid of the respective transport and link-layer
>>>>> checking, and it will help us to add new protocol/Technology (like OPA) 
>>>>> more
>>>>> easier, also with the introduced management helpers, IB management logical
>>>>> will be more clear and easier for extending.
>>>>>
>>>>> Highlights:
>>>>> The long CC list in each patches was complained consider about the
>>>>> maintainability, it was suggested folks to provide their reviewed-by 
>>>>> or
>>>>> Acked-by instead, so for those who used

Re: [PATCH v8 00/23] IB/Verbs: IB Management Helpers

2015-05-12 Thread Michael Wang
On 05/12/2015 01:49 AM, ira.weiny wrote:
> I have run with this series and the only issue I have found is not with this
> patch set directly.
> 
> This patch:
> 
>>   IB/Verbs: Use management helper rdma_cap_ib_mad()
> 
> causes an error when you actually use the port passed from the ib_umad module.
> I have a patch to fix that which I found while trying to build on this series
> for the use of a bit mask.
> 
> Doug, I don't know what you would like to do for this fix.  I am submitting it
> shortly with a new version of the core capability bit patches.  If you want to
> just add it after this series or force Michael to respin with the fix?  
> Frankly
> I vote for the former because as it stands this series does not break 
> directly.
> It was only after I changed the implementation of rdma_cap_ib_mad that it
> broke.

Agree, it sounds more reasonable to include the fix in the series introduced
it :-P

> 
> 
> For the rest of the series.
> 
> Reviewed-by: Ira Weiny 
> Tested-by: Ira Weiny 
>   -- Limited to mlx4, qib, and OPA (with additional patches.)

Thanks for the review and testing :-)

Regards,
Michael Wang

> 
> 
> On Tue, May 05, 2015 at 02:50:17PM +0200, Michael Wang wrote:
>> Since v7:
>>   * Thanks to Doug, Ira, Devesh for the testing :-)
>>   * Thanks for the comments from or, Doug, Ira, Jason :-)
>> Please remind me if anything missed :-P
>>   * Use rdma_cap_XX() instead of cap_XX() for readability
>>   * Remove CC list in git log for maintainability
>>   * Use bool as return value
>>   * Updated github repository to v8
>>
>> There are plenty of lengthy code to check the transport type of IB device,
>> or the link layer type of it's port, but actually we are just speculating
>> whether a particular management/feature is supported by the device/port.
>>
>> Thus instead of inferring, we should have our own mechanism for IB management
>> capability/protocol/feature checking, several proposals below.
>>
>> This patch set will introduce query_protocol() to check management 
>> requirement
>> instead of inferring from transport and link layer respectively, along with
>> the new enum on protocol type.
>>
>> Mapping List:
>>  node-type   link-layer  transport   protocol
>> nes  RNICETH IWARP   IWARP
>> amso1100 RNICETH IWARP   IWARP
>> cxgb3RNICETH IWARP   IWARP
>> cxgb4RNICETH IWARP   IWARP
>> usnicUSNIC_UDP   ETH USNIC_UDP   USNIC_UDP
>> ocrdma   IB_CA   ETH IB  IBOE
>> mlx4 IB_CA   IB/ETH  IB  IB/IBOE
>> mlx5 IB_CA   IB  IB  IB
>> ehca IB_CA   IB  IB  IB
>> ipathIB_CA   IB  IB  IB
>> mthcaIB_CA   IB  IB  IB
>> qib  IB_CA   IB  IB  IB
>>
>> For example:
>>  if (transport == IB) && (link-layer == ETH)
>> will now become:
>>  if (query_protocol() == IBOE)
>>
>> Thus we will be able to get rid of the respective transport and link-layer
>> checking, and it will help us to add new protocol/Technology (like OPA) more
>> easier, also with the introduced management helpers, IB management logical
>> will be more clear and easier for extending.
>>
>> Highlights:
>> The long CC list in each patches was complained consider about the
>> maintainability, it was suggested folks to provide their reviewed-by or
>> Acked-by instead, so for those who used to be on the CC list, please
>> provide your signature voluntarily :-)
>>
>> The 'mgmt-helpers' branch of 'g...@github.com:ywang-pb/infiniband-wy.git'
>> contain this series based on the latest 'infiniband/for-next'
>>
>> Patch 1#~14# included all the logical reform, 15#~23# introduced the
>> management helpers.
>>
>> Doug suggested the bitmask mechanism:
>>  https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23765.html
>> which could be the plan for future reforming, we prefer that to be 
>> another
>> series which focus on semantic and performance.
>>
>> This patch-set is somewhat 'bloated' now and it may be a good timing for
>> staging, I'd like to suggest we focus on impr

Re: [PATCH v8 00/23] IB/Verbs: IB Management Helpers

2015-05-12 Thread Michael Wang


On 05/12/2015 02:27 AM, Doug Ledford wrote:
> On Mon, 2015-05-11 at 19:49 -0400, ira.weiny wrote:
>> I have run with this series and the only issue I have found is not with this
>> patch set directly.
>>
>> This patch:
>>
>>>   IB/Verbs: Use management helper rdma_cap_ib_mad()
>>
>> causes an error when you actually use the port passed from the ib_umad 
>> module.
>> I have a patch to fix that which I found while trying to build on this series
>> for the use of a bit mask.
>>
>> Doug, I don't know what you would like to do for this fix.  I am submitting 
>> it
>> shortly with a new version of the core capability bit patches.  If you want 
>> to
>> just add it after this series or force Michael to respin with the fix?
> 
> As I recall, there was a comment from Or requesting to squash some of
> the individual patches down, but I no longer have that email in my Inbox
> to double check.  And it seemed like there was one other review comment
> not yet addressed.  Do I have that right Michael?  And if so, are you
> working on a v9?

AFAIK Or was asking to merge the #15~23, and want to reserve the changelog
meanwhile reply the cover of prev version (I'm still confused on that...),
I've replied but get no respond yet.

I can make a v9 to merge the #15~#23 if that could benefit the maintainability,
please let me know your opinion :-)

About the Bug, if it was not introduced in this series, maybe including the
fix in next series would be better?

Regards,
Michael Wang

> 
>>   Frankly
>> I vote for the former because as it stands this series does not break 
>> directly.
>> It was only after I changed the implementation of rdma_cap_ib_mad that it
>> broke.
>>
>>
>> For the rest of the series.
>>
>> Reviewed-by: Ira Weiny 
>> Tested-by: Ira Weiny 
>>  -- Limited to mlx4, qib, and OPA (with additional patches.)
>>
>>
>> On Tue, May 05, 2015 at 02:50:17PM +0200, Michael Wang wrote:
>>> Since v7:
>>>   * Thanks to Doug, Ira, Devesh for the testing :-)
>>>   * Thanks for the comments from or, Doug, Ira, Jason :-)
>>> Please remind me if anything missed :-P
>>>   * Use rdma_cap_XX() instead of cap_XX() for readability
>>>   * Remove CC list in git log for maintainability
>>>   * Use bool as return value
>>>   * Updated github repository to v8
>>>
>>> There are plenty of lengthy code to check the transport type of IB device,
>>> or the link layer type of it's port, but actually we are just speculating
>>> whether a particular management/feature is supported by the device/port.
>>>
>>> Thus instead of inferring, we should have our own mechanism for IB 
>>> management
>>> capability/protocol/feature checking, several proposals below.
>>>
>>> This patch set will introduce query_protocol() to check management 
>>> requirement
>>> instead of inferring from transport and link layer respectively, along with
>>> the new enum on protocol type.
>>>
>>> Mapping List:
>>> node-type   link-layer  transport   protocol
>>> nes RNICETH IWARP   IWARP
>>> amso1100RNICETH IWARP   IWARP
>>> cxgb3   RNICETH IWARP   IWARP
>>> cxgb4   RNICETH IWARP   IWARP
>>> usnic   USNIC_UDP   ETH USNIC_UDP   USNIC_UDP
>>> ocrdma  IB_CA   ETH IB  IBOE
>>> mlx4IB_CA   IB/ETH  IB  IB/IBOE
>>> mlx5IB_CA   IB  IB  IB
>>> ehcaIB_CA   IB  IB  IB
>>> ipath   IB_CA   IB  IB  IB
>>> mthca   IB_CA   IB  IB  IB
>>> qib IB_CA   IB  IB  IB
>>>
>>> For example:
>>> if (transport == IB) && (link-layer == ETH)
>>> will now become:
>>> if (query_protocol() == IBOE)
>>>
>>> Thus we will be able to get rid of the respective transport and link-layer
>>> checking, and it will help us to add new protocol/Technology (like OPA) more
>>> easier, also with the introduced management helpers, IB management logical
>>> will be more clear and easier for extending.
>>>
>>> Highlights:
>>> The long CC list in each patches was complained cons

Re: [PATCH v8 00/23] IB/Verbs: IB Management Helpers

2015-05-05 Thread Michael Wang


On 05/05/2015 04:16 PM, Or Gerlitz wrote:
> On 5/5/2015 3:50 PM, Michael Wang wrote:
>> Since v7:
>>* Thanks to Doug, Ira, Devesh for the testing:-)
>>* Thanks for the comments from or, Doug, Ira, Jason:-)
>>  Please remind me if anything missed:-P
>>* Use rdma_cap_XX() instead of cap_XX() for readability
>>* Remove CC list in git log for maintainability
>>* Use bool as return value
>>* Updated github repository to v8
> 
> Didn't you see that patches 15~23 will be squashed into one?

I missed the conversation on that, you mentioned to remove the
CC list but don't want to merge the patches, correct?

> 
> 
> Also, when you post version N you need not only to list the changes since 
> version N-1 but rather also to keep the full changes since Vx for x=1...N-2, 
> reviewers needs not chase your previous cover letters and see if/whatwent 
> wrong, specifically with a sensitive series like this one.

That maybe too long and with some outdated info, I can reserve
the log since this version if it was preferred by folks.

> 
> So if/when there's V9, use that practice, and for the time being, reply on 
> the V8 cover letter with the full listing of changes

I'm using git send-mail which reply on the cover of each version,
not sure how to reply on a prev cover? and that won't start a new
thread for the new version, isn't it?

Regards,
Michael Wang

> 
> Or.
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 00/23] IB/Verbs: IB Management Helpers

2015-05-05 Thread Michael Wang
Since v7:
  * Thanks to Doug, Ira, Devesh for the testing :-)
  * Thanks for the comments from or, Doug, Ira, Jason :-)
Please remind me if anything missed :-P
  * Use rdma_cap_XX() instead of cap_XX() for readability
  * Remove CC list in git log for maintainability
  * Use bool as return value
  * Updated github repository to v8

There are plenty of lengthy code to check the transport type of IB device,
or the link layer type of it's port, but actually we are just speculating
whether a particular management/feature is supported by the device/port.

Thus instead of inferring, we should have our own mechanism for IB management
capability/protocol/feature checking, several proposals below.

This patch set will introduce query_protocol() to check management requirement
instead of inferring from transport and link layer respectively, along with
the new enum on protocol type.

Mapping List:
node-type   link-layer  transport   protocol
nes RNICETH IWARP   IWARP
amso1100RNICETH IWARP   IWARP
cxgb3   RNICETH IWARP   IWARP
cxgb4   RNICETH IWARP   IWARP
usnic   USNIC_UDP   ETH USNIC_UDP   USNIC_UDP
ocrdma  IB_CA   ETH IB  IBOE
mlx4IB_CA   IB/ETH  IB  IB/IBOE
mlx5IB_CA   IB  IB  IB
ehcaIB_CA   IB  IB  IB
ipath   IB_CA   IB  IB  IB
mthca   IB_CA   IB  IB  IB
qib IB_CA   IB  IB  IB

For example:
if (transport == IB) && (link-layer == ETH)
will now become:
if (query_protocol() == IBOE)

Thus we will be able to get rid of the respective transport and link-layer
checking, and it will help us to add new protocol/Technology (like OPA) more
easier, also with the introduced management helpers, IB management logical
will be more clear and easier for extending.

Highlights:
The long CC list in each patches was complained consider about the
maintainability, it was suggested folks to provide their reviewed-by or
Acked-by instead, so for those who used to be on the CC list, please
provide your signature voluntarily :-)

The 'mgmt-helpers' branch of 'g...@github.com:ywang-pb/infiniband-wy.git'
contain this series based on the latest 'infiniband/for-next'

Patch 1#~14# included all the logical reform, 15#~23# introduced the
management helpers.

Doug suggested the bitmask mechanism:
https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23765.html
which could be the plan for future reforming, we prefer that to be another
series which focus on semantic and performance.

This patch-set is somewhat 'bloated' now and it may be a good timing for
staging, I'd like to suggest we focus on improving existed helpers and push
all the further reforms into next series ;-)

Proposals:
Sean:
https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23339.html
Doug:
https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23418.html
https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23765.html
Jason:
https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23425.html

Michael Wang (23):
  IB/Verbs: Implement new callback query_protocol()
  IB/Verbs: Implement raw management helpers
  IB/Verbs: Reform IB-core mad/agent/user_mad
  IB/Verbs: Reform IB-core cm
  IB/Verbs: Reform IB-core sa_query
  IB/Verbs: Reform IB-core multicast
  IB/Verbs: Reform IB-ulp ipoib
  IB/Verbs: Reform IB-ulp xprtrdma
  IB/Verbs: Reform IB-core verbs
  IB/Verbs: Reform cm related part in IB-core cma/ucm
  IB/Verbs: Reform route related part in IB-core cma
  IB/Verbs: Reform mcast related part in IB-core cma
  IB/Verbs: Reform cma_acquire_dev()
  IB/Verbs: Reform rest part in IB-core cma
  IB/Verbs: Use management helper rdma_cap_ib_mad()
  IB/Verbs: Use management helper rdma_cap_ib_smi()
  IB/Verbs: Use management helper rdma_cap_ib_cm()
  IB/Verbs: Use management helper rdma_cap_iw_cm()
  IB/Verbs: Use management helper rdma_cap_ib_sa()
  IB/Verbs: Use management helper rdma_cap_ib_mcast()
  IB/Verbs: Use management helper rdma_cap_read_multi_sge()
  IB/Verbs: Use management helper rdma_cap_af_ib()
  IB/Verbs: Use management helper rdma_cap_eth_ah()

 drivers/infiniband/core/agent.c  |   2 +-
 drivers/infiniband/core/cm.c |  20 ++-
 drivers/infiniband/core/cma.c| 257 +++
 drivers/infiniband/core/device.c |   1 +
 drivers/infiniband/core/mad.c|  43 +++--
 drivers/infiniband/c

[PATCH v8 02/23] IB/Verbs: Implement raw management helpers

2015-05-05 Thread Michael Wang
Add raw helpers:
rdma_protocol_ib
rdma_protocol_iboe
rdma_protocol_iwarp
rdma_ib_or_iboe (transition, clean up later)
To help us detect which technology the port supported.

Signed-off-by: Michael Wang 
---
 include/rdma/ib_verbs.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 080f204..e6dd984 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1752,6 +1752,28 @@ int ib_query_port(struct ib_device *device,
 enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device,
   u8 port_num);
 
+static inline bool rdma_protocol_ib(struct ib_device *device, u8 port_num)
+{
+   return device->query_protocol(device, port_num) == RDMA_PROTOCOL_IB;
+}
+
+static inline bool rdma_protocol_iboe(struct ib_device *device, u8 port_num)
+{
+   return device->query_protocol(device, port_num) == RDMA_PROTOCOL_IBOE;
+}
+
+static inline bool rdma_protocol_iwarp(struct ib_device *device, u8 port_num)
+{
+   return device->query_protocol(device, port_num) == RDMA_PROTOCOL_IWARP;
+}
+
+static inline bool rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
+{
+   enum rdma_protocol_type pt = device->query_protocol(device, port_num);
+
+   return (pt == RDMA_PROTOCOL_IB || pt == RDMA_PROTOCOL_IBOE);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 11/23] IB/Verbs: Reform route related part in IB-core cma

2015-05-05 Thread Michael Wang
Use raw management helpers to reform route related part in IB-core cma.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c  | 31 ---
 drivers/infiniband/core/ucma.c | 25 ++---
 2 files changed, 14 insertions(+), 42 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 8a07e89..36c5f8a 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -923,13 +923,9 @@ static inline int cma_user_data_offset(struct 
rdma_id_private *id_priv)
 
 static void cma_cancel_route(struct rdma_id_private *id_priv)
 {
-   switch (rdma_port_get_link_layer(id_priv->id.device, 
id_priv->id.port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
+   if (rdma_protocol_ib(id_priv->id.device, id_priv->id.port_num)) {
if (id_priv->query)
ib_sa_cancel_query(id_priv->query_id, id_priv->query);
-   break;
-   default:
-   break;
}
 }
 
@@ -1957,26 +1953,15 @@ int rdma_resolve_route(struct rdma_cm_id *id, int 
timeout_ms)
return -EINVAL;
 
atomic_inc(&id_priv->refcount);
-   switch (rdma_node_get_transport(id->device->node_type)) {
-   case RDMA_TRANSPORT_IB:
-   switch (rdma_port_get_link_layer(id->device, id->port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
-   ret = cma_resolve_ib_route(id_priv, timeout_ms);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
-   ret = cma_resolve_iboe_route(id_priv);
-   break;
-   default:
-   ret = -ENOSYS;
-   }
-   break;
-   case RDMA_TRANSPORT_IWARP:
+   if (rdma_protocol_ib(id->device, id->port_num))
+   ret = cma_resolve_ib_route(id_priv, timeout_ms);
+   else if (rdma_protocol_iboe(id->device, id->port_num))
+   ret = cma_resolve_iboe_route(id_priv);
+   else if (rdma_protocol_iwarp(id->device, id->port_num))
ret = cma_resolve_iw_route(id_priv, timeout_ms);
-   break;
-   default:
+   else
ret = -ENOSYS;
-   break;
-   }
+
if (ret)
goto err;
 
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 45d67e9..dae7620 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -722,26 +722,13 @@ static ssize_t ucma_query_route(struct ucma_file *file,
 
resp.node_guid = (__force __u64) ctx->cm_id->device->node_guid;
resp.port_num = ctx->cm_id->port_num;
-   switch (rdma_node_get_transport(ctx->cm_id->device->node_type)) {
-   case RDMA_TRANSPORT_IB:
-   switch (rdma_port_get_link_layer(ctx->cm_id->device,
-   ctx->cm_id->port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
-   ucma_copy_ib_route(&resp, &ctx->cm_id->route);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
-   ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
-   break;
-   default:
-   break;
-   }
-   break;
-   case RDMA_TRANSPORT_IWARP:
+
+   if (rdma_protocol_ib(ctx->cm_id->device, ctx->cm_id->port_num))
+   ucma_copy_ib_route(&resp, &ctx->cm_id->route);
+   else if (rdma_protocol_iboe(ctx->cm_id->device, ctx->cm_id->port_num))
+   ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
+   else if (rdma_protocol_iwarp(ctx->cm_id->device, ctx->cm_id->port_num))
ucma_copy_iw_route(&resp, &ctx->cm_id->route);
-   break;
-   default:
-   break;
-   }
 
 out:
if (copy_to_user((void __user *)(unsigned long)cmd.response,
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 03/23] IB/Verbs: Reform IB-core mad/agent/user_mad

2015-05-05 Thread Michael Wang
Use raw management helpers to reform IB-core mad/agent/user_mad.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/agent.c|  2 +-
 drivers/infiniband/core/mad.c  | 43 +++---
 drivers/infiniband/core/user_mad.c | 26 ---
 3 files changed, 41 insertions(+), 30 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index f6d2961..89d4fbc 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int 
port_num)
goto error1;
}
 
-   if (rdma_port_get_link_layer(device, port_num) == 
IB_LINK_LAYER_INFINIBAND) {
+   if (rdma_protocol_ib(device, port_num)) {
/* Obtain send only MAD agent for SMI QP */
port_priv->agent[0] = ib_register_mad_agent(device, port_num,
IB_QPT_SMI, NULL, 0,
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 74c30f4..507eb67 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device,
init_mad_qp(port_priv, &port_priv->qp_info[1]);
 
cq_size = mad_sendq_size + mad_recvq_size;
-   has_smi = rdma_port_get_link_layer(device, port_num) == 
IB_LINK_LAYER_INFINIBAND;
+   has_smi = rdma_protocol_ib(device, port_num);
if (has_smi)
cq_size *= 2;
 
@@ -3057,9 +3057,6 @@ static void ib_mad_init_device(struct ib_device *device)
 {
int start, end, i;
 
-   if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
-   return;
-
if (device->node_type == RDMA_NODE_IB_SWITCH) {
start = 0;
end   = 0;
@@ -3069,6 +3066,9 @@ static void ib_mad_init_device(struct ib_device *device)
}
 
for (i = start; i <= end; i++) {
+   if (!rdma_ib_or_iboe(device, i))
+   continue;
+
if (ib_mad_port_open(device, i)) {
dev_err(&device->dev, "Couldn't open port %d\n", i);
goto error;
@@ -3086,40 +3086,39 @@ error_agent:
dev_err(&device->dev, "Couldn't close port %d\n", i);
 
 error:
-   i--;
+   while (--i >= start) {
+   if (!rdma_ib_or_iboe(device, i))
+   continue;
 
-   while (i >= start) {
if (ib_agent_port_close(device, i))
dev_err(&device->dev,
"Couldn't close port %d for agents\n", i);
if (ib_mad_port_close(device, i))
dev_err(&device->dev, "Couldn't close port %d\n", i);
-   i--;
}
 }
 
 static void ib_mad_remove_device(struct ib_device *device)
 {
-   int i, num_ports, cur_port;
-
-   if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
-   return;
+   int start, end, i;
 
if (device->node_type == RDMA_NODE_IB_SWITCH) {
-   num_ports = 1;
-   cur_port = 0;
+   start = 0;
+   end   = 0;
} else {
-   num_ports = device->phys_port_cnt;
-   cur_port = 1;
+   start = 1;
+   end   = device->phys_port_cnt;
}
-   for (i = 0; i < num_ports; i++, cur_port++) {
-   if (ib_agent_port_close(device, cur_port))
+
+   for (i = start; i <= end; i++) {
+   if (!rdma_ib_or_iboe(device, i))
+   continue;
+
+   if (ib_agent_port_close(device, i))
dev_err(&device->dev,
-   "Couldn't close port %d for agents\n",
-   cur_port);
-   if (ib_mad_port_close(device, cur_port))
-   dev_err(&device->dev, "Couldn't close port %d\n",
-   cur_port);
+   "Couldn't close port %d for agents\n", i);
+   if (ib_mad_port_close(device, i))
+   dev_err(&device->dev, "Couldn't close port %d\n", i);
}
 }
 
diff --git a/drivers/infiniband/core/user_mad.c 
b/drivers/infiniband/core/user_mad.c
index 928cdd2..aa8b334 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -1273,9 +1273,7 @@ static void ib_umad_add_one(struct ib_device *device)
 {
struct ib_umad_device *umad_dev;
int s, e, i;
-
-   if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
-   return;
+   int count = 0;
 
 

[PATCH v8 05/23] IB/Verbs: Reform IB-core sa_query

2015-05-05 Thread Michael Wang
Use raw management helpers to reform IB-core sa_query.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/sa_query.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index c38f030..b115c28 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, 
struct ib_event *event
struct ib_sa_port *port =
&sa_dev->port[event->element.port_num - 
sa_dev->start_port];
 
-   if (rdma_port_get_link_layer(handler->device, port->port_num) 
!= IB_LINK_LAYER_INFINIBAND)
+   if (WARN_ON(!rdma_protocol_ib(handler->device, port->port_num)))
return;
 
spin_lock_irqsave(&port->ah_lock, flags);
@@ -540,7 +540,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 
port_num,
ah_attr->port_num = port_num;
ah_attr->static_rate = rec->rate;
 
-   force_grh = rdma_port_get_link_layer(device, port_num) == 
IB_LINK_LAYER_ETHERNET;
+   force_grh = rdma_protocol_iboe(device, port_num);
 
if (rec->hop_limit > 1 || force_grh) {
ah_attr->ah_flags = IB_AH_GRH;
@@ -1153,9 +1153,7 @@ static void ib_sa_add_one(struct ib_device *device)
 {
struct ib_sa_device *sa_dev;
int s, e, i;
-
-   if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
-   return;
+   int count = 0;
 
if (device->node_type == RDMA_NODE_IB_SWITCH)
s = e = 0;
@@ -1175,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)
 
for (i = 0; i <= e - s; ++i) {
spin_lock_init(&sa_dev->port[i].ah_lock);
-   if (rdma_port_get_link_layer(device, i + 1) != 
IB_LINK_LAYER_INFINIBAND)
+   if (!rdma_protocol_ib(device, i + 1))
continue;
 
sa_dev->port[i].sm_ah= NULL;
@@ -1189,8 +1187,13 @@ static void ib_sa_add_one(struct ib_device *device)
goto err;
 
INIT_WORK(&sa_dev->port[i].update_task, update_sm_ah);
+
+   count++;
}
 
+   if (!count)
+   goto free;
+
ib_set_client_data(device, &sa_client, sa_dev);
 
/*
@@ -1204,19 +1207,20 @@ static void ib_sa_add_one(struct ib_device *device)
if (ib_register_event_handler(&sa_dev->event_handler))
goto err;
 
-   for (i = 0; i <= e - s; ++i)
-   if (rdma_port_get_link_layer(device, i + 1) == 
IB_LINK_LAYER_INFINIBAND)
+   for (i = 0; i <= e - s; ++i) {
+   if (rdma_protocol_ib(device, i + 1))
update_sm_ah(&sa_dev->port[i].update_task);
+   }
 
return;
 
 err:
-   while (--i >= 0)
-   if (rdma_port_get_link_layer(device, i + 1) == 
IB_LINK_LAYER_INFINIBAND)
+   while (--i >= 0) {
+   if (rdma_protocol_ib(device, i + 1))
ib_unregister_mad_agent(sa_dev->port[i].agent);
-
+   }
+free:
kfree(sa_dev);
-
return;
 }
 
@@ -1233,7 +1237,7 @@ static void ib_sa_remove_one(struct ib_device *device)
flush_workqueue(ib_wq);
 
for (i = 0; i <= sa_dev->end_port - sa_dev->start_port; ++i) {
-   if (rdma_port_get_link_layer(device, i + 1) == 
IB_LINK_LAYER_INFINIBAND) {
+   if (rdma_protocol_ib(device, i + 1)) {
ib_unregister_mad_agent(sa_dev->port[i].agent);
if (sa_dev->port[i].sm_ah)
kref_put(&sa_dev->port[i].sm_ah->ref, 
free_sm_ah);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 10/23] IB/Verbs: Reform cm related part in IB-core cma/ucm

2015-05-05 Thread Michael Wang
Use raw management helpers to reform cm related part in IB-core cma/ucm.

Few checks focus on the device cm type rather than the port capability,
directly pass port 1 works currently, but can't support mixing cm type
device in future.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c | 81 +--
 drivers/infiniband/core/ucm.c |  3 +-
 2 files changed, 26 insertions(+), 58 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d570030..8a07e89 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -735,8 +735,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct 
ib_qp_attr *qp_attr,
int ret = 0;
 
id_priv = container_of(id, struct rdma_id_private, id);
-   switch (rdma_node_get_transport(id_priv->id.device->node_type)) {
-   case RDMA_TRANSPORT_IB:
+   if (rdma_ib_or_iboe(id->device, id->port_num)) {
if (!id_priv->cm_id.ib || (id_priv->id.qp_type == IB_QPT_UD))
ret = cma_ib_init_qp_attr(id_priv, qp_attr, 
qp_attr_mask);
else
@@ -745,19 +744,15 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct 
ib_qp_attr *qp_attr,
 
if (qp_attr->qp_state == IB_QPS_RTR)
qp_attr->rq_psn = id_priv->seq_num;
-   break;
-   case RDMA_TRANSPORT_IWARP:
+   } else if (rdma_protocol_iwarp(id->device, id->port_num)) {
if (!id_priv->cm_id.iw) {
qp_attr->qp_access_flags = 0;
*qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS;
} else
ret = iw_cm_init_qp_attr(id_priv->cm_id.iw, qp_attr,
 qp_attr_mask);
-   break;
-   default:
+   } else
ret = -ENOSYS;
-   break;
-   }
 
return ret;
 }
@@ -1037,17 +1032,12 @@ void rdma_destroy_id(struct rdma_cm_id *id)
mutex_unlock(&id_priv->handler_mutex);
 
if (id_priv->cma_dev) {
-   switch (rdma_node_get_transport(id_priv->id.device->node_type)) 
{
-   case RDMA_TRANSPORT_IB:
+   if (rdma_ib_or_iboe(id_priv->id.device, 1)) {
if (id_priv->cm_id.ib)
ib_destroy_cm_id(id_priv->cm_id.ib);
-   break;
-   case RDMA_TRANSPORT_IWARP:
+   } else if (rdma_protocol_iwarp(id_priv->id.device, 1)) {
if (id_priv->cm_id.iw)
iw_destroy_cm_id(id_priv->cm_id.iw);
-   break;
-   default:
-   break;
}
cma_leave_mc_groups(id_priv);
cma_release_dev(id_priv);
@@ -1626,7 +1616,7 @@ static void cma_listen_on_dev(struct rdma_id_private 
*id_priv,
int ret;
 
if (cma_family(id_priv) == AF_IB &&
-   rdma_node_get_transport(cma_dev->device->node_type) != 
RDMA_TRANSPORT_IB)
+   !rdma_ib_or_iboe(cma_dev->device, 1))
return;
 
id = rdma_create_id(cma_listen_handler, id_priv, id_priv->id.ps,
@@ -2028,7 +2018,7 @@ static int cma_bind_loopback(struct rdma_id_private 
*id_priv)
mutex_lock(&lock);
list_for_each_entry(cur_dev, &dev_list, list) {
if (cma_family(id_priv) == AF_IB &&
-   rdma_node_get_transport(cur_dev->device->node_type) != 
RDMA_TRANSPORT_IB)
+   !rdma_ib_or_iboe(cur_dev->device, 1))
continue;
 
if (!cma_dev)
@@ -2060,7 +2050,7 @@ port_found:
goto out;
 
id_priv->id.route.addr.dev_addr.dev_type =
-   (rdma_port_get_link_layer(cma_dev->device, p) == 
IB_LINK_LAYER_INFINIBAND) ?
+   (rdma_protocol_ib(cma_dev->device, p)) ?
ARPHRD_INFINIBAND : ARPHRD_ETHER;
 
rdma_addr_set_sgid(&id_priv->id.route.addr.dev_addr, &gid);
@@ -2537,18 +2527,15 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
 
id_priv->backlog = backlog;
if (id->device) {
-   switch (rdma_node_get_transport(id->device->node_type)) {
-   case RDMA_TRANSPORT_IB:
+   if (rdma_ib_or_iboe(id->device, 1)) {
ret = cma_ib_listen(id_priv);
if (ret)
goto err;
-   break;
-   case RDMA_TRANSPORT_IWARP:
+   } else if (rdma_protocol_iwarp(id->device, 1)) {
ret = cma_iw_listen(id_priv, backlog);
if (ret)
goto err;
-   break;
-   

[PATCH v8 09/23] IB/Verbs: Reform IB-core verbs

2015-05-05 Thread Michael Wang
Use raw management helpers to reform IB-core verbs

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/verbs.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index f93eb8d..7dd2f51 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -198,11 +198,9 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 
port_num, struct ib_wc *wc,
u32 flow_class;
u16 gid_index;
int ret;
-   int is_eth = (rdma_port_get_link_layer(device, port_num) ==
-   IB_LINK_LAYER_ETHERNET);
 
memset(ah_attr, 0, sizeof *ah_attr);
-   if (is_eth) {
+   if (rdma_protocol_iboe(device, port_num)) {
if (!(wc->wc_flags & IB_WC_GRH))
return -EPROTOTYPE;
 
@@ -871,7 +869,7 @@ int ib_resolve_eth_l2_attrs(struct ib_qp *qp,
union ib_gid  sgid;
 
if ((*qp_attr_mask & IB_QP_AV)  &&
-   (rdma_port_get_link_layer(qp->device, qp_attr->ah_attr.port_num) == 
IB_LINK_LAYER_ETHERNET)) {
+   (rdma_protocol_iboe(qp->device, qp_attr->ah_attr.port_num))) {
ret = ib_query_gid(qp->device, qp_attr->ah_attr.port_num,
   qp_attr->ah_attr.grh.sgid_index, &sgid);
if (ret)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 08/23] IB/Verbs: Reform IB-ulp xprtrdma

2015-05-05 Thread Michael Wang
Use raw management helpers to reform IB-ulp xprtrdma.

Signed-off-by: Michael Wang 
---
 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c  |  4 +--
 net/sunrpc/xprtrdma/svc_rdma_transport.c | 45 +---
 2 files changed, 20 insertions(+), 29 deletions(-)

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index f9f13a3..2cc625d 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -117,8 +117,8 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
 
 static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
 {
-   if (rdma_node_get_transport(xprt->sc_cm_id->device->node_type) ==
-RDMA_TRANSPORT_IWARP)
+   if (rdma_protocol_iwarp(xprt->sc_cm_id->device,
+   xprt->sc_cm_id->port_num))
return 1;
else
return min_t(int, sge_count, xprt->sc_max_sge);
diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c 
b/net/sunrpc/xprtrdma/svc_rdma_transport.c
index f609c1c..3df8320 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
@@ -851,7 +851,7 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt 
*xprt)
struct ib_qp_init_attr qp_attr;
struct ib_device_attr devattr;
int uninitialized_var(dma_mr_acc);
-   int need_dma_mr;
+   int need_dma_mr = 0;
int ret;
int i;
 
@@ -985,35 +985,26 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt 
*xprt)
/*
 * Determine if a DMA MR is required and if so, what privs are required
 */
-   switch (rdma_node_get_transport(newxprt->sc_cm_id->device->node_type)) {
-   case RDMA_TRANSPORT_IWARP:
-   newxprt->sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV;
-   if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG)) {
-   need_dma_mr = 1;
-   dma_mr_acc =
-   (IB_ACCESS_LOCAL_WRITE |
-IB_ACCESS_REMOTE_WRITE);
-   } else if (!(devattr.device_cap_flags & 
IB_DEVICE_LOCAL_DMA_LKEY)) {
-   need_dma_mr = 1;
-   dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
-   } else
-   need_dma_mr = 0;
-   break;
-   case RDMA_TRANSPORT_IB:
-   if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG)) {
-   need_dma_mr = 1;
-   dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
-   } else if (!(devattr.device_cap_flags &
-IB_DEVICE_LOCAL_DMA_LKEY)) {
-   need_dma_mr = 1;
-   dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
-   } else
-   need_dma_mr = 0;
-   break;
-   default:
+   if (!rdma_protocol_iwarp(newxprt->sc_cm_id->device,
+newxprt->sc_cm_id->port_num) &&
+   !rdma_ib_or_iboe(newxprt->sc_cm_id->device,
+newxprt->sc_cm_id->port_num))
goto errout;
+
+   if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG) ||
+   !(devattr.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY)) {
+   need_dma_mr = 1;
+   dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
+   if (rdma_protocol_iwarp(newxprt->sc_cm_id->device,
+   newxprt->sc_cm_id->port_num) &&
+   !(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG))
+   dma_mr_acc |= IB_ACCESS_REMOTE_WRITE;
}
 
+   if (rdma_protocol_iwarp(newxprt->sc_cm_id->device,
+   newxprt->sc_cm_id->port_num))
+   newxprt->sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV;
+
/* Create the DMA MR if needed, otherwise, use the DMA LKEY */
if (need_dma_mr) {
/* Register all of physical memory */
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 04/23] IB/Verbs: Reform IB-core cm

2015-05-05 Thread Michael Wang
Use raw management helpers to reform IB-core cm.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cm.c | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index e28a494..add5e484 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3760,11 +3760,9 @@ static void cm_add_one(struct ib_device *ib_device)
};
unsigned long flags;
int ret;
+   int count = 0;
u8 i;
 
-   if (rdma_node_get_transport(ib_device->node_type) != RDMA_TRANSPORT_IB)
-   return;
-
cm_dev = kzalloc(sizeof(*cm_dev) + sizeof(*port) *
 ib_device->phys_port_cnt, GFP_KERNEL);
if (!cm_dev)
@@ -3783,6 +3781,9 @@ static void cm_add_one(struct ib_device *ib_device)
 
set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
for (i = 1; i <= ib_device->phys_port_cnt; i++) {
+   if (!rdma_ib_or_iboe(ib_device, i))
+   continue;
+
port = kzalloc(sizeof *port, GFP_KERNEL);
if (!port)
goto error1;
@@ -3809,7 +3810,13 @@ static void cm_add_one(struct ib_device *ib_device)
ret = ib_modify_port(ib_device, i, 0, &port_modify);
if (ret)
goto error3;
+
+   count++;
}
+
+   if (!count)
+   goto free;
+
ib_set_client_data(ib_device, &cm_client, cm_dev);
 
write_lock_irqsave(&cm.device_lock, flags);
@@ -3825,11 +3832,15 @@ error1:
port_modify.set_port_cap_mask = 0;
port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
while (--i) {
+   if (!rdma_ib_or_iboe(ib_device, i))
+   continue;
+
port = cm_dev->port[i-1];
ib_modify_port(ib_device, port->port_num, 0, &port_modify);
ib_unregister_mad_agent(port->mad_agent);
cm_remove_port_fs(port);
}
+free:
device_unregister(cm_dev->device);
kfree(cm_dev);
 }
@@ -3853,6 +3864,9 @@ static void cm_remove_one(struct ib_device *ib_device)
write_unlock_irqrestore(&cm.device_lock, flags);
 
for (i = 1; i <= ib_device->phys_port_cnt; i++) {
+   if (!rdma_ib_or_iboe(ib_device, i))
+   continue;
+
port = cm_dev->port[i-1];
ib_modify_port(ib_device, port->port_num, 0, &port_modify);
ib_unregister_mad_agent(port->mad_agent);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 12/23] IB/Verbs: Reform mcast related part in IB-core cma

2015-05-05 Thread Michael Wang
Use raw management helpers to reform mcast related part in IB-core cma.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c | 56 ++-
 1 file changed, 18 insertions(+), 38 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 36c5f8a..34ec13f 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -997,17 +997,12 @@ static void cma_leave_mc_groups(struct rdma_id_private 
*id_priv)
mc = container_of(id_priv->mc_list.next,
  struct cma_multicast, list);
list_del(&mc->list);
-   switch (rdma_port_get_link_layer(id_priv->cma_dev->device, 
id_priv->id.port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
+   if (rdma_protocol_ib(id_priv->cma_dev->device,
+ id_priv->id.port_num)) {
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
+   } else
kref_put(&mc->mcref, release_mc);
-   break;
-   default:
-   break;
-   }
}
 }
 
@@ -3314,24 +3309,13 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr,
list_add(&mc->list, &id_priv->mc_list);
spin_unlock(&id_priv->lock);
 
-   switch (rdma_node_get_transport(id->device->node_type)) {
-   case RDMA_TRANSPORT_IB:
-   switch (rdma_port_get_link_layer(id->device, id->port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
-   ret = cma_join_ib_multicast(id_priv, mc);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
-   kref_init(&mc->mcref);
-   ret = cma_iboe_join_multicast(id_priv, mc);
-   break;
-   default:
-   ret = -EINVAL;
-   }
-   break;
-   default:
+   if (rdma_protocol_iboe(id->device, id->port_num)) {
+   kref_init(&mc->mcref);
+   ret = cma_iboe_join_multicast(id_priv, mc);
+   } else if (rdma_protocol_ib(id->device, id->port_num))
+   ret = cma_join_ib_multicast(id_priv, mc);
+   else
ret = -ENOSYS;
-   break;
-   }
 
if (ret) {
spin_lock_irq(&id_priv->lock);
@@ -3359,19 +3343,15 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr)
ib_detach_mcast(id->qp,
&mc->multicast.ib->rec.mgid,

be16_to_cpu(mc->multicast.ib->rec.mlid));
-   if 
(rdma_node_get_transport(id_priv->cma_dev->device->node_type) == 
RDMA_TRANSPORT_IB) {
-   switch (rdma_port_get_link_layer(id->device, 
id->port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
-   ib_sa_free_multicast(mc->multicast.ib);
-   kfree(mc);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
-   kref_put(&mc->mcref, release_mc);
-   break;
-   default:
-   break;
-   }
-   }
+
+   BUG_ON(id_priv->cma_dev->device != id->device);
+
+   if (rdma_protocol_ib(id->device, id->port_num)) {
+   ib_sa_free_multicast(mc->multicast.ib);
+   kfree(mc);
+   } else if (rdma_protocol_iboe(id->device, id->port_num))
+   kref_put(&mc->mcref, release_mc);
+
return;
}
}
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 13/23] IB/Verbs: Reform cma_acquire_dev()

2015-05-05 Thread Michael Wang
Reform cma_acquire_dev() with management helpers, introduce
cma_validate_port() to make the code more clean.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c | 68 +--
 1 file changed, 40 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 34ec13f..3fb3458 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -349,18 +349,35 @@ static int cma_translate_addr(struct sockaddr *addr, 
struct rdma_dev_addr *dev_a
return ret;
 }
 
+static inline int cma_validate_port(struct ib_device *device, u8 port,
+ union ib_gid *gid, int dev_type)
+{
+   u8 found_port;
+   int ret = -ENODEV;
+
+   if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, port))
+   return ret;
+
+   if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port))
+   return ret;
+
+   ret = ib_find_cached_gid(device, gid, &found_port, NULL);
+   if (port != found_port)
+   return -ENODEV;
+
+   return ret;
+}
+
 static int cma_acquire_dev(struct rdma_id_private *id_priv,
   struct rdma_id_private *listen_id_priv)
 {
struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
struct cma_device *cma_dev;
-   union ib_gid gid, iboe_gid;
+   union ib_gid gid, iboe_gid, *gidp;
int ret = -ENODEV;
-   u8 port, found_port;
-   enum rdma_link_layer dev_ll = dev_addr->dev_type == ARPHRD_INFINIBAND ?
-   IB_LINK_LAYER_INFINIBAND : IB_LINK_LAYER_ETHERNET;
+   u8 port;
 
-   if (dev_ll != IB_LINK_LAYER_INFINIBAND &&
+   if (dev_addr->dev_type != ARPHRD_INFINIBAND &&
id_priv->id.ps == RDMA_PS_IPOIB)
return -EINVAL;
 
@@ -370,41 +387,36 @@ static int cma_acquire_dev(struct rdma_id_private 
*id_priv,
 
memcpy(&gid, dev_addr->src_dev_addr +
   rdma_addr_gid_offset(dev_addr), sizeof gid);
-   if (listen_id_priv &&
-   rdma_port_get_link_layer(listen_id_priv->id.device,
-listen_id_priv->id.port_num) == dev_ll) {
+
+   if (listen_id_priv) {
cma_dev = listen_id_priv->cma_dev;
port = listen_id_priv->id.port_num;
-   if (rdma_node_get_transport(cma_dev->device->node_type) == 
RDMA_TRANSPORT_IB &&
-   rdma_port_get_link_layer(cma_dev->device, port) == 
IB_LINK_LAYER_ETHERNET)
-   ret = ib_find_cached_gid(cma_dev->device, &iboe_gid,
-&found_port, NULL);
-   else
-   ret = ib_find_cached_gid(cma_dev->device, &gid,
-&found_port, NULL);
+   gidp = rdma_protocol_iboe(cma_dev->device, port) ?
+  &iboe_gid : &gid;
 
-   if (!ret && (port  == found_port)) {
-   id_priv->id.port_num = found_port;
+   ret = cma_validate_port(cma_dev->device, port, gidp,
+   dev_addr->dev_type);
+   if (!ret) {
+   id_priv->id.port_num = port;
goto out;
}
}
+
list_for_each_entry(cma_dev, &dev_list, list) {
for (port = 1; port <= cma_dev->device->phys_port_cnt; ++port) {
if (listen_id_priv &&
listen_id_priv->cma_dev == cma_dev &&
listen_id_priv->id.port_num == port)
continue;
-   if (rdma_port_get_link_layer(cma_dev->device, port) == 
dev_ll) {
-   if 
(rdma_node_get_transport(cma_dev->device->node_type) == RDMA_TRANSPORT_IB &&
-   rdma_port_get_link_layer(cma_dev->device, 
port) == IB_LINK_LAYER_ETHERNET)
-   ret = 
ib_find_cached_gid(cma_dev->device, &iboe_gid, &found_port, NULL);
-   else
-   ret = 
ib_find_cached_gid(cma_dev->device, &gid, &found_port, NULL);
-
-   if (!ret && (port == found_port)) {
-   id_priv->id.port_num = found_port;
-   goto out;
-   }
+
+   gidp = rdma_protocol_iboe(cma_dev->device, port) ?
+  &iboe_gid : &gid;
+
+   ret = cma_v

[PATCH v8 15/23] IB/Verbs: Use management helper rdma_cap_ib_mad()

2015-05-05 Thread Michael Wang
Introduce helper rdma_cap_ib_mad() to help us check if the port of an
IB device support Infiniband Management Datagrams.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/mad.c  |  6 +++---
 drivers/infiniband/core/user_mad.c |  6 +++---
 include/rdma/ib_verbs.h| 15 +++
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 507eb67..80777cd 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -3066,7 +3066,7 @@ static void ib_mad_init_device(struct ib_device *device)
}
 
for (i = start; i <= end; i++) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!rdma_cap_ib_mad(device, i))
continue;
 
if (ib_mad_port_open(device, i)) {
@@ -3087,7 +3087,7 @@ error_agent:
 
 error:
while (--i >= start) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!rdma_cap_ib_mad(device, i))
continue;
 
if (ib_agent_port_close(device, i))
@@ -3111,7 +3111,7 @@ static void ib_mad_remove_device(struct ib_device *device)
}
 
for (i = start; i <= end; i++) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!rdma_cap_ib_mad(device, i))
continue;
 
if (ib_agent_port_close(device, i))
diff --git a/drivers/infiniband/core/user_mad.c 
b/drivers/infiniband/core/user_mad.c
index aa8b334..d451717 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -1294,7 +1294,7 @@ static void ib_umad_add_one(struct ib_device *device)
umad_dev->end_port   = e;
 
for (i = s; i <= e; ++i) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!rdma_cap_ib_mad(device, i))
continue;
 
umad_dev->port[i - s].umad_dev = umad_dev;
@@ -1315,7 +1315,7 @@ static void ib_umad_add_one(struct ib_device *device)
 
 err:
while (--i >= s) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!rdma_cap_ib_mad(device, i))
continue;
 
ib_umad_kill_port(&umad_dev->port[i - s]);
@@ -1333,7 +1333,7 @@ static void ib_umad_remove_one(struct ib_device *device)
return;
 
for (i = 0; i <= umad_dev->end_port - umad_dev->start_port; ++i) {
-   if (rdma_ib_or_iboe(device, i))
+   if (rdma_cap_ib_mad(device, i))
ib_umad_kill_port(&umad_dev->port[i]);
}
 
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index e6dd984..23ba66e 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1774,6 +1774,21 @@ static inline bool rdma_ib_or_iboe(struct ib_device 
*device, u8 port_num)
return (pt == RDMA_PROTOCOL_IB || pt == RDMA_PROTOCOL_IBOE);
 }
 
+/**
+ * rdma_cap_ib_mad - Check if the port of device has the capability Infiniband
+ * Management Datagrams.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return false when port of the device don't support Infiniband
+ * Management Datagrams.
+ */
+static inline bool rdma_cap_ib_mad(struct ib_device *device, u8 port_num)
+{
+   return rdma_ib_or_iboe(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 14/23] IB/Verbs: Reform rest part in IB-core cma

2015-05-05 Thread Michael Wang
Use raw management helpers to reform rest part in IB-core cma.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 3fb3458..d43f492f 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -447,10 +447,10 @@ static int cma_resolve_ib_dev(struct rdma_id_private 
*id_priv)
pkey = ntohs(addr->sib_pkey);
 
list_for_each_entry(cur_dev, &dev_list, list) {
-   if (rdma_node_get_transport(cur_dev->device->node_type) != 
RDMA_TRANSPORT_IB)
-   continue;
-
for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
+   if (!rdma_ib_or_iboe(cur_dev->device, p))
+   continue;
+
if (ib_find_cached_pkey(cur_dev->device, p, pkey, 
&index))
continue;
 
@@ -645,10 +645,9 @@ static int cma_modify_qp_rtr(struct rdma_id_private 
*id_priv,
if (ret)
goto out;
 
-   if (rdma_node_get_transport(id_priv->cma_dev->device->node_type)
-   == RDMA_TRANSPORT_IB &&
-   rdma_port_get_link_layer(id_priv->id.device, id_priv->id.port_num)
-   == IB_LINK_LAYER_ETHERNET) {
+   BUG_ON(id_priv->cma_dev->device != id_priv->id.device);
+
+   if (rdma_protocol_iboe(id_priv->id.device, id_priv->id.port_num)) {
ret = rdma_addr_find_smac_by_sgid(&sgid, qp_attr.smac, NULL);
 
if (ret)
@@ -712,11 +711,10 @@ static int cma_ib_init_qp_attr(struct rdma_id_private 
*id_priv,
int ret;
u16 pkey;
 
-   if (rdma_port_get_link_layer(id_priv->id.device, id_priv->id.port_num) 
==
-   IB_LINK_LAYER_INFINIBAND)
-   pkey = ib_addr_get_pkey(dev_addr);
-   else
+   if (rdma_protocol_iboe(id_priv->id.device, id_priv->id.port_num))
pkey = 0x;
+   else
+   pkey = ib_addr_get_pkey(dev_addr);
 
ret = ib_find_cached_pkey(id_priv->id.device, id_priv->id.port_num,
  pkey, &qp_attr->pkey_index);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 22/23] IB/Verbs: Use management helper rdma_cap_af_ib()

2015-05-05 Thread Michael Wang
Introduce helper rdma_cap_af_ib() to help us check if the port of an
IB device support Native Infiniband Address.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c |  2 +-
 include/rdma/ib_verbs.h   | 15 +++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 101e9cc..a6cbf42 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -448,7 +448,7 @@ static int cma_resolve_ib_dev(struct rdma_id_private 
*id_priv)
 
list_for_each_entry(cur_dev, &dev_list, list) {
for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
-   if (!rdma_ib_or_iboe(cur_dev->device, p))
+   if (!rdma_cap_af_ib(cur_dev->device, p))
continue;
 
if (ib_find_cached_pkey(cur_dev->device, p, pkey, 
&index))
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index ac1e7f1..41f8445 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1865,6 +1865,21 @@ static inline bool rdma_cap_ib_mcast(struct ib_device 
*device, u8 port_num)
 }
 
 /**
+ * rdma_cap_af_ib - Check if the port of device has the capability
+ * Native Infiniband Address.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return false when port of the device don't support
+ * Native Infiniband Address.
+ */
+static inline bool rdma_cap_af_ib(struct ib_device *device, u8 port_num)
+{
+   return rdma_ib_or_iboe(device, port_num);
+}
+
+/**
  * rdma_cap_read_multi_sge - Check if the port of device has the capability
  * RDMA Read Multiple Scatter-Gather Entries.
  *
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 21/23] IB/Verbs: Use management helper rdma_cap_read_multi_sge()

2015-05-05 Thread Michael Wang
Introduce helper rdma_cap_read_multi_sge() to help us check if the port of an
IB device support RDMA Read Multiple Scatter-Gather Entries.

Signed-off-by: Michael Wang 
---
 include/rdma/ib_verbs.h | 16 
 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |  4 ++--
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 6bbbc86..2cf23b1 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1864,6 +1864,22 @@ static inline bool rdma_cap_ib_mcast(struct ib_device 
*device, u8 port_num)
return rdma_cap_ib_sa(device, port_num);
 }
 
+/**
+ * rdma_cap_read_multi_sge - Check if the port of device has the capability
+ * RDMA Read Multiple Scatter-Gather Entries.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return false when port of the device don't support
+ * RDMA Read Multiple Scatter-Gather Entries.
+ */
+static inline bool rdma_cap_read_multi_sge(struct ib_device *device,
+  u8 port_num)
+{
+   return !rdma_protocol_iwarp(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index 2cc625d..86b4416 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -117,8 +117,8 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
 
 static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
 {
-   if (rdma_protocol_iwarp(xprt->sc_cm_id->device,
-   xprt->sc_cm_id->port_num))
+   if (!rdma_cap_read_multi_sge(xprt->sc_cm_id->device,
+xprt->sc_cm_id->port_num))
return 1;
else
return min_t(int, sge_count, xprt->sc_max_sge);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 23/23] IB/Verbs: Use management helper rdma_cap_eth_ah()

2015-05-05 Thread Michael Wang
Introduce helper rdma_cap_eth_ah() to help us check if the port of an
IB device support Ethernet Address Handler.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c  |  2 +-
 drivers/infiniband/core/sa_query.c |  2 +-
 drivers/infiniband/core/verbs.c|  4 ++--
 include/rdma/ib_verbs.h| 15 +++
 4 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index a6cbf42..0a3e859 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -711,7 +711,7 @@ static int cma_ib_init_qp_attr(struct rdma_id_private 
*id_priv,
int ret;
u16 pkey;
 
-   if (rdma_protocol_iboe(id_priv->id.device, id_priv->id.port_num))
+   if (rdma_cap_eth_ah(id_priv->id.device, id_priv->id.port_num))
pkey = 0x;
else
pkey = ib_addr_get_pkey(dev_addr);
diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index 30aa5e5..7f7c8c9 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -540,7 +540,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 
port_num,
ah_attr->port_num = port_num;
ah_attr->static_rate = rec->rate;
 
-   force_grh = rdma_protocol_iboe(device, port_num);
+   force_grh = rdma_cap_eth_ah(device, port_num);
 
if (rec->hop_limit > 1 || force_grh) {
ah_attr->ah_flags = IB_AH_GRH;
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 7dd2f51..d110a5e 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -200,7 +200,7 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 
port_num, struct ib_wc *wc,
int ret;
 
memset(ah_attr, 0, sizeof *ah_attr);
-   if (rdma_protocol_iboe(device, port_num)) {
+   if (rdma_cap_eth_ah(device, port_num)) {
if (!(wc->wc_flags & IB_WC_GRH))
return -EPROTOTYPE;
 
@@ -869,7 +869,7 @@ int ib_resolve_eth_l2_attrs(struct ib_qp *qp,
union ib_gid  sgid;
 
if ((*qp_attr_mask & IB_QP_AV)  &&
-   (rdma_protocol_iboe(qp->device, qp_attr->ah_attr.port_num))) {
+   (rdma_cap_eth_ah(qp->device, qp_attr->ah_attr.port_num))) {
ret = ib_query_gid(qp->device, qp_attr->ah_attr.port_num,
   qp_attr->ah_attr.grh.sgid_index, &sgid);
if (ret)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 41f8445..721c378 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1880,6 +1880,21 @@ static inline bool rdma_cap_af_ib(struct ib_device 
*device, u8 port_num)
 }
 
 /**
+ * rdma_cap_eth_ah - Check if the port of device has the capability
+ * Ethernet Address Handler.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return false when port of the device don't support
+ * Ethernet Address Handler.
+ */
+static inline bool rdma_cap_eth_ah(struct ib_device *device, u8 port_num)
+{
+   return rdma_protocol_iboe(device, port_num);
+}
+
+/**
  * rdma_cap_read_multi_sge - Check if the port of device has the capability
  * RDMA Read Multiple Scatter-Gather Entries.
  *
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 20/23] IB/Verbs: Use management helper rdma_cap_ib_mcast()

2015-05-05 Thread Michael Wang
Introduce helper rdma_cap_ib_mcast() to help us check if the port of an
IB device support Infiniband Multicast.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c   |  6 +++---
 drivers/infiniband/core/multicast.c |  6 +++---
 include/rdma/ib_verbs.h | 15 +++
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 8def2f5..101e9cc 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1007,7 +1007,7 @@ static void cma_leave_mc_groups(struct rdma_id_private 
*id_priv)
mc = container_of(id_priv->mc_list.next,
  struct cma_multicast, list);
list_del(&mc->list);
-   if (rdma_protocol_ib(id_priv->cma_dev->device,
+   if (rdma_cap_ib_mcast(id_priv->cma_dev->device,
  id_priv->id.port_num)) {
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
@@ -3321,7 +3321,7 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr,
if (rdma_protocol_iboe(id->device, id->port_num)) {
kref_init(&mc->mcref);
ret = cma_iboe_join_multicast(id_priv, mc);
-   } else if (rdma_protocol_ib(id->device, id->port_num))
+   } else if (rdma_cap_ib_mcast(id->device, id->port_num))
ret = cma_join_ib_multicast(id_priv, mc);
else
ret = -ENOSYS;
@@ -3355,7 +3355,7 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr)
 
BUG_ON(id_priv->cma_dev->device != id->device);
 
-   if (rdma_protocol_ib(id->device, id->port_num)) {
+   if (rdma_cap_ib_mcast(id->device, id->port_num)) {
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
} else if (rdma_protocol_iboe(id->device, id->port_num))
diff --git a/drivers/infiniband/core/multicast.c 
b/drivers/infiniband/core/multicast.c
index b57ed03..605f20a 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -780,7 +780,7 @@ static void mcast_event_handler(struct ib_event_handler 
*handler,
int index;
 
dev = container_of(handler, struct mcast_device, event_handler);
-   if (WARN_ON(!rdma_protocol_ib(dev->device, event->element.port_num)))
+   if (WARN_ON(!rdma_cap_ib_mcast(dev->device, event->element.port_num)))
return;
 
index = event->element.port_num - dev->start_port;
@@ -820,7 +820,7 @@ static void mcast_add_one(struct ib_device *device)
}
 
for (i = 0; i <= dev->end_port - dev->start_port; i++) {
-   if (!rdma_protocol_ib(device, dev->start_port + i))
+   if (!rdma_cap_ib_mcast(device, dev->start_port + i))
continue;
port = &dev->port[i];
port->dev = dev;
@@ -858,7 +858,7 @@ static void mcast_remove_one(struct ib_device *device)
flush_workqueue(mcast_wq);
 
for (i = 0; i <= dev->end_port - dev->start_port; i++) {
-   if (rdma_protocol_ib(device, dev->start_port + i)) {
+   if (rdma_cap_ib_mcast(device, dev->start_port + i)) {
port = &dev->port[i];
deref_port(port);
wait_for_completion(&port->comp);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index c3a561e8..6bbbc86 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1849,6 +1849,21 @@ static inline bool rdma_cap_ib_sa(struct ib_device 
*device, u8 port_num)
return rdma_protocol_ib(device, port_num);
 }
 
+/**
+ * rdma_cap_ib_mcast - Check if the port of device has the capability 
Infiniband
+ * Multicast.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return false when port of the device don't support Infiniband
+ * Multicast.
+ */
+static inline bool rdma_cap_ib_mcast(struct ib_device *device, u8 port_num)
+{
+   return rdma_cap_ib_sa(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 19/23] IB/Verbs: Use management helper rdma_cap_ib_sa()

2015-05-05 Thread Michael Wang
Introduce helper rdma_cap_ib_sa() to help us check if the port of an
IB device support Infiniband Subnet Administration.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c  |  4 ++--
 drivers/infiniband/core/sa_query.c | 10 +-
 drivers/infiniband/core/ucma.c |  2 +-
 include/rdma/ib_verbs.h| 15 +++
 4 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 0787035..8def2f5 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -933,7 +933,7 @@ static inline int cma_user_data_offset(struct 
rdma_id_private *id_priv)
 
 static void cma_cancel_route(struct rdma_id_private *id_priv)
 {
-   if (rdma_protocol_ib(id_priv->id.device, id_priv->id.port_num)) {
+   if (rdma_cap_ib_sa(id_priv->id.device, id_priv->id.port_num)) {
if (id_priv->query)
ib_sa_cancel_query(id_priv->query_id, id_priv->query);
}
@@ -1957,7 +1957,7 @@ int rdma_resolve_route(struct rdma_cm_id *id, int 
timeout_ms)
return -EINVAL;
 
atomic_inc(&id_priv->refcount);
-   if (rdma_protocol_ib(id->device, id->port_num))
+   if (rdma_cap_ib_sa(id->device, id->port_num))
ret = cma_resolve_ib_route(id_priv, timeout_ms);
else if (rdma_protocol_iboe(id->device, id->port_num))
ret = cma_resolve_iboe_route(id_priv);
diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index b115c28..30aa5e5 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, 
struct ib_event *event
struct ib_sa_port *port =
&sa_dev->port[event->element.port_num - 
sa_dev->start_port];
 
-   if (WARN_ON(!rdma_protocol_ib(handler->device, port->port_num)))
+   if (WARN_ON(!rdma_cap_ib_sa(handler->device, port->port_num)))
return;
 
spin_lock_irqsave(&port->ah_lock, flags);
@@ -1173,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)
 
for (i = 0; i <= e - s; ++i) {
spin_lock_init(&sa_dev->port[i].ah_lock);
-   if (!rdma_protocol_ib(device, i + 1))
+   if (!rdma_cap_ib_sa(device, i + 1))
continue;
 
sa_dev->port[i].sm_ah= NULL;
@@ -1208,7 +1208,7 @@ static void ib_sa_add_one(struct ib_device *device)
goto err;
 
for (i = 0; i <= e - s; ++i) {
-   if (rdma_protocol_ib(device, i + 1))
+   if (rdma_cap_ib_sa(device, i + 1))
update_sm_ah(&sa_dev->port[i].update_task);
}
 
@@ -1216,7 +1216,7 @@ static void ib_sa_add_one(struct ib_device *device)
 
 err:
while (--i >= 0) {
-   if (rdma_protocol_ib(device, i + 1))
+   if (rdma_cap_ib_sa(device, i + 1))
ib_unregister_mad_agent(sa_dev->port[i].agent);
}
 free:
@@ -1237,7 +1237,7 @@ static void ib_sa_remove_one(struct ib_device *device)
flush_workqueue(ib_wq);
 
for (i = 0; i <= sa_dev->end_port - sa_dev->start_port; ++i) {
-   if (rdma_protocol_ib(device, i + 1)) {
+   if (rdma_cap_ib_sa(device, i + 1)) {
ib_unregister_mad_agent(sa_dev->port[i].agent);
if (sa_dev->port[i].sm_ah)
kref_put(&sa_dev->port[i].sm_ah->ref, 
free_sm_ah);
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index dae7620..d42b816 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -723,7 +723,7 @@ static ssize_t ucma_query_route(struct ucma_file *file,
resp.node_guid = (__force __u64) ctx->cm_id->device->node_guid;
resp.port_num = ctx->cm_id->port_num;
 
-   if (rdma_protocol_ib(ctx->cm_id->device, ctx->cm_id->port_num))
+   if (rdma_cap_ib_sa(ctx->cm_id->device, ctx->cm_id->port_num))
ucma_copy_ib_route(&resp, &ctx->cm_id->route);
else if (rdma_protocol_iboe(ctx->cm_id->device, ctx->cm_id->port_num))
ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index cc92a64..c3a561e8 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1834,6 +1834,21 @@ static inline bool rdma_cap_iw_cm(struct ib_device 
*device, u8 port_num)
return rdma_protocol_iwarp(device, port_num);
 }
 
+/**
+ * rdma_cap_ib_sa - Check if the port of device has the capability Infiniband
+ * Subnet Admin

[PATCH v8 16/23] IB/Verbs: Use management helper rdma_cap_ib_smi()

2015-05-05 Thread Michael Wang
Introduce helper rdma_cap_ib_smi() to help us check if the port of an
IB device support Infiniband Subnet Management Interface.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/agent.c |  2 +-
 drivers/infiniband/core/mad.c   |  2 +-
 include/rdma/ib_verbs.h | 15 +++
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index 89d4fbc..a6fc4d6 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int 
port_num)
goto error1;
}
 
-   if (rdma_protocol_ib(device, port_num)) {
+   if (rdma_cap_ib_smi(device, port_num)) {
/* Obtain send only MAD agent for SMI QP */
port_priv->agent[0] = ib_register_mad_agent(device, port_num,
IB_QPT_SMI, NULL, 0,
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 80777cd..e9699c9 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device,
init_mad_qp(port_priv, &port_priv->qp_info[1]);
 
cq_size = mad_sendq_size + mad_recvq_size;
-   has_smi = rdma_protocol_ib(device, port_num);
+   has_smi = rdma_cap_ib_smi(device, port_num);
if (has_smi)
cq_size *= 2;
 
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 23ba66e..e983e33 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1789,6 +1789,21 @@ static inline bool rdma_cap_ib_mad(struct ib_device 
*device, u8 port_num)
return rdma_ib_or_iboe(device, port_num);
 }
 
+/**
+ * rdma_cap_ib_smi - Check if the port of device has the capability Infiniband
+ * Subnet Management Interface.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return false when port of the device don't support Infiniband
+ * Subnet Management Interface.
+ */
+static inline bool rdma_cap_ib_smi(struct ib_device *device, u8 port_num)
+{
+   return rdma_protocol_ib(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 18/23] IB/Verbs: Use management helper rdma_cap_iw_cm()

2015-05-05 Thread Michael Wang
Introduce helper rdma_cap_iw_cm() to help us check if the port of an
IB device support IWARP Communication Manager.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c | 14 +++---
 include/rdma/ib_verbs.h   | 15 +++
 2 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 71a3668..0787035 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -754,7 +754,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct 
ib_qp_attr *qp_attr,
 
if (qp_attr->qp_state == IB_QPS_RTR)
qp_attr->rq_psn = id_priv->seq_num;
-   } else if (rdma_protocol_iwarp(id->device, id->port_num)) {
+   } else if (rdma_cap_iw_cm(id->device, id->port_num)) {
if (!id_priv->cm_id.iw) {
qp_attr->qp_access_flags = 0;
*qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS;
@@ -1036,7 +1036,7 @@ void rdma_destroy_id(struct rdma_cm_id *id)
if (rdma_cap_ib_cm(id_priv->id.device, 1)) {
if (id_priv->cm_id.ib)
ib_destroy_cm_id(id_priv->cm_id.ib);
-   } else if (rdma_protocol_iwarp(id_priv->id.device, 1)) {
+   } else if (rdma_cap_iw_cm(id_priv->id.device, 1)) {
if (id_priv->cm_id.iw)
iw_destroy_cm_id(id_priv->cm_id.iw);
}
@@ -2520,7 +2520,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
ret = cma_ib_listen(id_priv);
if (ret)
goto err;
-   } else if (rdma_protocol_iwarp(id->device, 1)) {
+   } else if (rdma_cap_iw_cm(id->device, 1)) {
ret = cma_iw_listen(id_priv, backlog);
if (ret)
goto err;
@@ -2865,7 +2865,7 @@ int rdma_connect(struct rdma_cm_id *id, struct 
rdma_conn_param *conn_param)
ret = cma_resolve_ib_udp(id_priv, conn_param);
else
ret = cma_connect_ib(id_priv, conn_param);
-   } else if (rdma_protocol_iwarp(id->device, id->port_num))
+   } else if (rdma_cap_iw_cm(id->device, id->port_num))
ret = cma_connect_iw(id_priv, conn_param);
else
ret = -ENOSYS;
@@ -2987,7 +2987,7 @@ int rdma_accept(struct rdma_cm_id *id, struct 
rdma_conn_param *conn_param)
else
ret = cma_rep_recv(id_priv);
}
-   } else if (rdma_protocol_iwarp(id->device, id->port_num))
+   } else if (rdma_cap_iw_cm(id->device, id->port_num))
ret = cma_accept_iw(id_priv, conn_param);
else
ret = -ENOSYS;
@@ -3042,7 +3042,7 @@ int rdma_reject(struct rdma_cm_id *id, const void 
*private_data,
ret = ib_send_cm_rej(id_priv->cm_id.ib,
 IB_CM_REJ_CONSUMER_DEFINED, NULL,
 0, private_data, private_data_len);
-   } else if (rdma_protocol_iwarp(id->device, id->port_num)) {
+   } else if (rdma_cap_iw_cm(id->device, id->port_num)) {
ret = iw_cm_reject(id_priv->cm_id.iw,
   private_data, private_data_len);
} else
@@ -3068,7 +3068,7 @@ int rdma_disconnect(struct rdma_cm_id *id)
/* Initiate or respond to a disconnect. */
if (ib_send_cm_dreq(id_priv->cm_id.ib, NULL, 0))
ib_send_cm_drep(id_priv->cm_id.ib, NULL, 0);
-   } else if (rdma_protocol_iwarp(id->device, id->port_num)) {
+   } else if (rdma_cap_iw_cm(id->device, id->port_num)) {
ret = iw_cm_disconnect(id_priv->cm_id.iw, 0);
} else
ret = -EINVAL;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index e349596..cc92a64 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1819,6 +1819,21 @@ static inline bool rdma_cap_ib_cm(struct ib_device 
*device, u8 port_num)
return rdma_ib_or_iboe(device, port_num);
 }
 
+/**
+ * rdma_cap_iw_cm - Check if the port of device has the capability IWARP
+ * Communication Manager.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return false when port of the device don't support IWARP
+ * Communication Manager.
+ */
+static inline bool rdma_cap_iw_cm(struct ib_device *device, u8 port_num)
+{
+   return rdma_protocol_iwarp(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "

[PATCH v8 17/23] IB/Verbs: Use management helper rdma_cap_ib_cm()

2015-05-05 Thread Michael Wang
Introduce helper rdma_cap_ib_cm() to help us check if the port of an
IB device support Infiniband Communication Manager.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cm.c  |  6 +++---
 drivers/infiniband/core/cma.c | 19 +--
 drivers/infiniband/core/ucm.c |  2 +-
 include/rdma/ib_verbs.h   | 15 +++
 4 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index add5e484..7073f98 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3781,7 +3781,7 @@ static void cm_add_one(struct ib_device *ib_device)
 
set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
for (i = 1; i <= ib_device->phys_port_cnt; i++) {
-   if (!rdma_ib_or_iboe(ib_device, i))
+   if (!rdma_cap_ib_cm(ib_device, i))
continue;
 
port = kzalloc(sizeof *port, GFP_KERNEL);
@@ -3832,7 +3832,7 @@ error1:
port_modify.set_port_cap_mask = 0;
port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
while (--i) {
-   if (!rdma_ib_or_iboe(ib_device, i))
+   if (!rdma_cap_ib_cm(ib_device, i))
continue;
 
port = cm_dev->port[i-1];
@@ -3864,7 +3864,7 @@ static void cm_remove_one(struct ib_device *ib_device)
write_unlock_irqrestore(&cm.device_lock, flags);
 
for (i = 1; i <= ib_device->phys_port_cnt; i++) {
-   if (!rdma_ib_or_iboe(ib_device, i))
+   if (!rdma_cap_ib_cm(ib_device, i))
continue;
 
port = cm_dev->port[i-1];
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d43f492f..71a3668 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -745,7 +745,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct 
ib_qp_attr *qp_attr,
int ret = 0;
 
id_priv = container_of(id, struct rdma_id_private, id);
-   if (rdma_ib_or_iboe(id->device, id->port_num)) {
+   if (rdma_cap_ib_cm(id->device, id->port_num)) {
if (!id_priv->cm_id.ib || (id_priv->id.qp_type == IB_QPT_UD))
ret = cma_ib_init_qp_attr(id_priv, qp_attr, 
qp_attr_mask);
else
@@ -1033,7 +1033,7 @@ void rdma_destroy_id(struct rdma_cm_id *id)
mutex_unlock(&id_priv->handler_mutex);
 
if (id_priv->cma_dev) {
-   if (rdma_ib_or_iboe(id_priv->id.device, 1)) {
+   if (rdma_cap_ib_cm(id_priv->id.device, 1)) {
if (id_priv->cm_id.ib)
ib_destroy_cm_id(id_priv->cm_id.ib);
} else if (rdma_protocol_iwarp(id_priv->id.device, 1)) {
@@ -1616,8 +1616,7 @@ static void cma_listen_on_dev(struct rdma_id_private 
*id_priv,
struct rdma_cm_id *id;
int ret;
 
-   if (cma_family(id_priv) == AF_IB &&
-   !rdma_ib_or_iboe(cma_dev->device, 1))
+   if (cma_family(id_priv) == AF_IB && !rdma_cap_ib_cm(cma_dev->device, 1))
return;
 
id = rdma_create_id(cma_listen_handler, id_priv, id_priv->id.ps,
@@ -2008,7 +2007,7 @@ static int cma_bind_loopback(struct rdma_id_private 
*id_priv)
mutex_lock(&lock);
list_for_each_entry(cur_dev, &dev_list, list) {
if (cma_family(id_priv) == AF_IB &&
-   !rdma_ib_or_iboe(cur_dev->device, 1))
+   !rdma_cap_ib_cm(cur_dev->device, 1))
continue;
 
if (!cma_dev)
@@ -2517,7 +2516,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
 
id_priv->backlog = backlog;
if (id->device) {
-   if (rdma_ib_or_iboe(id->device, 1)) {
+   if (rdma_cap_ib_cm(id->device, 1)) {
ret = cma_ib_listen(id_priv);
if (ret)
goto err;
@@ -2861,7 +2860,7 @@ int rdma_connect(struct rdma_cm_id *id, struct 
rdma_conn_param *conn_param)
id_priv->srq = conn_param->srq;
}
 
-   if (rdma_ib_or_iboe(id->device, id->port_num)) {
+   if (rdma_cap_ib_cm(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD)
ret = cma_resolve_ib_udp(id_priv, conn_param);
else
@@ -2972,7 +2971,7 @@ int rdma_accept(struct rdma_cm_id *id, struct 
rdma_conn_param *conn_param)
id_priv->srq = conn_param->srq;
}
 
-   if (rdma_ib_or_iboe(id->device, id->port_num)) {
+   if (rdma_cap_ib_cm(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD) {
if (conn_param)
ret = cma_send_sidr_rep(id_priv, 
IB_SIDR_SUCCESS

[PATCH v8 07/23] IB/Verbs: Reform IB-ulp ipoib

2015-05-05 Thread Michael Wang
Use raw management helpers to reform IB-ulp ipoib.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 7cad4dd..468fc2b 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1680,9 +1680,7 @@ static void ipoib_add_one(struct ib_device *device)
struct net_device *dev;
struct ipoib_dev_priv *priv;
int s, e, p;
-
-   if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
-   return;
+   int count = 0;
 
dev_list = kmalloc(sizeof *dev_list, GFP_KERNEL);
if (!dev_list)
@@ -1699,15 +1697,21 @@ static void ipoib_add_one(struct ib_device *device)
}
 
for (p = s; p <= e; ++p) {
-   if (rdma_port_get_link_layer(device, p) != 
IB_LINK_LAYER_INFINIBAND)
+   if (!rdma_protocol_ib(device, p))
continue;
dev = ipoib_add_port("ib%d", device, p);
if (!IS_ERR(dev)) {
priv = netdev_priv(dev);
list_add_tail(&priv->list, dev_list);
+   count++;
}
}
 
+   if (!count) {
+   kfree(dev_list);
+   return;
+   }
+
ib_set_client_data(device, &ipoib_client, dev_list);
 }
 
@@ -1716,9 +1720,6 @@ static void ipoib_remove_one(struct ib_device *device)
struct ipoib_dev_priv *priv, *tmp;
struct list_head *dev_list;
 
-   if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
-   return;
-
dev_list = ib_get_client_data(device, &ipoib_client);
if (!dev_list)
return;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 06/23] IB/Verbs: Reform IB-core multicast

2015-05-05 Thread Michael Wang
Use raw management helpers to reform IB-core multicast.

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/multicast.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/multicast.c 
b/drivers/infiniband/core/multicast.c
index fa17b55..b57ed03 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -780,8 +780,7 @@ static void mcast_event_handler(struct ib_event_handler 
*handler,
int index;
 
dev = container_of(handler, struct mcast_device, event_handler);
-   if (rdma_port_get_link_layer(dev->device, event->element.port_num) !=
-   IB_LINK_LAYER_INFINIBAND)
+   if (WARN_ON(!rdma_protocol_ib(dev->device, event->element.port_num)))
return;
 
index = event->element.port_num - dev->start_port;
@@ -808,9 +807,6 @@ static void mcast_add_one(struct ib_device *device)
int i;
int count = 0;
 
-   if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
-   return;
-
dev = kmalloc(sizeof *dev + device->phys_port_cnt * sizeof *port,
  GFP_KERNEL);
if (!dev)
@@ -824,8 +820,7 @@ static void mcast_add_one(struct ib_device *device)
}
 
for (i = 0; i <= dev->end_port - dev->start_port; i++) {
-   if (rdma_port_get_link_layer(device, dev->start_port + i) !=
-   IB_LINK_LAYER_INFINIBAND)
+   if (!rdma_protocol_ib(device, dev->start_port + i))
continue;
port = &dev->port[i];
port->dev = dev;
@@ -863,8 +858,7 @@ static void mcast_remove_one(struct ib_device *device)
flush_workqueue(mcast_wq);
 
for (i = 0; i <= dev->end_port - dev->start_port; i++) {
-   if (rdma_port_get_link_layer(device, dev->start_port + i) ==
-   IB_LINK_LAYER_INFINIBAND) {
+   if (rdma_protocol_ib(device, dev->start_port + i)) {
port = &dev->port[i];
deref_port(port);
wait_for_completion(&port->comp);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 01/23] IB/Verbs: Implement new callback query_protocol()

2015-05-05 Thread Michael Wang
Add new callback query_protocol() and implement for each HW.

Mapping List:
node-type   link-layer  transport   protocol
nes RNICETH IWARP   IWARP
amso1100RNICETH IWARP   IWARP
cxgb3   RNICETH IWARP   IWARP
cxgb4   RNICETH IWARP   IWARP
usnic   USNIC_UDP   ETH USNIC_UDP   USNIC_UDP
ocrdma  IB_CA   ETH IB  IBOE
mlx4IB_CA   IB/ETH  IB  IB/IBOE
mlx5IB_CA   IB  IB  IB
ehcaIB_CA   IB  IB  IB
ipath   IB_CA   IB  IB  IB
mthca   IB_CA   IB  IB  IB
qib IB_CA   IB  IB  IB

Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/device.c |  1 +
 drivers/infiniband/hw/amso1100/c2_provider.c |  7 +++
 drivers/infiniband/hw/cxgb3/iwch_provider.c  |  7 +++
 drivers/infiniband/hw/cxgb4/provider.c   |  7 +++
 drivers/infiniband/hw/ehca/ehca_hca.c|  6 ++
 drivers/infiniband/hw/ehca/ehca_iverbs.h |  3 +++
 drivers/infiniband/hw/ehca/ehca_main.c   |  1 +
 drivers/infiniband/hw/ipath/ipath_verbs.c|  7 +++
 drivers/infiniband/hw/mlx4/main.c| 10 ++
 drivers/infiniband/hw/mlx5/main.c|  7 +++
 drivers/infiniband/hw/mthca/mthca_provider.c |  7 +++
 drivers/infiniband/hw/nes/nes_verbs.c|  6 ++
 drivers/infiniband/hw/ocrdma/ocrdma_main.c   |  1 +
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c  |  6 ++
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.h  |  3 +++
 drivers/infiniband/hw/qib/qib_verbs.c|  7 +++
 drivers/infiniband/hw/usnic/usnic_ib_main.c  |  1 +
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c |  6 ++
 drivers/infiniband/hw/usnic/usnic_ib_verbs.h |  2 ++
 include/rdma/ib_verbs.h  |  9 +
 20 files changed, 104 insertions(+)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 18c1ece..b360350 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -76,6 +76,7 @@ static int ib_device_check_mandatory(struct ib_device *device)
} mandatory_table[] = {
IB_MANDATORY_FUNC(query_device),
IB_MANDATORY_FUNC(query_port),
+   IB_MANDATORY_FUNC(query_protocol),
IB_MANDATORY_FUNC(query_pkey),
IB_MANDATORY_FUNC(query_gid),
IB_MANDATORY_FUNC(alloc_pd),
diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c 
b/drivers/infiniband/hw/amso1100/c2_provider.c
index bdf3507..6fe329a 100644
--- a/drivers/infiniband/hw/amso1100/c2_provider.c
+++ b/drivers/infiniband/hw/amso1100/c2_provider.c
@@ -99,6 +99,12 @@ static int c2_query_port(struct ib_device *ibdev,
return 0;
 }
 
+static enum rdma_protocol_type
+c2_query_protocol(struct ib_device *device, u8 port_num)
+{
+   return RDMA_PROTOCOL_IWARP;
+}
+
 static int c2_query_pkey(struct ib_device *ibdev,
 u8 port, u16 index, u16 * pkey)
 {
@@ -801,6 +807,7 @@ int c2_register_device(struct c2_dev *dev)
dev->ibdev.dma_device = &dev->pcidev->dev;
dev->ibdev.query_device = c2_query_device;
dev->ibdev.query_port = c2_query_port;
+   dev->ibdev.query_protocol = c2_query_protocol;
dev->ibdev.query_pkey = c2_query_pkey;
dev->ibdev.query_gid = c2_query_gid;
dev->ibdev.alloc_ucontext = c2_alloc_ucontext;
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c 
b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index 811b24a..298d1ca 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -1232,6 +1232,12 @@ static int iwch_query_port(struct ib_device *ibdev,
return 0;
 }
 
+static enum rdma_protocol_type
+iwch_query_protocol(struct ib_device *device, u8 port_num)
+{
+   return RDMA_PROTOCOL_IWARP;
+}
+
 static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
char *buf)
 {
@@ -1385,6 +1391,7 @@ int iwch_register_device(struct iwch_dev *dev)
dev->ibdev.dma_device = &(dev->rdev.rnic_info.pdev->dev);
dev->ibdev.query_device = iwch_query_device;
dev->ibdev.query_port = iwch_query_port;
+   dev->ibdev.query_protocol = iwch_query_protocol;
dev->ibdev.query_pkey = iwch_query_pkey;
dev->ibdev.query_gid = iwch_query_gid;
dev->ibdev.alloc_ucontext = iwch_alloc_ucontext;
diff --git a/drivers/infiniband/hw/cxgb4/provider.c 
b/drivers/infiniband/hw/cxgb4/provider.c
index 66bd6a2..f52ee63 100644
--- a/driv

Re: [PATCH v7 00/23] IB/Verbs: IB Management Helpers

2015-05-04 Thread Michael Wang


On 05/01/2015 08:34 AM, ira.weiny wrote:
> On Tue, Apr 28, 2015 at 05:10:00PM +0200, Michael Wang wrote:
>> Since v6:
>>   * Thanks to Ira, Devesh for the review and testing :-)
>>   * Thanks for the comments from Sean, Tom, Jason, Doug, Devesh, Ira,
>> Liran :-) Please remind me if anything missed :-P
>>   * Use query_protocol() and enum protocol type in 1#
>>   * Use rdma_protocol_XX() in 2#
>>   * Drop cma_set_legacy_transport()
>>   * Reserve rdma_ib_or_iboe() and rdma_node_get_transport()
>>   * Updated github repository to v7
> 
> I pulled these via Dougs for-4.2 branch and have done light testing with mlx4
> and qib.
> 
> Now we need to look at converting to some bit mask.
> 
> Does anyone have a link to the emails which proposed bitmasks?  I can't find
> them right now.

I've recorded two links related to bitmask proposal:
https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23418.html
https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23765.html

But there are also some scattered pieces of discussion during the review of
prev version, I think the best way is to send out another series dedicated
on bitmask and then gather discussions in there :-)

> 
> 
> For the Series:
> 
> Reviewed-by: Ira Weiny 

Thanks for the review :-)

Regards,
Michael Wang

> 
> 
>>
>> There are plenty of lengthy code to check the transport type of IB device,
>> or the link layer type of it's port, but actually we are just speculating
>> whether a particular management/feature is supported by the device/port.
>>
>> Thus instead of inferring, we should have our own mechanism for IB management
>> capability/protocol/feature checking, several proposals below.
>>
>> This patch set will introduce query_protocol() to check management 
>> requirement
>> instead of inferring from transport and link layer respectively, along with
>> the new enum on protocol type.
>>
>> Mapping List:
>>  node-type   link-layer  transport   protocol
>> nes  RNICETH IWARP   IWARP
>> amso1100 RNICETH IWARP   IWARP
>> cxgb3RNICETH IWARP   IWARP
>> cxgb4RNICETH IWARP   IWARP
>> usnicUSNIC_UDP   ETH USNIC_UDP   USNIC_UDP
>> ocrdma   IB_CA   ETH IB  IBOE
>> mlx4 IB_CA   IB/ETH  IB  IB/IBOE
>> mlx5 IB_CA   IB  IB  IB
>> ehca IB_CA   IB  IB  IB
>> ipathIB_CA   IB  IB  IB
>> mthcaIB_CA   IB  IB  IB
>> qib  IB_CA   IB  IB  IB
>>
>> For example:
>>  if (transport == IB) && (link-layer == ETH)
>> will now become:
>>  if (query_protocol() == IBOE)
>>
>> Thus we will be able to get rid of the respective transport and link-layer
>> checking, and it will help us to add new protocol/Technology (like OPA) more
>> easier, also with the introduced management helpers, IB management logical
>> will be more clear and easier for extending.
>>
>> Highlights:
>> The 'mgmt-helpers' branch of 'g...@github.com:ywang-pb/infiniband-wy.git'
>> contain this series based on the latest 'infiniband/for-next'
>>
>> The patch set covered a wide range of IB stuff, thus for those who are
>> familiar with the particular part, your suggestion would be invaluable 
>> ;-)
>>
>> Patch 1#~14# included all the logical reform, 15#~23# introduced the
>> management helpers.
>>
>> we appreciate for those one who have the HW willing to provide Tested-by 
>> :-)
>>
>> Doug suggested the bitmask mechanism:
>>  https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23765.html
>> which could be the plan for future reforming, we prefer that to be 
>> another
>> series which focus on semantic and performance.
>>
>> This patch-set is somewhat 'bloated' now and it may be a good timing for
>>     staging, I'd like to suggest we focus on improving existed helpers and 
>> push
>> all the further reforms into next series ;-)
>>
>>
>> Proposals:
>> Sean:
>>  https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23339.html
>> Doug:
>>  https://

Re: [PATCH v7 00/23] IB/Verbs: IB Management Helpers

2015-04-30 Thread Michael Wang


On 04/29/2015 06:28 PM, Doug Ledford wrote:
> On Tue, 2015-04-28 at 17:10 +0200, Michael Wang wrote:
>> Since v6:
>>   * Thanks to Ira, Devesh for the review and testing :-)
>>   * Thanks for the comments from Sean, Tom, Jason, Doug, Devesh, Ira,
>> Liran :-) Please remind me if anything missed :-P
>>   * Use query_protocol() and enum protocol type in 1#
>>   * Use rdma_protocol_XX() in 2#
>>   * Drop cma_set_legacy_transport()
>>   * Reserve rdma_ib_or_iboe() and rdma_node_get_transport()
>>   * Updated github repository to v7
> 
> I've taken your patchset and threw it into a for-4.2 branch in my repo.
> This will get it 0day testing.  I've also pulled it into my test cluster
> and done minimal testing (bootup, finds devices, gets IP address via
> dhcp).  Here are the hardware types it passed on:
> 
> mthca (yes, I still have these...only just barely, but I do)
> mlx4 (IB/RoCE, in both standalone and SRIOV usage)
> mlx5 (IB only, I don't have the Eth capable mlx5 hardware yet)
> cxgb3
> cxgb4
> qib
> ocrdma

My appreciation :-) Please let me know if there are anything broken.

Regards,
Michael Wang

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 04/23] IB/Verbs: Reform IB-core cm

2015-04-30 Thread Michael Wang


On 04/29/2015 05:48 PM, Or Gerlitz wrote:
[snip]
> 
>>
>> I think the CC list is not that big for a patch set covered such a wide
>> range, isn't it :-P
> 
> Maybe it's a matter of taste, but for me it look way way too big. If
> you really want to have such
> a huge listing, do it in the early patches of the series where you
> introduce the new concepts, and later,on downstream patches, when you
> use it, put one person if they happen to be the author or maintainthat
> area (e.g Sean <-- CM/CMA, Doug/Erez <-- IPoIB Ira, Hal <-- MAD, Steve
> <-- IW_CM, etc)

Thanks for the suggestion, I can't callback correctly who participated
on which part of the review accurately... my bad, will take care next
time :-) and will stop add new CC from now on.

BTW, as now folks already familiar with the cap_XX stuff, may be
the last version to be applied could merge all the cap_XX into one,
after all, it's more focus on the description rather than logical,
separate cap_XX won't help easier the review anyway.

Regards,
Michael Wang

> 
> Or.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 04/23] IB/Verbs: Reform IB-core cm

2015-04-29 Thread Michael Wang
Hi, Or

On 04/28/2015 09:02 PM, Or Gerlitz wrote:
> On Tue, Apr 28, 2015 at 6:10 PM, Michael Wang  
> wrote:
>> Use raw management helpers to reform IB-core cm.
>>
>> Cc: Hal Rosenstock 
>> Cc: Steve Wise 
>> Cc: Tom Talpey 
>> Cc: Jason Gunthorpe 
>> Cc: Doug Ledford 
>> Cc: Ira Weiny 
>> Cc: Sean Hefty 
>> Signed-off-by: Michael Wang 
>> ---
>>  drivers/infiniband/core/cm.c | 20 +---
>>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> Hi Michael,
> 
> I don't really see the benefit (e.g for someone doing bisection
> 1/2/5/10 years from now and landing here) of listing all the group of
> reviewers for each of the ~30 patches that make this series, any
> special reason that caused you doing so?

Those on the CC list are used to help correct some problems or
contributed to the definition/plan :-) They are familiar with
the whole story of this patch set.

As you mentioned, few years later when someone bisect out the patches
and want to learn why it's like that, he could have enough address to
send his question, although few of them may not work on the same aspect
anymore, but the chance to find someone have the story is higher.

I think the CC list is not that big for a patch set covered such a wide
range, isn't it :-P

Regards,
Michael Wang

> 
> Or.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 08/23] IB/Verbs: Reform IB-ulp xprtrdma

2015-04-28 Thread Michael Wang
Use raw management helpers to reform IB-ulp xprtrdma.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c  |  4 +--
 net/sunrpc/xprtrdma/svc_rdma_transport.c | 45 +---
 2 files changed, 20 insertions(+), 29 deletions(-)

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index f9f13a3..2cc625d 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -117,8 +117,8 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
 
 static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
 {
-   if (rdma_node_get_transport(xprt->sc_cm_id->device->node_type) ==
-RDMA_TRANSPORT_IWARP)
+   if (rdma_protocol_iwarp(xprt->sc_cm_id->device,
+   xprt->sc_cm_id->port_num))
return 1;
else
return min_t(int, sge_count, xprt->sc_max_sge);
diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c 
b/net/sunrpc/xprtrdma/svc_rdma_transport.c
index f609c1c..3df8320 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
@@ -851,7 +851,7 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt 
*xprt)
struct ib_qp_init_attr qp_attr;
struct ib_device_attr devattr;
int uninitialized_var(dma_mr_acc);
-   int need_dma_mr;
+   int need_dma_mr = 0;
int ret;
int i;
 
@@ -985,35 +985,26 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt 
*xprt)
/*
 * Determine if a DMA MR is required and if so, what privs are required
 */
-   switch (rdma_node_get_transport(newxprt->sc_cm_id->device->node_type)) {
-   case RDMA_TRANSPORT_IWARP:
-   newxprt->sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV;
-   if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG)) {
-   need_dma_mr = 1;
-   dma_mr_acc =
-   (IB_ACCESS_LOCAL_WRITE |
-IB_ACCESS_REMOTE_WRITE);
-   } else if (!(devattr.device_cap_flags & 
IB_DEVICE_LOCAL_DMA_LKEY)) {
-   need_dma_mr = 1;
-   dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
-   } else
-   need_dma_mr = 0;
-   break;
-   case RDMA_TRANSPORT_IB:
-   if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG)) {
-   need_dma_mr = 1;
-   dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
-   } else if (!(devattr.device_cap_flags &
-IB_DEVICE_LOCAL_DMA_LKEY)) {
-   need_dma_mr = 1;
-   dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
-   } else
-   need_dma_mr = 0;
-   break;
-   default:
+   if (!rdma_protocol_iwarp(newxprt->sc_cm_id->device,
+newxprt->sc_cm_id->port_num) &&
+   !rdma_ib_or_iboe(newxprt->sc_cm_id->device,
+newxprt->sc_cm_id->port_num))
goto errout;
+
+   if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG) ||
+   !(devattr.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY)) {
+   need_dma_mr = 1;
+   dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
+   if (rdma_protocol_iwarp(newxprt->sc_cm_id->device,
+   newxprt->sc_cm_id->port_num) &&
+   !(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG))
+   dma_mr_acc |= IB_ACCESS_REMOTE_WRITE;
}
 
+   if (rdma_protocol_iwarp(newxprt->sc_cm_id->device,
+   newxprt->sc_cm_id->port_num))
+   newxprt->sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV;
+
/* Create the DMA MR if needed, otherwise, use the DMA LKEY */
if (need_dma_mr) {
/* Register all of physical memory */
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 05/23] IB/Verbs: Reform IB-core sa_query

2015-04-28 Thread Michael Wang
Use raw management helpers to reform IB-core sa_query.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/sa_query.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index c38f030..b115c28 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, 
struct ib_event *event
struct ib_sa_port *port =
&sa_dev->port[event->element.port_num - 
sa_dev->start_port];
 
-   if (rdma_port_get_link_layer(handler->device, port->port_num) 
!= IB_LINK_LAYER_INFINIBAND)
+   if (WARN_ON(!rdma_protocol_ib(handler->device, port->port_num)))
return;
 
spin_lock_irqsave(&port->ah_lock, flags);
@@ -540,7 +540,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 
port_num,
ah_attr->port_num = port_num;
ah_attr->static_rate = rec->rate;
 
-   force_grh = rdma_port_get_link_layer(device, port_num) == 
IB_LINK_LAYER_ETHERNET;
+   force_grh = rdma_protocol_iboe(device, port_num);
 
if (rec->hop_limit > 1 || force_grh) {
ah_attr->ah_flags = IB_AH_GRH;
@@ -1153,9 +1153,7 @@ static void ib_sa_add_one(struct ib_device *device)
 {
struct ib_sa_device *sa_dev;
int s, e, i;
-
-   if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
-   return;
+   int count = 0;
 
if (device->node_type == RDMA_NODE_IB_SWITCH)
s = e = 0;
@@ -1175,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)
 
for (i = 0; i <= e - s; ++i) {
spin_lock_init(&sa_dev->port[i].ah_lock);
-   if (rdma_port_get_link_layer(device, i + 1) != 
IB_LINK_LAYER_INFINIBAND)
+   if (!rdma_protocol_ib(device, i + 1))
continue;
 
sa_dev->port[i].sm_ah= NULL;
@@ -1189,8 +1187,13 @@ static void ib_sa_add_one(struct ib_device *device)
goto err;
 
INIT_WORK(&sa_dev->port[i].update_task, update_sm_ah);
+
+   count++;
}
 
+   if (!count)
+   goto free;
+
ib_set_client_data(device, &sa_client, sa_dev);
 
/*
@@ -1204,19 +1207,20 @@ static void ib_sa_add_one(struct ib_device *device)
if (ib_register_event_handler(&sa_dev->event_handler))
goto err;
 
-   for (i = 0; i <= e - s; ++i)
-   if (rdma_port_get_link_layer(device, i + 1) == 
IB_LINK_LAYER_INFINIBAND)
+   for (i = 0; i <= e - s; ++i) {
+   if (rdma_protocol_ib(device, i + 1))
update_sm_ah(&sa_dev->port[i].update_task);
+   }
 
return;
 
 err:
-   while (--i >= 0)
-   if (rdma_port_get_link_layer(device, i + 1) == 
IB_LINK_LAYER_INFINIBAND)
+   while (--i >= 0) {
+   if (rdma_protocol_ib(device, i + 1))
ib_unregister_mad_agent(sa_dev->port[i].agent);
-
+   }
+free:
kfree(sa_dev);
-
return;
 }
 
@@ -1233,7 +1237,7 @@ static void ib_sa_remove_one(struct ib_device *device)
flush_workqueue(ib_wq);
 
for (i = 0; i <= sa_dev->end_port - sa_dev->start_port; ++i) {
-   if (rdma_port_get_link_layer(device, i + 1) == 
IB_LINK_LAYER_INFINIBAND) {
+   if (rdma_protocol_ib(device, i + 1)) {
ib_unregister_mad_agent(sa_dev->port[i].agent);
if (sa_dev->port[i].sm_ah)
kref_put(&sa_dev->port[i].sm_ah->ref, 
free_sm_ah);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 06/23] IB/Verbs: Reform IB-core multicast

2015-04-28 Thread Michael Wang
Use raw management helpers to reform IB-core multicast.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/multicast.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/multicast.c 
b/drivers/infiniband/core/multicast.c
index fa17b55..b57ed03 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -780,8 +780,7 @@ static void mcast_event_handler(struct ib_event_handler 
*handler,
int index;
 
dev = container_of(handler, struct mcast_device, event_handler);
-   if (rdma_port_get_link_layer(dev->device, event->element.port_num) !=
-   IB_LINK_LAYER_INFINIBAND)
+   if (WARN_ON(!rdma_protocol_ib(dev->device, event->element.port_num)))
return;
 
index = event->element.port_num - dev->start_port;
@@ -808,9 +807,6 @@ static void mcast_add_one(struct ib_device *device)
int i;
int count = 0;
 
-   if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
-   return;
-
dev = kmalloc(sizeof *dev + device->phys_port_cnt * sizeof *port,
  GFP_KERNEL);
if (!dev)
@@ -824,8 +820,7 @@ static void mcast_add_one(struct ib_device *device)
}
 
for (i = 0; i <= dev->end_port - dev->start_port; i++) {
-   if (rdma_port_get_link_layer(device, dev->start_port + i) !=
-   IB_LINK_LAYER_INFINIBAND)
+   if (!rdma_protocol_ib(device, dev->start_port + i))
continue;
port = &dev->port[i];
port->dev = dev;
@@ -863,8 +858,7 @@ static void mcast_remove_one(struct ib_device *device)
flush_workqueue(mcast_wq);
 
for (i = 0; i <= dev->end_port - dev->start_port; i++) {
-   if (rdma_port_get_link_layer(device, dev->start_port + i) ==
-   IB_LINK_LAYER_INFINIBAND) {
+   if (rdma_protocol_ib(device, dev->start_port + i)) {
port = &dev->port[i];
deref_port(port);
wait_for_completion(&port->comp);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 02/23] IB/Verbs: Implement raw management helpers

2015-04-28 Thread Michael Wang
Add raw helpers:
rdma_protocol_ib
rdma_protocol_iboe
rdma_protocol_iwarp
rdma_ib_or_iboe
To help us detect which technology the port supported.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 include/rdma/ib_verbs.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 080f204..acdba60 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1752,6 +1752,28 @@ int ib_query_port(struct ib_device *device,
 enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device,
   u8 port_num);
 
+static inline int rdma_protocol_ib(struct ib_device *device, u8 port_num)
+{
+   return device->query_protocol(device, port_num) == RDMA_PROTOCOL_IB;
+}
+
+static inline int rdma_protocol_iboe(struct ib_device *device, u8 port_num)
+{
+   return device->query_protocol(device, port_num) == RDMA_PROTOCOL_IBOE;
+}
+
+static inline int rdma_protocol_iwarp(struct ib_device *device, u8 port_num)
+{
+   return device->query_protocol(device, port_num) == RDMA_PROTOCOL_IWARP;
+}
+
+static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
+{
+   enum rdma_protocol_type pt = device->query_protocol(device, port_num);
+
+   return (pt == RDMA_PROTOCOL_IB || pt == RDMA_PROTOCOL_IBOE);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 10/23] IB/Verbs: Reform cm related part in IB-core cma/ucm

2015-04-28 Thread Michael Wang
Use raw management helpers to reform cm related part in IB-core cma/ucm.

Few checks focus on the device cm type rather than the port capability,
directly pass port 1 works currently, but can't support mixing cm type
device in future.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c | 81 +--
 drivers/infiniband/core/ucm.c |  3 +-
 2 files changed, 26 insertions(+), 58 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d570030..8a07e89 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -735,8 +735,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct 
ib_qp_attr *qp_attr,
int ret = 0;
 
id_priv = container_of(id, struct rdma_id_private, id);
-   switch (rdma_node_get_transport(id_priv->id.device->node_type)) {
-   case RDMA_TRANSPORT_IB:
+   if (rdma_ib_or_iboe(id->device, id->port_num)) {
if (!id_priv->cm_id.ib || (id_priv->id.qp_type == IB_QPT_UD))
ret = cma_ib_init_qp_attr(id_priv, qp_attr, 
qp_attr_mask);
else
@@ -745,19 +744,15 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct 
ib_qp_attr *qp_attr,
 
if (qp_attr->qp_state == IB_QPS_RTR)
qp_attr->rq_psn = id_priv->seq_num;
-   break;
-   case RDMA_TRANSPORT_IWARP:
+   } else if (rdma_protocol_iwarp(id->device, id->port_num)) {
if (!id_priv->cm_id.iw) {
qp_attr->qp_access_flags = 0;
*qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS;
} else
ret = iw_cm_init_qp_attr(id_priv->cm_id.iw, qp_attr,
 qp_attr_mask);
-   break;
-   default:
+   } else
ret = -ENOSYS;
-   break;
-   }
 
return ret;
 }
@@ -1037,17 +1032,12 @@ void rdma_destroy_id(struct rdma_cm_id *id)
mutex_unlock(&id_priv->handler_mutex);
 
if (id_priv->cma_dev) {
-   switch (rdma_node_get_transport(id_priv->id.device->node_type)) 
{
-   case RDMA_TRANSPORT_IB:
+   if (rdma_ib_or_iboe(id_priv->id.device, 1)) {
if (id_priv->cm_id.ib)
ib_destroy_cm_id(id_priv->cm_id.ib);
-   break;
-   case RDMA_TRANSPORT_IWARP:
+   } else if (rdma_protocol_iwarp(id_priv->id.device, 1)) {
if (id_priv->cm_id.iw)
iw_destroy_cm_id(id_priv->cm_id.iw);
-   break;
-   default:
-   break;
}
cma_leave_mc_groups(id_priv);
cma_release_dev(id_priv);
@@ -1626,7 +1616,7 @@ static void cma_listen_on_dev(struct rdma_id_private 
*id_priv,
int ret;
 
if (cma_family(id_priv) == AF_IB &&
-   rdma_node_get_transport(cma_dev->device->node_type) != 
RDMA_TRANSPORT_IB)
+   !rdma_ib_or_iboe(cma_dev->device, 1))
return;
 
id = rdma_create_id(cma_listen_handler, id_priv, id_priv->id.ps,
@@ -2028,7 +2018,7 @@ static int cma_bind_loopback(struct rdma_id_private 
*id_priv)
mutex_lock(&lock);
list_for_each_entry(cur_dev, &dev_list, list) {
if (cma_family(id_priv) == AF_IB &&
-   rdma_node_get_transport(cur_dev->device->node_type) != 
RDMA_TRANSPORT_IB)
+   !rdma_ib_or_iboe(cur_dev->device, 1))
continue;
 
if (!cma_dev)
@@ -2060,7 +2050,7 @@ port_found:
goto out;
 
id_priv->id.route.addr.dev_addr.dev_type =
-   (rdma_port_get_link_layer(cma_dev->device, p) == 
IB_LINK_LAYER_INFINIBAND) ?
+   (rdma_protocol_ib(cma_dev->device, p)) ?
ARPHRD_INFINIBAND : ARPHRD_ETHER;
 
rdma_addr_set_sgid(&id_priv->id.route.addr.dev_addr, &gid);
@@ -2537,18 +2527,15 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
 
id_priv->backlog = backlog;
if (id->device) {
-   switch (rdma_node_get_transport(id->device->node_type)) {
-   case RDMA_TRANSPORT_IB:
+   if (rdma_ib_or_iboe(id->device, 1)) {
ret = cma_ib_listen(id_priv);
if (ret)
goto err;
-   break;
-   case RDMA_TRANSPORT_IWARP:
+   } else if (rdma_protocol_iwarp(id->device, 1)) {
ret = cma_iw_listen(id_priv,

[PATCH v7 03/23] IB/Verbs: Reform IB-core mad/agent/user_mad

2015-04-28 Thread Michael Wang
Use raw management helpers to reform IB-core mad/agent/user_mad.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/agent.c|  2 +-
 drivers/infiniband/core/mad.c  | 43 +++---
 drivers/infiniband/core/user_mad.c | 26 ---
 3 files changed, 41 insertions(+), 30 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index f6d2961..89d4fbc 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int 
port_num)
goto error1;
}
 
-   if (rdma_port_get_link_layer(device, port_num) == 
IB_LINK_LAYER_INFINIBAND) {
+   if (rdma_protocol_ib(device, port_num)) {
/* Obtain send only MAD agent for SMI QP */
port_priv->agent[0] = ib_register_mad_agent(device, port_num,
IB_QPT_SMI, NULL, 0,
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 74c30f4..507eb67 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device,
init_mad_qp(port_priv, &port_priv->qp_info[1]);
 
cq_size = mad_sendq_size + mad_recvq_size;
-   has_smi = rdma_port_get_link_layer(device, port_num) == 
IB_LINK_LAYER_INFINIBAND;
+   has_smi = rdma_protocol_ib(device, port_num);
if (has_smi)
cq_size *= 2;
 
@@ -3057,9 +3057,6 @@ static void ib_mad_init_device(struct ib_device *device)
 {
int start, end, i;
 
-   if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
-   return;
-
if (device->node_type == RDMA_NODE_IB_SWITCH) {
start = 0;
end   = 0;
@@ -3069,6 +3066,9 @@ static void ib_mad_init_device(struct ib_device *device)
}
 
for (i = start; i <= end; i++) {
+   if (!rdma_ib_or_iboe(device, i))
+   continue;
+
if (ib_mad_port_open(device, i)) {
dev_err(&device->dev, "Couldn't open port %d\n", i);
goto error;
@@ -3086,40 +3086,39 @@ error_agent:
dev_err(&device->dev, "Couldn't close port %d\n", i);
 
 error:
-   i--;
+   while (--i >= start) {
+   if (!rdma_ib_or_iboe(device, i))
+   continue;
 
-   while (i >= start) {
if (ib_agent_port_close(device, i))
dev_err(&device->dev,
"Couldn't close port %d for agents\n", i);
if (ib_mad_port_close(device, i))
dev_err(&device->dev, "Couldn't close port %d\n", i);
-   i--;
}
 }
 
 static void ib_mad_remove_device(struct ib_device *device)
 {
-   int i, num_ports, cur_port;
-
-   if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
-   return;
+   int start, end, i;
 
if (device->node_type == RDMA_NODE_IB_SWITCH) {
-   num_ports = 1;
-   cur_port = 0;
+   start = 0;
+   end   = 0;
} else {
-   num_ports = device->phys_port_cnt;
-   cur_port = 1;
+   start = 1;
+   end   = device->phys_port_cnt;
}
-   for (i = 0; i < num_ports; i++, cur_port++) {
-   if (ib_agent_port_close(device, cur_port))
+
+   for (i = start; i <= end; i++) {
+   if (!rdma_ib_or_iboe(device, i))
+   continue;
+
+   if (ib_agent_port_close(device, i))
dev_err(&device->dev,
-   "Couldn't close port %d for agents\n",
-   cur_port);
-   if (ib_mad_port_close(device, cur_port))
-   dev_err(&device->dev, "Couldn't close port %d\n",
-   cur_port);
+   "Couldn't close port %d for agents\n", i);
+   if (ib_mad_port_close(device, i))
+   dev_err(&device->dev, "Couldn't close port %d\n", i);
}
 }
 
diff --git a/drivers/infiniband/core/user_mad.c 
b/drivers/infiniband/core/user_mad.c
index 928cdd2..aa8b334 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -1273,9 +1273,7 @@ static void ib_umad_add_one(struct ib_device *device)
 {
struct ib_umad_device *umad_dev;
int s, e, i;
-
- 

[PATCH v7 11/23] IB/Verbs: Reform route related part in IB-core cma

2015-04-28 Thread Michael Wang
Use raw management helpers to reform route related part in IB-core cma.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c  | 31 ---
 drivers/infiniband/core/ucma.c | 25 ++---
 2 files changed, 14 insertions(+), 42 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 8a07e89..36c5f8a 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -923,13 +923,9 @@ static inline int cma_user_data_offset(struct 
rdma_id_private *id_priv)
 
 static void cma_cancel_route(struct rdma_id_private *id_priv)
 {
-   switch (rdma_port_get_link_layer(id_priv->id.device, 
id_priv->id.port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
+   if (rdma_protocol_ib(id_priv->id.device, id_priv->id.port_num)) {
if (id_priv->query)
ib_sa_cancel_query(id_priv->query_id, id_priv->query);
-   break;
-   default:
-   break;
}
 }
 
@@ -1957,26 +1953,15 @@ int rdma_resolve_route(struct rdma_cm_id *id, int 
timeout_ms)
return -EINVAL;
 
atomic_inc(&id_priv->refcount);
-   switch (rdma_node_get_transport(id->device->node_type)) {
-   case RDMA_TRANSPORT_IB:
-   switch (rdma_port_get_link_layer(id->device, id->port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
-   ret = cma_resolve_ib_route(id_priv, timeout_ms);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
-   ret = cma_resolve_iboe_route(id_priv);
-   break;
-   default:
-   ret = -ENOSYS;
-   }
-   break;
-   case RDMA_TRANSPORT_IWARP:
+   if (rdma_protocol_ib(id->device, id->port_num))
+   ret = cma_resolve_ib_route(id_priv, timeout_ms);
+   else if (rdma_protocol_iboe(id->device, id->port_num))
+   ret = cma_resolve_iboe_route(id_priv);
+   else if (rdma_protocol_iwarp(id->device, id->port_num))
ret = cma_resolve_iw_route(id_priv, timeout_ms);
-   break;
-   default:
+   else
ret = -ENOSYS;
-   break;
-   }
+
if (ret)
goto err;
 
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 45d67e9..dae7620 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -722,26 +722,13 @@ static ssize_t ucma_query_route(struct ucma_file *file,
 
resp.node_guid = (__force __u64) ctx->cm_id->device->node_guid;
resp.port_num = ctx->cm_id->port_num;
-   switch (rdma_node_get_transport(ctx->cm_id->device->node_type)) {
-   case RDMA_TRANSPORT_IB:
-   switch (rdma_port_get_link_layer(ctx->cm_id->device,
-   ctx->cm_id->port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
-   ucma_copy_ib_route(&resp, &ctx->cm_id->route);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
-   ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
-   break;
-   default:
-   break;
-   }
-   break;
-   case RDMA_TRANSPORT_IWARP:
+
+   if (rdma_protocol_ib(ctx->cm_id->device, ctx->cm_id->port_num))
+   ucma_copy_ib_route(&resp, &ctx->cm_id->route);
+   else if (rdma_protocol_iboe(ctx->cm_id->device, ctx->cm_id->port_num))
+   ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
+   else if (rdma_protocol_iwarp(ctx->cm_id->device, ctx->cm_id->port_num))
ucma_copy_iw_route(&resp, &ctx->cm_id->route);
-   break;
-   default:
-   break;
-   }
 
 out:
if (copy_to_user((void __user *)(unsigned long)cmd.response,
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 13/23] IB/Verbs: Reform cma_acquire_dev()

2015-04-28 Thread Michael Wang
Reform cma_acquire_dev() with management helpers, introduce
cma_validate_port() to make the code more clean.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c | 68 +--
 1 file changed, 40 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 34ec13f..3fb3458 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -349,18 +349,35 @@ static int cma_translate_addr(struct sockaddr *addr, 
struct rdma_dev_addr *dev_a
return ret;
 }
 
+static inline int cma_validate_port(struct ib_device *device, u8 port,
+ union ib_gid *gid, int dev_type)
+{
+   u8 found_port;
+   int ret = -ENODEV;
+
+   if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, port))
+   return ret;
+
+   if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port))
+   return ret;
+
+   ret = ib_find_cached_gid(device, gid, &found_port, NULL);
+   if (port != found_port)
+   return -ENODEV;
+
+   return ret;
+}
+
 static int cma_acquire_dev(struct rdma_id_private *id_priv,
   struct rdma_id_private *listen_id_priv)
 {
struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
struct cma_device *cma_dev;
-   union ib_gid gid, iboe_gid;
+   union ib_gid gid, iboe_gid, *gidp;
int ret = -ENODEV;
-   u8 port, found_port;
-   enum rdma_link_layer dev_ll = dev_addr->dev_type == ARPHRD_INFINIBAND ?
-   IB_LINK_LAYER_INFINIBAND : IB_LINK_LAYER_ETHERNET;
+   u8 port;
 
-   if (dev_ll != IB_LINK_LAYER_INFINIBAND &&
+   if (dev_addr->dev_type != ARPHRD_INFINIBAND &&
id_priv->id.ps == RDMA_PS_IPOIB)
return -EINVAL;
 
@@ -370,41 +387,36 @@ static int cma_acquire_dev(struct rdma_id_private 
*id_priv,
 
memcpy(&gid, dev_addr->src_dev_addr +
   rdma_addr_gid_offset(dev_addr), sizeof gid);
-   if (listen_id_priv &&
-   rdma_port_get_link_layer(listen_id_priv->id.device,
-listen_id_priv->id.port_num) == dev_ll) {
+
+   if (listen_id_priv) {
cma_dev = listen_id_priv->cma_dev;
port = listen_id_priv->id.port_num;
-   if (rdma_node_get_transport(cma_dev->device->node_type) == 
RDMA_TRANSPORT_IB &&
-   rdma_port_get_link_layer(cma_dev->device, port) == 
IB_LINK_LAYER_ETHERNET)
-   ret = ib_find_cached_gid(cma_dev->device, &iboe_gid,
-&found_port, NULL);
-   else
-   ret = ib_find_cached_gid(cma_dev->device, &gid,
-&found_port, NULL);
+   gidp = rdma_protocol_iboe(cma_dev->device, port) ?
+  &iboe_gid : &gid;
 
-   if (!ret && (port  == found_port)) {
-   id_priv->id.port_num = found_port;
+   ret = cma_validate_port(cma_dev->device, port, gidp,
+   dev_addr->dev_type);
+   if (!ret) {
+   id_priv->id.port_num = port;
goto out;
}
}
+
list_for_each_entry(cma_dev, &dev_list, list) {
for (port = 1; port <= cma_dev->device->phys_port_cnt; ++port) {
if (listen_id_priv &&
listen_id_priv->cma_dev == cma_dev &&
listen_id_priv->id.port_num == port)
continue;
-   if (rdma_port_get_link_layer(cma_dev->device, port) == 
dev_ll) {
-   if 
(rdma_node_get_transport(cma_dev->device->node_type) == RDMA_TRANSPORT_IB &&
-   rdma_port_get_link_layer(cma_dev->device, 
port) == IB_LINK_LAYER_ETHERNET)
-   ret = 
ib_find_cached_gid(cma_dev->device, &iboe_gid, &found_port, NULL);
-   else
-   ret = 
ib_find_cached_gid(cma_dev->device, &gid, &found_port, NULL);
-
-   if (!ret && (port == found_port)) {
-   id_priv->id.port_num = found_port;
-   goto out;
-   }
+
+   gidp = rdma_protocol_iboe(cma_dev->devic

[PATCH v7 07/23] IB/Verbs: Reform IB-ulp ipoib

2015-04-28 Thread Michael Wang
Use raw management helpers to reform IB-ulp ipoib.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 7cad4dd..468fc2b 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1680,9 +1680,7 @@ static void ipoib_add_one(struct ib_device *device)
struct net_device *dev;
struct ipoib_dev_priv *priv;
int s, e, p;
-
-   if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
-   return;
+   int count = 0;
 
dev_list = kmalloc(sizeof *dev_list, GFP_KERNEL);
if (!dev_list)
@@ -1699,15 +1697,21 @@ static void ipoib_add_one(struct ib_device *device)
}
 
for (p = s; p <= e; ++p) {
-   if (rdma_port_get_link_layer(device, p) != 
IB_LINK_LAYER_INFINIBAND)
+   if (!rdma_protocol_ib(device, p))
continue;
dev = ipoib_add_port("ib%d", device, p);
if (!IS_ERR(dev)) {
priv = netdev_priv(dev);
list_add_tail(&priv->list, dev_list);
+   count++;
}
}
 
+   if (!count) {
+   kfree(dev_list);
+   return;
+   }
+
ib_set_client_data(device, &ipoib_client, dev_list);
 }
 
@@ -1716,9 +1720,6 @@ static void ipoib_remove_one(struct ib_device *device)
struct ipoib_dev_priv *priv, *tmp;
struct list_head *dev_list;
 
-   if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
-   return;
-
dev_list = ib_get_client_data(device, &ipoib_client);
if (!dev_list)
return;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 14/23] IB/Verbs: Reform rest part in IB-core cma

2015-04-28 Thread Michael Wang
Use raw management helpers to reform rest part in IB-core cma.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 3fb3458..d43f492f 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -447,10 +447,10 @@ static int cma_resolve_ib_dev(struct rdma_id_private 
*id_priv)
pkey = ntohs(addr->sib_pkey);
 
list_for_each_entry(cur_dev, &dev_list, list) {
-   if (rdma_node_get_transport(cur_dev->device->node_type) != 
RDMA_TRANSPORT_IB)
-   continue;
-
for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
+   if (!rdma_ib_or_iboe(cur_dev->device, p))
+   continue;
+
if (ib_find_cached_pkey(cur_dev->device, p, pkey, 
&index))
continue;
 
@@ -645,10 +645,9 @@ static int cma_modify_qp_rtr(struct rdma_id_private 
*id_priv,
if (ret)
goto out;
 
-   if (rdma_node_get_transport(id_priv->cma_dev->device->node_type)
-   == RDMA_TRANSPORT_IB &&
-   rdma_port_get_link_layer(id_priv->id.device, id_priv->id.port_num)
-   == IB_LINK_LAYER_ETHERNET) {
+   BUG_ON(id_priv->cma_dev->device != id_priv->id.device);
+
+   if (rdma_protocol_iboe(id_priv->id.device, id_priv->id.port_num)) {
ret = rdma_addr_find_smac_by_sgid(&sgid, qp_attr.smac, NULL);
 
if (ret)
@@ -712,11 +711,10 @@ static int cma_ib_init_qp_attr(struct rdma_id_private 
*id_priv,
int ret;
u16 pkey;
 
-   if (rdma_port_get_link_layer(id_priv->id.device, id_priv->id.port_num) 
==
-   IB_LINK_LAYER_INFINIBAND)
-   pkey = ib_addr_get_pkey(dev_addr);
-   else
+   if (rdma_protocol_iboe(id_priv->id.device, id_priv->id.port_num))
pkey = 0x;
+   else
+   pkey = ib_addr_get_pkey(dev_addr);
 
ret = ib_find_cached_pkey(id_priv->id.device, id_priv->id.port_num,
  pkey, &qp_attr->pkey_index);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 17/23] IB/Verbs: Use management helper cap_ib_cm()

2015-04-28 Thread Michael Wang
Introduce helper cap_ib_cm() to help us check if the port of an
IB device support Infiniband Communication Manager.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cm.c  |  6 +++---
 drivers/infiniband/core/cma.c | 19 +--
 drivers/infiniband/core/ucm.c |  2 +-
 include/rdma/ib_verbs.h   | 15 +++
 4 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index add5e484..3ffaad3 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3781,7 +3781,7 @@ static void cm_add_one(struct ib_device *ib_device)
 
set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
for (i = 1; i <= ib_device->phys_port_cnt; i++) {
-   if (!rdma_ib_or_iboe(ib_device, i))
+   if (!cap_ib_cm(ib_device, i))
continue;
 
port = kzalloc(sizeof *port, GFP_KERNEL);
@@ -3832,7 +3832,7 @@ error1:
port_modify.set_port_cap_mask = 0;
port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
while (--i) {
-   if (!rdma_ib_or_iboe(ib_device, i))
+   if (!cap_ib_cm(ib_device, i))
continue;
 
port = cm_dev->port[i-1];
@@ -3864,7 +3864,7 @@ static void cm_remove_one(struct ib_device *ib_device)
write_unlock_irqrestore(&cm.device_lock, flags);
 
for (i = 1; i <= ib_device->phys_port_cnt; i++) {
-   if (!rdma_ib_or_iboe(ib_device, i))
+   if (!cap_ib_cm(ib_device, i))
continue;
 
port = cm_dev->port[i-1];
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d43f492f..ecb0484 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -745,7 +745,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct 
ib_qp_attr *qp_attr,
int ret = 0;
 
id_priv = container_of(id, struct rdma_id_private, id);
-   if (rdma_ib_or_iboe(id->device, id->port_num)) {
+   if (cap_ib_cm(id->device, id->port_num)) {
if (!id_priv->cm_id.ib || (id_priv->id.qp_type == IB_QPT_UD))
ret = cma_ib_init_qp_attr(id_priv, qp_attr, 
qp_attr_mask);
else
@@ -1033,7 +1033,7 @@ void rdma_destroy_id(struct rdma_cm_id *id)
mutex_unlock(&id_priv->handler_mutex);
 
if (id_priv->cma_dev) {
-   if (rdma_ib_or_iboe(id_priv->id.device, 1)) {
+   if (cap_ib_cm(id_priv->id.device, 1)) {
if (id_priv->cm_id.ib)
ib_destroy_cm_id(id_priv->cm_id.ib);
} else if (rdma_protocol_iwarp(id_priv->id.device, 1)) {
@@ -1616,8 +1616,7 @@ static void cma_listen_on_dev(struct rdma_id_private 
*id_priv,
struct rdma_cm_id *id;
int ret;
 
-   if (cma_family(id_priv) == AF_IB &&
-   !rdma_ib_or_iboe(cma_dev->device, 1))
+   if (cma_family(id_priv) == AF_IB && !cap_ib_cm(cma_dev->device, 1))
return;
 
id = rdma_create_id(cma_listen_handler, id_priv, id_priv->id.ps,
@@ -2008,7 +2007,7 @@ static int cma_bind_loopback(struct rdma_id_private 
*id_priv)
mutex_lock(&lock);
list_for_each_entry(cur_dev, &dev_list, list) {
if (cma_family(id_priv) == AF_IB &&
-   !rdma_ib_or_iboe(cur_dev->device, 1))
+   !cap_ib_cm(cur_dev->device, 1))
continue;
 
if (!cma_dev)
@@ -2517,7 +2516,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
 
id_priv->backlog = backlog;
if (id->device) {
-   if (rdma_ib_or_iboe(id->device, 1)) {
+   if (cap_ib_cm(id->device, 1)) {
ret = cma_ib_listen(id_priv);
if (ret)
goto err;
@@ -2861,7 +2860,7 @@ int rdma_connect(struct rdma_cm_id *id, struct 
rdma_conn_param *conn_param)
id_priv->srq = conn_param->srq;
}
 
-   if (rdma_ib_or_iboe(id->device, id->port_num)) {
+   if (cap_ib_cm(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD)
ret = cma_resolve_ib_udp(id_priv, conn_param);
else
@@ -2972,7 +2971,7 @@ int rdma_accept(struct rdma_cm_id *id, struct 
rdma_conn_param *conn_param)
id_priv->srq = conn_param->srq;
}
 
-   if (rdma_ib_or_iboe(id->device, id->port_num)) {
+   if (cap_ib_cm(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD) {
if (conn_param)
 

[PATCH v7 15/23] IB/Verbs: Use management helper cap_ib_mad()

2015-04-28 Thread Michael Wang
Introduce helper cap_ib_mad() to help us check if the port of an
IB device support Infiniband Management Datagrams.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/mad.c  |  6 +++---
 drivers/infiniband/core/user_mad.c |  6 +++---
 include/rdma/ib_verbs.h| 15 +++
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 507eb67..59459e7 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -3066,7 +3066,7 @@ static void ib_mad_init_device(struct ib_device *device)
}
 
for (i = start; i <= end; i++) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!cap_ib_mad(device, i))
continue;
 
if (ib_mad_port_open(device, i)) {
@@ -3087,7 +3087,7 @@ error_agent:
 
 error:
while (--i >= start) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!cap_ib_mad(device, i))
continue;
 
if (ib_agent_port_close(device, i))
@@ -3111,7 +3111,7 @@ static void ib_mad_remove_device(struct ib_device *device)
}
 
for (i = start; i <= end; i++) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!cap_ib_mad(device, i))
continue;
 
if (ib_agent_port_close(device, i))
diff --git a/drivers/infiniband/core/user_mad.c 
b/drivers/infiniband/core/user_mad.c
index aa8b334..e3ccbf2 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -1294,7 +1294,7 @@ static void ib_umad_add_one(struct ib_device *device)
umad_dev->end_port   = e;
 
for (i = s; i <= e; ++i) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!cap_ib_mad(device, i))
continue;
 
umad_dev->port[i - s].umad_dev = umad_dev;
@@ -1315,7 +1315,7 @@ static void ib_umad_add_one(struct ib_device *device)
 
 err:
while (--i >= s) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!cap_ib_mad(device, i))
continue;
 
ib_umad_kill_port(&umad_dev->port[i - s]);
@@ -1333,7 +1333,7 @@ static void ib_umad_remove_one(struct ib_device *device)
return;
 
for (i = 0; i <= umad_dev->end_port - umad_dev->start_port; ++i) {
-   if (rdma_ib_or_iboe(device, i))
+   if (cap_ib_mad(device, i))
ib_umad_kill_port(&umad_dev->port[i]);
}
 
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index acdba60..cb3ba2d 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1774,6 +1774,21 @@ static inline int rdma_ib_or_iboe(struct ib_device 
*device, u8 port_num)
return (pt == RDMA_PROTOCOL_IB || pt == RDMA_PROTOCOL_IBOE);
 }
 
+/**
+ * cap_ib_mad - Check if the port of device has the capability Infiniband
+ * Management Datagrams.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Management Datagrams.
+ */
+static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
+{
+   return rdma_ib_or_iboe(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 20/23] IB/Verbs: Use management helper cap_ib_mcast()

2015-04-28 Thread Michael Wang
Introduce helper cap_ib_mcast() to help us check if the port of an
IB device support Infiniband Multicast.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c   |  6 +++---
 drivers/infiniband/core/multicast.c |  6 +++---
 include/rdma/ib_verbs.h | 15 +++
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index ec3a901..c06ca60 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1007,7 +1007,7 @@ static void cma_leave_mc_groups(struct rdma_id_private 
*id_priv)
mc = container_of(id_priv->mc_list.next,
  struct cma_multicast, list);
list_del(&mc->list);
-   if (rdma_protocol_ib(id_priv->cma_dev->device,
+   if (cap_ib_mcast(id_priv->cma_dev->device,
  id_priv->id.port_num)) {
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
@@ -3321,7 +3321,7 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr,
if (rdma_protocol_iboe(id->device, id->port_num)) {
kref_init(&mc->mcref);
ret = cma_iboe_join_multicast(id_priv, mc);
-   } else if (rdma_protocol_ib(id->device, id->port_num))
+   } else if (cap_ib_mcast(id->device, id->port_num))
ret = cma_join_ib_multicast(id_priv, mc);
else
ret = -ENOSYS;
@@ -3355,7 +3355,7 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr)
 
BUG_ON(id_priv->cma_dev->device != id->device);
 
-   if (rdma_protocol_ib(id->device, id->port_num)) {
+   if (cap_ib_mcast(id->device, id->port_num)) {
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
} else if (rdma_protocol_iboe(id->device, id->port_num))
diff --git a/drivers/infiniband/core/multicast.c 
b/drivers/infiniband/core/multicast.c
index b57ed03..bdc1880 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -780,7 +780,7 @@ static void mcast_event_handler(struct ib_event_handler 
*handler,
int index;
 
dev = container_of(handler, struct mcast_device, event_handler);
-   if (WARN_ON(!rdma_protocol_ib(dev->device, event->element.port_num)))
+   if (WARN_ON(!cap_ib_mcast(dev->device, event->element.port_num)))
return;
 
index = event->element.port_num - dev->start_port;
@@ -820,7 +820,7 @@ static void mcast_add_one(struct ib_device *device)
}
 
for (i = 0; i <= dev->end_port - dev->start_port; i++) {
-   if (!rdma_protocol_ib(device, dev->start_port + i))
+   if (!cap_ib_mcast(device, dev->start_port + i))
continue;
port = &dev->port[i];
port->dev = dev;
@@ -858,7 +858,7 @@ static void mcast_remove_one(struct ib_device *device)
flush_workqueue(mcast_wq);
 
for (i = 0; i <= dev->end_port - dev->start_port; i++) {
-   if (rdma_protocol_ib(device, dev->start_port + i)) {
+   if (cap_ib_mcast(device, dev->start_port + i)) {
port = &dev->port[i];
deref_port(port);
wait_for_completion(&port->comp);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index f3d9760..dde2aa9 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1849,6 +1849,21 @@ static inline int cap_ib_sa(struct ib_device *device, u8 
port_num)
return rdma_protocol_ib(device, port_num);
 }
 
+/**
+ * cap_ib_mcast - Check if the port of device has the capability Infiniband
+ * Multicast.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Multicast.
+ */
+static inline int cap_ib_mcast(struct ib_device *device, u8 port_num)
+{
+   return cap_ib_sa(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 19/23] IB/Verbs: Use management helper cap_ib_sa()

2015-04-28 Thread Michael Wang
Introduce helper cap_ib_sa() to help us check if the port of an
IB device support Infiniband Subnet Administration.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c  |  4 ++--
 drivers/infiniband/core/sa_query.c | 10 +-
 drivers/infiniband/core/ucma.c |  2 +-
 include/rdma/ib_verbs.h| 15 +++
 4 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 7d55296..ec3a901 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -933,7 +933,7 @@ static inline int cma_user_data_offset(struct 
rdma_id_private *id_priv)
 
 static void cma_cancel_route(struct rdma_id_private *id_priv)
 {
-   if (rdma_protocol_ib(id_priv->id.device, id_priv->id.port_num)) {
+   if (cap_ib_sa(id_priv->id.device, id_priv->id.port_num)) {
if (id_priv->query)
ib_sa_cancel_query(id_priv->query_id, id_priv->query);
}
@@ -1957,7 +1957,7 @@ int rdma_resolve_route(struct rdma_cm_id *id, int 
timeout_ms)
return -EINVAL;
 
atomic_inc(&id_priv->refcount);
-   if (rdma_protocol_ib(id->device, id->port_num))
+   if (cap_ib_sa(id->device, id->port_num))
ret = cma_resolve_ib_route(id_priv, timeout_ms);
else if (rdma_protocol_iboe(id->device, id->port_num))
ret = cma_resolve_iboe_route(id_priv);
diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index b115c28..c82aa48 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, 
struct ib_event *event
struct ib_sa_port *port =
&sa_dev->port[event->element.port_num - 
sa_dev->start_port];
 
-   if (WARN_ON(!rdma_protocol_ib(handler->device, port->port_num)))
+   if (WARN_ON(!cap_ib_sa(handler->device, port->port_num)))
return;
 
spin_lock_irqsave(&port->ah_lock, flags);
@@ -1173,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)
 
for (i = 0; i <= e - s; ++i) {
spin_lock_init(&sa_dev->port[i].ah_lock);
-   if (!rdma_protocol_ib(device, i + 1))
+   if (!cap_ib_sa(device, i + 1))
continue;
 
sa_dev->port[i].sm_ah= NULL;
@@ -1208,7 +1208,7 @@ static void ib_sa_add_one(struct ib_device *device)
goto err;
 
for (i = 0; i <= e - s; ++i) {
-   if (rdma_protocol_ib(device, i + 1))
+   if (cap_ib_sa(device, i + 1))
update_sm_ah(&sa_dev->port[i].update_task);
}
 
@@ -1216,7 +1216,7 @@ static void ib_sa_add_one(struct ib_device *device)
 
 err:
while (--i >= 0) {
-   if (rdma_protocol_ib(device, i + 1))
+   if (cap_ib_sa(device, i + 1))
ib_unregister_mad_agent(sa_dev->port[i].agent);
}
 free:
@@ -1237,7 +1237,7 @@ static void ib_sa_remove_one(struct ib_device *device)
flush_workqueue(ib_wq);
 
for (i = 0; i <= sa_dev->end_port - sa_dev->start_port; ++i) {
-   if (rdma_protocol_ib(device, i + 1)) {
+   if (cap_ib_sa(device, i + 1)) {
ib_unregister_mad_agent(sa_dev->port[i].agent);
if (sa_dev->port[i].sm_ah)
kref_put(&sa_dev->port[i].sm_ah->ref, 
free_sm_ah);
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index dae7620..6204065 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -723,7 +723,7 @@ static ssize_t ucma_query_route(struct ucma_file *file,
resp.node_guid = (__force __u64) ctx->cm_id->device->node_guid;
resp.port_num = ctx->cm_id->port_num;
 
-   if (rdma_protocol_ib(ctx->cm_id->device, ctx->cm_id->port_num))
+   if (cap_ib_sa(ctx->cm_id->device, ctx->cm_id->port_num))
ucma_copy_ib_route(&resp, &ctx->cm_id->route);
else if (rdma_protocol_iboe(ctx->cm_id->device, ctx->cm_id->port_num))
ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index d69e467..f3d9760 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1834,6 +1834,21 @@ static inline int cap_iw_cm(struct ib_device *device, u8 
port_num)
return rdma_protocol_iwarp(device, port_num);
 }
 
+/**
+ * cap_ib_sa - Check if the port 

[PATCH v7 12/23] IB/Verbs: Reform mcast related part in IB-core cma

2015-04-28 Thread Michael Wang
Use raw management helpers to reform mcast related part in IB-core cma.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c | 56 ++-
 1 file changed, 18 insertions(+), 38 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 36c5f8a..34ec13f 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -997,17 +997,12 @@ static void cma_leave_mc_groups(struct rdma_id_private 
*id_priv)
mc = container_of(id_priv->mc_list.next,
  struct cma_multicast, list);
list_del(&mc->list);
-   switch (rdma_port_get_link_layer(id_priv->cma_dev->device, 
id_priv->id.port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
+   if (rdma_protocol_ib(id_priv->cma_dev->device,
+ id_priv->id.port_num)) {
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
+   } else
kref_put(&mc->mcref, release_mc);
-   break;
-   default:
-   break;
-   }
}
 }
 
@@ -3314,24 +3309,13 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr,
list_add(&mc->list, &id_priv->mc_list);
spin_unlock(&id_priv->lock);
 
-   switch (rdma_node_get_transport(id->device->node_type)) {
-   case RDMA_TRANSPORT_IB:
-   switch (rdma_port_get_link_layer(id->device, id->port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
-   ret = cma_join_ib_multicast(id_priv, mc);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
-   kref_init(&mc->mcref);
-   ret = cma_iboe_join_multicast(id_priv, mc);
-   break;
-   default:
-   ret = -EINVAL;
-   }
-   break;
-   default:
+   if (rdma_protocol_iboe(id->device, id->port_num)) {
+   kref_init(&mc->mcref);
+   ret = cma_iboe_join_multicast(id_priv, mc);
+   } else if (rdma_protocol_ib(id->device, id->port_num))
+   ret = cma_join_ib_multicast(id_priv, mc);
+   else
ret = -ENOSYS;
-   break;
-   }
 
if (ret) {
spin_lock_irq(&id_priv->lock);
@@ -3359,19 +3343,15 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr)
ib_detach_mcast(id->qp,
&mc->multicast.ib->rec.mgid,

be16_to_cpu(mc->multicast.ib->rec.mlid));
-   if 
(rdma_node_get_transport(id_priv->cma_dev->device->node_type) == 
RDMA_TRANSPORT_IB) {
-   switch (rdma_port_get_link_layer(id->device, 
id->port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
-   ib_sa_free_multicast(mc->multicast.ib);
-   kfree(mc);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
-   kref_put(&mc->mcref, release_mc);
-   break;
-   default:
-   break;
-   }
-   }
+
+   BUG_ON(id_priv->cma_dev->device != id->device);
+
+   if (rdma_protocol_ib(id->device, id->port_num)) {
+   ib_sa_free_multicast(mc->multicast.ib);
+   kfree(mc);
+   } else if (rdma_protocol_iboe(id->device, id->port_num))
+   kref_put(&mc->mcref, release_mc);
+
return;
}
}
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 18/23] IB/Verbs: Use management helper cap_iw_cm()

2015-04-28 Thread Michael Wang
Introduce helper cap_iw_cm() to help us check if the port of an
IB device support IWARP Communication Manager.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c | 14 +++---
 include/rdma/ib_verbs.h   | 15 +++
 2 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index ecb0484..7d55296 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -754,7 +754,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct 
ib_qp_attr *qp_attr,
 
if (qp_attr->qp_state == IB_QPS_RTR)
qp_attr->rq_psn = id_priv->seq_num;
-   } else if (rdma_protocol_iwarp(id->device, id->port_num)) {
+   } else if (cap_iw_cm(id->device, id->port_num)) {
if (!id_priv->cm_id.iw) {
qp_attr->qp_access_flags = 0;
*qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS;
@@ -1036,7 +1036,7 @@ void rdma_destroy_id(struct rdma_cm_id *id)
if (cap_ib_cm(id_priv->id.device, 1)) {
if (id_priv->cm_id.ib)
ib_destroy_cm_id(id_priv->cm_id.ib);
-   } else if (rdma_protocol_iwarp(id_priv->id.device, 1)) {
+   } else if (cap_iw_cm(id_priv->id.device, 1)) {
if (id_priv->cm_id.iw)
iw_destroy_cm_id(id_priv->cm_id.iw);
}
@@ -2520,7 +2520,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
ret = cma_ib_listen(id_priv);
if (ret)
goto err;
-   } else if (rdma_protocol_iwarp(id->device, 1)) {
+   } else if (cap_iw_cm(id->device, 1)) {
ret = cma_iw_listen(id_priv, backlog);
if (ret)
goto err;
@@ -2865,7 +2865,7 @@ int rdma_connect(struct rdma_cm_id *id, struct 
rdma_conn_param *conn_param)
ret = cma_resolve_ib_udp(id_priv, conn_param);
else
ret = cma_connect_ib(id_priv, conn_param);
-   } else if (rdma_protocol_iwarp(id->device, id->port_num))
+   } else if (cap_iw_cm(id->device, id->port_num))
ret = cma_connect_iw(id_priv, conn_param);
else
ret = -ENOSYS;
@@ -2987,7 +2987,7 @@ int rdma_accept(struct rdma_cm_id *id, struct 
rdma_conn_param *conn_param)
else
ret = cma_rep_recv(id_priv);
}
-   } else if (rdma_protocol_iwarp(id->device, id->port_num))
+   } else if (cap_iw_cm(id->device, id->port_num))
ret = cma_accept_iw(id_priv, conn_param);
else
ret = -ENOSYS;
@@ -3042,7 +3042,7 @@ int rdma_reject(struct rdma_cm_id *id, const void 
*private_data,
ret = ib_send_cm_rej(id_priv->cm_id.ib,
 IB_CM_REJ_CONSUMER_DEFINED, NULL,
 0, private_data, private_data_len);
-   } else if (rdma_protocol_iwarp(id->device, id->port_num)) {
+   } else if (cap_iw_cm(id->device, id->port_num)) {
ret = iw_cm_reject(id_priv->cm_id.iw,
   private_data, private_data_len);
} else
@@ -3068,7 +3068,7 @@ int rdma_disconnect(struct rdma_cm_id *id)
/* Initiate or respond to a disconnect. */
if (ib_send_cm_dreq(id_priv->cm_id.ib, NULL, 0))
ib_send_cm_drep(id_priv->cm_id.ib, NULL, 0);
-   } else if (rdma_protocol_iwarp(id->device, id->port_num)) {
+   } else if (cap_iw_cm(id->device, id->port_num)) {
ret = iw_cm_disconnect(id_priv->cm_id.iw, 0);
} else
ret = -EINVAL;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 87b07f2..d69e467 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1819,6 +1819,21 @@ static inline int cap_ib_cm(struct ib_device *device, u8 
port_num)
return rdma_ib_or_iboe(device, port_num);
 }
 
+/**
+ * cap_iw_cm - Check if the port of device has the capability IWARP
+ * Communication Manager.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support IWARP
+ * Communication Manager.
+ */
+static inline int cap_iw_cm(struct ib_device *device, u8 port_num)
+{
+   return rdma_protocol_iwarp(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);

[PATCH v7 22/23] IB/Verbs: Use management helper cap_af_ib()

2015-04-28 Thread Michael Wang
Introduce helper cap_af_ib() to help us check if the port of an
IB device support Native Infiniband Address.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c |  2 +-
 include/rdma/ib_verbs.h   | 15 +++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index c06ca60..c3dbcdd 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -448,7 +448,7 @@ static int cma_resolve_ib_dev(struct rdma_id_private 
*id_priv)
 
list_for_each_entry(cur_dev, &dev_list, list) {
for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
-   if (!rdma_ib_or_iboe(cur_dev->device, p))
+   if (!cap_af_ib(cur_dev->device, p))
continue;
 
if (ib_find_cached_pkey(cur_dev->device, p, pkey, 
&index))
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index cca0293..c045be1 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1865,6 +1865,21 @@ static inline int cap_ib_mcast(struct ib_device *device, 
u8 port_num)
 }
 
 /**
+ * cap_af_ib - Check if the port of device has the capability
+ * Native Infiniband Address.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support
+ * Native Infiniband Address.
+ */
+static inline int cap_af_ib(struct ib_device *device, u8 port_num)
+{
+   return rdma_ib_or_iboe(device, port_num);
+}
+
+/**
  * cap_read_multi_sge - Check if the port of device has the capability
  * RDMA Read Multiple Scatter-Gather Entries.
  *
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 21/23] IB/Verbs: Use management helper cap_read_multi_sge()

2015-04-28 Thread Michael Wang
Introduce helper cap_read_multi_sge() to help us check if the port of an
IB device support RDMA Read Multiple Scatter-Gather Entries.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 include/rdma/ib_verbs.h | 15 +++
 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |  2 +-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index dde2aa9..cca0293 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1864,6 +1864,21 @@ static inline int cap_ib_mcast(struct ib_device *device, 
u8 port_num)
return cap_ib_sa(device, port_num);
 }
 
+/**
+ * cap_read_multi_sge - Check if the port of device has the capability
+ * RDMA Read Multiple Scatter-Gather Entries.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support
+ * RDMA Read Multiple Scatter-Gather Entries.
+ */
+static inline int cap_read_multi_sge(struct ib_device *device, u8 port_num)
+{
+   return !rdma_protocol_iwarp(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index 2cc625d..7711b7a 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -117,7 +117,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
 
 static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
 {
-   if (rdma_protocol_iwarp(xprt->sc_cm_id->device,
+   if (!cap_read_multi_sge(xprt->sc_cm_id->device,
xprt->sc_cm_id->port_num))
return 1;
else
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 23/23] IB/Verbs: Use management helper cap_eth_ah()

2015-04-28 Thread Michael Wang
Introduce helper cap_eth_ah() to help us check if the port of an
IB device support Ethernet Address Handler.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/cma.c  |  2 +-
 drivers/infiniband/core/sa_query.c |  2 +-
 drivers/infiniband/core/verbs.c|  4 ++--
 include/rdma/ib_verbs.h| 15 +++
 4 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index c3dbcdd..2e96883 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -711,7 +711,7 @@ static int cma_ib_init_qp_attr(struct rdma_id_private 
*id_priv,
int ret;
u16 pkey;
 
-   if (rdma_protocol_iboe(id_priv->id.device, id_priv->id.port_num))
+   if (cap_eth_ah(id_priv->id.device, id_priv->id.port_num))
pkey = 0x;
else
pkey = ib_addr_get_pkey(dev_addr);
diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index c82aa48..1f0d009 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -540,7 +540,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 
port_num,
ah_attr->port_num = port_num;
ah_attr->static_rate = rec->rate;
 
-   force_grh = rdma_protocol_iboe(device, port_num);
+   force_grh = cap_eth_ah(device, port_num);
 
if (rec->hop_limit > 1 || force_grh) {
ah_attr->ah_flags = IB_AH_GRH;
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 7dd2f51..db1139f 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -200,7 +200,7 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 
port_num, struct ib_wc *wc,
int ret;
 
memset(ah_attr, 0, sizeof *ah_attr);
-   if (rdma_protocol_iboe(device, port_num)) {
+   if (cap_eth_ah(device, port_num)) {
if (!(wc->wc_flags & IB_WC_GRH))
return -EPROTOTYPE;
 
@@ -869,7 +869,7 @@ int ib_resolve_eth_l2_attrs(struct ib_qp *qp,
union ib_gid  sgid;
 
if ((*qp_attr_mask & IB_QP_AV)  &&
-   (rdma_protocol_iboe(qp->device, qp_attr->ah_attr.port_num))) {
+   (cap_eth_ah(qp->device, qp_attr->ah_attr.port_num))) {
ret = ib_query_gid(qp->device, qp_attr->ah_attr.port_num,
   qp_attr->ah_attr.grh.sgid_index, &sgid);
if (ret)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index c045be1..c724114 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1880,6 +1880,21 @@ static inline int cap_af_ib(struct ib_device *device, u8 
port_num)
 }
 
 /**
+ * cap_eth_ah - Check if the port of device has the capability
+ * Ethernet Address Handler.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support
+ * Ethernet Address Handler.
+ */
+static inline int cap_eth_ah(struct ib_device *device, u8 port_num)
+{
+   return rdma_protocol_iboe(device, port_num);
+}
+
+/**
  * cap_read_multi_sge - Check if the port of device has the capability
  * RDMA Read Multiple Scatter-Gather Entries.
  *
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 09/23] IB/Verbs: Reform IB-core verbs

2015-04-28 Thread Michael Wang
Use raw management helpers to reform IB-core verbs

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/verbs.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index f93eb8d..7dd2f51 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -198,11 +198,9 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 
port_num, struct ib_wc *wc,
u32 flow_class;
u16 gid_index;
int ret;
-   int is_eth = (rdma_port_get_link_layer(device, port_num) ==
-   IB_LINK_LAYER_ETHERNET);
 
memset(ah_attr, 0, sizeof *ah_attr);
-   if (is_eth) {
+   if (rdma_protocol_iboe(device, port_num)) {
if (!(wc->wc_flags & IB_WC_GRH))
return -EPROTOTYPE;
 
@@ -871,7 +869,7 @@ int ib_resolve_eth_l2_attrs(struct ib_qp *qp,
union ib_gid  sgid;
 
if ((*qp_attr_mask & IB_QP_AV)  &&
-   (rdma_port_get_link_layer(qp->device, qp_attr->ah_attr.port_num) == 
IB_LINK_LAYER_ETHERNET)) {
+   (rdma_protocol_iboe(qp->device, qp_attr->ah_attr.port_num))) {
ret = ib_query_gid(qp->device, qp_attr->ah_attr.port_num,
   qp_attr->ah_attr.grh.sgid_index, &sgid);
if (ret)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 16/23] IB/Verbs: Use management helper cap_ib_smi()

2015-04-28 Thread Michael Wang
Introduce helper cap_ib_smi() to help us check if the port of an
IB device support Infiniband Subnet Management Interface.

Cc: Hal Rosenstock 
Cc: Steve Wise 
Cc: Tom Talpey 
Cc: Jason Gunthorpe 
Cc: Doug Ledford 
Cc: Ira Weiny 
Cc: Sean Hefty 
Signed-off-by: Michael Wang 
---
 drivers/infiniband/core/agent.c |  2 +-
 drivers/infiniband/core/mad.c   |  2 +-
 include/rdma/ib_verbs.h | 15 +++
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index 89d4fbc..61471ee 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int 
port_num)
goto error1;
}
 
-   if (rdma_protocol_ib(device, port_num)) {
+   if (cap_ib_smi(device, port_num)) {
/* Obtain send only MAD agent for SMI QP */
port_priv->agent[0] = ib_register_mad_agent(device, port_num,
IB_QPT_SMI, NULL, 0,
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 59459e7..ee3a05e 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device,
init_mad_qp(port_priv, &port_priv->qp_info[1]);
 
cq_size = mad_sendq_size + mad_recvq_size;
-   has_smi = rdma_protocol_ib(device, port_num);
+   has_smi = cap_ib_smi(device, port_num);
if (has_smi)
cq_size *= 2;
 
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index cb3ba2d..b364a82 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1789,6 +1789,21 @@ static inline int cap_ib_mad(struct ib_device *device, 
u8 port_num)
return rdma_ib_or_iboe(device, port_num);
 }
 
+/**
+ * cap_ib_smi - Check if the port of device has the capability Infiniband
+ * Subnet Management Interface.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Subnet Management Interface.
+ */
+static inline int cap_ib_smi(struct ib_device *device, u8 port_num)
+{
+   return rdma_protocol_ib(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   >