Re: [PATCH] rds: send: Fix dead code in rds_sendmsg

2018-07-25 Thread Sowmini Varadhan
On (07/25/18 10:22), Gustavo A. R. Silva wrote:
> Currently, code at label *out* is unreachable. Fix this by updating
> variable *ret* with -EINVAL, so the jump to *out* can be properly
> executed instead of directly returning from function.
> 
> Addresses-Coverity-ID: 1472059 ("Structurally dead code")
> Fixes: 1e2b44e78eea ("rds: Enable RDS IPv6 support")
> Signed-off-by: Gustavo A. R. Silva 

Acked-by: Sowmini Varadhan 


Re: [PATCH] rds: send: Fix dead code in rds_sendmsg

2018-07-25 Thread Sowmini Varadhan
On (07/25/18 10:22), Gustavo A. R. Silva wrote:
> Currently, code at label *out* is unreachable. Fix this by updating
> variable *ret* with -EINVAL, so the jump to *out* can be properly
> executed instead of directly returning from function.
> 
> Addresses-Coverity-ID: 1472059 ("Structurally dead code")
> Fixes: 1e2b44e78eea ("rds: Enable RDS IPv6 support")
> Signed-off-by: Gustavo A. R. Silva 

Acked-by: Sowmini Varadhan 


Re: [rds-devel] KASAN: null-ptr-deref Read in rds_ib_get_mr

2018-05-11 Thread Sowmini Varadhan
On (05/11/18 15:48), Yanjun Zhu wrote:
> diff --git a/net/rds/ib_rdma.c b/net/rds/ib_rdma.c
> index e678699..2228b50 100644
> --- a/net/rds/ib_rdma.c
> +++ b/net/rds/ib_rdma.c
> @@ -539,11 +539,17 @@ void rds_ib_flush_mrs(void)
>void *rds_ib_get_mr(struct scatterlist *sg, unsigned long nents,
> struct rds_sock *rs, u32 *key_ret)
>{
> - struct rds_ib_device *rds_ibdev;
> + struct rds_ib_device *rds_ibdev = NULL;
> struct rds_ib_mr *ibmr = NULL;
> - struct rds_ib_connection *ic = rs->rs_conn->c_transport_data;
> + struct rds_ib_connection *ic = NULL;
> int ret;
> 
> + if (rs->rs_bound_addr == 0) {
> + ret = -EPERM;
> + goto out;
> + }
> +
> + ic = rs->rs_conn->c_transport_data;
> rds_ibdev = rds_ib_get_device(rs->rs_bound_addr);
> if (!rds_ibdev) {
> ret = -ENODEV;
> 
> I made this raw patch. If you can reproduce this bug, please make tests 
> with it.

I dont think this solves the problem, I think it
just changes the timing under which it can still happen. 

what if the rds_remove_bound() in rds_bind() happens after the check
for if (rs->rs_bound_addr == 0) added above by the patch 

I believe you need some type of synchronization (either
through mutex, or some atomic flag in the rs or similar) to make
sure rds_bind() and rds_ib_get_mr() are mutually exclusive.

--Sowmini
 



Re: [rds-devel] KASAN: null-ptr-deref Read in rds_ib_get_mr

2018-05-11 Thread Sowmini Varadhan
On (05/11/18 15:48), Yanjun Zhu wrote:
> diff --git a/net/rds/ib_rdma.c b/net/rds/ib_rdma.c
> index e678699..2228b50 100644
> --- a/net/rds/ib_rdma.c
> +++ b/net/rds/ib_rdma.c
> @@ -539,11 +539,17 @@ void rds_ib_flush_mrs(void)
>void *rds_ib_get_mr(struct scatterlist *sg, unsigned long nents,
> struct rds_sock *rs, u32 *key_ret)
>{
> - struct rds_ib_device *rds_ibdev;
> + struct rds_ib_device *rds_ibdev = NULL;
> struct rds_ib_mr *ibmr = NULL;
> - struct rds_ib_connection *ic = rs->rs_conn->c_transport_data;
> + struct rds_ib_connection *ic = NULL;
> int ret;
> 
> + if (rs->rs_bound_addr == 0) {
> + ret = -EPERM;
> + goto out;
> + }
> +
> + ic = rs->rs_conn->c_transport_data;
> rds_ibdev = rds_ib_get_device(rs->rs_bound_addr);
> if (!rds_ibdev) {
> ret = -ENODEV;
> 
> I made this raw patch. If you can reproduce this bug, please make tests 
> with it.

I dont think this solves the problem, I think it
just changes the timing under which it can still happen. 

what if the rds_remove_bound() in rds_bind() happens after the check
for if (rs->rs_bound_addr == 0) added above by the patch 

I believe you need some type of synchronization (either
through mutex, or some atomic flag in the rs or similar) to make
sure rds_bind() and rds_ib_get_mr() are mutually exclusive.

--Sowmini
 



Re: KASAN: use-after-free Read in inet_create

2018-04-08 Thread Sowmini Varadhan

#syz dup: KASAN: use-after-free Read in rds_cong_queue_updates

There  are a number of manifestations of this bug, basically
all suggest that the connect/reconnect etc workqs are somehow
being scheduled after the netns is deleted, despite the
code refactoring in Commit  3db6e0d172c (and looks like
the WARN_ONs in that commit are not even being triggered).
We've not been able to reproduce this issues, and without
a crash dump (or some hint of other threads that were running
at the time of the problem) are working on figuring out
the root-cause by code-inspection.

--Sowmini



Re: KASAN: use-after-free Read in inet_create

2018-04-08 Thread Sowmini Varadhan

#syz dup: KASAN: use-after-free Read in rds_cong_queue_updates

There  are a number of manifestations of this bug, basically
all suggest that the connect/reconnect etc workqs are somehow
being scheduled after the netns is deleted, despite the
code refactoring in Commit  3db6e0d172c (and looks like
the WARN_ONs in that commit are not even being triggered).
We've not been able to reproduce this issues, and without
a crash dump (or some hint of other threads that were running
at the time of the problem) are working on figuring out
the root-cause by code-inspection.

--Sowmini



Re: KASAN: slab-out-of-bounds Read in rds_cong_queue_updates

2018-03-19 Thread Sowmini Varadhan
On (03/19/18 09:29), Dmitry Vyukov wrote:
> 
> This looks the same as:
> 
> #syz dup: KASAN: use-after-free Read in rds_cong_queue_updates

correct, seems like the rds_destroy_pending() fixes did not seal
this race condition. I need to look at this more carefully to see
what race I missed.. no easy answer here, I am afraid.

--Sowmini


Re: KASAN: slab-out-of-bounds Read in rds_cong_queue_updates

2018-03-19 Thread Sowmini Varadhan
On (03/19/18 09:29), Dmitry Vyukov wrote:
> 
> This looks the same as:
> 
> #syz dup: KASAN: use-after-free Read in rds_cong_queue_updates

correct, seems like the rds_destroy_pending() fixes did not seal
this race condition. I need to look at this more carefully to see
what race I missed.. no easy answer here, I am afraid.

--Sowmini


Re: WARNING in __local_bh_enable_ip (2)

2018-03-14 Thread Sowmini Varadhan
On (03/14/18 14:28), Eric Dumazet wrote:
> 
> 
> spin_lock_bh(_tcp_conn_lock);/spin_unlock_bh(_tcp_conn_lock); in
> rds_tcp_conn_free()
> 
> is in conflict with the spin_lock_irqsave(_conn_lock, flags);
> in __rds_conn_create()

yes I was going to look into this and fix it later.

> Hard to understand why RDS is messing with hard irqs really.

some of it comes from the rds_rdma history: some parts of
the common rds and rds_rdma module get called in various
driver contexts for infiniband.

--Sowmini


Re: WARNING in __local_bh_enable_ip (2)

2018-03-14 Thread Sowmini Varadhan
On (03/14/18 14:28), Eric Dumazet wrote:
> 
> 
> spin_lock_bh(_tcp_conn_lock);/spin_unlock_bh(_tcp_conn_lock); in
> rds_tcp_conn_free()
> 
> is in conflict with the spin_lock_irqsave(_conn_lock, flags);
> in __rds_conn_create()

yes I was going to look into this and fix it later.

> Hard to understand why RDS is messing with hard irqs really.

some of it comes from the rds_rdma history: some parts of
the common rds and rds_rdma module get called in various
driver contexts for infiniband.

--Sowmini


Re: [PATCH][rds-next] rds: make functions rds_info_from_znotifier and rds_message_zcopy_from_user static

2018-03-11 Thread Sowmini Varadhan
On (03/11/18 18:03), Colin King wrote:
> From: Colin Ian King 
> 
> Functions rds_info_from_znotifier and rds_message_zcopy_from_user are
> local to the source and do not need to be in global scope, so make them
> static.

the rds_message_zcopy_from_user warning was already flagged by  kbuild-robot
last week.  See
   https://www.spinics.net/lists/netdev/msg488041.html


 



Re: [PATCH][rds-next] rds: make functions rds_info_from_znotifier and rds_message_zcopy_from_user static

2018-03-11 Thread Sowmini Varadhan
On (03/11/18 18:03), Colin King wrote:
> From: Colin Ian King 
> 
> Functions rds_info_from_znotifier and rds_message_zcopy_from_user are
> local to the source and do not need to be in global scope, so make them
> static.

the rds_message_zcopy_from_user warning was already flagged by  kbuild-robot
last week.  See
   https://www.spinics.net/lists/netdev/msg488041.html


 



Re: [rds-devel] [PATCH][rds-next] rds: remove redundant variable 'sg_off'

2018-03-11 Thread Sowmini Varadhan
On (03/11/18 17:27), Colin King wrote:
> Variable sg_off is assigned a value but it is never read, hence it is
> redundant and can be removed.
> 

Acked-by: Sowmini Varadhan <sowmini.varad...@oracle.com>




Re: [rds-devel] [PATCH][rds-next] rds: remove redundant variable 'sg_off'

2018-03-11 Thread Sowmini Varadhan
On (03/11/18 17:27), Colin King wrote:
> Variable sg_off is assigned a value but it is never read, hence it is
> redundant and can be removed.
> 

Acked-by: Sowmini Varadhan 




Re: [PATCH] rds: send: mark expected switch fall-through in rds_rm_size

2018-02-19 Thread Sowmini Varadhan
On (02/19/18 12:10), Gustavo A. R. Silva wrote:
> 
> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.

Acked-by:  Sowmini Varadhan <sowmini.varad...@oracle.com>



Re: [PATCH] rds: send: mark expected switch fall-through in rds_rm_size

2018-02-19 Thread Sowmini Varadhan
On (02/19/18 12:10), Gustavo A. R. Silva wrote:
> 
> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.

Acked-by:  Sowmini Varadhan 



Re: KASAN: use-after-free Read in rds_tcp_tune

2018-02-14 Thread Sowmini Varadhan
On (02/14/18 16:28), Dmitry Vyukov wrote:
> syzbot is probably not seeing this problem. However if you don't add
> the Reported-by tag to commit, nor provide syz fix tag, it will
> consider it as "open". One consequence of this is that it is still on
> our radars. Another consequence is that syzbot will never report bugs
> in rds_tcp_tune ever again as it thinks that it's the same known bug,
> so no point in bothering anybody.

understood, I think I saw this in the original syzbot mail as well,
but I was hesitant to actually add the tag because the fix was
based on code-inspection only, and I would have felt more comfortable
about asserting the Reported-by if I'd done a clear-cut before/after
verification.

btw, checkpatch.pl complains about the syzbot*@syzkaller.appspotmail.com
addresses as "Unrecognized email address", we should fix that
error from checkpatch at some point.

--Sowmini 



Re: KASAN: use-after-free Read in rds_tcp_tune

2018-02-14 Thread Sowmini Varadhan
On (02/14/18 16:28), Dmitry Vyukov wrote:
> syzbot is probably not seeing this problem. However if you don't add
> the Reported-by tag to commit, nor provide syz fix tag, it will
> consider it as "open". One consequence of this is that it is still on
> our radars. Another consequence is that syzbot will never report bugs
> in rds_tcp_tune ever again as it thinks that it's the same known bug,
> so no point in bothering anybody.

understood, I think I saw this in the original syzbot mail as well,
but I was hesitant to actually add the tag because the fix was
based on code-inspection only, and I would have felt more comfortable
about asserting the Reported-by if I'd done a clear-cut before/after
verification.

btw, checkpatch.pl complains about the syzbot*@syzkaller.appspotmail.com
addresses as "Unrecognized email address", we should fix that
error from checkpatch at some point.

--Sowmini 



Re: KASAN: use-after-free Read in rds_tcp_tune

2018-02-14 Thread Sowmini Varadhan
On (02/14/18 16:11), Dmitry Vyukov wrote:
> 
> Hi Sowmini,
> 
> Was this ever fixed? What's the fix? This still hangs as open. Please
> provide "syz fix" tag.

Are you still seeing this problem?

I had expected that the changes around rds_destroy_pending - see commit
ebeeb1ad9b8a - would have taken care of this (note that ebeeb1ad9b8a
refactors/updates 3db6e0d172c9) but those fixes were done by inspection
only. In other words, I was never able to reproduce this, so we may
still have missed some race condition.

--Sowmini






Re: KASAN: use-after-free Read in rds_tcp_tune

2018-02-14 Thread Sowmini Varadhan
On (02/14/18 16:11), Dmitry Vyukov wrote:
> 
> Hi Sowmini,
> 
> Was this ever fixed? What's the fix? This still hangs as open. Please
> provide "syz fix" tag.

Are you still seeing this problem?

I had expected that the changes around rds_destroy_pending - see commit
ebeeb1ad9b8a - would have taken care of this (note that ebeeb1ad9b8a
refactors/updates 3db6e0d172c9) but those fixes were done by inspection
only. In other words, I was never able to reproduce this, so we may
still have missed some race condition.

--Sowmini






Re: WARNING: suspicious RCU usage in rds_tcp_conn_alloc

2018-02-12 Thread Sowmini Varadhan
#syz dup: WARNING: suspicious RCU usage in rds_loop_conn_alloc



Re: WARNING: suspicious RCU usage in rds_tcp_conn_alloc

2018-02-12 Thread Sowmini Varadhan
#syz dup: WARNING: suspicious RCU usage in rds_loop_conn_alloc



Re: BUG: sleeping function called from invalid context at mm/slab.h:LINE (3)

2018-02-12 Thread Sowmini Varadhan
#syz dup: WARNING: suspicious RCU usage in rds_loop_conn_alloc



Re: BUG: sleeping function called from invalid context at mm/slab.h:LINE (3)

2018-02-12 Thread Sowmini Varadhan
#syz dup: WARNING: suspicious RCU usage in rds_loop_conn_alloc



Re: WARNING: suspicious RCU usage in rds_loop_conn_alloc

2018-02-12 Thread Sowmini Varadhan
> rcu_scheduler_active = 2, debug_locks = 1
> 1 lock held by syzkaller563791/4086:
>  #0:  (rcu_read_lock){}, at: []
> __rds_conn_create+0xe46/0x1b50 net/rds/connection.c:218
> 

the rcu_read_lock() was added by ebeeb1ad9b. After we add that,
everything in the rcu read critical section needs to use GFP_ATOMIC
to avoid sleeping. I'll look into this fix.

--Sowmini



Re: WARNING: suspicious RCU usage in rds_loop_conn_alloc

2018-02-12 Thread Sowmini Varadhan
> rcu_scheduler_active = 2, debug_locks = 1
> 1 lock held by syzkaller563791/4086:
>  #0:  (rcu_read_lock){}, at: []
> __rds_conn_create+0xe46/0x1b50 net/rds/connection.c:218
> 

the rcu_read_lock() was added by ebeeb1ad9b. After we add that,
everything in the rcu read critical section needs to use GFP_ATOMIC
to avoid sleeping. I'll look into this fix.

--Sowmini



Re: [rds-devel] BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit

2018-01-30 Thread Sowmini Varadhan
On (01/30/18 14:22), Eric Biggers wrote:
>
> I assume you weren't able to reproduce this?  This crash hasn't been
> seen again,
   :
> I am invalidating the bug for syzbot so it will report the same crash
> signature
> again if it occurs, but if you think there is a real bug feel free to keep
> looking into it.

correct I was not able to reproduce this.  However based on code
inspecion, I came up with

commit 3db6e0d172c94bd9953a1347c55ffb64b1d2e74f
rds: use RCU to synchronize work-enqueue with connection teardown

Marking  it invalid sounds good to me.

--Sowmini


Re: [rds-devel] BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit

2018-01-30 Thread Sowmini Varadhan
On (01/30/18 14:22), Eric Biggers wrote:
>
> I assume you weren't able to reproduce this?  This crash hasn't been
> seen again,
   :
> I am invalidating the bug for syzbot so it will report the same crash
> signature
> again if it occurs, but if you think there is a real bug feel free to keep
> looking into it.

correct I was not able to reproduce this.  However based on code
inspecion, I came up with

commit 3db6e0d172c94bd9953a1347c55ffb64b1d2e74f
rds: use RCU to synchronize work-enqueue with connection teardown

Marking  it invalid sounds good to me.

--Sowmini


Re: KASAN: use-after-free Read in rds_tcp_tune

2018-01-12 Thread Sowmini Varadhan
On (01/11/18 21:29), syzbot wrote:
> ==
> BUG: KASAN: use-after-free in rds_tcp_tune+0x491/0x520 net/rds/tcp.c:397
> Read of size 4 at addr 8801cd5f6c58 by task kworker/u4:4/4954

Just had an offline discussion with santosh around this, here's a summary
of that discussion for the archives:

Looks like an rds_connect_worker workq got scheduled after the 
netns was deleted. This could happen if an an rds_connection got
added between lines 528 and 529 of 

  506 static void rds_tcp_kill_sock(struct net *net)
  :
  /* code to pull out all the rds_connections that should be destroyed */
  :
  528 spin_unlock_irq(_tcp_conn_lock);
  529 list_for_each_entry_safe(tc, _tc, _list, t_tcp_node)
  530 rds_conn_destroy(tc->t_cpath->cp_conn);

Such an rds_connection would miss out the rds_conn_destroy() 
loop (that cancels all pending work) and (if it was scheduled
after netns deletion) could trigger the use-after-free.

Evaluating various fixes for this (including using _bh instead of _irq
as suggested by santosh), I'll get back with a patch soon.

--Sowmini






Re: KASAN: use-after-free Read in rds_tcp_tune

2018-01-12 Thread Sowmini Varadhan
On (01/11/18 21:29), syzbot wrote:
> ==
> BUG: KASAN: use-after-free in rds_tcp_tune+0x491/0x520 net/rds/tcp.c:397
> Read of size 4 at addr 8801cd5f6c58 by task kworker/u4:4/4954

Just had an offline discussion with santosh around this, here's a summary
of that discussion for the archives:

Looks like an rds_connect_worker workq got scheduled after the 
netns was deleted. This could happen if an an rds_connection got
added between lines 528 and 529 of 

  506 static void rds_tcp_kill_sock(struct net *net)
  :
  /* code to pull out all the rds_connections that should be destroyed */
  :
  528 spin_unlock_irq(_tcp_conn_lock);
  529 list_for_each_entry_safe(tc, _tc, _list, t_tcp_node)
  530 rds_conn_destroy(tc->t_cpath->cp_conn);

Such an rds_connection would miss out the rds_conn_destroy() 
loop (that cancels all pending work) and (if it was scheduled
after netns deletion) could trigger the use-after-free.

Evaluating various fixes for this (including using _bh instead of _irq
as suggested by santosh), I'll get back with a patch soon.

--Sowmini






Re: [rds-devel] BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit

2017-12-18 Thread Sowmini Varadhan
> From: Santosh Shilimkar 
> Date: Mon, 18 Dec 2017 08:28:05 -0800
  :
> > Looks like another one tripping on empty transport. Mostly below
> > should
> > address it but we will test it if it does.

that was my first thought, but it cannot be the case here: rds_sendmsg
etc itself would have bombed if that were the case, and the packet
would never have gotten queued.

This is unlike f3069c6d33, where an applications skips the transport
binding (either misses the explicit bind, or gets the wrong transport
due to an implicit bind) before it triggers the setsockopt.

I suspect that the problems is that the conn (and thus c_trans)
have gotten destroyed, but the cp_send_w work got incorrectly 
re-queued. For example, rds_cong_queue_updates() (because the
peer sent a congestion update) can happen in softirq context, 
and would end up requeing work in the middle of rds_conn_destroy, 
after we have assumed that everything is quisced.

On (12/18/17 12:12), David Miller wrote:
> 
> We're seeming to accumulate a lot of checks like this, maybe there
> is a more general way to deal with this problem?

Yeah, I was thinking about this..  let me try to reprodcue this in-house
and get back with a patchset.  

--Sowmini




Re: [rds-devel] BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit

2017-12-18 Thread Sowmini Varadhan
> From: Santosh Shilimkar 
> Date: Mon, 18 Dec 2017 08:28:05 -0800
  :
> > Looks like another one tripping on empty transport. Mostly below
> > should
> > address it but we will test it if it does.

that was my first thought, but it cannot be the case here: rds_sendmsg
etc itself would have bombed if that were the case, and the packet
would never have gotten queued.

This is unlike f3069c6d33, where an applications skips the transport
binding (either misses the explicit bind, or gets the wrong transport
due to an implicit bind) before it triggers the setsockopt.

I suspect that the problems is that the conn (and thus c_trans)
have gotten destroyed, but the cp_send_w work got incorrectly 
re-queued. For example, rds_cong_queue_updates() (because the
peer sent a congestion update) can happen in softirq context, 
and would end up requeing work in the middle of rds_conn_destroy, 
after we have assumed that everything is quisced.

On (12/18/17 12:12), David Miller wrote:
> 
> We're seeming to accumulate a lot of checks like this, maybe there
> is a more general way to deal with this problem?

Yeah, I was thinking about this..  let me try to reprodcue this in-house
and get back with a patchset.  

--Sowmini




Re: KASAN: use-after-free Read in rds_tcp_dev_event

2017-11-13 Thread Sowmini Varadhan
On (11/13/17 19:30), Girish Moodalbail wrote:
> (L538-540). However, it leaves behind some of the rds_tcp connections that
> shared the same underlying RDS connection (L534 and 535). These connections
> with pointer to stale network namespace are left behind in the global list.

It leaves behind no such thing. After mprds, you want to collect
only one instance of the conn that is being removed, that's why
lines 534-535 skips over duplicat instances of the same conn
(for multiple paths in the same conn).

> When the 2nd network namespace is deleted, we will hit the above stale
> pointer and hit UAF panic.
> I think we should move away from global list to a per-namespace list. The
> global list are used only in two places (both of which are per-namespace
> operations):

Nice try, but not so. 

Let me look at this tomorrow, I missed this mail in my mbox.

--Sowmini



Re: KASAN: use-after-free Read in rds_tcp_dev_event

2017-11-13 Thread Sowmini Varadhan
On (11/13/17 19:30), Girish Moodalbail wrote:
> (L538-540). However, it leaves behind some of the rds_tcp connections that
> shared the same underlying RDS connection (L534 and 535). These connections
> with pointer to stale network namespace are left behind in the global list.

It leaves behind no such thing. After mprds, you want to collect
only one instance of the conn that is being removed, that's why
lines 534-535 skips over duplicat instances of the same conn
(for multiple paths in the same conn).

> When the 2nd network namespace is deleted, we will hit the above stale
> pointer and hit UAF panic.
> I think we should move away from global list to a per-namespace list. The
> global list are used only in two places (both of which are per-namespace
> operations):

Nice try, but not so. 

Let me look at this tomorrow, I missed this mail in my mbox.

--Sowmini



Re: [PATCH net] rds: Make sure updates to cp_send_gen can be observed

2017-07-20 Thread Sowmini Varadhan
On (07/20/17 12:28), H??kon Bugge wrote:
> cp->cp_send_gen is treated as a normal variable, although it may be
> used by different threads.

I'm confused by that assertion. If you look at the comments right
above the change in your patch, there is a note that 
acquire_in_xmit/release_in_xmit are the synchronization/serialization 
points.

Can you please clarify?

> --- a/net/rds/send.c
> +++ b/net/rds/send.c
> @@ -170,8 +170,8 @@ int rds_send_xmit(struct rds_conn_path *cp)
>* The acquire_in_xmit() check above ensures that only one
>* caller can increment c_send_gen at any time.
>*/
> - cp->cp_send_gen++;
> - send_gen = cp->cp_send_gen;
> + send_gen = READ_ONCE(cp->cp_send_gen) + 1;
> + WRITE_ONCE(cp->cp_send_gen, send_gen);
>  

--Sowmini



Re: [PATCH net] rds: Make sure updates to cp_send_gen can be observed

2017-07-20 Thread Sowmini Varadhan
On (07/20/17 12:28), H??kon Bugge wrote:
> cp->cp_send_gen is treated as a normal variable, although it may be
> used by different threads.

I'm confused by that assertion. If you look at the comments right
above the change in your patch, there is a note that 
acquire_in_xmit/release_in_xmit are the synchronization/serialization 
points.

Can you please clarify?

> --- a/net/rds/send.c
> +++ b/net/rds/send.c
> @@ -170,8 +170,8 @@ int rds_send_xmit(struct rds_conn_path *cp)
>* The acquire_in_xmit() check above ensures that only one
>* caller can increment c_send_gen at any time.
>*/
> - cp->cp_send_gen++;
> - send_gen = cp->cp_send_gen;
> + send_gen = READ_ONCE(cp->cp_send_gen) + 1;
> + WRITE_ONCE(cp->cp_send_gen, send_gen);
>  

--Sowmini



Re: [GIT] Networking

2017-07-10 Thread Sowmini Varadhan
On (07/10/17 18:05), Herbert Xu wrote:
> 
> Hmm, I can't see the problem in af_alg_accept.  The struct socket
> comes directly from sys_accept() which creates it using sock_alloc.
> 
> So the only thing I can think of is that the memory returned by
> sock_alloc is not zeroed and therefore the WARN_ON is just reading
> garbage.

Then it is odd that this WARN_ON is not triggered for other sockets
(e.g., for TCP sockets), though it happens easily with AF_ALG. 

But it's not sock_alloc() - that function is returning a properly
zeroed ->sk.

The reason that the WARN_ON is triggered is that af_alg_accept() calls
sock_init_data() which does 

   2636 if (sock) {
:
   2639 sock->sk=   sk;


So we can do one of the following:

1. drop the WARN_ON(), which makes true leaks hard to detect
2. change the WARN_ON() to WARN_ON(parent->sk && parent->sk != sk)

#2 assumes that all the refcount book-keeping is being done
correctly (there is the danger that we end up taking 2 refs on the sk) 

--Sowmini






Re: [GIT] Networking

2017-07-10 Thread Sowmini Varadhan
On (07/10/17 18:05), Herbert Xu wrote:
> 
> Hmm, I can't see the problem in af_alg_accept.  The struct socket
> comes directly from sys_accept() which creates it using sock_alloc.
> 
> So the only thing I can think of is that the memory returned by
> sock_alloc is not zeroed and therefore the WARN_ON is just reading
> garbage.

Then it is odd that this WARN_ON is not triggered for other sockets
(e.g., for TCP sockets), though it happens easily with AF_ALG. 

But it's not sock_alloc() - that function is returning a properly
zeroed ->sk.

The reason that the WARN_ON is triggered is that af_alg_accept() calls
sock_init_data() which does 

   2636 if (sock) {
:
   2639 sock->sk=   sk;


So we can do one of the following:

1. drop the WARN_ON(), which makes true leaks hard to detect
2. change the WARN_ON() to WARN_ON(parent->sk && parent->sk != sk)

#2 assumes that all the refcount book-keeping is being done
correctly (there is the danger that we end up taking 2 refs on the sk) 

--Sowmini






Re: [GIT] Networking

2017-07-09 Thread Sowmini Varadhan
On (07/09/17 11:49), Linus Torvalds wrote:
> 
> On Sat, Jul 8, 2017 at 3:36 AM, David Miller <da...@davemloft.net> wrote:
> >
> > 8) Fix socket leak on accept() in RDS, from Sowmini Varadhan.  Also
> >add a WARN_ON() to sock_graft() so other protocol stacks don't trip
> >over this as well.
> 
> Hmm. This one triggers for me on both my desktop and laptop at bootup.
> Bog-standard machines, running F25 and F24 respectively.
> 
> The warning doesn't seem particularly useful, although maybe that
> "alg_accept()" gives people who know this code enough of a clue.

My initial question was whether sock_graft() should do a sock_put()
before cutting loose the existing parent->sk and assigning a new
parent->sk (https://www.spinics.net/lists/netdev/msg442191.html)

It look like PF_ALG sets up a ->sk in alg_create() (but this
would get over-written in alg_accept()?) 

Cc'ing Herbert to see if this is expected behavior (and PF_ALG
somehow does the right thing with the refcount for the ->sk
set up in alg_create) in which case I suppose we should drop the 
WARN_ON. 

--Sowmini

> [ cut here ]
> WARNING: CPU: 1 PID: 492 at ./include/net/sock.h:1700 
> af_alg_accept+0x1bf/0x1f0
> CPU: 1 PID: 492 Comm: systemd-cryptse Not tainted 4.12.0-09010-g2b976203417c 
> #1
> Hardware name: System manufacturer System Product Name/Z170-K, BIOS
> 1803 05/06/2016
> RIP: 0010:af_alg_accept+0x1bf/0x1f0
> Call Trace:
>  alg_accept+0x15/0x20
>  SYSC_accept4+0x105/0x210
>  ? getnstimeofday64+0xe/0x20
>  ? __audit_syscall_entry+0xb1/0xf0
>  ? syscall_trace_enter+0x1bd/0x2d0
>  ? __audit_syscall_exit+0x1a5/0x2a0
>  SyS_accept+0x10/0x20
>  do_syscall_64+0x61/0x140
>  entry_SYSCALL64_slow_path+0x25/0x25
> ---[ end trace a35e5baea85df269 ]---


Re: [GIT] Networking

2017-07-09 Thread Sowmini Varadhan
On (07/09/17 11:49), Linus Torvalds wrote:
> 
> On Sat, Jul 8, 2017 at 3:36 AM, David Miller  wrote:
> >
> > 8) Fix socket leak on accept() in RDS, from Sowmini Varadhan.  Also
> >add a WARN_ON() to sock_graft() so other protocol stacks don't trip
> >over this as well.
> 
> Hmm. This one triggers for me on both my desktop and laptop at bootup.
> Bog-standard machines, running F25 and F24 respectively.
> 
> The warning doesn't seem particularly useful, although maybe that
> "alg_accept()" gives people who know this code enough of a clue.

My initial question was whether sock_graft() should do a sock_put()
before cutting loose the existing parent->sk and assigning a new
parent->sk (https://www.spinics.net/lists/netdev/msg442191.html)

It look like PF_ALG sets up a ->sk in alg_create() (but this
would get over-written in alg_accept()?) 

Cc'ing Herbert to see if this is expected behavior (and PF_ALG
somehow does the right thing with the refcount for the ->sk
set up in alg_create) in which case I suppose we should drop the 
WARN_ON. 

--Sowmini

> [ cut here ]
> WARNING: CPU: 1 PID: 492 at ./include/net/sock.h:1700 
> af_alg_accept+0x1bf/0x1f0
> CPU: 1 PID: 492 Comm: systemd-cryptse Not tainted 4.12.0-09010-g2b976203417c 
> #1
> Hardware name: System manufacturer System Product Name/Z170-K, BIOS
> 1803 05/06/2016
> RIP: 0010:af_alg_accept+0x1bf/0x1f0
> Call Trace:
>  alg_accept+0x15/0x20
>  SYSC_accept4+0x105/0x210
>  ? getnstimeofday64+0xe/0x20
>  ? __audit_syscall_entry+0xb1/0xf0
>  ? syscall_trace_enter+0x1bd/0x2d0
>  ? __audit_syscall_exit+0x1a5/0x2a0
>  SyS_accept+0x10/0x20
>  do_syscall_64+0x61/0x140
>  entry_SYSCALL64_slow_path+0x25/0x25
> ---[ end trace a35e5baea85df269 ]---


Re: RDS: TCP: Delete an error message for a failed memory allocation in rds_tcp_init_net()

2017-05-22 Thread Sowmini Varadhan
> Do you find information from a Linux allocation failure report sufficient
> for such an use case?

yes, I suppose that would cover the needed cases. 
Your change looks good to me.

Acked-by: Sowmini Varadhan <sowmini.varad...@oracle.com>







Re: RDS: TCP: Delete an error message for a failed memory allocation in rds_tcp_init_net()

2017-05-22 Thread Sowmini Varadhan
> Do you find information from a Linux allocation failure report sufficient
> for such an use case?

yes, I suppose that would cover the needed cases. 
Your change looks good to me.

Acked-by: Sowmini Varadhan 







Re: [PATCH 3/3] RDS: TCP: Delete an error message for a failed memory allocation in rds_tcp_init_net()

2017-05-22 Thread Sowmini Varadhan
On (05/22/17 16:13), SF Markus Elfring wrote:
> 
> Omit an extra message for a memory allocation failure in this function.

The change itself is harmless,  but I'm curious about the "extra"
part: "extra" from what? If this happens, hopefully this will be logged
somewhere? Note that this type of (infrequent) logging noise is useful
in some cases, e.g., with 8ce675ff, when one is trying to do the
post-mortem of where things first went wrong.

--Sowmini



Re: [PATCH 3/3] RDS: TCP: Delete an error message for a failed memory allocation in rds_tcp_init_net()

2017-05-22 Thread Sowmini Varadhan
On (05/22/17 16:13), SF Markus Elfring wrote:
> 
> Omit an extra message for a memory allocation failure in this function.

The change itself is harmless,  but I'm curious about the "extra"
part: "extra" from what? If this happens, hopefully this will be logged
somewhere? Note that this type of (infrequent) logging noise is useful
in some cases, e.g., with 8ce675ff, when one is trying to do the
post-mortem of where things first went wrong.

--Sowmini



Re: Question about SOCK_SEQPACKET

2017-05-05 Thread Sowmini Varadhan
On (05/05/17 10:45), Steven Whitehouse wrote:
> 
> I do wonder if the man page for recvmsg is wrong, or at least a bit
> confusing. SOCK_SEQPACKET is stream based not message based - it just
> happens to have EOR markers in the stream. There is no reason that the whole
> message needs to be returned in a single read, and in fact that would be
> impossible if the sender didn't insert any EOR markers but kept sending data
> beyond the size that the socket could buffer.
> 
> I notice that man 7 socket says SOCK_SEQPACKET is for datagrams of fixed
> maximum length which is definitely wrong, as is the statement that a
> consumer has to read an entire packet with each system call.

Which man page do you think is wrong here? The POSIX definition is here

http://pubs.opengroup.org/onlinepubs/009695399/functions/recvmsg.html

The description in

http://pubs.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_10.html

says, "It is protocol-specific whether a maximum record size is imposed."
In my machine (Ubuntu 4.4.0-72, and it is in socket(2), not socket(7), btw)
doesnt have any references to max length, but I'm not sure I'd boldly assert
"definitely wrong" about the requirement of having to read entire
packet in a system call (see POSIX man page)

--Sowmini



Re: Question about SOCK_SEQPACKET

2017-05-05 Thread Sowmini Varadhan
On (05/05/17 10:45), Steven Whitehouse wrote:
> 
> I do wonder if the man page for recvmsg is wrong, or at least a bit
> confusing. SOCK_SEQPACKET is stream based not message based - it just
> happens to have EOR markers in the stream. There is no reason that the whole
> message needs to be returned in a single read, and in fact that would be
> impossible if the sender didn't insert any EOR markers but kept sending data
> beyond the size that the socket could buffer.
> 
> I notice that man 7 socket says SOCK_SEQPACKET is for datagrams of fixed
> maximum length which is definitely wrong, as is the statement that a
> consumer has to read an entire packet with each system call.

Which man page do you think is wrong here? The POSIX definition is here

http://pubs.opengroup.org/onlinepubs/009695399/functions/recvmsg.html

The description in

http://pubs.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_10.html

says, "It is protocol-specific whether a maximum record size is imposed."
In my machine (Ubuntu 4.4.0-72, and it is in socket(2), not socket(7), btw)
doesnt have any references to max length, but I'm not sure I'd boldly assert
"definitely wrong" about the requirement of having to read entire
packet in a system call (see POSIX man page)

--Sowmini



Re: crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex

2017-03-15 Thread Sowmini Varadhan
On (03/15/17 10:08), Dmitry Vyukov wrote:
> After I've applied the patch these reports stopped to happen, and I
> have not seem any other reports that look relevant.
> However, it there was one, but it looks like a different issue and it
> was probably masked by massive amounts of original deadlock reports:

Yes, this looks like a valid deadlock.

I think there may be some ->dumpit callbacks that take the rtnl_lock
and do not unlock it before return, e.g., I see nl80211_dump_interface()
doing this at 

   2612 rtnl_lock();
   2613 if (!cb->args[2]) {
 :
   2619 ret = nl80211_dump_wiphy_parse(skb, cb, );
   2620 if (ret)
   2621 return ret;

afaict, nl80211_dump_wiphy_parse does not itself do rtnl_unlock on error.


If that's the case then we'd run into the circular locking dependancy
flagged by lockdep. 

Disclaimer: I did not check every single ->dumpit, there may be more
than one of these..






Re: crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex

2017-03-15 Thread Sowmini Varadhan
On (03/15/17 10:08), Dmitry Vyukov wrote:
> After I've applied the patch these reports stopped to happen, and I
> have not seem any other reports that look relevant.
> However, it there was one, but it looks like a different issue and it
> was probably masked by massive amounts of original deadlock reports:

Yes, this looks like a valid deadlock.

I think there may be some ->dumpit callbacks that take the rtnl_lock
and do not unlock it before return, e.g., I see nl80211_dump_interface()
doing this at 

   2612 rtnl_lock();
   2613 if (!cb->args[2]) {
 :
   2619 ret = nl80211_dump_wiphy_parse(skb, cb, );
   2620 if (ret)
   2621 return ret;

afaict, nl80211_dump_wiphy_parse does not itself do rtnl_unlock on error.


If that's the case then we'd run into the circular locking dependancy
flagged by lockdep. 

Disclaimer: I did not check every single ->dumpit, there may be more
than one of these..






Re: crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex

2017-03-14 Thread Sowmini Varadhan
On (03/14/17 09:14), Dmitry Vyukov wrote:
> Another one now involving rds_tcp_listen_stop
   :
> kworker/u4:1/19 is trying to acquire lock:
>  (sk_lock-AF_INET){+.+.+.}, at: [] lock_sock
> include/net/sock.h:1460 [inline]
>  (sk_lock-AF_INET){+.+.+.}, at: []
> rds_tcp_listen_stop+0x5c/0x150 net/rds/tcp_listen.c:288
> 
> but task is already holding lock:
>  (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x17/0x20
> net/core/rtnetlink.c:70

Is this also a false positive?

genl_lock_dumpit takes the genl_lock and then waits on the rtnl_lock
(e.g., out of tipc_nl_bearer_dump).

netdev_run_todo takes the rtnl_lock and then wants lock_sock()
for the TCP/IPv4 socket. 

Why is lockdep seeing a circular dependancy here? Same pattern
seems to be happening  for 
  http://www.spinics.net/lists/netdev/msg423368.html
and maybe also http://www.spinics.net/lists/netdev/msg423323.html?

--Sowmini

> Chain exists of:
>   sk_lock-AF_INET --> genl_mutex --> rtnl_mutex
> 
>  Possible unsafe locking scenario:
> 
>CPU0CPU1
>
>   lock(rtnl_mutex);
>lock(genl_mutex);
>lock(rtnl_mutex);
>   lock(sk_lock-AF_INET);
> 
>  *** DEADLOCK ***
> 
> 4 locks held by kworker/u4:1/19:
>  #0:  ("%s""netns"){.+.+.+}, at: []
> __write_once_size include/linux/compiler.h:283 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: [] atomic64_set
> arch/x86/include/asm/atomic64_64.h:33 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: [] atomic_long_set
> include/asm-generic/atomic-long.h:56 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: [] set_work_data
> kernel/workqueue.c:617 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: []
> set_work_pool_and_clear_pending kernel/workqueue.c:644 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: []
> process_one_work+0xab3/0x1c10 kernel/workqueue.c:2089
>  #1:  (net_cleanup_work){+.+.+.}, at: []
> process_one_work+0xb07/0x1c10 kernel/workqueue.c:2093
>  #2:  (net_mutex){+.+.+.}, at: []
> cleanup_net+0x22b/0xa90 net/core/net_namespace.c:429
>  #3:  (rtnl_mutex){+.+.+.}, at: []
> rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70
> 


Re: crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex

2017-03-14 Thread Sowmini Varadhan
On (03/14/17 09:14), Dmitry Vyukov wrote:
> Another one now involving rds_tcp_listen_stop
   :
> kworker/u4:1/19 is trying to acquire lock:
>  (sk_lock-AF_INET){+.+.+.}, at: [] lock_sock
> include/net/sock.h:1460 [inline]
>  (sk_lock-AF_INET){+.+.+.}, at: []
> rds_tcp_listen_stop+0x5c/0x150 net/rds/tcp_listen.c:288
> 
> but task is already holding lock:
>  (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x17/0x20
> net/core/rtnetlink.c:70

Is this also a false positive?

genl_lock_dumpit takes the genl_lock and then waits on the rtnl_lock
(e.g., out of tipc_nl_bearer_dump).

netdev_run_todo takes the rtnl_lock and then wants lock_sock()
for the TCP/IPv4 socket. 

Why is lockdep seeing a circular dependancy here? Same pattern
seems to be happening  for 
  http://www.spinics.net/lists/netdev/msg423368.html
and maybe also http://www.spinics.net/lists/netdev/msg423323.html?

--Sowmini

> Chain exists of:
>   sk_lock-AF_INET --> genl_mutex --> rtnl_mutex
> 
>  Possible unsafe locking scenario:
> 
>CPU0CPU1
>
>   lock(rtnl_mutex);
>lock(genl_mutex);
>lock(rtnl_mutex);
>   lock(sk_lock-AF_INET);
> 
>  *** DEADLOCK ***
> 
> 4 locks held by kworker/u4:1/19:
>  #0:  ("%s""netns"){.+.+.+}, at: []
> __write_once_size include/linux/compiler.h:283 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: [] atomic64_set
> arch/x86/include/asm/atomic64_64.h:33 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: [] atomic_long_set
> include/asm-generic/atomic-long.h:56 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: [] set_work_data
> kernel/workqueue.c:617 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: []
> set_work_pool_and_clear_pending kernel/workqueue.c:644 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: []
> process_one_work+0xab3/0x1c10 kernel/workqueue.c:2089
>  #1:  (net_cleanup_work){+.+.+.}, at: []
> process_one_work+0xb07/0x1c10 kernel/workqueue.c:2093
>  #2:  (net_mutex){+.+.+.}, at: []
> cleanup_net+0x22b/0xa90 net/core/net_namespace.c:429
>  #3:  (rtnl_mutex){+.+.+.}, at: []
> rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70
> 


Re: net/rds: use-after-free in inet_create

2017-02-28 Thread Sowmini Varadhan

Actually, I'm not sure if I can assert that these are all manifestations
of the same bug- was a netns-delete involved in this one as well?

I see:

> BUG: KASAN: use-after-free in memcmp+0xe3/0x160 lib/string.c:768 at
:
>  memcmp+0xe3/0x160 lib/string.c:768
:
>  rds_find_bound+0x4fe/0x8a0 net/rds/bind.c:63
>  rds_recv_incoming+0x5f3/0x12c0 net/rds/recv.c:349
>  rds_loop_xmit+0x1c5/0x490 net/rds/loop.c:82
:
This appears to be for a looped back packet, and looks like there
are problems with  some rds_sock that got removed from the bind_hash_table..

According to the report, socket was created at
> Allocated:
> PID = 5235
:
>  sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1334
>  sk_alloc+0x105/0x1010 net/core/sock.c:1396
>  rds_create+0x11c/0x600 net/rds/af_rds.c:504

and closed at some point:
> Freed:
> PID = 5235
 :
>  rds_release+0x3a1/0x4d0 net/rds/af_rds.c:89
>  sock_release+0x8d/0x1e0 net/socket.c:599

This is all uspace created rds sockets, and while there may be an
unrelated bug here, I'm not sure I see  the netns/kernel-socket
connection.. can you please clarify if this was also seen in some netns
context?

--Sowmini







Re: net/rds: use-after-free in inet_create

2017-02-28 Thread Sowmini Varadhan

Actually, I'm not sure if I can assert that these are all manifestations
of the same bug- was a netns-delete involved in this one as well?

I see:

> BUG: KASAN: use-after-free in memcmp+0xe3/0x160 lib/string.c:768 at
:
>  memcmp+0xe3/0x160 lib/string.c:768
:
>  rds_find_bound+0x4fe/0x8a0 net/rds/bind.c:63
>  rds_recv_incoming+0x5f3/0x12c0 net/rds/recv.c:349
>  rds_loop_xmit+0x1c5/0x490 net/rds/loop.c:82
:
This appears to be for a looped back packet, and looks like there
are problems with  some rds_sock that got removed from the bind_hash_table..

According to the report, socket was created at
> Allocated:
> PID = 5235
:
>  sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1334
>  sk_alloc+0x105/0x1010 net/core/sock.c:1396
>  rds_create+0x11c/0x600 net/rds/af_rds.c:504

and closed at some point:
> Freed:
> PID = 5235
 :
>  rds_release+0x3a1/0x4d0 net/rds/af_rds.c:89
>  sock_release+0x8d/0x1e0 net/socket.c:599

This is all uspace created rds sockets, and while there may be an
unrelated bug here, I'm not sure I see  the netns/kernel-socket
connection.. can you please clarify if this was also seen in some netns
context?

--Sowmini







Re: net/rds: use-after-free in inet_create

2017-02-28 Thread Sowmini Varadhan
On (02/28/17 18:45), Dmitry Vyukov wrote:
> 
> Yes, I can now apply custom patches to the bots. However, it fired
> only 3 times, so it will give weak signal. But at least it will test
> that the patch does not cause other bad things.

Ok, let me do my bit of homework on this one and get back to you
(probably by tomorrow)..




Re: net/rds: use-after-free in inet_create

2017-02-28 Thread Sowmini Varadhan
On (02/28/17 18:45), Dmitry Vyukov wrote:
> 
> Yes, I can now apply custom patches to the bots. However, it fired
> only 3 times, so it will give weak signal. But at least it will test
> that the patch does not cause other bad things.

Ok, let me do my bit of homework on this one and get back to you
(probably by tomorrow)..




Re: net/rds: use-after-free in inet_create

2017-02-28 Thread Sowmini Varadhan
On (02/28/17 17:51), Dmitry Vyukov wrote:
> Searching other crashes for "net/rds" I found 2 more crashes that may
> be related. They suggest that the delayed works are not properly
> stopped when the socket is destroyed. That would explain how
> rds_connect_worker accesses freed net, right?

yes, I think we may want to explicitly cancel this workq.. this
in rds_conn_destroy().

I'm trying to build/sanity-test (if lucky, reproduce the bug)
as I send this out.. let me get back to you..

If I have a patch against net-next, would you be willing/able to 
try it out? given that this does not show up on demand, I'm not
sure how we can check that "the fix worked"..

--Sowmini


Re: net/rds: use-after-free in inet_create

2017-02-28 Thread Sowmini Varadhan
On (02/28/17 17:51), Dmitry Vyukov wrote:
> Searching other crashes for "net/rds" I found 2 more crashes that may
> be related. They suggest that the delayed works are not properly
> stopped when the socket is destroyed. That would explain how
> rds_connect_worker accesses freed net, right?

yes, I think we may want to explicitly cancel this workq.. this
in rds_conn_destroy().

I'm trying to build/sanity-test (if lucky, reproduce the bug)
as I send this out.. let me get back to you..

If I have a patch against net-next, would you be willing/able to 
try it out? given that this does not show up on demand, I'm not
sure how we can check that "the fix worked"..

--Sowmini


Re: net/rds: use-after-free in inet_create

2017-02-28 Thread Sowmini Varadhan
On (02/28/17 17:32), Dmitry Vyukov wrote:
> Not reproducible so far.
> 
> rds is compiled into kernel (no modules):
> CONFIG_RDS=y
> CONFIG_RDS_TCP=y

I see. So if it never gets unloaded, the rds_connections "should"
be around forever.. let me inspect code and see if I spot some 
race-window.. 

> Also fuzzer actively creates and destroys namespaces.
> Yes, I don't see socket(0x15) in the log. Probably it was truncated.

I see. May be useful if we coudl get a crash dump to see what
other threads were going on (might give a hint about which threads
were racing). I'll try reproducing this at my end too.

--Sowmini


Re: net/rds: use-after-free in inet_create

2017-02-28 Thread Sowmini Varadhan
On (02/28/17 17:32), Dmitry Vyukov wrote:
> Not reproducible so far.
> 
> rds is compiled into kernel (no modules):
> CONFIG_RDS=y
> CONFIG_RDS_TCP=y

I see. So if it never gets unloaded, the rds_connections "should"
be around forever.. let me inspect code and see if I spot some 
race-window.. 

> Also fuzzer actively creates and destroys namespaces.
> Yes, I don't see socket(0x15) in the log. Probably it was truncated.

I see. May be useful if we coudl get a crash dump to see what
other threads were going on (might give a hint about which threads
were racing). I'll try reproducing this at my end too.

--Sowmini


Re: net/rds: use-after-free in inet_create

2017-02-28 Thread Sowmini Varadhan
On (02/28/17 16:49), Dmitry Vyukov wrote:
> 
> Grepping "socket" there, it was doing lots of things with sockets. Are
> we looking for some particular socket type? If there are few programs
> that create sockets of that type, then we can narrow down the set:

Yes, we are looking for PF_RDS/AF_RDS - this should be 
#define AF_RDS  21  /* RDS sockets  */

I see PF_KCM there (value 41) but no instances of 0x15.. how did 
the rds_connect_worker thread get kicked off at all?

the way this is supposed to work is
1. someone modprobes rds-tcp
2. app tries to do rds_sendmsg to some ip address in a netns - this triggers the
   creation of an rds_connection, and subsequent kernel socket TCP connection
   threads (i.e., rds_connect_worker) for that netns
3. if you unload rds-tcp, the module_unload should do all the cleanup
   needed via rds_tcp_conn_paths_destroy. This is done  
Its not clear to me that the test is doing any of this... 

is this reproducible? let me check if there is some race window where
we can restart a connection attempt when rds_tcp_kill_sock assumes
that the connect worker has been quiesced..

--Sowmini


Re: net/rds: use-after-free in inet_create

2017-02-28 Thread Sowmini Varadhan
On (02/28/17 16:49), Dmitry Vyukov wrote:
> 
> Grepping "socket" there, it was doing lots of things with sockets. Are
> we looking for some particular socket type? If there are few programs
> that create sockets of that type, then we can narrow down the set:

Yes, we are looking for PF_RDS/AF_RDS - this should be 
#define AF_RDS  21  /* RDS sockets  */

I see PF_KCM there (value 41) but no instances of 0x15.. how did 
the rds_connect_worker thread get kicked off at all?

the way this is supposed to work is
1. someone modprobes rds-tcp
2. app tries to do rds_sendmsg to some ip address in a netns - this triggers the
   creation of an rds_connection, and subsequent kernel socket TCP connection
   threads (i.e., rds_connect_worker) for that netns
3. if you unload rds-tcp, the module_unload should do all the cleanup
   needed via rds_tcp_conn_paths_destroy. This is done  
Its not clear to me that the test is doing any of this... 

is this reproducible? let me check if there is some race window where
we can restart a connection attempt when rds_tcp_kill_sock assumes
that the connect worker has been quiesced..

--Sowmini


Re: net/rds: use-after-free in inet_create

2017-02-28 Thread Sowmini Varadhan
On (02/28/17 15:22), Dmitry Vyukov wrote:
> 
> Hello,
> 
> I've got the following report while running syzkaller fuzzer on
> linux-next/8d01c069486aca75b8f6018a759215b0ed0c91f0. So far it
> happened only once. net was somehow deleted from underneath
> inet_create. I've noticed that rds uses sock_create_kern which does
> not take net reference. What is that that must keep net alive then?

The rds_connection (which is where the net pointer is being obtained from)
should keep the connection alive. Did you have the rds[_tcp] modules
loaded at the time of failure? Were there kernel tcp sockets to/from
the 16385 port? any hints on what else the test was doing (was it
running a userspace RDS application that triggered the kernel TCP
connection attempt in the first place)?

--Sowmini



Re: net/rds: use-after-free in inet_create

2017-02-28 Thread Sowmini Varadhan
On (02/28/17 15:22), Dmitry Vyukov wrote:
> 
> Hello,
> 
> I've got the following report while running syzkaller fuzzer on
> linux-next/8d01c069486aca75b8f6018a759215b0ed0c91f0. So far it
> happened only once. net was somehow deleted from underneath
> inet_create. I've noticed that rds uses sock_create_kern which does
> not take net reference. What is that that must keep net alive then?

The rds_connection (which is where the net pointer is being obtained from)
should keep the connection alive. Did you have the rds[_tcp] modules
loaded at the time of failure? Were there kernel tcp sockets to/from
the 16385 port? any hints on what else the test was doing (was it
running a userspace RDS application that triggered the kernel TCP
connection attempt in the first place)?

--Sowmini



Re: [net-next][PATCH] RDS: keep data type consistent in the user visible header

2017-02-20 Thread Sowmini Varadhan
On (02/20/17 10:19), David Miller wrote:
> 
> The correct fix it to use "__u8", "__u64", etc.

So the rest of rds.h uses uint8_t, uint32_t etc 
Perhaps (I'm not sure of the origins) this was because
of the shared OpenIB.org BSD license etc using __u8 in
one place and uint8_t in another would seem inconsistent,
at the very least.

--Sowmini



Re: [net-next][PATCH] RDS: keep data type consistent in the user visible header

2017-02-20 Thread Sowmini Varadhan
On (02/20/17 10:19), David Miller wrote:
> 
> The correct fix it to use "__u8", "__u64", etc.

So the rest of rds.h uses uint8_t, uint32_t etc 
Perhaps (I'm not sure of the origins) this was because
of the shared OpenIB.org BSD license etc using __u8 in
one place and uint8_t in another would seem inconsistent,
at the very least.

--Sowmini



Re: net/packet: use-after-free in packet_rcv_fanout

2017-02-10 Thread Sowmini Varadhan
On (02/10/17 10:00), Cong Wang wrote:
> My understanding about the race here is packet_release() doesn't
> wait for flying packets correctly, which leads to a flying packet still
> refers to the struct sock which is being released.
> 
> This could happen because struct packet_fanout is refcn'ted, it is
   :
> At least I believe this explains the crash Dmitry reported.

hmm, the proof of the pudding is in the eating- would be good to 
be able to reliably reproduce this somewhere (thus proving that
root-cause analysis is rock-solid), maybe by introducing artificial
delays to slow down paths..

I'm travelling at the moment but may be able to give this (try
to reproduce it reliably) next week.

--Sowmini


Re: net/packet: use-after-free in packet_rcv_fanout

2017-02-10 Thread Sowmini Varadhan
On (02/10/17 10:00), Cong Wang wrote:
> My understanding about the race here is packet_release() doesn't
> wait for flying packets correctly, which leads to a flying packet still
> refers to the struct sock which is being released.
> 
> This could happen because struct packet_fanout is refcn'ted, it is
   :
> At least I believe this explains the crash Dmitry reported.

hmm, the proof of the pudding is in the eating- would be good to 
be able to reliably reproduce this somewhere (thus proving that
root-cause analysis is rock-solid), maybe by introducing artificial
delays to slow down paths..

I'm travelling at the moment but may be able to give this (try
to reproduce it reliably) next week.

--Sowmini


Re: net/packet: use-after-free in packet_rcv_fanout

2017-02-09 Thread Sowmini Varadhan
On (02/09/17 19:19), Eric Dumazet wrote:
> 
> More likely the bug is in fanout_add(), with a buggy sequence in error
> case, and not correct locking.
> 
> kfree(po->rollover);
> po->rollover = NULL;
> 
> Two cpus entering fanout_add() (using the same af_packet socket,
> syzkaller courtesy...) might both see po->fanout being NULL.
> 
> Then they grab the mutex.  Too late...

I'm not sure I follow- aiui the panic was in acceessing the
sk_receive_queue.lock in a socket that had been closed earlier. I think
the assumption is that rcu_read_lock_bh in __dev_queue_xmit (and
rcu_read_lock in dev_queue_xmit_nit?) should make sure that the nit
packet delivery can be done safely, and the synchronize_net in
packet_release() makes sure that the Tx paths are quiesced before freeing
the socket.  What is the race-hole here? Does it have to do with the
_bh and softirq context, somehow?

--Sowmini




Re: net/packet: use-after-free in packet_rcv_fanout

2017-02-09 Thread Sowmini Varadhan
On (02/09/17 19:19), Eric Dumazet wrote:
> 
> More likely the bug is in fanout_add(), with a buggy sequence in error
> case, and not correct locking.
> 
> kfree(po->rollover);
> po->rollover = NULL;
> 
> Two cpus entering fanout_add() (using the same af_packet socket,
> syzkaller courtesy...) might both see po->fanout being NULL.
> 
> Then they grab the mutex.  Too late...

I'm not sure I follow- aiui the panic was in acceessing the
sk_receive_queue.lock in a socket that had been closed earlier. I think
the assumption is that rcu_read_lock_bh in __dev_queue_xmit (and
rcu_read_lock in dev_queue_xmit_nit?) should make sure that the nit
packet delivery can be done safely, and the synchronize_net in
packet_release() makes sure that the Tx paths are quiesced before freeing
the socket.  What is the race-hole here? Does it have to do with the
_bh and softirq context, somehow?

--Sowmini




Re: net/packet: use-after-free in packet_rcv_fanout

2017-02-09 Thread Sowmini Varadhan
On (02/09/17 14:14), Dmitry Vyukov wrote:
> 
> Call Trace:
   :
>  packet_rcv_has_room+0x25/0xb0 net/packet/af_packet.c:1308
>  fanout_demux_rollover+0x3bb/0x6b0 net/packet/af_packet.c:1388
>  packet_rcv_fanout+0x674/0x800 net/packet/af_packet.c:1490
>  dev_queue_xmit_nit+0x73a/0xa90 net/core/dev.c:1898
:
>  tcp_sendmsg_fastopen net/ipv4/tcp.c:1110 [inline]
:

looks like a race between a NIT socket (tcpdump, maybe?) that is closing,
and a standard tcp socket.. packet_release() takes the po->bind_lock
to remove the socket from the ptype_all NIT queue. but how does
that sync with the Tx path for other af_inet/af_inet6 sockets? 

--Sowmini


Re: net/packet: use-after-free in packet_rcv_fanout

2017-02-09 Thread Sowmini Varadhan
On (02/09/17 14:14), Dmitry Vyukov wrote:
> 
> Call Trace:
   :
>  packet_rcv_has_room+0x25/0xb0 net/packet/af_packet.c:1308
>  fanout_demux_rollover+0x3bb/0x6b0 net/packet/af_packet.c:1388
>  packet_rcv_fanout+0x674/0x800 net/packet/af_packet.c:1490
>  dev_queue_xmit_nit+0x73a/0xa90 net/core/dev.c:1898
:
>  tcp_sendmsg_fastopen net/ipv4/tcp.c:1110 [inline]
:

looks like a race between a NIT socket (tcpdump, maybe?) that is closing,
and a standard tcp socket.. packet_release() takes the po->bind_lock
to remove the socket from the ptype_all NIT queue. but how does
that sync with the Tx path for other af_inet/af_inet6 sockets? 

--Sowmini


Re: [PATCH v2 net-next 6/9] sunvnet: straighten up message event handling logic

2017-02-08 Thread Sowmini Varadhan
On (02/08/17 08:28), Shannon Nelson wrote:
> The existing code does this as well - if it first finds a RESET, it handles
> it then hits the return 0.  Next if it finds the UP, it does the goto back
> to the ldc_ctrl: to process, and hits the same return 0.  Only if neither of
> these bits have been seen does the code move on to process the DATA_READY
> event.
> 
> If we're seeing cases of both UP and DATA_READY, then yes we'll wnat to look
> at changing this logic.  I think that should be a separate patch.
> 

Ok. probably we just make one redundant loop out and back into the function.

Other than that, the patchset looks good to me.

--Sowmini



Re: [PATCH v2 net-next 6/9] sunvnet: straighten up message event handling logic

2017-02-08 Thread Sowmini Varadhan
On (02/08/17 08:28), Shannon Nelson wrote:
> The existing code does this as well - if it first finds a RESET, it handles
> it then hits the return 0.  Next if it finds the UP, it does the goto back
> to the ldc_ctrl: to process, and hits the same return 0.  Only if neither of
> these bits have been seen does the code move on to process the DATA_READY
> event.
> 
> If we're seeing cases of both UP and DATA_READY, then yes we'll wnat to look
> at changing this logic.  I think that should be a separate patch.
> 

Ok. probably we just make one redundant loop out and back into the function.

Other than that, the patchset looks good to me.

--Sowmini



Re: [PATCH v2 net-next 6/9] sunvnet: straighten up message event handling logic

2017-02-08 Thread Sowmini Varadhan
On (02/07/17 14:12), Shannon Nelson wrote:
> +
> + /* we don't expect any other bits */
> + BUG_ON(port->rx_event & ~(LDC_EVENT_DATA_READY |
> +   LDC_EVENT_RESET |
> +   LDC_EVENT_UP));
> +
> + /* RESET takes precedent over any other event */
> + if (port->rx_event & LDC_EVENT_RESET) { 
  :
>   port->rx_event = 0;
>   return 0;
>   }
> + if (port->rx_event & LDC_EVENT_UP) {
> + vio_link_state_change(vio, LDC_EVENT_UP);
> + port->rx_event = 0;
> + return 0;
> + }
>  
>   err = 0;
>   tx_wakeup = 0;

IIRC there were timing-related situations where you can get woken up with
both UP and DATA_READY, and if my reading of your patch is 
correct, we would ignore the DATA_READY, and return, right?

--Sowmini




Re: [PATCH v2 net-next 6/9] sunvnet: straighten up message event handling logic

2017-02-08 Thread Sowmini Varadhan
On (02/07/17 14:12), Shannon Nelson wrote:
> +
> + /* we don't expect any other bits */
> + BUG_ON(port->rx_event & ~(LDC_EVENT_DATA_READY |
> +   LDC_EVENT_RESET |
> +   LDC_EVENT_UP));
> +
> + /* RESET takes precedent over any other event */
> + if (port->rx_event & LDC_EVENT_RESET) { 
  :
>   port->rx_event = 0;
>   return 0;
>   }
> + if (port->rx_event & LDC_EVENT_UP) {
> + vio_link_state_change(vio, LDC_EVENT_UP);
> + port->rx_event = 0;
> + return 0;
> + }
>  
>   err = 0;
>   tx_wakeup = 0;

IIRC there were timing-related situations where you can get woken up with
both UP and DATA_READY, and if my reading of your patch is 
correct, we would ignore the DATA_READY, and return, right?

--Sowmini




Re: [rds-devel] [PATCH 2/2] rds: Remove duplicate prefix from rds_conn_path_error use

2016-10-15 Thread Sowmini Varadhan
On (10/15/16 11:53), Joe Perches wrote:
> 
> rds_conn_path_error already prefixes "RDS:" to the output.
> 
> Signed-off-by: Joe Perches <j...@perches.com>
Acked-by: Sowmini Varadhan <sowmini.varad...@oracle.com>


Re: [rds-devel] [PATCH 1/2] rds: Remove unused rds_conn_error

2016-10-15 Thread Sowmini Varadhan
On (10/15/16 11:53), Joe Perches wrote:
> This macro's last use was removed in commit d769ef81d5b59
> ("RDS: Update rds_conn_shutdown to work with rds_conn_path")
> so make the macro and the __rds_conn_error function definition
> and declaration disappear.
> 
> Signed-off-by: Joe Perches <j...@perches.com>

Acked-by: Sowmini Varadhan <sowmini.varad...@oracle.com>


Re: [rds-devel] [PATCH 2/2] rds: Remove duplicate prefix from rds_conn_path_error use

2016-10-15 Thread Sowmini Varadhan
On (10/15/16 11:53), Joe Perches wrote:
> 
> rds_conn_path_error already prefixes "RDS:" to the output.
> 
> Signed-off-by: Joe Perches 
Acked-by: Sowmini Varadhan 


Re: [rds-devel] [PATCH 1/2] rds: Remove unused rds_conn_error

2016-10-15 Thread Sowmini Varadhan
On (10/15/16 11:53), Joe Perches wrote:
> This macro's last use was removed in commit d769ef81d5b59
> ("RDS: Update rds_conn_shutdown to work with rds_conn_path")
> so make the macro and the __rds_conn_error function definition
> and declaration disappear.
> 
> Signed-off-by: Joe Perches 

Acked-by: Sowmini Varadhan 


Re: [PATCH v05 08/72] rds.h: use __u8, __u16, __s16, __u32 and __s64 from linux/types.h

2016-08-22 Thread Sowmini Varadhan
On (08/22/16 20:32), Mikko Rapeli wrote:
> 
> Fixes userspace compilation errors like:
> 
> linux/rds.h:96:2: error: unknown type name ‘uint8_t’
> 
> Signed-off-by: Mikko Rapeli <mikko.rap...@iki.fi>

Acked-by: Sowmini Varadhan <sowmini.varad...@oracle.com>

I think we discussed this some time before, and I certainly
dont have any religious opinions about it, but I would like
to point out that this means that the exported rds.h will
expose __ to user space applications. 

--Sowmini


Re: [PATCH v05 08/72] rds.h: use __u8, __u16, __s16, __u32 and __s64 from linux/types.h

2016-08-22 Thread Sowmini Varadhan
On (08/22/16 20:32), Mikko Rapeli wrote:
> 
> Fixes userspace compilation errors like:
> 
> linux/rds.h:96:2: error: unknown type name ‘uint8_t’
> 
> Signed-off-by: Mikko Rapeli 

Acked-by: Sowmini Varadhan 

I think we discussed this some time before, and I certainly
dont have any religious opinions about it, but I would like
to point out that this means that the exported rds.h will
expose __ to user space applications. 

--Sowmini


Re: [PATCH v2] lib/mpi: Fix kernel unaligned access in mpi_write_to_sgl

2016-05-04 Thread Sowmini Varadhan
On (05/04/16 10:32), Herbert Xu wrote:
> 
> Please base it on cryptodev.
> 

Looks like this got fixed in cryptodev by commit cece762f6f3c 
("lib/mpi: mpi_write_sgl(): fix out-of-bounds stack access")

Thanks,
--Sowmini


Re: [PATCH v2] lib/mpi: Fix kernel unaligned access in mpi_write_to_sgl

2016-05-04 Thread Sowmini Varadhan
On (05/04/16 10:32), Herbert Xu wrote:
> 
> Please base it on cryptodev.
> 

Looks like this got fixed in cryptodev by commit cece762f6f3c 
("lib/mpi: mpi_write_sgl(): fix out-of-bounds stack access")

Thanks,
--Sowmini


Re: [PATCH v2] lib/mpi: Fix kernel unaligned access in mpi_write_to_sgl

2016-05-03 Thread Sowmini Varadhan
On (05/03/16 16:12), Herbert Xu wrote:
> 
> Sorry, but your patch doesn't apply against the current tree at all.
> Please rebase it if it is still needed.

Hello,

I had based my patch off of net-next, which is where I do my work.

I'd be happy to rebase it on the "current tree", 
but given that mpicoder.c does not have an entry in MAINTAINERS, 
please clarify what you mean by "current tree" in this case.

do you mean

 git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
or
 git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git
or 
 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

(which are the three possible candidates I can see in MAINTAINERS).

It would be nice to get this bug fixed, since the fix is fairly
obvious, and the nuisance factor from the generated "unaligned
access" messages on the impacted non-intel platforms is quite high, 

thanks,
--Sowmini


Re: [PATCH v2] lib/mpi: Fix kernel unaligned access in mpi_write_to_sgl

2016-05-03 Thread Sowmini Varadhan
On (05/03/16 16:12), Herbert Xu wrote:
> 
> Sorry, but your patch doesn't apply against the current tree at all.
> Please rebase it if it is still needed.

Hello,

I had based my patch off of net-next, which is where I do my work.

I'd be happy to rebase it on the "current tree", 
but given that mpicoder.c does not have an entry in MAINTAINERS, 
please clarify what you mean by "current tree" in this case.

do you mean

 git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
or
 git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git
or 
 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

(which are the three possible candidates I can see in MAINTAINERS).

It would be nice to get this bug fixed, since the fix is fairly
obvious, and the nuisance factor from the generated "unaligned
access" messages on the impacted non-intel platforms is quite high, 

thanks,
--Sowmini


Re: [PATCH v2] lib/mpi: Fix kernel unaligned access in mpi_write_to_sgl

2016-04-27 Thread Sowmini Varadhan
On (04/28/16 09:01), Herbert Xu wrote:
> Subject: Re: [PATCH v2] lib/mpi: Fix kernel unaligned access in
>  mpi_write_to_sgl
> 
> Please cc linux-crypto.

Just bounced the message to linux-crypto as well. 
I think get_maintainers.pl might also need to be updated to 
generate this automatically.

Thanks,
--Sowmini


Re: [PATCH v2] lib/mpi: Fix kernel unaligned access in mpi_write_to_sgl

2016-04-27 Thread Sowmini Varadhan
On (04/28/16 09:01), Herbert Xu wrote:
> Subject: Re: [PATCH v2] lib/mpi: Fix kernel unaligned access in
>  mpi_write_to_sgl
> 
> Please cc linux-crypto.

Just bounced the message to linux-crypto as well. 
I think get_maintainers.pl might also need to be updated to 
generate this automatically.

Thanks,
--Sowmini


[PATCH v2] lib/mpi: Fix kernel unaligned access in mpi_write_to_sgl

2016-04-27 Thread Sowmini Varadhan

Commit 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers") added
mpi_write_to_sgl() which generates traps due to unaligned
access on some platforms like sparc. Fix this by using
the get_unaligned* and put_unaligned* functions.

Fixes: 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com>
---
v2: tadeusz.struk comments: Predicate on BYTES_PER_MPI_LIMB.

 lib/mpi/mpicoder.c |   21 +
 1 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index eb15e7d..b61eb6b 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include "mpi-internal.h"
+#include 
 
 #define MAX_EXTERN_MPI_BITS 16384
 
@@ -405,10 +406,22 @@ int mpi_write_to_sgl(MPI a, struct scatterlist *sgl, 
unsigned *nbytes,
p -= sizeof(alimb);
continue;
} else {
-   mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
-   mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
-   + lzeros;
-   *limb1 = *limb2;
+   mpi_limb_t tmp;
+#if BYTES_PER_MPI_LIMB == 4
+   tmp = get_unaligned_be32((void *)p -
+sizeof(alimb) +
+lzeros);
+   put_unaligned_be32(tmp, (void *)p -
+  sizeof(alimb));
+#elif BYTES_PER_MPI_LIMB == 8
+   tmp = get_unaligned_be64((void *)p -
+sizeof(alimb) +
+lzeros);
+   put_unaligned_be64(tmp, (void *)p -
+  sizeof(alimb));
+#else
+#error please implement for this limb size.
+#endif
p -= lzeros;
y = lzeros;
}
-- 
1.7.1



[PATCH v2] lib/mpi: Fix kernel unaligned access in mpi_write_to_sgl

2016-04-27 Thread Sowmini Varadhan

Commit 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers") added
mpi_write_to_sgl() which generates traps due to unaligned
access on some platforms like sparc. Fix this by using
the get_unaligned* and put_unaligned* functions.

Fixes: 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Sowmini Varadhan 
---
v2: tadeusz.struk comments: Predicate on BYTES_PER_MPI_LIMB.

 lib/mpi/mpicoder.c |   21 +
 1 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index eb15e7d..b61eb6b 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include "mpi-internal.h"
+#include 
 
 #define MAX_EXTERN_MPI_BITS 16384
 
@@ -405,10 +406,22 @@ int mpi_write_to_sgl(MPI a, struct scatterlist *sgl, 
unsigned *nbytes,
p -= sizeof(alimb);
continue;
} else {
-   mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
-   mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
-   + lzeros;
-   *limb1 = *limb2;
+   mpi_limb_t tmp;
+#if BYTES_PER_MPI_LIMB == 4
+   tmp = get_unaligned_be32((void *)p -
+sizeof(alimb) +
+lzeros);
+   put_unaligned_be32(tmp, (void *)p -
+  sizeof(alimb));
+#elif BYTES_PER_MPI_LIMB == 8
+   tmp = get_unaligned_be64((void *)p -
+sizeof(alimb) +
+lzeros);
+   put_unaligned_be64(tmp, (void *)p -
+  sizeof(alimb));
+#else
+#error please implement for this limb size.
+#endif
p -= lzeros;
y = lzeros;
}
-- 
1.7.1



Re: [PATCH] lib/mpi: Fix kernel unaligned acces in mpi_write_to_sgl

2016-04-21 Thread Sowmini Varadhan
On (04/21/16 10:23), Tadeusz Struk wrote:
> 
> What if the mpi_limb_t will happen to be 64 bit? 
> Thanks,

When I checked this with cscope, I found

typedef unsigned long int mpi_limb_t;

thus I used the *32 functions. 

But you obviously know better, since you wrote this code (and bug).
If you anticipate that mpi_limb_t in some environment today,
I can check for sizeof(mpi_limb_t), and predicate it to
use the *32 or *64 functions based on the result. Do you think that
is necessary?

Regards,
--Sowmini



Re: [PATCH] lib/mpi: Fix kernel unaligned acces in mpi_write_to_sgl

2016-04-21 Thread Sowmini Varadhan
On (04/21/16 10:23), Tadeusz Struk wrote:
> 
> What if the mpi_limb_t will happen to be 64 bit? 
> Thanks,

When I checked this with cscope, I found

typedef unsigned long int mpi_limb_t;

thus I used the *32 functions. 

But you obviously know better, since you wrote this code (and bug).
If you anticipate that mpi_limb_t in some environment today,
I can check for sizeof(mpi_limb_t), and predicate it to
use the *32 or *64 functions based on the result. Do you think that
is necessary?

Regards,
--Sowmini



[PATCH] lib/mpi: Fix kernel unaligned acces in mpi_write_to_sgl

2016-04-21 Thread Sowmini Varadhan

Commit 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers") added
mpi_write_to_sgl() which generates traps due to unaligned
access on some platforms like sparc. Fix this by using
the get_unaligned* and put_unaligned* functions.

Fixes: 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com>
---
 lib/mpi/mpicoder.c |   12 
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index eb15e7d..6771378 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include "mpi-internal.h"
+#include 
 
 #define MAX_EXTERN_MPI_BITS 16384
 
@@ -405,10 +406,13 @@ int mpi_write_to_sgl(MPI a, struct scatterlist *sgl, 
unsigned *nbytes,
p -= sizeof(alimb);
continue;
} else {
-   mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
-   mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
-   + lzeros;
-   *limb1 = *limb2;
+   mpi_limb_t tmp;
+
+   tmp = get_unaligned_be32((void *)p -
+sizeof(alimb) +
+lzeros);
+   put_unaligned_be32(tmp,
+  (void *)p - sizeof(alimb));
p -= lzeros;
y = lzeros;
}
-- 
1.7.1



[PATCH] lib/mpi: Fix kernel unaligned acces in mpi_write_to_sgl

2016-04-21 Thread Sowmini Varadhan

Commit 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers") added
mpi_write_to_sgl() which generates traps due to unaligned
access on some platforms like sparc. Fix this by using
the get_unaligned* and put_unaligned* functions.

Fixes: 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Sowmini Varadhan 
---
 lib/mpi/mpicoder.c |   12 
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index eb15e7d..6771378 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include "mpi-internal.h"
+#include 
 
 #define MAX_EXTERN_MPI_BITS 16384
 
@@ -405,10 +406,13 @@ int mpi_write_to_sgl(MPI a, struct scatterlist *sgl, 
unsigned *nbytes,
p -= sizeof(alimb);
continue;
} else {
-   mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
-   mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
-   + lzeros;
-   *limb1 = *limb2;
+   mpi_limb_t tmp;
+
+   tmp = get_unaligned_be32((void *)p -
+sizeof(alimb) +
+lzeros);
+   put_unaligned_be32(tmp,
+  (void *)p - sizeof(alimb));
p -= lzeros;
y = lzeros;
}
-- 
1.7.1



[PATCH v9] i40e: Look up MAC address in Open Firmware or IDPROM

2015-12-07 Thread Sowmini Varadhan

This is the i40e equivalent of commit c762dff24c06 ("ixgbe: Look up MAC
address in Open Firmware or IDPROM").

As with that fix, attempt to look up the MAC address in Open Firmware
on systems that support it, and use IDPROM on SPARC if no OF address
is found.

In the case of the i40e there is an assumption that the default mac
address has already been set up as the primary mac filter on probe,
so if this filter is obtained from the Open Firmware or IDPROM, an
explicit write is needed via i40e_aq_mac_address_write() and
i40e_aq_add_macvlan() invocation.

The I40E_FLAG_PF_MAC flag in the platform-private i40e_pf structure
tracks whether a platform-specific mac address was found, in which
case calls to i40e_aq_mac_address_write() and i40e_aq_add_macvlan()
will be triggered.

Reviewed-by: Martin K. Petersen 
Signed-off-by: Sowmini Varadhan 
---
v2, v3: Andy Shevchenko comments
v4: Shannon Nelson review: explicitly set up mac filters before register_netdev
v5: Shannon Nelson code style comments
v6: Shannon Nelson code style comments
v7: Ensure that i40e_macaddr_init() is called only for VSI_MAIN, and only
if the mac address is not the default. Some additional code-refactoring
based on comments from Shannon Nelson
v8: use pf->flags instead of bit value to track if we found a platform-specific
macaddr. Other code-style comments from Shannon Nelson
v9: David Miller, Shannon Nelson comments for i40e_get_platform_mac_addr

 drivers/net/ethernet/intel/i40e/i40e.h  |1 +
 drivers/net/ethernet/intel/i40e/i40e_main.c |   90 +++
 2 files changed, 91 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 0b9537b..b85eaee 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -337,6 +337,7 @@ struct i40e_pf {
 #define I40E_FLAG_LINK_POLLING_ENABLED BIT_ULL(39)
 #define I40E_FLAG_VEB_MODE_ENABLED BIT_ULL(40)
 #define I40E_FLAG_NO_PCI_LINK_CHECKBIT_ULL(42)
+#define I40E_FLAG_PF_MAC   BIT_ULL(50)
 
/* tracks features that get auto disabled by errors */
u64 auto_disable_flags;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 9e6268b..3e2d4e0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -24,6 +24,15 @@
  *
  
**/
 
+#include 
+#include 
+#include 
+
+#ifdef CONFIG_SPARC
+#include 
+#include 
+#endif
+
 /* Local includes */
 #include "i40e.h"
 #include "i40e_diag.h"
@@ -9387,6 +9396,44 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct 
i40e_vsi *vsi)
 }
 
 /**
+ * i40e_macaddr_init - explicitly write the mac address filters.
+ *
+ * @vsi: pointer to the vsi.
+ * @macaddr: the MAC address
+ *
+ * This is needed when the macaddr has been obtained by other
+ * means than the default, e.g., from Open Firmware or IDPROM.
+ * Returns 0 on success, negative on failure
+ **/
+static int i40e_macaddr_init(struct i40e_vsi *vsi, u8 *macaddr)
+{
+   int ret;
+   struct i40e_aqc_add_macvlan_element_data element;
+
+   ret = i40e_aq_mac_address_write(>back->hw,
+   I40E_AQC_WRITE_TYPE_LAA_WOL,
+   macaddr, NULL);
+   if (ret) {
+   dev_info(>back->pdev->dev,
+"Addr change for VSI failed: %d\n", ret);
+   return -EADDRNOTAVAIL;
+   }
+
+   memset(, 0, sizeof(element));
+   ether_addr_copy(element.mac_addr, macaddr);
+   element.flags = cpu_to_le16(I40E_AQC_MACVLAN_ADD_PERFECT_MATCH);
+   ret = i40e_aq_add_macvlan(>back->hw, vsi->seid, , 1, NULL);
+   if (ret) {
+   dev_info(>back->pdev->dev,
+"add filter failed err %s aq_err %s\n",
+i40e_stat_str(>back->hw, ret),
+i40e_aq_str(>back->hw,
+vsi->back->hw.aq.asq_last_status));
+   }
+   return ret;
+}
+
+/**
  * i40e_vsi_setup - Set up a VSI by a given type
  * @pf: board private structure
  * @type: VSI type
@@ -9510,6 +9557,17 @@ struct i40e_vsi *i40e_vsi_setup(struct i40e_pf *pf, u8 
type,
switch (vsi->type) {
/* setup the netdev if needed */
case I40E_VSI_MAIN:
+   /* Apply relevant filters if a platform-specific mac
+* address was selected.
+*/
+   if (!!(pf->flags & I40E_FLAG_PF_MAC)) {
+   ret = i40e_macaddr_init(vsi, pf->hw.mac.addr);
+   if (ret) {
+   dev_warn(>pdev->dev,

[PATCH v9] i40e: Look up MAC address in Open Firmware or IDPROM

2015-12-07 Thread Sowmini Varadhan

This is the i40e equivalent of commit c762dff24c06 ("ixgbe: Look up MAC
address in Open Firmware or IDPROM").

As with that fix, attempt to look up the MAC address in Open Firmware
on systems that support it, and use IDPROM on SPARC if no OF address
is found.

In the case of the i40e there is an assumption that the default mac
address has already been set up as the primary mac filter on probe,
so if this filter is obtained from the Open Firmware or IDPROM, an
explicit write is needed via i40e_aq_mac_address_write() and
i40e_aq_add_macvlan() invocation.

The I40E_FLAG_PF_MAC flag in the platform-private i40e_pf structure
tracks whether a platform-specific mac address was found, in which
case calls to i40e_aq_mac_address_write() and i40e_aq_add_macvlan()
will be triggered.

Reviewed-by: Martin K. Petersen <martin.peter...@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com>
---
v2, v3: Andy Shevchenko comments
v4: Shannon Nelson review: explicitly set up mac filters before register_netdev
v5: Shannon Nelson code style comments
v6: Shannon Nelson code style comments
v7: Ensure that i40e_macaddr_init() is called only for VSI_MAIN, and only
if the mac address is not the default. Some additional code-refactoring
based on comments from Shannon Nelson
v8: use pf->flags instead of bit value to track if we found a platform-specific
macaddr. Other code-style comments from Shannon Nelson
v9: David Miller, Shannon Nelson comments for i40e_get_platform_mac_addr

 drivers/net/ethernet/intel/i40e/i40e.h  |1 +
 drivers/net/ethernet/intel/i40e/i40e_main.c |   90 +++
 2 files changed, 91 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 0b9537b..b85eaee 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -337,6 +337,7 @@ struct i40e_pf {
 #define I40E_FLAG_LINK_POLLING_ENABLED BIT_ULL(39)
 #define I40E_FLAG_VEB_MODE_ENABLED BIT_ULL(40)
 #define I40E_FLAG_NO_PCI_LINK_CHECKBIT_ULL(42)
+#define I40E_FLAG_PF_MAC   BIT_ULL(50)
 
/* tracks features that get auto disabled by errors */
u64 auto_disable_flags;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 9e6268b..3e2d4e0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -24,6 +24,15 @@
  *
  
**/
 
+#include 
+#include 
+#include 
+
+#ifdef CONFIG_SPARC
+#include 
+#include 
+#endif
+
 /* Local includes */
 #include "i40e.h"
 #include "i40e_diag.h"
@@ -9387,6 +9396,44 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct 
i40e_vsi *vsi)
 }
 
 /**
+ * i40e_macaddr_init - explicitly write the mac address filters.
+ *
+ * @vsi: pointer to the vsi.
+ * @macaddr: the MAC address
+ *
+ * This is needed when the macaddr has been obtained by other
+ * means than the default, e.g., from Open Firmware or IDPROM.
+ * Returns 0 on success, negative on failure
+ **/
+static int i40e_macaddr_init(struct i40e_vsi *vsi, u8 *macaddr)
+{
+   int ret;
+   struct i40e_aqc_add_macvlan_element_data element;
+
+   ret = i40e_aq_mac_address_write(>back->hw,
+   I40E_AQC_WRITE_TYPE_LAA_WOL,
+   macaddr, NULL);
+   if (ret) {
+   dev_info(>back->pdev->dev,
+"Addr change for VSI failed: %d\n", ret);
+   return -EADDRNOTAVAIL;
+   }
+
+   memset(, 0, sizeof(element));
+   ether_addr_copy(element.mac_addr, macaddr);
+   element.flags = cpu_to_le16(I40E_AQC_MACVLAN_ADD_PERFECT_MATCH);
+   ret = i40e_aq_add_macvlan(>back->hw, vsi->seid, , 1, NULL);
+   if (ret) {
+   dev_info(>back->pdev->dev,
+"add filter failed err %s aq_err %s\n",
+i40e_stat_str(>back->hw, ret),
+i40e_aq_str(>back->hw,
+vsi->back->hw.aq.asq_last_status));
+   }
+   return ret;
+}
+
+/**
  * i40e_vsi_setup - Set up a VSI by a given type
  * @pf: board private structure
  * @type: VSI type
@@ -9510,6 +9557,17 @@ struct i40e_vsi *i40e_vsi_setup(struct i40e_pf *pf, u8 
type,
switch (vsi->type) {
/* setup the netdev if needed */
case I40E_VSI_MAIN:
+   /* Apply relevant filters if a platform-specific mac
+* address was selected.
+*/
+   if (!!(pf->flags & I40E_FLAG_PF_MAC)) {
+   ret = i40e_macaddr_init(vsi, pf->hw.mac.addr);
+   if (ret) {

[PATCH v8] i40e: Look up MAC address in Open Firmware or IDPROM

2015-12-05 Thread Sowmini Varadhan

This is the i40e equivalent of commit c762dff24c06 ("ixgbe: Look up MAC
address in Open Firmware or IDPROM").

As with that fix, attempt to look up the MAC address in Open Firmware
on systems that support it, and use IDPROM on SPARC if no OF address
is found.

In the case of the i40e there is an assumption that the default mac
address has already been set up as the primary mac filter on probe,
so if this filter is obtained from the Open Firmware or IDPROM, an
explicit write is needed via i40e_aq_mac_address_write() and
i40e_aq_add_macvlan() invocation.

The I40E_FLAG_PF_MAC flag in the platform-private i40e_pf structure
tracks whether a platform-specific mac address was found, in which
case calls to i40e_aq_mac_address_write() and i40e_aq_add_macvlan()
will be triggered.

Reviewed-by: Martin K. Petersen 
Signed-off-by: Sowmini Varadhan 
---
v2, v3: Andy Shevchenko comments
v4: Shannon Nelson review: explicitly set up mac filters before register_netdev
v5: Shannon Nelson code style comments
v6: Shannon Nelson code style comments
v7: Ensure that i40e_macaddr_init() is called only for VSI_MAIN, and only
if the mac address is not the default. Some additional code-refactoring
based on comments from Shannon Nelson
v8: use pf->flags instead of bit value to track if we found a platform-specific
macaddr. Other code-style comments from Shannon Nelson

 drivers/net/ethernet/intel/i40e/i40e.h  |1 +
 drivers/net/ethernet/intel/i40e/i40e_main.c |   89 +++
 2 files changed, 90 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 0b9537b..b85eaee 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -337,6 +337,7 @@ struct i40e_pf {
 #define I40E_FLAG_LINK_POLLING_ENABLED BIT_ULL(39)
 #define I40E_FLAG_VEB_MODE_ENABLED BIT_ULL(40)
 #define I40E_FLAG_NO_PCI_LINK_CHECKBIT_ULL(42)
+#define I40E_FLAG_PF_MAC   BIT_ULL(50)
 
/* tracks features that get auto disabled by errors */
u64 auto_disable_flags;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 9e6268b..ad7c941 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -24,6 +24,15 @@
  *
  
**/
 
+#include 
+#include 
+#include 
+
+#ifdef CONFIG_SPARC
+#include 
+#include 
+#endif
+
 /* Local includes */
 #include "i40e.h"
 #include "i40e_diag.h"
@@ -9387,6 +9396,44 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct 
i40e_vsi *vsi)
 }
 
 /**
+ * i40e_macaddr_init - explicitly write the mac address filters.
+ *
+ * @vsi: pointer to the vsi.
+ * @macaddr: the MAC address
+ *
+ * This is needed when the macaddr has been obtained by other
+ * means than the default, e.g., from Open Firmware or IDPROM.
+ * Returns 0 on success, negative on failure
+ **/
+static int i40e_macaddr_init(struct i40e_vsi *vsi, u8 *macaddr)
+{
+   int ret;
+   struct i40e_aqc_add_macvlan_element_data element;
+
+   ret = i40e_aq_mac_address_write(>back->hw,
+   I40E_AQC_WRITE_TYPE_LAA_WOL,
+   macaddr, NULL);
+   if (ret) {
+   dev_info(>back->pdev->dev,
+"Addr change for VSI failed: %d\n", ret);
+   return -EADDRNOTAVAIL;
+   }
+
+   memset(, 0, sizeof(element));
+   ether_addr_copy(element.mac_addr, macaddr);
+   element.flags = cpu_to_le16(I40E_AQC_MACVLAN_ADD_PERFECT_MATCH);
+   ret = i40e_aq_add_macvlan(>back->hw, vsi->seid, , 1, NULL);
+   if (ret) {
+   dev_info(>back->pdev->dev,
+"add filter failed err %s aq_err %s\n",
+i40e_stat_str(>back->hw, ret),
+i40e_aq_str(>back->hw,
+vsi->back->hw.aq.asq_last_status));
+   }
+   return ret;
+}
+
+/**
  * i40e_vsi_setup - Set up a VSI by a given type
  * @pf: board private structure
  * @type: VSI type
@@ -9510,6 +9557,17 @@ struct i40e_vsi *i40e_vsi_setup(struct i40e_pf *pf, u8 
type,
switch (vsi->type) {
/* setup the netdev if needed */
case I40E_VSI_MAIN:
+   /* Apply relevant filters if a platform-specific mac
+* address was selected.
+*/
+   if (!!(pf->flags & I40E_FLAG_PF_MAC)) {
+   ret = i40e_macaddr_init(vsi, pf->hw.mac.addr);
+   if (ret) {
+   dev_warn(>pdev->dev,
+"could not set up macaddr; err %d\n",
+

[PATCH v8] i40e: Look up MAC address in Open Firmware or IDPROM

2015-12-05 Thread Sowmini Varadhan

This is the i40e equivalent of commit c762dff24c06 ("ixgbe: Look up MAC
address in Open Firmware or IDPROM").

As with that fix, attempt to look up the MAC address in Open Firmware
on systems that support it, and use IDPROM on SPARC if no OF address
is found.

In the case of the i40e there is an assumption that the default mac
address has already been set up as the primary mac filter on probe,
so if this filter is obtained from the Open Firmware or IDPROM, an
explicit write is needed via i40e_aq_mac_address_write() and
i40e_aq_add_macvlan() invocation.

The I40E_FLAG_PF_MAC flag in the platform-private i40e_pf structure
tracks whether a platform-specific mac address was found, in which
case calls to i40e_aq_mac_address_write() and i40e_aq_add_macvlan()
will be triggered.

Reviewed-by: Martin K. Petersen <martin.peter...@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com>
---
v2, v3: Andy Shevchenko comments
v4: Shannon Nelson review: explicitly set up mac filters before register_netdev
v5: Shannon Nelson code style comments
v6: Shannon Nelson code style comments
v7: Ensure that i40e_macaddr_init() is called only for VSI_MAIN, and only
if the mac address is not the default. Some additional code-refactoring
based on comments from Shannon Nelson
v8: use pf->flags instead of bit value to track if we found a platform-specific
macaddr. Other code-style comments from Shannon Nelson

 drivers/net/ethernet/intel/i40e/i40e.h  |1 +
 drivers/net/ethernet/intel/i40e/i40e_main.c |   89 +++
 2 files changed, 90 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 0b9537b..b85eaee 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -337,6 +337,7 @@ struct i40e_pf {
 #define I40E_FLAG_LINK_POLLING_ENABLED BIT_ULL(39)
 #define I40E_FLAG_VEB_MODE_ENABLED BIT_ULL(40)
 #define I40E_FLAG_NO_PCI_LINK_CHECKBIT_ULL(42)
+#define I40E_FLAG_PF_MAC   BIT_ULL(50)
 
/* tracks features that get auto disabled by errors */
u64 auto_disable_flags;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 9e6268b..ad7c941 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -24,6 +24,15 @@
  *
  
**/
 
+#include 
+#include 
+#include 
+
+#ifdef CONFIG_SPARC
+#include 
+#include 
+#endif
+
 /* Local includes */
 #include "i40e.h"
 #include "i40e_diag.h"
@@ -9387,6 +9396,44 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct 
i40e_vsi *vsi)
 }
 
 /**
+ * i40e_macaddr_init - explicitly write the mac address filters.
+ *
+ * @vsi: pointer to the vsi.
+ * @macaddr: the MAC address
+ *
+ * This is needed when the macaddr has been obtained by other
+ * means than the default, e.g., from Open Firmware or IDPROM.
+ * Returns 0 on success, negative on failure
+ **/
+static int i40e_macaddr_init(struct i40e_vsi *vsi, u8 *macaddr)
+{
+   int ret;
+   struct i40e_aqc_add_macvlan_element_data element;
+
+   ret = i40e_aq_mac_address_write(>back->hw,
+   I40E_AQC_WRITE_TYPE_LAA_WOL,
+   macaddr, NULL);
+   if (ret) {
+   dev_info(>back->pdev->dev,
+"Addr change for VSI failed: %d\n", ret);
+   return -EADDRNOTAVAIL;
+   }
+
+   memset(, 0, sizeof(element));
+   ether_addr_copy(element.mac_addr, macaddr);
+   element.flags = cpu_to_le16(I40E_AQC_MACVLAN_ADD_PERFECT_MATCH);
+   ret = i40e_aq_add_macvlan(>back->hw, vsi->seid, , 1, NULL);
+   if (ret) {
+   dev_info(>back->pdev->dev,
+"add filter failed err %s aq_err %s\n",
+i40e_stat_str(>back->hw, ret),
+i40e_aq_str(>back->hw,
+vsi->back->hw.aq.asq_last_status));
+   }
+   return ret;
+}
+
+/**
  * i40e_vsi_setup - Set up a VSI by a given type
  * @pf: board private structure
  * @type: VSI type
@@ -9510,6 +9557,17 @@ struct i40e_vsi *i40e_vsi_setup(struct i40e_pf *pf, u8 
type,
switch (vsi->type) {
/* setup the netdev if needed */
case I40E_VSI_MAIN:
+   /* Apply relevant filters if a platform-specific mac
+* address was selected.
+*/
+   if (!!(pf->flags & I40E_FLAG_PF_MAC)) {
+   ret = i40e_macaddr_init(vsi, pf->hw.mac.addr);
+   if (ret) {
+   dev_warn(>pd

[PATCH RESEND v7] i40e: Look up MAC address in Open Firmware or IDPROM

2015-12-04 Thread Sowmini Varadhan

[Apologies for fat-fingering subject in the other attempt]

This is the i40e equivalent of commit c762dff24c06 ("ixgbe: Look up MAC
address in Open Firmware or IDPROM").

As with that fix, attempt to look up the MAC address in Open Firmware
on systems that support it, and use IDPROM on SPARC if no OF address
is found.

In the case of the i40e there is an assumption that the default mac
address has already been set up as the primary mac filter on probe,
so if this filter is obtained from the Open Firmware or IDPROM, an
explicit write is needed via i40e_aq_mac_address_write() and
i40e_aq_add_macvlan() invocation.

The is_default_mac field in the platform-private i40e_pf structure
tracks whether the mac address was default or not, and in the latter
case, will trigger the calls to i40e_aq_mac_address_write() and
i40e_aq_add_macvlan().

Reviewed-by: Martin K. Petersen 
Signed-off-by: Sowmini Varadhan 
---
v2, v3: Andy Shevchenko comments
v4: Shannon Nelson review: explicitly set up mac filters before register_netdev
v5: Shannon Nelson code style comments
v6: Shannon Nelson code style comments
v7: Ensure that i40e_macaddr_init() is called only for VSI_MAIN, and only
if the mac address is not the default. Some additional code-refactoring
based on comments from Shannon Nelson

 drivers/net/ethernet/intel/i40e/i40e.h  |2 +
 drivers/net/ethernet/intel/i40e/i40e_main.c |   87 +++
 2 files changed, 89 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 0b9537b..6d41757 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -420,6 +420,8 @@ struct i40e_pf {
 
u32 ioremap_len;
u32 fd_inv;
+
+   u32 is_default_mac:1;
 };
 
 struct i40e_mac_filter {
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 9e6268b..40a5d53 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -24,6 +24,15 @@
  *
  
**/
 
+#include 
+#include 
+#include 
+
+#ifdef CONFIG_SPARC
+#include 
+#include 
+#endif
+
 /* Local includes */
 #include "i40e.h"
 #include "i40e_diag.h"
@@ -9387,6 +9396,44 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct 
i40e_vsi *vsi)
 }
 
 /**
+ * i40e_macaddr_init - explicitly write the mac address filters.
+ *
+ * @vsi: pointer to the vsi.
+ * @macaddr: the MAC address
+ *
+ * This is needed when the macaddr has been obtained by other
+ * means than the default, e.g., from Open Firmware or IDPROM.
+ * Returns 0 on success, negative on failure
+ **/
+static int i40e_macaddr_init(struct i40e_vsi *vsi, u8 *macaddr)
+{
+   int ret, aq_err;
+   struct i40e_aqc_add_macvlan_element_data element;
+
+   ret = i40e_aq_mac_address_write(>back->hw,
+   I40E_AQC_WRITE_TYPE_LAA_WOL,
+   macaddr, NULL);
+   if (ret) {
+   dev_info(>back->pdev->dev,
+"Addr change for VSI failed: %d\n", ret);
+   return -EADDRNOTAVAIL;
+   }
+
+   memset(, 0, sizeof(element));
+   ether_addr_copy(element.mac_addr, macaddr);
+   element.flags = cpu_to_le16(I40E_AQC_MACVLAN_ADD_PERFECT_MATCH);
+   ret = i40e_aq_add_macvlan(>back->hw, vsi->seid, , 1, NULL);
+   aq_err = vsi->back->hw.aq.asq_last_status;
+   if (aq_err != I40E_AQ_RC_OK) {
+   dev_info(>back->pdev->dev,
+"add filter failed err %s aq_err %s\n",
+i40e_stat_str(>back->hw, ret),
+i40e_aq_str(>back->hw, aq_err));
+   }
+   return ret;
+}
+
+/**
  * i40e_vsi_setup - Set up a VSI by a given type
  * @pf: board private structure
  * @type: VSI type
@@ -9510,6 +9557,14 @@ struct i40e_vsi *i40e_vsi_setup(struct i40e_pf *pf, u8 
type,
switch (vsi->type) {
/* setup the netdev if needed */
case I40E_VSI_MAIN:
+   /* Apply relevant filters if a platform-specific mac
+* address was selected.
+*/
+   if (!pf->is_default_mac) {
+   ret = i40e_macaddr_init(vsi, pf->hw.mac.addr);
+   if (ret)
+   goto err_netdev;
+   }
case I40E_VSI_VMDQ2:
case I40E_VSI_FCOE:
ret = i40e_config_netdev(vsi);
@@ -10340,6 +10395,36 @@ static void i40e_print_features(struct i40e_pf *pf)
 }
 
 /**
+ * i40e_get_platform_mac_addr - get platform-specific MAC address
+ *
+ * @pdev: PCI device information struct
+ * @pf: board private structure
+ *
+ * Look up the MAC address in Open Firmware  on systems that sup

bar

2015-12-04 Thread Sowmini Varadhan

This is the i40e equivalent of commit c762dff24c06 ("ixgbe: Look up MAC
address in Open Firmware or IDPROM").

As with that fix, attempt to look up the MAC address in Open Firmware
on systems that support it, and use IDPROM on SPARC if no OF address
is found.

In the case of the i40e there is an assumption that the default mac
address has already been set up as the primary mac filter on probe,
so if this filter is obtained from the Open Firmware or IDPROM, an
explicit write is needed via i40e_aq_mac_address_write() and
i40e_aq_add_macvlan() invocation.

The is_default_mac field in the platform-private i40e_pf structure
tracks whether the mac address was default or not, and in the latter
case, will trigger the calls to i40e_aq_mac_address_write() and
i40e_aq_add_macvlan().

Reviewed-by: Martin K. Petersen 
Signed-off-by: Sowmini Varadhan 
---
v2, v3: Andy Shevchenko comments
v4: Shannon Nelson review: explicitly set up mac filters before register_netdev
v5: Shannon Nelson code style comments
v6: Shannon Nelson code style comments
v7: Ensure that i40e_macaddr_init() is called only for VSI_MAIN, and only
if the mac address is not the default. Some additional code-refactoring
based on comments from Shannon Nelson

 drivers/net/ethernet/intel/i40e/i40e.h  |2 +
 drivers/net/ethernet/intel/i40e/i40e_main.c |   87 +++
 2 files changed, 89 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 0b9537b..6d41757 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -420,6 +420,8 @@ struct i40e_pf {
 
u32 ioremap_len;
u32 fd_inv;
+
+   u32 is_default_mac:1;
 };
 
 struct i40e_mac_filter {
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 9e6268b..40a5d53 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -24,6 +24,15 @@
  *
  
**/
 
+#include 
+#include 
+#include 
+
+#ifdef CONFIG_SPARC
+#include 
+#include 
+#endif
+
 /* Local includes */
 #include "i40e.h"
 #include "i40e_diag.h"
@@ -9387,6 +9396,44 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct 
i40e_vsi *vsi)
 }
 
 /**
+ * i40e_macaddr_init - explicitly write the mac address filters.
+ *
+ * @vsi: pointer to the vsi.
+ * @macaddr: the MAC address
+ *
+ * This is needed when the macaddr has been obtained by other
+ * means than the default, e.g., from Open Firmware or IDPROM.
+ * Returns 0 on success, negative on failure
+ **/
+static int i40e_macaddr_init(struct i40e_vsi *vsi, u8 *macaddr)
+{
+   int ret, aq_err;
+   struct i40e_aqc_add_macvlan_element_data element;
+
+   ret = i40e_aq_mac_address_write(>back->hw,
+   I40E_AQC_WRITE_TYPE_LAA_WOL,
+   macaddr, NULL);
+   if (ret) {
+   dev_info(>back->pdev->dev,
+"Addr change for VSI failed: %d\n", ret);
+   return -EADDRNOTAVAIL;
+   }
+
+   memset(, 0, sizeof(element));
+   ether_addr_copy(element.mac_addr, macaddr);
+   element.flags = cpu_to_le16(I40E_AQC_MACVLAN_ADD_PERFECT_MATCH);
+   ret = i40e_aq_add_macvlan(>back->hw, vsi->seid, , 1, NULL);
+   aq_err = vsi->back->hw.aq.asq_last_status;
+   if (aq_err != I40E_AQ_RC_OK) {
+   dev_info(>back->pdev->dev,
+"add filter failed err %s aq_err %s\n",
+i40e_stat_str(>back->hw, ret),
+i40e_aq_str(>back->hw, aq_err));
+   }
+   return ret;
+}
+
+/**
  * i40e_vsi_setup - Set up a VSI by a given type
  * @pf: board private structure
  * @type: VSI type
@@ -9510,6 +9557,14 @@ struct i40e_vsi *i40e_vsi_setup(struct i40e_pf *pf, u8 
type,
switch (vsi->type) {
/* setup the netdev if needed */
case I40E_VSI_MAIN:
+   /* Apply relevant filters if a platform-specific mac
+* address was selected.
+*/
+   if (!pf->is_default_mac) {
+   ret = i40e_macaddr_init(vsi, pf->hw.mac.addr);
+   if (ret)
+   goto err_netdev;
+   }
case I40E_VSI_VMDQ2:
case I40E_VSI_FCOE:
ret = i40e_config_netdev(vsi);
@@ -10340,6 +10395,36 @@ static void i40e_print_features(struct i40e_pf *pf)
 }
 
 /**
+ * i40e_get_platform_mac_addr - get platform-specific MAC address
+ *
+ * @pdev: PCI device information struct
+ * @pf: board private structure
+ *
+ * Look up the MAC address in Open Firmware  on systems that support it,
+ * and use IDPROM on SPARC if no OF address is found.

[PATCH RESEND v7] i40e: Look up MAC address in Open Firmware or IDPROM

2015-12-04 Thread Sowmini Varadhan

[Apologies for fat-fingering subject in the other attempt]

This is the i40e equivalent of commit c762dff24c06 ("ixgbe: Look up MAC
address in Open Firmware or IDPROM").

As with that fix, attempt to look up the MAC address in Open Firmware
on systems that support it, and use IDPROM on SPARC if no OF address
is found.

In the case of the i40e there is an assumption that the default mac
address has already been set up as the primary mac filter on probe,
so if this filter is obtained from the Open Firmware or IDPROM, an
explicit write is needed via i40e_aq_mac_address_write() and
i40e_aq_add_macvlan() invocation.

The is_default_mac field in the platform-private i40e_pf structure
tracks whether the mac address was default or not, and in the latter
case, will trigger the calls to i40e_aq_mac_address_write() and
i40e_aq_add_macvlan().

Reviewed-by: Martin K. Petersen <martin.peter...@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com>
---
v2, v3: Andy Shevchenko comments
v4: Shannon Nelson review: explicitly set up mac filters before register_netdev
v5: Shannon Nelson code style comments
v6: Shannon Nelson code style comments
v7: Ensure that i40e_macaddr_init() is called only for VSI_MAIN, and only
if the mac address is not the default. Some additional code-refactoring
based on comments from Shannon Nelson

 drivers/net/ethernet/intel/i40e/i40e.h  |2 +
 drivers/net/ethernet/intel/i40e/i40e_main.c |   87 +++
 2 files changed, 89 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 0b9537b..6d41757 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -420,6 +420,8 @@ struct i40e_pf {
 
u32 ioremap_len;
u32 fd_inv;
+
+   u32 is_default_mac:1;
 };
 
 struct i40e_mac_filter {
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 9e6268b..40a5d53 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -24,6 +24,15 @@
  *
  
**/
 
+#include 
+#include 
+#include 
+
+#ifdef CONFIG_SPARC
+#include 
+#include 
+#endif
+
 /* Local includes */
 #include "i40e.h"
 #include "i40e_diag.h"
@@ -9387,6 +9396,44 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct 
i40e_vsi *vsi)
 }
 
 /**
+ * i40e_macaddr_init - explicitly write the mac address filters.
+ *
+ * @vsi: pointer to the vsi.
+ * @macaddr: the MAC address
+ *
+ * This is needed when the macaddr has been obtained by other
+ * means than the default, e.g., from Open Firmware or IDPROM.
+ * Returns 0 on success, negative on failure
+ **/
+static int i40e_macaddr_init(struct i40e_vsi *vsi, u8 *macaddr)
+{
+   int ret, aq_err;
+   struct i40e_aqc_add_macvlan_element_data element;
+
+   ret = i40e_aq_mac_address_write(>back->hw,
+   I40E_AQC_WRITE_TYPE_LAA_WOL,
+   macaddr, NULL);
+   if (ret) {
+   dev_info(>back->pdev->dev,
+"Addr change for VSI failed: %d\n", ret);
+   return -EADDRNOTAVAIL;
+   }
+
+   memset(, 0, sizeof(element));
+   ether_addr_copy(element.mac_addr, macaddr);
+   element.flags = cpu_to_le16(I40E_AQC_MACVLAN_ADD_PERFECT_MATCH);
+   ret = i40e_aq_add_macvlan(>back->hw, vsi->seid, , 1, NULL);
+   aq_err = vsi->back->hw.aq.asq_last_status;
+   if (aq_err != I40E_AQ_RC_OK) {
+   dev_info(>back->pdev->dev,
+"add filter failed err %s aq_err %s\n",
+i40e_stat_str(>back->hw, ret),
+i40e_aq_str(>back->hw, aq_err));
+   }
+   return ret;
+}
+
+/**
  * i40e_vsi_setup - Set up a VSI by a given type
  * @pf: board private structure
  * @type: VSI type
@@ -9510,6 +9557,14 @@ struct i40e_vsi *i40e_vsi_setup(struct i40e_pf *pf, u8 
type,
switch (vsi->type) {
/* setup the netdev if needed */
case I40E_VSI_MAIN:
+   /* Apply relevant filters if a platform-specific mac
+* address was selected.
+*/
+   if (!pf->is_default_mac) {
+   ret = i40e_macaddr_init(vsi, pf->hw.mac.addr);
+   if (ret)
+   goto err_netdev;
+   }
case I40E_VSI_VMDQ2:
case I40E_VSI_FCOE:
ret = i40e_config_netdev(vsi);
@@ -10340,6 +10395,36 @@ static void i40e_print_features(struct i40e_pf *pf)
 }
 
 /**
+ * i40e_get_platform_mac_addr - get platform-specific MAC address
+ *
+ * @pdev: PCI device information struct
+ * @pf: board private structure

bar

2015-12-04 Thread Sowmini Varadhan

This is the i40e equivalent of commit c762dff24c06 ("ixgbe: Look up MAC
address in Open Firmware or IDPROM").

As with that fix, attempt to look up the MAC address in Open Firmware
on systems that support it, and use IDPROM on SPARC if no OF address
is found.

In the case of the i40e there is an assumption that the default mac
address has already been set up as the primary mac filter on probe,
so if this filter is obtained from the Open Firmware or IDPROM, an
explicit write is needed via i40e_aq_mac_address_write() and
i40e_aq_add_macvlan() invocation.

The is_default_mac field in the platform-private i40e_pf structure
tracks whether the mac address was default or not, and in the latter
case, will trigger the calls to i40e_aq_mac_address_write() and
i40e_aq_add_macvlan().

Reviewed-by: Martin K. Petersen <martin.peter...@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com>
---
v2, v3: Andy Shevchenko comments
v4: Shannon Nelson review: explicitly set up mac filters before register_netdev
v5: Shannon Nelson code style comments
v6: Shannon Nelson code style comments
v7: Ensure that i40e_macaddr_init() is called only for VSI_MAIN, and only
if the mac address is not the default. Some additional code-refactoring
based on comments from Shannon Nelson

 drivers/net/ethernet/intel/i40e/i40e.h  |2 +
 drivers/net/ethernet/intel/i40e/i40e_main.c |   87 +++
 2 files changed, 89 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 0b9537b..6d41757 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -420,6 +420,8 @@ struct i40e_pf {
 
u32 ioremap_len;
u32 fd_inv;
+
+   u32 is_default_mac:1;
 };
 
 struct i40e_mac_filter {
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 9e6268b..40a5d53 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -24,6 +24,15 @@
  *
  
**/
 
+#include 
+#include 
+#include 
+
+#ifdef CONFIG_SPARC
+#include 
+#include 
+#endif
+
 /* Local includes */
 #include "i40e.h"
 #include "i40e_diag.h"
@@ -9387,6 +9396,44 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct 
i40e_vsi *vsi)
 }
 
 /**
+ * i40e_macaddr_init - explicitly write the mac address filters.
+ *
+ * @vsi: pointer to the vsi.
+ * @macaddr: the MAC address
+ *
+ * This is needed when the macaddr has been obtained by other
+ * means than the default, e.g., from Open Firmware or IDPROM.
+ * Returns 0 on success, negative on failure
+ **/
+static int i40e_macaddr_init(struct i40e_vsi *vsi, u8 *macaddr)
+{
+   int ret, aq_err;
+   struct i40e_aqc_add_macvlan_element_data element;
+
+   ret = i40e_aq_mac_address_write(>back->hw,
+   I40E_AQC_WRITE_TYPE_LAA_WOL,
+   macaddr, NULL);
+   if (ret) {
+   dev_info(>back->pdev->dev,
+"Addr change for VSI failed: %d\n", ret);
+   return -EADDRNOTAVAIL;
+   }
+
+   memset(, 0, sizeof(element));
+   ether_addr_copy(element.mac_addr, macaddr);
+   element.flags = cpu_to_le16(I40E_AQC_MACVLAN_ADD_PERFECT_MATCH);
+   ret = i40e_aq_add_macvlan(>back->hw, vsi->seid, , 1, NULL);
+   aq_err = vsi->back->hw.aq.asq_last_status;
+   if (aq_err != I40E_AQ_RC_OK) {
+   dev_info(>back->pdev->dev,
+"add filter failed err %s aq_err %s\n",
+i40e_stat_str(>back->hw, ret),
+i40e_aq_str(>back->hw, aq_err));
+   }
+   return ret;
+}
+
+/**
  * i40e_vsi_setup - Set up a VSI by a given type
  * @pf: board private structure
  * @type: VSI type
@@ -9510,6 +9557,14 @@ struct i40e_vsi *i40e_vsi_setup(struct i40e_pf *pf, u8 
type,
switch (vsi->type) {
/* setup the netdev if needed */
case I40E_VSI_MAIN:
+   /* Apply relevant filters if a platform-specific mac
+* address was selected.
+*/
+   if (!pf->is_default_mac) {
+   ret = i40e_macaddr_init(vsi, pf->hw.mac.addr);
+   if (ret)
+   goto err_netdev;
+   }
case I40E_VSI_VMDQ2:
case I40E_VSI_FCOE:
ret = i40e_config_netdev(vsi);
@@ -10340,6 +10395,36 @@ static void i40e_print_features(struct i40e_pf *pf)
 }
 
 /**
+ * i40e_get_platform_mac_addr - get platform-specific MAC address
+ *
+ * @pdev: PCI device information struct
+ * @pf: board private structure
+ *
+ * Look up the MAC address in Open Firmware  on syst

  1   2   3   4   >