Re: [PATCH net-next] inet: Always increment refcount in inet_twsk_schedule

2015-07-23 Thread subashab
 Actually we do increment refcnt, for every socket found in ehash.

 Carefully read again __inet_lookup_established()

 This code is generic for ESTABLISH and TIME-WAIT sockets

 If you found a code that performed the lookup without taking the refcnt,
 please point me at it, this would be a serious bug.

From my previous observations, it appears as if
1. this check is bypassed
2. the refcount is incremented here but is decremented before it reaches
the packet processing in tcp_timewait_state_process()

I will try to debug this and update.

 Is it some Android kernel ?

 Android had private modules that needed an update in 3.18

Yes, the kernel is based on Android 3.18.

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] inet: Always increment refcount in inet_twsk_schedule

2015-07-21 Thread Eric Dumazet
On Mon, 2015-07-20 at 19:14 +, subas...@codeaurora.org wrote:
  //Initialize time wait socket and setup timer
  inet_twsk_alloc() tw_refcnt = 0
  __inet_twsk_hashdance() tw_refcnt = 3
  inet_twsk_schedule() tw_refcnt = 4
  inet_twsk_put() tw_refcnt = 3
 
  //Receive packet 1 in timewait state
  tcp_timewait_state_process() - inet_twsk_schedule tw_refcnt = 3 (no
  change)
 
  This is obviously wrong.
 
  If a timewait socket is found, do we increment its refcnt before
  proceeding.
 We do not increment refcount currently when we find a timewait socket.

Actually we do increment refcnt, for every socket found in ehash.

Carefully read again __inet_lookup_established()

This code is generic for ESTABLISH and TIME-WAIT sockets

If you found a code that performed the lookup without taking the refcnt,
please point me at it, this would be a serious bug.

 
  I've received some private mails about tw issues, that turned to be
  caused by buggy drivers or buggy arch specific code.
 
  Are you crashed observed on x86 ?
 
 This is observed on ARM devices. In the current debug, all time wait
 socket refcount changes were happening in TCP stack only and there was no
 platform / driver code involved.
 
 According to my understanding, we would need to increment the time wait
 socket refcount first before proceeding with any subsequent operations.
 However, I request your expert opinion on this.

Is it some Android kernel ?

Android had private modules that needed an update in 3.18



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] inet: Always increment refcount in inet_twsk_schedule

2015-07-20 Thread subashab
 //Initialize time wait socket and setup timer
 inet_twsk_alloc() tw_refcnt = 0
 __inet_twsk_hashdance() tw_refcnt = 3
 inet_twsk_schedule() tw_refcnt = 4
 inet_twsk_put() tw_refcnt = 3

 //Receive packet 1 in timewait state
 tcp_timewait_state_process() - inet_twsk_schedule tw_refcnt = 3 (no
 change)

 This is obviously wrong.

 If a timewait socket is found, do we increment its refcnt before
 proceeding.
We do not increment refcount currently when we find a timewait socket.

 I've received some private mails about tw issues, that turned to be
 caused by buggy drivers or buggy arch specific code.

 Are you crashed observed on x86 ?

This is observed on ARM devices. In the current debug, all time wait
socket refcount changes were happening in TCP stack only and there was no
platform / driver code involved.

According to my understanding, we would need to increment the time wait
socket refcount first before proceeding with any subsequent operations.
However, I request your expert opinion on this.


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] inet: Always increment refcount in inet_twsk_schedule

2015-07-19 Thread Eric Dumazet
On Sun, 2015-07-19 at 03:31 +, subas...@codeaurora.org wrote:
 I am seeing an issue with the reference count of time wait sockets which
 leads to freeing of active timer object. This occurs in some data stress
 test setups, so I am unable to determine the exact step when it occured.
 However, I logged the refcount and was able to find out the code path
 which leads to this problem.
 
 //Initialize time wait socket and setup timer
 inet_twsk_alloc() tw_refcnt = 0
 __inet_twsk_hashdance() tw_refcnt = 3
 inet_twsk_schedule() tw_refcnt = 4
 inet_twsk_put() tw_refcnt = 3
 
 //Receive packet 1 in timewait state
 tcp_timewait_state_process() - inet_twsk_schedule tw_refcnt = 3 (no change)

This is obviously wrong.

If a timewait socket is found, do we increment its refcnt before
proceeding.

 TCP: tcp_v4_timewait_ack() - inet_twsk_put() tw_refcnt = 2
 
 //Receive packet 2 in timewait state
 tcp_timewait_state_process() - inet_twsk_schedule tw_refcnt = 2 (no change)
 TCP: tcp_v4_timewait_ack() - inet_twsk_put() tw_refcnt = 1
 
 //Receive packet 3 in timewait state
 tcp_timewait_state_process() - inet_twsk_schedule tw_refcnt = 1 (no change)
 TCP: tcp_v4_timewait_ack() - inet_twsk_put() tw_refcnt = 0
 
 After this step, the time wait socket is destroyed along with the active
 timer object. This leads to a warning being printed which eventually leads
 to a crash.
 
 ODEBUG: free active (active state 0) object type: timer_list hint:
 tw_timer_handler+0x0/0x68
 
 It appears that inet_twsk_schedule needs to increment the reference count
 unconditionally, otherwise the socket will be destroyed since reference
 count will be decremented each time an ack is sent out as a response for
 an incoming packet.
 
 Signed-off-by: Subash Abhinov Kasiviswanathan subas...@codeaurora.org
 ---
  net/ipv4/inet_timewait_sock.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
 index cbeb022..99c349a 100644
 --- a/net/ipv4/inet_timewait_sock.c
 +++ b/net/ipv4/inet_timewait_sock.c
 @@ -246,9 +246,9 @@ void inet_twsk_schedule(struct inet_timewait_sock *tw,
 const int timeo)
 
   tw-tw_kill = timeo = 4*HZ;
   if (!mod_timer_pinned(tw-tw_timer, jiffies + timeo)) {
 - atomic_inc(tw-tw_refcnt);
   atomic_inc(tw-tw_dr-tw_count);
   }
 + atomic_inc(tw-tw_refcnt);
  }
  EXPORT_SYMBOL_GPL(inet_twsk_schedule);


This is wrong. You simply add a memory leak here. It might solve your
crash, but is not the proper way.

I've received some private mails about tw issues, that turned to be
caused by buggy drivers or buggy arch specific code.

Are you crashed observed on x86 ?


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] inet: Always increment refcount in inet_twsk_schedule

2015-07-18 Thread subashab
I am seeing an issue with the reference count of time wait sockets which
leads to freeing of active timer object. This occurs in some data stress
test setups, so I am unable to determine the exact step when it occured.
However, I logged the refcount and was able to find out the code path
which leads to this problem.

//Initialize time wait socket and setup timer
inet_twsk_alloc() tw_refcnt = 0
__inet_twsk_hashdance() tw_refcnt = 3
inet_twsk_schedule() tw_refcnt = 4
inet_twsk_put() tw_refcnt = 3

//Receive packet 1 in timewait state
tcp_timewait_state_process() - inet_twsk_schedule tw_refcnt = 3 (no change)
TCP: tcp_v4_timewait_ack() - inet_twsk_put() tw_refcnt = 2

//Receive packet 2 in timewait state
tcp_timewait_state_process() - inet_twsk_schedule tw_refcnt = 2 (no change)
TCP: tcp_v4_timewait_ack() - inet_twsk_put() tw_refcnt = 1

//Receive packet 3 in timewait state
tcp_timewait_state_process() - inet_twsk_schedule tw_refcnt = 1 (no change)
TCP: tcp_v4_timewait_ack() - inet_twsk_put() tw_refcnt = 0

After this step, the time wait socket is destroyed along with the active
timer object. This leads to a warning being printed which eventually leads
to a crash.

ODEBUG: free active (active state 0) object type: timer_list hint:
tw_timer_handler+0x0/0x68

It appears that inet_twsk_schedule needs to increment the reference count
unconditionally, otherwise the socket will be destroyed since reference
count will be decremented each time an ack is sent out as a response for
an incoming packet.

Signed-off-by: Subash Abhinov Kasiviswanathan subas...@codeaurora.org
---
 net/ipv4/inet_timewait_sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index cbeb022..99c349a 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -246,9 +246,9 @@ void inet_twsk_schedule(struct inet_timewait_sock *tw,
const int timeo)

tw-tw_kill = timeo = 4*HZ;
if (!mod_timer_pinned(tw-tw_timer, jiffies + timeo)) {
-   atomic_inc(tw-tw_refcnt);
atomic_inc(tw-tw_dr-tw_count);
}
+   atomic_inc(tw-tw_refcnt);
 }
 EXPORT_SYMBOL_GPL(inet_twsk_schedule);

--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html