Re: Fix Emulex oce driver in CURRENT

2014-07-15 Thread Stefano Garzarella
Hi,
I found other problems in the oce driver during some experiments with
netmap in emulation mode.

In details:
- missing locking:
- in some functions there are write accesses on the wq struct (tx queue
descriptor)
without acquire LOCK on the queue, particularly in oce_wq_handler() that is
invoked
in the interrupt routine. For this reason there may be race conditions.

- tx cleanup:
- in oce_if_deactivate() the wq queues are drained but some still pending
mbufs are not freed.
For this reason, I added the oce_tx_clean() that releases any pending mbufs.

I also tried experimenting with iperf3 using the same Borja environment and
I don't have panic.
Can you try this patch? Do you still have the panic?

Cheers,
Stefano Garzarella


diff --git a/sys/dev/oce/oce_if.c b/sys/dev/oce/oce_if.c
index af57491..33b35b4 100644
--- a/sys/dev/oce/oce_if.c
+++ b/sys/dev/oce/oce_if.c
@@ -142,6 +142,7 @@ static int  oce_tx(POCE_SOFTC sc, struct mbuf **mpp,
int wq_index);
 static void oce_tx_restart(POCE_SOFTC sc, struct oce_wq *wq);
 static void oce_tx_complete(struct oce_wq *wq, uint32_t wqe_idx,
  uint32_t status);
+static void oce_tx_clean(POCE_SOFTC sc);
 static int  oce_multiq_transmit(struct ifnet *ifp, struct mbuf *m,
   struct oce_wq *wq);

@@ -585,8 +586,10 @@ oce_multiq_flush(struct ifnet *ifp)
  int i = 0;

  for (i = 0; i  sc-nwqs; i++) {
+ LOCK(sc-wq[i]-tx_lock);
  while ((m = buf_ring_dequeue_sc(sc-wq[i]-br)) != NULL)
  m_freem(m);
+ UNLOCK(sc-wq[i]-tx_lock);
  }
  if_qflush(ifp);
 }
@@ -1052,6 +1055,19 @@ oce_tx_complete(struct oce_wq *wq, uint32_t wqe_idx,
uint32_t status)
  }
 }

+static void
+oce_tx_clean(POCE_SOFTC sc) {
+ int i = 0;
+ struct oce_wq *wq;
+
+ for_all_wq_queues(sc, wq, i) {
+ LOCK(wq-tx_lock);
+ while (wq-pkt_desc_tail != wq-pkt_desc_head) {
+ oce_tx_complete(wq, 0, 0);
+ }
+ UNLOCK(wq-tx_lock);
+ }
+}

 static void
 oce_tx_restart(POCE_SOFTC sc, struct oce_wq *wq)
@@ -1213,6 +1229,8 @@ oce_wq_handler(void *arg)
  struct oce_nic_tx_cqe *cqe;
  int num_cqes = 0;

+ LOCK(wq-tx_lock);
+
  bus_dmamap_sync(cq-ring-dma.tag,
  cq-ring-dma.map, BUS_DMASYNC_POSTWRITE);
  cqe = RING_GET_CONSUMER_ITEM_VA(cq-ring, struct oce_nic_tx_cqe);
@@ -1237,6 +1255,8 @@ oce_wq_handler(void *arg)
  if (num_cqes)
  oce_arm_cq(sc, cq-cq_id, num_cqes, FALSE);

+ UNLOCK(wq-tx_lock);
+
  return 0;
 }

@@ -2087,6 +2107,9 @@ oce_if_deactivate(POCE_SOFTC sc)
  /* Delete RX queue in card with flush param */
  oce_stop_rx(sc);

+ /* Flush the mbufs that are still in TX queues */
+ oce_tx_clean(sc);
+
  /* Invalidate any pending cq and eq entries*/
  for_all_evnt_queues(sc, eq, i)
  oce_drain_eq(eq);
diff --git a/sys/dev/oce/oce_queue.c b/sys/dev/oce/oce_queue.c
index 308c16d..161011b 100644
--- a/sys/dev/oce/oce_queue.c
+++ b/sys/dev/oce/oce_queue.c
@@ -969,7 +969,9 @@ oce_start_rq(struct oce_rq *rq)
 int
 oce_start_wq(struct oce_wq *wq)
 {
+ LOCK(wq-tx_lock); /* XXX: maybe not necessary */
  oce_arm_cq(wq-parent, wq-cq-cq_id, 0, TRUE);
+ UNLOCK(wq-tx_lock);
  return 0;
 }

@@ -1076,6 +1078,8 @@ oce_drain_wq_cq(struct oce_wq *wq)
 struct oce_nic_tx_cqe *cqe;
 int num_cqes = 0;

+ LOCK(wq-tx_lock); /* XXX: maybe not necessary */
+
  bus_dmamap_sync(cq-ring-dma.tag, cq-ring-dma.map,
   BUS_DMASYNC_POSTWRITE);

@@ -1093,6 +1097,7 @@ oce_drain_wq_cq(struct oce_wq *wq)

  oce_arm_cq(sc, cq-cq_id, num_cqes, FALSE);

+ UNLOCK(wq-tx_lock);
 }



2014-07-07 13:57 GMT+02:00 Borja Marcos bor...@sarenet.es:


 On Jul 7, 2014, at 1:23 PM, Luigi Rizzo wrote:

  On Mon, Jul 7, 2014 at 1:03 PM, Borja Marcos bor...@sarenet.es wrote:
  we'll try to investigate, can you tell us more about the environment you
 use ?
  (FreeBSD version, card model (PCI id perhaps), iperf3 invocation line,
  interface configuration etc.)
 
  The main differences between 10.0.747.0 and the code in head (after
  our fix) is the use
  of drbr_enqueue/dequeue versus the peek/putback in the transmit routine.
 
 
  Both drivers still have issues when the link flaps because the
  transmit queue is not cleaned
  up properly (unlike what happens in the linux driver and all FreeBSD
  drivers for different
  hardware), so it might well be that you are seeing some side effect of
  that or other
  problem which manifests itself differently depending on the environment.
 
  'instant panic' by itself does not tell us anything about what could
  be the problem you experience (and we do not see it with either driver).

 The environment details are here:

 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183391

 The way I produce an instant panic is:

 1) Connect to another machine (cross connect cable)

 2) iperf3 -s on the other machine
 (The other machine is different, it has an  ix card)

 3) iperf3 -t 30 -P 4 -c 10.0.0.1 -N

 In less than 30 seconds, panic.



 mierda dumped core - see /var/crash/vmcore.0

 Mon Jul  7 13:06:44 CEST 2014

 FreeBSD mierda 10.0-STABLE FreeBSD 10.0-STABLE #2: Mon Jul  7 11:41:45
 CEST 2014 

Re: Fix Emulex oce driver in CURRENT

2014-07-15 Thread Borja Marcos

On Jul 15, 2014, at 10:22 AM, Stefano Garzarella wrote:

 Hi,
 I found other problems in the oce driver during some experiments with
 netmap in emulation mode.

What about driver  version 10.0.747.0? At least in my configuration it works 
perfectly, no crashes despite keeping it running for several days at full 
bandwidth.

I have a server about to go into production. Should this patch work on 
10-STABLE?






Borja.


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Fix Emulex oce driver in CURRENT

2014-07-15 Thread Stefano Garzarella
I used the oce driver in CURRENT.
I think that this patch in combination with the previous one should work in
10-STABLE.

I have only tested if it works with CURRENT, but now I try if it works with
10-STABLE and I'll send you some feedback.

Cheers,
Stefano


2014-07-15 10:28 GMT+02:00 Borja Marcos bor...@sarenet.es:


 On Jul 15, 2014, at 10:22 AM, Stefano Garzarella wrote:

  Hi,
  I found other problems in the oce driver during some experiments with
  netmap in emulation mode.

 What about driver  version 10.0.747.0? At least in my configuration it
 works perfectly, no crashes despite keeping it running for several days at
 full bandwidth.

 I have a server about to go into production. Should this patch work on
 10-STABLE?






 Borja.





-- 
Stefano Garzarella
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Fix Emulex oce driver in CURRENT

2014-07-15 Thread Borja Marcos

On Jul 15, 2014, at 10:43 AM, Stefano Garzarella wrote:

 I used the oce driver in CURRENT. 
 I think that this patch in combination with the previous one should work in 
 10-STABLE.
 
 I have only tested if it works with CURRENT, but now I try if it works with 
 10-STABLE and I'll send you some feedback.

I can still try. Will get back to you soon.


Cheers,




Borja.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Fix Emulex oce driver in CURRENT

2014-07-15 Thread Borja Marcos

On Jul 15, 2014, at 10:43 AM, Stefano Garzarella wrote:

 I used the oce driver in CURRENT.
 I think that this patch in combination with the previous one should work in
 10-STABLE.
 
 I have only tested if it works with CURRENT, but now I try if it works with
 10-STABLE and I'll send you some feedback.

Hmmm. The patch seems to be broken. I have tried to apply it renaming the 
a/usr/src... to oce_if.c.old and oce_if.c, etc, and patch complains:

Patching file oce_if.c using Plan A...
patch:  malformed patch at line 6: int wq_index);


Was it broken by the email client formatting? Or am I being especially clumsy 
today? ;)




Borja.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Fix Emulex oce driver in CURRENT

2014-07-15 Thread Stefano Garzarella
I think there is some problem with the email formatting.
I send you a file with both patches.

Cheers,
Stefano


2014-07-15 11:12 GMT+02:00 Borja Marcos bor...@sarenet.es:


 On Jul 15, 2014, at 10:43 AM, Stefano Garzarella wrote:

  I used the oce driver in CURRENT.
  I think that this patch in combination with the previous one should work
 in
  10-STABLE.
 
  I have only tested if it works with CURRENT, but now I try if it works
 with
  10-STABLE and I'll send you some feedback.

 Hmmm. The patch seems to be broken. I have tried to apply it renaming the
 a/usr/src... to oce_if.c.old and oce_if.c, etc, and patch complains:

 Patching file oce_if.c using Plan A...
 patch:  malformed patch at line 6: int wq_index);


 Was it broken by the email client formatting? Or am I being especially
 clumsy today? ;)




 Borja.




-- 
Stefano Garzarella


oce_fix_STABLE10.patch
Description: Binary data
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: Fix Emulex oce driver in CURRENT

2014-07-15 Thread Stefano Garzarella
I just tried to run iperf3 with this patch and STABLE-10 and it seems to
work.
Do you have a panic?

Cheers,
Stefano


2014-07-15 11:19 GMT+02:00 Stefano Garzarella stefanogarzare...@gmail.com:

 I think there is some problem with the email formatting.
 I send you a file with both patches.

 Cheers,
 Stefano


 2014-07-15 11:12 GMT+02:00 Borja Marcos bor...@sarenet.es:


 On Jul 15, 2014, at 10:43 AM, Stefano Garzarella wrote:

  I used the oce driver in CURRENT.
  I think that this patch in combination with the previous one should
 work in
  10-STABLE.
 
  I have only tested if it works with CURRENT, but now I try if it works
 with
  10-STABLE and I'll send you some feedback.

 Hmmm. The patch seems to be broken. I have tried to apply it renaming the
 a/usr/src... to oce_if.c.old and oce_if.c, etc, and patch complains:

 Patching file oce_if.c using Plan A...
 patch:  malformed patch at line 6: int wq_index);


 Was it broken by the email client formatting? Or am I being especially
 clumsy today? ;)




 Borja.




 --
 Stefano Garzarella




-- 
Stefano Garzarella
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Fix Emulex oce driver in CURRENT

2014-07-15 Thread Borja Marcos

On Jul 15, 2014, at 11:45 AM, Stefano Garzarella wrote:

 I just tried to run iperf3 with this patch and STABLE-10 and it seems to work.
 Do you have a panic?

Still compiling :) Anyway, you didn't suffer panics before, right?




Borja.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Fix Emulex oce driver in CURRENT

2014-07-15 Thread Stefano Garzarella
2014-07-15 11:46 GMT+02:00 Borja Marcos bor...@sarenet.es:


 On Jul 15, 2014, at 11:45 AM, Stefano Garzarella wrote:

  I just tried to run iperf3 with this patch and STABLE-10 and it seems to
 work.
  Do you have a panic?

 Still compiling :) Anyway, you didn't suffer panics before, right?


Right, I didn't suffer panics with iperf3, but with netmap in emulation
mode I had a lot of panics before this patch.

Stefano





 Borja.




-- 
Stefano Garzarella
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Fix Emulex oce driver in CURRENT

2014-07-15 Thread Borja Marcos

On Jul 15, 2014, at 11:45 AM, Stefano Garzarella wrote:

 I just tried to run iperf3 with this patch and STABLE-10 and it seems to
 work.
 Do you have a panic?

So far, so good. I've ran a couple of iperf3 tests (60 seconds, trying both 
directions) and it doesn't crash.

Without the fixes I obtained a panic quite reliably, in less than 30 seconds.

Still trying. But the bugs you mentioned (lack of locking and deallocating, 
etc) seem to be consistent with the kind of failures I saw and their apparent 
randomness.

So, asking for spiritual counsel now. Would you use this driver  in a 
production environment instead of the 747 version downloaded from Emulex? I 
think the latter is giving slightly better performance but, anyway, I disable 
LRO and TSO because I see a horrible impact on NFS performance.

Cheers,





Borja.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Fix Emulex oce driver in CURRENT

2014-07-15 Thread Stefano Garzarella
2014-07-15 12:00 GMT+02:00 Borja Marcos bor...@sarenet.es:


 On Jul 15, 2014, at 11:45 AM, Stefano Garzarella wrote:

  I just tried to run iperf3 with this patch and STABLE-10 and it seems to
  work.
  Do you have a panic?

 So far, so good. I've ran a couple of iperf3 tests (60 seconds, trying
 both directions) and it doesn't crash.

 Without the fixes I obtained a panic quite reliably, in less than 30
 seconds.


 Still trying. But the bugs you mentioned (lack of locking and
 deallocating, etc) seem to be consistent with the kind of failures I saw
 and their apparent randomness.


Well.



 So, asking for spiritual counsel now. Would you use this driver  in a
 production environment instead of the 747 version downloaded from Emulex? I
 think the latter is giving slightly better performance but, anyway, I
 disable LRO and TSO because I see a horrible impact on NFS performance.


I made a diff between the two versions (CURRENT and 747) and I saw that the
main difference is in the management of buf_ring through drbr API.
In the CURRENT driver they use a new function drbr_peek() instead of
drbr_dequeue() and I think this is better.
However, even in the 747 version seems to have the problem of the lack of
locking.

Cheers,
Stefano

Cheers,





 Borja.




-- 
Stefano Garzarella
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Fix Emulex oce driver in CURRENT

2014-07-15 Thread Borja Marcos

On Jul 15, 2014, at 1:36 PM, Stefano Garzarella wrote:

 So, asking for spiritual counsel now. Would you use this driver  in a 
 production environment instead of the 747 version downloaded from Emulex? I 
 think the latter is giving slightly better performance but, anyway, I disable 
 LRO and TSO because I see a horrible impact on NFS performance.
 
 
 I made a diff between the two versions (CURRENT and 747) and I saw that the 
 main difference is in the management of buf_ring through drbr API.
 In the CURRENT driver they use a new function drbr_peek() instead of 
 drbr_dequeue() and I think this is better.
 However, even in the 747 version seems to have the problem of the lack of 
 locking.

Well, definitely you saved my cake! So it was still a tickling time bomb.

Thank you very much!




Borja.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Fix Emulex oce driver in CURRENT

2014-07-07 Thread Borja Marcos

On Jul 1, 2014, at 10:24 PM, Luigi Rizzo wrote:

 
 
 
 On Tue, Jul 1, 2014 at 8:58 PM, bor...@sarenet.es wrote:
 El 30.06.2014 18:36, Stefano Garzarella escribió:
 
 Hello,
 I had problems during some experiments with Emulex and oce driver in
 CURRENT.
 I found several bugs in the oce driver and this patch fixes them.
 
 At least with some cards, the driver simply does not work. It causes a panic 
 when there is some traffic.
 
 The relevant bug report is here.
 
 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183391
 
 The latest version available from the Emulex website works. But the version 
 bundled with 9.3 and at least -STABLE (which is the same version bundled with 
 -CURRENT) does cause panics on 10- and 9-
 
 ​i compared the code on the emulex website (10.0.747.0 ?) with the
 one in HEAD and it does not seem​ much different, but perhaps
 you have some other version in mind ?
 
 The bugs found by stefano exist also in the emulex version above.

Anyway

The fixed version is an instant panic when generating traffic (just use 
iperf3). Version 10.0.747.0  does _not_ panic.





Borja.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: Fix Emulex oce driver in CURRENT

2014-07-07 Thread Luigi Rizzo
On Mon, Jul 7, 2014 at 1:03 PM, Borja Marcos bor...@sarenet.es wrote:

 On Jul 1, 2014, at 10:24 PM, Luigi Rizzo wrote:




 On Tue, Jul 1, 2014 at 8:58 PM, bor...@sarenet.es wrote:
 El 30.06.2014 18:36, Stefano Garzarella escribió:

 Hello,
 I had problems during some experiments with Emulex and oce driver in
 CURRENT.
 I found several bugs in the oce driver and this patch fixes them.

 At least with some cards, the driver simply does not work. It causes a panic 
 when there is some traffic.

 The relevant bug report is here.

 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183391

 The latest version available from the Emulex website works. But the version 
 bundled with 9.3 and at least -STABLE (which is the same version bundled 
 with -CURRENT) does cause panics on 10- and 9-

 i compared the code on the emulex website (10.0.747.0 ?) with the
 one in HEAD and it does not seem much different, but perhaps
 you have some other version in mind ?

 The bugs found by stefano exist also in the emulex version above.

 Anyway

 The fixed version is an instant panic when generating traffic (just use 
 iperf3). Version 10.0.747.0  does _not_ panic.

we'll try to investigate, can you tell us more about the environment you use ?
(FreeBSD version, card model (PCI id perhaps), iperf3 invocation line,
interface configuration etc.)

The main differences between 10.0.747.0 and the code in head (after
our fix) is the use
of drbr_enqueue/dequeue versus the peek/putback in the transmit routine.


Both drivers still have issues when the link flaps because the
transmit queue is not cleaned
up properly (unlike what happens in the linux driver and all FreeBSD
drivers for different
hardware), so it might well be that you are seeing some side effect of
that or other
problem which manifests itself differently depending on the environment.

'instant panic' by itself does not tell us anything about what could
be the problem you experience (and we do not see it with either driver).

cheers
luigi
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: Fix Emulex oce driver in CURRENT

2014-07-07 Thread Borja Marcos

On Jul 7, 2014, at 1:23 PM, Luigi Rizzo wrote:

 On Mon, Jul 7, 2014 at 1:03 PM, Borja Marcos bor...@sarenet.es wrote:
 we'll try to investigate, can you tell us more about the environment you use ?
 (FreeBSD version, card model (PCI id perhaps), iperf3 invocation line,
 interface configuration etc.)
 
 The main differences between 10.0.747.0 and the code in head (after
 our fix) is the use
 of drbr_enqueue/dequeue versus the peek/putback in the transmit routine.
 
 
 Both drivers still have issues when the link flaps because the
 transmit queue is not cleaned
 up properly (unlike what happens in the linux driver and all FreeBSD
 drivers for different
 hardware), so it might well be that you are seeing some side effect of
 that or other
 problem which manifests itself differently depending on the environment.
 
 'instant panic' by itself does not tell us anything about what could
 be the problem you experience (and we do not see it with either driver).

The environment details are here:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183391

The way I produce an instant panic is:

1) Connect to another machine (cross connect cable)

2) iperf3 -s on the other machine 
(The other machine is different, it has an  ix card)

3) iperf3 -t 30 -P 4 -c 10.0.0.1 -N

In less than 30 seconds, panic.



mierda dumped core - see /var/crash/vmcore.0

Mon Jul  7 13:06:44 CEST 2014

FreeBSD mierda 10.0-STABLE FreeBSD 10.0-STABLE #2: Mon Jul  7 11:41:45 CEST 
2014 root@mierda:/usr/obj/usr/src/sys/GENERIC  amd64

panic: sbsndptr: sockbuf 0xf800a70489b0 and mbuf 0xf801a3326e00 clashing

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as amd64-marcel-freebsd...

Unread portion of the kernel message buffer:
panic: sbsndptr: sockbuf 0xf800a70489b0 and mbuf 0xf801a3326e00 clashing
cpuid = 12
KDB: stack backtrace:
#0 0x8092a470 at kdb_backtrace+0x60
#1 0x808ef9c5 at panic+0x155
#2 0x80962710 at sbdroprecord_locked+0
#3 0x80a8ba8c at tcp_output+0xdbc
#4 0x80a8987f at tcp_do_segment+0x30ff
#5 0x80a85b34 at tcp_input+0xd04
#6 0x80a1af57 at ip_input+0x97
#7 0x809ba512 at netisr_dispatch_src+0x62
#8 0x809b1ae6 at ether_demux+0x126
#9 0x809b278e at ether_nh_input+0x35e
#10 0x809ba512 at netisr_dispatch_src+0x62
#11 0x81c19ab9 at oce_rx+0x3c9
#12 0x81c19536 at oce_rq_handler+0xb6
#13 0x81c1bb1c at oce_intr+0xdc
#14 0x80938b35 at taskqueue_run_locked+0xe5
#15 0x809395c8 at taskqueue_thread_loop+0xa8
#16 0x808c057a at fork_exit+0x9a
#17 0x80ccb51e at fork_trampoline+0xe
Uptime: 51m20s













Borja.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Fix Emulex oce driver in CURRENT

2014-07-07 Thread Luigi Rizzo
On Mon, Jul 7, 2014 at 1:57 PM, Borja Marcos bor...@sarenet.es wrote:
...

 The environment details are here:

 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183391

 The way I produce an instant panic is:

 1) Connect to another machine (cross connect cable)

 2) iperf3 -s on the other machine
 (The other machine is different, it has an  ix card)

 3) iperf3 -t 30 -P 4 -c 10.0.0.1 -N

 In less than 30 seconds, panic.



 mierda dumped core - see /var/crash/vmcore.0

 Mon Jul  7 13:06:44 CEST 2014

 FreeBSD mierda 10.0-STABLE FreeBSD 10.0-STABLE #2: Mon Jul  7 11:41:45 CEST 
 2014 root@mierda:/usr/obj/usr/src/sys/GENERIC  amd64

 panic: sbsndptr: sockbuf 0xf800a70489b0 and mbuf 0xf801a3326e00 
 clashing

 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain conditions.
 Type show copying to see the conditions.
 There is absolutely no warranty for GDB.  Type show warranty for details.
 This GDB was configured as amd64-marcel-freebsd...

 Unread portion of the kernel message buffer:
 panic: sbsndptr: sockbuf 0xf800a70489b0 and mbuf 0xf801a3326e00 
 clashing
 cpuid = 12
 KDB: stack backtrace:
 #0 0x8092a470 at kdb_backtrace+0x60
 #1 0x808ef9c5 at panic+0x155
 #2 0x80962710 at sbdroprecord_locked+0
 #3 0x80a8ba8c at tcp_output+0xdbc
 #4 0x80a8987f at tcp_do_segment+0x30ff
 #5 0x80a85b34 at tcp_input+0xd04
 #6 0x80a1af57 at ip_input+0x97
 #7 0x809ba512 at netisr_dispatch_src+0x62
 #8 0x809b1ae6 at ether_demux+0x126
 #9 0x809b278e at ether_nh_input+0x35e
 #10 0x809ba512 at netisr_dispatch_src+0x62
 #11 0x81c19ab9 at oce_rx+0x3c9
 #12 0x81c19536 at oce_rq_handler+0xb6
 #13 0x81c1bb1c at oce_intr+0xdc
 #14 0x80938b35 at taskqueue_run_locked+0xe5
 #15 0x809395c8 at taskqueue_thread_loop+0xa8
 #16 0x808c057a at fork_exit+0x9a
 #17 0x80ccb51e at fork_trampoline+0xe
 Uptime: 51m20s

ah, that seems a bug on the receive side, we were only looking
at the transmit side so far.

cheers
luigi
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org