Re: msk msk0 watchdog timeout freeze hang lock stop problem

2015-08-27 Thread Yonghyeon PYUN
On Thu, Aug 27, 2015 at 11:29:28AM +0200, Johann Hugo wrote:
 It's working for me so far and I haven't seen any watchdog timeouts.
 With 10.2-RELEASE I got timeouts and lost connectivity in less that a
 minute.
 

Ok, great.  Committed in r287238.
Thanks again.

 Johann
 
 On Wed, Aug 26, 2015 at 10:28 AM, Yonghyeon PYUN pyu...@gmail.com wrote:
  On Wed, Aug 26, 2015 at 10:06:29AM +0200, Johann Hugo wrote:
  10.2-RELEASE does not work for me. It works for a very short while and
  then it stops with msk0 watchdog timeout errors
 
 
  Thanks a lot for your report.  This is the first report for
  msk(4) watchdog timeouts on 10.2-RELEASE.
 
  I'm not sure what patch Roosevelt was talking about, but the patch in
  this thread works for me:
  https://lists.freebsd.org/pipermail/freebsd-stable/2015-April/082226.html
 
  I've changed MSK_STAT_ALIGN  from 4096 to 8192 in if_mskreg.h and it's
  been running stable for the last week.
 
 
  I see.  I'm under the impression that RX/TX descriptor ring
  alignment shall trigger the same issue so it would be better to
  know how attached patch works on your box.
 
  Thanks.
 
  Johann
 
  On Sun, Aug 16, 2015 at 2:08 PM, Yonghyeon PYUN pyu...@gmail.com wrote:
   On Wed, Aug 12, 2015 at 09:44:06AM -0400, Roosevelt Littleton wrote:
   Hi,
   So, I can confirm with the attached patch. I have a working msk0 that
   hasn't failed for the past month. I considered this problem fix for me.
   Since, I have went a long time without any problems. Thanks!
  
   I'm not sure which patch you used.  Given that users reported
   10.2-RELEASE works, it would be great if you revert local patch
   and try it again on 10.2-RELEASE.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: msk msk0 watchdog timeout freeze hang lock stop problem

2015-08-27 Thread Johann Hugo
It's working for me so far and I haven't seen any watchdog timeouts.
With 10.2-RELEASE I got timeouts and lost connectivity in less that a
minute.

Johann

On Wed, Aug 26, 2015 at 10:28 AM, Yonghyeon PYUN pyu...@gmail.com wrote:
 On Wed, Aug 26, 2015 at 10:06:29AM +0200, Johann Hugo wrote:
 10.2-RELEASE does not work for me. It works for a very short while and
 then it stops with msk0 watchdog timeout errors


 Thanks a lot for your report.  This is the first report for
 msk(4) watchdog timeouts on 10.2-RELEASE.

 I'm not sure what patch Roosevelt was talking about, but the patch in
 this thread works for me:
 https://lists.freebsd.org/pipermail/freebsd-stable/2015-April/082226.html

 I've changed MSK_STAT_ALIGN  from 4096 to 8192 in if_mskreg.h and it's
 been running stable for the last week.


 I see.  I'm under the impression that RX/TX descriptor ring
 alignment shall trigger the same issue so it would be better to
 know how attached patch works on your box.

 Thanks.

 Johann

 On Sun, Aug 16, 2015 at 2:08 PM, Yonghyeon PYUN pyu...@gmail.com wrote:
  On Wed, Aug 12, 2015 at 09:44:06AM -0400, Roosevelt Littleton wrote:
  Hi,
  So, I can confirm with the attached patch. I have a working msk0 that
  hasn't failed for the past month. I considered this problem fix for me.
  Since, I have went a long time without any problems. Thanks!
 
  I'm not sure which patch you used.  Given that users reported
  10.2-RELEASE works, it would be great if you revert local patch
  and try it again on 10.2-RELEASE.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: msk msk0 watchdog timeout freeze hang lock stop problem

2015-08-26 Thread Johann Hugo
10.2-RELEASE does not work for me. It works for a very short while and
then it stops with msk0 watchdog timeout errors

I'm not sure what patch Roosevelt was talking about, but the patch in
this thread works for me:
https://lists.freebsd.org/pipermail/freebsd-stable/2015-April/082226.html

I've changed MSK_STAT_ALIGN  from 4096 to 8192 in if_mskreg.h and it's
been running stable for the last week.

Johann

On Sun, Aug 16, 2015 at 2:08 PM, Yonghyeon PYUN pyu...@gmail.com wrote:
 On Wed, Aug 12, 2015 at 09:44:06AM -0400, Roosevelt Littleton wrote:
 Hi,
 So, I can confirm with the attached patch. I have a working msk0 that
 hasn't failed for the past month. I considered this problem fix for me.
 Since, I have went a long time without any problems. Thanks!

 I'm not sure which patch you used.  Given that users reported
 10.2-RELEASE works, it would be great if you revert local patch
 and try it again on 10.2-RELEASE.
 ___
 freebsd-stable@freebsd.org mailing list
 https://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: msk msk0 watchdog timeout freeze hang lock stop problem

2015-08-26 Thread Yonghyeon PYUN
On Wed, Aug 26, 2015 at 10:06:29AM +0200, Johann Hugo wrote:
 10.2-RELEASE does not work for me. It works for a very short while and
 then it stops with msk0 watchdog timeout errors
 

Thanks a lot for your report.  This is the first report for
msk(4) watchdog timeouts on 10.2-RELEASE.

 I'm not sure what patch Roosevelt was talking about, but the patch in
 this thread works for me:
 https://lists.freebsd.org/pipermail/freebsd-stable/2015-April/082226.html
 
 I've changed MSK_STAT_ALIGN  from 4096 to 8192 in if_mskreg.h and it's
 been running stable for the last week.
 

I see.  I'm under the impression that RX/TX descriptor ring
alignment shall trigger the same issue so it would be better to
know how attached patch works on your box.

Thanks.

 Johann
 
 On Sun, Aug 16, 2015 at 2:08 PM, Yonghyeon PYUN pyu...@gmail.com wrote:
  On Wed, Aug 12, 2015 at 09:44:06AM -0400, Roosevelt Littleton wrote:
  Hi,
  So, I can confirm with the attached patch. I have a working msk0 that
  hasn't failed for the past month. I considered this problem fix for me.
  Since, I have went a long time without any problems. Thanks!
 
  I'm not sure which patch you used.  Given that users reported
  10.2-RELEASE works, it would be great if you revert local patch
  and try it again on 10.2-RELEASE.
Index: sys/dev/msk/if_mskreg.h
===
--- sys/dev/msk/if_mskreg.h	(revision 281587)
+++ sys/dev/msk/if_mskreg.h	(working copy)
@@ -2175,13 +2175,8 @@
 #define MSK_ADDR_LO(x)	((uint64_t) (x)  0xUL)
 #define MSK_ADDR_HI(x)	((uint64_t) (x)  32)
 
-/*
- * At first I guessed 8 bytes, the size of a single descriptor, would be
- * required alignment constraints. But, it seems that Yukon II have 4096
- * bytes boundary alignment constraints.
- */
-#define MSK_RING_ALIGN	4096
-#define	MSK_STAT_ALIGN	4096
+#define	MSK_RING_ALIGN	32768
+#define	MSK_STAT_ALIGN	32768
 
 /* Rx descriptor data structure */
 struct msk_rx_desc {
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: msk msk0 watchdog timeout freeze hang lock stop problem

2015-08-16 Thread Yonghyeon PYUN
On Wed, Aug 12, 2015 at 09:44:06AM -0400, Roosevelt Littleton wrote:
 Hi,
 So, I can confirm with the attached patch. I have a working msk0 that
 hasn't failed for the past month. I considered this problem fix for me.
 Since, I have went a long time without any problems. Thanks!

I'm not sure which patch you used.  Given that users reported
10.2-RELEASE works, it would be great if you revert local patch
and try it again on 10.2-RELEASE.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: msk msk0 watchdog timeout freeze hang lock stop problem

2015-08-15 Thread Alnis Morics

On 08/12/2015 04:44 PM, Roosevelt Littleton wrote:

Hi,
So, I can confirm with the attached patch. I have a working msk0 that
hasn't failed for the past month. I considered this problem fix for me.
Since, I have went a long time without any problems. Thanks!

Roosevelt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Since 10.2-RC1 it works for me, too; now on 10.2-RELEASE. And I don't 
use any patches, still.


-Alnis
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: msk msk0 watchdog timeout freeze hang lock stop problem

2015-08-12 Thread Roosevelt Littleton
Hi,
So, I can confirm with the attached patch. I have a working msk0 that
hasn't failed for the past month. I considered this problem fix for me.
Since, I have went a long time without any problems. Thanks!

Roosevelt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: msk msk0 watchdog timeout freeze hang lock stop problem

2015-07-26 Thread Yonghyeon PYUN
On Sat, Jul 25, 2015 at 02:08:10PM +0300, Alnis Morics wrote:

 Just tried 10.2-RC1 amd64 GENERIC, and the problem seems to be gone. I 
 was even able to scp a 500 MB file. Could it be related to this fix in 
 BETA2, as mentioned in the announcement, The watchdog(4) device has 
 been fixed to print to the correct buffer.?
 

msk(4) will show watchdog timeouts when it detects driver TX path
is in stuck condition but I believe this has nothing to do with
watchdog(4).

There was no msk(4) code change in 10.2-RC1.  If you happen to see
the watchdog timeouts again, please try attached patch and let me
know whether it makes any difference for you.  I didn't get much
feedbacks on the patch so I'm not sure whether it really fixes the
root cause.

 pciconf -lv
 [..]
 mskc0@pci0:9:0:0:class=0x02 card=0xc072144d chip=0x435411ab 
 rev=0x00 hdr=0x00
 vendor = 'Marvell Technology Group Ltd.'
 device = '88E8040 PCI-E Fast Ethernet Controller'
 class  = network
 subclass   = ethernet
 
 
Index: sys/dev/msk/if_mskreg.h
===
--- sys/dev/msk/if_mskreg.h	(revision 281587)
+++ sys/dev/msk/if_mskreg.h	(working copy)
@@ -2175,13 +2175,8 @@
 #define MSK_ADDR_LO(x)	((uint64_t) (x)  0xUL)
 #define MSK_ADDR_HI(x)	((uint64_t) (x)  32)
 
-/*
- * At first I guessed 8 bytes, the size of a single descriptor, would be
- * required alignment constraints. But, it seems that Yukon II have 4096
- * bytes boundary alignment constraints.
- */
-#define MSK_RING_ALIGN	4096
-#define	MSK_STAT_ALIGN	4096
+#define	MSK_RING_ALIGN	32768
+#define	MSK_STAT_ALIGN	32768
 
 /* Rx descriptor data structure */
 struct msk_rx_desc {
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: msk msk0 watchdog timeout freeze hang lock stop problem

2015-07-26 Thread Alnis Morics

On 07/26/2015 01:40 PM, Yonghyeon PYUN wrote:

On Sat, Jul 25, 2015 at 02:08:10PM +0300, Alnis Morics wrote:


Just tried 10.2-RC1 amd64 GENERIC, and the problem seems to be gone. I
was even able to scp a 500 MB file. Could it be related to this fix in
BETA2, as mentioned in the announcement, The watchdog(4) device has
been fixed to print to the correct buffer.?


msk(4) will show watchdog timeouts when it detects driver TX path
is in stuck condition but I believe this has nothing to do with
watchdog(4).

There was no msk(4) code change in 10.2-RC1.  If you happen to see
the watchdog timeouts again, please try attached patch and let me
know whether it makes any difference for you.  I didn't get much
feedbacks on the patch so I'm not sure whether it really fixes the
root cause.


pciconf -lv
[..]
mskc0@pci0:9:0:0:class=0x02 card=0xc072144d chip=0x435411ab
rev=0x00 hdr=0x00
 vendor = 'Marvell Technology Group Ltd.'
 device = '88E8040 PCI-E Fast Ethernet Controller'
 class  = network
 subclass   = ethernet




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Thanks, Pyun. If the watchdog timeouts reappear, I'll try the patch and 
give notice about the results.


-Alnis
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: msk msk0 watchdog timeout freeze hang lock stop problem

2015-07-25 Thread Alnis Morics
=197
mskc0: msk_handle_events: sd=0xfe011e23b620  sd-msk_control=1610612806  
control=1610612806
mskc0: msk_handle_events: Break #5  cons=196  csrread=197
mskc0: msk_handle_events: Break #5  cons=197  csrread=198
...
mskc0: msk_handle_events: Break #5  cons=510  csrread=511
mskc0: msk_handle_events: Break #5  cons=511  csrread=512
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
...
mskc0: msk_handle_events: Break #1  cons=512  csrread=519
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=519
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
...etc



From: owner-freebsd-sta...@freebsd.org [owner-freebsd-sta...@freebsd.org] on 
behalf of Yonghyeon PYUN [pyu...@gmail.com]
Sent: 13 April 2015 09:13
To: Gareth Wyn Roberts
Cc: freebsd-stable@freebsd.org
Subject: Re: msk msk0 watchdog timeout freeze hang lock stop problem

On Sun, Apr 12, 2015 at 05:57:34PM +, Gareth Wyn Roberts wrote:

I've run in to problems using the msk device where initially it works well 
enough to set DHCP etc. but stops/freezes as soon as any appreciable network 
traffic occurs . There are several threads describing similar symptoms over the 
past two years or more.  I've been following several false leads but have 
finally found a solution (at least it solves my problem).

I'm running a standard FreeBSD 10.1-RELEASE and the NIC is detected as:

mskc0: Marvell Yukon 88E8057 Gigabit Ethernet mem 0xfa00-0xfa003fff irq 
19 at device 0.0 on pci6
msk0: Marvell Technology Group Ltd. Yukon Ultra 2 Id 0xba Rev 0x00 on mskc0
msk0: Ethernet address: 00:13:77:e9:df:eb
miibus0: MII bus on msk0
e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0
e1000phy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-ma
ster, auto, auto-flow

The network worked when using the i386 release, but failed for the amd64 
release (as reported previously) which prompted me to disable 64-bit DMA (the 
patch for this is attached below).  This worked for the first kernel built but 
mysteriously failed when another unrelated part of the kernel was changed (a 
usb driver) and the kernel recompiled.  So identical msk driver code worked in 
one kernel but not the second! This suggested that alignment differences 
between the two kernels were causing the msk driver to fail. Others have 
reported varying behaviour depending on different circumstances.

It transpires that changing just one value in the if_mskreg.h file solved all 
my problems.  Subsequently I have not been able to make it fail under heavy 
network traffic in either 32-bit or 64-bit mode.
I'm working on 10.1-RELEASE source, i.e. if_msk.c revision 262524 and 
if_mskreg.h revision 264442.

Thanks for letting me know your findings.  I really appreciate
that.
I recall that the alignment requirement of status LEs(List Elements
in Marvell terms) is 2048 and the maximum size of the status LEs is
4096 bytes(Actual alignment seems to be much lower value like 32 or
64 bytes, but alignment 2048 is chosen to avoid silicon bugs).
Later experiments showed some variants of Yukon II require 4096
bytes alignment and I changed the alignment to 4096 in the past.
It seems your finding indicates msk(4) needs 8192 alignment for
status LEs.

However this does not explain how and why the same code in 8.x/9.x
works well.  In addition, it's not common to require alignment size
greater than PAGE_SIZE on x86 given that the maximum size of DMA
buffer is 4096 bytes.  I have to check whether there was a change
in bus_dma(9) between 8.x/9.x and 10.x but it needs more time due
to lack of spare time.  Probably you can verify the DMA address of
status LEs meets the following requirements both on i386 and amd64.
   - Alignment is 4096.
   - Number of DMA segment is 1.
   - DMA segment base address plus DMA segment size does not cross
 a PAGE_SIZE boundary.


Here's the patch to if_mskreg.h
--- if_mskreg.h-orig2014-11-11 20:02:58.0

Re: msk msk0 watchdog timeout freeze hang lock stop problem

2015-04-16 Thread Yonghyeon PYUN
On Wed, Apr 15, 2015 at 09:52:09PM +, Gareth Wyn Roberts wrote:
 I've inserted code to print some values which show the differences between 
 specifying 4096 or 8192 for MSK_STAT_ALIGN.  In both cases the status buffer 
 has length 0x4000 (8x2048=16K) but the alignments are different as expected, 
 respectively start addresses 0x5c3b000 or 0xbdc2c000.
 
 The following values were output from functions msk_status_dma_alloc(), 
 msk_dmamap_cb() and msk_handle_events().
 The Break #n refer to breaks in msk_handle_events(). #1 occurs if 
 ((control  HW_OWNER) == 0), #5 is OP_RXSTAT and #6 is OP_TXINDEXLE.
 
 The first output is for MSK_STAT_ALIGN=8192.  It continues normally.  
 Although not shown here, it reaches cons=2047 then cons=0 as expected.
 
 The second output is for MSK_STAT_ALIGN=4096.  Although there can be isolated 
 occurences of Break #1 (e.g. cons=196) (?are these to be expected?),  it 
 continues normally until cons=512. At this point it continually invokes the 
 #1 block because the msk_control from msk_stat_ring[512] is always zero and 
 the network hangs immediately. This suggests the Yukon Ultra 2 88E8057 can't 
 access the next 4096 memory block, but why not?
 

Yes, it seems the status LE block is not updated at all for
MSK_STAT_ALIGN == 4096 and some elements of the status block looks
suspicious(put index increases but the value in the location is 0).
I vaguely guess this indicates there are DMA alignment and/or DMA
boundary issues.
The maximum number of elements of the status block is 4096 so the
maximum size of the status block is 32KB.  For i386, msk(4) uses
8KB status block(1024 elements).  For 64bit architectures, the
block size is increased to 16KB(2048 elements).
Probably the safe alignment value for the status block would be
32K.  This looks excessive value to me but it shall avoid guessing
DMA boundary issue.

 Please let me know if any further information would be helpful.
 

Thanks a lot. I've attached a diff which sets the alignment of
TX/RX ring and status block to 32KB.  Not sure whether this also
addresses other msk(4) related watchdog timeouts.
Index: sys/dev/msk/if_mskreg.h
===
--- sys/dev/msk/if_mskreg.h	(revision 281587)
+++ sys/dev/msk/if_mskreg.h	(working copy)
@@ -2175,13 +2175,8 @@
 #define MSK_ADDR_LO(x)	((uint64_t) (x)  0xUL)
 #define MSK_ADDR_HI(x)	((uint64_t) (x)  32)
 
-/*
- * At first I guessed 8 bytes, the size of a single descriptor, would be
- * required alignment constraints. But, it seems that Yukon II have 4096
- * bytes boundary alignment constraints.
- */
-#define MSK_RING_ALIGN	4096
-#define	MSK_STAT_ALIGN	4096
+#define	MSK_RING_ALIGN	32768
+#define	MSK_STAT_ALIGN	32768
 
 /* Rx descriptor data structure */
 struct msk_rx_desc {
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

RE: msk msk0 watchdog timeout freeze hang lock stop problem

2015-04-15 Thread Gareth Wyn Roberts
  sd-msk_control=1610612806  
control=1610612806
mskc0: msk_handle_events: Break #5  cons=196  csrread=197
mskc0: msk_handle_events: Break #5  cons=197  csrread=198
...
mskc0: msk_handle_events: Break #5  cons=510  csrread=511
mskc0: msk_handle_events: Break #5  cons=511  csrread=512
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=513
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
...
mskc0: msk_handle_events: Break #1  cons=512  csrread=519
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
mskc0: msk_handle_events: Break #1  cons=512  csrread=519
mskc0: msk_handle_events: sd=0xfe011e23c000  sd-msk_control=0  control=0
...etc



From: owner-freebsd-sta...@freebsd.org [owner-freebsd-sta...@freebsd.org] on 
behalf of Yonghyeon PYUN [pyu...@gmail.com]
Sent: 13 April 2015 09:13
To: Gareth Wyn Roberts
Cc: freebsd-stable@freebsd.org
Subject: Re: msk msk0 watchdog timeout freeze hang lock stop problem

On Sun, Apr 12, 2015 at 05:57:34PM +, Gareth Wyn Roberts wrote:
 I've run in to problems using the msk device where initially it works well 
 enough to set DHCP etc. but stops/freezes as soon as any appreciable network 
 traffic occurs . There are several threads describing similar symptoms over 
 the past two years or more.  I've been following several false leads but have 
 finally found a solution (at least it solves my problem).

 I'm running a standard FreeBSD 10.1-RELEASE and the NIC is detected as:

 mskc0: Marvell Yukon 88E8057 Gigabit Ethernet mem 0xfa00-0xfa003fff irq 
 19 at device 0.0 on pci6
 msk0: Marvell Technology Group Ltd. Yukon Ultra 2 Id 0xba Rev 0x00 on mskc0
 msk0: Ethernet address: 00:13:77:e9:df:eb
 miibus0: MII bus on msk0
 e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0
 e1000phy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-ma
 ster, auto, auto-flow

 The network worked when using the i386 release, but failed for the amd64 
 release (as reported previously) which prompted me to disable 64-bit DMA (the 
 patch for this is attached below).  This worked for the first kernel built 
 but mysteriously failed when another unrelated part of the kernel was changed 
 (a usb driver) and the kernel recompiled.  So identical msk driver code 
 worked in one kernel but not the second! This suggested that alignment 
 differences between the two kernels were causing the msk driver to fail. 
 Others have reported varying behaviour depending on different circumstances.

 It transpires that changing just one value in the if_mskreg.h file solved all 
 my problems.  Subsequently I have not been able to make it fail under heavy 
 network traffic in either 32-bit or 64-bit mode.
 I'm working on 10.1-RELEASE source, i.e. if_msk.c revision 262524 and 
 if_mskreg.h revision 264442.

Thanks for letting me know your findings.  I really appreciate
that.
I recall that the alignment requirement of status LEs(List Elements
in Marvell terms) is 2048 and the maximum size of the status LEs is
4096 bytes(Actual alignment seems to be much lower value like 32 or
64 bytes, but alignment 2048 is chosen to avoid silicon bugs).
Later experiments showed some variants of Yukon II require 4096
bytes alignment and I changed the alignment to 4096 in the past.
It seems your finding indicates msk(4) needs 8192 alignment for
status LEs.

However this does not explain how and why the same code in 8.x/9.x
works well.  In addition, it's not common to require alignment size
greater than PAGE_SIZE on x86 given that the maximum size of DMA
buffer is 4096 bytes.  I have to check whether there was a change
in bus_dma(9) between 8.x/9.x and 10.x but it needs more time due
to lack of spare time.  Probably you can verify the DMA address of
status LEs meets the following requirements both on i386 and amd64.
  - Alignment is 4096.
  - Number of DMA segment is 1.
  - DMA segment base address plus DMA segment size does not cross
a PAGE_SIZE boundary.


 Here's the patch to if_mskreg.h
 --- if_mskreg.h-orig2014-11-11 20:02:58.0 +
 +++ if_mskreg.h 2015

msk msk0 watchdog timeout freeze hang lock stop problem

2015-04-13 Thread Alnis Morics
Hm... I patched if_msk.c with if_msk.c.rev262524.dma.diff 
(attachment-001.bin) and if_mskreg.h with if_mskreg.h.rev264442.dma.diff 
(attachment-002.bin), and nothing changed: scp'ing 50 MB soon got 
stalled and ended up with broken pipe, as it was before.


I have 10.1-RELEASE-p9 amd64

pciconf -lv:
[..]
mskc0@pci0:9:0:0:class=0x02 card=0xc072144d chip=0x435411ab 
rev=0x00 hdr=0x00

vendor = 'Marvell Technology Group Ltd.'
device = '88E8040 PCI-E Fast Ethernet Controller'
class  = network
subclass   = ethernet

Alnis
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: msk msk0 watchdog timeout freeze hang lock stop problem

2015-04-13 Thread Yonghyeon PYUN
On Sun, Apr 12, 2015 at 05:57:34PM +, Gareth Wyn Roberts wrote:
 I've run in to problems using the msk device where initially it works well 
 enough to set DHCP etc. but stops/freezes as soon as any appreciable network 
 traffic occurs . There are several threads describing similar symptoms over 
 the past two years or more.  I've been following several false leads but have 
 finally found a solution (at least it solves my problem).
 
 I'm running a standard FreeBSD 10.1-RELEASE and the NIC is detected as:
 
 mskc0: Marvell Yukon 88E8057 Gigabit Ethernet mem 0xfa00-0xfa003fff irq 
 19 at device 0.0 on pci6
 msk0: Marvell Technology Group Ltd. Yukon Ultra 2 Id 0xba Rev 0x00 on mskc0
 msk0: Ethernet address: 00:13:77:e9:df:eb
 miibus0: MII bus on msk0
 e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0
 e1000phy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-ma
 ster, auto, auto-flow
 
 The network worked when using the i386 release, but failed for the amd64 
 release (as reported previously) which prompted me to disable 64-bit DMA (the 
 patch for this is attached below).  This worked for the first kernel built 
 but mysteriously failed when another unrelated part of the kernel was changed 
 (a usb driver) and the kernel recompiled.  So identical msk driver code 
 worked in one kernel but not the second! This suggested that alignment 
 differences between the two kernels were causing the msk driver to fail. 
 Others have reported varying behaviour depending on different circumstances.
 
 It transpires that changing just one value in the if_mskreg.h file solved all 
 my problems.  Subsequently I have not been able to make it fail under heavy 
 network traffic in either 32-bit or 64-bit mode.
 I'm working on 10.1-RELEASE source, i.e. if_msk.c revision 262524 and 
 if_mskreg.h revision 264442.

Thanks for letting me know your findings.  I really appreciate
that.
I recall that the alignment requirement of status LEs(List Elements
in Marvell terms) is 2048 and the maximum size of the status LEs is
4096 bytes(Actual alignment seems to be much lower value like 32 or
64 bytes, but alignment 2048 is chosen to avoid silicon bugs).
Later experiments showed some variants of Yukon II require 4096
bytes alignment and I changed the alignment to 4096 in the past.
It seems your finding indicates msk(4) needs 8192 alignment for
status LEs.

However this does not explain how and why the same code in 8.x/9.x
works well.  In addition, it's not common to require alignment size
greater than PAGE_SIZE on x86 given that the maximum size of DMA
buffer is 4096 bytes.  I have to check whether there was a change
in bus_dma(9) between 8.x/9.x and 10.x but it needs more time due
to lack of spare time.  Probably you can verify the DMA address of
status LEs meets the following requirements both on i386 and amd64.
  - Alignment is 4096.
  - Number of DMA segment is 1.
  - DMA segment base address plus DMA segment size does not cross
a PAGE_SIZE boundary.

 
 Here's the patch to if_mskreg.h
 --- if_mskreg.h-orig2014-11-11 20:02:58.0 +
 +++ if_mskreg.h 2015-04-12 18:47:20.0 +0100
 @@ -2179,9 +2179,11 @@
   * At first I guessed 8 bytes, the size of a single descriptor, would be
   * required alignment constraints. But, it seems that Yukon II have 4096
   * bytes boundary alignment constraints.
 + * And it seems that the DMA status region for the Yukon Ultra 2 (88E8057)
 + * requires 8192 byte alignment to prevent locking.
   */
  #define MSK_RING_ALIGN 4096
 -#defineMSK_STAT_ALIGN  4096
 +#defineMSK_STAT_ALIGN  8192
 
 
 The patches to both files which also implement a MSK_64BIT_DMA_DISABLE flag 
 are attached.  Perhaps the developers would consider committing these as it 
 may be useful for future debugging.
 

If you have more than 4GB memory installed and disables 64bit DMA
addressing, msk(4) shall use bounce buffers.  Passing packets
through bounce buffers involves copy operation and it costs a lot.
You can check hw.busdma sysctl node to see whether there are
drivers that use bounce buffers.  And if you want to disable 64bit
DMA on 64bit architectures, add '#undef MSK_64BIT_DMA' just below
BUS_SPACE_MAXADDR check in if_mskreg.h.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


msk msk0 watchdog timeout freeze hang lock stop problem

2015-04-12 Thread Gareth Wyn Roberts
I've run in to problems using the msk device where initially it works well 
enough to set DHCP etc. but stops/freezes as soon as any appreciable network 
traffic occurs . There are several threads describing similar symptoms over the 
past two years or more.  I've been following several false leads but have 
finally found a solution (at least it solves my problem).

I'm running a standard FreeBSD 10.1-RELEASE and the NIC is detected as:

mskc0: Marvell Yukon 88E8057 Gigabit Ethernet mem 0xfa00-0xfa003fff irq 
19 at device 0.0 on pci6
msk0: Marvell Technology Group Ltd. Yukon Ultra 2 Id 0xba Rev 0x00 on mskc0
msk0: Ethernet address: 00:13:77:e9:df:eb
miibus0: MII bus on msk0
e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0
e1000phy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-ma
ster, auto, auto-flow

The network worked when using the i386 release, but failed for the amd64 
release (as reported previously) which prompted me to disable 64-bit DMA (the 
patch for this is attached below).  This worked for the first kernel built but 
mysteriously failed when another unrelated part of the kernel was changed (a 
usb driver) and the kernel recompiled.  So identical msk driver code worked in 
one kernel but not the second! This suggested that alignment differences 
between the two kernels were causing the msk driver to fail. Others have 
reported varying behaviour depending on different circumstances.

It transpires that changing just one value in the if_mskreg.h file solved all 
my problems.  Subsequently I have not been able to make it fail under heavy 
network traffic in either 32-bit or 64-bit mode.
I'm working on 10.1-RELEASE source, i.e. if_msk.c revision 262524 and 
if_mskreg.h revision 264442.

Here's the patch to if_mskreg.h
--- if_mskreg.h-orig2014-11-11 20:02:58.0 +
+++ if_mskreg.h 2015-04-12 18:47:20.0 +0100
@@ -2179,9 +2179,11 @@
  * At first I guessed 8 bytes, the size of a single descriptor, would be
  * required alignment constraints. But, it seems that Yukon II have 4096
  * bytes boundary alignment constraints.
+ * And it seems that the DMA status region for the Yukon Ultra 2 (88E8057)
+ * requires 8192 byte alignment to prevent locking.
  */
 #define MSK_RING_ALIGN 4096
-#defineMSK_STAT_ALIGN  4096
+#defineMSK_STAT_ALIGN  8192


The patches to both files which also implement a MSK_64BIT_DMA_DISABLE flag are 
attached.  Perhaps the developers would consider committing these as it may be 
useful for future debugging.

Gareth.
--- if_mskreg.h-orig	2014-11-11 20:02:58.0 +
+++ if_mskreg.h	2015-04-12 18:47:20.0 +0100
@@ -2179,9 +2179,11 @@
  * At first I guessed 8 bytes, the size of a single descriptor, would be
  * required alignment constraints. But, it seems that Yukon II have 4096
  * bytes boundary alignment constraints.
+ * And it seems that the DMA status region for the Yukon Ultra 2 (88E8057)
+ * requires 8192 byte alignment to prevent locking.
  */
 #define MSK_RING_ALIGN	4096
-#define	MSK_STAT_ALIGN	4096
+#define	MSK_STAT_ALIGN	8192
 
 /* Rx descriptor data structure */
 struct msk_rx_desc {
--- if_msk.c-orig	2014-11-11 20:02:58.0 +
+++ if_msk.c	2015-04-12 02:15:12.551005000 +0100
@@ -2164,8 +2164,8 @@
 	error = bus_dma_tag_create(
 		bus_get_dma_tag(sc-msk_dev),	/* parent */
 		MSK_STAT_ALIGN, 0,		/* alignment, boundary */
-		BUS_SPACE_MAXADDR,		/* lowaddr */
-		BUS_SPACE_MAXADDR,		/* highaddr */
+		BUS_DMA_TAG_LOWADDR,	/* lowaddr */
+		BUS_DMA_TAG_HIGHADDR,	/* highaddr */
 		NULL, NULL,			/* filter, filterarg */
 		stat_sz,			/* maxsize */
 		1,/* nsegments */
@@ -2235,8 +2235,8 @@
 	error = bus_dma_tag_create(
 		bus_get_dma_tag(sc_if-msk_if_dev),	/* parent */
 		1, 0,			/* alignment, boundary */
-		BUS_SPACE_MAXADDR,		/* lowaddr */
-		BUS_SPACE_MAXADDR,		/* highaddr */
+		BUS_DMA_TAG_LOWADDR,	/* lowaddr */
+		BUS_DMA_TAG_HIGHADDR,	/* highaddr */
 		NULL, NULL,			/* filter, filterarg */
 		BUS_SPACE_MAXSIZE_32BIT,	/* maxsize */
 		0,/* nsegments */
@@ -2252,8 +2252,8 @@
 	/* Create tag for Tx ring. */
 	error = bus_dma_tag_create(sc_if-msk_cdata.msk_parent_tag,/* parent */
 		MSK_RING_ALIGN, 0,		/* alignment, boundary */
-		BUS_SPACE_MAXADDR,		/* lowaddr */
-		BUS_SPACE_MAXADDR,		/* highaddr */
+		BUS_DMA_TAG_LOWADDR,	/* lowaddr */
+		BUS_DMA_TAG_HIGHADDR,	/* highaddr */
 		NULL, NULL,			/* filter, filterarg */
 		MSK_TX_RING_SZ,		/* maxsize */
 		1,/* nsegments */
@@ -2270,8 +2270,8 @@
 	/* Create tag for Rx ring. */
 	error = bus_dma_tag_create(sc_if-msk_cdata.msk_parent_tag,/* parent */
 		MSK_RING_ALIGN, 0,		/* alignment, boundary */
-		BUS_SPACE_MAXADDR,		/* lowaddr */
-		BUS_SPACE_MAXADDR,		/* highaddr */
+		BUS_DMA_TAG_LOWADDR,	/* lowaddr */
+		BUS_DMA_TAG_HIGHADDR,	/* highaddr */
 		NULL, 

Re: msk msk0 watchdog timeout freeze hang lock stop problem

2015-04-12 Thread Kurt Jaeger
Hi!

 I've run in to problems using the msk device [...]

 I'm working on 10.1-RELEASE source, i.e. if_msk.c revision 262524 and 
 if_mskreg.h revision 264442.
 
 Here's the patch to if_mskreg.h
[...]

Thanks for the suggested fix.

There are five PRs, all describe similar things:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197887
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197002
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=189404
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=186872
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=166727

I added some pointer to your posting, maybe someone can test it ?

-- 
p...@opsec.eu+49 171 3101372 5 years to go !
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org