Re: Memory corruption with r8169 across several device revisions and kernels

2018-01-23 Thread Andreas Hartmann
On 01/24/2018 at 06:31 AM Andreas Hartmann wrote:
> On 01/23/2018 at 04:47 PM Oliver Freyermuth wrote:
>> Am 23.01.2018 um 16:28 schrieb David Miller:
>>> Looking at how these DMA counters are handled, there appears to be a
>>> requirement that the memory buffer is 64-byte aligned.
>>>
>>> [...]
>>>
>>> Therefore the driver needs to allocate "size + (64 - 1)" bytes and do
>>> the 64-byte alignment of the CPU pointer and the DMA address by hand.
>>
>> This is also what I wondered about as a non-expert in hardware drivers; 
>> alignment should surely be enforced here. 
>>
>> However, for the memory corruption I observed, I used an x86_64 system
>> (which I believe always has PAGE_SIZE aligned buffers). 
>> So there should be another bug, unless I am mistaken about x86_64. 
>>
>> I checked the deprecated r8168 driver by Realtek (I am not sure if this one 
>> is also affected by the issue, though)
> 
> I'm using since years this driver because r8169 is broken (it is slow
> and it misses packages - which is extremely bad for real time
> applications like asterisk, if they appear 50s later ...).
> 
> r8168-8.045.08 is an actual version which is provided by realtek on
> their homepage and which even compiles fine w/ 4.14.x.

It just *compiles* w/ 4.14 - but doesn't work here at all. It's running
fine here w/ vanilla 4.4. Realtek officially writes about support up to
4.7. at the moment - but code already knows about 4.11.


Regards,
Andreas


Re: Memory corruption with r8169 across several device revisions and kernels

2018-01-23 Thread Andreas Hartmann
On 01/23/2018 at 04:47 PM Oliver Freyermuth wrote:
> Am 23.01.2018 um 16:28 schrieb David Miller:
>> Looking at how these DMA counters are handled, there appears to be a
>> requirement that the memory buffer is 64-byte aligned.
>>
>> [...]
>>
>> Therefore the driver needs to allocate "size + (64 - 1)" bytes and do
>> the 64-byte alignment of the CPU pointer and the DMA address by hand.
> 
> This is also what I wondered about as a non-expert in hardware drivers; 
> alignment should surely be enforced here. 
> 
> However, for the memory corruption I observed, I used an x86_64 system
> (which I believe always has PAGE_SIZE aligned buffers). 
> So there should be another bug, unless I am mistaken about x86_64. 
> 
> I checked the deprecated r8168 driver by Realtek (I am not sure if this one 
> is also affected by the issue, though)

I'm using since years this driver because r8169 is broken (it is slow
and it misses packages - which is extremely bad for real time
applications like asterisk, if they appear 50s later ...).

r8168-8.045.08 is an actual version which is provided by realtek on
their homepage and which even compiles fine w/ 4.14.x.


Regards,
Andreas


Re: Memory corruption with r8169 across several device revisions and kernels

2018-01-23 Thread David Miller
From: Oliver Freyermuth 
Date: Wed, 24 Jan 2018 02:21:55 +0100

> Am 23.01.2018 um 23:13 schrieb Francois Romieu:
>> 
>> It helps. Can you try the snippet below ?
> 
> It seems to fix the issue - I could not reproduce memory corruption
> anymore neither on an Ubuntu 17.10.1 live system (with patched
> kernel module) nor on my Gentoo system (4.14.12 with your patch
> applied) across several reboots and module reloads!

Thanks for helping test this out.

> Sorry for my ignorance, but I'm curious - the R32 is there to avoid
> reordering of the writes?

The R32 make sure the write(s) beforehand have reached the chip, and
indeed another thing it ensures is write ordering.

> Many thanks for the quick help. I don't know the policies, but from
> user point of view, this should be a good candidate for backporting
> to stable kernels, since many systems in the wild should be affected
> by this, and spurious memory corruption leading to e.g. broken
> filesystems is rather nasty.

Hopefully Francois can post a bonafide patch for me to apply with
a full commit log message etc.

I'm really surprised we were clearing those registers after
programming properly, I wonder what that was all about. :-/

Thanks again for your testing and help.


Re: Memory corruption with r8169 across several device revisions and kernels

2018-01-23 Thread Oliver Freyermuth
Am 23.01.2018 um 23:13 schrieb Francois Romieu:
> 
> It helps. Can you try the snippet below ?

It seems to fix the issue - I could not reproduce memory corruption anymore
neither on an Ubuntu 17.10.1 live system (with patched kernel module)
nor on my Gentoo system (4.14.12 with your patch applied) 
across several reboots and module reloads! 

Sorry for my ignorance, but I'm curious - the R32 is there to avoid reordering 
of the writes? 

Also, a small issue (for the final patch, in case you did not notice): 
"bool ret" is now unused and will produce a compiler warning. 

Many thanks for the quick help. I don't know the policies, but from user point 
of view,
this should be a good candidate for backporting to stable kernels, 
since many systems in the wild should be affected by this,
and spurious memory corruption leading to e.g. broken filesystems is rather 
nasty. 

All the best,
Oliver

> 
> diff --git a/drivers/net/ethernet/realtek/r8169.c 
> b/drivers/net/ethernet/realtek/r8169.c
> index 272c5962e4f7..8531b41e3397 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -2238,16 +2238,12 @@ static bool rtl8169_do_counters(struct net_device 
> *dev, u32 counter_cmd)
>   bool ret;
>  
>   RTL_W32(CounterAddrHigh, (u64)paddr >> 32);
> + RTL_R32(CounterAddrHigh);
>   cmd = (u64)paddr & DMA_BIT_MASK(32);
>   RTL_W32(CounterAddrLow, cmd);
>   RTL_W32(CounterAddrLow, cmd | counter_cmd);
>  
> - ret = rtl_udelay_loop_wait_low(tp, &rtl_counters_cond, 10, 1000);
> -
> - RTL_W32(CounterAddrLow, 0);
> - RTL_W32(CounterAddrHigh, 0);
> -
> - return ret;
> + return rtl_udelay_loop_wait_low(tp, &rtl_counters_cond, 10, 1000);
>  }
>  
>  static bool rtl8169_reset_counters(struct net_device *dev)
> 



Re: Memory corruption with r8169 across several device revisions and kernels

2018-01-23 Thread Francois Romieu
Oliver Freyermuth  :
[...]
> This looks like it could very well match the structure found in memory,
> so something would be broken related to rtl8169_do_counters, in the DMA
> transfer. 
> 
> Does this help - can I provide more info? I get the feeling this affects
> many tens of thousands of systems and just has been hidden due to network
> stats being read rarely... 

It helps. Can you try the snippet below ?

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index 272c5962e4f7..8531b41e3397 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -2238,16 +2238,12 @@ static bool rtl8169_do_counters(struct net_device *dev, 
u32 counter_cmd)
bool ret;
 
RTL_W32(CounterAddrHigh, (u64)paddr >> 32);
+   RTL_R32(CounterAddrHigh);
cmd = (u64)paddr & DMA_BIT_MASK(32);
RTL_W32(CounterAddrLow, cmd);
RTL_W32(CounterAddrLow, cmd | counter_cmd);
 
-   ret = rtl_udelay_loop_wait_low(tp, &rtl_counters_cond, 10, 1000);
-
-   RTL_W32(CounterAddrLow, 0);
-   RTL_W32(CounterAddrHigh, 0);
-
-   return ret;
+   return rtl_udelay_loop_wait_low(tp, &rtl_counters_cond, 10, 1000);
 }
 
 static bool rtl8169_reset_counters(struct net_device *dev)


Re: Memory corruption with r8169 across several device revisions and kernels

2018-01-23 Thread Oliver Freyermuth
Am 23.01.2018 um 16:28 schrieb David Miller:
> Looking at how these DMA counters are handled, there appears to be a
> requirement that the memory buffer is 64-byte aligned.
> 
> [...]
> 
> Therefore the driver needs to allocate "size + (64 - 1)" bytes and do
> the 64-byte alignment of the CPU pointer and the DMA address by hand.

This is also what I wondered about as a non-expert in hardware drivers; 
alignment should surely be enforced here. 

However, for the memory corruption I observed, I used an x86_64 system
(which I believe always has PAGE_SIZE aligned buffers). 
So there should be another bug, unless I am mistaken about x86_64. 

I checked the deprecated r8168 driver by Realtek (I am not sure if this one is 
also affected by the issue, though)
and found two major differences in DMA handling:
1) It wraps the DMA operations (writing of adresses, waiting for cmd bits to be 
pulled down) in spin_lock_irqsave / spin_unlock_irqrestore. 
2) It does not reset CounterAddrLow / CounterAddrHigh to 0 / 0 after finishing. 
   That's not really good, but may have hidden this issue with r8168. 

Again, I have not tried to use r8168 yet (especially since it only supports old 
kernels),
but maybe this helps to trigger some ideas. 

Worst case, this could be a firmware timing bug, i.e. the card writes the 
counters to system memory
shortly before the cmd bytes are pulled high / shortly after they have been 
pulled down (then using the partially zeroed
out memory address) - I don't know. Let me know if I can extract any more info 
from an affected machine,
but I believe these machines should be very abundant. 

HTH and thanks,
Oliver


Re: Memory corruption with r8169 across several device revisions and kernels

2018-01-23 Thread David Miller
From: Oliver Freyermuth 
Date: Mon, 22 Jan 2018 23:55:58 +0100

> Checking through the driver sources, I find rtnl_link_stats64 can
> not be the culprit, since it has rx_packets and only after
> tx_packets.  However, struct rtl8169_counters looks like:
>
> struct rtl8169_counters {
>   __le64  tx_packets;
>   __le64  rx_packets;
>   __le64  tx_errors;
>   __le32  rx_errors;
>   __le16  rx_missed;
>   __le16  align_errors;
>   __le32  tx_one_collision;
>   __le32  tx_multi_collision;
>   __le64  rx_unicast;
>   __le64  rx_broadcast;
>   __le32  rx_multicast;
>   __le16  tx_aborted;
>   __le16  tx_underun;
> };
>
> This looks like it could very well match the structure found in
> memory, so something would be broken related to rtl8169_do_counters,
> in the DMA transfer.
> 
> Does this help - can I provide more info? I get the feeling this
> affects many tens of thousands of systems and just has been hidden
> due to network stats being read rarely...

Looking at how these DMA counters are handled, there appears to be a
requirement that the memory buffer is 64-byte aligned.

This is because the low bits in the counter address register are used
for various commands, for example:

/* ResetCounterCommand */
CounterReset= 0x1,

/* DumpCounterCommand */
CounterDump = 0x8,

Looking at the FreeBSD driver, the requirement seems to be 64-bytes of
alignment.  (see RL_DUMP_ALIGN define)

However, nothing is being done in r8169.c to enforce this alignment at
counter allocation time:

tp->counters = dmam_alloc_coherent (&pdev->dev, sizeof(*tp->counters),
&tp->counters_phys_addr,

There is no alignment guaranteed by this allocation interface.  On a
lot of platforms you get PAGE_SIZE aligned buffers, but this is not
a universal thing at all.

Therefore the driver needs to allocate "size + (64 - 1)" bytes and do
the 64-byte alignment of the CPU pointer and the DMA address by hand.


Re: Memory corruption with r8169 across several device revisions and kernels

2018-01-22 Thread Oliver Freyermuth
Dear Francois, other r8169 experts, 

Am 22.01.2018 um 01:09 schrieb Francois Romieu:
> Are you able to retrieve the layout ? That is, does it appear to match:
> 
> - r8169 hardware stats DMA buffer ?
>   TxOk, RxOk, TxErr, RxErr, ...
> 
> - rtnl_link_stats ?
>   rx_packets, tx_packets, rx_bytes, tx_bytes, ...
> 
> or something else ?
> 

It took me a while, somehow it seems the bug does not always occur - 
potentially there's also some race involved. 
Reproducing on a Ubuntu 17.10 system I found the following:

Address in virtual memory || value
0x7f87bb4c6000|| 0x0217
0x7f87bb4c6008|| 0x03ab
0x7f87bb4c6018|| 0x
0x7f87bb4c6028|| 0x0279
0x7f87bb4c6030|| 0x00e1
0x7f87bb4c6038|| 0x0051

At almost the same time, I find the following numbers in /proc/self/net/dev for 
the device:

 decimal || hex
RX bytes:870820  || 0x000d49a4
   packets: 945  || 0x03b1
   errs   0  || 
   drop   0  || 
   fifo   0  ||  
   frame  0  || 
   compressed 0  || 
   multicast 83  || 0x0053
TX bytes: 58505  || 0xe489
   packets: 535  || 0x0217
   errs   0  || 
   drop   0  ||
   fifo   0  || 
   frame  0  || 
   compressed 0  || 
   multicast  0  || 

Since there was a small delay in time (reading from /proc/self/net/dev happened 
a few seconds later),
these values are by a few packets off from the memory dump. 

So I deduce the layout:
0x7f87bb4c6000   TX Packets
0x7f87bb4c6008   RX Packets
0x7f87bb4c6010* corruption not seen by memtester for whatever reason *
0x7f87bb4c6018   ???
0x7f87bb4c6020* corruption not seen by memtester for whatever reason *
0x7f87bb4c6028   ???
0x7f87bb4c6030   ???
0x7f87bb4c6038   RX multicast (?)

So the only thing which is fully clear is that there are TX Packets and after 
that RX Packets. 

Checking through the driver sources, I find rtnl_link_stats64 can not be the 
culprit, since it has rx_packets and only after tx_packets. 
However, struct rtl8169_counters looks like:
struct rtl8169_counters {
__le64  tx_packets;
__le64  rx_packets;
__le64  tx_errors;
__le32  rx_errors;
__le16  rx_missed;
__le16  align_errors;
__le32  tx_one_collision;
__le32  tx_multi_collision;
__le64  rx_unicast;
__le64  rx_broadcast;
__le32  rx_multicast;
__le16  tx_aborted;
__le16  tx_underun;
};
This looks like it could very well match the structure found in memory, so 
something would be broken related to rtl8169_do_counters, in the DMA transfer. 

Does this help - can I provide more info? I get the feeling this affects many 
tens of thousands of systems and just has been hidden due to 
network stats being read rarely... 

Cheers,
Oliver


Re: Memory corruption with r8169 across several device revisions and kernels

2018-01-21 Thread Oliver Freyermuth
Am 22.01.2018 um 01:09 schrieb Francois Romieu:
> You said:
> 
> Oliver Freyermuth  :
> [...]
>> The values found in overwritten memory match those contained in
>> /proc/self/net/dev for the realtek ethernet device.
> 
> Are you able to retrieve the layout ? That is, does it appear to match:
> 
> - r8169 hardware stats DMA buffer ?
>   TxOk, RxOk, TxErr, RxErr, ...
> 
> - rtnl_link_stats ?
>   rx_packets, tx_packets, rx_bytes, tx_bytes, ...
> 
> or something else ?

Not cleanly. 
Since I'm no expert in kernel module development, I can only deduce from what I 
get in mapped memory,
e.g. with memtester. What I found there I found back in /proc/self/net/dev,
I'm not sure anymore whether it was RX or TX bytes / packets (but it was none 
of the error counters). 
I can try to reproduce to clarify, but it's a somwhat dangerous undertaking. 

Also, from a time when the physical offset was in low memory, I got the 
following in syslog:
Oct 12 10:05:02 desktop1 kernel: Corrupted low memory at 88009000 (9000 
phys) = 0065b8ea
Oct 12 10:10:02 desktop1 kernel: Corrupted low memory at 88009000 (9000 
phys) = 0065be39
Oct 12 10:11:02 desktop1 kernel: Corrupted low memory at 88009000 (9000 
phys) = 0065be8c
Oct 12 10:12:02 desktop1 kernel: Corrupted low memory at 88009000 (9000 
phys) = 0065bef8
Oct 12 10:13:02 desktop1 kernel: Corrupted low memory at 88009000 (9000 
phys) = 0065bfbe
Oct 12 10:18:02 desktop1 kernel: Corrupted low memory at 88009000 (9000 
phys) = 0065c37a
Oct 12 10:19:02 desktop1 kernel: Corrupted low memory at 88009000 (9000 
phys) = 0065c3db
Oct 12 10:31:02 desktop1 kernel: Corrupted low memory at 88009000 (9000 
phys) = 0065cc48
Oct 12 10:35:02 desktop1 kernel: Corrupted low memory at 88009000 (9000 
phys) = 0065d402
Oct 12 10:47:02 desktop1 kernel: Corrupted low memory at 88009000 (9000 
phys) = 0065dcbb
Oct 12 10:53:02 desktop1 kernel: Corrupted low memory at 88009000 (9000 
phys) = 0065e0a3
Oct 12 11:39:02 desktop1 kernel: Corrupted low memory at 88009000 (9000 
phys) = 006602f2
Oct 12 11:44:02 desktop1 kernel: Corrupted low memory at 88009000 (9000 
phys) = 00661ef0

Also, I'm not sure whether the low memory scanner continues after a single 
corruption was found, potentially it would only see the first corrupted region. 
memtester in userspace stops on the first corruption and then tries another 
pass. At least I only ever saw one corrupted region with the tools I used. 

The same was true for the corrupted btrfs filesystem: As far as I could tell, 
there was a single corrupted region, no series of counters, i.e. not a full 
structure. 

Cheers,
Oliver


Re: Memory corruption with r8169 across several device revisions and kernels

2018-01-21 Thread Francois Romieu
Oliver Freyermuth  :
> Am 21.01.2018 um 21:48 schrieb Francois Romieu:
> > Oliver Freyermuth  :
[...]
> > Is it an AMD based system ?
> > 
> 
> No, all the systems on which I have observed this up to now are Intel-based. 
> Two Haswell and one Sandy Bridge system. 

Ok.

You said:

Oliver Freyermuth  :
[...]
> The values found in overwritten memory match those contained in
> /proc/self/net/dev for the realtek ethernet device.

Are you able to retrieve the layout ? That is, does it appear to match:

- r8169 hardware stats DMA buffer ?
  TxOk, RxOk, TxErr, RxErr, ...

- rtnl_link_stats ?
  rx_packets, tx_packets, rx_bytes, tx_bytes, ...

or something else ?

-- 
Ueimor


Re: Memory corruption with r8169 across several device revisions and kernels

2018-01-21 Thread Oliver Freyermuth
Hi, 

Am 21.01.2018 um 21:48 schrieb Francois Romieu:
> Oliver Freyermuth  :
> [...]
> 
> Is it an AMD based system ?
> 

No, all the systems on which I have observed this up to now are Intel-based. 
Two Haswell and one Sandy Bridge system. 

Cheers,
Oliver


Re: Memory corruption with r8169 across several device revisions and kernels

2018-01-21 Thread Francois Romieu
Oliver Freyermuth  :
[...]

Is it an AMD based system ?

-- 
Ueimor


Memory corruption with r8169 across several device revisions and kernels

2018-01-20 Thread Oliver Freyermuth
Dear network experts,

please redirect me if this is the wrong place. 

I have reproduced the following issue across three devices with different 
Realtek card revisions
and different Distros (Debian 9, Ubuntu 17.04, Gentoo with kernels 4.9, 4.11.3, 
4.14.12). 

It's safely reproducible with at least:
Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI 
Express Gigabit Ethernet Controller (rev 06)
Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI 
Express Gigabit Ethernet Controller (rev 12)

Memory corruption at physical addresses either in low memory or kernel memory 
or user space memory occurs when reading from:
/proc/self/net/dev
The physical memory addresses which get corrupted change with each boot of the 
system,
and also appear to change with each reload of the kernel module (I have only 
one data point on that). 

To reproduce, execute:
$ while true; do cat /proc/self/net/dev > /dev/null; done
and in parallel, scan memory for corruption, e.g.
$ memtester 15G
Of course, one should try to map all system memory here. 
It usually shows up in the first loop iteration if the "while" loop is executed 
in parallel. 

Depending on the actual memory being corrupted, it may also become visible via
Corrupted low memory at 8800b000 (b000 phys) = 0016e109
in klog, if the low memory corruption scanning is activated. 

The values found in overwritten memory match those contained in 
/proc/self/net/dev for the realtek ethernet device. 

Unloading r8169 or disabling the card in bios "fixes" this issue. 

I have already ended up with two corrupted btrfs filesystems due to this issue, 
and many segfaults in userspace. 

Please include me directly in replies, I may not stay subscribed to the list. 

Cheers,
Oliver