Re: r8152: data corruption in various scenarios

2019-01-07 Thread Mark Lord
On 2019-01-07 1:27 p.m., mario.limoncie...@dell.com wrote:
..
> The xHCI overrun workaround should only be applied on TB16/TB16, correct.
> 
> Can you double check the verbose information from lsusb for the r8153 device
> on your WD15?

Sure, see below for the full output.

> If it's the same information as the TB16 (which it sounds like it is) Kai 
> Heng and I will check
> around internally to find out why they're looking the same.

Thanks.

> My second guess would be maybe newer ethernet NVM in manufacturing.
> My third guess would be a manufacturing issue putting wrong NVM image on your 
> WD15.

It could be one of those two things.
Let us know what you discover.

Thanks

Bus 004 Device 003: ID 0bda:8153 Realtek Semiconductor Corp.
Device Descriptor:
  bLength18
  bDescriptorType 1
  bcdUSB   3.00
  bDeviceClass0 (Defined at Interface level)
  bDeviceSubClass 0
  bDeviceProtocol 0
  bMaxPacketSize0 9
  idVendor   0x0bda Realtek Semiconductor Corp.
  idProduct  0x8153
  bcdDevice   30.11
  iManufacturer   1 Realtek
  iProduct2 USB 10/100/1000 LAN
  iSerial 6 0200
  bNumConfigurations  2
  Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength   57
bNumInterfaces  1
bConfigurationValue 1
iConfiguration  0
bmAttributes 0xa0
  (Bus Powered)
  Remote Wakeup
MaxPower   64mA
Interface Descriptor:
  bLength 9
  bDescriptorType 4
  bInterfaceNumber0
  bAlternateSetting   0
  bNumEndpoints   3
  bInterfaceClass   255 Vendor Specific Class
  bInterfaceSubClass255 Vendor Specific Subclass
  bInterfaceProtocol  0
  iInterface  0
  Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81  EP 1 IN
bmAttributes2
  Transfer TypeBulk
  Synch Type   None
  Usage Type   Data
wMaxPacketSize 0x0400  1x 1024 bytes
bInterval   0
bMaxBurst   3
  Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x02  EP 2 OUT
bmAttributes2
  Transfer TypeBulk
  Synch Type   None
  Usage Type   Data
wMaxPacketSize 0x0400  1x 1024 bytes
bInterval   0
bMaxBurst   3
  Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x83  EP 3 IN
bmAttributes3
  Transfer TypeInterrupt
  Synch Type   None
  Usage Type   Data
wMaxPacketSize 0x0002  1x 2 bytes
bInterval   8
bMaxBurst   0
  Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength   98
bNumInterfaces  2
bConfigurationValue 2
iConfiguration  0
bmAttributes 0xa0
  (Bus Powered)
  Remote Wakeup
MaxPower   64mA
Interface Descriptor:
  bLength 9
  bDescriptorType 4
  bInterfaceNumber0
  bAlternateSetting   0
  bNumEndpoints   1
  bInterfaceClass 2 Communications
  bInterfaceSubClass  6 Ethernet Networking
  bInterfaceProtocol  0
  iInterface  5 CDC Communications Control
  CDC Header:
bcdCDC   1.10
  CDC Union:
bMasterInterface0
bSlaveInterface 1
  CDC Ethernet:
iMacAddress  3 54BF6450FC4F
bmEthernetStatistics0x
wMaxSegmentSize   1514
wNumberMCFilters0x
bNumberPowerFilters  0
  Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x83  EP 3 IN
bmAttributes3
  Transfer TypeInterrupt
  Synch Type   None
  Usage Type   Data
wMaxPacketSize 0x0010  1x 16 bytes
bInterval   8
bMaxBurst   0
Interface Descriptor:
  bLength 9
  bDescriptorType 4
  bInterfaceNumber1
  bAlternateSetting   0
  bNumEndpoints   0
  bInterfaceClass10 CDC Data
  bInterfaceSubClass  0 Unused
  bInterfaceProtocol  0
  iInterface  0
Interface Descriptor:

RE: r8152: data corruption in various scenarios

2019-01-07 Thread Mario.Limonciello
> -Original Message-
> From: Mark Lord 
> Sent: Monday, January 7, 2019 12:06 PM
> To: Limonciello, Mario; hayesw...@realtek.com; kai.heng.f...@canonical.com
> Cc: aatt...@nicira.com; da...@davemloft.net; g...@kroah.com;
> rom...@fr.zoreil.com; net...@vger.kernel.org; nic_s...@realtek.com; linux-
> ker...@vger.kernel.org; linux-...@vger.kernel.org; ryan...@realtek.com
> Subject: Re: r8152: data corruption in various scenarios
> 
> 
> [EXTERNAL EMAIL]
> 
> On 2019-01-07 11:01 a.m., mario.limoncie...@dell.com wrote:
> >
> > TB16 contains ASMedia host controller.  It's a Thunderbolt dock and all USB
> devices
> > are connected to ASMedia host controller in the dock.
> >
> > WD15 does not contain an ASMedia host controller, it connected to system's
> > USB host controller.
> 
> 
> Thank-you, Mario.
> 
> So.. why are we enabling the r8153 (USB-ethernet) workaround on this WD15
> dock?
> The discussion back in 2017 was that only the TB15/TB16 were affected by
> the XHCI overruns it produces?
> 
> --

The xHCI overrun workaround should only be applied on TB16/TB16, correct.

Can you double check the verbose information from lsusb for the r8153 device
on your WD15?

I just double checked on my on hand WD15 with an XPS 9380 and it's not 
activating the
quirk (bcdDevice was different).

If it's the same information as the TB16 (which it sounds like it is) Kai Heng 
and I will check
around internally to find out why they're looking the same.

I can hypothesize a few guesses of what happened.
My first guess would be a comparison issue with the logic in 176eb614b.

Looking at that commit, I guess I would ask on the compiler behavior of 
!strcmp().
Would that be matching the less than case as well as the zero case?
If so, it might need to be changed to strcmp() == 0.

My second guess would be maybe newer ethernet NVM in manufacturing.
My third guess would be a manufacturing issue putting wrong NVM image on your 
WD15.


Re: r8152: data corruption in various scenarios

2019-01-07 Thread Mark Lord
On 2019-01-07 11:01 a.m., mario.limoncie...@dell.com wrote:
>
> TB16 contains ASMedia host controller.  It's a Thunderbolt dock and all USB 
> devices
> are connected to ASMedia host controller in the dock.
> 
> WD15 does not contain an ASMedia host controller, it connected to system's
> USB host controller.


Thank-you, Mario.

So.. why are we enabling the r8153 (USB-ethernet) workaround on this WD15 dock?
The discussion back in 2017 was that only the TB15/TB16 were affected by
the XHCI overruns it produces?

-- 
Mark Lord
Real-Time Remedies Inc.
ml...@pobox.com


RE: r8152: data corruption in various scenarios

2019-01-07 Thread Mario.Limonciello
> -Original Message-
> From: Hayes Wang 
> Sent: Sunday, January 6, 2019 9:54 PM
> To: Mark Lord; Kai Heng Feng
> Cc: Ansis Atteka; David Miller; g...@kroah.com; rom...@fr.zoreil.com;
> net...@vger.kernel.org; nic_swsd; linux-kernel@vger.kernel.org; linux-
> u...@vger.kernel.org; Limonciello, Mario; Ryankao
> Subject: RE: r8152: data corruption in various scenarios
> 
> 
> [EXTERNAL EMAIL]
> 
> Monday, January 07, 2019 5:17 AM
> [...]
> >> This is probably an xHC bug. A similar issue is fixed by commit 
> >> 9da5a1092b13
> >> ("xhci: Bad Ethernet performance plugged in ASM1042A host”).
> >>
> >>> I just got that exact message above, with the r8152 in my 1-day old WD15 
> >>> dock,
> >>> with the TB16 "workaround" enabled in Linux kernel 4.20.0.
> >>
> >> Is the xHC WD15 connected an ASMedia one?
> >
> > I don't know.  I *think* it identifies as a DSL6340 (see below).
> >
> 
> According to our record, it is relative to the asmedia.
> 

DSL6430 should be referring to the Alpine Ridge controller in the system.

TB16 contains ASMedia host controller.  It's a Thunderbolt dock and all USB 
devices
are connected to ASMedia host controller in the dock.

WD15 does not contain an ASMedia host controller, it connected to system's
USB host controller.


Re: r8152: data corruption in various scenarios

2019-01-06 Thread Mark Lord
On 2019-01-07 1:46 a.m., Kai Heng Feng wrote:
>
> Do you happen to use a Dell system? We can do some test here.

Yes.  It is a Dell XPS 13 9360 i7-8550U notebook,
with the Dell WD15 USB-C dock.

-- 
Mark Lord
Real-Time Remedies Inc.
ml...@pobox.com


Re: r8152: data corruption in various scenarios

2019-01-06 Thread Kai Heng Feng



> On Jan 7, 2019, at 12:13, Mark Lord  wrote:
> 
> On 2019-01-06 11:09 p.m., Kai Heng Feng wrote:
>> 
>> 
>>> On Jan 7, 2019, at 05:16, Mark Lord  wrote:
>>> 
>>> On 2019-01-06 4:13 p.m., Mark Lord wrote:
 On 2019-01-06 2:14 p.m., Kai Heng Feng wrote:>> On Jan 5, 2019, at 10:14 
 PM, Mark Lord
  wrote:
 ..
>> There is even now a special hack in the upstream r8152.c to attempt to 
>> detect
>> a Dell TB16 dock and disable RX Aggregation in the driver to prevent 
>> such issues.
>> 
>> Well.. I have a WD15 dock, not a TB16, and that same hack also catches 
>> my dock
>> in its net:
>> 
>>  [5.794641] usb 4-1.2: Dell TB16 Dock, disable RX aggregation
> 
> The serial should be unique according to Dell.
> 
>> So one issue is that the code is not correctly identifying the dock,
>> and the WD15 is claimed to be immune from the r8152 issues.
> 
> The WD15 I tested didn't use that serial number though...
 
 What info do you need from me about the WD15 so this can be corrected?
 
>> xhci_hcd :39:00.0: ERROR Transfer event TRB DMA ptr not part of 
>> current TD ep_index 13
>> comp_code 1
> 
> This is probably an xHC bug. A similar issue is fixed by commit 
> 9da5a1092b13
> ("xhci: Bad Ethernet performance plugged in ASM1042A host”). 
> 
>> I just got that exact message above, with the r8152 in my 1-day old WD15 
>> dock,
>> with the TB16 "workaround" enabled in Linux kernel 4.20.0.
> 
> Is the xHC WD15 connected an ASMedia one?
 
 I don't know.  I *think* it identifies as a DSL6340 (see below).
 
 Here is lspci and lsusb:
 
 $ lspci -vt
 -[:00]-+-00.0  Intel Corporation Xeon E3-1200 v6/7th Gen Core 
 Processor Host Bridge/DRAM Registers
  +-02.0  Intel Corporation UHD Graphics 620
  +-04.0  Intel Corporation Skylake Processor Thermal Subsystem
  +-14.0  Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller
  +-14.2  Intel Corporation Sunrise Point-LP Thermal subsystem
  +-15.0  Intel Corporation Sunrise Point-LP Serial IO I2C 
 Controller #0
  +-15.1  Intel Corporation Sunrise Point-LP Serial IO I2C 
 Controller #1
  +-16.0  Intel Corporation Sunrise Point-LP CSME HECI #1
  +-1c.0-[01-39]00.0-[02-39]--+-00.0-[03]--
  |   +-01.0-[04-38]--
  |   \-02.0-[39]00.0  Intel 
 Corporation DSL6340 USB 3.1
 Controller [Alpine Ridge]
  +-1c.4-[3a]00.0  Qualcomm Atheros QCA6174 802.11ac Wireless 
 Network Adapter
  +-1d.0-[3b]00.0  Samsung Electronics Co Ltd Device a808
  +-1f.0  Intel Corporation Device 9d4e
  +-1f.2  Intel Corporation Sunrise Point-LP PMC
  +-1f.3  Intel Corporation Sunrise Point-LP HD Audio
  \-1f.4  Intel Corporation Sunrise Point-LP SMBus
>>> 
>>> 
>>> Mmm.. lspci -vt isn't as verbose as I thought, so here is plain lspci to 
>>> fill in the blanks:
>>> 
>>> $ lspci
>>> 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core 
>>> Processor Host Bridge/DRAM
>>> Registers (rev 08)
>>> 00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 620 (rev 
>>> 07)
>>> 
>>> 00:04.0 Signal processing controller: Intel Corporation Skylake Processor 
>>> Thermal Subsystem (rev 08)
>>> 
>>> 00:14.0 USB controller: Intel Corporation Sunrise Point-LP USB 3.0 xHCI 
>>> Controller (rev 21)
>>> 
>>> 00:14.2 Signal processing controller: Intel Corporation Sunrise Point-LP 
>>> Thermal subsystem (rev 21)
>>> 
>>> 00:15.0 Signal processing controller: Intel Corporation Sunrise Point-LP 
>>> Serial IO I2C Controller #0
>>> (rev 21)
>>> 00:15.1 Signal processing controller: Intel Corporation Sunrise Point-LP 
>>> Serial IO I2C Controller #1
>>> (rev 21)
>>> 00:16.0 Communication controller: Intel Corporation Sunrise Point-LP CSME 
>>> HECI #1 (rev 21)
>>> 
>>> 00:1c.0 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root 
>>> Port (rev f1)
>>> 
>>> 00:1c.4 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root 
>>> Port #5 (rev f1)
>>> 
>>> 00:1d.0 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root 
>>> Port #9 (rev f1)
>>> 
>>> 00:1f.0 ISA bridge: Intel Corporation Device 9d4e (rev 21)
>>> 
>>> 00:1f.2 Memory controller: Intel Corporation Sunrise Point-LP PMC (rev 21)
>>> 
>>> 00:1f.3 Audio device: Intel Corporation Sunrise Point-LP HD Audio (rev 21)
>>> 
>>> 00:1f.4 SMBus: Intel Corporation Sunrise Point-LP SMBus (rev 21)
>>> 
>>> 01:00.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
>>> Ridge 2C 2015]
>>> 
>>> 02:00.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
>>> Ridge 2C 2015]
>>> 
>>> 02:01.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge 

Re: r8152: data corruption in various scenarios

2019-01-06 Thread Mark Lord
On 2019-01-06 11:09 p.m., Kai Heng Feng wrote:
> 
> 
>> On Jan 7, 2019, at 05:16, Mark Lord  wrote:
>>
>> On 2019-01-06 4:13 p.m., Mark Lord wrote:
>>> On 2019-01-06 2:14 p.m., Kai Heng Feng wrote:>> On Jan 5, 2019, at 10:14 
>>> PM, Mark Lord
>>>  wrote:
>>> ..
> There is even now a special hack in the upstream r8152.c to attempt to 
> detect
> a Dell TB16 dock and disable RX Aggregation in the driver to prevent such 
> issues.
>
> Well.. I have a WD15 dock, not a TB16, and that same hack also catches my 
> dock
> in its net:
>
>   [5.794641] usb 4-1.2: Dell TB16 Dock, disable RX aggregation

 The serial should be unique according to Dell.

> So one issue is that the code is not correctly identifying the dock,
> and the WD15 is claimed to be immune from the r8152 issues.

 The WD15 I tested didn't use that serial number though...
>>>
>>> What info do you need from me about the WD15 so this can be corrected?
>>>
>  xhci_hcd :39:00.0: ERROR Transfer event TRB DMA ptr not part of 
> current TD ep_index 13
> comp_code 1

 This is probably an xHC bug. A similar issue is fixed by commit 
 9da5a1092b13
 ("xhci: Bad Ethernet performance plugged in ASM1042A host”). 

> I just got that exact message above, with the r8152 in my 1-day old WD15 
> dock,
> with the TB16 "workaround" enabled in Linux kernel 4.20.0.

 Is the xHC WD15 connected an ASMedia one?
>>>
>>> I don't know.  I *think* it identifies as a DSL6340 (see below).
>>>
>>> Here is lspci and lsusb:
>>>
>>> $ lspci -vt
>>> -[:00]-+-00.0  Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor 
>>> Host Bridge/DRAM Registers
>>>   +-02.0  Intel Corporation UHD Graphics 620
>>>   +-04.0  Intel Corporation Skylake Processor Thermal Subsystem
>>>   +-14.0  Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller
>>>   +-14.2  Intel Corporation Sunrise Point-LP Thermal subsystem
>>>   +-15.0  Intel Corporation Sunrise Point-LP Serial IO I2C 
>>> Controller #0
>>>   +-15.1  Intel Corporation Sunrise Point-LP Serial IO I2C 
>>> Controller #1
>>>   +-16.0  Intel Corporation Sunrise Point-LP CSME HECI #1
>>>   +-1c.0-[01-39]00.0-[02-39]--+-00.0-[03]--
>>>   |   +-01.0-[04-38]--
>>>   |   \-02.0-[39]00.0  Intel 
>>> Corporation DSL6340 USB 3.1
>>> Controller [Alpine Ridge]
>>>   +-1c.4-[3a]00.0  Qualcomm Atheros QCA6174 802.11ac Wireless 
>>> Network Adapter
>>>   +-1d.0-[3b]00.0  Samsung Electronics Co Ltd Device a808
>>>   +-1f.0  Intel Corporation Device 9d4e
>>>   +-1f.2  Intel Corporation Sunrise Point-LP PMC
>>>   +-1f.3  Intel Corporation Sunrise Point-LP HD Audio
>>>   \-1f.4  Intel Corporation Sunrise Point-LP SMBus
>>
>>
>> Mmm.. lspci -vt isn't as verbose as I thought, so here is plain lspci to 
>> fill in the blanks:
>>
>> $ lspci
>> 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core 
>> Processor Host Bridge/DRAM
>> Registers (rev 08)
>> 00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 620 (rev 
>> 07)
>>
>> 00:04.0 Signal processing controller: Intel Corporation Skylake Processor 
>> Thermal Subsystem (rev 08)
>>
>> 00:14.0 USB controller: Intel Corporation Sunrise Point-LP USB 3.0 xHCI 
>> Controller (rev 21)
>>
>> 00:14.2 Signal processing controller: Intel Corporation Sunrise Point-LP 
>> Thermal subsystem (rev 21)
>>
>> 00:15.0 Signal processing controller: Intel Corporation Sunrise Point-LP 
>> Serial IO I2C Controller #0
>> (rev 21)
>> 00:15.1 Signal processing controller: Intel Corporation Sunrise Point-LP 
>> Serial IO I2C Controller #1
>> (rev 21)
>> 00:16.0 Communication controller: Intel Corporation Sunrise Point-LP CSME 
>> HECI #1 (rev 21)
>>
>> 00:1c.0 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port 
>> (rev f1)
>>
>> 00:1c.4 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port 
>> #5 (rev f1)
>>
>> 00:1d.0 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port 
>> #9 (rev f1)
>>
>> 00:1f.0 ISA bridge: Intel Corporation Device 9d4e (rev 21)
>>
>> 00:1f.2 Memory controller: Intel Corporation Sunrise Point-LP PMC (rev 21)
>>
>> 00:1f.3 Audio device: Intel Corporation Sunrise Point-LP HD Audio (rev 21)
>>
>> 00:1f.4 SMBus: Intel Corporation Sunrise Point-LP SMBus (rev 21)
>>
>> 01:00.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
>> Ridge 2C 2015]
>>
>> 02:00.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
>> Ridge 2C 2015]
>>
>> 02:01.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
>> Ridge 2C 2015]
>>
>> 02:02.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
>> Ridge 2C 2015]
>>
>> 39:00.0 USB controller: Intel Corporation DSL6340 USB 3.1 

Re: r8152: data corruption in various scenarios

2019-01-06 Thread Kai Heng Feng



> On Jan 7, 2019, at 05:16, Mark Lord  wrote:
> 
> On 2019-01-06 4:13 p.m., Mark Lord wrote:
>> On 2019-01-06 2:14 p.m., Kai Heng Feng wrote:>> On Jan 5, 2019, at 10:14 PM, 
>> Mark Lord
>>  wrote:
>> ..
 There is even now a special hack in the upstream r8152.c to attempt to 
 detect
 a Dell TB16 dock and disable RX Aggregation in the driver to prevent such 
 issues.
 
 Well.. I have a WD15 dock, not a TB16, and that same hack also catches my 
 dock
 in its net:
 
   [5.794641] usb 4-1.2: Dell TB16 Dock, disable RX aggregation
>>> 
>>> The serial should be unique according to Dell.
>>> 
 So one issue is that the code is not correctly identifying the dock,
 and the WD15 is claimed to be immune from the r8152 issues.
>>> 
>>> The WD15 I tested didn't use that serial number though...
>> 
>> What info do you need from me about the WD15 so this can be corrected?
>> 
  xhci_hcd :39:00.0: ERROR Transfer event TRB DMA ptr not part of 
 current TD ep_index 13
 comp_code 1
>>> 
>>> This is probably an xHC bug. A similar issue is fixed by commit 9da5a1092b13
>>> ("xhci: Bad Ethernet performance plugged in ASM1042A host”). 
>>> 
 I just got that exact message above, with the r8152 in my 1-day old WD15 
 dock,
 with the TB16 "workaround" enabled in Linux kernel 4.20.0.
>>> 
>>> Is the xHC WD15 connected an ASMedia one?
>> 
>> I don't know.  I *think* it identifies as a DSL6340 (see below).
>> 
>> Here is lspci and lsusb:
>> 
>> $ lspci -vt
>> -[:00]-+-00.0  Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor 
>> Host Bridge/DRAM Registers
>>   +-02.0  Intel Corporation UHD Graphics 620
>>   +-04.0  Intel Corporation Skylake Processor Thermal Subsystem
>>   +-14.0  Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller
>>   +-14.2  Intel Corporation Sunrise Point-LP Thermal subsystem
>>   +-15.0  Intel Corporation Sunrise Point-LP Serial IO I2C 
>> Controller #0
>>   +-15.1  Intel Corporation Sunrise Point-LP Serial IO I2C 
>> Controller #1
>>   +-16.0  Intel Corporation Sunrise Point-LP CSME HECI #1
>>   +-1c.0-[01-39]00.0-[02-39]--+-00.0-[03]--
>>   |   +-01.0-[04-38]--
>>   |   \-02.0-[39]00.0  Intel 
>> Corporation DSL6340 USB 3.1
>> Controller [Alpine Ridge]
>>   +-1c.4-[3a]00.0  Qualcomm Atheros QCA6174 802.11ac Wireless 
>> Network Adapter
>>   +-1d.0-[3b]00.0  Samsung Electronics Co Ltd Device a808
>>   +-1f.0  Intel Corporation Device 9d4e
>>   +-1f.2  Intel Corporation Sunrise Point-LP PMC
>>   +-1f.3  Intel Corporation Sunrise Point-LP HD Audio
>>   \-1f.4  Intel Corporation Sunrise Point-LP SMBus
> 
> 
> Mmm.. lspci -vt isn't as verbose as I thought, so here is plain lspci to fill 
> in the blanks:
> 
> $ lspci
> 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor 
> Host Bridge/DRAM
> Registers (rev 08)
> 00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 620 (rev 07)
> 
> 00:04.0 Signal processing controller: Intel Corporation Skylake Processor 
> Thermal Subsystem (rev 08)
> 
> 00:14.0 USB controller: Intel Corporation Sunrise Point-LP USB 3.0 xHCI 
> Controller (rev 21)
> 
> 00:14.2 Signal processing controller: Intel Corporation Sunrise Point-LP 
> Thermal subsystem (rev 21)
> 
> 00:15.0 Signal processing controller: Intel Corporation Sunrise Point-LP 
> Serial IO I2C Controller #0
> (rev 21)
> 00:15.1 Signal processing controller: Intel Corporation Sunrise Point-LP 
> Serial IO I2C Controller #1
> (rev 21)
> 00:16.0 Communication controller: Intel Corporation Sunrise Point-LP CSME 
> HECI #1 (rev 21)
> 
> 00:1c.0 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port 
> (rev f1)
> 
> 00:1c.4 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port 
> #5 (rev f1)
> 
> 00:1d.0 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port 
> #9 (rev f1)
> 
> 00:1f.0 ISA bridge: Intel Corporation Device 9d4e (rev 21)
> 
> 00:1f.2 Memory controller: Intel Corporation Sunrise Point-LP PMC (rev 21)
> 
> 00:1f.3 Audio device: Intel Corporation Sunrise Point-LP HD Audio (rev 21)
> 
> 00:1f.4 SMBus: Intel Corporation Sunrise Point-LP SMBus (rev 21)
> 
> 01:00.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
> Ridge 2C 2015]
> 
> 02:00.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
> Ridge 2C 2015]
> 
> 02:01.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
> Ridge 2C 2015]
> 
> 02:02.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
> Ridge 2C 2015]
> 
> 39:00.0 USB controller: Intel Corporation DSL6340 USB 3.1 Controller [Alpine 
> Ridge]

So it’s not an ASMedia one.

Before digging further, please make sure the system firmware (BIOS), 
Thunderbolt controller NVM and 

RE: r8152: data corruption in various scenarios

2019-01-06 Thread Hayes Wang
Monday, January 07, 2019 5:17 AM
[...]
>> This is probably an xHC bug. A similar issue is fixed by commit 9da5a1092b13
>> ("xhci: Bad Ethernet performance plugged in ASM1042A host”). 
>>
>>> I just got that exact message above, with the r8152 in my 1-day old WD15 
>>> dock,
>>> with the TB16 "workaround" enabled in Linux kernel 4.20.0.
>>
>> Is the xHC WD15 connected an ASMedia one?
> 
> I don't know.  I *think* it identifies as a DSL6340 (see below).
> 

According to our record, it is relative to the asmedia.

Best Regards,
Hayes




Re: r8152: data corruption in various scenarios

2019-01-06 Thread Mark Lord
On 2019-01-06 4:13 p.m., Mark Lord wrote:
> On 2019-01-06 2:14 p.m., Kai Heng Feng wrote:>> On Jan 5, 2019, at 10:14 PM, 
> Mark Lord
>  wrote:
> ..
>>> There is even now a special hack in the upstream r8152.c to attempt to 
>>> detect
>>> a Dell TB16 dock and disable RX Aggregation in the driver to prevent such 
>>> issues.
>>>
>>> Well.. I have a WD15 dock, not a TB16, and that same hack also catches my 
>>> dock
>>> in its net:
>>>
>>>[5.794641] usb 4-1.2: Dell TB16 Dock, disable RX aggregation
>>
>> The serial should be unique according to Dell.
>>
>>> So one issue is that the code is not correctly identifying the dock,
>>> and the WD15 is claimed to be immune from the r8152 issues.
>>
>> The WD15 I tested didn't use that serial number though...
> 
> What info do you need from me about the WD15 so this can be corrected?
> 
>>>   xhci_hcd :39:00.0: ERROR Transfer event TRB DMA ptr not part of 
>>> current TD ep_index 13
>>> comp_code 1
>>
>> This is probably an xHC bug. A similar issue is fixed by commit 9da5a1092b13
>> ("xhci: Bad Ethernet performance plugged in ASM1042A host”). 
>>
>>> I just got that exact message above, with the r8152 in my 1-day old WD15 
>>> dock,
>>> with the TB16 "workaround" enabled in Linux kernel 4.20.0.
>>
>> Is the xHC WD15 connected an ASMedia one?
> 
> I don't know.  I *think* it identifies as a DSL6340 (see below).
> 
> Here is lspci and lsusb:
> 
> $ lspci -vt
> -[:00]-+-00.0  Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor 
> Host Bridge/DRAM Registers
>+-02.0  Intel Corporation UHD Graphics 620
>+-04.0  Intel Corporation Skylake Processor Thermal Subsystem
>+-14.0  Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller
>+-14.2  Intel Corporation Sunrise Point-LP Thermal subsystem
>+-15.0  Intel Corporation Sunrise Point-LP Serial IO I2C 
> Controller #0
>+-15.1  Intel Corporation Sunrise Point-LP Serial IO I2C 
> Controller #1
>+-16.0  Intel Corporation Sunrise Point-LP CSME HECI #1
>+-1c.0-[01-39]00.0-[02-39]--+-00.0-[03]--
>|   +-01.0-[04-38]--
>|   \-02.0-[39]00.0  Intel 
> Corporation DSL6340 USB 3.1
> Controller [Alpine Ridge]
>+-1c.4-[3a]00.0  Qualcomm Atheros QCA6174 802.11ac Wireless 
> Network Adapter
>+-1d.0-[3b]00.0  Samsung Electronics Co Ltd Device a808
>+-1f.0  Intel Corporation Device 9d4e
>+-1f.2  Intel Corporation Sunrise Point-LP PMC
>+-1f.3  Intel Corporation Sunrise Point-LP HD Audio
>\-1f.4  Intel Corporation Sunrise Point-LP SMBus


Mmm.. lspci -vt isn't as verbose as I thought, so here is plain lspci to fill 
in the blanks:

$ lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor 
Host Bridge/DRAM
Registers (rev 08)
00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 620 (rev 07)

00:04.0 Signal processing controller: Intel Corporation Skylake Processor 
Thermal Subsystem (rev 08)

00:14.0 USB controller: Intel Corporation Sunrise Point-LP USB 3.0 xHCI 
Controller (rev 21)

00:14.2 Signal processing controller: Intel Corporation Sunrise Point-LP 
Thermal subsystem (rev 21)

00:15.0 Signal processing controller: Intel Corporation Sunrise Point-LP Serial 
IO I2C Controller #0
(rev 21)
00:15.1 Signal processing controller: Intel Corporation Sunrise Point-LP Serial 
IO I2C Controller #1
(rev 21)
00:16.0 Communication controller: Intel Corporation Sunrise Point-LP CSME HECI 
#1 (rev 21)

00:1c.0 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port 
(rev f1)

00:1c.4 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port #5 
(rev f1)

00:1d.0 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port #9 
(rev f1)

00:1f.0 ISA bridge: Intel Corporation Device 9d4e (rev 21)

00:1f.2 Memory controller: Intel Corporation Sunrise Point-LP PMC (rev 21)

00:1f.3 Audio device: Intel Corporation Sunrise Point-LP HD Audio (rev 21)

00:1f.4 SMBus: Intel Corporation Sunrise Point-LP SMBus (rev 21)

01:00.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
Ridge 2C 2015]

02:00.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
Ridge 2C 2015]

02:01.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
Ridge 2C 2015]

02:02.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine 
Ridge 2C 2015]

39:00.0 USB controller: Intel Corporation DSL6340 USB 3.1 Controller [Alpine 
Ridge]

3a:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network 
Adapter (rev 32)

3b:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd Device a808


-- 
Mark Lord
Real-Time Remedies Inc.
ml...@pobox.com


Re: r8152: data corruption in various scenarios

2019-01-06 Thread Mark Lord
On 2019-01-06 2:14 p.m., Kai Heng Feng wrote:>> On Jan 5, 2019, at 10:14 PM, 
Mark Lord
 wrote:
..
>> There is even now a special hack in the upstream r8152.c to attempt to detect
>> a Dell TB16 dock and disable RX Aggregation in the driver to prevent such 
>> issues.
>>
>> Well.. I have a WD15 dock, not a TB16, and that same hack also catches my 
>> dock
>> in its net:
>>
>>[5.794641] usb 4-1.2: Dell TB16 Dock, disable RX aggregation
> 
> The serial should be unique according to Dell.
>
>> So one issue is that the code is not correctly identifying the dock,
>> and the WD15 is claimed to be immune from the r8152 issues.
> 
> The WD15 I tested didn't use that serial number though...

What info do you need from me about the WD15 so this can be corrected?

>>   xhci_hcd :39:00.0: ERROR Transfer event TRB DMA ptr not part of 
>> current TD ep_index 13
>> comp_code 1
> 
> This is probably an xHC bug. A similar issue is fixed by commit 9da5a1092b13
> ("xhci: Bad Ethernet performance plugged in ASM1042A host”). 
> 
>> I just got that exact message above, with the r8152 in my 1-day old WD15 
>> dock,
>> with the TB16 "workaround" enabled in Linux kernel 4.20.0.
> 
> Is the xHC WD15 connected an ASMedia one?

I don't know.  I *think* it identifies as a DSL6340 (see below).

Here is lspci and lsusb:

$ lspci -vt
-[:00]-+-00.0  Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor 
Host Bridge/DRAM Registers
   +-02.0  Intel Corporation UHD Graphics 620
   +-04.0  Intel Corporation Skylake Processor Thermal Subsystem
   +-14.0  Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller
   +-14.2  Intel Corporation Sunrise Point-LP Thermal subsystem
   +-15.0  Intel Corporation Sunrise Point-LP Serial IO I2C Controller 
#0
   +-15.1  Intel Corporation Sunrise Point-LP Serial IO I2C Controller 
#1
   +-16.0  Intel Corporation Sunrise Point-LP CSME HECI #1
   +-1c.0-[01-39]00.0-[02-39]--+-00.0-[03]--
   |   +-01.0-[04-38]--
   |   \-02.0-[39]00.0  Intel 
Corporation DSL6340 USB 3.1
Controller [Alpine Ridge]
   +-1c.4-[3a]00.0  Qualcomm Atheros QCA6174 802.11ac Wireless 
Network Adapter
   +-1d.0-[3b]00.0  Samsung Electronics Co Ltd Device a808
   +-1f.0  Intel Corporation Device 9d4e
   +-1f.2  Intel Corporation Sunrise Point-LP PMC
   +-1f.3  Intel Corporation Sunrise Point-LP HD Audio
   \-1f.4  Intel Corporation Sunrise Point-LP SMBus
$ lsusb -t
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 1M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/7p, 5000M
|__ Port 2: Dev 3, If 0, Class=Vendor Specific Class, Driver=r8152, 
5000M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/7p, 480M
|__ Port 5: Dev 3, If 1, Class=Audio, Driver=snd-usb-audio, 480M
|__ Port 5: Dev 3, If 2, Class=Audio, Driver=snd-usb-audio, 480M
|__ Port 5: Dev 3, If 0, Class=Audio, Driver=snd-usb-audio, 480M
|__ Port 5: Dev 3, If 3, Class=Audio, Driver=snd-usb-audio, 480M
|__ Port 6: Dev 4, If 0, Class=Human Interface Device, Driver=usbhid, 
12M
|__ Port 6: Dev 4, If 1, Class=Human Interface Device, Driver=usbhid, 
12M
|__ Port 6: Dev 4, If 2, Class=Human Interface Device, Driver=usbhid, 
12M
|__ Port 7: Dev 5, If 0, Class=Human Interface Device, Driver=usbhid, 
1.5M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/12p, 480M
|__ Port 3: Dev 2, If 0, Class=Wireless, Driver=btusb, 12M
|__ Port 3: Dev 2, If 1, Class=Wireless, Driver=btusb, 12M

Thanks for having a look.
-- 
Mark Lord
Real-Time Remedies Inc.
ml...@pobox.com


Re: r8152: data corruption in various scenarios

2019-01-06 Thread Kai Heng Feng



> On Jan 5, 2019, at 10:14 PM, Mark Lord  wrote:
> 
> A couple of years back, I reported data corruption resulting from
> a change in kernel 3.16 which enabled hardware checksums in the r8152 driver.
> This was happening on an embedded system that was using a r8152 USB dongle.
> 
> At the time, it was very difficult to figure out what could possibly be 
> causing it,
> other than that re-enabling software checksums prevented corrupted packets 
> from
> resulting in more serious issues.
> 
> Since that time, more and more reports of similar corruption and issues
> have been trickling in.  Eg.
> 
>   https://lore.kernel.org/patchwork/patch/873920/
> 
> Note that there are reports in the thread above that the issues
> are not limited to only the built-in ethernet chip of the dock.
> 
> There is even now a special hack in the upstream r8152.c to attempt to detect
> a Dell TB16 dock and disable RX Aggregation in the driver to prevent such 
> issues.
> 
> Well.. I have a WD15 dock, not a TB16, and that same hack also catches my dock
> in its net:
> 
>[5.794641] usb 4-1.2: Dell TB16 Dock, disable RX aggregation

The serial should be unique according to Dell.

> 
> So one issue is that the code is not correctly identifying the dock,
> and the WD15 is claimed to be immune from the r8152 issues.

The WD15 I tested didn't use that serial number though...

> 
> One of the symptoms of the r8152 issue, reported by Ansis Atteka,
> were messages like this:
> 
>   xhci_hcd :39:00.0: ERROR Transfer event TRB DMA ptr not part of current 
> TD ep_index 13
> comp_code 1

This is probably an xHC bug. A similar issue is fixed by commit 9da5a1092b13
("xhci: Bad Ethernet performance plugged in ASM1042A host”). 

> 
> I just got that exact message above, with the r8152 in my 1-day old WD15 dock,
> with the TB16 "workaround" enabled in Linux kernel 4.20.0.

Is the xHC WD15 connected an ASMedia one?

Kai-Heng

> 
> From this I conclude that the workaround is not 100% complete yet.
> -- 
> Mark Lord
> Real-Time Remedies Inc.
> ml...@pobox.com



Re: r8152: data corruption in various scenarios

2019-01-05 Thread Mark Lord
On 2019-01-05 9:14 a.m., Mark Lord wrote:
> A couple of years back, I reported data corruption resulting from
> a change in kernel 3.16 which enabled hardware checksums in the r8152 driver.
> This was happening on an embedded system that was using a r8152 USB dongle.
> 
> At the time, it was very difficult to figure out what could possibly be 
> causing it,
> other than that re-enabling software checksums prevented corrupted packets 
> from
> resulting in more serious issues.
> 
> Since that time, more and more reports of similar corruption and issues
> have been trickling in.  Eg.
> 
>https://lore.kernel.org/patchwork/patch/873920/

Forgot to include this link (below) where people still have the issue
even with the driver workaround.  Switching to software checksums "fixes" it:

https://bugzilla.redhat.com/show_bug.cgi?id=1460789

> 
> Note that there are reports in the thread above that the issues
> are not limited to only the built-in ethernet chip of the dock.
> 
> There is even now a special hack in the upstream r8152.c to attempt to detect
> a Dell TB16 dock and disable RX Aggregation in the driver to prevent such 
> issues.
> 
> Well.. I have a WD15 dock, not a TB16, and that same hack also catches my dock
> in its net:
> 
> [5.794641] usb 4-1.2: Dell TB16 Dock, disable RX aggregation
> 
> So one issue is that the code is not correctly identifying the dock,
> and the WD15 is claimed to be immune from the r8152 issues.
> 
> One of the symptoms of the r8152 issue, reported by Ansis Atteka,
> were messages like this:
> 
>xhci_hcd :39:00.0: ERROR Transfer event TRB DMA ptr not part of 
> current TD ep_index 13
> comp_code 1
> 
> I just got that exact message above, with the r8152 in my 1-day old WD15 dock,
> with the TB16 "workaround" enabled in Linux kernel 4.20.0.
> 
>>From this I conclude that the workaround is not 100% complete yet.
> 


-- 
Mark Lord
Real-Time Remedies Inc.
ml...@pobox.com