Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang

Joe Jin Tue, 18 Dec 2012 22:14:27 -0800

Hi Yijing,

Thanks for your reference, the patch looks good for me, but I have no chance
to test it on customer's env.


Best Regards,
Joe

On 12/19/12 13:52, Yijing Wang wrote:
> On 2012/12/19 11:04, Joe Jin wrote:
>> Hi all,
>>
>> I backported mps commits and ask customer pass "pci=pcie_bus_peer2pee" to 
>> kernel
>> to limited MPS to 128 and issue disappeared, sound like this is a BIOS bug.
>>
> 
> Hi Joe,
>    I found similar problem when I do pci hotplug, discussion is 
> here:http://marc.info/?l=linux-pci&m=134810569924220&w=2.
> We try to improve Linux kernel to debug this problem easily based Bjorn's 
> suggestion. Jon sent out the first version patch 
> http://marc.info/?l=linux-pci&m=135002016005274&w=2.
> I think we can do further here, 
> http://marc.info/?l=linux-pci&m=135115581307869&w=2. I hope this information 
> can help you.
> 
> Thanks!
> Yijing.
> 
>> Thanks all of your help.
>>
>> Best Regards,
>> Joe
>>
>> On 11/29/12 23:52, Fujinaka, Todd wrote:
>>> Someone else pointed this out to me locally. If you have a non-client BIOS, 
>>> you should be able to set the MaxPayloadSize using setpci. You have to make 
>>> sure that you're being consistent throughout all the associated links.
>>>
>>> Todd Fujinaka
>>> Technical Marketing Engineer
>>> LAN Access Division (LAD)
>>> Intel Corporation
>>> [email protected]
>>> (503) 712-4565
>>>
>>>
>>> -----Original Message-----
>>> From: Ethan Zhao [mailto:[email protected]] 
>>> Sent: Wednesday, November 28, 2012 7:10 PM
>>> To: Fujinaka, Todd
>>> Cc: Joe Jin; Ben Hutchings; Mary Mcgrath; [email protected]; 
>>> [email protected]; [email protected]; linux-pci
>>> Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
>>>
>>> Joe,
>>>     Possibly your customer is running a kernel without source code on a 
>>> platform whose vendor wouldn't like to fix BIOS issue( Is that a HP/Dell 
>>> server ?).
>>>     Anyway, to see if is a payload issue or,  you could change the payload 
>>> size with setpci tool to those devices and set the link retrain bit to 
>>> trigger the link retraining to debug the issue and identity the root cause. 
>>>  I thinks it is much easier than modify the BIOS or  eeprom of NIC.
>>>
>>>     e.g.
>>>    set device control register to 0f 00   (128 bytes payload size)
>>>    #   setpci -v -s 00:02.0 98.w=000f
>>>    set device link control register to 60h (retrain the link)
>>>    #  setpci -v -s 00:02.0 a0.b=60
>>>
>>>   Hope it works,  Just my 2 cents.
>>>
>>> [email protected]
>>>
>>> On Wed, Nov 28, 2012 at 11:53 PM, Fujinaka, Todd <[email protected]> 
>>> wrote:
>>>> The only EEPROM I know about or can speak to is the one attached to the 
>>>> 82571 and it doesn't set the MaxPayloadSize. That's done by the BIOS.
>>>>
>>>> Todd Fujinaka
>>>> Technical Marketing Engineer
>>>> LAN Access Division (LAD)
>>>> Intel Corporation
>>>> [email protected]
>>>> (503) 712-4565
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Joe Jin [mailto:[email protected]]
>>>> Sent: Wednesday, November 28, 2012 12:31 AM
>>>> To: Ben Hutchings
>>>> Cc: Fujinaka, Todd; Mary Mcgrath; [email protected]; 
>>>> [email protected]; [email protected]; linux-pci
>>>> Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
>>>>
>>>> On 11/28/12 02:10, Ben Hutchings wrote:
>>>>> On Tue, 2012-11-27 at 17:32 +0000, Fujinaka, Todd wrote:
>>>>>> Forgive me if I'm being too repetitious as I think some of this has 
>>>>>> been mentioned in the past.
>>>>>>
>>>>>> We (and by we I mean the Ethernet part and driver) can only change 
>>>>>> the advertised availability of a larger MaxPayloadSize. The size is 
>>>>>> negotiated by both sides of the link when the link is established.
>>>>>> The driver should not change the size of the link as it would be 
>>>>>> poking at registers outside of its scope and is controlled by the 
>>>>>> upstream bridge (not us).
>>>>> [...]
>>>>>
>>>>> MaxPayloadSize (MPS) is not negotiated between devices but is 
>>>>> programmed by the system firmware (at least for devices present at 
>>>>> boot - the kernel may be responsible in case of hotplug).  You can 
>>>>> use the kernel parameter 'pci=pcie_bus_perf' (or one of several 
>>>>> others) to set a policy that overrides this, but no policy will allow 
>>>>> setting MPS above the device's MaxPayloadSizeSupported (MPSS).
>>>>>
>>>>
>>>> Ben,
>>>>
>>>> Unfortunately I'm using 3.0.x kernel and this is not included in the 
>>>> kernel.
>>>> So I'm trying to use ethtool modify it from eeprom to see if help or no.
>>>>
>>>>
>>>> Todd, I'll review all MaxPayload for all devices, but need to say if it 
>>>> mismatch, customer could not modify it from BIOS for there was not entry 
>>>> at there, to test it, we have to find how to verify if this is the root 
>>>> cause, so still need to find the offset in eeprom.
>>>>
>>>> Thanks in advance,
>>>> Joe
>>>>
>>
>>
> 
> 


-- 
Oracle <http://www.oracle.com>
Joe Jin | Software Development Senior Manager | +8610.6106.5624
ORACLE | Linux and Virtualization
No. 24 Zhongguancun Software Park, Haidian District | 100193 Beijing 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang

Reply via email to