(moving the discussion back to the list)

Hi,

I am sorry, I didn't receive your patch as we discussed in private
and ended up writing one patch myself which essentially does the
same thing.

The patch is available at:
https://bugzilla.redhat.com/show_bug.cgi?id=746272#c13

It schedules a workqueue to flush the descriptors 500ms after
sent the first packet. This ensures that there will be a write-back
and enough time before the watchdog detects it as an old entry.

Time:  0 ms   ->  x ms -> y ms  ->...-> 500ms -> 
Pkts: pkt#1   -> pkt#2 -> pkt#3 ->...-> pkt#n ->pkt(n+1)
Event:schedule      ->       ->     ->   flush ->schedule
      workqueue                                  workqueue
       
Customer reported that it works, so IMHO, the root cause is confirmed.
There is no enough packets to cause the write-back and writing to FPD
fixes it.

That patch will flush every 500ms with high traffic too which
isn't good for performance, though it would be a flush of up to
4 descriptors as far as I understand.

I like Michael's approach to let the watchdog detects the hang first,
then try to flush.  Michael told me that we could flush and use the
interrupt raised when the write-back ends to clean up.  I think if
there is a real TX hang (i.e. no interrupt event), it will take another
watchdog cycle to detect that. It seems to me too much time without
taking any action.

Maybe something like this would work:
1) watchdog detects the hang
2) check for FLAG2_DMA_BURST flag
3) if yes, force flush, set a bit flag in the TX ring and schedule
   watchdog with a short period
4) if the TXDW interrupt happens, cleans up and reset the bit flag.
5) if not, the watchdog will expire, that bit flag will remain set
   then it will take any action assuming a real hang has occurred.

thanks,
fbl

On Wed, 26 Oct 2011 17:27:04 +0800
Michael Wang <wang...@linux.vnet.ibm.com> wrote:

> Hi, Flavio, Jesse
> 
> I have send out the patch, which I hope can do some help.
> 
> Because this is my first time to send a patch, I am sorry if
> I have done some silly thing.
> 
> And please tell me if there are some problem about it.
> 
> Thanks & Best regards,
> Michael Wang
> 
> ------------------------------------------------------------------------------
> The demand for IT networking professionals continues to grow, and the
> demand for specialized networking skills is growing even more rapidly.
> Take a complimentary Learning@Cisco Self-Assessment and learn 
> about Cisco certifications, training, and career opportunities. 
> http://p.sf.net/sfu/cisco-dev2dev
> _______________________________________________
> E1000-devel mailing list
> E1000-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/e1000-devel
> To learn more about Intel&#174; Ethernet, visit
> http://communities.intel.com/community/wired


------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to