On 07/08/2011 02:03 AM, Vladislav Bogdanov wrote: >>> I checked the archives and found a patch from some time ago that was >>> never merged. It wasn't verified to resolve the "pause timeout" problem >>> but t could indeed solve the problem. It wasn't merged because we >>> lacked verification it resolved the problem. >> >> Great, I'll try it in next few days, good news is that problem should be >> easily reproducible. > > Hmm... > Not so easily... > > I applied that patch to all physical hosts, and do not see that message > any more for two days, independently of number of RX buffers in adapter. > > But, I do not see it if I downgrade to previous image (without that > patch) :( Although I did not test it again for a long time, only several > hours. > > I didn't apply patch to VM, and do not see that message either. > What I did also: > * Rescheduled VM to higher CPU priority (actually real-time) > * Assigned higher blkio priority to that VM > * Assigned low blkio priority to bulk resources on node where that VM runs. > So, original problem seems to have different causes for bare-metal and > VM cases. > > For former case patch seems to be helpful. > It should help for VM case too. > > There were lots of '[TOTEM ] Retransmit List:' messages on bare-metal > hosts until I returned eth RX ring size back to 256 buffers (from 4096). > After some thinking, this is probably correct, because more buffers add > some latency, which is bad for corosync. Not sure why that may affect > NAPI polling rate although. > > I'll try to upgrade igb driver (newer version has tuning param > InterruptThrottleRate) and play again with ring buffers and that rate. > > Again, that driver version I currently have may have some bugs when > operating with big buffer rings which lead to 500ms blocking under high > load. > > BTW are that Retransmit List: messages harmful? >
These are only warning messages and result in a duplicate message being retransmitted which may not have to be. We are working to sort out how to remove these on some hardware enironments. Regards -steve > > Best, > Vladislav > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
