On 6.12.2015. 15:56, Hrvoje Popovski wrote: > On 6.12.2015. 5:00, David Gwynne wrote: >> the current code for serialising if_start calls for mpsafe nics does what it >> says. >> >> however, kettenis realised it doesnt help us much when we're trying >> to coordinate between the start and txeof side of a driver when >> setting or clearing oactive. in particular, a start routine can >> figure out there's no more space, and then set oactive. txeof could >> be running on another cpu emptying the ring and clearing it. if >> that clear runs in between the other cpus space check and >> ifq_set_oactive, then the nic will be marked full and the stack >> wont ever call start on it again. >> >> so it can be argued that start and txeof should be serialised. >> indeed, other platforms do exactly that. >> >> the least worst mechanism we have for doing that is taskqs. however, >> all my experiments deferring start to a taskq end up significantly >> hurting performance. >> >> dragonfly appears to have some of the semantics we want. according >> to sephe, start and txeof are serialised, but they can be directly >> called from anywhere. however, if one cpu is trying to run start >> while the other is in txeof, it figures it out and makes the other >> cpu run txeof on the first cpus behalf. the first cpu then simply >> returns cos it knows the other cpu will end up doing the work. >> >> the implementation is tied very much to that specific situation, >> and its hard for me to grok cos im not familiar with their locking >> infrastructure. >> >> the dfly code has the (slight) caveat that you cant run txeof and >> start concurrently, it forces them to be serialised. >> >> while toying with ideas on how to solve kettenis' oactive problem, >> i came up with the following. >> >> it combines tasks with direct dispatch, and borrows the current >> ifq_serialiser/pool/scsi serialisation algorithm. >> >> the idea is you have a taskctx, which represents a serialising >> context for tasks. tasks are submitted to the taskctx, and the code >> will try to run the tasks immediately rather than defer them to a >> thread. if there is contention on the context, the contending cpu >> yields after queueing the task because the other cpu is responsible >> for running all pending tasks to completion. >> >> it also simplifies the barrier operations a lot. >> >> the diff below implements a generic taskctx framework, and cuts the >> mpsafe if_start() implementation over to it. >> >> myx is also changed to only clr oactive from within the taskctx >> serialiser, thereby avoiding the race, but keeps the bulk of txeof >> outside the serialiser so it can run concurrently with start. >> >> other nics are free to serialise start and txeof within the >> ifq_serializer if they want, or not, it is up to them. >> >> thoughts? tests? opinions on messy .h files? > > > Hi, > > after applying this patches over cvs source from few hours (no > additional patches for ix and em) it seems that something isn't right... > > freshly rebooted box, sending 2Mpps over ix (82599) and i'm getting > around 50kpps on receiver... over x540 around 100kpps
With latest patch 50kpps and 100kpps problem is gone.