Re: interface queue transmit mitigation (again)
On 28.3.2018. 11:42, Hrvoje Popovski wrote: > On 28.3.2018. 3:28, David Gwynne wrote: >> On Thu, Mar 15, 2018 at 03:25:46PM +0100, Martin Pieuchot wrote: >>> On 14/03/18(Wed) 13:00, David Gwynne wrote: this adds transmit mitigation back to the tree. it is basically the same diff as last time. the big difference this time is that all the tunnel drivers all defer ip_output calls, which avoids having to play games with NET_LOCK in the ifq transmit paths. >>> Comments inline. >>> + if (ifq_len(ifq) >= min(4, ifq->ifq_maxlen)) { >>> Why 4? DragonFly recently bumped `ifsq_stage_cntmax' to 16. Did you >>> try other values? They also have an XXX comment that this value should >>> be per-interface. Why? >> their default was 4, and they'd done some research on it. if they >> moved to 16 there would be a reason for it. > Hi all, > > with this diff i'm getting 1.43Mpps on > 12 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz, 2400.01 MHz > > with plain kernel i'm getting 1.12Mpps and with older diff's i was > getting 1.31Mpps ... > > if i execute ifconfig ix0 down while generating traffic over everything stop I've tested this diff with today's tree and i can't repeat problem with or without -fretpoline diff.
Re: interface queue transmit mitigation (again)
On 28.3.2018. 11:42, Hrvoje Popovski wrote: > Hi all, > > with this diff i'm getting 1.43Mpps on > 12 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz, 2400.01 MHz > > with plain kernel i'm getting 1.12Mpps and with older diff's i was > getting 1.31Mpps ... On box with 6 x Intel(R) Xeon(R) CPU E5-2643 v2 @ 3.50GHz, 3600.46 MHz with or without this diff i'm getting 1.75 Mpps and i can't trigger freeze although i left ifconfig down/up over weekend and it's still running. Stability of forwarding performance seems more stable with this diff than without it, meaning that there aren't oscillation in forwarding as before.
Re: interface queue transmit mitigation (again)
On 28.3.2018. 3:28, David Gwynne wrote: > On Thu, Mar 15, 2018 at 03:25:46PM +0100, Martin Pieuchot wrote: >> On 14/03/18(Wed) 13:00, David Gwynne wrote: >>> this adds transmit mitigation back to the tree. >>> >>> it is basically the same diff as last time. the big difference this >>> time is that all the tunnel drivers all defer ip_output calls, which >>> avoids having to play games with NET_LOCK in the ifq transmit paths. >> Comments inline. >> >>> + if (ifq_len(ifq) >= min(4, ifq->ifq_maxlen)) { >> Why 4? DragonFly recently bumped `ifsq_stage_cntmax' to 16. Did you >> try other values? They also have an XXX comment that this value should >> be per-interface. Why? > their default was 4, and they'd done some research on it. if they > moved to 16 there would be a reason for it. Hi all, with this diff i'm getting 1.43Mpps on 12 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz, 2400.01 MHz with plain kernel i'm getting 1.12Mpps and with older diff's i was getting 1.31Mpps ... if i execute ifconfig ix0 down while generating traffic over everything stop x3550m4# ifconfig ix0 down ^C from ddb: ddb{0}> ps PID TID PPIDUID S FLAGS WAIT COMMAND 287 54967 4723 0 3 0x3 tqbar ifconfig 4723 369240 1 0 30x10008b pause ksh 78652 12963 1 0 30x100083 ttyin getty 77445 380566 1 0 30x100083 ttyin getty 7708 16964 1 0 30x100083 ttyin getty 6466 480278 1 0 30x100083 ttyin getty 10683 361200 1 0 30x100083 ttyin ksh 33222446 1 0 30x100098 poll cron 81878185 1 99 30x100090 poll sndiod 38828 292448 1110 30x100090 poll sndiod 96602 349758 17921 95 30x100092 kqreadsmtpd 59159 244097 17921103 30x100092 kqreadsmtpd 72520 357622 17921 95 30x100092 kqreadsmtpd 11196 442980 17921 95 30x100092 kqreadsmtpd 19374 184986 17921 95 30x100092 kqreadsmtpd 99758 239851 17921 95 30x100092 kqreadsmtpd 17921 468052 1 0 30x100080 kqreadsmtpd 96705 480305 1 0 30x80 selectsshd 10784 513637 74642 83 30x100092 poll ntpd 74642 349129 19763 83 30x100092 poll ntpd 19763 118713 1 0 30x100080 poll ntpd 80820 281787 61744 73 30x100090 kqreadsyslogd 61744 358228 1 0 30x100082 netio syslogd 92438 103007 37416115 30x100092 kqreadslaacd 27600 284603 37416115 30x100092 kqreadslaacd 37416 302849 1 0 30x80 kqreadslaacd 31025 461354 0 0 3 0x14200 pgzerozerothread 87259 136299 0 0 3 0x14200 aiodoned aiodoned 40842 63462 0 0 3 0x14200 syncerupdate 434254 0 0 3 0x14200 cleaner cleaner 20219 234861 0 0 3 0x14200 reaperreaper 49370 510675 0 0 3 0x14200 pgdaemon pagedaemon 34415 210813 0 0 3 0x14200 bored crynlk 98085 523911 0 0 3 0x14200 bored crypto 88352 390863 0 0 3 0x14200 usbtskusbtask 36743 62252 0 0 3 0x14200 usbatsk usbatsk 38389 456819 0 0 3 0x40014200 acpi0 acpi0 93162 81423 0 0 7 0x40014200idle11 58166 30480 0 0 7 0x40014200idle10 55398 115831 0 0 7 0x40014200idle9 96 42736 0 0 7 0x40014200idle8 63465 183206 0 0 7 0x40014200idle7 79804 340505 0 0 7 0x40014200idle6 42987 392463 0 0 7 0x40014200idle5 94478 284414 0 0 7 0x40014200idle4 45914 13592 0 0 7 0x40014200idle3 14508 513956 0 0 7 0x40014200idle2 16424 111283 0 0 7 0x40014200idle1 60252 298958 0 0 3 0x14200 bored sensors 80523 235128 0 0 3 0x14200 tqbar softnet 40231 252516 0 0 3 0x14200 bored systqmp 97996 128366 0 0 3 0x14200 bored systq 78639 112346 0 0 3 0x40014200 bored softclock *60124 95946 0 0 7 0x40014200idle0 1 232514 0 0 30x82 wait init 0 0 -1 0 3 0x10200 scheduler swapper ddb{0}> ddb{0}> tr /p 0t54967 sleep_finish(800023d3c528,81a3863c) at sleep_finish+0x70
Re: interface queue transmit mitigation (again)
On 28/03/18(Wed) 11:28, David Gwynne wrote: > On Thu, Mar 15, 2018 at 03:25:46PM +0100, Martin Pieuchot wrote: > > On 14/03/18(Wed) 13:00, David Gwynne wrote: > > > this adds transmit mitigation back to the tree. > > > > > > it is basically the same diff as last time. the big difference this > > > time is that all the tunnel drivers all defer ip_output calls, which > > > avoids having to play games with NET_LOCK in the ifq transmit paths. > > > > Comments inline. > > > > > + if (ifq_len(ifq) >= min(4, ifq->ifq_maxlen)) { > > > > Why 4? DragonFly recently bumped `ifsq_stage_cntmax' to 16. Did you > > try other values? They also have an XXX comment that this value should > > be per-interface. Why? > > their default was 4, and they'd done some research on it. if they > moved to 16 there would be a reason for it. > > > In any case I'd be happier with a define. > > i've taken your advice on board and made it per interface, defaulted > to 16 with a macro. after this i can add ioctls to report it (under > hwfeatures maybe) and change it with ifconfig. > > > > ifq_barrier(struct ifqueue *ifq) > > > { > > > struct cond c = COND_INITIALIZER(); > > > struct task t = TASK_INITIALIZER(ifq_barrier_task, ); > > > > > > + NET_ASSERT_UNLOCKED(); > > > > Since you assert here... > > > > > + > > > + if (!task_del(ifq->ifq_softnet, >ifq_bundle)) { > > > + int netlocked = (rw_status() == RW_WRITE); > > ^ > > You can remove this code. > > i can't get rid of the assert :( > > ifq_barrier is called from lots of places, eg, ifq_destroy and > ix_down. the former does not hold the net lock, but the latter does. > putting manual NET_UNLOCK calls around places like the latter is > too much work imo. You're right, getting rid of the NET_LOCK() is too much work. That's why I'm being rude for such pattern not to spread. I'd prefer an assert and move the dance in the only place that needs it. Anyway `if_sndmit' should be [d] since its a driver variable, not a stack one. Apart from that ok mpi@ > retrieving revision 1.548 > diff -u -p -r1.548 if.c > --- if.c 2 Mar 2018 15:52:11 - 1.548 > +++ if.c 28 Mar 2018 01:28:03 - > @@ -607,6 +607,7 @@ if_attach_common(struct ifnet *ifp) > ifp->if_snd.ifq_ifqs[0] = >if_snd; > ifp->if_ifqs = ifp->if_snd.ifq_ifqs; > ifp->if_nifqs = 1; > + ifp->if_sndmit = IF_SNDMIT_DEFAULT; > > ifiq_init(>if_rcv, ifp, 0); > > Index: if_var.h > === > RCS file: /cvs/src/sys/net/if_var.h,v > retrieving revision 1.89 > diff -u -p -r1.89 if_var.h > --- if_var.h 10 Jan 2018 23:50:39 - 1.89 > +++ if_var.h 28 Mar 2018 01:28:03 - > @@ -170,6 +170,7 @@ struct ifnet {/* and the > entries */ > struct ifqueue **if_ifqs; /* [I] pointer to an array of sndqs */ > void(*if_qstart)(struct ifqueue *); > unsigned int if_nifqs; /* [I] number of output queues */ > + unsigned int if_sndmit; /* [c] tx mitigation amount */ > > struct ifiqueue if_rcv;/* rx/input queue */ > struct ifiqueue **if_iqs; /* [I] pointer to the array of iqs */ > @@ -279,6 +280,9 @@ do { > \ > #define IFQ_LEN(ifq)ifq_len(ifq) > #define IFQ_IS_EMPTY(ifq) ifq_empty(ifq) > #define IFQ_SET_MAXLEN(ifq, len)ifq_set_maxlen(ifq, len) > + > +#define IF_SNDMIT_MIN1 > +#define IF_SNDMIT_DEFAULT16 > > /* default interface priorities */ > #define IF_WIRED_DEFAULT_PRIORITY0 > Index: ifq.c > === > RCS file: /cvs/src/sys/net/ifq.c,v > retrieving revision 1.22 > diff -u -p -r1.22 ifq.c > --- ifq.c 25 Jan 2018 14:04:36 - 1.22 > +++ ifq.c 28 Mar 2018 01:28:03 - > @@ -70,9 +70,16 @@ struct priq { > void ifq_start_task(void *); > void ifq_restart_task(void *); > void ifq_barrier_task(void *); > +void ifq_bundle_task(void *); > > #define TASK_ONQUEUE 0x1 > > +static inline void > +ifq_run_start(struct ifqueue *ifq) > +{ > + ifq_serialize(ifq, >ifq_start); > +} > + > void > ifq_serialize(struct ifqueue *ifq, struct task *t) > { > @@ -114,6 +121,16 @@ ifq_is_serialized(struct ifqueue *ifq) > } > > void > +ifq_start(struct ifqueue *ifq) > +{ > + if (ifq_len(ifq) >= min(ifq->ifq_if->if_sndmit, ifq->ifq_maxlen)) { > + task_del(ifq->ifq_softnet, >ifq_bundle); > + ifq_run_start(ifq); > + } else > + task_add(ifq->ifq_softnet, >ifq_bundle); > +} > + > +void > ifq_start_task(void *p) > { > struct ifqueue *ifq = p; > @@ -137,11 +154,31 @@ ifq_restart_task(void *p) > } > > void > +ifq_bundle_task(void *p) > +{ > + struct ifqueue *ifq = p; > + > + ifq_run_start(ifq); >
Re: interface queue transmit mitigation (again)
On Tue, Mar 27, 2018 at 9:30 PM David Gwynnewrote: > On Thu, Mar 15, 2018 at 03:25:46PM +0100, Martin Pieuchot wrote: > > On 14/03/18(Wed) 13:00, David Gwynne wrote: > > > this adds transmit mitigation back to the tree. > > > > > > it is basically the same diff as last time. the big difference this > > > time is that all the tunnel drivers all defer ip_output calls, which > > > avoids having to play games with NET_LOCK in the ifq transmit paths. > > > > Comments inline. > > > > > + if (ifq_len(ifq) >= min(4, ifq->ifq_maxlen)) { > > > > Why 4? DragonFly recently bumped `ifsq_stage_cntmax' to 16. Did you > > try other values? They also have an XXX comment that this value should > > be per-interface. Why? > > their default was 4, and they'd done some research on it. if they > moved to 16 there would be a reason for it. Would it be this commit? https://marc.info/?l=dragonfly-commits=151401707632544=2 Comments include test data. Michael
Re: interface queue transmit mitigation (again)
On Thu, Mar 15, 2018 at 03:25:46PM +0100, Martin Pieuchot wrote: > On 14/03/18(Wed) 13:00, David Gwynne wrote: > > this adds transmit mitigation back to the tree. > > > > it is basically the same diff as last time. the big difference this > > time is that all the tunnel drivers all defer ip_output calls, which > > avoids having to play games with NET_LOCK in the ifq transmit paths. > > Comments inline. > > > + if (ifq_len(ifq) >= min(4, ifq->ifq_maxlen)) { > > Why 4? DragonFly recently bumped `ifsq_stage_cntmax' to 16. Did you > try other values? They also have an XXX comment that this value should > be per-interface. Why? their default was 4, and they'd done some research on it. if they moved to 16 there would be a reason for it. > In any case I'd be happier with a define. i've taken your advice on board and made it per interface, defaulted to 16 with a macro. after this i can add ioctls to report it (under hwfeatures maybe) and change it with ifconfig. > > ifq_barrier(struct ifqueue *ifq) > > { > > struct cond c = COND_INITIALIZER(); > > struct task t = TASK_INITIALIZER(ifq_barrier_task, ); > > > > + NET_ASSERT_UNLOCKED(); > > Since you assert here... > > > + > > + if (!task_del(ifq->ifq_softnet, >ifq_bundle)) { > > + int netlocked = (rw_status() == RW_WRITE); > ^ > You can remove this code. i can't get rid of the assert :( ifq_barrier is called from lots of places, eg, ifq_destroy and ix_down. the former does not hold the net lock, but the latter does. putting manual NET_UNLOCK calls around places like the latter is too much work imo. Index: if.c === RCS file: /cvs/src/sys/net/if.c,v retrieving revision 1.548 diff -u -p -r1.548 if.c --- if.c2 Mar 2018 15:52:11 - 1.548 +++ if.c28 Mar 2018 01:28:03 - @@ -607,6 +607,7 @@ if_attach_common(struct ifnet *ifp) ifp->if_snd.ifq_ifqs[0] = >if_snd; ifp->if_ifqs = ifp->if_snd.ifq_ifqs; ifp->if_nifqs = 1; + ifp->if_sndmit = IF_SNDMIT_DEFAULT; ifiq_init(>if_rcv, ifp, 0); Index: if_var.h === RCS file: /cvs/src/sys/net/if_var.h,v retrieving revision 1.89 diff -u -p -r1.89 if_var.h --- if_var.h10 Jan 2018 23:50:39 - 1.89 +++ if_var.h28 Mar 2018 01:28:03 - @@ -170,6 +170,7 @@ struct ifnet { /* and the entries */ struct ifqueue **if_ifqs; /* [I] pointer to an array of sndqs */ void(*if_qstart)(struct ifqueue *); unsigned int if_nifqs; /* [I] number of output queues */ + unsigned int if_sndmit; /* [c] tx mitigation amount */ struct ifiqueue if_rcv;/* rx/input queue */ struct ifiqueue **if_iqs; /* [I] pointer to the array of iqs */ @@ -279,6 +280,9 @@ do { \ #defineIFQ_LEN(ifq)ifq_len(ifq) #defineIFQ_IS_EMPTY(ifq) ifq_empty(ifq) #defineIFQ_SET_MAXLEN(ifq, len)ifq_set_maxlen(ifq, len) + +#define IF_SNDMIT_MIN 1 +#define IF_SNDMIT_DEFAULT 16 /* default interface priorities */ #define IF_WIRED_DEFAULT_PRIORITY 0 Index: ifq.c === RCS file: /cvs/src/sys/net/ifq.c,v retrieving revision 1.22 diff -u -p -r1.22 ifq.c --- ifq.c 25 Jan 2018 14:04:36 - 1.22 +++ ifq.c 28 Mar 2018 01:28:03 - @@ -70,9 +70,16 @@ struct priq { void ifq_start_task(void *); void ifq_restart_task(void *); void ifq_barrier_task(void *); +void ifq_bundle_task(void *); #define TASK_ONQUEUE 0x1 +static inline void +ifq_run_start(struct ifqueue *ifq) +{ + ifq_serialize(ifq, >ifq_start); +} + void ifq_serialize(struct ifqueue *ifq, struct task *t) { @@ -114,6 +121,16 @@ ifq_is_serialized(struct ifqueue *ifq) } void +ifq_start(struct ifqueue *ifq) +{ + if (ifq_len(ifq) >= min(ifq->ifq_if->if_sndmit, ifq->ifq_maxlen)) { + task_del(ifq->ifq_softnet, >ifq_bundle); + ifq_run_start(ifq); + } else + task_add(ifq->ifq_softnet, >ifq_bundle); +} + +void ifq_start_task(void *p) { struct ifqueue *ifq = p; @@ -137,11 +154,31 @@ ifq_restart_task(void *p) } void +ifq_bundle_task(void *p) +{ + struct ifqueue *ifq = p; + + ifq_run_start(ifq); +} + +void ifq_barrier(struct ifqueue *ifq) { struct cond c = COND_INITIALIZER(); struct task t = TASK_INITIALIZER(ifq_barrier_task, ); + if (!task_del(ifq->ifq_softnet, >ifq_bundle)) { + int netlocked = (rw_status() == RW_WRITE); + + if (netlocked) /* XXXSMP breaks atomicity */ + NET_UNLOCK(); + + taskq_barrier(ifq->ifq_softnet);
Re: interface queue transmit mitigation (again)
On 14/03/18(Wed) 13:00, David Gwynne wrote: > this adds transmit mitigation back to the tree. > > it is basically the same diff as last time. the big difference this > time is that all the tunnel drivers all defer ip_output calls, which > avoids having to play games with NET_LOCK in the ifq transmit paths. Comments inline. > Index: ifq.c > === > RCS file: /cvs/src/sys/net/ifq.c,v > retrieving revision 1.22 > diff -u -p -r1.22 ifq.c > --- ifq.c 25 Jan 2018 14:04:36 - 1.22 > +++ ifq.c 14 Mar 2018 02:58:13 - > @@ -70,9 +70,16 @@ struct priq { > void ifq_start_task(void *); > void ifq_restart_task(void *); > void ifq_barrier_task(void *); > +void ifq_bundle_task(void *); > > #define TASK_ONQUEUE 0x1 > > +static inline void > +ifq_run_start(struct ifqueue *ifq) > +{ > + ifq_serialize(ifq, >ifq_start); > +} > + > void > ifq_serialize(struct ifqueue *ifq, struct task *t) > { > @@ -114,6 +121,16 @@ ifq_is_serialized(struct ifqueue *ifq) > } > > void > +ifq_start(struct ifqueue *ifq) > +{ > + if (ifq_len(ifq) >= min(4, ifq->ifq_maxlen)) { Why 4? DragonFly recently bumped `ifsq_stage_cntmax' to 16. Did you try other values? They also have an XXX comment that this value should be per-interface. Why? In any case I'd be happier with a define. > + task_del(ifq->ifq_softnet, >ifq_bundle); > + ifq_run_start(ifq); > + } else > + task_add(ifq->ifq_softnet, >ifq_bundle); > +} > + > +void > ifq_start_task(void *p) > { > struct ifqueue *ifq = p; > @@ -137,11 +154,33 @@ ifq_restart_task(void *p) > } > > void > +ifq_bundle_task(void *p) > +{ > + struct ifqueue *ifq = p; > + > + ifq_run_start(ifq); > +} > + > +void > ifq_barrier(struct ifqueue *ifq) > { > struct cond c = COND_INITIALIZER(); > struct task t = TASK_INITIALIZER(ifq_barrier_task, ); > > + NET_ASSERT_UNLOCKED(); Since you assert here... > + > + if (!task_del(ifq->ifq_softnet, >ifq_bundle)) { > + int netlocked = (rw_status() == RW_WRITE); ^ You can remove this code. > + > + if (netlocked) /* XXXSMP breaks atomicity */ > + NET_UNLOCK(); > + > + taskq_barrier(ifq->ifq_softnet); > + > + if (netlocked) > + NET_LOCK(); > + } > + > if (ifq->ifq_serializer == NULL) > return; > > @@ -166,6 +205,7 @@ void > ifq_init(struct ifqueue *ifq, struct ifnet *ifp, unsigned int idx) > { > ifq->ifq_if = ifp; > + ifq->ifq_softnet = net_tq(ifp->if_index); > ifq->ifq_softc = NULL; > > mtx_init(>ifq_mtx, IPL_NET); > @@ -187,6 +227,7 @@ ifq_init(struct ifqueue *ifq, struct ifn > mtx_init(>ifq_task_mtx, IPL_NET); > TAILQ_INIT(>ifq_task_list); > ifq->ifq_serializer = NULL; > + task_set(>ifq_bundle, ifq_bundle_task, ifq); > > task_set(>ifq_start, ifq_start_task, ifq); > task_set(>ifq_restart, ifq_restart_task, ifq); > @@ -237,6 +278,8 @@ void > ifq_destroy(struct ifqueue *ifq) > { > struct mbuf_list ml = MBUF_LIST_INITIALIZER(); > + > + ifq_barrier(ifq); /* ensure nothing is running with the ifq */ > > /* don't need to lock because this is the last use of the ifq */ > > Index: ifq.h > === > RCS file: /cvs/src/sys/net/ifq.h,v > retrieving revision 1.20 > diff -u -p -r1.20 ifq.h > --- ifq.h 4 Jan 2018 11:02:57 - 1.20 > +++ ifq.h 14 Mar 2018 02:58:13 - > @@ -25,6 +25,7 @@ struct ifq_ops; > > struct ifqueue { > struct ifnet*ifq_if; > + struct taskq*ifq_softnet; > union { > void*_ifq_softc; > /* > @@ -57,6 +58,7 @@ struct ifqueue { > struct mutex ifq_task_mtx; > struct task_list ifq_task_list; > void*ifq_serializer; > + struct task ifq_bundle; > > /* work to be serialised */ > struct task ifq_start; > @@ -405,6 +407,7 @@ void ifq_attach(struct ifqueue *, cons > void ifq_destroy(struct ifqueue *); > void ifq_add_data(struct ifqueue *, struct if_data *); > int ifq_enqueue(struct ifqueue *, struct mbuf *); > +void ifq_start(struct ifqueue *); > struct mbuf *ifq_deq_begin(struct ifqueue *); > void ifq_deq_commit(struct ifqueue *, struct mbuf *); > void ifq_deq_rollback(struct ifqueue *, struct mbuf *); > @@ -438,12 +441,6 @@ static inline unsigned int > ifq_is_oactive(struct ifqueue *ifq) > { > return (ifq->ifq_oactive); > -} > - > -static inline void > -ifq_start(struct ifqueue *ifq) > -{ > - ifq_serialize(ifq, >ifq_start); > } > > static inline void >
interface queue transmit mitigation (again)
this adds transmit mitigation back to the tree. it is basically the same diff as last time. the big difference this time is that all the tunnel drivers all defer ip_output calls, which avoids having to play games with NET_LOCK in the ifq transmit paths. tests? ok? Index: ifq.c === RCS file: /cvs/src/sys/net/ifq.c,v retrieving revision 1.22 diff -u -p -r1.22 ifq.c --- ifq.c 25 Jan 2018 14:04:36 - 1.22 +++ ifq.c 14 Mar 2018 02:58:13 - @@ -70,9 +70,16 @@ struct priq { void ifq_start_task(void *); void ifq_restart_task(void *); void ifq_barrier_task(void *); +void ifq_bundle_task(void *); #define TASK_ONQUEUE 0x1 +static inline void +ifq_run_start(struct ifqueue *ifq) +{ + ifq_serialize(ifq, >ifq_start); +} + void ifq_serialize(struct ifqueue *ifq, struct task *t) { @@ -114,6 +121,16 @@ ifq_is_serialized(struct ifqueue *ifq) } void +ifq_start(struct ifqueue *ifq) +{ + if (ifq_len(ifq) >= min(4, ifq->ifq_maxlen)) { + task_del(ifq->ifq_softnet, >ifq_bundle); + ifq_run_start(ifq); + } else + task_add(ifq->ifq_softnet, >ifq_bundle); +} + +void ifq_start_task(void *p) { struct ifqueue *ifq = p; @@ -137,11 +154,33 @@ ifq_restart_task(void *p) } void +ifq_bundle_task(void *p) +{ + struct ifqueue *ifq = p; + + ifq_run_start(ifq); +} + +void ifq_barrier(struct ifqueue *ifq) { struct cond c = COND_INITIALIZER(); struct task t = TASK_INITIALIZER(ifq_barrier_task, ); + NET_ASSERT_UNLOCKED(); + + if (!task_del(ifq->ifq_softnet, >ifq_bundle)) { + int netlocked = (rw_status() == RW_WRITE); + + if (netlocked) /* XXXSMP breaks atomicity */ + NET_UNLOCK(); + + taskq_barrier(ifq->ifq_softnet); + + if (netlocked) + NET_LOCK(); + } + if (ifq->ifq_serializer == NULL) return; @@ -166,6 +205,7 @@ void ifq_init(struct ifqueue *ifq, struct ifnet *ifp, unsigned int idx) { ifq->ifq_if = ifp; + ifq->ifq_softnet = net_tq(ifp->if_index); ifq->ifq_softc = NULL; mtx_init(>ifq_mtx, IPL_NET); @@ -187,6 +227,7 @@ ifq_init(struct ifqueue *ifq, struct ifn mtx_init(>ifq_task_mtx, IPL_NET); TAILQ_INIT(>ifq_task_list); ifq->ifq_serializer = NULL; + task_set(>ifq_bundle, ifq_bundle_task, ifq); task_set(>ifq_start, ifq_start_task, ifq); task_set(>ifq_restart, ifq_restart_task, ifq); @@ -237,6 +278,8 @@ void ifq_destroy(struct ifqueue *ifq) { struct mbuf_list ml = MBUF_LIST_INITIALIZER(); + + ifq_barrier(ifq); /* ensure nothing is running with the ifq */ /* don't need to lock because this is the last use of the ifq */ Index: ifq.h === RCS file: /cvs/src/sys/net/ifq.h,v retrieving revision 1.20 diff -u -p -r1.20 ifq.h --- ifq.h 4 Jan 2018 11:02:57 - 1.20 +++ ifq.h 14 Mar 2018 02:58:13 - @@ -25,6 +25,7 @@ struct ifq_ops; struct ifqueue { struct ifnet*ifq_if; + struct taskq*ifq_softnet; union { void*_ifq_softc; /* @@ -57,6 +58,7 @@ struct ifqueue { struct mutex ifq_task_mtx; struct task_list ifq_task_list; void*ifq_serializer; + struct task ifq_bundle; /* work to be serialised */ struct task ifq_start; @@ -405,6 +407,7 @@ void ifq_attach(struct ifqueue *, cons voidifq_destroy(struct ifqueue *); voidifq_add_data(struct ifqueue *, struct if_data *); int ifq_enqueue(struct ifqueue *, struct mbuf *); +voidifq_start(struct ifqueue *); struct mbuf*ifq_deq_begin(struct ifqueue *); voidifq_deq_commit(struct ifqueue *, struct mbuf *); voidifq_deq_rollback(struct ifqueue *, struct mbuf *); @@ -438,12 +441,6 @@ static inline unsigned int ifq_is_oactive(struct ifqueue *ifq) { return (ifq->ifq_oactive); -} - -static inline void -ifq_start(struct ifqueue *ifq) -{ - ifq_serialize(ifq, >ifq_start); } static inline void