Hi,
This series provides a set of interfaces that can be used by applications that require (time-based) Scheduled Transmission of packets. It is comprised by 3 new components to the kernel: - etf: the per-queue TxTime-Based scheduling qdisc; - taprio: the per-port Time-Aware scheduler qdisc; - SO_TXTIME: a socket option + cmsg APIs. ETF and SO_TXTIME are already applied[1] into the net-next tree. This is the remaining piece. Overview ======== The CBS qdisc proposal RFC [2] included some rough ideas about the design and API of a "taprio" (Time Aware Priority) qdisc. The idea of presenting the taprio ideas at that point (almost 10 months ago!) was to show our vision of how things would fit together going forward. >From that concept stage to this (almost) realised stage the main differences are: - As of now, taprio is a software only implementation of a schedule executor; - Instead of taprio centralising all the time based decisions, taprio can work together with ETF (the Earliest TxTime First), a qdisc meant to use the LaunchTime (or similar) feature of various network controllers; In a nutshell, taprio is a root qdisc that can execute a predefined schedule, etf is a qdisc that provides time based admission control and "earliest deadline first" dequeue mode, and SO_TXTIME is a socket option that is used for enabling a socket to be used for time-based transmission and configuring its parameters. taprio ====== This scheduler allows the network administrator to configure schedules for classes of traffic, the configuration interface is similar to what IEEE 802.1Qbv-2015 defines. Example configuration: $ tc qdisc add dev enp2s0 parent root handle 100 taprio \ num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ queues 1@0 1@1 2@2 \ sched-file ~/gates.sched \ base-time 1528743495910289987 \ clockid CLOCK_TAI This qdisc borrows a few concepts from mqprio and so, most the parameters are similar to mqprio. The main difference is the 'sched-file' parameter, one example on a schedule file would be: gates.sched ----------- S 01 300000 S 02 300000 S 04 300000 The format of each line is: <CMD> <GATE MASK> <INTERVAL> The only supported <CMD> is "S", which means "SetGateStates", following the IEEE 802.1Qbv-2015 definition (Table 8-6). <GATE MASK> is a bitmask where each bit is a associated with a traffic class, so bit 0 (the least significant bit) being "on" means that traffic class 0 is "active" for that schedule entry. <INTERVAL> is a time duration in nanoseconds that specifies for how long that state defined by <CMD> and <GATE MASK> should be held before moving to the next entry. This schedule is circular, that is, after the last entry is executed it starts from the first one, indefinitely. The other parameters can be defined as follows: - base-time: allows that multiple systems can have synchronised schedules, it specifies the instant when the schedule starts; - clockid: specifies the reference clock to be used; A more complete example can be found here, with instructions of how to test it: https://gist.github.com/jeez/bd3afeff081ba64a695008dd8215866f [3] The basic design of the scheduler is simple, after we calculate the first expiration of the hrtimer, we set the next expiration to be the previous plus the current entry's interval. At each time the function runs, we set the current_entry, which has a gate_mask (that controls which traffic classes are allowed to "go out" during each interval), and we reuse this callback to "kick" the qdisc (this is the reason that the usual qdisc watchdog isn't used). Known Issues ============ - As taprio is a software only implementation, and there's another layer of queuing in the network controller, packets can still leave the controller outside their "correct" windows. This happens mostly for low-priority classes, and is more evident if they are 'starved' by the higher priority ones; - There's no support for changing the schedule during runtime; This series is also hosted on github and can be found at [4]. The companion iproute2 patches can be found at [5]. Cheers, -- Vinicius [1] https://patchwork.ozlabs.org/cover/938991/ [2] https://patchwork.ozlabs.org/cover/808504/ [3] github doesn't make it clear, but the gist can be cloned like this: $ git clone https://gist.github.com/jeez/bd3afeff081ba64a695008dd8215866f taprio-test [4] https://github.com/vcgomes/linux/tree/taprio-RFC-v1 [5] https://github.com/vcgomes/iproute2/tree/taprio-RFC-v1 Vinicius Costa Gomes (1): net/sched: Introduce the taprio scheduler include/uapi/linux/pkt_sched.h | 49 ++ net/sched/Kconfig | 11 + net/sched/Makefile | 1 + net/sched/sch_taprio.c | 952 +++++++++++++++++++++++++++++++++ 4 files changed, 1013 insertions(+) create mode 100644 net/sched/sch_taprio.c -- 2.18.0