Re: [PATCH net-next] net:sched: add gkprio scheduler
Sorry I dropped this. On 14/05/18 10:08 AM, Michel Machado wrote: On 09/05/18 01:37 PM, Michel Machado wrote: A simplified description of what DSprio is meant to do is as follows: when a link is overloaded at a router, DSprio makes this router drop the packets of lower priority. Makes sense. Any priority based work-conserving scheduler will work fine. The only small difference you have with prio qdisc is you drop an enqueued low prio packet to make room for a new higher prio queue. Can you look at pfifo_head_drop qdisc to see if it suffices? It may not be: In such a case, I would suggest a hybrid between pfifo_head_drop and pfifo_fast for the new qdisc. [Cong has suggested to write a classful qdisc but it may be sufficient to just replicate what pfifo_fast does since it tracks virtual queues] These priorities are assigned by Gatekeeper in such a way that well behaving sources are favored (Theorem 4.1 of the Portcullis paper pointed out in my previous email). Moreover, attackers cannot do much better than well behaving sources (Theorem 4.2). This description is simplified because it omits many other components of Gatekeeper that affects the packets that goes to DSprio. I am sorry - I have no access to this document so dont know what these theorems are. I understand your requirements. 1) You are looking to use priority identifiers to select queues. 2) You want to prioritize treatment of favorably tagged packets. The enqueueing will drop lower priority packets to make space for higher priority under congestion. Did i miss anything? For #1 my suggestion is to use skbmod to set the priority tag. For #2 if you didnt have to drop at enqueue time you could have used any of the existing priority favoring qdiscs which recognize skb->priority. Otherwise as i suggested above look at pfifo_fast/pfifo_head_drop Like you, I'm all in for less code. If someone can instruct us on how to accomplish the same thing that our patch is doing, we would be happy to withdraw it. We have submitted this patch because we want to lower the bar to deploy Gatekeeper as much as possible, and requiring network operators willing to deploy Gatekeeper to keep patching the kernel is an operational burden. So I would suggest you keep this real simple - especially if you want to go backwards in kernels. For existing kernels you can implement the basic policies of what you need by using prio qdisc with a combination of a classifier that knows how to match on dsfield (trivial to do with u32) and skbedit action to tag the skb->priority. Then let prio qdisc use the priomap to select the queue. If you must drop enqueued low prio packets then you may need the new qdisc. And to optimize, you will need the skbmod change. I really think it is a bad idea to encapsulate the classifier in the qdisc. Look at the priomap or prio2band arrangement on prio qdisc or pfifo_fast qdisc. You take an skbprio as an index into the array and retrieve a queue to enqueue to. The size of the array is 16. In the past this was based IIRC on ip precedence + 1 bit. Those map similarly to DS fields (calls selectors, assured forwarding etc). So no need to even increase the array beyond current 16. What application is this change supposed to enable or help? I think this change should be left for when one can explain the need for it. I meant to take a look at the prio map. It is an array of size 16 which holds the skb->priority implicit classifier (prio, pfifo_fast etc). A packets skb priority is used as an index into this array and from the result a queue is selected to put the packet onto. The map of this array can be configured from user space. I was saying earlier that it may be tempting to make a size 64 array to map the possible dsfields - in practise that has never been pragmatic (so 16 was sufficient). cheers, jamal
Re: [PATCH net-next] net:sched: add gkprio scheduler
On 09/05/18 01:37 PM, Michel Machado wrote: On 05/09/2018 10:43 AM, Jamal Hadi Salim wrote: On 08/05/18 10:27 PM, Cong Wang wrote: On Tue, May 8, 2018 at 6:29 AM, Jamal Hadi Salim wrote: I like the suggestion of extending skbmod to mark skbprio based on ds. Given that DSprio would no longer depend on the DS field, would you have a name suggestion for this new queue discipline since the name "prio" is currently in use? Not sure what to call it. My struggle is still with the intended end goal of the qdisc. It looks like prio qdisc except for the enqueue part which attempts to use a shared global queue size for all prios. I would have pointed to other approaches which use global priority queue pool which do early congestion detection like RED or variants like GRED but those use average values of the queue lengths not instantenous values such as you do. I am tempted to say - based on my current understanding - that you dont need a new qdisc; rather you need to map your dsfields to skbprio (via skbmod) and stick with prio qdisc. I also think the skbmod mapping is useful regardless of this need. A simplified description of what DSprio is meant to do is as follows: when a link is overloaded at a router, DSprio makes this router drop the packets of lower priority. These priorities are assigned by Gatekeeper in such a way that well behaving sources are favored (Theorem 4.1 of the Portcullis paper pointed out in my previous email). Moreover, attackers cannot do much better than well behaving sources (Theorem 4.2). This description is simplified because it omits many other components of Gatekeeper that affects the packets that goes to DSprio. Like you, I'm all in for less code. If someone can instruct us on how to accomplish the same thing that our patch is doing, we would be happy to withdraw it. We have submitted this patch because we want to lower the bar to deploy Gatekeeper as much as possible, and requiring network operators willing to deploy Gatekeeper to keep patching the kernel is an operational burden. What should be the range of priorities that this new queue discipline would accept? skb->prioriry is of type __u32, but supporting 2^32 priorities would require too large of an array to index packets by priority; the DS field is only 6 bits long. Do you have a use case in mind to guide us here? Look at the priomap or prio2band arrangement on prio qdisc or pfifo_fast qdisc. You take an skbprio as an index into the array and retrieve a queue to enqueue to. The size of the array is 16. In the past this was based IIRC on ip precedence + 1 bit. Those map similarly to DS fields (calls selectors, assured forwarding etc). So no need to even increase the array beyond current 16. What application is this change supposed to enable or help? I think this change should be left for when one can explain the need for it. 2) Dropping already enqueued packets will not work well for local feedback (__NET_XMIT_BYPASS return code is about the packet that has been dropped from earlier enqueueing because it is lower priority - it does not signify anything with current skb to which actually just got enqueud). Perhaps (off top of my head) is to always enqueue packets on high priority when their limit is exceeded as long as lower prio has some space. Means youd have to increment low prio accounting if their space is used. I don't understand the point you are making here. Could you develop it further? Sorry - I was meaning NET_XMIT_CN If you drop an already enqueued packet - it makes sense to signify as such using NET_XMIT_CN this does not make sense for forwarded packets but it does for locally sourced packets. Thank you for bringing this detail to our attention; we've overlooked the return code NET_XMIT_CN. We'll adopt it when the queue is full and the lowest priority packet in the queue is being dropped to make room for the higher-priority, incoming packet. [ ]'s Michel Machado
Re: [PATCH net-next] net:sched: add gkprio scheduler
Sorry for the latency.. On 09/05/18 01:37 PM, Michel Machado wrote: On 05/09/2018 10:43 AM, Jamal Hadi Salim wrote: On 08/05/18 10:27 PM, Cong Wang wrote: On Tue, May 8, 2018 at 6:29 AM, Jamal Hadi Salim wrote: I like the suggestion of extending skbmod to mark skbprio based on ds. Given that DSprio would no longer depend on the DS field, would you have a name suggestion for this new queue discipline since the name "prio" is currently in use? Not sure what to call it. My struggle is still with the intended end goal of the qdisc. It looks like prio qdisc except for the enqueue part which attempts to use a shared global queue size for all prios. I would have pointed to other approaches which use global priority queue pool which do early congestion detection like RED or variants like GRED but those use average values of the queue lengths not instantenous values such as you do. I am tempted to say - based on my current understanding - that you dont need a new qdisc; rather you need to map your dsfields to skbprio (via skbmod) and stick with prio qdisc. I also think the skbmod mapping is useful regardless of this need. What should be the range of priorities that this new queue discipline would accept? skb->prioriry is of type __u32, but supporting 2^32 priorities would require too large of an array to index packets by priority; the DS field is only 6 bits long. Do you have a use case in mind to guide us here? Look at the priomap or prio2band arrangement on prio qdisc or pfifo_fast qdisc. You take an skbprio as an index into the array and retrieve a queue to enqueue to. The size of the array is 16. In the past this was based IIRC on ip precedence + 1 bit. Those map similarly to DS fields (calls selectors, assured forwarding etc). So no need to even increase the array beyond current 16. I find the cleverness in changing the highest/low prios confusing. It looks error-prone (I guess that is why there is a BUG check) To the authors: Is there a document/paper on the theory of this thing as to why no explicit queues are "faster"? The priority orientation in GKprio is due to two factors: failing safe and elegance. If zero were the highest priority, any operational mistake that leads not-classified packets through GKprio would potentially disrupt the system. We are humans, we'll make mistakes. The elegance aspect comes from the fact that the assigned priority is not massaged to fit the DS field. We find it helpful while inspecting packets on the wire. The reason for us to avoid explicit queues in GKprio, which could change the behavior within a given priority, is to closely abide to the expected behavior assumed to prove Theorem 4.1 in the paper "Portcullis: Protecting Connection Setup from Denial-of-Capability Attacks": https://dl.acm.org/citation.cfm?id=1282413 Paper seems to be under paywall. Googling didnt help. My concern is still the science behind this; if you had written up some test setup which shows how you concluded this was a better approach at DOS prevention and showed some numbers it would have helped greatly clarify. 1) I agree that using multiple queues as in prio qdisc would make it more manageable; does not necessarily need to be classful if you use implicit skbprio classification. i.e on equeue use a priority map to select a queue; on dequeue always dequeu from highest prio until it has no more packets to send. In my reply to Cong, I point out that there is a technical limitation in the interface of queue disciplines that forbids GKprio to have explicit sub-queues: https://www.mail-archive.com/netdev@vger.kernel.org/msg234201.html 2) Dropping already enqueued packets will not work well for local feedback (__NET_XMIT_BYPASS return code is about the packet that has been dropped from earlier enqueueing because it is lower priority - it does not signify anything with current skb to which actually just got enqueud). Perhaps (off top of my head) is to always enqueue packets on high priority when their limit is exceeded as long as lower prio has some space. Means youd have to increment low prio accounting if their space is used. I don't understand the point you are making here. Could you develop it further? Sorry - I was meaning NET_XMIT_CN If you drop an already enqueued packet - it makes sense to signify as such using NET_XMIT_CN this does not make sense for forwarded packets but it does for locally sourced packets. cheers, jamal
Re: [PATCH net-next] net:sched: add gkprio scheduler
On 05/10/2018 01:38 PM, Cong Wang wrote: On Wed, May 9, 2018 at 7:09 AM, Michel Machado wrote: On 05/08/2018 10:24 PM, Cong Wang wrote: On Tue, May 8, 2018 at 5:59 AM, Michel Machado wrote: Overall it looks good to me, just one thing below: +struct Qdisc_ops gkprio_qdisc_ops __read_mostly = { + .id = "gkprio", + .priv_size = sizeof(struct gkprio_sched_data), + .enqueue= gkprio_enqueue, + .dequeue= gkprio_dequeue, + .peek = qdisc_peek_dequeued, + .init = gkprio_init, + .reset = gkprio_reset, + .change = gkprio_change, + .dump = gkprio_dump, + .destroy= gkprio_destroy, + .owner = THIS_MODULE, +}; You probably want to add Qdisc_class_ops here so that you can dump the stats of each internal queue. Hi Cong, In the production scenario we are targeting, this priority queue must be classless; being classful would only bloat the code for us. I don't see making this queue classful as a problem per se, but I suggest leaving it as a future improvement for when someone can come up with a useful scenario for it. Take a look at sch_prio, it is fairly simple since your internal queues are just an array... Per-queue stats are quite useful in production, we definitely want to observe which queues are full which are not. DSprio cannot add Qdisc_class_ops without a rewrite of other queue disciplines, which doesn't seem desirable. Since the method cops->leaf is required (see register_qdisc()), we would need to replace the array struct sk_buff_head qdiscs[GKPRIO_MAX_PRIORITY] in struct gkprio_sched_data with the array struct Qdisc *queues[GKPRIO_MAX_PRIORITY] to be able to return a Qdisc in dsprio_leaf(). The problem with this change is that Qdisc does not have a method to dequeue from its tail. This new method may not even make sense in other queue disciplines. But without this method, gkprio_enqueue() cannot drop the lowest priority packet when the queue is full and an incoming packet has higher priority. Sorry for giving you a bad example. Take a look at sch_fq_codel instead, it returns NULL for ->leaf() and maps its internal flows to classes. I thought sch_prio uses internal qdiscs, but I was wrong, as you noticed it actually exposes them to user via classes. My point is never to make it classful, just want to expose the useful stats, like how fq_codel dumps its internal flows. Nevertheless, I see your point on being able to observe the distribution of queued packets per priority. A solution for that would be to add the array __u32 qlen[GKPRIO_MAX_PRIORITY] in struct tc_gkprio_qopt. This solution even avoids adding overhead in the critical paths of DSprio. Do you see a better solution? I believe you can return NULL for ->leaf() and don't need to worry about ->graft() either. ;) Thank you for pointing sch_fq_codel out. We'll follow its example. [ ]'s Michel Machado
Re: [PATCH net-next] net:sched: add gkprio scheduler
On Wed, May 9, 2018 at 7:09 AM, Michel Machado wrote: > On 05/08/2018 10:24 PM, Cong Wang wrote: >> >> On Tue, May 8, 2018 at 5:59 AM, Michel Machado >> wrote: > > Overall it looks good to me, just one thing below: > >> +struct Qdisc_ops gkprio_qdisc_ops __read_mostly = { >> + .id = "gkprio", >> + .priv_size = sizeof(struct gkprio_sched_data), >> + .enqueue= gkprio_enqueue, >> + .dequeue= gkprio_dequeue, >> + .peek = qdisc_peek_dequeued, >> + .init = gkprio_init, >> + .reset = gkprio_reset, >> + .change = gkprio_change, >> + .dump = gkprio_dump, >> + .destroy= gkprio_destroy, >> + .owner = THIS_MODULE, >> +}; > > > > You probably want to add Qdisc_class_ops here so that you can > dump the stats of each internal queue. >>> >>> >>> >>> Hi Cong, >>> >>> In the production scenario we are targeting, this priority queue must >>> be >>> classless; being classful would only bloat the code for us. I don't see >>> making this queue classful as a problem per se, but I suggest leaving it >>> as >>> a future improvement for when someone can come up with a useful scenario >>> for >>> it. >> >> >> >> Take a look at sch_prio, it is fairly simple since your internal >> queues are just an array... Per-queue stats are quite useful >> in production, we definitely want to observe which queues are >> full which are not. >> > > DSprio cannot add Qdisc_class_ops without a rewrite of other queue > disciplines, which doesn't seem desirable. Since the method cops->leaf is > required (see register_qdisc()), we would need to replace the array struct > sk_buff_head qdiscs[GKPRIO_MAX_PRIORITY] in struct gkprio_sched_data with > the array struct Qdisc *queues[GKPRIO_MAX_PRIORITY] to be able to return a > Qdisc in dsprio_leaf(). The problem with this change is that Qdisc does not > have a method to dequeue from its tail. This new method may not even make > sense in other queue disciplines. But without this method, gkprio_enqueue() > cannot drop the lowest priority packet when the queue is full and an > incoming packet has higher priority. Sorry for giving you a bad example. Take a look at sch_fq_codel instead, it returns NULL for ->leaf() and maps its internal flows to classes. I thought sch_prio uses internal qdiscs, but I was wrong, as you noticed it actually exposes them to user via classes. My point is never to make it classful, just want to expose the useful stats, like how fq_codel dumps its internal flows. > > Nevertheless, I see your point on being able to observe the distribution of > queued packets per priority. A solution for that would be to add the array > __u32 qlen[GKPRIO_MAX_PRIORITY] in struct tc_gkprio_qopt. This solution even > avoids adding overhead in the critical paths of DSprio. Do you see a better > solution? I believe you can return NULL for ->leaf() and don't need to worry about ->graft() either. ;) > > By the way, I've used GKPRIO_MAX_PRIORITY and other names that include > "gkprio" above to reflect the version 1 of this patch that we are > discussing. We will rename these identifiers for version 2 of this patch to > replace "gkprio" with "dsprio". > Sounds good. Thanks.
Re: [PATCH net-next] net:sched: add gkprio scheduler
On 05/09/2018 10:43 AM, Jamal Hadi Salim wrote: On 08/05/18 10:27 PM, Cong Wang wrote: On Tue, May 8, 2018 at 6:29 AM, Jamal Hadi Salim wrote: Have you considered using skb->prio instead of peeking into the packet header. Also have you looked at the dsmark qdisc? dsmark modifies ds fields, while this one just maps ds fields into different queues. Yeah, I was thinking more of re-using it for the purpose of mapping to queues - but would require a lot more work. once skbprio is set by something[1] then this qdisc could be used by other subsystems (8021q, sockets etc); so i would argue for removal of the embedded classification and instead maybe writing a simple extension to skbmod to mark skbprio based on ds. I like the suggestion of extending skbmod to mark skbprio based on ds. Given that DSprio would no longer depend on the DS field, would you have a name suggestion for this new queue discipline since the name "prio" is currently in use? What should be the range of priorities that this new queue discipline would accept? skb->prioriry is of type __u32, but supporting 2^32 priorities would require too large of an array to index packets by priority; the DS field is only 6 bits long. Do you have a use case in mind to guide us here? I find the cleverness in changing the highest/low prios confusing. It looks error-prone (I guess that is why there is a BUG check) To the authors: Is there a document/paper on the theory of this thing as to why no explicit queues are "faster"? The priority orientation in GKprio is due to two factors: failing safe and elegance. If zero were the highest priority, any operational mistake that leads not-classified packets through GKprio would potentially disrupt the system. We are humans, we'll make mistakes. The elegance aspect comes from the fact that the assigned priority is not massaged to fit the DS field. We find it helpful while inspecting packets on the wire. The reason for us to avoid explicit queues in GKprio, which could change the behavior within a given priority, is to closely abide to the expected behavior assumed to prove Theorem 4.1 in the paper "Portcullis: Protecting Connection Setup from Denial-of-Capability Attacks": https://dl.acm.org/citation.cfm?id=1282413 1) I agree that using multiple queues as in prio qdisc would make it more manageable; does not necessarily need to be classful if you use implicit skbprio classification. i.e on equeue use a priority map to select a queue; on dequeue always dequeu from highest prio until it has no more packets to send. In my reply to Cong, I point out that there is a technical limitation in the interface of queue disciplines that forbids GKprio to have explicit sub-queues: https://www.mail-archive.com/netdev@vger.kernel.org/msg234201.html 2) Dropping already enqueued packets will not work well for local feedback (__NET_XMIT_BYPASS return code is about the packet that has been dropped from earlier enqueueing because it is lower priority - it does not signify anything with current skb to which actually just got enqueud). Perhaps (off top of my head) is to always enqueue packets on high priority when their limit is exceeded as long as lower prio has some space. Means youd have to increment low prio accounting if their space is used. I don't understand the point you are making here. Could you develop it further? [ ]'s Michel Machado
Re: [PATCH net-next] net:sched: add gkprio scheduler
On 08/05/18 10:27 PM, Cong Wang wrote: On Tue, May 8, 2018 at 6:29 AM, Jamal Hadi Salim wrote: Have you considered using skb->prio instead of peeking into the packet header. Also have you looked at the dsmark qdisc? dsmark modifies ds fields, while this one just maps ds fields into different queues. Yeah, I was thinking more of re-using it for the purpose of mapping to queues - but would require a lot more work. once skbprio is set by something[1] then this qdisc could be used by other subsystems (8021q, sockets etc); so i would argue for removal of the embedded classification and instead maybe writing a simple extension to skbmod to mark skbprio based on ds. I find the cleverness in changing the highest/low prios confusing. It looks error-prone (I guess that is why there is a BUG check) To the authors: Is there a document/paper on the theory of this thing as to why no explicit queues are "faster"? Some other feedback: 1) I agree that using multiple queues as in prio qdisc would make it more manageable; does not necessarily need to be classful if you use implicit skbprio classification. i.e on equeue use a priority map to select a queue; on dequeue always dequeu from highest prio until it has no more packets to send. 2) Dropping already enqueued packets will not work well for local feedback (__NET_XMIT_BYPASS return code is about the packet that has been dropped from earlier enqueueing because it is lower priority - it does not signify anything with current skb to which actually just got enqueud). Perhaps (off top of my head) is to always enqueue packets on high priority when their limit is exceeded as long as lower prio has some space. Means youd have to increment low prio accounting if their space is used. cheers, jamal [1] something like: tc filter add match all ip action skbmod inheritdsfield tc filter add match all ip6 action skbmod inheritdsfield inheritdsfield maps ds to skb->prioriry
Re: [PATCH net-next] net:sched: add gkprio scheduler
On 05/08/2018 10:24 PM, Cong Wang wrote: On Tue, May 8, 2018 at 5:59 AM, Michel Machado wrote: Overall it looks good to me, just one thing below: +struct Qdisc_ops gkprio_qdisc_ops __read_mostly = { + .id = "gkprio", + .priv_size = sizeof(struct gkprio_sched_data), + .enqueue= gkprio_enqueue, + .dequeue= gkprio_dequeue, + .peek = qdisc_peek_dequeued, + .init = gkprio_init, + .reset = gkprio_reset, + .change = gkprio_change, + .dump = gkprio_dump, + .destroy= gkprio_destroy, + .owner = THIS_MODULE, +}; You probably want to add Qdisc_class_ops here so that you can dump the stats of each internal queue. Hi Cong, In the production scenario we are targeting, this priority queue must be classless; being classful would only bloat the code for us. I don't see making this queue classful as a problem per se, but I suggest leaving it as a future improvement for when someone can come up with a useful scenario for it. Take a look at sch_prio, it is fairly simple since your internal queues are just an array... Per-queue stats are quite useful in production, we definitely want to observe which queues are full which are not. DSprio cannot add Qdisc_class_ops without a rewrite of other queue disciplines, which doesn't seem desirable. Since the method cops->leaf is required (see register_qdisc()), we would need to replace the array struct sk_buff_head qdiscs[GKPRIO_MAX_PRIORITY] in struct gkprio_sched_data with the array struct Qdisc *queues[GKPRIO_MAX_PRIORITY] to be able to return a Qdisc in dsprio_leaf(). The problem with this change is that Qdisc does not have a method to dequeue from its tail. This new method may not even make sense in other queue disciplines. But without this method, gkprio_enqueue() cannot drop the lowest priority packet when the queue is full and an incoming packet has higher priority. Nevertheless, I see your point on being able to observe the distribution of queued packets per priority. A solution for that would be to add the array __u32 qlen[GKPRIO_MAX_PRIORITY] in struct tc_gkprio_qopt. This solution even avoids adding overhead in the critical paths of DSprio. Do you see a better solution? By the way, I've used GKPRIO_MAX_PRIORITY and other names that include "gkprio" above to reflect the version 1 of this patch that we are discussing. We will rename these identifiers for version 2 of this patch to replace "gkprio" with "dsprio". [ ]'s Michel Machado
Re: [PATCH net-next] net:sched: add gkprio scheduler
On Tue, May 8, 2018 at 6:29 AM, Jamal Hadi Salim wrote: > Have you considered using skb->prio instead of peeking into the packet > header. > Also have you looked at the dsmark qdisc? > dsmark modifies ds fields, while this one just maps ds fields into different queues.
Re: [PATCH net-next] net:sched: add gkprio scheduler
On Tue, May 8, 2018 at 5:59 AM, Michel Machado wrote: >>> Overall it looks good to me, just one thing below: >>> +struct Qdisc_ops gkprio_qdisc_ops __read_mostly = { + .id = "gkprio", + .priv_size = sizeof(struct gkprio_sched_data), + .enqueue= gkprio_enqueue, + .dequeue= gkprio_dequeue, + .peek = qdisc_peek_dequeued, + .init = gkprio_init, + .reset = gkprio_reset, + .change = gkprio_change, + .dump = gkprio_dump, + .destroy= gkprio_destroy, + .owner = THIS_MODULE, +}; >>> >>> >>> You probably want to add Qdisc_class_ops here so that you can >>> dump the stats of each internal queue. > > > Hi Cong, > >In the production scenario we are targeting, this priority queue must be > classless; being classful would only bloat the code for us. I don't see > making this queue classful as a problem per se, but I suggest leaving it as > a future improvement for when someone can come up with a useful scenario for > it. Take a look at sch_prio, it is fairly simple since your internal queues are just an array... Per-queue stats are quite useful in production, we definitely want to observe which queues are full which are not.
Re: [PATCH net-next] net:sched: add gkprio scheduler
On 05/08/2018 09:29 AM, Jamal Hadi Salim wrote: On 08/05/18 08:59 AM, Michel Machado wrote: Overall it looks good to me, just one thing below: +struct Qdisc_ops gkprio_qdisc_ops __read_mostly = { + .id = "gkprio", + .priv_size = sizeof(struct gkprio_sched_data), + .enqueue = gkprio_enqueue, + .dequeue = gkprio_dequeue, + .peek = qdisc_peek_dequeued, + .init = gkprio_init, + .reset = gkprio_reset, + .change = gkprio_change, + .dump = gkprio_dump, + .destroy = gkprio_destroy, + .owner = THIS_MODULE, +}; You probably want to add Qdisc_class_ops here so that you can dump the stats of each internal queue. Hi Cong, In the production scenario we are targeting, this priority queue must be classless; being classful would only bloat the code for us. I don't see making this queue classful as a problem per se, but I suggest leaving it as a future improvement for when someone can come up with a useful scenario for it. I am actually struggling with this whole thing. Have you considered using skb->prio instead of peeking into the packet header. Also have you looked at the dsmark qdisc? As far as I know, skb->priority (skb->prio has been renamed) is unsigned for packets that come from the network. DSprio, adopting Cong's name suggestion, is most useful "merging" packets that come from different network interfaces. Had we relied on DSmark to mark skb->tc_index with the DS field, we would have forced anyone using DSprio to use DSmark. This may sound as a good idea, but DSmark always requires writable socket buffers while setting skb->tc_index with the DS field of the packet (see dsmark_enqueue()), what means that the kernel may drop high priority packets instead of low priority packets due to memory pressure. [ ]'s Michel Machado
Re: [PATCH net-next] net:sched: add gkprio scheduler
Overall it looks good to me, just one thing below: +struct Qdisc_ops gkprio_qdisc_ops __read_mostly = { + .id = "gkprio", + .priv_size = sizeof(struct gkprio_sched_data), + .enqueue= gkprio_enqueue, + .dequeue= gkprio_dequeue, + .peek = qdisc_peek_dequeued, + .init = gkprio_init, + .reset = gkprio_reset, + .change = gkprio_change, + .dump = gkprio_dump, + .destroy= gkprio_destroy, + .owner = THIS_MODULE, +}; You probably want to add Qdisc_class_ops here so that you can dump the stats of each internal queue. Hi Cong, In the production scenario we are targeting, this priority queue must be classless; being classful would only bloat the code for us. I don't see making this queue classful as a problem per se, but I suggest leaving it as a future improvement for when someone can come up with a useful scenario for it. [ ]'s Michel Machado
Re: [PATCH net-next] net:sched: add gkprio scheduler
On 08/05/18 08:59 AM, Michel Machado wrote: Overall it looks good to me, just one thing below: +struct Qdisc_ops gkprio_qdisc_ops __read_mostly = { + .id = "gkprio", + .priv_size = sizeof(struct gkprio_sched_data), + .enqueue = gkprio_enqueue, + .dequeue = gkprio_dequeue, + .peek = qdisc_peek_dequeued, + .init = gkprio_init, + .reset = gkprio_reset, + .change = gkprio_change, + .dump = gkprio_dump, + .destroy = gkprio_destroy, + .owner = THIS_MODULE, +}; You probably want to add Qdisc_class_ops here so that you can dump the stats of each internal queue. Hi Cong, In the production scenario we are targeting, this priority queue must be classless; being classful would only bloat the code for us. I don't see making this queue classful as a problem per se, but I suggest leaving it as a future improvement for when someone can come up with a useful scenario for it. I am actually struggling with this whole thing. Have you considered using skb->prio instead of peeking into the packet header. Also have you looked at the dsmark qdisc? cheers, jamal
Re: [PATCH net-next] net:sched: add gkprio scheduler
On Mon, May 07, 2018 at 10:24:51PM -0700, Cong Wang wrote: > On Mon, May 7, 2018 at 2:36 AM, Nishanth Devarajan wrote: > > net/sched: add gkprio scheduler > > > > Gkprio (Gatekeeper Priority Queue) is a queueing discipline that prioritizes > > IPv4 and IPv6 packets accordingly to their DSCP field. Although Gkprio can > > be > > employed in any QoS scenario in which a higher DSCP field means a higher > > priority packet, Gkprio was concieved as a solution for denial-of-service > > defenses that need to route packets with different priorities. > > > Can we give it a better name? "Gatekeeper" is meaningless if we read > it alone, it ties to your Gatekeeper project which is more than just this > kernel module. Maybe "DS Priority Queue"? > Yes, we should be able to come up with a better name, we'll work on it. > Overall it looks good to me, just one thing below: > > > +struct Qdisc_ops gkprio_qdisc_ops __read_mostly = { > > + .id = "gkprio", > > + .priv_size = sizeof(struct gkprio_sched_data), > > + .enqueue= gkprio_enqueue, > > + .dequeue= gkprio_dequeue, > > + .peek = qdisc_peek_dequeued, > > + .init = gkprio_init, > > + .reset = gkprio_reset, > > + .change = gkprio_change, > > + .dump = gkprio_dump, > > + .destroy= gkprio_destroy, > > + .owner = THIS_MODULE, > > +}; > > You probably want to add Qdisc_class_ops here so that you can > dump the stats of each internal queue. Alright, will make some changes and send in a v2. Thanks, Nishanth
Re: [PATCH net-next] net:sched: add gkprio scheduler
On Mon, May 7, 2018 at 2:36 AM, Nishanth Devarajan wrote: > net/sched: add gkprio scheduler > > Gkprio (Gatekeeper Priority Queue) is a queueing discipline that prioritizes > IPv4 and IPv6 packets accordingly to their DSCP field. Although Gkprio can be > employed in any QoS scenario in which a higher DSCP field means a higher > priority packet, Gkprio was concieved as a solution for denial-of-service > defenses that need to route packets with different priorities. Can we give it a better name? "Gatekeeper" is meaningless if we read it alone, it ties to your Gatekeeper project which is more than just this kernel module. Maybe "DS Priority Queue"? Overall it looks good to me, just one thing below: > +struct Qdisc_ops gkprio_qdisc_ops __read_mostly = { > + .id = "gkprio", > + .priv_size = sizeof(struct gkprio_sched_data), > + .enqueue= gkprio_enqueue, > + .dequeue= gkprio_dequeue, > + .peek = qdisc_peek_dequeued, > + .init = gkprio_init, > + .reset = gkprio_reset, > + .change = gkprio_change, > + .dump = gkprio_dump, > + .destroy= gkprio_destroy, > + .owner = THIS_MODULE, > +}; You probably want to add Qdisc_class_ops here so that you can dump the stats of each internal queue.