Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches
| Patch #3: | - |Renames s/tfrc_tx_hist/tfrc_tx_hist_slab/g; no problems. | | This was for consistency with the other slabs in DCCP: | | [EMAIL PROTECTED] net-2.6.25]$ find net/dccp/ -name *.c | xargs grep 'struct kmem_cache' Thanks, I will put the same naming scheme also in the test tree (tomorrow). | Patch #4: | - | Just a suggestion, it is possible to simplify this further, |by doing all the initialisation / cleanup in tfrc.c: | int __init {rx,tx}_packet_history_init() | { | tfrc_{rx,tx}_hist_slab = kmem_cache_create(tfrc_{rx,tx}_hist, ...); | return tfrc_{rx,tx}_hist_slab == NULL ? - ENOBUFS : 0; | } |and then call these successively in tfrc_module_init(). | | Idea here was to have each C source file to have a init module. Perhaps | we should try to break packet_history.c into tx_packet_history and | rx_packet_history.c. We can do that later to try to meet the goal of | being able to see what is being replaced. | I think this is a great idea, since then rx_packet_history.c could also take up all the internals of the RX packet history list, as it is currently done for the TX history, and it could also possibly incorporate. packet_history_internal.h. | | Patch #7: | - | | * tfrc_rx_hist_entry_data_packet() is not needed: | - a similar function, called dccp_data_packet(), was introduced in patch 2/7 | | I thought about that, but dccp_data_packet is for skbs, | tfrc_rx_hist_entry_data_packet is for tfrc_rx_hist_entries, I guess we | should just make dccp_data_packet receive the packet type and not an | object that has a packet type field. | The question which of the two interfaces is generally better to use is best left to you. Two functions doing almost the same thing can probably be replaced by just one. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches
Note: removed Ingo from the CC list, I had added it first just because he advocated reducing the number of mailing lists, so I wanted him to know that we're trying to do that. Em Wed, Dec 05, 2007 at 10:27:36AM +, Gerrit Renker escreveu: Today being Wednesday, below is some feedback after working through the patch set. Hope this is helpful. Patch #1: - Several new good points are introduced: - IP_DCCP_TFRC_DEBUG is made dependent on IP_DCCP_TFRC_DEBUG which is useful - the select from CONFIG_IP_DCCP_CCID3 = CONFIG_DCCP_TFRC_LIB - the cleanup action in tfrc_module_init() (when packet_history_init() fails) was previously missing, this is a good catch. Also a note: tfrc_pr_debug() is not currently used (but may be later should the code common to both CCID3 and CCID4 be shared). OK Patches #2/#6: -- Separated from main patch, no problems (were initially submitted in this format). I wonder whether switching back to smaller patch sizes is better? Patches that do one thing that is logically separated from others are preferred, as its analisys is made faster. Patches that rewrite an existing functionality are done best when the functions being replaced keep as much as possible the same name. For instance, in the new RX handling code we need to alloc the RX hist internal structures, in the previous implementation we also needed to do that, it also needs to purge entries, as the old one needed. Previous one was a linked list, the new one is a ring. As we go we can and should add new APIs when needed, but trying to keep an API so that if in the future we decide to rework the internal structures, this can be done in a plugin fashion, without changing too much its users (CCIDs in this case), so that we can get back to the old one with minor disruption to the users if needed. Sometimes we manage to do that, some times we need to improve upon the current API, but the goal remains the same. Patch #3: - Renames s/tfrc_tx_hist/tfrc_tx_hist_slab/g; no problems. This was for consistency with the other slabs in DCCP: [EMAIL PROTECTED] net-2.6.25]$ find net/dccp/ -name *.c | xargs grep 'struct kmem_cache' net/dccp/ccids/lib/packet_history.c:static struct kmem_cache *tfrc_tx_hist_slab; net/dccp/ccids/lib/packet_history.c:static struct kmem_cache *tfrc_rx_hist_slab; net/dccp/ccids/lib/loss_interval.c:static struct kmem_cache *dccp_li_cachep __read_mostly; net/dccp/ackvec.c:static struct kmem_cache *dccp_ackvec_slab; net/dccp/ackvec.c:static struct kmem_cache *dccp_ackvec_record_slab; net/dccp/ccid.c:struct kmem_cache *slab; net/dccp/ccid.c:static void ccid_kmem_cache_destroy(struct kmem_cache *slab) [EMAIL PROTECTED] net-2.6.25]$ Patch #4: - packet_history_init() initialises both RX and TX history and is later called by the module_init() function in net/dccp/ccids/lib/tfrc.c. Just a suggestion, it is possible to simplify this further, by doing all the initialisation / cleanup in tfrc.c: int __init {rx,tx}_packet_history_init() { tfrc_{rx,tx}_hist_slab = kmem_cache_create(tfrc_{rx,tx}_hist, ...); return tfrc_{rx,tx}_hist_slab == NULL ? - ENOBUFS : 0; } and then call these successively in tfrc_module_init(). Idea here was to have each C source file to have a init module. Perhaps we should try to break packet_history.c into tx_packet_history and rx_packet_history.c. We can do that later to try to meet the goal of being able to see what is being replaced. Patch #5: - The naming scheme introduced here is s/dccp_rx/tfrc_rx/g; s/dccphrx/tfrchrx/g; I wonder, while at it, would it be possible to extend this and extend this to other parts of the code? Basically this is the same discussion as earlier on [EMAIL PROTECTED] with Leandro, who first came up with this naming scheme. There the same question came up and the result of the discussion was that a prefix of `tfrchrx' is cryptic; if something simpler is possible then it would be great. As we did with the ackvecs, will do that later. Patch #7: - * ccid3_hc_rx_detect_loss() can be fully removed since no other code references it any further. I left it to try have the diff not mixing up removal of its implementation with the implementation of the function just after it, ccid3_hx_rx_packet_recv. Will remove it later. * bytes_recv also counts the payload_size's of non-data packets, which is wrong (it should only sum up the sum of bytes transferred as data packets) As said to you in private message I'll get back to the way you wrote, as you mentioned that upcoming patches will make good use of that. I'll try to restrain me to not do too many changes. * loss handling is not correctly taken care of: unlike in the other part, both data and non-data packets are used to detect loss (this is
Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches
Today being Wednesday, below is some feedback after working through the patch set. Hope this is helpful. Patch #1: - Several new good points are introduced: - IP_DCCP_TFRC_DEBUG is made dependent on IP_DCCP_TFRC_DEBUG which is useful - the select from CONFIG_IP_DCCP_CCID3 = CONFIG_DCCP_TFRC_LIB - the cleanup action in tfrc_module_init() (when packet_history_init() fails) was previously missing, this is a good catch. Also a note: tfrc_pr_debug() is not currently used (but may be later should the code common to both CCID3 and CCID4 be shared). Patches #2/#6: -- Separated from main patch, no problems (were initially submitted in this format). I wonder whether switching back to smaller patch sizes is better? Patch #3: - Renames s/tfrc_tx_hist/tfrc_tx_hist_slab/g; no problems. Patch #4: - packet_history_init() initialises both RX and TX history and is later called by the module_init() function in net/dccp/ccids/lib/tfrc.c. Just a suggestion, it is possible to simplify this further, by doing all the initialisation / cleanup in tfrc.c: int __init {rx,tx}_packet_history_init() { tfrc_{rx,tx}_hist_slab = kmem_cache_create(tfrc_{rx,tx}_hist, ...); return tfrc_{rx,tx}_hist_slab == NULL ? - ENOBUFS : 0; } and then call these successively in tfrc_module_init(). Patch #5: - The naming scheme introduced here is s/dccp_rx/tfrc_rx/g; s/dccphrx/tfrchrx/g; I wonder, while at it, would it be possible to extend this and extend this to other parts of the code? Basically this is the same discussion as earlier on [EMAIL PROTECTED] with Leandro, who first came up with this naming scheme. There the same question came up and the result of the discussion was that a prefix of `tfrchrx' is cryptic; if something simpler is possible then it would be great. Patch #7: - * ccid3_hc_rx_detect_loss() can be fully removed since no other code references it any further. * bytes_recv also counts the payload_size's of non-data packets, which is wrong (it should only sum up the sum of bytes transferred as data packets) * loss handling is not correctly taken care of: unlike in the other part, both data and non-data packets are used to detect loss (this is not correctly handled in the current Linux implementation). * tfrc_rx_hist_entry_data_packet() is not needed: - a similar function, called dccp_data_packet(), was introduced in patch 2/7 - code compiles cleanly without tfrc_rx_hist_entry_data_packet() - all references to this function are deleted by patch 7/7 * is another header file (net/dccp/ccids/lib/packet_history_internal.h) really necessary? - net/dccp/ccids/lib has already 3 x header file, 4 x source file - with the removal of tfrc_rx_hist_entry_data_packet(), only struct tfrc_rx_hist_entry remains as the sole occupant of that file - how about hiding the internals of struct tfrc_rx_hist_entry by putting it into packet_history.c, as it was done previously in a similar way for the TX packet history? * in ccid3_hc_rx_insert_options(), due to hunk-shifting or something else, there is still the call to dccp_insert_option_timestamp(sk, skb) -- this was meant to be removed by an earlier patch (which also removed the Elapsed Time option); -- in the original submission of this patch the call to dccp_insert_option_timestamp() did no longer appear (as can be found in the [EMAIL PROTECTED] mailing list archives), and the test tree likewise does not use it; -- it can be removed with full confidence since no part of the code uses timestamps sent by the HC-receiver (only the HC-sender timestamps are used); and it is further not required by the spec to send HC-receiver timestamps (RFC 4342, section 6) * one of the two variables ccid3hcrx_tstamp_last_feedback and ccid3hcrx_tstamp_last_ack is redundant and can be removed (this may be part of a later patch); the variable ccid3hcrx_tstamp_last_feedback is very long (function is clear due to type of ktime_t). * the inlines are a good idea regarding type safety, but I take it that we can now throw overboard the old rule of 80 characters per line? Due to the longer names of the inlines, some expressions end at column 98 (cf. tfrc_rx_hist_sample_rtt(); but to be honest I'd rather get rid of that function since the receiver-RTT sampling is notoriously inaccurate (wrong place to measure) and then there is little left to argue with the inlines). * with regard to RX history initialisation, would you be amicable to the idea of performing the RX/TX history, and loss intervals history in tfrc.c, as suggested for patch 1/7 (shortens the routines)? * tfrc_rx_hist_entry is lacking documentation (my fault, had been forgotten in the initial submission): /** *
Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches
On 12/3/07, Arnaldo Carvalho de Melo [EMAIL PROTECTED] wrote: WARNING: After reading some messages from Ingo Molnar on lkml I think we should really trim the number of lists we use for kernel development. And since I moved back to using mutt for reading e-mails, something I should have never, ever stopped doing, I guess we should move the DCCP discussions to netdev, where we hopefully can get more people interested and reviewing the work we do, so please consider moving DCCP discussion to netdev@vger.kernel.org, where lots of smart networking folks are present and can help our efforts on turning RFCs to code. I (and others too) don't necessarily have time to read netdev so would vote on keeping dccp. I would totally agree to making sure that cross-post to netdev as well as dccp. Ian -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches
Hi Arnaldo, hank you for going through this. I have just backported your recent patches of 2.6.25 to the DCCP/CCID4/Faster Restart test tree at git://eden-feed.erg.abdn.ac.uk/dccp_exp {dccp,ccid4,dccp_fr} as per subsequent message. |do, so please consider moving DCCP discussion to netdev@vger.kernel.org, |where lots of smart networking folks are present and can help our efforts |on turning RFCs to code. Are you suggesting using netdev exclusively or in addition to [EMAIL PROTECTED] | Please take a look at this patch series where I reorganized your work on the new | TFRC rx history handling code. I'll wait for your considerations and then do as many | interactions as reasonable to get your work merged. | | It should be completely equivalent, plus some fixes and optimizations, such as: It will be necessary to address these points one-by-one. Before diving into making fixes and `optimisations', have you tested your code? The patches you are referring to have been posted and re-posted for a period of over 9 months on [EMAIL PROTECTED], and there are regression tests which show that this code improves on the existing Linux implementation. These are labelled as `test tree' on http://www.linux-foundation.org/en/Net:DCCP_Testing#Regression_testing So if you are making changes to the code, I would like to ask if you have run similar regression tests, to avoid having to step back later. | . The code that allocates the RX ring deals with failures when one of the entries in | the ring buffer is not successfully allocated, the original code was leaking the | successfully allocated entries. | | . We do just one allocation for the ring buffer, as the number of entries is fixed we | should just do one allocation and not TFRC_NDUPACK times. Will look at the first point in the patch; with regard to the second point I agree, this will make the code simpler, which is good. | . I haven't checked if all the code was commited, as I tried to introduce just what was | immediatelly used, probably we'll need to do some changes when working on the merge | of your loss intervals code. Sorry I don't understand this point. | . I changed the ccid3_hc_rx_packet_recv code to set hcrx-ccid3hcrx_s for the first | non-data packet instead of calling ccid3_hc_rx_set_state, that would use 0 as the | initial value in the EWMA calculation. This is a misunderstanding. Non-data packets are not considered in the moving average for the data packet size `s'; and it would be an error to do (consider 40byte Acks vs. 1460byte data packets, also it is in RFC 4342). Where would the zero initial value come from? I think this is also a misunderstanding. Please have a look below: static void ccid3_hc_rx_packet_recv(struct sock *sk, struct sk_buff *skb) { // ... u32 sample, payload_size = skb-len - dccp_hdr(skb)-dccph_doff * 4; if (unlikely(hcrx-ccid3hcrx_state == TFRC_RSTATE_NO_DATA)) { if (is_data_packet) { do_feedback = FBACK_INITIAL; ccid3_hc_rx_set_state(sk, TFRC_RSTATE_DATA); ccid3_hc_rx_update_s(hcrx, payload_size); } goto update_records; } == Non-data packets are ignored for the purposes of computing s (this is in the RFC), consequently update_s() is only called for data packets; using the two following functions: static inline u32 tfrc_ewma(const u32 avg, const u32 newval, const u8 weight) { return avg ? (weight * avg + (10 - weight) * newval) / 10 : newval; } static inline void ccid3_hc_rx_update_s(struct ccid3_hc_rx_sock *hcrx, int len) { if (likely(len 0))/* don't update on empty packets (e.g. ACKs) */ hcrx-ccid3hcrx_s = tfrc_ewma(hcrx-ccid3hcrx_s, len, 9); } == Hence I can't see where a zero value should come from: ccid3hrx_s is initially initialised with zero (memset(...,0,...)); when first called, update_s() will feed a non-zero payload size to tfrc_ewma(), which will return `newval' = payload_size, hence the first data packet will contribute a non-zero payload_size. Zero-sized DCCP-Data packets are pathological and are ignored by the CCID calculations (not by the receiver); a corresponding counterpart for zero-sized | | It is available at: | | master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6.25 | Need to do this separately. As said, the code has been developed and tested over a long time, it took a long while until it acted predictably, so being careful is very important. I would rather not have my patches merged and continue to run a test tree if the current changes alter the behaviour to the worse. -- To unsubscribe
Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches
Em Mon, Dec 03, 2007 at 08:35:12AM +, Gerrit Renker escreveu: Hi Arnaldo, hank you for going through this. I have just backported your recent patches of 2.6.25 to the DCCP/CCID4/Faster Restart test tree at git://eden-feed.erg.abdn.ac.uk/dccp_exp {dccp,ccid4,dccp_fr} as per subsequent message. | do, so please consider moving DCCP discussion to netdev@vger.kernel.org, | where lots of smart networking folks are present and can help our efforts | on turning RFCs to code. Are you suggesting using netdev exclusively or in addition to [EMAIL PROTECTED] Well, since at least one person that has contributed significantly in the past has said he can't cope with traffic on netdev, we can CC [EMAIL PROTECTED] | Please take a look at this patch series where I reorganized your work on the new | TFRC rx history handling code. I'll wait for your considerations and then do as many | interactions as reasonable to get your work merged. | | It should be completely equivalent, plus some fixes and optimizations, such as: It will be necessary to address these points one-by-one. Before diving into making fixes and `optimisations', have you tested your code? The patches you are referring to I have performed basic tests, and when doing so noticed a bug in inet_diag, that I commented with Herbert Xu and he has since provided a fix. have been posted and re-posted for a period of over 9 months on [EMAIL PROTECTED], and Being posted and re-posted does not guarantee that the patch is OK or that is in a form that is acceptable to all tree maintainers. DaveM is subscribed to [EMAIL PROTECTED] and could have picked this if he had the time to do the review. I didn't, but now I'm trying to spend my time on reviewing your patches and in the process doing modifications I find appropriate while trying hard not to introduce bugs in your hard work to get it merged. there are regression tests which show that this code improves on the existing Linux implementation. These are labelled as `test tree' on http://www.linux-foundation.org/en/Net:DCCP_Testing#Regression_testing So if you are making changes to the code, I would like to ask if you have run similar regression tests, to avoid having to step back later. Fair enough, I will do that before asking Herbert or Dave to pull from my tree. | . The code that allocates the RX ring deals with failures when one of the entries in | the ring buffer is not successfully allocated, the original code was leaking the | successfully allocated entries. Sorry for not point out exactly this, here it goes: Your original patch: +int tfrc_rx_hist_init(struct tfrc_rx_hist *h) +{ + int i; + + for (i = 0; i = NDUPACK; i++) { + h-ring[i] = kmem_cache_alloc(tfrc_rxh_cache, GFP_ATOMIC); + if (h-ring[i] == NULL) + return 1; + } + h-loss_count = 0; + h-loss_start = 0; + return 0; +} +EXPORT_SYMBOL_GPL(tfrc_rx_hist_init); Then, in ccid3_hc_rx_init you do: static int ccid3_hc_rx_init(struct ccid *ccid, struct sock *sk) { struct ccid3_hc_rx_sock *hcrx = ccid_priv(ccid); ccid3_pr_debug(entry\n); hcrx-ccid3hcrx_state = TFRC_RSTATE_NO_DATA; tfrc_lh_init(hcrx-ccid3hcrx_li_hist); return tfrc_rx_hist_init(hcrx-ccid3hcrx_hist); } So if tfrc_rx_hist_init fail to allocate, say, the last entry in the ring, it will return 1, and from looking at how you initialize h-loss_{count,start} to zero I assumed that the whole tfrc_rx_hist is not zeroed when tfrc_rx_hist_init is called, so one can also assume that the ring entries are not initialized to NULL and that if the error recovery is to assume that later on tfrc_rx_hist_cleanup is called we would not have a leak, but an OOPS when tfrc_rx_hist_cleanup tries to call kfree_cache_free on the uninitialized ring entries. But if you look at ccid_new(), that calls ccid3_hc_rx_init(), you'll see that when the ccid rx or hx routine fails, it just frees the struct ccid with the area where h-ring is, so, what was not cleaned up by the ccid init routine is effectively forgot, leaked. I first did the cleanup at tfrc_rx_hist_init (that I renamed to tfrc_rx_hist_alloc, since it doesn't just initializes things, but allocates entries from slab), but then I just made the rx history slab have arrays of tfrc_rx_hist_entry objects, not individual ones as your code always allocates them as arrays. | . We do just one allocation for the ring buffer, as the number of entries is fixed we | should just do one allocation and not TFRC_NDUPACK times. Will look at the first point in the patch; with regard to the second point I agree, this will make the code simpler, which is good. good | . I haven't checked if all the code was commited, as I tried to introduce just what was | immediatelly used, probably we'll need to do some changes when working on the merge
Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches
| Are you suggesting using netdev exclusively or in addition to [EMAIL PROTECTED] | | Well, since at least one person that has contributed significantly in | the past has said he can't cope with traffic on netdev, we can CC | [EMAIL PROTECTED] I have a similar problem with the traffic but agree and will copy as well. | have been posted and re-posted for a period of over 9 months on [EMAIL PROTECTED], and | | Being posted and re-posted does not guarantee that the patch is OK or | that is in a form that is acceptable to all tree maintainers. With the first point I couldn't agree more, but this is not really what I meant - the point was that despite posting and re-posting there was often silence. And now there is feedback, in form of a patchset made by you; and all that I am asking for is just to be given the time to look that through. Last time a RFC patch appeared at 3pm and was checked in the same evening (only found out next morning). With your experience and long background as a maintainer maybe you expect quicker turn-around times, but I also had waited patiently until you had had a chance to review the stuff I sent. | | . The code that allocates the RX ring deals with failures when one of the entries in | | the ring buffer is not successfully allocated, the original code was leaking the | | successfully allocated entries. | | Sorry for not point out exactly this, here it goes: | | Your original patch: | | +int tfrc_rx_hist_init(struct tfrc_rx_hist *h) | +{ | + int i; | + | + for (i = 0; i = NDUPACK; i++) { | + h-ring[i] = kmem_cache_alloc(tfrc_rxh_cache, GFP_ATOMIC); | + if (h-ring[i] == NULL) | + return 1; | + } | + h-loss_count = 0; | + h-loss_start = 0; | + return 0; | +} | +EXPORT_SYMBOL_GPL(tfrc_rx_hist_init); | I expected this and it actually was very clear from your original message. I fully back up your solution in this point, see below. But my question above was rather: are there any other bugs rather than the above leakage, which is what the previous email seemed to indicate? With regard to your solution - you are using int tfrc_rx_hist_alloc(struct tfrc_rx_hist *h) { h-ring = kmem_cache_alloc(tfrc_rx_hist_slab, GFP_ATOMIC); h-loss_count = h-loss_start = 0; return h-ring == NULL; } which is better not only for one but for two reasons. It solves the leakage and in addition makes the entire code simpler. Fully agreed. | | | . I haven't checked if all the code was commited, as I tried to introduce just what was | | immediatelly used, probably we'll need to do some changes when working on the merge | | of your loss intervals code. | Sorry I don't understand this point. | | Let me check now and tell you for sure: | | tfrc_rx_hist_delta_seqno and tfrc_rx_hist_swap were not included, as | they were not used, we should introduce them later, when getting to the | working on the loss interval code. Ah thanks, that was really not clear. Just beginning to work through the set. | static void ccid3_hc_rx_packet_recv(struct sock *sk, struct sk_buff *skb) | { | // ... | u32 sample, payload_size = skb-len - dccp_hdr(skb)-dccph_doff * 4; | | if (unlikely(hcrx-ccid3hcrx_state == TFRC_RSTATE_NO_DATA)) { | if (is_data_packet) { | do_feedback = FBACK_INITIAL; | ccid3_hc_rx_set_state(sk, TFRC_RSTATE_DATA); | ccid3_hc_rx_update_s(hcrx, payload_size); | } | goto update_records; | } | | == Non-data packets are ignored for the purposes of computing s (this is in the RFC), | consequently update_s() is only called for data packets; using the two following | functions: | | | static inline u32 tfrc_ewma(const u32 avg, const u32 newval, const u8 weight) | { | return avg ? (weight * avg + (10 - weight) * newval) / 10 : newval; | } | | I hadn't considered that tfrc_ewma would for every packet check if the | avg was 0 and I find it suboptimal now that I look at it, we are just | feeding data packets, no? Yes exactly, only data packets are used for s. | static inline void ccid3_hc_rx_update_s(struct ccid3_hc_rx_sock *hcrx, int len) | { | if (likely(len 0))/* don't update on empty packets (e.g. ACKs) */ | hcrx-ccid3hcrx_s = tfrc_ewma(hcrx-ccid3hcrx_s, len, 9); | } | | And we also just do test for len 0 in update_s, that looks like | also excessive, no? Hm, I think we need to make it robust against API bugs and/or zero-sized data packets. The check `len 0' may seem redundant but it catches such a condition. For a moving average an accidental zero value will have quite an impact on the overall value. In CCID3 it
Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches
Em Mon, Dec 03, 2007 at 01:49:47PM +, Gerrit Renker escreveu: | Are you suggesting using netdev exclusively or in addition to [EMAIL PROTECTED] | | Well, since at least one person that has contributed significantly in | the past has said he can't cope with traffic on netdev, we can CC | [EMAIL PROTECTED] I have a similar problem with the traffic but agree and will copy as well. | have been posted and re-posted for a period of over 9 months on [EMAIL PROTECTED], and | | Being posted and re-posted does not guarantee that the patch is OK or | that is in a form that is acceptable to all tree maintainers. With the first point I couldn't agree more, but this is not really what I meant - the point was that despite posting and re-posting there was often silence. And now there is feedback, in form of a patchset made by you; and all that I am asking for is just to be given the time to look that through. Last time a RFC patch appeared at 3pm and was checked in the same evening (only found out next morning). I was too optimistic about that one, feeling that it was safe, sorry about that, will avoid doing that in the future. With your experience and long background as a maintainer maybe you expect quicker turn-around times, but I also had waited patiently until you had had a chance to review the stuff I sent. Agreed, my bad, will be more patient with my side as you've been with yours :-) | | . The code that allocates the RX ring deals with failures when one of the entries in | | the ring buffer is not successfully allocated, the original code was leaking the | | successfully allocated entries. | | Sorry for not point out exactly this, here it goes: | | Your original patch: | | +int tfrc_rx_hist_init(struct tfrc_rx_hist *h) | +{ | + int i; | + | + for (i = 0; i = NDUPACK; i++) { | + h-ring[i] = kmem_cache_alloc(tfrc_rxh_cache, GFP_ATOMIC); | + if (h-ring[i] == NULL) | + return 1; | + } | + h-loss_count = 0; | + h-loss_start = 0; | + return 0; | +} | +EXPORT_SYMBOL_GPL(tfrc_rx_hist_init); | I expected this and it actually was very clear from your original message. I fully back up your solution in this point, see below. But my question above was rather: are there any other bugs rather than the above leakage, which is what the previous email seemed to indicate? With regard to your solution - you are using int tfrc_rx_hist_alloc(struct tfrc_rx_hist *h) { h-ring = kmem_cache_alloc(tfrc_rx_hist_slab, GFP_ATOMIC); h-loss_count = h-loss_start = 0; return h-ring == NULL; } which is better not only for one but for two reasons. It solves the leakage and in addition makes the entire code simpler. Fully agreed. good | | | . I haven't checked if all the code was commited, as I tried to introduce just what was | | immediatelly used, probably we'll need to do some changes when working on the merge | | of your loss intervals code. | Sorry I don't understand this point. | | Let me check now and tell you for sure: | | tfrc_rx_hist_delta_seqno and tfrc_rx_hist_swap were not included, as | they were not used, we should introduce them later, when getting to the | working on the loss interval code. Ah thanks, that was really not clear. Just beginning to work through the set. great |static void ccid3_hc_rx_packet_recv(struct sock *sk, struct sk_buff *skb) |{ |// ... |u32 sample, payload_size = skb-len - dccp_hdr(skb)-dccph_doff * 4; | |if (unlikely(hcrx-ccid3hcrx_state == TFRC_RSTATE_NO_DATA)) { |if (is_data_packet) { |do_feedback = FBACK_INITIAL; |ccid3_hc_rx_set_state(sk, TFRC_RSTATE_DATA); |ccid3_hc_rx_update_s(hcrx, payload_size); |} |goto update_records; |} | | == Non-data packets are ignored for the purposes of computing s (this is in the RFC), | consequently update_s() is only called for data packets; using the two following | functions: | | |static inline u32 tfrc_ewma(const u32 avg, const u32 newval, const u8 weight) |{ |return avg ? (weight * avg + (10 - weight) * newval) / 10 : newval; |} | | I hadn't considered that tfrc_ewma would for every packet check if the | avg was 0 and I find it suboptimal now that I look at it, we are just | feeding data packets, no? Yes exactly, only data packets are used for s. |static inline void ccid3_hc_rx_update_s(struct ccid3_hc_rx_sock *hcrx, int len) |{ |if (likely(len 0))/* don't update on empty packets (e.g. ACKs) */ |hcrx-ccid3hcrx_s = tfrc_ewma(hcrx-ccid3hcrx_s, len, 9); |} | | And we also
Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches
| | static inline void ccid3_hc_rx_update_s(struct ccid3_hc_rx_sock *hcrx, int len) | | { | | if (likely(len 0))/* don't update on empty packets (e.g. ACKs) */ | | hcrx-ccid3hcrx_s = tfrc_ewma(hcrx-ccid3hcrx_s, len, 9); | | } | | | | And we also just do test for len 0 in update_s, that looks like | | also excessive, no? | Hm, I think we need to make it robust against API bugs and/or zero-sized | data packets. The check `len 0' may seem redundant but it catches such | a condition. For a moving average an accidental zero value will have | quite an impact on the overall value. In CCID3 it is | | x_new = 0.9 * x_old + 0.1 * len | | So if len is accidentally 0 where it should not be, then each time the | moving average is reduced to 90%. | | So we must make it a BUG_ON, not something that is to be always present. | I think it should be a warning condition since it can be triggered when the remote party sends zero-sized packets. It may be good to log this into the syslog to warn about possibly misbehaving apps/peers/remote stacks. | As a comparison - the entire patch set took about a full month to do. | But that meant I am reasonably sure the algorithm is sound and can cope | with problematic conditions. | | And from what I saw so far that is my impression too, if you look at | what I'm doing it is: | | 1. go thru the whole patch trying to understand hunk by hunk You are doing a great job - in particular as it really is a lot of material. | 2. do consistency changes (add namespace prefixes) | 3. reorganize the code to look more like what is already there, we |both have different backgrounds and tastes about how code should |be written, so its only normal that if we want to keep the code |consistent, and I want that, I jump into things I think should be |reworded, while trying to keep the algorithm expressed by you. | Agree, that is not always easy to get right. I try to stick as close as possible to existing conventions but of course that is my interpretation, so I am already anticipating such changes/comments here. | think about further automatization on regression testing. | If it is of any use, some scripts and setups are at the bottom of the page at http://www.erg.abdn.ac.uk/users/gerrit/dccp/testing_dccp/ -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html