Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches

2007-12-05 Thread Gerrit Renker
|  Patch #3: 
|  -
|Renames s/tfrc_tx_hist/tfrc_tx_hist_slab/g; no problems.
| 
| This was for consistency with the other slabs in DCCP:
| 
| [EMAIL PROTECTED] net-2.6.25]$ find net/dccp/ -name *.c | xargs grep 
'struct kmem_cache'
Thanks, I will put the same naming scheme also in the test tree (tomorrow).

 
|  Patch #4:
|  -
|  Just a suggestion, it is possible to simplify this further,
|by doing all the initialisation / cleanup in tfrc.c:
|  int __init {rx,tx}_packet_history_init()
|  {
|  tfrc_{rx,tx}_hist_slab = kmem_cache_create(tfrc_{rx,tx}_hist, 
...);
|  return tfrc_{rx,tx}_hist_slab == NULL ? - ENOBUFS : 0;
|  }
|and then call these successively in tfrc_module_init().
| 
| Idea here was to have each C source file to have a init module. Perhaps
| we should try to break packet_history.c into tx_packet_history and
| rx_packet_history.c. We can do that later to try to meet the goal of
| being able to see what is being replaced.
|
I think this is a great idea, since then rx_packet_history.c could also
take up all the internals of the RX packet history list, as it is
currently done for the TX history, and it could also possibly
incorporate. packet_history_internal.h.


|  
|  Patch #7:
|  -
|  
|   * tfrc_rx_hist_entry_data_packet() is not needed:
| - a similar function, called dccp_data_packet(), was introduced in patch 
2/7
| 
| I thought about that, but dccp_data_packet is for skbs,
| tfrc_rx_hist_entry_data_packet is for tfrc_rx_hist_entries, I guess we
| should just make dccp_data_packet receive the packet type and not an
| object that has a packet type field.
| 
The question which of the two interfaces is generally better to use is
best left to you. Two functions doing almost the same thing can probably
be replaced by just one.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches

2007-12-05 Thread Arnaldo Carvalho de Melo
Note: removed Ingo from the CC list, I had added it first just because
he advocated reducing the number of mailing lists, so I wanted him to
know that we're trying to do that.

Em Wed, Dec 05, 2007 at 10:27:36AM +, Gerrit Renker escreveu:
 Today being Wednesday, below is some feedback after working through the patch 
 set.
 Hope this is helpful. 
 
 Patch #1:
 -
   Several new good points are introduced:
   - IP_DCCP_TFRC_DEBUG is made dependent on IP_DCCP_TFRC_DEBUG which is useful
   - the select from CONFIG_IP_DCCP_CCID3 = CONFIG_DCCP_TFRC_LIB
   - the cleanup action in tfrc_module_init() (when packet_history_init() 
 fails)
 was previously missing, this is a good catch.
   Also a note: tfrc_pr_debug() is not currently used (but may be later should 
 the 
code common to both CCID3 and CCID4 be shared).

OK
  
 Patches #2/#6:
 --
   Separated from main patch, no problems (were initially submitted in this 
 format).
   I wonder whether switching back to smaller patch sizes is better?

Patches that do one thing that is logically separated from others are
preferred, as its analisys is made faster.

Patches that rewrite an existing functionality are done best when the
functions being replaced keep as much as possible the same name. For
instance, in the new RX handling code we need to alloc the RX hist
internal structures, in the previous implementation we also needed to do
that, it also needs to purge entries, as the old one needed. Previous
one was a linked list, the new one is a ring. As we go we can and should
add new APIs when needed, but trying to keep an API so that if in the
future we decide to rework the internal structures, this can be done in
a plugin fashion, without changing too much its users (CCIDs in this
case), so that we can get back to the old one with minor disruption to
the users if needed. Sometimes we manage to do that, some times we need
to improve upon the current API, but the goal remains the same.
 
 Patch #3: 
 -
   Renames s/tfrc_tx_hist/tfrc_tx_hist_slab/g; no problems.

This was for consistency with the other slabs in DCCP:

[EMAIL PROTECTED] net-2.6.25]$ find net/dccp/ -name *.c | xargs grep 'struct 
kmem_cache'
net/dccp/ccids/lib/packet_history.c:static struct kmem_cache *tfrc_tx_hist_slab;
net/dccp/ccids/lib/packet_history.c:static struct kmem_cache *tfrc_rx_hist_slab;
net/dccp/ccids/lib/loss_interval.c:static struct kmem_cache *dccp_li_cachep 
__read_mostly;
net/dccp/ackvec.c:static struct kmem_cache *dccp_ackvec_slab;
net/dccp/ackvec.c:static struct kmem_cache *dccp_ackvec_record_slab;
net/dccp/ccid.c:struct kmem_cache *slab;
net/dccp/ccid.c:static void ccid_kmem_cache_destroy(struct kmem_cache *slab)
[EMAIL PROTECTED] net-2.6.25]$

 Patch #4:
 -
   packet_history_init() initialises both RX and TX history and is later 
 called by the module_init()
   function in net/dccp/ccids/lib/tfrc.c. Just a suggestion, it is possible to 
 simplify this further,
   by doing all the initialisation / cleanup in tfrc.c:
 int __init {rx,tx}_packet_history_init()
 {
   tfrc_{rx,tx}_hist_slab = kmem_cache_create(tfrc_{rx,tx}_hist, 
 ...);
   return tfrc_{rx,tx}_hist_slab == NULL ? - ENOBUFS : 0;
 }
   and then call these successively in tfrc_module_init().

Idea here was to have each C source file to have a init module. Perhaps
we should try to break packet_history.c into tx_packet_history and
rx_packet_history.c. We can do that later to try to meet the goal of
being able to see what is being replaced.
 
 Patch #5:
 -
   The naming scheme introduced here is 
 s/dccp_rx/tfrc_rx/g; 
 s/dccphrx/tfrchrx/g;
   I wonder, while at it, would it be possible to extend this and extend this 
 to other parts
   of the code? Basically this is the same discussion as earlier on [EMAIL 
 PROTECTED] with Leandro,
   who first came up with this naming scheme. There the same question came up 
 and the result
   of the discussion was that a prefix of  `tfrchrx' is cryptic; if something 
 simpler is 
   possible then it would be great.

As we did with the ackvecs, will do that later.
 
 Patch #7:
 -
  * ccid3_hc_rx_detect_loss() can be fully removed since no other code 
 references it any
further.

I left it to try have the diff not mixing up removal of its
implementation with the implementation of the function just after it,
ccid3_hx_rx_packet_recv. Will remove it later.

  * bytes_recv also counts the payload_size's of non-data packets, which is 
 wrong (it should only
sum up the sum of bytes transferred as data packets)

As said to you in private message I'll get back to the way you wrote, as
you mentioned that upcoming patches will make good use of that.

I'll try to restrain me to not do too many changes.
 
  * loss handling is not correctly taken care of: unlike in the other part, 
 both data and non-data
packets are used to detect loss (this is 

Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches

2007-12-05 Thread Gerrit Renker
Today being Wednesday, below is some feedback after working through the patch 
set.
Hope this is helpful. 

Patch #1:
-
  Several new good points are introduced:
  - IP_DCCP_TFRC_DEBUG is made dependent on IP_DCCP_TFRC_DEBUG which is useful
  - the select from CONFIG_IP_DCCP_CCID3 = CONFIG_DCCP_TFRC_LIB
  - the cleanup action in tfrc_module_init() (when packet_history_init() fails)
was previously missing, this is a good catch.
  Also a note: tfrc_pr_debug() is not currently used (but may be later should 
the 
   code common to both CCID3 and CCID4 be shared).
  
Patches #2/#6:
--
  Separated from main patch, no problems (were initially submitted in this 
format).
  I wonder whether switching back to smaller patch sizes is better?

Patch #3: 
-
  Renames s/tfrc_tx_hist/tfrc_tx_hist_slab/g; no problems.

Patch #4:
-
  packet_history_init() initialises both RX and TX history and is later called 
by the module_init()
  function in net/dccp/ccids/lib/tfrc.c. Just a suggestion, it is possible to 
simplify this further,
  by doing all the initialisation / cleanup in tfrc.c:
int __init {rx,tx}_packet_history_init()
{
tfrc_{rx,tx}_hist_slab = kmem_cache_create(tfrc_{rx,tx}_hist, 
...);
return tfrc_{rx,tx}_hist_slab == NULL ? - ENOBUFS : 0;
}
  and then call these successively in tfrc_module_init().

Patch #5:
-
  The naming scheme introduced here is 
  s/dccp_rx/tfrc_rx/g; 
  s/dccphrx/tfrchrx/g;
  I wonder, while at it, would it be possible to extend this and extend this to 
other parts
  of the code? Basically this is the same discussion as earlier on [EMAIL 
PROTECTED] with Leandro,
  who first came up with this naming scheme. There the same question came up 
and the result
  of the discussion was that a prefix of  `tfrchrx' is cryptic; if something 
simpler is 
  possible then it would be great.

Patch #7:
-
 * ccid3_hc_rx_detect_loss() can be fully removed since no other code 
references it any
   further.
 * bytes_recv also counts the payload_size's of non-data packets, which is 
wrong (it should only
   sum up the sum of bytes transferred as data packets)  

 * loss handling is not correctly taken care of: unlike in the other part, both 
data and non-data
   packets are used to detect loss (this is not correctly handled in the 
current Linux implementation).


 * tfrc_rx_hist_entry_data_packet() is not needed:
   - a similar function, called dccp_data_packet(), was introduced in patch 2/7
   - code compiles cleanly without tfrc_rx_hist_entry_data_packet()
   - all references to this function are deleted by patch 7/7
 * is another header file (net/dccp/ccids/lib/packet_history_internal.h) really 
necessary?
   - net/dccp/ccids/lib has already 3 x header file, 4 x source file
   - with the removal of tfrc_rx_hist_entry_data_packet(), only struct 
tfrc_rx_hist_entry
 remains as the sole occupant of that file
   - how about hiding the internals of struct tfrc_rx_hist_entry by putting it 
into packet_history.c,
 as it was done previously in a similar way for the TX packet history?

 * in ccid3_hc_rx_insert_options(), due to hunk-shifting or something else, 
there is still the
   call to dccp_insert_option_timestamp(sk, skb)
   -- this was meant to be removed by an earlier patch (which also removed the 
Elapsed Time option);
   -- in the original submission of this patch the call to 
dccp_insert_option_timestamp() did no
   longer appear (as can be found in the [EMAIL PROTECTED] mailing list 
archives), and the test tree
   likewise does not use it;
   -- it can be removed with full confidence since no part of the code uses 
timestamps sent by the
   HC-receiver (only the HC-sender timestamps are used); and it is further 
not required by the
   spec to send HC-receiver timestamps (RFC 4342, section 6)

 * one of the two variables ccid3hcrx_tstamp_last_feedback and 
ccid3hcrx_tstamp_last_ack is
   redundant and can be removed (this may be part of a later patch); the 
variable ccid3hcrx_tstamp_last_feedback
   is very long (function is clear due to type of ktime_t).


 * the inlines are a good idea regarding type safety, but I take it that we can 
now throw overboard the old rule
   of 80 characters per line? Due to the longer names of the inlines, some 
expressions end at column 98 (cf. 
   tfrc_rx_hist_sample_rtt(); but to be honest I'd rather get rid of that 
function since the receiver-RTT
   sampling is notoriously inaccurate (wrong place to measure) and then there 
is little left to argue with the inlines).

 * with regard to RX history initialisation, would you be amicable to the idea 
of performing the RX/TX history, and
   loss intervals history in tfrc.c, as suggested for patch 1/7 (shortens the 
routines)?


 * tfrc_rx_hist_entry is lacking documentation (my fault, had been forgotten in 
the initial submission):
/** 
 * 

Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches

2007-12-03 Thread Ian McDonald
On 12/3/07, Arnaldo Carvalho de Melo [EMAIL PROTECTED] wrote:
 WARNING: After reading some messages from Ingo Molnar on lkml I think we 
 should really
  trim the number of lists we use for kernel development. And since I 
 moved
  back to using mutt for reading e-mails, something I should have 
 never, ever
  stopped doing, I guess we should move the DCCP discussions to netdev,
  where we hopefully can get more people interested and reviewing the 
 work we
  do, so please consider moving DCCP discussion to 
 netdev@vger.kernel.org,
  where lots of smart networking folks are present and can help our 
 efforts
  on turning RFCs to code.

I (and others too) don't necessarily have time to read netdev so would
vote on keeping dccp. I would totally agree to making sure that
cross-post to netdev as well as dccp.

Ian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches

2007-12-03 Thread Gerrit Renker
Hi Arnaldo,

hank you for going through this. I have just backported your recent patches of 
2.6.25
to the DCCP/CCID4/Faster Restart test tree at 
git://eden-feed.erg.abdn.ac.uk/dccp_exp {dccp,ccid4,dccp_fr}
as per subsequent message.
|do, so please consider moving DCCP discussion to 
netdev@vger.kernel.org,
|where lots of smart networking folks are present and can help our 
efforts
|on turning RFCs to code.
Are you suggesting using netdev exclusively or in addition to [EMAIL PROTECTED]
  

|   Please take a look at this patch series where I reorganized your work 
on the new
| TFRC rx history handling code. I'll wait for your considerations and then do 
as many
| interactions as reasonable to get your work merged.
| 
|   It should be completely equivalent, plus some fixes and optimizations, 
such as:
It will be necessary to address these points one-by-one. Before diving into 
making
fixes and `optimisations', have you tested your code? The patches you are 
referring to
have been posted and re-posted for a period of over 9 months on [EMAIL 
PROTECTED], and
there are regression tests which show that this code improves on the existing 
Linux
implementation. These are labelled as `test tree' on 
http://www.linux-foundation.org/en/Net:DCCP_Testing#Regression_testing
So if you are making changes to the code, I would like to ask if you have run 
similar
regression tests, to avoid having to step back later.


 
| . The code that allocates the RX ring deals with failures when one of the 
entries in
|   the ring buffer is not successfully allocated, the original code was 
leaking the
|   successfully allocated entries.
| 
| . We do just one allocation for the ring buffer, as the number of entries is 
fixed we
|   should just do one allocation and not TFRC_NDUPACK times.
Will look at the first point in the patch; with regard to the second point I 
agree, this
will make the code simpler, which is good. 

| . I haven't checked if all the code was commited, as I tried to introduce 
just what was
|   immediatelly used, probably we'll need to do some changes when working on 
the merge
|   of your loss intervals code.
Sorry I don't understand this point.

| . I changed the ccid3_hc_rx_packet_recv code to set hcrx-ccid3hcrx_s for the 
first
|   non-data packet instead of calling ccid3_hc_rx_set_state, that would use 0 
as the
|   initial value in the EWMA calculation.
This is a misunderstanding. Non-data packets are not considered in the moving 
average
for the data packet size `s'; and it would be an error to do (consider 40byte 
Acks vs.
1460byte data packets, also it is in RFC 4342). 
Where would the zero initial value come from? I think this is also a 
misunderstanding.
Please have a look below:
static void ccid3_hc_rx_packet_recv(struct sock *sk, struct sk_buff 
*skb)
{
// ...
u32 sample, payload_size = skb-len - dccp_hdr(skb)-dccph_doff 
* 4;

if (unlikely(hcrx-ccid3hcrx_state == TFRC_RSTATE_NO_DATA)) {
if (is_data_packet) {
do_feedback = FBACK_INITIAL;
ccid3_hc_rx_set_state(sk, TFRC_RSTATE_DATA);
ccid3_hc_rx_update_s(hcrx, payload_size);
}
goto update_records;
}

== Non-data packets are ignored for the purposes of computing s (this is in 
the RFC),
consequently update_s() is only called for data packets; using the two 
following
functions:


static inline u32 tfrc_ewma(const u32 avg, const u32 newval, const u8 
weight)
{
return avg ? (weight * avg + (10 - weight) * newval) / 10 : 
newval;
}

static inline void ccid3_hc_rx_update_s(struct ccid3_hc_rx_sock *hcrx, 
int len)
{
if (likely(len  0))/* don't update on empty packets (e.g. 
ACKs) */
hcrx-ccid3hcrx_s = tfrc_ewma(hcrx-ccid3hcrx_s, len, 
9);
}

== Hence I can't see where a zero value should come from: ccid3hrx_s is 
initially 
initialised with zero (memset(...,0,...)); when first called, update_s() 
will
feed a non-zero payload size to tfrc_ewma(), which will return  `newval' = 
payload_size,
hence the first data packet will contribute a non-zero payload_size.
Zero-sized DCCP-Data packets are pathological and are ignored by the CCID 
calculations
(not by the receiver); a corresponding counterpart for zero-sized

| 
|   It is available at:
| 
| master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6.25
| 
Need to do this separately. As said, the code has been developed and tested 
over a long time,
it took a long while until it acted predictably, so being careful is very 
important.

I would rather not have my patches merged and continue to run a test tree if 
the current
changes alter the behaviour to the worse.
--
To unsubscribe 

Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches

2007-12-03 Thread Arnaldo Carvalho de Melo
Em Mon, Dec 03, 2007 at 08:35:12AM +, Gerrit Renker escreveu:
 Hi Arnaldo,
 
 hank you for going through this. I have just backported your recent patches 
 of 2.6.25
 to the DCCP/CCID4/Faster Restart test tree at 
   git://eden-feed.erg.abdn.ac.uk/dccp_exp {dccp,ccid4,dccp_fr}
 as per subsequent message.
 |  do, so please consider moving DCCP discussion to 
 netdev@vger.kernel.org,
 |  where lots of smart networking folks are present and can help our 
 efforts
 |  on turning RFCs to code.
 Are you suggesting using netdev exclusively or in addition to [EMAIL 
 PROTECTED]

Well, since at least one person that has contributed significantly in
the past has said he can't cope with traffic on netdev, we can CC
[EMAIL PROTECTED]

 
 | Please take a look at this patch series where I reorganized your work 
 on the new
 | TFRC rx history handling code. I'll wait for your considerations and then 
 do as many
 | interactions as reasonable to get your work merged.
 | 
 | It should be completely equivalent, plus some fixes and optimizations, 
 such as:
 It will be necessary to address these points one-by-one. Before diving into 
 making
 fixes and `optimisations', have you tested your code? The patches you are 
 referring to

I have performed basic tests, and when doing so noticed a bug in
inet_diag, that I commented with Herbert Xu and he has since provided a
fix.

 have been posted and re-posted for a period of over 9 months on [EMAIL 
 PROTECTED], and

Being posted and re-posted does not guarantee that the patch is OK or
that is in a form that is acceptable to all tree maintainers. DaveM is
subscribed to [EMAIL PROTECTED] and could have picked this if he had the time to
do the review. I didn't, but now I'm trying to spend my time on
reviewing your patches and in the process doing modifications I find
appropriate while trying hard not to introduce bugs in your hard work
to get it merged.

 there are regression tests which show that this code improves on the existing 
 Linux
 implementation. These are labelled as `test tree' on 
   http://www.linux-foundation.org/en/Net:DCCP_Testing#Regression_testing
 So if you are making changes to the code, I would like to ask if you have run 
 similar
 regression tests, to avoid having to step back later.

Fair enough, I will do that before asking Herbert or Dave to pull from
my tree.
 
 | . The code that allocates the RX ring deals with failures when one of the 
 entries in
 |   the ring buffer is not successfully allocated, the original code was 
 leaking the
 |   successfully allocated entries.

Sorry for not point out exactly this, here it goes:

Your original patch:

+int tfrc_rx_hist_init(struct tfrc_rx_hist *h)
+{
+   int i;
+
+   for (i = 0; i = NDUPACK; i++) {
+   h-ring[i] = kmem_cache_alloc(tfrc_rxh_cache, GFP_ATOMIC);
+   if (h-ring[i] == NULL)
+   return 1;
+   }
+   h-loss_count = 0;
+   h-loss_start = 0;
+   return 0;
+}
+EXPORT_SYMBOL_GPL(tfrc_rx_hist_init);

Then, in ccid3_hc_rx_init you do:

static int ccid3_hc_rx_init(struct ccid *ccid, struct sock *sk)
{
struct ccid3_hc_rx_sock *hcrx = ccid_priv(ccid);

ccid3_pr_debug(entry\n);

hcrx-ccid3hcrx_state = TFRC_RSTATE_NO_DATA;
tfrc_lh_init(hcrx-ccid3hcrx_li_hist);
return tfrc_rx_hist_init(hcrx-ccid3hcrx_hist);
}

So if tfrc_rx_hist_init fail to allocate, say, the last entry in the
ring, it will return 1, and from looking at how you initialize
h-loss_{count,start} to zero I assumed that the whole tfrc_rx_hist
is not zeroed when tfrc_rx_hist_init is called, so one can also assume
that the ring entries are not initialized to NULL and that if the
error recovery is to assume that later on tfrc_rx_hist_cleanup is called
we would not have a leak, but an OOPS when tfrc_rx_hist_cleanup tries
to call kfree_cache_free on the uninitialized ring entries.

But if you look at ccid_new(), that calls ccid3_hc_rx_init(), you'll see
that when the ccid rx or hx routine fails, it just frees the
struct ccid with the area where h-ring is, so, what was not cleaned up
by the ccid init routine is effectively forgot, leaked.

I first did the cleanup at tfrc_rx_hist_init (that I renamed to
tfrc_rx_hist_alloc, since it doesn't just initializes things, but
allocates entries from slab), but then I just made the rx history slab
have arrays of tfrc_rx_hist_entry objects, not individual ones as your
code always allocates them as arrays.

 | . We do just one allocation for the ring buffer, as the number of entries 
 is fixed we
 |   should just do one allocation and not TFRC_NDUPACK times.
 Will look at the first point in the patch; with regard to the second point I 
 agree, this
 will make the code simpler, which is good. 

good

 | . I haven't checked if all the code was commited, as I tried to introduce 
 just what was
 |   immediatelly used, probably we'll need to do some changes when working on 
 the merge
 

Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches

2007-12-03 Thread Gerrit Renker
|  Are you suggesting using netdev exclusively or in addition to [EMAIL 
PROTECTED]
| 
| Well, since at least one person that has contributed significantly in
| the past has said he can't cope with traffic on netdev, we can CC
| [EMAIL PROTECTED]
I have a similar problem with the traffic but agree and will copy as well.


|  have been posted and re-posted for a period of over 9 months on [EMAIL 
PROTECTED], and
| 
| Being posted and re-posted does not guarantee that the patch is OK or
| that is in a form that is acceptable to all tree maintainers.
With the first point I couldn't agree more, but this is not really what I meant 
- the point
was that despite posting and re-posting there was often silence. And now there 
is feedback,
in form of a patchset made by you; and all that I am asking for is just to be 
given the time to
look that through. Last time a RFC patch appeared at 3pm and was checked in the 
same evening
(only found out next morning).
With your experience and long background as a maintainer maybe you expect 
quicker turn-around
times, but I also had waited patiently until you had had a chance to review the 
stuff I sent.


|  | . The code that allocates the RX ring deals with failures when one of the 
entries in
|  |   the ring buffer is not successfully allocated, the original code was 
leaking the
|  |   successfully allocated entries.
| 
| Sorry for not point out exactly this, here it goes:
| 
| Your original patch:
| 
| +int tfrc_rx_hist_init(struct tfrc_rx_hist *h)
| +{
| + int i;
| +
| + for (i = 0; i = NDUPACK; i++) {
| + h-ring[i] = kmem_cache_alloc(tfrc_rxh_cache, GFP_ATOMIC);
| + if (h-ring[i] == NULL)
| + return 1;
| + }
| + h-loss_count = 0;
| + h-loss_start = 0;
| + return 0;
| +}
| +EXPORT_SYMBOL_GPL(tfrc_rx_hist_init);
| 
I expected this and it actually was very clear from your original message. I 
fully back up
your solution in this point, see below. But my question above was rather: are 
there any
other bugs rather than the above leakage, which is what the previous email 
seemed to indicate?

With regard to your solution - you are using

int tfrc_rx_hist_alloc(struct tfrc_rx_hist *h)
{
h-ring = kmem_cache_alloc(tfrc_rx_hist_slab, GFP_ATOMIC);
h-loss_count = h-loss_start = 0;
return h-ring == NULL;
}

which is better not only for one but for two reasons. It solves the leakage and 
in addition makes
the entire code simpler. Fully agreed.
  

| 
|  | . I haven't checked if all the code was commited, as I tried to introduce 
just what was
|  |   immediatelly used, probably we'll need to do some changes when working 
on the merge
|  |   of your loss intervals code.
|  Sorry I don't understand this point.
| 
| Let me check now and tell you for sure:
| 
| tfrc_rx_hist_delta_seqno and tfrc_rx_hist_swap were not included, as
| they were not used, we should introduce them later, when getting to the
| working on the loss interval code.
Ah thanks, that was really not clear. Just beginning to work through the set.

|  static void ccid3_hc_rx_packet_recv(struct sock *sk, struct sk_buff 
*skb)
|  {
|  // ...
|  u32 sample, payload_size = skb-len - dccp_hdr(skb)-dccph_doff 
* 4;
|  
|  if (unlikely(hcrx-ccid3hcrx_state == TFRC_RSTATE_NO_DATA)) {
|  if (is_data_packet) {
|  do_feedback = FBACK_INITIAL;
|  ccid3_hc_rx_set_state(sk, TFRC_RSTATE_DATA);
|  ccid3_hc_rx_update_s(hcrx, payload_size);
|  }
|  goto update_records;
|  }
|  
|  == Non-data packets are ignored for the purposes of computing s (this is 
in the RFC),
|  consequently update_s() is only called for data packets; using the two 
following
|  functions:
|  
|  
|  static inline u32 tfrc_ewma(const u32 avg, const u32 newval, const u8 
weight)
|  {
|  return avg ? (weight * avg + (10 - weight) * newval) / 10 : 
newval;
|  }
| 
| I hadn't considered that tfrc_ewma would for every packet check if the
| avg was 0 and I find it suboptimal now that I look at it, we are just
| feeding data packets, no? 
Yes exactly, only data packets are used for s.

|  static inline void ccid3_hc_rx_update_s(struct ccid3_hc_rx_sock *hcrx, 
int len)
|  {
|  if (likely(len  0))/* don't update on empty packets (e.g. 
ACKs) */
|  hcrx-ccid3hcrx_s = tfrc_ewma(hcrx-ccid3hcrx_s, len, 
9);
|  }
| 
|   And we also just do test for len  0 in update_s, that looks like
| also excessive, no?
Hm, I think we need to make it robust against API bugs and/or zero-sized
data packets. The check `len  0' may seem redundant but it catches such
a condition. For a moving average an accidental zero value will have
quite an impact on the overall value. In CCID3 it 

Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches

2007-12-03 Thread Arnaldo Carvalho de Melo
Em Mon, Dec 03, 2007 at 01:49:47PM +, Gerrit Renker escreveu:
 |  Are you suggesting using netdev exclusively or in addition to [EMAIL 
 PROTECTED]
 | 
 | Well, since at least one person that has contributed significantly in
 | the past has said he can't cope with traffic on netdev, we can CC
 | [EMAIL PROTECTED]
 I have a similar problem with the traffic but agree and will copy as well.
 
 
 |  have been posted and re-posted for a period of over 9 months on [EMAIL 
 PROTECTED], and
 | 
 | Being posted and re-posted does not guarantee that the patch is OK or
 | that is in a form that is acceptable to all tree maintainers.

 With the first point I couldn't agree more, but this is not really what I 
 meant - the point
 was that despite posting and re-posting there was often silence. And now 
 there is feedback,
 in form of a patchset made by you; and all that I am asking for is just to be 
 given the time to
 look that through. Last time a RFC patch appeared at 3pm and was checked in 
 the same evening
 (only found out next morning).

I was too optimistic about that one, feeling that it was safe, sorry
about that, will avoid doing that in the future.

 With your experience and long background as a maintainer maybe you expect 
 quicker turn-around
 times, but I also had waited patiently until you had had a chance to review 
 the stuff I sent.

Agreed, my bad, will be more patient with my side as you've been with
yours :-)

 |  | . The code that allocates the RX ring deals with failures when one of 
 the entries in
 |  |   the ring buffer is not successfully allocated, the original code was 
 leaking the
 |  |   successfully allocated entries.
 | 
 | Sorry for not point out exactly this, here it goes:
 | 
 | Your original patch:
 | 
 | +int tfrc_rx_hist_init(struct tfrc_rx_hist *h)
 | +{
 | +   int i;
 | +
 | +   for (i = 0; i = NDUPACK; i++) {
 | +   h-ring[i] = kmem_cache_alloc(tfrc_rxh_cache, GFP_ATOMIC);
 | +   if (h-ring[i] == NULL)
 | +   return 1;
 | +   }
 | +   h-loss_count = 0;
 | +   h-loss_start = 0;
 | +   return 0;
 | +}
 | +EXPORT_SYMBOL_GPL(tfrc_rx_hist_init);
 | 
 I expected this and it actually was very clear from your original message. I 
 fully back up
 your solution in this point, see below. But my question above was rather: are 
 there any
 other bugs rather than the above leakage, which is what the previous email 
 seemed to indicate?
 
 With regard to your solution - you are using
 
   int tfrc_rx_hist_alloc(struct tfrc_rx_hist *h)
   {
   h-ring = kmem_cache_alloc(tfrc_rx_hist_slab, GFP_ATOMIC);
   h-loss_count = h-loss_start = 0;
   return h-ring == NULL;
   }
 
 which is better not only for one but for two reasons. It solves the leakage 
 and in addition makes
 the entire code simpler. Fully agreed.

good
   
 
 | 
 |  | . I haven't checked if all the code was commited, as I tried to 
 introduce just what was
 |  |   immediatelly used, probably we'll need to do some changes when 
 working on the merge
 |  |   of your loss intervals code.
 |  Sorry I don't understand this point.
 | 
 | Let me check now and tell you for sure:
 | 
 | tfrc_rx_hist_delta_seqno and tfrc_rx_hist_swap were not included, as
 | they were not used, we should introduce them later, when getting to the
 | working on the loss interval code.
 Ah thanks, that was really not clear. Just beginning to work through the set.

great
 
 |static void ccid3_hc_rx_packet_recv(struct sock *sk, struct sk_buff 
 *skb)
 |{
 |// ...
 |u32 sample, payload_size = skb-len - dccp_hdr(skb)-dccph_doff 
 * 4;
 |  
 |if (unlikely(hcrx-ccid3hcrx_state == TFRC_RSTATE_NO_DATA)) {
 |if (is_data_packet) {
 |do_feedback = FBACK_INITIAL;
 |ccid3_hc_rx_set_state(sk, TFRC_RSTATE_DATA);
 |ccid3_hc_rx_update_s(hcrx, payload_size);
 |}
 |goto update_records;
 |}
 |  
 |  == Non-data packets are ignored for the purposes of computing s (this is 
 in the RFC),
 |  consequently update_s() is only called for data packets; using the 
 two following
 |  functions:
 |  
 |  
 |static inline u32 tfrc_ewma(const u32 avg, const u32 newval, const u8 
 weight)
 |{
 |return avg ? (weight * avg + (10 - weight) * newval) / 10 : 
 newval;
 |}
 | 
 | I hadn't considered that tfrc_ewma would for every packet check if the
 | avg was 0 and I find it suboptimal now that I look at it, we are just
 | feeding data packets, no? 
 Yes exactly, only data packets are used for s.
 
 |static inline void ccid3_hc_rx_update_s(struct ccid3_hc_rx_sock *hcrx, 
 int len)
 |{
 |if (likely(len  0))/* don't update on empty packets (e.g. 
 ACKs) */
 |hcrx-ccid3hcrx_s = tfrc_ewma(hcrx-ccid3hcrx_s, len, 
 9);
 |}
 | 
 | And we also 

Re: [RFC][PATCHES 0/7]: Reorganization of RX history patches

2007-12-03 Thread Gerrit Renker
|  |  static inline void ccid3_hc_rx_update_s(struct ccid3_hc_rx_sock 
*hcrx, int len)
|  |  {
|  |  if (likely(len  0))/* don't update on empty 
packets (e.g. ACKs) */
|  |  hcrx-ccid3hcrx_s = 
tfrc_ewma(hcrx-ccid3hcrx_s, len, 9);
|  |  }
|  | 
|  |   And we also just do test for len  0 in update_s, that looks like
|  | also excessive, no?
|  Hm, I think we need to make it robust against API bugs and/or zero-sized
|  data packets. The check `len  0' may seem redundant but it catches such
|  a condition. For a moving average an accidental zero value will have
|  quite an impact on the overall value. In CCID3 it is
|  
|  x_new = 0.9 * x_old + 0.1 * len
|  
|  So if len is accidentally 0 where it should not be, then each time the
|  moving average is reduced to 90%.
| 
| So we must make it a BUG_ON, not something that is to be always present.
|  
I think it should be a warning condition since it can be triggered when
the remote party sends zero-sized packets. It may be good to log this
into the syslog to warn about possibly misbehaving apps/peers/remote
stacks.


|  As a comparison - the entire patch set took about a full month to do.
|  But that meant I am reasonably sure the algorithm is sound and can cope
|  with problematic conditions.
| 
| And from what I saw so far that is my impression too, if you look at
| what I'm doing it is:
| 
| 1. go thru the whole patch trying to understand hunk by hunk
You are doing a great job - in particular as it really is a lot of material.

| 2. do consistency changes (add namespace prefixes)
| 3. reorganize the code to look more like what is already there, we
|both have different backgrounds and tastes about how code should
|be written, so its only normal that if we want to keep the code
|consistent, and I want that, I jump into things I think should be
|reworded, while trying to keep the algorithm expressed by you.
|
Agree, that is not always easy to get right. I try to stick as close as
possible to existing conventions but of course that is my
interpretation, so I am already anticipating such changes/comments here.

| think about further automatization on regression testing.
| 
If it is of any use, some scripts and setups are at the bottom of the page at
http://www.erg.abdn.ac.uk/users/gerrit/dccp/testing_dccp/
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html