[Tech] [old frost] bandwidth usage improvements for old nodes (and much more!)

Matthew Toseland Wed, 8 Aug 2007 17:11:20 +0100

On Wednesday 08 August 2007 17:06, Matthew Toseland wrote:
> > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.10 -
> > > > 21:01:58GMT -----
> > > >
> > > > Oh no, getting stuff instantly rejected is most often not good (an
> > > > exception would be certain kinds of realtime traffic, but that is not
> > > > applicable for freenet at all). I have some experience with
> > > > independant/competing ISPs and their broken traffic shaping routers
> > > > that were always dropping packets not fitting current shaping limits;
> > > > TCP performance was experiencing major hit there, and several TCP
> > > > connections running under the same shaping were always taking
> > > > seriously unfair bandwidth share (unless you get quite long intervals
> > > > for stats, like 10+ minutes). Changing shaping processing by queueing
> > > > an over-quota packet (even a single packet queue!) till the
> > > > calculated average bandwidth allows to send the packet (thus in the
> > > > end increasing roundtrip slightly) was always sufficient for TCP flow
> > > > to work at 100% of the shaped level, and having simultaneous TCP
> > > > streams to very equally share available bandwidth even for sub-second
> > > > stats intervals, and there were no other working solution found
> > > > (aside raising the shaping limit above the maximum speed of TCP
> > > > peers).
> > > >
> > > > I am not sure that can be directly applied to the current freenet
> > > > networking code; honestly, the mechanism of first quickly accepting
> > > > packets and then slowly picking them using some kind of filters looks
> > > > unneccessary complicated and performance inoptimal, to say least: I
> > > > have another bright example why - the mechanism quite resembles the
> > > > traditional O/S network packets handling (with received packets
> > > > extracted from NIC at highest priority - during hardware interrupt,
> > > > and then having CPU/server business logic failing to process all
> > > > received packets leading to internal queues overflow), and after
> > > > years and decades it is generally agreed that such approach does not
> > > > work well for server applications; instead, linux for several years
> > > > already has mechanism named NAPI (which is optional for some NIC
> > > > drivers - check kernel config, but default and mandatory for most
> > > > server-grade and/or 1Gb NIC drivers): hardware interrupt just sets a
> > > > flag/semaphore that NIC has received something, and instantly quits
> > > > leaving the particular NIC interrupt line disabled (actual algorithm
> > > > is a little bit more complex, allowing hardware interrupt to perform
> > > > extraction of a very limited number of packets if the host is very
> > > > idle). Then there is a lowest priority kernel thread ("software
> > > > interrupt") woken up by the flag/semaphore starts reading packets
> > > > from NIC into O/S queues (where user-level read()s get satisfied
> > > > from), extracting only limited number of packets at a time (then
> > > > yielding CPU for other runnable processes), and reenabling the NIC
> > > > interrupts only when it managed to empty the hardware queue - with
> > > > TCP flow control, and with the modern ethernet hardware flow control
> > > > that works exceptionally well. Thus server business logic (i.e.
> > > > useful work) running at priority much higher than software interrupt
> > > > thread is never starved from CPU by hardware interrupts that first
> > > > pull in packets which then result in CPU wasted to drop them from
> > > > overflown system queue - resulting in smooth behaviour and best
> > > > sustained performance.
> > > >
> > > > Or in short - on overload, delaying input packets reading/processing
> > > > is better than dropping or rejecting them instantly.
> > > >
> > > > Toad - if you know a simple way to delay freenet reads from UDP
> > > > socket in order to enforce configured input bandwidth limit, please
> > > > do so. (And with that UDP read delay, I would be very interested to
> > > > test freenet node without other input bandwidth limiters aside input
> > > > bandwidth liability used - chances that the UDP socket read delay
> > > > will be sufficient for quality shaping, with the valuable help of
> > > > sending node tracking the roundtrip - an already well implemented
> > > > feature).
> > > >
> > > > If the delay can not be done easily with the current codebase, I will
> > > > consider doing major rewrite of the traffic accepting code part. Not
> > > > of highest priority tho, due to anticipated large amount of work -
> > > > but those high fruits look big and tasty.
> > >
> > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.11 -
> > > 14:56:37GMT -----
> > >
> > > > I am not sure that can be directly applied to the current freenet
> > > > networking
> > >
> > > code;
> > >
> > > We're working on an idea called token-passing that's supposed to
> > > address this: you can only send a search (request/insert) to a peer if
> > > you have a flow control token from that peer. If you don't have a token
> > > you either keep the search in a queue until you receive a token, or
> > > send it to the next-best peer if the queue is full.
> > >
> > > > the mechanism quite resembles the traditional O/S network packets
> > > > handling
> > >
> > > (with received packets extracted from NIC at highest priority - during
> > > hardware interrupt, and then having CPU/server business logic failing
> > > to process all received packets leading to internal queues overflow)
> > >
> > > Interesting point - in the new congestion control layer, maybe the UDP
> > > reader shouldn't advance the receiver window until the internal queues
> > > have dropped below a certain size... but it might be tricky to
> > > implement because the internal queues all belong to different
> > > threads...
> > >
> > > > If the delay can not be done easily with the current codebase, I will
> > >
> > > consider doing major rewrite of the traffic accepting code part.
> > >
> > > This is due to be rewritten soon anyway, so now's probably a good time
> > > to make suggestions.
> > >
> > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.14 -
> > > 16:46:24GMT -----
> > >
> > > While token passing would indeed smooth the traffic out, it feels
> > > excessive:
> > >
> > > - it adds extra traffic;
> > > - it creates additional traffic patterns, that quite simplify attacks
> > > (like those aiming at reliably proving that a particular request
> > > originates from attacked node) against a node which all connections are
> > > monitored (by ISP), and some of them are fast but compromised
> > > (compromised peers).
> > > - it requires to pull a multidimensional set of heurictics on whom to
> > > send new token out of a thin air, and those heuristics will tend to
> > > disagree for different connection types.
> > >
> > > The method of delaying network reads (thats important - and AFAIK the
> > > only major missing thing to get shaping rolling smoothly already)
> > > should work similarly well (might be even better): just consider the
> > > metric 'the current peer roundtrip time is lower than the [peer]
> > > average roundtrip time' as equivalence of 'the peer gave us few
> > > tokens', and enjoy the bandwidth/crypt(CPU) free virtual token passing
> > > which obeys both hardware/ISP traffic shaping imposed limits, as well
> > > as software configured limits - whichever is stricter.
> > >
> > > So I currently discorage implementing explicit token passing, in favor
> > > of lower, equially tasty fruit.
> >
> > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.17 - 21:40:27GMT
> > -----
> >
> > > - it adds extra traffic
> >
> > Um, right. "Here are n tokens" takes about 6 bytes: two for the message
> > type, two for the message size, and two for the number of tokens (we're
> > never going to hand out more than 65535 tokens in one go). It uses less
> > traffic than "Can I send you a request?" "Yes" "Here's the request", and
> > it avoids a round-trip. It also uses less traffic than "Can I send you a
> > request?" "No", because if you don't have a token, you don't need to ask!
> >
> > > - it creates additional traffic patterns, that quite simplify attacks
> > > (like
> >
> > those aiming at reliably proving that a particular request originates
> > from attacked node) against a node which all connections are monitored
> > (by ISP), and some of them are fast but compromised (compromised peers).
> >
> > Please explain how handing my peer some tokens reveals anything about
> > traffic patterns that wasn't already visible to traffic analysis. If they
> > can see the requests and results going back and forth, who cares if they
> > can also see the tokens?
> >
> > > - it requires to pull a multidimensional set of heurictics on whom to
> > > send
> >
> > new token out of a thin air, and those heuristics will tend to disagree
> > for different connection types.
> >
> > No magical heuristics are needed - we hand out tokens as long as we're
> > not overloaded (measured by total queueing delay, including the bandwidth
> > limiter). That alone should be enough to outperform the current system,
> > because we'll avoid wasting traffic on rejected searches. Then we can
> > start thinking about clever token allocation policies to enforce fairness
> > when the network's busy, without imposing unnecessary limits when the
> > network's idle, etc etc. But token passing doesn't depend on any such
> > policy - it's just a lower-bandwidth alternative to pre-emptive
> > rejection.
>
> ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.25 -
> 11:57:46GMT -----
>
> As far as I see, the tokens should be transferred in timely enough manner,
> to keep bursts problem moderate; so majority of them will not be coalesced,
> frequently resulting in the folloing overhead for the 6 bytes:
>
> - up to 100 bytes of random padding;
> - 50+ bytes FNP headers;
> - 8 bytes UDP header;
> - 20+ bytes IP header.
>
> Those 150-200 bytes packets, aside being large enough to get noticeable in
> number, inavoidably create additional traffic patterns that could be useful
> to estimate node's activity with other peers (even if the other peers use
> some local connection like Bluetooth/WiFi/LAN which is much more expensive
> to monitor remotely/centralized).


----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.25 - 17:28:07GMT 
-----

I still don't understand your argument about "extra traffic patterns". If an 
eavesdropper can see that your node is sending requests to my node, why does 
it make a difference if the eavesdropper can also see that my node is sending 
tokens to your node? The eavesdropper could deduce that anyway.

----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.25 - 
21:56:54GMT -----

If seen amount of tokens well matches seen traffic, it gives additional 
knowledge that the node does not use different links. Thus especially in case 
of degraded topology it makes somewhat (how much?) easier to prove that the 
particular data originates from the attacked node or requested by the 
attacked node.

----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.26 - 19:44:07GMT 
-----

What? The amount of tokens will always match the amount of traffic, because 
you need a token to send traffic. It will not prove anything about the 
ultimate source or destination of the traffic, and it will not reveal *more* 
about the amount of traffic than is visible *from the traffic itself*.

----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.27 - 
08:30:21GMT -----

Node that is working near its bandwidth limits will be feeding peers only with 
limited amount of tokens; seriously underloaded node will tend to allow 
bursts in attempt to saturate alloted bandwidth.

----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.27 - 15:05:29GMT 
-----

So you're saying that by observing the number of tokens, an eavesdropper can 
infer the node's bandwidth limit?

----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.27 - 
18:34:02GMT -----

Something like that. And comparing it with known connection bandwidth and its 
saturation, potentially derive information about previously hidden/unknown 
connections/peers.

----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.28 - 22:07:55GMT 
-----

OK, so we're assuming the node has some connections that are invisible to the 
eavesdropper? So how does the traffic on the visible connections (including 
tokens passed in both directions) reveal the presence of the invisible 
connections? How would an eavesdropper distinguish a request received on an 
invisible connection from an internally generated request?

----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.30 - 
11:07:46GMT -----

The situation of attacker strongly supposing/knowing for sure that the only 
active traffic generator is the node "itself", is quite different from 
attacker strongly supposing/knowing for sure that seemingly actively 
generated traffic originates "either from the node or its unknown peers".

----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.06.01 - 21:05:16GMT 
-----

True, but how does the presence or absence of tokens make any difference to 
the attacker's knowledge? Can you describe an example so I can see what 
information you think the tokens are revealing?

----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.06.04 - 
17:32:56GMT -----

- If node never gives away more tokens that can be spent in 10 seconds, but 
attacker sees that connections A, B, and C are given more explicit tokens 
than that, then it is reasonable to assume that there is currently 
mostly-idle connection D exists.

- If the node is given physical connection bandwidth N, but gives out tokens 
for N/2 only for A, B, and C taken together, chances the node has hidden 
connection D which is well-loaded at the moment. This seriously differs from 
observing that A, B, and C voluntarily send less traffic than N - maybe they 
just have nothing else to send.

Of course that gives no guarantees, but after certain period of observation it 
should be possible to quite reliability decide if the node has hidden 
connections (and so additional expenses could be spend to make the attack 
easier/more successful), or that's very unlikely (and maybe taking control 
over the node should be attempted). Without the estimate, the attacker either 
has to spent additional resources when it is not really needed or not, or 
adopt very strict anti-freenet policy which will attract additional undesired 
public attention and more likely to result in drastic political consequences 
for the force.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/tech/attachments/20070808/cb5b5036/attachment.pgp>

[Tech] [old frost] bandwidth usage improvements for old nodes (and much more!)

Reply via email to