On Wednesday 08 August 2007 17:11, Matthew Toseland wrote: > On Wednesday 08 August 2007 17:06, Matthew Toseland wrote: > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.10 - > > > > > 21:01:58GMT ----- > > > > > > > > > > Oh no, getting stuff instantly rejected is most often not good (an > > > > > exception would be certain kinds of realtime traffic, but that is > > > > > not applicable for freenet at all). I have some experience with > > > > > independant/competing ISPs and their broken traffic shaping routers > > > > > that were always dropping packets not fitting current shaping > > > > > limits; TCP performance was experiencing major hit there, and > > > > > several TCP connections running under the same shaping were always > > > > > taking seriously unfair bandwidth share (unless you get quite long > > > > > intervals for stats, like 10+ minutes). Changing shaping processing > > > > > by queueing an over-quota packet (even a single packet queue!) till > > > > > the calculated average bandwidth allows to send the packet (thus in > > > > > the end increasing roundtrip slightly) was always sufficient for > > > > > TCP flow to work at 100% of the shaped level, and having > > > > > simultaneous TCP streams to very equally share available bandwidth > > > > > even for sub-second stats intervals, and there were no other > > > > > working solution found (aside raising the shaping limit above the > > > > > maximum speed of TCP peers). > > > > > > > > > > I am not sure that can be directly applied to the current freenet > > > > > networking code; honestly, the mechanism of first quickly accepting > > > > > packets and then slowly picking them using some kind of filters > > > > > looks unneccessary complicated and performance inoptimal, to say > > > > > least: I have another bright example why - the mechanism quite > > > > > resembles the traditional O/S network packets handling (with > > > > > received packets extracted from NIC at highest priority - during > > > > > hardware interrupt, and then having CPU/server business logic > > > > > failing to process all received packets leading to internal queues > > > > > overflow), and after years and decades it is generally agreed that > > > > > such approach does not work well for server applications; instead, > > > > > linux for several years already has mechanism named NAPI (which is > > > > > optional for some NIC drivers - check kernel config, but default > > > > > and mandatory for most server-grade and/or 1Gb NIC drivers): > > > > > hardware interrupt just sets a flag/semaphore that NIC has received > > > > > something, and instantly quits leaving the particular NIC interrupt > > > > > line disabled (actual algorithm is a little bit more complex, > > > > > allowing hardware interrupt to perform extraction of a very limited > > > > > number of packets if the host is very idle). Then there is a lowest > > > > > priority kernel thread ("software interrupt") woken up by the > > > > > flag/semaphore starts reading packets from NIC into O/S queues > > > > > (where user-level read()s get satisfied from), extracting only > > > > > limited number of packets at a time (then yielding CPU for other > > > > > runnable processes), and reenabling the NIC interrupts only when it > > > > > managed to empty the hardware queue - with TCP flow control, and > > > > > with the modern ethernet hardware flow control that works > > > > > exceptionally well. Thus server business logic (i.e. useful work) > > > > > running at priority much higher than software interrupt thread is > > > > > never starved from CPU by hardware interrupts that first pull in > > > > > packets which then result in CPU wasted to drop them from overflown > > > > > system queue - resulting in smooth behaviour and best sustained > > > > > performance. > > > > > > > > > > Or in short - on overload, delaying input packets > > > > > reading/processing is better than dropping or rejecting them > > > > > instantly. > > > > > > > > > > Toad - if you know a simple way to delay freenet reads from UDP > > > > > socket in order to enforce configured input bandwidth limit, please > > > > > do so. (And with that UDP read delay, I would be very interested to > > > > > test freenet node without other input bandwidth limiters aside > > > > > input bandwidth liability used - chances that the UDP socket read > > > > > delay will be sufficient for quality shaping, with the valuable > > > > > help of sending node tracking the roundtrip - an already well > > > > > implemented feature). > > > > > > > > > > If the delay can not be done easily with the current codebase, I > > > > > will consider doing major rewrite of the traffic accepting code > > > > > part. Not of highest priority tho, due to anticipated large amount > > > > > of work - but those high fruits look big and tasty. > > > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.11 - > > > > 14:56:37GMT ----- > > > > > > > > > I am not sure that can be directly applied to the current freenet > > > > > networking > > > > > > > > code; > > > > > > > > We're working on an idea called token-passing that's supposed to > > > > address this: you can only send a search (request/insert) to a peer > > > > if you have a flow control token from that peer. If you don't have a > > > > token you either keep the search in a queue until you receive a > > > > token, or send it to the next-best peer if the queue is full. > > > > > > > > > the mechanism quite resembles the traditional O/S network packets > > > > > handling > > > > > > > > (with received packets extracted from NIC at highest priority - > > > > during hardware interrupt, and then having CPU/server business logic > > > > failing to process all received packets leading to internal queues > > > > overflow) > > > > > > > > Interesting point - in the new congestion control layer, maybe the > > > > UDP reader shouldn't advance the receiver window until the internal > > > > queues have dropped below a certain size... but it might be tricky to > > > > implement because the internal queues all belong to different > > > > threads... > > > > > > > > > If the delay can not be done easily with the current codebase, I > > > > > will > > > > > > > > consider doing major rewrite of the traffic accepting code part. > > > > > > > > This is due to be rewritten soon anyway, so now's probably a good > > > > time to make suggestions. > > > > > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.14 - > > > > 16:46:24GMT ----- > > > > > > > > While token passing would indeed smooth the traffic out, it feels > > > > excessive: > > > > > > > > - it adds extra traffic; > > > > - it creates additional traffic patterns, that quite simplify attacks > > > > (like those aiming at reliably proving that a particular request > > > > originates from attacked node) against a node which all connections > > > > are monitored (by ISP), and some of them are fast but compromised > > > > (compromised peers). > > > > - it requires to pull a multidimensional set of heurictics on whom to > > > > send new token out of a thin air, and those heuristics will tend to > > > > disagree for different connection types. > > > > > > > > The method of delaying network reads (thats important - and AFAIK the > > > > only major missing thing to get shaping rolling smoothly already) > > > > should work similarly well (might be even better): just consider the > > > > metric 'the current peer roundtrip time is lower than the [peer] > > > > average roundtrip time' as equivalence of 'the peer gave us few > > > > tokens', and enjoy the bandwidth/crypt(CPU) free virtual token > > > > passing which obeys both hardware/ISP traffic shaping imposed limits, > > > > as well as software configured limits - whichever is stricter. > > > > > > > > So I currently discorage implementing explicit token passing, in > > > > favor of lower, equially tasty fruit. > > > > > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.17 - > > > 21:40:27GMT ----- > > > > > > > - it adds extra traffic > > > > > > Um, right. "Here are n tokens" takes about 6 bytes: two for the message > > > type, two for the message size, and two for the number of tokens (we're > > > never going to hand out more than 65535 tokens in one go). It uses less > > > traffic than "Can I send you a request?" "Yes" "Here's the request", > > > and it avoids a round-trip. It also uses less traffic than "Can I send > > > you a request?" "No", because if you don't have a token, you don't need > > > to ask! > > > > > > > - it creates additional traffic patterns, that quite simplify attacks > > > > (like > > > > > > those aiming at reliably proving that a particular request originates > > > from attacked node) against a node which all connections are monitored > > > (by ISP), and some of them are fast but compromised (compromised > > > peers). > > > > > > Please explain how handing my peer some tokens reveals anything about > > > traffic patterns that wasn't already visible to traffic analysis. If > > > they can see the requests and results going back and forth, who cares > > > if they can also see the tokens? > > > > > > > - it requires to pull a multidimensional set of heurictics on whom to > > > > send > > > > > > new token out of a thin air, and those heuristics will tend to disagree > > > for different connection types. > > > > > > No magical heuristics are needed - we hand out tokens as long as we're > > > not overloaded (measured by total queueing delay, including the > > > bandwidth limiter). That alone should be enough to outperform the > > > current system, because we'll avoid wasting traffic on rejected > > > searches. Then we can start thinking about clever token allocation > > > policies to enforce fairness when the network's busy, without imposing > > > unnecessary limits when the network's idle, etc etc. But token passing > > > doesn't depend on any such policy - it's just a lower-bandwidth > > > alternative to pre-emptive rejection. > > > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.25 - > > 11:57:46GMT ----- > > > > As far as I see, the tokens should be transferred in timely enough > > manner, to keep bursts problem moderate; so majority of them will not be > > coalesced, frequently resulting in the folloing overhead for the 6 bytes: > > > > - up to 100 bytes of random padding; > > - 50+ bytes FNP headers; > > - 8 bytes UDP header; > > - 20+ bytes IP header. > > > > Those 150-200 bytes packets, aside being large enough to get noticeable > > in number, inavoidably create additional traffic patterns that could be > > useful to estimate node's activity with other peers (even if the other > > peers use some local connection like Bluetooth/WiFi/LAN which is much > > more expensive to monitor remotely/centralized). > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.25 - 17:28:07GMT > ----- > > I still don't understand your argument about "extra traffic patterns". If > an eavesdropper can see that your node is sending requests to my node, why > does it make a difference if the eavesdropper can also see that my node is > sending tokens to your node? The eavesdropper could deduce that anyway. > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.25 - > 21:56:54GMT ----- > > If seen amount of tokens well matches seen traffic, it gives additional > knowledge that the node does not use different links. Thus especially in > case of degraded topology it makes somewhat (how much?) easier to prove > that the particular data originates from the attacked node or requested by > the attacked node. > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.26 - 19:44:07GMT > ----- > > What? The amount of tokens will always match the amount of traffic, because > you need a token to send traffic. It will not prove anything about the > ultimate source or destination of the traffic, and it will not reveal > *more* about the amount of traffic than is visible *from the traffic > itself*. > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.27 - > 08:30:21GMT ----- > > Node that is working near its bandwidth limits will be feeding peers only > with limited amount of tokens; seriously underloaded node will tend to > allow bursts in attempt to saturate alloted bandwidth. > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.27 - 15:05:29GMT > ----- > > So you're saying that by observing the number of tokens, an eavesdropper > can infer the node's bandwidth limit? > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.27 - > 18:34:02GMT ----- > > Something like that. And comparing it with known connection bandwidth and > its saturation, potentially derive information about previously > hidden/unknown connections/peers. > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.05.28 - 22:07:55GMT > ----- > > OK, so we're assuming the node has some connections that are invisible to > the eavesdropper? So how does the traffic on the visible connections > (including tokens passed in both directions) reveal the presence of the > invisible connections? How would an eavesdropper distinguish a request > received on an invisible connection from an internally generated request? > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.05.30 - > 11:07:46GMT ----- > > The situation of attacker strongly supposing/knowing for sure that the only > active traffic generator is the node "itself", is quite different from > attacker strongly supposing/knowing for sure that seemingly actively > generated traffic originates "either from the node or its unknown peers". > > ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.06.01 - 21:05:16GMT > ----- > > True, but how does the presence or absence of tokens make any difference to > the attacker's knowledge? Can you describe an example so I can see what > information you think the tokens are revealing? > > ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.06.04 - > 17:32:56GMT ----- > > - If node never gives away more tokens that can be spent in 10 seconds, but > attacker sees that connections A, B, and C are given more explicit tokens > than that, then it is reasonable to assume that there is currently > mostly-idle connection D exists. > > - If the node is given physical connection bandwidth N, but gives out > tokens for N/2 only for A, B, and C taken together, chances the node has > hidden connection D which is well-loaded at the moment. This seriously > differs from observing that A, B, and C voluntarily send less traffic than > N - maybe they just have nothing else to send. > > Of course that gives no guarantees, but after certain period of observation > it should be possible to quite reliability decide if the node has hidden > connections (and so additional expenses could be spend to make the attack > easier/more successful), or that's very unlikely (and maybe taking control > over the node should be attempted). Without the estimate, the attacker > either has to spent additional resources when it is not really needed or > not, or adopt very strict anti-freenet policy which will attract additional > undesired public attention and more likely to result in drastic political > consequences for the force.
mrogers: ----- mrogers at UU62+3E1vKT1k+7fR0Gx7ZN2IB0 ----- 2007.06.07 - 11:32:12GMT ----- In either case, the attacker can't distinguish between requests that arrived over a hidden connection and requests that originated locally (we should attempt to ensure this by treating clients and peers identically throughout the code). But anyway it's a moot point because you've convinced me that explicit token-passing isn't necessary. :-) toad: ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.07 - 14:45:23GMT ----- If the attacker is connected to the node, he has a good idea of how many connections it has anyway through intercepting swapping data. Also having identified a single node, traffic analysis isn't so hard. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <https://emu.freenetproject.org/pipermail/tech/attachments/20070808/68d78daa/attachment.pgp>