[freenet-dev] How to fix UID exploits properly?

2012-09-04 Thread Matthew Toseland
On Tuesday 04 Sep 2012 18:45:47 Matthew Toseland wrote:
> On Tuesday 04 Sep 2012 18:13:04 Evan Daniel wrote:
> > On Tue, Sep 4, 2012 at 12:29 PM, Matthew Toseland
> >  wrote:
> > > The paper, "A Traceback Attack on Freenet" ( 
> > > http://www.ee.hawaii.edu/~dong/traceback/1569649421.pdf ) presents a new 
> > > attack which relies on being able to 1) quickly connect to a node via 
> > > announcement and 2) Query it to determine whether a given request UID has 
> > > visited this node. The attack allows them (prior to 1411) to trace a 
> > > request back to its originator.
> > >
> > > 1411 makes this dramatically more difficult by not tracking UIDs of 
> > > completed requests. However, it may still be possible to do some variant 
> > > of this attack, and we should improve things further.
> > >
> ...
> > >
> > > Ian's solution
> > > 
> > >
> > > Get rid of RejectedLoop. Always accept, never route to the same peer as 
> > > we've already routed that UID to, and RNF if we can't find any more nodes 
> > > to route to.
> > >
> > > I am worried about what this could do to routing. I don't think we should 
> > > implement it without some theoretical/simulation analysis? I can see that 
> > > it might improve things, but we need more than that given it could be 
> > > fairly significant.
> > >
> > > However it is the most comprehensive way to get rid of these problems, 
> > > and might have the least performance impact.
> > 
> > I like this solution. It was my immediate reaction to the problem 
> > description.
> > 
> > It will make local minimums harder to escape. Basically, you prevent
> > duplicating an edge along a route, rather than a node. That's a much
> > less powerful approach to avoiding minimums. I suspect FOAF routing
> > helps a lot here, but that seems like it might be problematic from a
> > security perspective as well.
> > 
> > In general, making routing better (link length distribution, mainly)
> > will make this less of an issue; local minimums are a problem that
> > results when you have too few short links, which is the current
> > problem with the network.
> 
> How concrete is this view that it will improve performance? What would it 
> take to put it on a solid footing? Is there relevant published work? Can you 
> suggest how to build a simulation to compare the two approaches (say with 
> many-nodes-one-VM to make it easy)?
> 
> I would be delighted to implement it ... but only if I can be fairly sure it 
> won't make things worse.

Okay, Evan has clarified: It will reduce performance but only slightly. I 
suspect it might be more than slightly on the poor topology we are likely to 
see on darknet. So IMHO we need more information.

Having said that, 99% of the risk is gone right now...
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20120904/1e5ee743/attachment.pgp>


[freenet-dev] How to fix UID exploits properly?

2012-09-04 Thread Matthew Toseland
On Tuesday 04 Sep 2012 18:13:04 Evan Daniel wrote:
> On Tue, Sep 4, 2012 at 12:29 PM, Matthew Toseland
>  wrote:
> > The paper, "A Traceback Attack on Freenet" ( 
> > http://www.ee.hawaii.edu/~dong/traceback/1569649421.pdf ) presents a new 
> > attack which relies on being able to 1) quickly connect to a node via 
> > announcement and 2) Query it to determine whether a given request UID has 
> > visited this node. The attack allows them (prior to 1411) to trace a 
> > request back to its originator.
> >
> > 1411 makes this dramatically more difficult by not tracking UIDs of 
> > completed requests. However, it may still be possible to do some variant of 
> > this attack, and we should improve things further.
> >
...
> >
> > Ian's solution
> > 
> >
> > Get rid of RejectedLoop. Always accept, never route to the same peer as 
> > we've already routed that UID to, and RNF if we can't find any more nodes 
> > to route to.
> >
> > I am worried about what this could do to routing. I don't think we should 
> > implement it without some theoretical/simulation analysis? I can see that 
> > it might improve things, but we need more than that given it could be 
> > fairly significant.
> >
> > However it is the most comprehensive way to get rid of these problems, and 
> > might have the least performance impact.
> 
> I like this solution. It was my immediate reaction to the problem description.
> 
> It will make local minimums harder to escape. Basically, you prevent
> duplicating an edge along a route, rather than a node. That's a much
> less powerful approach to avoiding minimums. I suspect FOAF routing
> helps a lot here, but that seems like it might be problematic from a
> security perspective as well.
> 
> In general, making routing better (link length distribution, mainly)
> will make this less of an issue; local minimums are a problem that
> results when you have too few short links, which is the current
> problem with the network.

How concrete is this view that it will improve performance? What would it take 
to put it on a solid footing? Is there relevant published work? Can you suggest 
how to build a simulation to compare the two approaches (say with 
many-nodes-one-VM to make it easy)?

I would be delighted to implement it ... but only if I can be fairly sure it 
won't make things worse.
> 
> > DARKNET
> > ==
> >
> > Best solution is to make darknet easy!
> >
> > We also need to fix darknet. That means fixing Pitch Black.

We need to contact our friend who was working on that. If he's disappeared we 
need to use what he said last, and what oskar said, and write a  simulation 
of our own. I'm not sure I 100% understand the attack, so that's one of the 
problems.
> 
> Among other problems. Location stability interactions with datastore,

Yes, that's a longer term problem.

> and opennet/darknet hybrid nodes, in particular.

I am open to any suggestions.
> 
> It also means we need to focus on the user experience when setting up
> darknet, which currently sucks.

I understand how to make major improvements to functionality/ease of use. 
However I need to be about 5 people at the moment. :)
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20120904/48163851/attachment.pgp>


[freenet-dev] How to fix UID exploits properly?

2012-09-04 Thread Matthew Toseland
d immediately after their 5 minute grace period expires, by the first 
request that happens and tries to path fold, or by announcements? One thing we 
could do would be make sure the inter-accept interval is more than the average 
request time?

We could cripple announcement, to make it a bit slower to reach targeted nodes. 
However this would reduce first time performance. Freenet 0.4/5 used a 
collaborative random number generation algorithm and hoped the node would 
migrate to where it should be, I don't think it worked well. We could reduce 
the precision of announcement a bit perhaps, as a partial solution, but again 
it would reduce first time performance *and* it would reduce performance on a 
slashdot.

Paper's solution
==

The paper suggests we get rid of the various different failure modes. 
Unfortunately we need most of them:
- RejectedOverload: We need this to be distinct for load management.
- RouteNotFound: This is non-fatal, and reduces the number of hops.
- DataNotFound: This is fatal, and terminates the request.

Combining RejectedLoop with RNF might be possible. I'm not sure whether or not 
there would be enough information to figure out when it's a loop and when it's 
a genuine RNF; although the attacker may know your peers, lots of things can 
affect routing, e.g. per-node failure tables.

We definitely need to reduce the number of distinct failure messages.

Purely random termination (no HTL) might allow for major simplifications, as 
well as reducing the information on the requests, but it would greatly increase 
the variability of request completion times (making timeouts more difficult), 
and might undermine some of the more important optimisations such as per-node 
failure tables. (I'm not sure whether that's absolutely certain, as it acts 
only after multiple requests...)

Ian's solution


Get rid of RejectedLoop. Always accept, never route to the same peer as we've 
already routed that UID to, and RNF if we can't find any more nodes to route to.

I am worried about what this could do to routing. I don't think we should 
implement it without some theoretical/simulation analysis? I can see that it 
might improve things, but we need more than that given it could be fairly 
significant.

However it is the most comprehensive way to get rid of these problems, and 
might have the least performance impact.

DARKNET
==

Best solution is to make darknet easy!

We also need to fix darknet. That means fixing Pitch Black.
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20120904/4b8d7575/attachment.pgp>


[freenet-dev] Opennet sucks: major public expose imminent

2012-09-04 Thread Matthew Toseland
http://www.ee.hawaii.edu/~dong/traceback/index.htm
I will post in more detail about the first paper, although IMHO it's largely 
fixed. I haven't read the second paper yet. However the bigger threat is their 
claim to have a range of attacks (some of them?) in their public github repo. I 
haven't looked at that yet either.

However, I repeat my traditional line:
OPENNET SUCKS, AND IT'S ONLY A MATTER OF TIME BEFORE SOMEBODY PRODUCES A GOOD 
SET OF TOOLS FOR ATTACKING IT.

This is more relevant because "somebody" appears to be a specific group of 
academics!
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20120904/f3227911/attachment.pgp>


[freenet-dev] How to fix UID exploits properly?

2012-09-04 Thread Evan Daniel
On Tue, Sep 4, 2012 at 12:29 PM, Matthew Toseland
 wrote:
> The paper, "A Traceback Attack on Freenet" ( 
> http://www.ee.hawaii.edu/~dong/traceback/1569649421.pdf ) presents a new 
> attack which relies on being able to 1) quickly connect to a node via 
> announcement and 2) Query it to determine whether a given request UID has 
> visited this node. The attack allows them (prior to 1411) to trace a request 
> back to its originator.
>
> 1411 makes this dramatically more difficult by not tracking UIDs of completed 
> requests. However, it may still be possible to do some variant of this 
> attack, and we should improve things further.
>
> REQUEST UIDS
> =
>
> Several kinds of requests were easily exploitable for #2. All that is needed 
> is:
> - Send a bogus request, which will be parsed lazily.
> - If it's not been seen, it will be Accepted, and then fail because it fails 
> to parse, so it won't pollute the attacker's search space.
> - If it has been seen, it will be RejectedLoop.
>
> Several kinds of requests allowing this have been fixed:
> - Old-style probe requests (removed)
> - Announcements (fixed)
>
> However, for inserts, especially CHK inserts, this does not appear to be 
> easily fixable:
> A -> B: I want to do an insert
> B -> A: Accepted, send me the data
> A -> B: Here is the data ...
> B routes the insert onwards once the data transfer starts.
> The data transfer fails, so B *stops routing the insert*, as do downstream 
> nodes.
>
> For SSK inserts, we don't route onwards until we have all 3 components (data, 
> headers, and pubkey). Sometimes we don't need to send the pubkey, and given 
> there are a lot more SSKs than CHKs,
>
> In both cases, we currently accept or reject the insert before we have all 
> the data.
>
> Possible solution for CHKs:
> - Always route to the end even if the data transfer fails. Cost relatively 
> low as we don't need to wait for the data transfers, so it's close to a 
> normal failing request. Messy to implement but only because CHKInsertSender 
> is messy. Also we'd have to turn off kill-on-fatalTimeout (temporarily) 
> before deploying this. AND
> - Combine the DataInsert with the initial InsertRequest. All it really 
> includes is the transfer UID. However do not start sending the data until we 
> get the Accepted (may require that we tolerate a slightly longer timeout on 
> the first block received). Slight improvement in latency, so this is actually 
> beneficial; no bandwidth wasted.
>
> Related issues for CHKs:
> - One day, we may want bulk inserts and requests to transfer and verify the 
> whole block before relaying. The basic advantage is we could then do a kind 
> of non-onion mixing to confuse traffic analysis.
>
> Solution for SSKs:
> - Don't check the UID until after we have received all the data.
>
> Related issues for SSKs:
> - We don't send the pubkey unless asked for it in the accepted message. This 
> is marginally bad in that it allows a relatively easy way to probe the 
> datastore for an SSK pubkey. On the other hand it's a useful bandwidth 
> optimisation, especially given the sheer numbers of SSK requests, and remote 
> datastore probing is trivial anyway (and not much use given we don't store 
> locally requested stuff in it).
> - We could always send the pubkey, or just send it when doing a realtime 
> request.
> - Currently we have two ways to send an SSK request - one just sends the key, 
> the other sends the data and headers as well. We could always send the data 
> and headers; this would reduce latency but would waste bandwidth when we are 
> rejected. So it may make sense for realtime inserts only. Obviously we'd want 
> the message to indicate whether to expect such data, so we can get rid of it 
> even if we reject.
>
> More general issue:
> Maybe we should check for overload before we check for loops? This would 
> require some new (and slightly hacky) structures, and would mean we have to 
> get rid of or alter some less important current load heuristics. But it would 
> allow us to avoid wasting bandwidth in the case of SSK inserts (we check for 
> overload, send accepted, then wait for all the data, then check for loops, 
> then relay), and would reduce the amount of information leaked in all cases.
>
> Overload is much more common than loops, although some small darknets may 
> have a lot of loops.
>
> Overall impact of UID-level fixes: The group of nodes with UIDs equal to the 
> original request is clouded by the attacker's requests. He might still be 
> able to make some progress but only if he correlates the same procedure for 
> multiple nodes. In which case he's probably better off doing some of the 
> other known opennet attacks, or combining it ... However, see Ian's solution.
>
> Opennet/Announcement:
> ===
>
> Section IIIA:
> - Perverse effects of the current replacement heuristics? The dominant 
> heuristic is that any of the neighbour categories has served at least 10 
> requests... This 

[freenet-dev] Disk I/O thread

2012-09-04 Thread Matthew Toseland
On Sunday 02 Sep 2012 17:51:49 xor wrote:
> On Thursday 30 August 2012 00:40:13 Matthew Toseland wrote:
> > Sadly Freetalk/WoT do use rollback so has to
> > commit EVERY TIME. 
> 
> I oppose to the "sadly". Transaction-based programming has proven to be a 
> valid approach to solve many issues of traditional "manually undo everything 
> upon error"-programming.

Lets get one thing clear to begin with: I am not advocating that Freetalk/WoT 
aggregate transactions by simply not committing. The basic design flaw in 
Fred's use of the database layer was to assume that object databases work as 
advertised and you can simply take a big complex in-memory structure and 
persist it more or less transparently, and then add a load of (de)activation 
code and reduce memory usage.

Freetalk and WoT may be better designed on that level. However they produce 
more disk I/O. And the reason for this is, mainstream database practice and 
theory are designed for two cases:
1. Lots of data (absolute amount and transactions/sec) on fast, reliable 
hardware with professional sysadmins.
2. Tiny amounts of data on commodity hardware.

If you have lots of data on commodity disks, it falls down. And note that 
mainstream use of databases in the second case frequently fails. For example, I 
have a huge set of bookmarks in a tree on one of my browsers. Every time I 
create a new category, another one gets renamed.

> The approach of fred to not use it is just wrong from a computer-science 
> perspective.
> 
> The fact that it does not perform so well is an implementation issue of the 
> database, not of the client code which uses the database.

No it is not. The load you put on it requires many seeks.

Of course, it might require fewer seeks if it was using a well-designed SQL 
schema rather than trying to store objects. And I'm only talking about writes 
here; obviously if everything doesn't fit in memory you need to do seeks on 
read as well, and they can be very involved given db4o's lack of two-column 
indexes. However, because of the constant fsync's, we still needs loads of 
seeks *even if the whole thing fits in the OS disk cache*!
> 
> Also, Freetalk/WoT themselves are CPU/IO hogs due to them using the wrong 
> algorithms, so we are not even at the point where we can tell whether their 
> ACID-usage is a problem. There is too much noise generated from the highly 
> inefficient algorithms which are being used by them, we cannot measure the 
> performance impact of commit/rollback.

That's possible.

However, the bottom line is if you have to commit every few seconds, you have 
to fsync every few seconds. IMHO avoiding that problem, for instance by turning 
off fsync and making periodic backups, would dramatically improve performance.

As regards Freenet I am leaning towards a hand-coded on-disk structure. We 
don't use queries anyway, and mostly we don't need them; most of the data could 
be handled as a series of flat-files, and it would be far more robust than 
db4o, and likely faster too. But the first thing to do is upgrade the database 
and see whether it 1) breaks even worse or 2) fixes the problems. Past testing 
suggests #1, but past testing was based on later versions of 7.4, not on the 
latest 7.12.
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20120904/07feb275/attachment.pgp>


Re: [freenet-dev] Disk I/O thread

2012-09-04 Thread Matthew Toseland
On Sunday 02 Sep 2012 17:51:49 xor wrote:
 On Thursday 30 August 2012 00:40:13 Matthew Toseland wrote:
  Sadly Freetalk/WoT do use rollback so has to
  commit EVERY TIME. 
 
 I oppose to the sadly. Transaction-based programming has proven to be a 
 valid approach to solve many issues of traditional manually undo everything 
 upon error-programming.

Lets get one thing clear to begin with: I am not advocating that Freetalk/WoT 
aggregate transactions by simply not committing. The basic design flaw in 
Fred's use of the database layer was to assume that object databases work as 
advertised and you can simply take a big complex in-memory structure and 
persist it more or less transparently, and then add a load of (de)activation 
code and reduce memory usage.

Freetalk and WoT may be better designed on that level. However they produce 
more disk I/O. And the reason for this is, mainstream database practice and 
theory are designed for two cases:
1. Lots of data (absolute amount and transactions/sec) on fast, reliable 
hardware with professional sysadmins.
2. Tiny amounts of data on commodity hardware.

If you have lots of data on commodity disks, it falls down. And note that 
mainstream use of databases in the second case frequently fails. For example, I 
have a huge set of bookmarks in a tree on one of my browsers. Every time I 
create a new category, another one gets renamed.

 The approach of fred to not use it is just wrong from a computer-science 
 perspective.
 
 The fact that it does not perform so well is an implementation issue of the 
 database, not of the client code which uses the database.

No it is not. The load you put on it requires many seeks.

Of course, it might require fewer seeks if it was using a well-designed SQL 
schema rather than trying to store objects. And I'm only talking about writes 
here; obviously if everything doesn't fit in memory you need to do seeks on 
read as well, and they can be very involved given db4o's lack of two-column 
indexes. However, because of the constant fsync's, we still needs loads of 
seeks *even if the whole thing fits in the OS disk cache*!
 
 Also, Freetalk/WoT themselves are CPU/IO hogs due to them using the wrong 
 algorithms, so we are not even at the point where we can tell whether their 
 ACID-usage is a problem. There is too much noise generated from the highly 
 inefficient algorithms which are being used by them, we cannot measure the 
 performance impact of commit/rollback.

That's possible.

However, the bottom line is if you have to commit every few seconds, you have 
to fsync every few seconds. IMHO avoiding that problem, for instance by turning 
off fsync and making periodic backups, would dramatically improve performance.

As regards Freenet I am leaning towards a hand-coded on-disk structure. We 
don't use queries anyway, and mostly we don't need them; most of the data could 
be handled as a series of flat-files, and it would be far more robust than 
db4o, and likely faster too. But the first thing to do is upgrade the database 
and see whether it 1) breaks even worse or 2) fixes the problems. Past testing 
suggests #1, but past testing was based on later versions of 7.4, not on the 
latest 7.12.


signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

[freenet-dev] Opennet sucks: major public expose imminent

2012-09-04 Thread Matthew Toseland
http://www.ee.hawaii.edu/~dong/traceback/index.htm
I will post in more detail about the first paper, although IMHO it's largely 
fixed. I haven't read the second paper yet. However the bigger threat is their 
claim to have a range of attacks (some of them?) in their public github repo. I 
haven't looked at that yet either.

However, I repeat my traditional line:
OPENNET SUCKS, AND IT'S ONLY A MATTER OF TIME BEFORE SOMEBODY PRODUCES A GOOD 
SET OF TOOLS FOR ATTACKING IT.

This is more relevant because somebody appears to be a specific group of 
academics!


signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

[freenet-dev] How to fix UID exploits properly?

2012-09-04 Thread Matthew Toseland
The paper, A Traceback Attack on Freenet ( 
http://www.ee.hawaii.edu/~dong/traceback/1569649421.pdf ) presents a new attack 
which relies on being able to 1) quickly connect to a node via announcement and 
2) Query it to determine whether a given request UID has visited this node. The 
attack allows them (prior to 1411) to trace a request back to its originator.

1411 makes this dramatically more difficult by not tracking UIDs of completed 
requests. However, it may still be possible to do some variant of this attack, 
and we should improve things further.

REQUEST UIDS
=

Several kinds of requests were easily exploitable for #2. All that is needed is:
- Send a bogus request, which will be parsed lazily.
- If it's not been seen, it will be Accepted, and then fail because it fails to 
parse, so it won't pollute the attacker's search space.
- If it has been seen, it will be RejectedLoop.

Several kinds of requests allowing this have been fixed:
- Old-style probe requests (removed)
- Announcements (fixed)

However, for inserts, especially CHK inserts, this does not appear to be easily 
fixable:
A - B: I want to do an insert
B - A: Accepted, send me the data
A - B: Here is the data ...
B routes the insert onwards once the data transfer starts.
The data transfer fails, so B *stops routing the insert*, as do downstream 
nodes.

For SSK inserts, we don't route onwards until we have all 3 components (data, 
headers, and pubkey). Sometimes we don't need to send the pubkey, and given 
there are a lot more SSKs than CHKs, 

In both cases, we currently accept or reject the insert before we have all the 
data.

Possible solution for CHKs:
- Always route to the end even if the data transfer fails. Cost relatively low 
as we don't need to wait for the data transfers, so it's close to a normal 
failing request. Messy to implement but only because CHKInsertSender is messy. 
Also we'd have to turn off kill-on-fatalTimeout (temporarily) before deploying 
this. AND
- Combine the DataInsert with the initial InsertRequest. All it really includes 
is the transfer UID. However do not start sending the data until we get the 
Accepted (may require that we tolerate a slightly longer timeout on the first 
block received). Slight improvement in latency, so this is actually beneficial; 
no bandwidth wasted.

Related issues for CHKs:
- One day, we may want bulk inserts and requests to transfer and verify the 
whole block before relaying. The basic advantage is we could then do a kind of 
non-onion mixing to confuse traffic analysis.

Solution for SSKs:
- Don't check the UID until after we have received all the data.

Related issues for SSKs:
- We don't send the pubkey unless asked for it in the accepted message. This is 
marginally bad in that it allows a relatively easy way to probe the datastore 
for an SSK pubkey. On the other hand it's a useful bandwidth optimisation, 
especially given the sheer numbers of SSK requests, and remote datastore 
probing is trivial anyway (and not much use given we don't store locally 
requested stuff in it).
- We could always send the pubkey, or just send it when doing a realtime 
request.
- Currently we have two ways to send an SSK request - one just sends the key, 
the other sends the data and headers as well. We could always send the data and 
headers; this would reduce latency but would waste bandwidth when we are 
rejected. So it may make sense for realtime inserts only. Obviously we'd want 
the message to indicate whether to expect such data, so we can get rid of it 
even if we reject.

More general issue:
Maybe we should check for overload before we check for loops? This would 
require some new (and slightly hacky) structures, and would mean we have to get 
rid of or alter some less important current load heuristics. But it would allow 
us to avoid wasting bandwidth in the case of SSK inserts (we check for 
overload, send accepted, then wait for all the data, then check for loops, then 
relay), and would reduce the amount of information leaked in all cases.

Overload is much more common than loops, although some small darknets may have 
a lot of loops.

Overall impact of UID-level fixes: The group of nodes with UIDs equal to the 
original request is clouded by the attacker's requests. He might still be able 
to make some progress but only if he correlates the same procedure for multiple 
nodes. In which case he's probably better off doing some of the other known 
opennet attacks, or combining it ... However, see Ian's solution.

Opennet/Announcement:
===

Section IIIA:
- Perverse effects of the current replacement heuristics? The dominant 
heuristic is that any of the neighbour categories has served at least 10 
requests... This means if you do requests near the node's location you can get 
your node accepted. Can we improve on this? I do think the principle of 
limiting by the number of requests makes sense - we don't want the nodes to get 
dropped immediately after their 5 

Re: [freenet-dev] How to fix UID exploits properly?

2012-09-04 Thread Evan Daniel
On Tue, Sep 4, 2012 at 12:29 PM, Matthew Toseland
t...@amphibian.dyndns.org wrote:
 The paper, A Traceback Attack on Freenet ( 
 http://www.ee.hawaii.edu/~dong/traceback/1569649421.pdf ) presents a new 
 attack which relies on being able to 1) quickly connect to a node via 
 announcement and 2) Query it to determine whether a given request UID has 
 visited this node. The attack allows them (prior to 1411) to trace a request 
 back to its originator.

 1411 makes this dramatically more difficult by not tracking UIDs of completed 
 requests. However, it may still be possible to do some variant of this 
 attack, and we should improve things further.

 REQUEST UIDS
 =

 Several kinds of requests were easily exploitable for #2. All that is needed 
 is:
 - Send a bogus request, which will be parsed lazily.
 - If it's not been seen, it will be Accepted, and then fail because it fails 
 to parse, so it won't pollute the attacker's search space.
 - If it has been seen, it will be RejectedLoop.

 Several kinds of requests allowing this have been fixed:
 - Old-style probe requests (removed)
 - Announcements (fixed)

 However, for inserts, especially CHK inserts, this does not appear to be 
 easily fixable:
 A - B: I want to do an insert
 B - A: Accepted, send me the data
 A - B: Here is the data ...
 B routes the insert onwards once the data transfer starts.
 The data transfer fails, so B *stops routing the insert*, as do downstream 
 nodes.

 For SSK inserts, we don't route onwards until we have all 3 components (data, 
 headers, and pubkey). Sometimes we don't need to send the pubkey, and given 
 there are a lot more SSKs than CHKs,

 In both cases, we currently accept or reject the insert before we have all 
 the data.

 Possible solution for CHKs:
 - Always route to the end even if the data transfer fails. Cost relatively 
 low as we don't need to wait for the data transfers, so it's close to a 
 normal failing request. Messy to implement but only because CHKInsertSender 
 is messy. Also we'd have to turn off kill-on-fatalTimeout (temporarily) 
 before deploying this. AND
 - Combine the DataInsert with the initial InsertRequest. All it really 
 includes is the transfer UID. However do not start sending the data until we 
 get the Accepted (may require that we tolerate a slightly longer timeout on 
 the first block received). Slight improvement in latency, so this is actually 
 beneficial; no bandwidth wasted.

 Related issues for CHKs:
 - One day, we may want bulk inserts and requests to transfer and verify the 
 whole block before relaying. The basic advantage is we could then do a kind 
 of non-onion mixing to confuse traffic analysis.

 Solution for SSKs:
 - Don't check the UID until after we have received all the data.

 Related issues for SSKs:
 - We don't send the pubkey unless asked for it in the accepted message. This 
 is marginally bad in that it allows a relatively easy way to probe the 
 datastore for an SSK pubkey. On the other hand it's a useful bandwidth 
 optimisation, especially given the sheer numbers of SSK requests, and remote 
 datastore probing is trivial anyway (and not much use given we don't store 
 locally requested stuff in it).
 - We could always send the pubkey, or just send it when doing a realtime 
 request.
 - Currently we have two ways to send an SSK request - one just sends the key, 
 the other sends the data and headers as well. We could always send the data 
 and headers; this would reduce latency but would waste bandwidth when we are 
 rejected. So it may make sense for realtime inserts only. Obviously we'd want 
 the message to indicate whether to expect such data, so we can get rid of it 
 even if we reject.

 More general issue:
 Maybe we should check for overload before we check for loops? This would 
 require some new (and slightly hacky) structures, and would mean we have to 
 get rid of or alter some less important current load heuristics. But it would 
 allow us to avoid wasting bandwidth in the case of SSK inserts (we check for 
 overload, send accepted, then wait for all the data, then check for loops, 
 then relay), and would reduce the amount of information leaked in all cases.

 Overload is much more common than loops, although some small darknets may 
 have a lot of loops.

 Overall impact of UID-level fixes: The group of nodes with UIDs equal to the 
 original request is clouded by the attacker's requests. He might still be 
 able to make some progress but only if he correlates the same procedure for 
 multiple nodes. In which case he's probably better off doing some of the 
 other known opennet attacks, or combining it ... However, see Ian's solution.

 Opennet/Announcement:
 ===

 Section IIIA:
 - Perverse effects of the current replacement heuristics? The dominant 
 heuristic is that any of the neighbour categories has served at least 10 
 requests... This means if you do requests near the node's location you can 
 get your node 

Re: [freenet-dev] How to fix UID exploits properly?

2012-09-04 Thread Matthew Toseland
On Tuesday 04 Sep 2012 18:45:47 Matthew Toseland wrote:
 On Tuesday 04 Sep 2012 18:13:04 Evan Daniel wrote:
  On Tue, Sep 4, 2012 at 12:29 PM, Matthew Toseland
  t...@amphibian.dyndns.org wrote:
   The paper, A Traceback Attack on Freenet ( 
   http://www.ee.hawaii.edu/~dong/traceback/1569649421.pdf ) presents a new 
   attack which relies on being able to 1) quickly connect to a node via 
   announcement and 2) Query it to determine whether a given request UID has 
   visited this node. The attack allows them (prior to 1411) to trace a 
   request back to its originator.
  
   1411 makes this dramatically more difficult by not tracking UIDs of 
   completed requests. However, it may still be possible to do some variant 
   of this attack, and we should improve things further.
  
 ...
  
   Ian's solution
   
  
   Get rid of RejectedLoop. Always accept, never route to the same peer as 
   we've already routed that UID to, and RNF if we can't find any more nodes 
   to route to.
  
   I am worried about what this could do to routing. I don't think we should 
   implement it without some theoretical/simulation analysis? I can see that 
   it might improve things, but we need more than that given it could be 
   fairly significant.
  
   However it is the most comprehensive way to get rid of these problems, 
   and might have the least performance impact.
  
  I like this solution. It was my immediate reaction to the problem 
  description.
  
  It will make local minimums harder to escape. Basically, you prevent
  duplicating an edge along a route, rather than a node. That's a much
  less powerful approach to avoiding minimums. I suspect FOAF routing
  helps a lot here, but that seems like it might be problematic from a
  security perspective as well.
  
  In general, making routing better (link length distribution, mainly)
  will make this less of an issue; local minimums are a problem that
  results when you have too few short links, which is the current
  problem with the network.
 
 How concrete is this view that it will improve performance? What would it 
 take to put it on a solid footing? Is there relevant published work? Can you 
 suggest how to build a simulation to compare the two approaches (say with 
 many-nodes-one-VM to make it easy)?
 
 I would be delighted to implement it ... but only if I can be fairly sure it 
 won't make things worse.

Okay, Evan has clarified: It will reduce performance but only slightly. I 
suspect it might be more than slightly on the poor topology we are likely to 
see on darknet. So IMHO we need more information.

Having said that, 99% of the risk is gone right now...


signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl