Re: Logging of Web Usage

2003-04-05 Thread Bill Stewart
At 11:32 AM 04/03/2003 -0800, Bill Frantz wrote:
Ah yes, I haven't updated my timings for the new machines that are faster
than my 550Mhz.  :-)
The only other item is importance is that the exhaustive search time isn't
the time to reverse one IP, but the time to reverse all the IPs that have
been recorded.
Also, until recently, there was the problem that storing a hash value
for every IP address took 8-10 bytes * 2**32, and the resulting 32-40GB
was an annoyingly large storage quantity, requiring a deck of Exabyte tapes
or corporate-budget quantities of disk drive, which also meant that
sorting the results was also awkward.  These days, disk drive prices
are $1/GB at Fry's for 3.5 IDE drives, so there's no reason not to have
120GB on your desk top.
This does mean that if you're keeping hashed logs you should probably
use some sort of keyed hash - even if you don't change the keys often,
you've at least prevented pre-computed dictionary attacks over the
entire IPv4 address space, and the key should be long enough (e.g. 128 bit)
so that dictionary attacks on the IP addresses of Usual Suspects
also can't be precomputed.
A related question is keeping lists of public information,
e.g. don't-spam lists, in some form that isn't readily abusable,
such as hashed addresses.  The possible namespace there is much larger,
but the actual namespace isn't likely to be more than a couple of billion,
in spite of the number of spammers selling their lists of 9 billion names.
There's the question of how exact a match do you need -
if mail is for [EMAIL PROTECTED], you'd ideally like to be able to check
[EMAIL PROTECTED], [EMAIL PROTECTED], and @example.com,
which makes the lookup process more complex.
-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Logging of Web Usage

2003-04-04 Thread Ben Laurie
Bill Frantz wrote:
At 6:16 PM -0800 4/2/03, Seth David Schoen wrote:

Bill Frantz writes:


The http://cryptome.org/usage-logs.htm URL says:


Low resolution data in most cases is intended to be sufficient for
marketing analyses.  It may take the form of IP addresses that have been
subjected to a one way hash, to refer URLs that exclude information other
than the high level domain, or temporary cookies.
Note that since IPv4 addresses are 32 bits, anyone willing to dedicate a
computer for a few hours can reverse a one way hash by exhaustive search.
Truncating IPs seems a much more privacy friendly approach.
This problem would be less acute with IPv6 addresses.
I'm skeptical that it will even take a few hours; on a 1.5 GHz
desktop machine, using openssl speed, I see about a million hash
operations per second.  (It depends slightly on which hash you choose.)
This is without compiling OpenSSL with processor-specific optimizations.


Ah yes, I haven't updated my timings for the new machines that are faster
than my 550Mhz.  :-)
The only other item is importance is that the exhaustive search time isn't
the time to reverse one IP, but the time to reverse all the IPs that have
been recorded.
You only need to build the dictionary once.

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html   http://www.thebunker.net/
There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit. - Robert Woodruff
-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Logging of Web Usage

2003-04-03 Thread Bill Frantz
At 6:16 PM -0800 4/2/03, Seth David Schoen wrote:
Bill Frantz writes:

 The http://cryptome.org/usage-logs.htm URL says:

 Low resolution data in most cases is intended to be sufficient for
 marketing analyses.  It may take the form of IP addresses that have been
 subjected to a one way hash, to refer URLs that exclude information other
 than the high level domain, or temporary cookies.

 Note that since IPv4 addresses are 32 bits, anyone willing to dedicate a
 computer for a few hours can reverse a one way hash by exhaustive search.
 Truncating IPs seems a much more privacy friendly approach.

 This problem would be less acute with IPv6 addresses.

I'm skeptical that it will even take a few hours; on a 1.5 GHz
desktop machine, using openssl speed, I see about a million hash
operations per second.  (It depends slightly on which hash you choose.)
This is without compiling OpenSSL with processor-specific optimizations.

Ah yes, I haven't updated my timings for the new machines that are faster
than my 550Mhz.  :-)

The only other item is importance is that the exhaustive search time isn't
the time to reverse one IP, but the time to reverse all the IPs that have
been recorded.

Cheers - Bill


-
Bill Frantz   | Due process for all| Periwinkle -- Consulting
(408)356-8506 | used to be the | 16345 Englewood Ave.
[EMAIL PROTECTED] | American way.  | Los Gatos, CA 95032, USA



-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Logging of Web Usage

2003-04-03 Thread Roop Mukherjee
Could this not use most of the code from the Onion Router itself. I am 
assuming that the code was made freely available and someone has a copy if 
it?

-- roop

On Thu, 3 Apr 2003, Ben Laurie wrote:
 Ben.
 
 [1] FWIW, I'd be willing to work on that, but not on my own (unless 
 someone wants to keep me in the style to which I am accustomed, that is).
 
 


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Logging of Web Usage

2003-04-03 Thread Ben Laurie
John Young wrote:
Ben,

Would you care to comment for publication on web logging 
described in these two files:

  http://cryptome.org/no-logs.htm

  http://cryptome.org/usage-logs.htm

Cryptome invites comments from others who know the capabilities 
of servers to log or not, and other means for protecting user privacy 
by users themselves rather than by reliance upon privacy policies 
of site operators and government regulation.

This relates to the data retention debate and current initiatives 
of law enforcement to subpoena, surveil, steal and manipulate
log data.
I don't have time right now to comment in detail (I will try to later), 
but it seems to me that, as someone else commented, relying on operators 
to not keep logs is really not the way to go. If you want privacy or 
anonymity, then you have to create it for yourself, not expect others to 
provide it for you.

Of course, it is possible to reduce your exposure to others whilst still 
taking advantage of privacy-enhancing services they offer. Two obvious 
examples of this are the mixmaster anonymous remailer network, and onion 
routing.

It seems to me if you want to make serious inroads into privacy w.r.t. 
logging of traffic, then what you want to put your energy into is onion 
routing. There is _still_ no deployable free software to do it, and that 
is ridiculous[1]. It seems to me that this is the single biggest win we 
can have against all sorts of privacy invasions.

Make log retention useless for any purpose other than statistics and 
maintenance. Don't try to make it only used for those purposes.

Cheers,

Ben.

[1] FWIW, I'd be willing to work on that, but not on my own (unless 
someone wants to keep me in the style to which I am accustomed, that is).

--
http://www.apache-ssl.org/ben.html   http://www.thebunker.net/
There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit. - Robert Woodruff
-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Logging of Web Usage

2003-04-02 Thread Bill Frantz
At 2:58 PM -0800 4/2/03, John Young wrote:
Ben,

Would you care to comment for publication on web logging
described in these two files:

  http://cryptome.org/no-logs.htm

  http://cryptome.org/usage-logs.htm

Cryptome invites comments from others who know the capabilities
of servers to log or not, and other means for protecting user privacy
by users themselves rather than by reliance upon privacy policies
of site operators and government regulation.

This relates to the data retention debate and current initiatives
of law enforcement to subpoena, surveil, steal and manipulate
log data.

Thanks,

John

The http://cryptome.org/usage-logs.htm URL says:

Low resolution data in most cases is intended to be sufficient for
marketing analyses.  It may take the form of IP addresses that have been
subjected to a one way hash, to refer URLs that exclude information other
than the high level domain, or temporary cookies.

Note that since IPv4 addresses are 32 bits, anyone willing to dedicate a
computer for a few hours can reverse a one way hash by exhaustive search.
Truncating IPs seems a much more privacy friendly approach.

This problem would be less acute with IPv6 addresses.

Cheers - Bill


-
Bill Frantz   | Due process for all| Periwinkle -- Consulting
(408)356-8506 | used to be the | 16345 Englewood Ave.
[EMAIL PROTECTED] | American way.  | Los Gatos, CA 95032, USA



-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]