What¹s the big fuss about IP addresses?
http://www.aquick.org/blog/2006/01/29/whats-the-big-fuss-about-ip-addresses/

Given the recent fuss about the government asking for search terms and what
qualifies as personally identifiable information, I want to explain why IP
address logging is a big deal. This explanation is somewhat simplified to
make the cases easier to understand without going into complete detail of
all of the possible configurations, of which there are many. I think I¹ve
kept the important stuff without dwelling on the boundary cases, and be
aware that your setup may differ somewhat. If you feel I¹ve glossed over
something important, please leave a comment.

First, a brief discussion of what IP addresses are and how they work.
Slightly simplified, every device that is connected to the Internet has a
unique number that identifies it, and this number is called an IP address.
Whenever you send any normal network traffic to any other computer on the
network (request a web page, send an email, etcŠ), it is marked with your IP
address.

There are three standard cases to worry about:

   1. If you use dialup, your analog modem has an IP address. Remote
computers see this IP address. (This case also applies if you¹re using a
data aircard, or using your cell phone as a modem.)
   2. If you have a DSL or cable connection, your DSL/cable modem has an IP
address when it¹s connected, and your computer has a separate internal IP
address that it uses to only communicate with the DSL or cable modem,
typically mediated by a home router. Remote computers see the IP address of
the DSL/cable modem. (This case also applies if you¹re using a mobile wifi
hotspot.)
   3. If you¹re directly connected to the internet via a network adapter,
your network adapter has an IP address. Remote computers see this IP
address.

Sometimes, IP addresses are static, meaning they¹re manually assigned and
don¹t change automatically unless someone changes them (typically, only for
case #3). Often, they¹re dynamic, which means they¹re assigned automatically
with a protocol called DHCP, which allows a new network connection to
automatically pick up an IP address from an available pool. But just because
they can change doesn¹t mean they will change. Even dynamic IP addresses can
remain the same for months or years at a time. (The servers you¹re
communicating with also have IP addresses, and they are typically static.)

In order to see how an IP address may be personally identifiable
information, there¹s a critical question to ask - ³where do IP addresses
come from, and what information can they be correlated with?².

Depending on how you connect to the internet, your IP address may come from
different places:

    * If you use dialup, your modem will get its IP address from the dialup
ISP, with which you have an account. The ISP knows who you are and can
correlate the IP address they give you with your account. Your name and
billing details are part of your account information. By recording the phone
number you call from, they may be able to identify your physical location.
    * If you have a DSL or cable connection, your DSL/cable modem will get
its IP address from the DSL/cable provider. The ISP knows who you are and
can correlate the IP address they give you with your account. Your name and
physical location, and probably other information about you, are part of
your account information.
    * If you¹re using a public wifi access point, you¹re probably using the
IP address of the access point itself. If you had to log in your account,
your name and physical location, and probably other information about you,
are part of your account information. If you¹re using someone else¹s open
wifi point, you look like them to the rest of the internet. This case is an
exception to the rest of the points outlined in this article.
    * If you¹re directly connected to the internet via a network adapter,
your network adapter will get its IP address from the network provider. In
an office, this is typically the network administrator of the company. Your
network administrator knows which computer has which IP address.

None of this information is secret in the traditional sense. It is probably
confidential business information, but in all cases, someone knows it, and
the only thing keeping it from being further revealed is the willingness or
lack thereof of the company or person who knows it.

While an IP address may not be enough to identify you personally, there are
strong correlations of various degrees, and in most cases, those
correlations are only one step away. By itself, an IP address is just a
number. But it¹s trivial to find out who is responsible for that address,
and thus who to ask if you want to know who it¹s been given out to. In some
cases, the logs will be kept indefinitely, or destroyed on a regular basis -
it¹s entirely up to each individual organization.

Up until now, I¹ve only discussed the implications of having an IP address.
The situation gets much much worse when you start using it. Because every
bit of network traffic you use is marked with your IP address, it can be
used to link all of those disparate transactions together.

Despite these possible correlations, not one of the major search engines
considers your IP address to be personally identifiable information.
[Update: someone asked where I got this conclusion. It¹s from my reading of
the Google, Yahoo, and MSN Search privacy policies. In all cases, they
discuss server logs separately from the collection of personal information
(although MSN Search does have it under the heading of ³Collection of Your
Personal Information², it¹s clearly a separate topic). If you have some
reason to believe I¹ve made a mistake, I¹m all ears.] While this may
technically be true if you take an IP address by itself, it is a highly
disingenuous position to take when logs exist that link IP addresses with
computers, physical locations, and account informationŠ and from there with
people. Not always, but often. The inability to link your IP address with
you depends always on the relative secrecy of these logs, what information
is gathered before you get access to your IP address, and what other
information you give out while using it.

Let¹s bring one more piece into the puzzle. It¹s the idea of a key. A key is
a piece of data in common between two disparate data sources. Let¹s say
there¹s one log which records which websites you visit, and it stores a log
that only contains the URL of the website and your IP address. No personal
information, right? But there¹s another log somewhere that records your
account information and the IP address that you happened to be using. Now,
the IP address is a key into your account information, and bringing the two
logs together allows the website list to be associated with your account
information.

    * Have you ever searched for your name? Your IP address is now a key to
your name in a log somewhere.
    * Have you ever ordered a product on the internet and had it shipped to
you? Your IP address is now a key to your home address in a log somewhere.
    * Have you ever viewed a web page with an ad in it served from an ad
network? Both the operator of the web site and the operator of the ad
network have your IP address in a log somewhere, as a key to the sites you
visited.

The list goes on, and it¹s not limited to IP addresses. Any piece of unique
data - IP addresses, cookie values, email addresses - can be used as a key.

Data mining is the act of taking a whole bunch of separate logs, or
databases, and looking for the keys to tie information together into a
comprehensive profile representing the correlations. To say that this
information is definitely being mined, used for anything, stored, or even
ever viewed is certainly alarmist, and I don¹t want to imply that it is. But
the possibility is there, and in many cases, these logs are being kept, if
they¹re not being used in that way now, the only thing really standing in
the way is the inaction of those who have access to the pieces, or can get
it.

If the information is recorded somewhere, it can be used. This is a big
problem.

There are various ways to mask your IP address, but that¹s not the whole
scope of the problem, and it¹s still very easy to leak personally
identifiable information.

I¹ll start with one suggestion for how to begin to address this problem:

Any key information associated with personally identifiable information must
also be considered personally identifiable.

Tags: IP address, privacy, tracking, logs, retention, 


_______________________________________________
Infowarrior mailing list
Infowarrior@attrition.org
https://attrition.org/mailman/listinfo/infowarrior

Reply via email to