Re: [Int-area] Int-area Digest, Vol 152, Issue 52

2018-04-25 Thread Povl H. Pedersen
I know what the web people are using the logs for. Most of the stuff they could 
likely do without an IP address. 

If we have performance issues, a drill down might be performed when the right 
people are involved. And in a few cases we have located some low and slow 
attacks and ended up blocking IPs. Usually 1 or 2. So it is crucial for 
operations to pinpoint specific IPs for say a month. 

I also know, that the Bluetooth MACs used for traffic mapping are hashed with a 
key that changes every 6 hours to provide anonymity. But here there is no need 
for the specific MAC. 

Telecoms are tracking tourists phones and selling the data. Anonymous of 
course. But selling info on hotel used, and tourist destinations visited. This 
is abuse and overstepping any privacy expectations. 

But an IP address is different. We can’t map it to a person. The legal system 
can map it to a physical location unless that location has shared WiFi, VPN or 
is a tor exit node. I have all 3. 

I see the abuse if my son surfs on Fortnite sites and I start getting fortnite 
ads. And he gets lawnmower ads. Then somebody assumes more from an IP address 
than they can do with any certainty. 

Last attack we tracked down to 2 IP addresses. Same city. Different Chrome on 
OSX. Same C net. 
We could then use these 2 IPs to find their interests. Newest iPhone. And see a 
Samsung galaxy visiting women’s fashion. This together with the IP and port 
number was something the engineer at the police where happy about. Would make 
it easier for them to talk to the criminals. 

We were not able to find any physical person or address. And we will not know 
about how the case goes before we are awarded damages after conviction. 

But the police engineer has repeated that they want as much info and background 
as we can get them. 

We don’t send armed police in confiscating everything here in Denmark. Often it 
is just a friendly knock on the door and a talk/confession. 
___
Int-area mailing list
Int-area@ietf.org
https://www.ietf.org/mailman/listinfo/int-area


Re: [Int-area] WG adoption call: Availability of Information in Criminal Investigations Involving Large-Scale IP Address Sharing Technologies

2018-04-25 Thread Povl H. Pedersen
I would keep full IP address + port info in my firewall log. Separate from
the webserver log. This to help the webguys not abusing collected data.
Having talked to the webguys, they use the logfiles in daily operations,
and they see them as necesary to provide continous delivery of the services
to the end user.That is another obligation we have.
Our legal department actually suggested we keep logs for 5 years, as some
data must be kept that long.

The big privacy issue here is more about abuse and losing the data (move
them away from the internet facing server within 3 days would be a good
recommendation). This must be controlled by internal company rules. Not
this RFC that says we must cripple data after 3 days. And 3 days is a
stupid limit if there is a longer weekened/holidays etc. Easter is an
example, Thursday to monday are non-working days. That is 5 days + the
extra. So the 3 days should be 6 days without even accounting for holidays.



On Wed, Apr 25, 2018 at 11:22 AM, <mohamed.boucad...@orange.com> wrote:

> Re-,
>
>
>
> Please see inline.
>
>
>
> Cheers,
>
> Med
>
>
>
> *De :* Povl H. Pedersen [mailto:p...@my.terminal.dk]
> *Envoyé :* mercredi 25 avril 2018 11:05
> *À :* BOUCADAIR Mohamed IMT/OLN
> *Cc :* int-a...@ietfa.amsl.com
> *Objet :* Re: [Int-area] WG adoption call: Availability of Information in
> Criminal Investigations Involving Large-Scale IP Address Sharing
> Technologies
>
>
>
> If we are at say a /20 or /22 (that is 2000-8000 possible IP addresses),
> and we have the source port, then the ISP should be able to see which of
> these addresses has the given source port to our destination IP and port.
>
> [Med] The assumption about destination IP at the provider side is broken.
> Further, logging destination IP address is not recommended. RFC6888 says
> the following:
>
>
>
>REQ-12: A CGN SHOULD NOT log destination addresses or ports unless
>
>   required to do so for administrative reasons.
>
>
>
>Justification:  Destination logging at the CGN creates privacy
>
>   issues.
>
>
>
> Note also that recent advances in optimizing logs at CGNs (e.g. port set
> assignment, deterministic NAT) conflicts with maintaining a track of the
> destination IP address.
>
>
>
> Also, there are stateless address sharing techniques which does not even
> involve a CGN (MAP-E, MAP-T, …). The information about destination IP
> address per new session is not an option.
>
>
>
>
>
> With a timestamp, the risk of collision is low. And the police can at
> least minimize number of suspects.
>
>
>
> [Med] If the destination IP address is not logged at the provider side
> (which is likely), the collision probability of your proposal may be bigger
> for deployments which use a low address sharing ratio (1:2, 1:4).
>
> CGN does not break GeoIP. It still allows us to pinpoint the ISP, but
> might not allow us to pinpoint the user any closer than the breakout point.
>
> [Med] This is exactly what we meant by broken GeoIP in
> https://tools.ietf.org/html/rfc6269#section-7
>
>
>
> If we have an ISP, with CGN, and the police can come with a timestamp, and
> source port, and a destination ip/port, the carrier can likely determine
> the physical person. If he has say 255 possible external IP addresses in
> use, the chance of the same source port to the same destination across
> these is small.
>
>
> With address sharing, we can't point to one physical person.
>
> [Med] OK.
>
> I have a dynamic public IP at home (changes rarely). It is diificult to
> pinpoint anything to me, my wife or my children. Or any user of my open
> WiFi SSID. From a legal point of view, this is impossible.
>
> [Med] OK.
>
> But, the privacy protection in GDPR should protect the 20 y.o. old having
> a fixed public IP, living alone. And here a fixed IP is enough for an ISP
> to locate a person (or rather a machine) with som certainty.
>
> [Med] ISPs operating fixed networks can locate their customers/subscribers
> whatever scheme used for assigning IP addresses. The identification is
> based on the line, not IP addresses.
>
> I think this is all a tradeoff between protecting individuals, while not
> completely giving up investigative tools - At least to do investigation
> with some statistical probability. And since you do not know which
> addresses are used by CGN, you can't handle them different than other IPs..
>
> [Med] Given that you stated above that it is difficult to track an
> individual user based on the IP address, then what is the value of
> complicating the investigation by not recording the full IP address + port
> (for this specific investigation purpose)?
>
>
> Having the full

Re: [Int-area] WG adoption call: Availability of Information in Criminal Investigations Involving Large-Scale IP Address Sharing Technologies

2018-04-25 Thread Povl H. Pedersen
If we are at say a /20 or /22 (that is 2000-8000 possible IP addresses),
and we have the source port, then the ISP should be able to see which of
these addresses has the given source port to our destination IP and port.
With a timestamp, the risk of collision is low. And the police can at least
minimize number of suspects.

CGN does not break GeoIP. It still allows us to pinpoint the ISP, but might
not allow us to pinpoint the user any closer than the breakout point. If we
have an ISP, with CGN, and the police can come with a timestamp, and source
port, and a destination ip/port, the carrier can likely determine the
physical person. If he has say 255 possible external IP addresses in use,
the chance of the same source port to the same destination across these is
small.

With address sharing, we can't point to one physical person. I have a
dynamic public IP at home (changes rarely). It is diificult to pinpoint
anything to me, my wife or my children. Or any user of my open WiFi SSID.
>From a legal point of view, this is impossible.

But, the privacy protection in GDPR should protect the 20 y.o. old having a
fixed public IP, living alone. And here a fixed IP is enough for an ISP to
locate a person (or rather a machine) with som certainty.

I think this is all a tradeoff between protecting individuals, while not
completely giving up investigative tools - At least to do investigation
with some statistical probability. And since you do not know which
addresses are used by CGN, you can't handle them different than other IPs.

Having the full firewall logs as a separate supplement to webserver logs
will allow you (in many cases) to use the truncated source IP + port to
find one or a few possible IP addresses. Since you need data from 2
systems, they are Pseudonymized, and our legal department would agree it is
then acceptable.

Today we keep logs for 18-24 months, and most police investigations comes
to us 12-14 months after the crime asking for more details. Sometimes for
cases we did not know existed. We are a PCI audited level 1 retailer with a
few web stores.

We do not have people at work every day to look in logs, so the 3 days
retention is impossible. It may take weeks for us to discover things. If 3
days is to cover the weekend (no 24/7), it should instead be 30 days, as
key employees might have the normal 21 days of holiday and a week to catch
up. Smaller companies might not have overlapping staff skills.


On Wed, Apr 25, 2018 at 10:20 AM, <mohamed.boucad...@orange.com> wrote:

> Dear Povl,
>
>
>
> Thank you for sharing your thoughts.
>
>
>
> I have one comment and two clarification questions:
>
> - Wouldn’t logging based /20-/22 nullify the interest to log source ports
> for investigations? Multiple subscribers may be assigned the same port in
> the /20 or /22 range.
>
> - GeoIP (whatever that means) is broken when CGNs are in use.
>
>   - How and under which conditions an IP address + port can be used to
> point to “ONE physical person” especially when address sharing is in use?
>
>
>
> Cheers,
>
> Med
>
>
>
> *De :* Int-area [mailto:int-area-boun...@ietf.org] *De la part de* Povl
> H. Pedersen
> *Envoyé :* mercredi 25 avril 2018 09:55
> *À :* int-a...@ietfa.amsl.com
> *Objet :* Re: [Int-area] WG adoption call: Availability of Information in
> Criminal Investigations Involving Large-Scale IP Address Sharing
> Technologies
>
>
>
> Where I work, we keep the firewall logs with port numbers completely
> separate from the webserver logs.
>
> Looking at article 25 of GDPR, it is clear that IP addresses are
> pseudonymized data in the firewall logs, as there are only 2 ways to
> connect the IP address to a physical person.
> 1. Court order to ISP etc.
> 2. have the web people look up the IP address in their systrem, trace
> requests, and see if they can associate it with a known user identity.
>
> So firewall logs, unless the web people have access to them, are
> pseudonymized data. So secure by design (article 25). And we can keep them
> for statistics, or investigation purposes.
>
> Now, the question then is, how can we keep enough data in the webserver
> etc log to be able to to actually do enough investigation ? A /16
> shortening was suggested. I think this is too large gruping. Can not even
> be used for country/city statistical purposes. But of course we can enrich
> data with that from the likes of MaxMind, when throwing away trailing bits.
>
> I think we need a minimum /20-/22 and source port in the logs to, with
> some degree of confidence, go from events in the webserver logs back to the
> firewall log to have necesary information for investigation/authorities. If
> we have a /20-/22 and GeoIP data, we might have a few candiates. Then this
> is good enough to ensure we can not get back to ONE physica

Re: [Int-area] WG adoption call: Availability of Information in Criminal Investigations Involving Large-Scale IP Address Sharing Technologies

2018-04-25 Thread Povl H. Pedersen
Where I work, we keep the firewall logs with port numbers completely
separate from the webserver logs.

Looking at article 25 of GDPR, it is clear that IP addresses are
pseudonymized data in the firewall logs, as there are only 2 ways to
connect the IP address to a physical person.
1. Court order to ISP etc.
2. have the web people look up the IP address in their systrem, trace
requests, and see if they can associate it with a known user identity.

So firewall logs, unless the web people have access to them, are
pseudonymized data. So secure by design (article 25). And we can keep them
for statistics, or investigation purposes.

Now, the question then is, how can we keep enough data in the webserver etc
log to be able to to actually do enough investigation ? A /16 shortening
was suggested. I think this is too large gruping. Can not even be used for
country/city statistical purposes. But of course we can enrich data with
that from the likes of MaxMind, when throwing away trailing bits.

I think we need a minimum /20-/22 and source port in the logs to, with some
degree of confidence, go from events in the webserver logs back to the
firewall log to have necesary information for investigation/authorities. If
we have a /20-/22 and GeoIP data, we might have a few candiates. Then this
is good enough to ensure we can not get back to ONE physical person.

I think, that updating RFC6302 might be a bit early, and we risk that it
has to be revised after the first court makes a decision.

If we keep RFC6302 as is, then companies can defend themself, by saying
they use best practise.

We have another obligation as dataowners/processors. We should keep enough
data to verify a suspected data breach, and judge the impact. If I can not
see if 1 profiles was downloaded by the same IP, or from 1
different IPs (out of 65535), I might not be able to fulfill some of the
other requirements in GDPR.

I think the big question here is how the data is stored/processed, and it
must be governed by organizational measures (policies and training). It
would likely be illegal to use to logs to profile a person.But there can be
other interests allowing us to keep the logs, disassociated from user
profiles or other things that allows us to map an IP to an individual.
___
Int-area mailing list
Int-area@ietf.org
https://www.ietf.org/mailman/listinfo/int-area