Re: [Int-area] Int-area Digest, Vol 152, Issue 52
I know what the web people are using the logs for. Most of the stuff they could likely do without an IP address. If we have performance issues, a drill down might be performed when the right people are involved. And in a few cases we have located some low and slow attacks and ended up blocking IPs. Usually 1 or 2. So it is crucial for operations to pinpoint specific IPs for say a month. I also know, that the Bluetooth MACs used for traffic mapping are hashed with a key that changes every 6 hours to provide anonymity. But here there is no need for the specific MAC. Telecoms are tracking tourists phones and selling the data. Anonymous of course. But selling info on hotel used, and tourist destinations visited. This is abuse and overstepping any privacy expectations. But an IP address is different. We can’t map it to a person. The legal system can map it to a physical location unless that location has shared WiFi, VPN or is a tor exit node. I have all 3. I see the abuse if my son surfs on Fortnite sites and I start getting fortnite ads. And he gets lawnmower ads. Then somebody assumes more from an IP address than they can do with any certainty. Last attack we tracked down to 2 IP addresses. Same city. Different Chrome on OSX. Same C net. We could then use these 2 IPs to find their interests. Newest iPhone. And see a Samsung galaxy visiting women’s fashion. This together with the IP and port number was something the engineer at the police where happy about. Would make it easier for them to talk to the criminals. We were not able to find any physical person or address. And we will not know about how the case goes before we are awarded damages after conviction. But the police engineer has repeated that they want as much info and background as we can get them. We don’t send armed police in confiscating everything here in Denmark. Often it is just a friendly knock on the door and a talk/confession. ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area
Re: [Int-area] WG adoption call: Availability of Information in Criminal Investigations Involving Large-Scale IP Address Sharing Technologies
I would keep full IP address + port info in my firewall log. Separate from the webserver log. This to help the webguys not abusing collected data. Having talked to the webguys, they use the logfiles in daily operations, and they see them as necesary to provide continous delivery of the services to the end user.That is another obligation we have. Our legal department actually suggested we keep logs for 5 years, as some data must be kept that long. The big privacy issue here is more about abuse and losing the data (move them away from the internet facing server within 3 days would be a good recommendation). This must be controlled by internal company rules. Not this RFC that says we must cripple data after 3 days. And 3 days is a stupid limit if there is a longer weekened/holidays etc. Easter is an example, Thursday to monday are non-working days. That is 5 days + the extra. So the 3 days should be 6 days without even accounting for holidays. On Wed, Apr 25, 2018 at 11:22 AM, <mohamed.boucad...@orange.com> wrote: > Re-, > > > > Please see inline. > > > > Cheers, > > Med > > > > *De :* Povl H. Pedersen [mailto:p...@my.terminal.dk] > *Envoyé :* mercredi 25 avril 2018 11:05 > *À :* BOUCADAIR Mohamed IMT/OLN > *Cc :* int-a...@ietfa.amsl.com > *Objet :* Re: [Int-area] WG adoption call: Availability of Information in > Criminal Investigations Involving Large-Scale IP Address Sharing > Technologies > > > > If we are at say a /20 or /22 (that is 2000-8000 possible IP addresses), > and we have the source port, then the ISP should be able to see which of > these addresses has the given source port to our destination IP and port. > > [Med] The assumption about destination IP at the provider side is broken. > Further, logging destination IP address is not recommended. RFC6888 says > the following: > > > >REQ-12: A CGN SHOULD NOT log destination addresses or ports unless > > required to do so for administrative reasons. > > > >Justification: Destination logging at the CGN creates privacy > > issues. > > > > Note also that recent advances in optimizing logs at CGNs (e.g. port set > assignment, deterministic NAT) conflicts with maintaining a track of the > destination IP address. > > > > Also, there are stateless address sharing techniques which does not even > involve a CGN (MAP-E, MAP-T, …). The information about destination IP > address per new session is not an option. > > > > > > With a timestamp, the risk of collision is low. And the police can at > least minimize number of suspects. > > > > [Med] If the destination IP address is not logged at the provider side > (which is likely), the collision probability of your proposal may be bigger > for deployments which use a low address sharing ratio (1:2, 1:4). > > CGN does not break GeoIP. It still allows us to pinpoint the ISP, but > might not allow us to pinpoint the user any closer than the breakout point. > > [Med] This is exactly what we meant by broken GeoIP in > https://tools.ietf.org/html/rfc6269#section-7 > > > > If we have an ISP, with CGN, and the police can come with a timestamp, and > source port, and a destination ip/port, the carrier can likely determine > the physical person. If he has say 255 possible external IP addresses in > use, the chance of the same source port to the same destination across > these is small. > > > With address sharing, we can't point to one physical person. > > [Med] OK. > > I have a dynamic public IP at home (changes rarely). It is diificult to > pinpoint anything to me, my wife or my children. Or any user of my open > WiFi SSID. From a legal point of view, this is impossible. > > [Med] OK. > > But, the privacy protection in GDPR should protect the 20 y.o. old having > a fixed public IP, living alone. And here a fixed IP is enough for an ISP > to locate a person (or rather a machine) with som certainty. > > [Med] ISPs operating fixed networks can locate their customers/subscribers > whatever scheme used for assigning IP addresses. The identification is > based on the line, not IP addresses. > > I think this is all a tradeoff between protecting individuals, while not > completely giving up investigative tools - At least to do investigation > with some statistical probability. And since you do not know which > addresses are used by CGN, you can't handle them different than other IPs.. > > [Med] Given that you stated above that it is difficult to track an > individual user based on the IP address, then what is the value of > complicating the investigation by not recording the full IP address + port > (for this specific investigation purpose)? > > > Having the full
Re: [Int-area] WG adoption call: Availability of Information in Criminal Investigations Involving Large-Scale IP Address Sharing Technologies
If we are at say a /20 or /22 (that is 2000-8000 possible IP addresses), and we have the source port, then the ISP should be able to see which of these addresses has the given source port to our destination IP and port. With a timestamp, the risk of collision is low. And the police can at least minimize number of suspects. CGN does not break GeoIP. It still allows us to pinpoint the ISP, but might not allow us to pinpoint the user any closer than the breakout point. If we have an ISP, with CGN, and the police can come with a timestamp, and source port, and a destination ip/port, the carrier can likely determine the physical person. If he has say 255 possible external IP addresses in use, the chance of the same source port to the same destination across these is small. With address sharing, we can't point to one physical person. I have a dynamic public IP at home (changes rarely). It is diificult to pinpoint anything to me, my wife or my children. Or any user of my open WiFi SSID. >From a legal point of view, this is impossible. But, the privacy protection in GDPR should protect the 20 y.o. old having a fixed public IP, living alone. And here a fixed IP is enough for an ISP to locate a person (or rather a machine) with som certainty. I think this is all a tradeoff between protecting individuals, while not completely giving up investigative tools - At least to do investigation with some statistical probability. And since you do not know which addresses are used by CGN, you can't handle them different than other IPs. Having the full firewall logs as a separate supplement to webserver logs will allow you (in many cases) to use the truncated source IP + port to find one or a few possible IP addresses. Since you need data from 2 systems, they are Pseudonymized, and our legal department would agree it is then acceptable. Today we keep logs for 18-24 months, and most police investigations comes to us 12-14 months after the crime asking for more details. Sometimes for cases we did not know existed. We are a PCI audited level 1 retailer with a few web stores. We do not have people at work every day to look in logs, so the 3 days retention is impossible. It may take weeks for us to discover things. If 3 days is to cover the weekend (no 24/7), it should instead be 30 days, as key employees might have the normal 21 days of holiday and a week to catch up. Smaller companies might not have overlapping staff skills. On Wed, Apr 25, 2018 at 10:20 AM, <mohamed.boucad...@orange.com> wrote: > Dear Povl, > > > > Thank you for sharing your thoughts. > > > > I have one comment and two clarification questions: > > - Wouldn’t logging based /20-/22 nullify the interest to log source ports > for investigations? Multiple subscribers may be assigned the same port in > the /20 or /22 range. > > - GeoIP (whatever that means) is broken when CGNs are in use. > > - How and under which conditions an IP address + port can be used to > point to “ONE physical person” especially when address sharing is in use? > > > > Cheers, > > Med > > > > *De :* Int-area [mailto:int-area-boun...@ietf.org] *De la part de* Povl > H. Pedersen > *Envoyé :* mercredi 25 avril 2018 09:55 > *À :* int-a...@ietfa.amsl.com > *Objet :* Re: [Int-area] WG adoption call: Availability of Information in > Criminal Investigations Involving Large-Scale IP Address Sharing > Technologies > > > > Where I work, we keep the firewall logs with port numbers completely > separate from the webserver logs. > > Looking at article 25 of GDPR, it is clear that IP addresses are > pseudonymized data in the firewall logs, as there are only 2 ways to > connect the IP address to a physical person. > 1. Court order to ISP etc. > 2. have the web people look up the IP address in their systrem, trace > requests, and see if they can associate it with a known user identity. > > So firewall logs, unless the web people have access to them, are > pseudonymized data. So secure by design (article 25). And we can keep them > for statistics, or investigation purposes. > > Now, the question then is, how can we keep enough data in the webserver > etc log to be able to to actually do enough investigation ? A /16 > shortening was suggested. I think this is too large gruping. Can not even > be used for country/city statistical purposes. But of course we can enrich > data with that from the likes of MaxMind, when throwing away trailing bits. > > I think we need a minimum /20-/22 and source port in the logs to, with > some degree of confidence, go from events in the webserver logs back to the > firewall log to have necesary information for investigation/authorities. If > we have a /20-/22 and GeoIP data, we might have a few candiates. Then this > is good enough to ensure we can not get back to ONE physica
Re: [Int-area] WG adoption call: Availability of Information in Criminal Investigations Involving Large-Scale IP Address Sharing Technologies
Where I work, we keep the firewall logs with port numbers completely separate from the webserver logs. Looking at article 25 of GDPR, it is clear that IP addresses are pseudonymized data in the firewall logs, as there are only 2 ways to connect the IP address to a physical person. 1. Court order to ISP etc. 2. have the web people look up the IP address in their systrem, trace requests, and see if they can associate it with a known user identity. So firewall logs, unless the web people have access to them, are pseudonymized data. So secure by design (article 25). And we can keep them for statistics, or investigation purposes. Now, the question then is, how can we keep enough data in the webserver etc log to be able to to actually do enough investigation ? A /16 shortening was suggested. I think this is too large gruping. Can not even be used for country/city statistical purposes. But of course we can enrich data with that from the likes of MaxMind, when throwing away trailing bits. I think we need a minimum /20-/22 and source port in the logs to, with some degree of confidence, go from events in the webserver logs back to the firewall log to have necesary information for investigation/authorities. If we have a /20-/22 and GeoIP data, we might have a few candiates. Then this is good enough to ensure we can not get back to ONE physical person. I think, that updating RFC6302 might be a bit early, and we risk that it has to be revised after the first court makes a decision. If we keep RFC6302 as is, then companies can defend themself, by saying they use best practise. We have another obligation as dataowners/processors. We should keep enough data to verify a suspected data breach, and judge the impact. If I can not see if 1 profiles was downloaded by the same IP, or from 1 different IPs (out of 65535), I might not be able to fulfill some of the other requirements in GDPR. I think the big question here is how the data is stored/processed, and it must be governed by organizational measures (policies and training). It would likely be illegal to use to logs to profile a person.But there can be other interests allowing us to keep the logs, disassociated from user profiles or other things that allows us to map an IP to an individual. ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area