I would keep full IP address + port info in my firewall log. Separate from
the webserver log. This to help the webguys not abusing collected data.
Having talked to the webguys, they use the logfiles in daily operations,
and they see them as necesary to provide continous delivery of the services
to the end user.That is another obligation we have.
Our legal department actually suggested we keep logs for 5 years, as some
data must be kept that long.

The big privacy issue here is more about abuse and losing the data (move
them away from the internet facing server within 3 days would be a good
recommendation). This must be controlled by internal company rules. Not
this RFC that says we must cripple data after 3 days. And 3 days is a
stupid limit if there is a longer weekened/holidays etc. Easter is an
example, Thursday to monday are non-working days. That is 5 days + the
extra. So the 3 days should be 6 days without even accounting for holidays.



On Wed, Apr 25, 2018 at 11:22 AM, <mohamed.boucad...@orange.com> wrote:

> Re-,
>
>
>
> Please see inline.
>
>
>
> Cheers,
>
> Med
>
>
>
> *De :* Povl H. Pedersen [mailto:p...@my.terminal.dk]
> *Envoyé :* mercredi 25 avril 2018 11:05
> *À :* BOUCADAIR Mohamed IMT/OLN
> *Cc :* int-a...@ietfa.amsl.com
> *Objet :* Re: [Int-area] WG adoption call: Availability of Information in
> Criminal Investigations Involving Large-Scale IP Address Sharing
> Technologies
>
>
>
> If we are at say a /20 or /22 (that is 2000-8000 possible IP addresses),
> and we have the source port, then the ISP should be able to see which of
> these addresses has the given source port to our destination IP and port.
>
> [Med] The assumption about destination IP at the provider side is broken.
> Further, logging destination IP address is not recommended. RFC6888 says
> the following:
>
>
>
>    REQ-12: A CGN SHOULD NOT log destination addresses or ports unless
>
>       required to do so for administrative reasons.
>
>
>
>    Justification:  Destination logging at the CGN creates privacy
>
>       issues.
>
>
>
> Note also that recent advances in optimizing logs at CGNs (e.g. port set
> assignment, deterministic NAT) conflicts with maintaining a track of the
> destination IP address.
>
>
>
> Also, there are stateless address sharing techniques which does not even
> involve a CGN (MAP-E, MAP-T, …). The information about destination IP
> address per new session is not an option.
>
>
>
>
>
> With a timestamp, the risk of collision is low. And the police can at
> least minimize number of suspects.
>
>
>
> [Med] If the destination IP address is not logged at the provider side
> (which is likely), the collision probability of your proposal may be bigger
> for deployments which use a low address sharing ratio (1:2, 1:4).
>
> CGN does not break GeoIP. It still allows us to pinpoint the ISP, but
> might not allow us to pinpoint the user any closer than the breakout point.
>
> [Med] This is exactly what we meant by broken GeoIP in
> https://tools.ietf.org/html/rfc6269#section-7
>
>
>
> If we have an ISP, with CGN, and the police can come with a timestamp, and
> source port, and a destination ip/port, the carrier can likely determine
> the physical person. If he has say 255 possible external IP addresses in
> use, the chance of the same source port to the same destination across
> these is small.
>
>
> With address sharing, we can't point to one physical person.
>
> [Med] OK.
>
> I have a dynamic public IP at home (changes rarely). It is diificult to
> pinpoint anything to me, my wife or my children. Or any user of my open
> WiFi SSID. From a legal point of view, this is impossible.
>
> [Med] OK.
>
> But, the privacy protection in GDPR should protect the 20 y.o. old having
> a fixed public IP, living alone. And here a fixed IP is enough for an ISP
> to locate a person (or rather a machine) with som certainty.
>
> [Med] ISPs operating fixed networks can locate their customers/subscribers
> whatever scheme used for assigning IP addresses. The identification is
> based on the line, not IP addresses.
>
> I think this is all a tradeoff between protecting individuals, while not
> completely giving up investigative tools - At least to do investigation
> with some statistical probability. And since you do not know which
> addresses are used by CGN, you can't handle them different than other IPs..
>
> [Med] Given that you stated above that it is difficult to track an
> individual user based on the IP address, then what is the value of
> complicating the investigation by not recording the full IP address + port
> (for this specific investigation purpose)?
>
>
> Having the full firewall logs as a separate supplement to webserver logs
> will allow you (in many cases) to use the truncated source IP + port to
> find one or a few possible IP addresses. Since you need data from 2
> systems, they are Pseudonymized, and our legal department would agree it is
> then acceptable.
>
> Today we keep logs for 18-24 months, and most police investigations comes
> to us 12-14 months after the crime asking for more details. Sometimes for
> cases we did not know existed. We are a PCI audited level 1 retailer with a
> few web stores.
>
> We do not have people at work every day to look in logs, so the 3 days
> retention is impossible. It may take weeks for us to discover things. If 3
> days is to cover the weekend (no 24/7), it should instead be 30 days, as
> key employees might have the normal 21 days of holiday and a week to catch
> up. Smaller companies might not have overlapping staff skills.
>
>
>
> On Wed, Apr 25, 2018 at 10:20 AM, <mohamed.boucad...@orange.com> wrote:
>
> Dear Povl,
>
>
>
> Thank you for sharing your thoughts.
>
>
>
> I have one comment and two clarification questions:
>
> - Wouldn’t logging based /20-/22 nullify the interest to log source ports
> for investigations? Multiple subscribers may be assigned the same port in
> the /20 or /22 range.
>
> - GeoIP (whatever that means) is broken when CGNs are in use.
>
>       - How and under which conditions an IP address + port can be used to
> point to “ONE physical person” especially when address sharing is in use?
>
>
>
> Cheers,
>
> Med
>
>
>
> *De :* Int-area [mailto:int-area-boun...@ietf.org] *De la part de* Povl
> H. Pedersen
> *Envoyé :* mercredi 25 avril 2018 09:55
> *À :* int-a...@ietfa.amsl.com
> *Objet :* Re: [Int-area] WG adoption call: Availability of Information in
> Criminal Investigations Involving Large-Scale IP Address Sharing
> Technologies
>
>
>
> Where I work, we keep the firewall logs with port numbers completely
> separate from the webserver logs.
>
> Looking at article 25 of GDPR, it is clear that IP addresses are
> pseudonymized data in the firewall logs, as there are only 2 ways to
> connect the IP address to a physical person.
> 1. Court order to ISP etc.
> 2. have the web people look up the IP address in their systrem, trace
> requests, and see if they can associate it with a known user identity.
>
> So firewall logs, unless the web people have access to them, are
> pseudonymized data. So secure by design (article 25). And we can keep them
> for statistics, or investigation purposes.
>
> Now, the question then is, how can we keep enough data in the webserver
> etc log to be able to to actually do enough investigation ? A /16
> shortening was suggested. I think this is too large gruping. Can not even
> be used for country/city statistical purposes. But of course we can enrich
> data with that from the likes of MaxMind, when throwing away trailing bits.
>
> I think we need a minimum /20-/22 and source port in the logs to, with
> some degree of confidence, go from events in the webserver logs back to the
> firewall log to have necesary information for investigation/authorities. If
> we have a /20-/22 and GeoIP data, we might have a few candiates. Then this
> is good enough to ensure we can not get back to ONE physical person.
>
> I think, that updating RFC6302 might be a bit early, and we risk that it
> has to be revised after the first court makes a decision.
>
> If we keep RFC6302 as is, then companies can defend themself, by saying
> they use best practise.
>
> We have another obligation as dataowners/processors. We should keep enough
> data to verify a suspected data breach, and judge the impact. If I can not
> see if 10000 profiles was downloaded by the same IP, or from 10000
> different IPs (out of 65535), I might not be able to fulfill some of the
> other requirements in GDPR.
>
> I think the big question here is how the data is stored/processed, and it
> must be governed by organizational measures (policies and training). It
> would likely be illegal to use to logs to profile a person.But there can be
> other interests allowing us to keep the logs, disassociated from user
> profiles or other things that allows us to map an IP to an individual.
>
>
>
_______________________________________________
Int-area mailing list
Int-area@ietf.org
https://www.ietf.org/mailman/listinfo/int-area

Reply via email to