All- New member here but this draft attracted my (and other of my colleagues) attention. For reference, as a career I do investigations and intelligence tracking cybercrime (and election related tomfoolery). My comments are in the light of that. My first comment is despite the concern of GDPR, Recital 49 is a thing (http://www.privacy-regulation.eu/en/recital-49-GDPR.htm). The biggest use case for logging IP and port involves authentication events. So I wanted to tackle the 5 recommendations individually. As an initial point, GDPR isn't the only regulation game in town. Most other regulations involving security (at least in the US) REQUIRES keeping much of this information for a variety of periods of time. Getting in the middle of dueling regulations is probably not a good thing, but I'd argue that a priori, recital 49 allows this collection.
- SHOULD only store entire incoming IP addresses for as long as is necessary to provide the specific service requested by the user. Historical login information is often essential for a variety of reasons. For instance, fail2ban tracked failed logins and then blocks the underlying IP address for a defined period of time to prevent brute-force attacks. Many user-profiling tools look at IP address for abnormal logins (i.e. this person lives in the US, has only logged in via US IP addresses, and now I see a login event from Russia). It is essential for investigations that are often called after the fact. The average "dwell time" for a breach in the US is about 6 months (most of that time the entity is unaware). It is used to correlate multiple malicious events. If the same IP address is hitting a web-server and doing SSH brute forcing, and sending phishing docs... etc. There are a wide variety of reasons to store this information longer than just until a "logout" event. This requirement would all but criminalize everything we hold as appropriate and a best practice in authentication. - SHOULD keep only the first two octets (of an IPv4 address) or the first three octets (of an IPv6 address) with remaining octets set to zero, when logging. This recommendation sounds like a middle-ground but it renders the information complete useless for any use case I can think of. In many cases, the first two octets wouldn't even tell me what country the user is coming from. If I had to investigate an incident based on first two octets of IP addresses along, I simply could not investigate at all. If data minimization is the goal and someone told me I could only log the first two octets, I'd counter with why log them at all? That's simply filling disk with useless data. - SHOULD NOT store logs of incoming IP addresses from inbound traffic for longer than three days. If breach time is 6 months, most of that being unknown, that would mean I would be deleting essential investigatory information long before I'd even know I needed it. If everyone did this, that would mean I would need to know all but immediately I had an incident, law enforcement would immediately have to be informed and start acting, and a legal order to produce (or at least retain) information would have to be sent to the other party. All within 3 days. If two of those days are Saturday and Sunday, forget about it. If enacted, this willful evidence destruction would several hamper the private sector and law enforcement from every investigating an online incident. The idea that all meaningful use of this data expires in 3 days does not comport to how things work in the real world. - SHOULD NOT log unnecessary identifiers, such as source port number, time stamps, transport protocol numbers or destination port numbers. This would fly in the face of almost every other regulatory regime. The idea that I don't have a right to know when someone is logging into my servers is mind-boggling. Recital 49 is a thing. Further, if I were attacked and need to subpoena the other party, if timestamps and ports were simply never logged, quite literally, there would be minimal to no ability to EVER investigate anything online again. I can't figure out why transport protocol number would ever be considered sensitive. Does whether someone used TCP or UDP really matter from a privacy perspective. If this information was never logged, I wouldn't ever be able to lawfully request this information from a third party or ISP to identify someone. I can understate how much this would all but end meaningful law enforcement investigations for any internet related case. - SHOULD ensure adequate log access control, with suitable mechanisms for keeping track of which entity accesses logged identifiers, for what reason and at what time. This I agree with this as a best practice FOR ORGANIZATIONS. One that for the most part is already widely practiced. At least for those places that can afford a SIEM. For any small organization, this would impose purchasing or building (something that I don't think exists in open-source) and imposing great costs. Many individuals and home users may expose SSH or a VPN on their home routers. They'd never be able to implement that. _______________________________________________ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area