ztzg commented on pull request #1519:
URL: https://github.com/apache/zookeeper/pull/1519#issuecomment-725535273


   > > I was considering adding a "fourth commit," making sure field values 
written to audit log entries are systematically escaped, but am not sure which 
encoding to use. Is there a precedent in the code base? In any case, a subset 
of URL encoding may be good enough; e.g.: % → %25, \t → %09, \n → %0A, and 
everything non-ASCII to %-encoded UTF-8 bytes. WDYT?
   > 
   > I don't have a strong opinion about this. I don't know about any precedent 
for this... do you think this would be necessary? Do we expect that user names 
/ schema ids would contain any "dangerous" characters? If the logs are 
processed by some scripts, then maybe escaping \n (or even \r) might be good. 
On the other hand the log processing tools are usually more robust and can 
handle multiline logs too (e.g. stacktraces). Also you can configure log4j to 
produce UTF-8 log files I guess.
   
   I am not adding that "fourth commit" for now, and have also "disabled" the 
third one, which does per-scheme filtering.  (I have kept it in the individual 
commits on this PR in case somebody wants to fish it out, but it will 
"disappear" once everything is squashed by the committer.)
   
   I think we should be careful not to inject unsanitized user data into logs 
in general. But the above seems overkill because authentication IDs are 
normally not under user control… except when the `digest` provider is 
enabled—and we now have a flag to block that vector.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to