Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10793 )

Change subject: WIP IMPALA-7178: Add the possibility to reduce logging for 
common data errors
......................................................................


Patch Set 1:

Thanks for the comments! I will not update the code today but I would like to 
respond to comments about the general approach:

"Currently we only store the first concrete error message per error code, but 
we might need all unique error messages."

In case of this error, every message would be the same for a given file+column, 
and column readers are not reused between columns/files. Things would be 
different if the values were logged, but data itself is considered sensitive 
information.

A new function like AddIfMessageIsUnique() could be added to LogCollector that 
would create a map from the messages and check uniqueness, but I would only add 
it if it was actually used somewhere. Such a function would also raise some 
questions: can the map grow without limit? If no, what to do if the limit was 
reached?


--
To view, visit http://gerrit.cloudera.org:8080/10793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3b7c1fd020a7ba5e0d9c619e1b67236dce198aa
Gerrit-Change-Number: 10793
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Jun 2018 18:17:27 +0000
Gerrit-HasComments: No

Reply via email to