Ashutosh Bapat created HIVE-20644:
-------------------------------------

             Summary: Avoid exposing sensitive infomation through an error 
message
                 Key: HIVE-20644
                 URL: https://issues.apache.org/jira/browse/HIVE-20644
             Project: Hive
          Issue Type: Improvement
          Components: HiveServer2
            Reporter: Ashutosh Bapat
            Assignee: Ashutosh Bapat


The HiveException raised from the following methods is exposing the datarow the 
caused the run time exception.
 # ReduceRecordSource::GroupIterator::next() - around line 372
 # MapOperator::process() - around line 567
 # ExecReducer::reduce() - around line 243

In all the cases, a string representation of the row is constructed on the fly 
and is included in
the error message.

VectorMapOperator::process() - around line 973 raises the same exception but 
it's not exposing the row since the row contents are not included in the error 
message.

While trying to reproduce above error, I also found that the arguments to a UDF 
get exposed in log messages from FunctionRegistry::invoke() around line 1114. 
This too can cause sensitive information to be leaked through error message.

This way some sensitive information is leaked to a user through exception 
message. That information may not be available to the user otherwise. Hence 
it's a kind of security breach or violation of access control.

The contents of the row or the arguments to a function may be useful for 
debugging and hence it's worth to add those to logs. Hence proposal here to log 
a separate message with log level DEBUG or INFO containing the string 
representation of the row. Users can configure their logging so that DEBUG/INFO 
messages do not go to the client but at the same time are available in the hive 
server logs for debugging. The actual exception message will not contain any 
sensitive data like row data or argument data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to