jkovacs-hwx commented on code in PR #4910:
URL: https://github.com/apache/hive/pull/4910#discussion_r1412090637


##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergInputFormat.java:
##########
@@ -176,6 +179,20 @@ public RecordReader<Void, Container<Record>> 
getRecordReader(InputSplit split, J
     }
   }
 
+  private static void validateFilesWithinTableDirectory(InputSplit split, 
JobConf job) throws IOException {
+    boolean dataFilesWithingTableLocationOnly =
+        
job.getBoolean(HiveConf.ConfVars.HIVE_ICEBERG_ALLOW_DATA_IN_TABLE_LOCATION_ONLY.varname,
+            
HiveConf.ConfVars.HIVE_ICEBERG_ALLOW_DATA_IN_TABLE_LOCATION_ONLY.defaultBoolVal);
+    if (dataFilesWithingTableLocationOnly) {
+      Path tableLocation = new Path(job.get(InputFormatConfig.TABLE_LOCATION));

Review Comment:
   If jar injection to override behaviour is only possible by admin, then this 
should not be a blocker for this scope. E.g. admin could even inject jar to 
override the AuthN chain and dump username/password pairs from those who are 
connecting to HS2 e.g. via jdbc via this auth method.
   
   doAs=true is not an option for Fine-Grained Access Control where this issues 
is the most significant (e.g. otherwise masked data breached as non-masked). 
Spark has no FGAC and elevated privilege based data access decoupling from 
end-user based file-access. 
   
   > if the data location is sensitive - why should we leak it?
   
   Historically it was not sensitive and even with Iceberg it should not be 
treated as sensitive. The issues comes from  Iceberg's new behaviour where 
instead of limiting the read to a directory in a Hive table format case, 
Iceberg now can read data files from anywhere the hive service user has access 
to, if the location is in its manifest file.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to