deniskuzZ commented on code in PR #4910:
URL: https://github.com/apache/hive/pull/4910#discussion_r1412079925


##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergInputFormat.java:
##########
@@ -176,6 +179,20 @@ public RecordReader<Void, Container<Record>> 
getRecordReader(InputSplit split, J
     }
   }
 
+  private static void validateFilesWithinTableDirectory(InputSplit split, 
JobConf job) throws IOException {
+    boolean dataFilesWithingTableLocationOnly =
+        
job.getBoolean(HiveConf.ConfVars.HIVE_ICEBERG_ALLOW_DATA_IN_TABLE_LOCATION_ONLY.varname,
+            
HiveConf.ConfVars.HIVE_ICEBERG_ALLOW_DATA_IN_TABLE_LOCATION_ONLY.defaultBoolVal);
+    if (dataFilesWithingTableLocationOnly) {
+      Path tableLocation = new Path(job.get(InputFormatConfig.TABLE_LOCATION));

Review Comment:
   > Would a custom/per-user jar could lead to the same class override?
   
   yes, but that could be altered only by admin
   
   > Why don't we consider storage-based authorization
   
   why can't we use `doAs=true` as a quick fix? AFAIK Spark has it enabled
   
   > not expose the metadata
   
   if the data location is sensitive - why should we leak it? Also, this way 
authorization would be done in 1 dedicated place (service designed for that), 
but not in every engine. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to