[GitHub] [spark] FatalLin edited a comment on pull request #32202: [SPARK-28098][SQL]allow reader could read files from subdirectory for non-partitioned table when configuration is enable

GitBox Sun, 18 Apr 2021 23:36:39 -0700


FatalLin edited a comment on pull request #32202:
URL: https://github.com/apache/spark/pull/32202#issuecomment-822210112



   about the configuration "mapred.input.dir.recursive" and 
"hive.mapred.supports.subdirectories", I found a  brief introduction in hive 
document: 
   ``
   hive.mapred.supports.subdirectories
   Default Value: false
   Added In: Hive 0.10.0 with HIVE-3276
   Whether the version of Hadoop which is running supports sub-directories for 
tables/partitions. Many Hive optimizations can be applied if the Hadoop version 
supports sub-directories for tables/partitions. This support was added by 
MAPREDUCE-1501.
   (https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties)
   ``
   looks like "mapred.input.dir.recursive" allow map-reduce could read files 
from sub-directories, and "hive.mapred.supports.subdirectories"  allow hive 
could do some sub-directories related optimization. In my first thought that 
due to hive and map-reduce is separate project so that's make sense that they 
have each own configuration about it. But in spark, the operation is only 
happened in spark-sql, so I only check hive-side configuration  
"hive.mapred.supports.subdirectories" earlier. How do you think? @attilapiros 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] FatalLin edited a comment on pull request #32202: [SPARK-28098][SQL]allow reader could read files from subdirectory for non-partitioned table when configuration is enable

Reply via email to