FatalLin edited a comment on pull request #32202: URL: https://github.com/apache/spark/pull/32202#issuecomment-822210112
about the configuration "mapred.input.dir.recursive" and "hive.mapred.supports.subdirectories", I found a brief introduction in hive document: `` hive.mapred.supports.subdirectories Default Value: false Added In: Hive 0.10.0 with HIVE-3276 Whether the version of Hadoop which is running supports sub-directories for tables/partitions. Many Hive optimizations can be applied if the Hadoop version supports sub-directories for tables/partitions. This support was added by MAPREDUCE-1501. (https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties) `` looks like "mapred.input.dir.recursive" allow map-reduce could read files from sub-directories, and "hive.mapred.supports.subdirectories" allow hive could do some sub-directories related optimization. In my first thought that due to hive and map-reduce is separate project so that's make sense that they have each own configuration about it. But in spark, the operation is only happened in spark-sql, so I only check hive-side configuration "hive.mapred.supports.subdirectories" earlier. How do you think? @attilapiros -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
