[ https://issues.apache.org/jira/browse/HIVE-2408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
John Sichi updated HIVE-2408: ----------------------------- Component/s: (was: HBase Handler) Query Processor > Perpetually degrading performance in checkPaths > ----------------------------------------------- > > Key: HIVE-2408 > URL: https://issues.apache.org/jira/browse/HIVE-2408 > Project: Hive > Issue Type: Bug > Components: Query Processor > Affects Versions: 0.7.1, 0.8.0 > Reporter: Grisha Trubetskoy > > In ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, checkPaths() > tacks on a copy_N if a file exists, working its way up until an available > file name is found. The problem is that the exists() check is quite expensive > in HDFS, and if you have hundreds of files to go through this becomes a > serious bottleneck. > A better solution would be to use a timestamp in the file name, then followed > by the "copy_N scheme". -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira