optimize mapjoin to use distributedcache ----------------------------------------
Key: HIVE-1599 URL: https://issues.apache.org/jira/browse/HIVE-1599 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Fix For: 0.7.0 Currently, each mapper reads the file locally in case of a mapjoin. This creates problems if the number of mappers is very high. It would be optimal to put the files in the distributedcache before the job starts, and then the mappers can read it from the cache instead of reading from hdfs as they do currently. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.