Szehon Ho created HIVE-13217: -------------------------------- Summary: Replication for HoS mapjoin small file needs to respect dfs.replication.max Key: HIVE-13217 URL: https://issues.apache.org/jira/browse/HIVE-13217 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 2.0.0, 1.2.1 Reporter: Szehon Ho Assignee: Xuefu Zhang Priority: Minor
Currently Hive on Spark Mapjoin replicates small table file to a hard-coded value of 10. See SparkHashTableSinkOperator.MIN_REPLICATION. When dfs.replication.max is less than 10, HoS query fails. This constant should cap at dfs.replication.max. Normally dfs.replication.max seems set at 512. -- This message was sent by Atlassian JIRA (v6.3.4#6332)