Xuefu Zhang created HIVE-15682: ---------------------------------- Summary: Eliminate the dummy iterator and optimize the per row based reducer-side processing Key: HIVE-15682 URL: https://issues.apache.org/jira/browse/HIVE-15682 Project: Hive Issue Type: Improvement Components: Spark Affects Versions: 2.2.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang
HIVE-15580 introduced a dummy iterator per input row which can be eliminated. This is because {{SparkReduceRecordHandler}} is able to handle single key value pairs. We can refactor this part of code 1. to remove the need for a iterator and 2. to optimize the code path for per (key, value) based (instead of (key, value iterator)) processing. It would be also great if we can measure the performance after the optimizations and compare to performance prior to HIVE-15580. -- This message was sent by Atlassian JIRA (v6.3.4#6332)