Xuefu Zhang created HIVE-7492: --------------------------------- Summary: Enhance SparkCollector Key: HIVE-7492 URL: https://issues.apache.org/jira/browse/HIVE-7492 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang
SparkCollector is used to collect the rows generated by HiveMapFunction or HiveReduceFunction. It currently is backed by a ArrayList, and thus has unbounded memory usage. Ideally, the collector should have a bounded memory usage, and be able to spill to disc when its quota is reached. -- This message was sent by Atlassian JIRA (v6.2#6252)