[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-1342: ------------------------------- Attachment: patch-1342.txt This patch added a limit on the number of unique values for UniqueValueCount aggregator. If the actual number of values is greater than the limit, the counter will be limit + 1. The limit is under the attribute name: "aggregate.max.num.unique.values". It can be set by calling job.setLong("aggregate.max.num.unique.values", 200). The default is Long.MAX_VALUE (same as the current behavior). > A configurable limit on the number of unique values should be set on the > UniqueValueCount and ValueHistogram aggregators > ------------------------------------------------------------------------------------------------------------------------ > > Key: HADOOP-1342 > URL: https://issues.apache.org/jira/browse/HADOOP-1342 > Project: Hadoop > Issue Type: Improvement > Reporter: Runping Qi > Assigned To: Runping Qi > Attachments: patch-1342.txt > > > In the current implementation, the uniq number of values may increase > unbounded, causing out of memory eventually. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.