Ted Malaska created HBASE-14339:
-----------------------------------
Summary: HBase Bulk Load and super wide rows
Key: HBASE-14339
URL: https://issues.apache.org/jira/browse/HBASE-14339
Project: HBase
Issue Type: Bug
Reporter: Ted Malaska
Priority: Minor
This may not be a huge issues but it does come up. If the number of columns in
a row are to many then KeyValueSortReducer will blow up with a out of memory
exception, because it uses a TreeMap to sort the columns with in the memory of
the reducer.
A solution would be to add the column family and qualifier to the key so the
shuffle would handle the sort.
The partitioner would only partition on the rowKey but ordering would apply to
the RowKey, Column Family, and Column Qualifier.
Look at the Spark Bulk load as an example. HBASE-14150
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)