[jira] [Commented] (PHOENIX-2649) GC/OOM during BulkLoad

James Taylor (JIRA) Thu, 04 Feb 2016 08:59:09 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132577#comment-15132577
 ]


James Taylor commented on PHOENIX-2649:
---------------------------------------

[~maghamravikiran] and/or [~gabriel.reid] - got time for a quick re-review? 
[~sergey.soldatov] - a couple of questions:

How do we know the offset if 4 here? Are we skipping over the length bytes and 
I thought this was encoded as a vint? Or did we change it to an int? If that 
latter, can you use Bytes.SIZEOF_INT instead? Should we be using the original 
comparator that Ravi wrote?
{code}
+    public static class Comparator extends WritableComparator {
+
+        private static final int LENGTH_BYTES = 4;
{code}
Would another solution be to write the length at the end of the row key and go 
ahead and use the regular/built-in comparator?

> GC/OOM during BulkLoad
> ----------------------
>
>                 Key: PHOENIX-2649
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2649
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.7.0
>         Environment: Mac OS, Hadoop 2.7.2, HBase 1.1.2
>            Reporter: Sergey Soldatov
>            Assignee: maghamravikiran
>            Priority: Critical
>             Fix For: 4.7.0
>
>         Attachments: PHOENIX-2649-1.patch, PHOENIX-2649-2.patch, 
> PHOENIX-2649.patch
>
>
> Phoenix fails to complete  bulk load of 40Mb csv data with GC heap error 
> during Reduce phase. The problem is in the comparator for TableRowkeyPair. It 
> expects that the serialized value was written using zero-compressed encoding, 
> but at least in my case it was written in regular way. So, trying to obtain 
> length for table name and row key it always get zero and reports that those 
> byte arrays are equal. As the result, the reducer receives all data produced 
> by mappers in one reduce call and fails with OOM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2649) GC/OOM during BulkLoad

Reply via email to