[jira] [Commented] (HBASE-15773) CellCounter improvements

Enis Soztutar (JIRA) Thu, 05 May 2016 18:43:37 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-15773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273468#comment-15273468
 ]


Enis Soztutar commented on HBASE-15773:
---------------------------------------

bq. generating job counters containing row keys and column qualifiers is 
guaranteed to blow up on anything but the smallest table. 
Yep, YARN will fail the job actually if you want to write more than ~50 
counters. 
Skimmed the patch, looks good. 

> CellCounter improvements
> ------------------------
>
>                 Key: HBASE-15773
>                 URL: https://issues.apache.org/jira/browse/HBASE-15773
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Gary Helmling
>            Assignee: Gary Helmling
>             Fix For: 1.3.0
>
>         Attachments: HBASE-15773.001.patch
>
>
> Looking at the CellCounter map reduce, it seems like it can be improved in a 
> few areas:
> * it does not currently support setting scan batching.  This is important 
> when we're fetching all versions for columns.  Actually, it would be nice to 
> support all of the scan configuration currently provided in TableInputFormat.
> * generating job counters containing row keys and column qualifiers is 
> guaranteed to blow up on anything but the smallest table.  This is not usable 
> and doesn't make any sense when the same counts are in the job output.  The 
> row and qualifier specific counters should be dropped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15773) CellCounter improvements

Reply via email to