[ 
https://issues.apache.org/jira/browse/CRUNCH-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820242#comment-13820242
 ] 

Josh Wills commented on CRUNCH-230:
-----------------------------------

Seems useful and straightforward to me. My week is sort of shot, but I should 
have time to cook something up this weekend if someone else doesn't beat me to 
it.

> Attempt to estimate HBase table sizes when we're given a trivial Scan object
> ----------------------------------------------------------------------------
>
>                 Key: CRUNCH-230
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-230
>             Project: Crunch
>          Issue Type: Improvement
>          Components: IO
>    Affects Versions: 0.6.0
>            Reporter: Josh Wills
>            Priority: Minor
>         Attachments: CRUNCH-230.patch
>
>
> If we're asked to do a scan of an entire HBase table, we can actually do a 
> pretty good job of estimating how large it is by looking up its directory in 
> HDFS. This patch checks the input scan given to the HBaseSourceTarget, and if 
> it doesn't specify any filters, looks up the size of the input table on HDFS.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to