[
https://issues.apache.org/jira/browse/CRUNCH-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820242#comment-13820242
]
Josh Wills commented on CRUNCH-230:
-----------------------------------
Seems useful and straightforward to me. My week is sort of shot, but I should
have time to cook something up this weekend if someone else doesn't beat me to
it.
> Attempt to estimate HBase table sizes when we're given a trivial Scan object
> ----------------------------------------------------------------------------
>
> Key: CRUNCH-230
> URL: https://issues.apache.org/jira/browse/CRUNCH-230
> Project: Crunch
> Issue Type: Improvement
> Components: IO
> Affects Versions: 0.6.0
> Reporter: Josh Wills
> Priority: Minor
> Attachments: CRUNCH-230.patch
>
>
> If we're asked to do a scan of an entire HBase table, we can actually do a
> pretty good job of estimating how large it is by looking up its directory in
> HDFS. This patch checks the input scan given to the HBaseSourceTarget, and if
> it doesn't specify any filters, looks up the size of the input table on HDFS.
--
This message was sent by Atlassian JIRA
(v6.1#6144)