[ 
https://issues.apache.org/jira/browse/HBASE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894755#comment-13894755
 ] 

Ted Yu commented on HBASE-10413:
--------------------------------

{code}
[email protected]
+public class RegionSizeCalculator {
{code}
@InterfaceAudience.Public should be used.
{code}
+  RegionSizeCalculator (HTable table, HBaseAdmin admin) throws IOException {
{code}
admin is only used in ctor. Close it in finally block.
{code}
+          long regionSizeBytes = (memSize + fileSize) * megaByte;
{code}
Does memstore size have to be included ?
{code}
+          LOG.debug(MessageFormat.format("Region {0} has size {1}", 
regionLoad.getNameAsString(), regionSizeBytes));
{code}
Wrap long line - limit is 100 char.

License header appears twice in RegionSizeCalculatorTest

> Tablesplit.getLength returns 0
> ------------------------------
>
>                 Key: HBASE-10413
>                 URL: https://issues.apache.org/jira/browse/HBASE-10413
>             Project: HBase
>          Issue Type: Bug
>          Components: Client, mapreduce
>    Affects Versions: 0.96.1.1
>            Reporter: Lukas Nalezenec
>            Assignee: Lukas Nalezenec
>         Attachments: HBASE-10413-2.patch, HBASE-10413.patch
>
>
> InputSplits should be sorted by length but TableSplit does not contain real 
> getLength implementation:
>   @Override
>   public long getLength() {
>     // Not clear how to obtain this... seems to be used only for sorting 
> splits
>     return 0;
>   }
> This is causing us problem with scheduling - we have got jobs that are 
> supposed to finish in limited time but they get often stuck in last mapper 
> working on large region.
> Can we implement this method ? 
> What is the best way ?
> We were thinking about estimating size by size of files on HDFS.
> We would like to get Scanner from TableSplit, use startRow, stopRow and 
> column families to get corresponding region than computing size of HDFS for 
> given region and column family. 
> Update:
> This ticket was about production issue - I talked with guy who worked on this 
> and he said our production issue was probably not directly caused by 
> getLength() returning 0. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to