[ 
https://issues.apache.org/jira/browse/BEAM-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067629#comment-16067629
 ] 

Jingsong Lee commented on BEAM-1531:
------------------------------------

Of course, the embedded HBase server version is better, it is a complete mini 
hbase cluster. So I only changed the testReadingSplitAtFraction (only involving 
Scanner.iterator) test, other tests remain unchanged.
I think there is a tradeoff here, tradeoff of test accuracy and test speed. For 
testReadingSplitAtFraction test, if we can effectively improve the speed, but 
also there is a good mock(query by startRow and stopRow), we can achieve the 
purpose of our test. (test HBaseIO.splitAtFraction)

I carried out some tests, understand the realization of HBaseTestingUtility, 
which has a complete miniHBaseCluster and miniZKCluster, JVM has 8000+ classes 
and 300+ threads when run. Then it is very slow. I do not have a detailed 
understanding, probably need to do a cluster of things, but let a JVM to do, 
resulting in a very slow running.

> Support dynamic work rebalancing for HBaseIO
> --------------------------------------------
>
>                 Key: BEAM-1531
>                 URL: https://issues.apache.org/jira/browse/BEAM-1531
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Ismaël Mejía
>            Assignee: Ismaël Mejía
>            Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to