[jira] [Comment Edited] (FLINK-2188) Reading from big HBase Tables

Hilmi Yildirim (JIRA) Tue, 09 Jun 2015 07:21:14 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578957#comment-14578957
 ]


Hilmi Yildirim edited comment on FLINK-2188 at 6/9/15 2:19 PM:
---------------------------------------------------------------

It takes time to embed the original TableInputFormat from hbase into the Flink 
code. Unfortunately, I have no time to do this. It would be great if you 
provide me a code snippet that I can test it.

Thanks.


was (Author: hilmiyildirim):
It takes time to embed the original TableInputFormat from hbase into the Flink 
code. Unfortunately, I have no time to do this. it would be great if you 
provide me a code snippet that I can test it.

Thanks.

> Reading from big HBase Tables
> -----------------------------
>
>                 Key: FLINK-2188
>                 URL: https://issues.apache.org/jira/browse/FLINK-2188
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Hilmi Yildirim
>            Priority: Critical
>         Attachments: flinkTest.zip
>
>
> I detected a bug in the reading from a big Hbase Table.
> I used a cluster of 13 machines with 13 processing slots for each machine 
> which results in a total number of processing slots of 169. Further, our 
> cluster uses cdh5.4.1 and the HBase version is 1.0.0-cdh5.4.1. There is a 
> Hbase Table with nearly 100. mio rows. I used Spark and Hive to count the 
> number of rows and both results are identical (nearly 100 mio.). 
> Then, I used Flink to count the number of rows. For that I added the 
> hbase-client 1.0.0-cdh5.4.1 Java API as dependency in maven and excluded the 
> other hbase-client dependencies. The result of the job is nearly 102 mio. , 2 
> mio. rows more than the result of Spark and Hive. Moreover, I run the Flink 
> job multiple times and sometimes the result fluctuates by +-5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (FLINK-2188) Reading from big HBase Tables

Reply via email to