Have you checked janusgraph source code , it used also hbase as a storage 
backend:
http://janusgraph.org/
It combines it with elasticsearch for indexing. Maybe you can inspire from the 
architecture there.

Generally, hbase it depends a lot on how the data is written to regions, the 
order of data and the right key (-> this has then impact on how it is read, 
also in flink to use locality). There is of course more detail on that and 
depends on the use case. Generally the hbase documentation is rather good.

> On 4. Apr 2018, at 23:38, santoshg <santo...@uber.com> wrote:
> 
> Restarting this thread since it is relevant to us. We are thinking of using
> HBase/Cassandra to store graph data and then load the data from here into
> Flink/Gelly. One of the issues we are concerned about is the read
> performance. So far we tried our tests with data residing on HDFS and that
> worked fine. 
> 
> Is there any guidance on reading from HBase for batch jobs ? Wondering if
> any experience with this approach. Do's/Don'ts etc..
> 
> Thanks
> 
> 
> 
> --
> Sent from: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply via email to