Have you checked janusgraph source code , it used also hbase as a storage backend: http://janusgraph.org/ It combines it with elasticsearch for indexing. Maybe you can inspire from the architecture there.
Generally, hbase it depends a lot on how the data is written to regions, the order of data and the right key (-> this has then impact on how it is read, also in flink to use locality). There is of course more detail on that and depends on the use case. Generally the hbase documentation is rather good. > On 4. Apr 2018, at 23:38, santoshg <santo...@uber.com> wrote: > > Restarting this thread since it is relevant to us. We are thinking of using > HBase/Cassandra to store graph data and then load the data from here into > Flink/Gelly. One of the issues we are concerned about is the read > performance. So far we tried our tests with data residing on HDFS and that > worked fine. > > Is there any guidance on reading from HBase for batch jobs ? Wondering if > any experience with this approach. Do's/Don'ts etc.. > > Thanks > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/