Re: Graph Analytics on HBase With HGraphDB and Apache Flink Gelly
Have you checked janusgraph source code , it used also hbase as a storage backend: http://janusgraph.org/ It combines it with elasticsearch for indexing. Maybe you can inspire from the architecture there. Generally, hbase it depends a lot on how the data is written to regions, the order of data and the right key (-> this has then impact on how it is read, also in flink to use locality). There is of course more detail on that and depends on the use case. Generally the hbase documentation is rather good. > On 4. Apr 2018, at 23:38, santoshg wrote: > > Restarting this thread since it is relevant to us. We are thinking of using > HBase/Cassandra to store graph data and then load the data from here into > Flink/Gelly. One of the issues we are concerned about is the read > performance. So far we tried our tests with data residing on HDFS and that > worked fine. > > Is there any guidance on reading from HBase for batch jobs ? Wondering if > any experience with this approach. Do's/Don'ts etc.. > > Thanks > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Re: Graph Analytics on HBase With HGraphDB and Apache Flink Gelly
Restarting this thread since it is relevant to us. We are thinking of using HBase/Cassandra to store graph data and then load the data from here into Flink/Gelly. One of the issues we are concerned about is the read performance. So far we tried our tests with data residing on HDFS and that worked fine. Is there any guidance on reading from HBase for batch jobs ? Wondering if any experience with this approach. Do's/Don'ts etc.. Thanks -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Re: Graph Analytics on HBase With HGraphDB and Apache Flink Gelly
Thank you for sharing! On 28 July 2017 at 05:01, Robert Yokota wrote: > Also Google Cloud Bigtable has such a page at https://cloud.google.com/ > bigtable/docs/integrations > > On Thu, Jul 27, 2017 at 6:57 PM, Robert Yokota wrote: > >> >> One thing I really appreciate about HBase is its flexibility. It doesn't >> enforce a schema, but also doesn't prevent you from building a schema layer >> on top. It is very customizable, allowing you to push arbitrary code to >> the server in the form of filters and coprocessors. >> >> Not having such higher-layer features built into HBase allows it to >> remain flexibile, but it does have a down-side. One complaint is that for >> a new user coming to HBase, who perhaps does want to work with things like >> query languages, schemas, secondary indices, transactions, and so forth, it >> can be daunting to research and understand what other projects in the HBase >> ecosystem can help him/her, how others have used such projects, and under >> what use cases each project might be successful or not. >> >> Perhaps a good start would be something like an "HBase ecosystem" page at >> the website that would list projects like Phoenix, Tephra, and others in >> the HBase ecosystem. The Apache TinkerPop site has a listing of projects >> in its ecosystem at http://tinkerpop.apache.org. I think new users >> coming to HBase aren't even aware of the larger ecosystem, and sometimes >> end up selecting alternative data stores as a result. >> >> P.S. I'm using HBase 1.1.2 >> >> On Thu, Jul 27, 2017 at 5:42 PM, Ted Yu wrote: >> >>> Interesting blog. >>> >>> From your experience, is there anything on hbase side which you see room >>> for improvement ? >>> >>> Which hbase release are you using ? >>> >>> Cheers >>> >>> On Thu, Jul 27, 2017 at 3:11 PM, Robert Yokota >>> wrote: >>> In case anyone is interested, I wrote a blog on how to analyze graphs stored in HBase with Apache Flink Gelly: https://yokota.blog/2017/07/27/graph-analytics-on-hbase-with -hgraphdb-and-apache-flink-gelly/ >>> >>> >> >
Re: Graph Analytics on HBase With HGraphDB and Apache Flink Gelly
Also Google Cloud Bigtable has such a page at https://cloud.google.com/bigtable/docs/integrations On Thu, Jul 27, 2017 at 6:57 PM, Robert Yokota wrote: > > One thing I really appreciate about HBase is its flexibility. It doesn't > enforce a schema, but also doesn't prevent you from building a schema layer > on top. It is very customizable, allowing you to push arbitrary code to > the server in the form of filters and coprocessors. > > Not having such higher-layer features built into HBase allows it to remain > flexibile, but it does have a down-side. One complaint is that for a new > user coming to HBase, who perhaps does want to work with things like query > languages, schemas, secondary indices, transactions, and so forth, it can > be daunting to research and understand what other projects in the HBase > ecosystem can help him/her, how others have used such projects, and under > what use cases each project might be successful or not. > > Perhaps a good start would be something like an "HBase ecosystem" page at > the website that would list projects like Phoenix, Tephra, and others in > the HBase ecosystem. The Apache TinkerPop site has a listing of projects > in its ecosystem at http://tinkerpop.apache.org. I think new users > coming to HBase aren't even aware of the larger ecosystem, and sometimes > end up selecting alternative data stores as a result. > > P.S. I'm using HBase 1.1.2 > > On Thu, Jul 27, 2017 at 5:42 PM, Ted Yu wrote: > >> Interesting blog. >> >> From your experience, is there anything on hbase side which you see room >> for improvement ? >> >> Which hbase release are you using ? >> >> Cheers >> >> On Thu, Jul 27, 2017 at 3:11 PM, Robert Yokota >> wrote: >> >>> In case anyone is interested, I wrote a blog on how to analyze graphs >>> stored in HBase with Apache Flink Gelly: >>> >>> https://yokota.blog/2017/07/27/graph-analytics-on-hbase-with >>> -hgraphdb-and-apache-flink-gelly/ >>> >> >> >
Re: Graph Analytics on HBase With HGraphDB and Apache Flink Gelly
One thing I really appreciate about HBase is its flexibility. It doesn't enforce a schema, but also doesn't prevent you from building a schema layer on top. It is very customizable, allowing you to push arbitrary code to the server in the form of filters and coprocessors. Not having such higher-layer features built into HBase allows it to remain flexibile, but it does have a down-side. One complaint is that for a new user coming to HBase, who perhaps does want to work with things like query languages, schemas, secondary indices, transactions, and so forth, it can be daunting to research and understand what other projects in the HBase ecosystem can help him/her, how others have used such projects, and under what use cases each project might be successful or not. Perhaps a good start would be something like an "HBase ecosystem" page at the website that would list projects like Phoenix, Tephra, and others in the HBase ecosystem. The Apache TinkerPop site has a listing of projects in its ecosystem at http://tinkerpop.apache.org. I think new users coming to HBase aren't even aware of the larger ecosystem, and sometimes end up selecting alternative data stores as a result. P.S. I'm using HBase 1.1.2 On Thu, Jul 27, 2017 at 5:42 PM, Ted Yu wrote: > Interesting blog. > > From your experience, is there anything on hbase side which you see room > for improvement ? > > Which hbase release are you using ? > > Cheers > > On Thu, Jul 27, 2017 at 3:11 PM, Robert Yokota wrote: > >> In case anyone is interested, I wrote a blog on how to analyze graphs >> stored in HBase with Apache Flink Gelly: >> >> https://yokota.blog/2017/07/27/graph-analytics-on-hbase-with >> -hgraphdb-and-apache-flink-gelly/ >> > >
Re: Graph Analytics on HBase With HGraphDB and Apache Flink Gelly
Interesting blog. >From your experience, is there anything on hbase side which you see room for improvement ? Which hbase release are you using ? Cheers On Thu, Jul 27, 2017 at 3:11 PM, Robert Yokota wrote: > In case anyone is interested, I wrote a blog on how to analyze graphs > stored in HBase with Apache Flink Gelly: > > https://yokota.blog/2017/07/27/graph-analytics-on-hbase-with > -hgraphdb-and-apache-flink-gelly/ >