Hi Stephen! Have you taken a look at Apache Gora? It uses Avro for its data model, which supports nested data structures, and can store in a variety of backing stores, including HBase.
-Sean On Tue, Sep 9, 2014 at 4:20 PM, Stephen Boesch <[email protected]> wrote: > Thanks Michael, yes cells are byte[]; therefore, storing JSON or other > document structures is always possible. Our use cases include querying > individual elements in the structure - so that would require reconstituting > the documents and then parsing them for every row. We probably are not > headed in the direction of HBase for those use cases: but we are trying to > make that determination after having carefully considered the extent of the > mismatch. > > 2014-09-09 13:37 GMT-07:00 Michael Segel <[email protected]>: > > > You do realize that everything you store in Hbase are byte arrays, right? > > That is each cell is a blob. > > > > So you have the ability to create nested structures like… JSON records? > ;-) > > > > So to your point. You can have a column A which represents a set of > values. > > > > This is one reason why you shouldn’t think of HBase in terms of being > > relational. In fact for Hadoop, you really don’t want to think in terms > of > > relational structures. > > Think more of Hierarchical. > > > > So yes, you can do what you want to do… > > > > HTH > > > > -Mike > > > > On Sep 8, 2014, at 10:06 PM, Stephen Boesch <[email protected]> wrote: > > > > > While I am aware that HBase does not have native support for nested > > > structures, surely there are some of you that have thought through this > > use > > > case carefully. > > > > > > Our particular use case is likely having single digit nested layers > with > > > tens to hundreds of items in the lists at each level. > > > > > > An example would be a > > > > > > top Level 300 items > > > middle level : 1 to 100 items ("1 value" may indicate a single value > > as > > > opposed to a list) > > > third level: 1 to 50 items > > > fourth level 1 to 20 items > > > > > > The column names are likely known ahead of time- which may or may not > > > matter for hbase. We could model the above structure in a Parquet File > > or > > > in Hive (with nested struct's)- but we would like to consider whether > > > HBase.might also be an option. > > > > > -- Sean
