Hi Stephen!

Have you taken a look at Apache Gora? It uses Avro for its data model,
which supports nested data structures, and can store in a variety of
backing stores, including HBase.

-Sean

On Tue, Sep 9, 2014 at 4:20 PM, Stephen Boesch <[email protected]> wrote:

> Thanks Michael, yes  cells are byte[]; therefore, storing JSON or other
> document structures is always possible.  Our use cases include querying
> individual elements in the structure - so that would require reconstituting
> the documents and then parsing them for every row.  We probably are not
> headed in the direction of HBase for those use cases: but we are trying to
> make that determination after having carefully considered the extent of the
> mismatch.
>
> 2014-09-09 13:37 GMT-07:00 Michael Segel <[email protected]>:
>
> > You do realize that everything you store in Hbase are byte arrays, right?
> > That is each cell is a blob.
> >
> > So you have the ability to create nested structures like… JSON records?
> ;-)
> >
> > So to your point. You can have a column A which represents a set of
> values.
> >
> > This is one reason why you shouldn’t think of HBase in terms of being
> > relational. In fact for Hadoop, you really don’t want to think in terms
> of
> > relational structures.
> > Think more of Hierarchical.
> >
> > So yes, you can do what you want to do…
> >
> > HTH
> >
> > -Mike
> >
> > On Sep 8, 2014, at 10:06 PM, Stephen Boesch <[email protected]> wrote:
> >
> > > While I am aware that HBase does not have native support for nested
> > > structures, surely there are some of you that have thought through this
> > use
> > > case carefully.
> > >
> > > Our particular use case is likely having single digit nested layers
> with
> > > tens to hundreds of items in the lists at each level.
> > >
> > > An example would be a
> > >
> > > top Level  300 items
> > > middle level :  1 to 100 items  ("1 value"  may indicate a single value
> > as
> > > opposed to a list)
> > > third level:  1 to 50 items
> > > fourth level  1 to 20 items
> > >
> > > The column names are likely known ahead of time- which may or may not
> > > matter for hbase.  We could model the above structure in a Parquet File
> > or
> > > in Hive (with nested struct's)- but we would like to consider whether
> > > HBase.might also be an option.
> >
> >
>



-- 
Sean

Reply via email to