RE: Time-series schema

Jonathan Gray Fri, 29 Oct 2010 21:58:28 -0700

For small datasets and workloads, yes.  But HBase works well at large scale.


> -----Original Message-----
> From: Sean Bigdatafun [mailto:[email protected]]
> Sent: Friday, October 29, 2010 9:54 PM
> To: [email protected]
> Subject: Re: Time-series schema
> 
> For scenario where a TableIndexed fits, would not an RDBMS be a better
> choice?
> 
> Sean
> 
> On Fri, Oct 29, 2010 at 7:01 PM, Jonathan Gray <[email protected]>
> wrote:
> 
> > >
> > > > There is no such atomicity provided by HBase. Recent TableIndexed
> may
> > > help,
> > > > but I have not personally tried it.
> > > >
> > > >
> > >
> > > Uhm actually there is. :-)
> > >
> > > Like I said in the other post, when you insert the rows, you can
> fetch
> > > the local time on the node and use it when you insert the row as
> the
> > > time stamp for the row.
> > > So you can get an 'atomic' write.
> > >
> > > Just a word of advice. Make sure you use the right System method.
> Had a
> > > developer accidentally use nanoTime() and now you have rows in a
> table
> > > that don't make sense...
> > >
> > >
> >
> > Using the same timestamp does not make the writes to different rows
> atomic.
> >
> > Atomic generally means that in addition to something seeming to
> appear at
> > the same instant, the entire thing must be successful or none of it
> at all.
> >  That is really a critical element of atomic and why you need some
> type of
> > transactions or transaction log.  A client that does two separate Put
> > operations on two different rows can fail at any point.  The existing
> > TableIndexed implementation used OCC (optimistic concurrency control)
> and
> > allowed for things to be undone if an operation failed.
> >
> > There should be some new work around indexing using Coprocessors in
> the
> > next few months.  I'm excited about the prospects there.  Rather than
> the
> > OCC approach, I'm thinking more like asynchronous, eventually
> consistent
> > secondary indexing.
> >
> > JG
> >

RE: Time-series schema

Reply via email to