Re: random timestamp insert

Alexandre Jaquet Tue, 16 Jun 2009 13:51:48 -0700

checkAndSave have to looks nice but

optimistic concurrency control is based on the assumption that most database
transactions <http://en.wikipedia.org/wiki/Database_transaction> don't
conflict with other transactions


In most case but what's happening if we are in a non optimistic mode ?


2009/6/16 Ryan Rawson <[email protected]>

> The IPC threading can become an issue on a really busy server.  There is by
> default 10 IPC listener threads, once you have 10 concurrent operations you
> must wait for one to end to do the next one.  You can up this if it ends up
> becoming a problem.  It has to be bounded or else resource consumption will
> eventually crash.
>
> The only area this becomes a problem is explicit row locking - if you take
> out a lock in one client, then a different client comes to get the same
> lock, the second client has to wait, and while waiting it consumes a IPC
> thread.
>
> But you shouldn't need to use explicit row locking.
> - Mutations (puts, deletes) take out a row lock then release it.
> - There is a checkAndSave() which allows you to get some kinds of
> optimistic
> concurrency
> - you can use the multi-version mechanism to test for optimistic lock
> failure
> - atomicIncrement allows you to maintain sequences/counters without the use
> of locks.
>
> I would recommend from designing a schema/application that uses row locks.
> Use one of the other excellent mechanisms provided.  If your needs are
> really above and beyond those, lets talk in detail.  A column oriented
> store
> has all sorts of powerful things available to it that rdbms dont have.
>
> On Tue, Jun 16, 2009 at 1:22 PM, Alexandre Jaquet <[email protected]
> >wrote:
>
> > Thanks Ryan for your explanation,
> >
> > But as I understand IPC call genereate dead lock over consomation  of
> > services ? What is the exact role of a region server ?
> >
> > Thanks again.
> >
> > 2009/6/16 Ryan Rawson <[email protected]>
> >
> > > Hey,
> > >
> > > So the issue there was when you are using the row-lock support built
> in,
> > > the
> > > waiters for a row lock use up a IPC responder thread. There is only so
> > many
> > > of them. Then your clients start failing as regionservers are busy
> > waiting
> > > for locks to be released.
> > >
> > > The suggestion there was to use zookeeper-based locks.  The suggestion
> is
> > > still valid.
> > >
> > > I don't get your question about if timestamp is better than "Long
> > > versioning".  A timestamp is a long - it's default value is
> > > System.currentTimeMillis(), thus it's the milliseconds since epoch 1970
> -
> > a
> > > slight variation on the time_t.
> > >
> > > Generally I would recommend people avoid setting timestamps unless they
> > > have
> > > special needs.  Timestamps order multiple version for a given
> row/column,
> > > thus if you 'mess it up', you get wrong data returned.
> > >
> > > I personally believe that timestamps are not necessairly the best way
> to
> > > store time-series data.  While in 0.20 we have better query mechanisms
> > (all
> > > values between X and Y is the general mechanism), you can probably do
> > > better
> > > with indexes.
> > >
> > > -ryan
> > >
> > > On Tue, Jun 16, 2009 at 1:04 PM, Alexandre Jaquet <
> [email protected]
> > > >wrote:
> > >
> > > > Hello,
> > > >
> > > > I'm also evaluting hbase for some applications and found an old post
> > > about
> > > > transactions and concurrent access
> > > >
> > > > http://osdir.com/ml/java.hadoop.hbase.user/2008-05/msg00169.html
> > > >
> > > > Does timestamp is really better than Long versioning ?
> > > >
> > > > Any workaround ?
> > > >
> > > > 2009/6/16 Xinan Wu <[email protected]>
> > > >
> > > > > I am aware that inserting data into hbase with random timestamp
> order
> > > > > results indeterminate result.
> > > > >
> > > > > e.g. comments here
> > > > > https://issues.apache.org/jira/browse/HBASE-1249#action_12682369
> > > > >
> > > > > I've personally experienced indeterminate results before when I
> > insert
> > > > > in random timestamp order (i.e., multiple versions with same
> > timestamp
> > > > > in the same cell, out-of-order timestamp when getting multiple
> > > > > versions).
> > > > >
> > > > > In other words, we don't want to go back in time in inserting
> cells.
> > > > > Deletion is ok. But is updating pretty much the same story as
> > > > > inserting?
> > > > >
> > > > > i.e., if I make sure the timestamp does exist in the cell, and then
> I
> > > > > _update_ it with that timestamp (and same value length), sometimes
> > > > > hbase still just inserts a new version without touching the old
> one,
> > > > > and of course timestamps of this cell becomes out of order. Even if
> I
> > > > > delete all versions in that cell and reinsert in the time order,
> the
> > > > > result is still out of order. I assume if I do a major compact
> > between
> > > > > delete all and reinsert, it would be ok, but that's not a good
> > > > > solution. Is there any good way to update a version of a cell in
> the
> > > > > past? or that simply won't work?
> > > > >
> > > > > Thanks,
> > > > >
> > > >
> > >
> >
>

Re: random timestamp insert

Reply via email to