Re: Is HBase RPC-Handling idempotent for reads?

Jerry He Sun, 09 Apr 2017 23:23:52 -0700

Yes.  In the context to the underlying physical region or database,. read
is idempotent.



Thanks

Jerry

On Apr 9, 2017 9:15 PM, "Yu Li" <[email protected]> wrote:

> Correct me if I'm wrong, but I think we should assume no other but the
> single operation when checking whether it's idempotent. Similar to the
> wikipedia
> example <https://en.wikipedia.org/wiki/Idempotence#Examples>: "A function
> looking up a customer's name and address in a database
> <https://en.wikipedia.org/wiki/Database> is typically idempotent, since
> this will not cause the database to change", I think all Get/MultiGet/Scan
> operations in hbase are idempotent.
>
> About "speculative rpc handling", I doubt whether it benefits in hbase.
> Normally if a request already arrives at server side but with slow
> execution, the problem might be:
> 1. The server is too busy and request get queued
> 2. The processing itself is slow due to the request pattern or some
> hardware failure
> I don't think a speculative execution of the request could help in any of
> the above cases. It's different from the speculative task execution in MR,
> there we could choose another node to execute the task while here we have
> no choice.
>
> OTOH, we already have timeout mechanism to make sure server resource won't
> be wasted:
> 1. For scan
>     - When a request handling timeouts, server will stop further
> processing, refer to RSRpcServices#getTimeLimit and
> ScannerContext#checkTimeLimit
>     - If the client went away during processing, server will also stop
> processing, check the SimpleRpcServer#disconnectSince and
> RegionScannerImpl#nextInternal methods for more details.
>
> 2. For single Get
>     - Controlled by rpc and operation timeout
>
> 3. For MultiGet
>     - I think this is something we could improve. On client side we have
> timeout mechanism but on server side there seems to be no relative
> interrupt logic.
>
>
> Best Regards,
> Yu
>
> On 10 April 2017 at 11:12, Jerry He <[email protected]> wrote:
>
> > Again, it depends on how you abort and 'idempotent' can have different
> > definitions.
> >
> > For example, even if you are only concerned about read,
> > there are resources on the HRegion that the read touches or acquires
> > (scanner, lock, mvcc etc) that hopefully will be cleaned/releases with
> the
> > abort.
> > Or you may have it in a bad/inconsistent state.
> >
> > Thanks.
> >
> > Jerry
> >
> >
> > On Sun, Apr 9, 2017 at 7:14 PM, 张铎(Duo Zhang) <[email protected]>
> > wrote:
> >
> > > I think this depends on how you model the problem. At server side, if
> you
> > > re-execute a read operation with a new mvcc, then you may read a value
> > that
> > > should not be visible if you use the old mvcc. If you define this as an
> > > error then I think there will be conflicts.
> > >
> > > But at client side, there is guarantee that the request you send first
> > will
> > > be executed first. So as long as the read request does not return, I
> > think
> > > it is OK to read a value which is written by a write request which is
> > sent
> > > after the read request?
> > >
> > > Thanks.
> > >
> > > 2017-04-10 9:52 GMT+08:00 杨苏立 Yang Su Li <[email protected]>:
> > >
> > > > We are only concerned about read operations here. Are you suggesting
> > they
> > > > are completely idempotent?
> > > > Are there any read-after-write conflicts?
> > > >
> > > > Thanks
> > > >
> > > > Sui
> > > >
> > > > On Sun, Apr 9, 2017 at 8:48 PM, 张铎(Duo Zhang) <[email protected]
> >
> > > > wrote:
> > > >
> > > > > It depends on how you about the rpc request. For hbase, there will
> be
> > > no
> > > > > write conflict, but a write operation can only be finished iff all
> > the
> > > > > write operations with a lower mvcc number have been finished. So if
> > you
> > > > > just stop a write operation without recovering the mvcc(I do not
> know
> > > how
> > > > > to recover but I think you need to something...) then the writes
> will
> > > be
> > > > > stuck.
> > > > >
> > > > > And one more thing, for read operation you may interrupt it at any
> > > time,
> > > > > but for write operation, I do not think you can re-execute it with
> a
> > > new
> > > > > mvcc number if the WAL entry has already been flushed out. That
> > means,
> > > > the
> > > > > re-execution process will be different if you about the write
> > operation
> > > > at
> > > > > different stages.
> > > > >
> > > > > Thanks.
> > > > >
> > > > > 2017-04-10 6:47 GMT+08:00 杨苏立 Yang Su Li <[email protected]>:
> > > > >
> > > > > > We are trying to implement speculative rpc handling for our
> > > workloads.
> > > > So
> > > > > > we want allow RPC Handler to stop executing an RPC call, put it
> > back
> > > to
> > > > > the
> > > > > > queue, and later re-execute it.
> > > > > >
> > > > > > If at time t1, we execute and RPC call half way, aborts, and put
> > the
> > > > call
> > > > > > back to the queue.
> > > > > > Then at time t2 another RPC handler picks the call and re-execute
> > it.
> > > > > > I understand that we might get a different mvcc number and
> > different
> > > > > > results at t2 compared to we execute it at t1.
> > > > > > My question is that: would this situation any different compared
> to
> > > the
> > > > > > situation where the call was never executed at t1, and is
> executed
> > at
> > > > t2
> > > > > > for the first time.
> > > > > >
> > > > > >
> > > > > > My guess is that since at t1 we may already gotten an mvcc
> number,
> > so
> > > > it
> > > > > > might potentially cause some write conflicts and certain write
> > > > operations
> > > > > > to retry. But correctness wise, is there any difference?
> > > > > >
> > > > > > Thanks a lot!
> > > > > >
> > > > > > Suli
> > > > > >
> > > > > >
> > > > > > On Sun, Apr 9, 2017 at 5:14 PM, Jerry He <[email protected]>
> > wrote:
> > > > > >
> > > > > > > I don't know what your intention and your context are.
> > > > > > >
> > > > > > > You may get a different mvcc number and get different results
> > next
> > > > time
> > > > > > > around if there are concurrent writes.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jerry
> > > > > > >
> > > > > > > On Sun, Apr 9, 2017 at 12:48 PM 杨苏立 Yang Su Li <
> > [email protected]
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I am wondering, for read requests like Get/MultiGet/Scan, is
> > the
> > > > RPC
> > > > > > > > handling idempotent in HBase?
> > > > > > > >
> > > > > > > > More specifically, if in the middle of RPC handling we stop
> the
> > > > > > handling
> > > > > > > > threads, puts the RPC call back to the queue, and later
> another
> > > RPC
> > > > > > > Handler
> > > > > > > > picks up this call and starts all over again, will the result
> > be
> > > > the
> > > > > > same
> > > > > > > > as if this call is being handled for the first time now? Or
> are
> > > > their
> > > > > > any
> > > > > > > > unexpected side effects?
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > Suli
> > > > > > > >
> > > > > > > > --
> > > > > > > > Suli Yang
> > > > > > > >
> > > > > > > > Department of Physics
> > > > > > > > University of Wisconsin Madison
> > > > > > > >
> > > > > > > > 4257 Chamberlin Hall
> > > > > > > > Madison WI 53703
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Suli Yang
> > > > > >
> > > > > > Department of Physics
> > > > > > University of Wisconsin Madison
> > > > > >
> > > > > > 4257 Chamberlin Hall
> > > > > > Madison WI 53703
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Suli Yang
> > > >
> > > > Department of Physics
> > > > University of Wisconsin Madison
> > > >
> > > > 4257 Chamberlin Hall
> > > > Madison WI 53703
> > > >
> > >
> >
>

Re: Is HBase RPC-Handling idempotent for reads?

Reply via email to