Yes. In the context to the underlying physical region or database,. read is idempotent.
Thanks Jerry On Apr 9, 2017 9:15 PM, "Yu Li" <[email protected]> wrote: > Correct me if I'm wrong, but I think we should assume no other but the > single operation when checking whether it's idempotent. Similar to the > wikipedia > example <https://en.wikipedia.org/wiki/Idempotence#Examples>: "A function > looking up a customer's name and address in a database > <https://en.wikipedia.org/wiki/Database> is typically idempotent, since > this will not cause the database to change", I think all Get/MultiGet/Scan > operations in hbase are idempotent. > > About "speculative rpc handling", I doubt whether it benefits in hbase. > Normally if a request already arrives at server side but with slow > execution, the problem might be: > 1. The server is too busy and request get queued > 2. The processing itself is slow due to the request pattern or some > hardware failure > I don't think a speculative execution of the request could help in any of > the above cases. It's different from the speculative task execution in MR, > there we could choose another node to execute the task while here we have > no choice. > > OTOH, we already have timeout mechanism to make sure server resource won't > be wasted: > 1. For scan > - When a request handling timeouts, server will stop further > processing, refer to RSRpcServices#getTimeLimit and > ScannerContext#checkTimeLimit > - If the client went away during processing, server will also stop > processing, check the SimpleRpcServer#disconnectSince and > RegionScannerImpl#nextInternal methods for more details. > > 2. For single Get > - Controlled by rpc and operation timeout > > 3. For MultiGet > - I think this is something we could improve. On client side we have > timeout mechanism but on server side there seems to be no relative > interrupt logic. > > > Best Regards, > Yu > > On 10 April 2017 at 11:12, Jerry He <[email protected]> wrote: > > > Again, it depends on how you abort and 'idempotent' can have different > > definitions. > > > > For example, even if you are only concerned about read, > > there are resources on the HRegion that the read touches or acquires > > (scanner, lock, mvcc etc) that hopefully will be cleaned/releases with > the > > abort. > > Or you may have it in a bad/inconsistent state. > > > > Thanks. > > > > Jerry > > > > > > On Sun, Apr 9, 2017 at 7:14 PM, 张铎(Duo Zhang) <[email protected]> > > wrote: > > > > > I think this depends on how you model the problem. At server side, if > you > > > re-execute a read operation with a new mvcc, then you may read a value > > that > > > should not be visible if you use the old mvcc. If you define this as an > > > error then I think there will be conflicts. > > > > > > But at client side, there is guarantee that the request you send first > > will > > > be executed first. So as long as the read request does not return, I > > think > > > it is OK to read a value which is written by a write request which is > > sent > > > after the read request? > > > > > > Thanks. > > > > > > 2017-04-10 9:52 GMT+08:00 杨苏立 Yang Su Li <[email protected]>: > > > > > > > We are only concerned about read operations here. Are you suggesting > > they > > > > are completely idempotent? > > > > Are there any read-after-write conflicts? > > > > > > > > Thanks > > > > > > > > Sui > > > > > > > > On Sun, Apr 9, 2017 at 8:48 PM, 张铎(Duo Zhang) <[email protected] > > > > > > wrote: > > > > > > > > > It depends on how you about the rpc request. For hbase, there will > be > > > no > > > > > write conflict, but a write operation can only be finished iff all > > the > > > > > write operations with a lower mvcc number have been finished. So if > > you > > > > > just stop a write operation without recovering the mvcc(I do not > know > > > how > > > > > to recover but I think you need to something...) then the writes > will > > > be > > > > > stuck. > > > > > > > > > > And one more thing, for read operation you may interrupt it at any > > > time, > > > > > but for write operation, I do not think you can re-execute it with > a > > > new > > > > > mvcc number if the WAL entry has already been flushed out. That > > means, > > > > the > > > > > re-execution process will be different if you about the write > > operation > > > > at > > > > > different stages. > > > > > > > > > > Thanks. > > > > > > > > > > 2017-04-10 6:47 GMT+08:00 杨苏立 Yang Su Li <[email protected]>: > > > > > > > > > > > We are trying to implement speculative rpc handling for our > > > workloads. > > > > So > > > > > > we want allow RPC Handler to stop executing an RPC call, put it > > back > > > to > > > > > the > > > > > > queue, and later re-execute it. > > > > > > > > > > > > If at time t1, we execute and RPC call half way, aborts, and put > > the > > > > call > > > > > > back to the queue. > > > > > > Then at time t2 another RPC handler picks the call and re-execute > > it. > > > > > > I understand that we might get a different mvcc number and > > different > > > > > > results at t2 compared to we execute it at t1. > > > > > > My question is that: would this situation any different compared > to > > > the > > > > > > situation where the call was never executed at t1, and is > executed > > at > > > > t2 > > > > > > for the first time. > > > > > > > > > > > > > > > > > > My guess is that since at t1 we may already gotten an mvcc > number, > > so > > > > it > > > > > > might potentially cause some write conflicts and certain write > > > > operations > > > > > > to retry. But correctness wise, is there any difference? > > > > > > > > > > > > Thanks a lot! > > > > > > > > > > > > Suli > > > > > > > > > > > > > > > > > > On Sun, Apr 9, 2017 at 5:14 PM, Jerry He <[email protected]> > > wrote: > > > > > > > > > > > > > I don't know what your intention and your context are. > > > > > > > > > > > > > > You may get a different mvcc number and get different results > > next > > > > time > > > > > > > around if there are concurrent writes. > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Jerry > > > > > > > > > > > > > > On Sun, Apr 9, 2017 at 12:48 PM 杨苏立 Yang Su Li < > > [email protected] > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > I am wondering, for read requests like Get/MultiGet/Scan, is > > the > > > > RPC > > > > > > > > handling idempotent in HBase? > > > > > > > > > > > > > > > > More specifically, if in the middle of RPC handling we stop > the > > > > > > handling > > > > > > > > threads, puts the RPC call back to the queue, and later > another > > > RPC > > > > > > > Handler > > > > > > > > picks up this call and starts all over again, will the result > > be > > > > the > > > > > > same > > > > > > > > as if this call is being handled for the first time now? Or > are > > > > their > > > > > > any > > > > > > > > unexpected side effects? > > > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > > > Suli > > > > > > > > > > > > > > > > -- > > > > > > > > Suli Yang > > > > > > > > > > > > > > > > Department of Physics > > > > > > > > University of Wisconsin Madison > > > > > > > > > > > > > > > > 4257 Chamberlin Hall > > > > > > > > Madison WI 53703 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Suli Yang > > > > > > > > > > > > Department of Physics > > > > > > University of Wisconsin Madison > > > > > > > > > > > > 4257 Chamberlin Hall > > > > > > Madison WI 53703 > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Suli Yang > > > > > > > > Department of Physics > > > > University of Wisconsin Madison > > > > > > > > 4257 Chamberlin Hall > > > > Madison WI 53703 > > > > > > > > > >
