On Sun, Apr 9, 2017 at 11:14 PM, Yu Li<[email protected]> wrote:
Correct me if I'm wrong, but I think we should assume no other but the
single operation when checking whether it's idempotent. Similar to the
wikipedia
example<https://en.wikipedia.org/wiki/Idempotence#Examples>: "A
function
looking up a customer's name and address in a database
<https://en.wikipedia.org/wiki/Database> is typically idempotent, since
this will not cause the database to change", I think all
Get/MultiGet/Scan
operations in hbase are idempotent.
About "speculative rpc handling", I doubt whether it benefits in hbase.
Normally if a request already arrives at server side but with slow
execution, the problem might be:
1. The server is too busy and request get queued
2. The processing itself is slow due to the request pattern or some
hardware failure
I don't think a speculative execution of the request could help in any of
the above cases. It's different from the speculative task execution in
MR,
there we could choose another node to execute the task while here we have
no choice.
We have a different use case here. Basically we are trying to enforce
scheduling at HBase.
Consider the following scenario: both client-1 and client-2 are competing
for I/O resources.
But client-2 are also issuing a bunch of requests that do not require any
I/O resources (say, data is cached).
Since we have idle CPU/memory, we want to serve these cached requests for
client-2, but we do not want client-2 to use more than its fair share of
I/O.
Unfortunately, at the time we pick RPC call to handle, we don't know
whether an RPC would cause I/O or not.
So we think we can abort a request if it requires I/O resources that are
not allocated to it, and re-schedule it later based on our scheduling
policy.
OTOH, we already have timeout mechanism to make sure server resource
won't
be wasted:
1. For scan
- When a request handling timeouts, server will stop further
processing, refer to RSRpcServices#getTimeLimit and
ScannerContext#checkTimeLimit
- If the client went away during processing, server will also stop
processing, check the SimpleRpcServer#disconnectSince and
RegionScannerImpl#nextInternal methods for more details.
2. For single Get
- Controlled by rpc and operation timeout
3. For MultiGet
- I think this is something we could improve. On client side we have
timeout mechanism but on server side there seems to be no relative
interrupt logic.
Best Regards,
Yu
On 10 April 2017 at 11:12, Jerry He<[email protected]> wrote:
Again, it depends on how you abort and 'idempotent' can have different
definitions.
For example, even if you are only concerned about read,
there are resources on the HRegion that the read touches or acquires
(scanner, lock, mvcc etc) that hopefully will be cleaned/releases with
the
abort.
Or you may have it in a bad/inconsistent state.
Thanks.
Jerry
On Sun, Apr 9, 2017 at 7:14 PM, 张铎(Duo Zhang)<[email protected]>
wrote:
I think this depends on how you model the problem. At server side, if
you
re-execute a read operation with a new mvcc, then you may read a
value
that
should not be visible if you use the old mvcc. If you define this as
an
error then I think there will be conflicts.
But at client side, there is guarantee that the request you send
first
will
be executed first. So as long as the read request does not return, I
think
it is OK to read a value which is written by a write request which is
sent
after the read request?
Thanks.
2017-04-10 9:52 GMT+08:00 杨苏立 Yang Su Li<[email protected]>:
We are only concerned about read operations here. Are you
suggesting
they
are completely idempotent?
Are there any read-after-write conflicts?
Thanks
Sui
On Sun, Apr 9, 2017 at 8:48 PM, 张铎(Duo Zhang)<
[email protected]
wrote:
It depends on how you about the rpc request. For hbase, there
will
be
no
write conflict, but a write operation can only be finished iff
all
the
write operations with a lower mvcc number have been finished. So
if
you
just stop a write operation without recovering the mvcc(I do not
know
how
to recover but I think you need to something...) then the writes
will
be
stuck.
And one more thing, for read operation you may interrupt it at
any
time,
but for write operation, I do not think you can re-execute it
with
a
new
mvcc number if the WAL entry has already been flushed out. That
means,
the
re-execution process will be different if you about the write
operation
at
different stages.
Thanks.
2017-04-10 6:47 GMT+08:00 杨苏立 Yang Su Li<[email protected]>:
We are trying to implement speculative rpc handling for our
workloads.
So
we want allow RPC Handler to stop executing an RPC call, put it
back
to
the
queue, and later re-execute it.
If at time t1, we execute and RPC call half way, aborts, and
put
the
call
back to the queue.
Then at time t2 another RPC handler picks the call and
re-execute
it.
I understand that we might get a different mvcc number and
different
results at t2 compared to we execute it at t1.
My question is that: would this situation any different
compared
to
the
situation where the call was never executed at t1, and is
executed
at
t2
for the first time.
My guess is that since at t1 we may already gotten an mvcc
number,
so
it
might potentially cause some write conflicts and certain write
operations
to retry. But correctness wise, is there any difference?
Thanks a lot!
Suli
On Sun, Apr 9, 2017 at 5:14 PM, Jerry He<[email protected]>
wrote:
I don't know what your intention and your context are.
You may get a different mvcc number and get different results
next
time
around if there are concurrent writes.
Thanks,
Jerry
On Sun, Apr 9, 2017 at 12:48 PM 杨苏立 Yang Su Li<
[email protected]
wrote:
Hi,
I am wondering, for read requests like Get/MultiGet/Scan,
is
the
RPC
handling idempotent in HBase?
More specifically, if in the middle of RPC handling we stop
the
handling
threads, puts the RPC call back to the queue, and later
another
RPC
Handler
picks up this call and starts all over again, will the
result
be
the
same
as if this call is being handled for the first time now? Or
are
their
any
unexpected side effects?
Thanks!
Suli
--
Suli Yang
Department of Physics
University of Wisconsin Madison
4257 Chamberlin Hall
Madison WI 53703
--
Suli Yang
Department of Physics
University of Wisconsin Madison
4257 Chamberlin Hall
Madison WI 53703
--
Suli Yang
Department of Physics
University of Wisconsin Madison
4257 Chamberlin Hall
Madison WI 53703
--
Suli Yang
Department of Physics
University of Wisconsin Madison
4257 Chamberlin Hall
Madison WI 53703