[
https://issues.apache.org/jira/browse/HDFS-11520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515968#comment-16515968
]
James Clampffer commented on HDFS-11520:
----------------------------------------
Thanks for picking this up [~anatoli.shein]!
It doesn't look like you're touching the java build so I doubt the shadedclient
failures are related to this. Shading just does a name mangling pass on java
class files to avoid library versioning issues.
Some general feedback:
-Some of the comments in my original patch are no longer applicable (I had
never intended to merge in that state). If you're basing this work off of that
make sure to update or remove them where appropriate. There's several in
rpc_connection.h that were reminders for future work I want to do but aren't
directly related to this. Example where the comment is alluding a change that
already happened so it's just confusing now:
{code:java}
+ // This should be an hdfs::IoService.
+ // Can get away for it at the moment because the FileSystemImpl is the only
thing that uses it.
std::shared_ptr<IoService> io_service_;
{code}
-Certain RPC (mainly the recursive ones) can't be cancelled right now and just
no-op. Rather than silently no-op explicitly fail and log a warning where it's
invoked so the user is aware of this.
-Break the unit test up into multiple blocks. StatCancel is also testing a lot
of other RPC. I'd most likely write a macro or templated function that allows
a method name and args to be plugged in for different calls rather than
repeating the same 15ish lines for each call but that's mostly preference.
-Add unit tests that do cancel in other phases of RPC. Right now the only place
being tested is when the NN response is being decoded. Realistically cancel is
going to be used when the network is stalled or NN is doing GC so you want to
test cancel of outgoing messages, messages in the queue to be sent, messages
that have been sent but a response hasn't been received etc.
-Add cancel tests to the mini stress tests. Cancelling from another thread
introduces potential for race conditions that are hard to perfectly emulate in
unit tests.
-Is this able to cancel requests that have shifted to a new RPCConnection due
to failover? What about retry?
> libhdfs++: Support cancellation of individual RPC calls in C++ API
> ------------------------------------------------------------------
>
> Key: HDFS-11520
> URL: https://issues.apache.org/jira/browse/HDFS-11520
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Reporter: James Clampffer
> Assignee: Anatoli Shein
> Priority: Major
> Attachments: HDFS-11520.002.patch, HDFS-11520.003.patch,
> HDFS-11520.004.patch, HDFS-11520.005.patch, HDFS-11520.HDFS-8707.000.patch,
> HDFS-11520.trunk.001.patch
>
>
> RPC calls done by FileSystem methods like Mkdirs, GetFileInfo etc should be
> individually cancelable without impacting other pending RPC calls.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]