[ 
https://issues.apache.org/jira/browse/HDFS-11520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515968#comment-16515968
 ] 

James Clampffer commented on HDFS-11520:
----------------------------------------

Thanks for picking this up [~anatoli.shein]!

It doesn't look like you're touching the java build so I doubt the shadedclient 
failures are related to this. Shading just does a name mangling pass on java 
class files to avoid library versioning issues.

Some general feedback:
 -Some of the comments in my original patch are no longer applicable (I had 
never intended to merge in that state). If you're basing this work off of that 
make sure to update or remove them where appropriate. There's several in 
rpc_connection.h that were reminders for future work I want to do but aren't 
directly related to this. Example where the comment is alluding a change that 
already happened so it's just confusing now:
{code:java}
+  // This should be an hdfs::IoService.
+  // Can get away for it at the moment because the FileSystemImpl is the only 
thing that uses it.
   std::shared_ptr<IoService> io_service_;
{code}
-Certain RPC (mainly the recursive ones) can't be cancelled right now and just 
no-op. Rather than silently no-op explicitly fail and log a warning where it's 
invoked so the user is aware of this.

-Break the unit test up into multiple blocks. StatCancel is also testing a lot 
of other RPC.  I'd most likely write a macro or templated function that allows 
a method name and args to be plugged in for different calls rather than 
repeating the same 15ish lines for each call but that's mostly preference.

-Add unit tests that do cancel in other phases of RPC. Right now the only place 
being tested is when the NN response is being decoded. Realistically cancel is 
going to be used when the network is stalled or NN is doing GC so you want to 
test cancel of outgoing messages, messages in the queue to be sent, messages 
that have been sent but a response hasn't been received etc.

-Add cancel tests to the mini stress tests. Cancelling from another thread 
introduces potential for race conditions that are hard to perfectly emulate in 
unit tests.

-Is this able to cancel requests that have shifted to a new RPCConnection due 
to failover?  What about retry?

> libhdfs++: Support cancellation of individual RPC calls in C++ API
> ------------------------------------------------------------------
>
>                 Key: HDFS-11520
>                 URL: https://issues.apache.org/jira/browse/HDFS-11520
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: James Clampffer
>            Assignee: Anatoli Shein
>            Priority: Major
>         Attachments: HDFS-11520.002.patch, HDFS-11520.003.patch, 
> HDFS-11520.004.patch, HDFS-11520.005.patch, HDFS-11520.HDFS-8707.000.patch, 
> HDFS-11520.trunk.001.patch
>
>
> RPC calls done by FileSystem methods like Mkdirs, GetFileInfo etc should be 
> individually cancelable without impacting other pending RPC calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to