[jira] [Commented] (HADOOP-10389) Native RPCv9 client

Colin Patrick McCabe (JIRA) Fri, 16 May 2014 07:32:39 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999175#comment-13999175
 ]


Colin Patrick McCabe commented on HADOOP-10389:
-----------------------------------------------

bq. <multiple call ids in flight discussion>

I was basing my belief that we don't support multiple call IDs in flight at 
once off a casual conversation I had (off the record) with some folks at Hadoop 
Summit Europe.  It's possible that the code has improved since then, or that 
they were out of date.  I admit that I haven't scrutinized the server code 
closely enough to give a definitive answer here.

There is an easy way to resolve this, of course: we can modify the C code to 
put multiple call IDs in flight at once, and see if the server loses its 
marbles :)

The important thing to remember is that this is an optimization.  Even if we 
never do it, we'll still have a usable native client.  So I'm going to try to 
get the basic stuff done first, then perhaps we can circle back on this.  Or 
maybe if one of you guys wants to do it in parallel that would work out too.  
The tricky part is testing... we need some way to *force* multiple calls to be 
in flight on the channel so we know that it works.

bq. We can keep functions, but the repeated code in these functions can be 
eliminated using abstraction, so as to reduce the binary code size.

I have a patch which implements most of the native client, based on the 
existing RPC code.  The library I generate is only 3 MB, even including a bunch 
of stuff which has nothing to do with RPC.  So although I can see that there 
might be a potential to optimize code size, I don't think it should be our 
highest priority right now.

You also have to keep in mind that code which is not used will be stripped out 
by the linker.  So if we don't use the async version of a certain RPC (for 
example), that code will not become part of {{libhdfs.so}}.  So although you 
might look at the generated code and go "OMG so much code!"  it's really not 
that bad.  This is similar to how in C++, every time you template {{std::map}} 
on a different type, you get another set of {{std::map}} functions in your 
binary.  In practice, it is usually not a problem.

Still, if you want to work on optimizing generated code size, I would welcome 
any patches.  The challenge would be to reduce the code size while still 
maintaining RPC-specific error messages and not regressing performance.

> Native RPCv9 client
> -------------------
>
>                 Key: HADOOP-10389
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10389
>             Project: Hadoop Common
>          Issue Type: Sub-task
>    Affects Versions: HADOOP-10388
>            Reporter: Binglin Chang
>            Assignee: Colin Patrick McCabe
>         Attachments: HADOOP-10388.001.patch, HADOOP-10389.002.patch, 
> HADOOP-10389.004.patch, HADOOP-10389.005.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10389) Native RPCv9 client

Reply via email to