[ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032698#comment-14032698
 ] 

Haohui Mai edited comment on HADOOP-10389 at 6/16/14 6:47 PM:
--------------------------------------------------------------

bq. If the code is so impenetrable that review is truly impossible, please 
consider demonstrating the virtues of C++11 in a competing branch.

As a weekend project, I've uploaded a proof-of-concept patch in C++ for this 
jira. It implements a synchronous RPC engine and the {{exists()}} call for 
libhdfs.

The main point is not (and never intends to be) to demonstrate c++ is superior 
to c, but to show that it is practical to using current technology to improve 
code quality and to cut down the maintenance cost down the road. In my personal 
opinion it is important to ensure that the code can be easily maintained by the 
community. The patch demonstrates a viable way to approach this goal.

Just a couple points to highlight:

# Simpler implementation. Thanks to the standard libraries, there is no need to 
put things like RB trees, hash tables, linked lists into the patch and to ask 
for reviews. Though not fully equivalent, the patch is an order of magnitude 
smaller in terms of size.
# Automatic resource management. Explicit resource management, (e.g., 
{{malloc()}} / {{free()}}, and closing sockets) are no longer required. The 
life time of resource matches the scope of the code. It is a systematic way to 
avoid resource leaks.
# Stronger claims from the type systems. Modern language allows the code to be 
written in a mostly type-safe way, where the type system is able to show that 
the majority of the code is safe. There is only one cast required in the code 
compared to many (each in the linked list) in the old implementation. New 
constructs like {{std::array}} also allow integrate bounds check in the code to 
help prevent buffer overflows.

Just to reiterate, by no means I'm trying to claim that the current patch is 
perfect and free of bugs. People myself included make mistakes all the time. A 
modern tool, however, can help people to catch the mistakes at the beginning of 
the development cycle and to avoid them. Thoughts?


was (Author: wheat9):
bq. If the code is so impenetrable that review is truly impossible, please 
consider demonstrating the virtues of C++11 in a competing branch.

As a weekend project, I've uploaded a proof-of-concept patch in C++ for this 
jira. It implements a synchronous RPC engine and the {{exists()}} call for 
libhdfs.

The main point is not (and never intends to be) to demonstrate c++ is superior 
to c, but to show that it is practical to using current technology to improve 
code quality and to cut down the maintenance cost down the road. In my personal 
opinion it is important to ensure that the code can be easily maintained by the 
community. The patch demonstrates a viable way to approach this goal.

Just a couple points to highlight:

# Simpler implementation. Thanks to the standard libraries, there is no need to 
put things like RB trees, hash tables, linked lists into the patch and to ask 
for reviews. Though not fully equivalent, the patch is an order of magnitude 
smaller in terms of size.
# Automatic resource management. Explicit resource management, (e.g., 
{{malloc()}} / {{free()}}, and closing sockets) are no longer required. The 
life time of resource matches the scope of the code. It is a systematic way to 
avoid resource leaks.
# Stronger claims from the type systems. Modern language allows the code to be 
written in a mostly type-safe way, where the type system is able to show that 
the majority of the code is safe. There is only one cast required in the code 
compared to many (each in the linked list) in the old implementation. New 
constructs like {{std::array}} also allow integrate bounds check in the code to 
help prevent buffer overflows.

Just to reiterate, by no means I'm trying to claim that the current patch is 
perfect and free of bugs. People myself included make mistakes all the time. A 
modern tool, however, can help people to catch the mistakes at the beginning of 
the development cycle and to avoid them. I believe that this is what good 
software engineering practice should do. I don't see why this is a 
philosophical debate on which language is better (though I can be happily 
convinced in either way), and why writing safer, and easier to approach 
correctness can be out-of-the-scope during development.

> Native RPCv9 client
> -------------------
>
>                 Key: HADOOP-10389
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10389
>             Project: Hadoop Common
>          Issue Type: Sub-task
>    Affects Versions: HADOOP-10388
>            Reporter: Binglin Chang
>            Assignee: Colin Patrick McCabe
>         Attachments: HADOOP-10388.001.patch, 
> HADOOP-10389-alternative.000.patch, HADOOP-10389.002.patch, 
> HADOOP-10389.004.patch, HADOOP-10389.005.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to