[
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028506#comment-14028506
]
Haohui Mai commented on HADOOP-10389:
-------------------------------------
bq. So far, we haven't found any defects.
So far I'm yet to be convinced that the code is in good quality. I can see two
implication on "haven't found any defects". One is the code is in good quality,
the other is the code has not yet been gone through full scale reviews and
testings. I'm unsure which implication I should get. I haven't gone through
this patch, but in other patches there might be some similar issues:
https://issues.apache.org/jira/browse/HADOOP-10640?focusedCommentId=14025478&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14025478
Due to the complexity the code, I'm deeply concern that the code cannot be
properly reviewed and have at least the same quality of the code in Hadoop.
bq. This JIRA doesn't have anything to do with C versus C++. It is about the
RPC code, which was already reviewed and committed.
You're missing the point. From the prospective of a reviewer, the key question
to answer is that what quality the code currently is. There are two problems I
can see as the current state of the code:
# The code is yet to be separated to logical pieces to be throughly reviewed.
# The control flow involves several heap allocations, and it is nontrivial to
ensure that there are no resource leaks.
I call out C++ because it is widely available when developing native code.
You're more than welcome to do it in C if you can address the above
requirements. I'm not sure what your plan on this branch, but if you plan to
merge it to trunk, in my opinion the above comments have to be properly
addressed.
bq. And C++ doesn't provide "type checked proofs" of anything, since it is an
unsafe language (much like C, which it's based on). If you wanted provable
correctness, you might try something like SML or Haskell plus a theorem prover
like Coq. But what does this have to do with a native client?
At the very least, the code should be free of the following common defects:
resource leaks, buffer overflow, and control flow hijacking. It would be great
if you can fix your code to systematically show that. I call out C++ because it
allows you to provide a much better claim that the code is correct (obviously
it can't save you if you intentionally screw it up), but it seems the the
current code does not even have this luxury.
By the way, if you're interested in how far off C/C++ towards a memory safe
language, you might want to take a look at the SAFECode project
(http://safecode.cs.illinois.edu/index.html). My impression is that the current
code is too far off.
bq. If I opened a JIRA about switching Hadoop to Scala, would the discussion
continue "in the right tone"? If I pointed out a bunch of stuff that would be
"fixed by scala" (that wasn't actually buggy), what would the tone be then?
(Note that I have nothing against Scala-- it's a fine language.)
There are more interests in writing more robust and efficient code, compared to
what language you want to implement it. Personally I'm fairly open to this
discussion, if you can demonstrate the benefits as I pointed out in this jira.
> Native RPCv9 client
> -------------------
>
> Key: HADOOP-10389
> URL: https://issues.apache.org/jira/browse/HADOOP-10389
> Project: Hadoop Common
> Issue Type: Sub-task
> Affects Versions: HADOOP-10388
> Reporter: Binglin Chang
> Assignee: Colin Patrick McCabe
> Attachments: HADOOP-10388.001.patch, HADOOP-10389.002.patch,
> HADOOP-10389.004.patch, HADOOP-10389.005.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)