[
https://issues.apache.org/jira/browse/HDFS-7207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164440#comment-14164440
]
Colin Patrick McCabe commented on HDFS-7207:
--------------------------------------------
I thought about this a little bit more. I thought it would be nice to have a
C++ API for libhdfs, but to be usable, it had to meet a few conditions:
* Not use exceptions (as [~wheat9] mentioned, many C++ programs don't use
exceptions... including the llvm compiler, the Chrome web browser, pretty much
any code Google develops).
* Be usable for both {{libhdfs}}, {{libwebhdfs}}, and {{libhdfs3}}. Since
{{libhdfs3}} is hdfs-specific, it cannot replace the functionality of the
original {{libhdfs}}. Many users are using {{libhdfs}} to talk to S3.
{{libhdfs3}} also lacks support for things like encryption (although unlike S3
support, it will eventually get it). So this flexibility is essential.
* Work with both C\+\+11 and earlier C\+\+ revisions.
* Not force {{libhdfs3}} to expose its guts to the world. We want to be able
to change {{libhdfs3}} in the future without breaking applications. So
exposing internal header files and data structures is out of the question. We
need a stable ABI.
What I came up with is {{libhdfs.hpp}}. It is implemented entirely in a header
file. Because it calls functions from {{libhdfs.h}}, it will work with both
{{libhdfs}} and {{libwebhdfs}}, as well as {{libhdfs3}}. We can change the
interface without worrying about breaking the ABI, since when the header file
is compiled, the code becomes a part of the client. The only functions that
the libraries needs to export are the original {{libhdfs}} functions.
We never pass C\+\+ types across the application / shared library boundary, so
we don't have to worry about issues like the application using a different name
mangling scheme or libstdc++ than the library. The only types passed across
the application / shared library boundary are C types, and we've done a very
good job in the past in not breaking that ABI. In practice, this means that
users of {{hdfs.hpp}} will be able to upgrade libhdfs3 without recompiling.
{{libhdfs.hpp}} gets rid of the need to check {{errno}} by introducing a
{{Status}} class that is returned by functions that can fail. (This is similar
to what Haohui suggested above.) You can call {{Status#toError}} to get a nice
error message.
{{libhdfs.hpp}} uses {{shared_ptr}} to allow the input and output stream to
keep a reference to the filesystem object. This prevents the filesystem object
from going away as long as at least one stream referencing it is open. Using
{{shared_ptr}} helps to avoid the possibility of resource leaks that
programmers needed to manually avoid in the C version.
I implemented *all* current functionality of libhdfs except short-circuit
reads, which can be implemented later pretty easily.
> libhdfs3 should not expose exceptions in public C++ API
> -------------------------------------------------------
>
> Key: HDFS-7207
> URL: https://issues.apache.org/jira/browse/HDFS-7207
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Haohui Mai
> Assignee: Colin Patrick McCabe
> Priority: Blocker
> Attachments: HDFS-7207.001.patch
>
>
> There are three major disadvantages of exposing exceptions in the public API:
> * Exposing exceptions in public APIs forces the downstream users to be
> compiled with {{-fexceptions}}, which might be infeasible in many use cases.
> * It forces other bindings to properly handle all C++ exceptions, which might
> be infeasible especially when the binding is generated by tools like SWIG.
> * It forces the downstream users to properly handle all C++ exceptions, which
> can be cumbersome as in certain cases it will lead to undefined behavior
> (e.g., throwing an exception in a destructor is undefined.)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)