[
https://issues.apache.org/jira/browse/HADOOP-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052118#comment-13052118
]
Allen Wittenauer commented on HADOOP-7405:
------------------------------------------
bq. Since Hadoop Kerberos Mac OS X support was never fully there, it is not
possible to compile libhadoop due to some compiler errors.
The compiler errors are fairly simple to fix on Darwin. I don't know why, but
it seems like 9 times out of 10, we favor BSD functionality when we go with
something non-portable.
bg. Because of this my take is that if we require native code to run Hadoop, we
should provide the full set of native code for each platform we are building
for.
Regardless of what happens in this jira, we need a testsuite for the C code
anyway. OS X actually proves out that even if the code compiles, it doesn't
necessarily mean it works properly. (See HADOOP-7367).
bg. A while ago I've opened a HADOOP-7083 to enable running Hadoop with
Kerberos ON without relying on some libhadoop functionality and the argument
there was that doing that was a security risk.
Right. It wanted to create a third security mode where some stuff worked and
some stuff didn't. That's not quite what I'm asking for here and it wouldn't
actually fix the problem we're hitting anyway. The security functionality is
orthogonal to the compression functionality. That's the base, surface issue.
Since it is in one big chunk, we broke *both*.
(While I guess it wasn't obvious, I should probably state that I'm not looking
for a "partially working" security mode. The scope of what constitutes a
working unit would still need to be defined. It is more than reasonable to say
that all of the functions that are directly security related would need to be
ported and treated like one block. Asking libhadoop.so if it "supports
security" seems like a reasonable thing to ask it.)
The problem that we've got is that we have a lot of unrelated code sitting in
libhadoop.so. Every time we add something we run the risk of regressing
features out of platforms other than Linux since those other platforms are an
afterthought. HADOOP-7206 may actually be a great example of this: if we go
with a pure native implementation, we won't be able to support Snappy on
anything but Linux with the current state of things. Lack of compression
support has a *direct* impact on the client. I'd be surprised if the majority
of shops are only using Linux clients.
Wouldn't it be great to be able to ask the lib "do you support gzip, do you
support snappy, do you support lzo, do you support security, ..."? Then we
could add code as needed, do ports as needed, etc. An alternative would be
that we start breaking libhadoop up into at least related functionality.
I suppose the other outcome might be that we as a community just admit that we
don't support Hadoop on anything but Linux and give up on any semblance of
portability. More and more code is being added or rewritten in C. I would be
surprised if this trend changes.
> libhadoop is all or nothing
> ---------------------------
>
> Key: HADOOP-7405
> URL: https://issues.apache.org/jira/browse/HADOOP-7405
> Project: Hadoop Common
> Issue Type: Bug
> Components: native
> Affects Versions: 0.20.203.0, 0.23.0
> Environment: Everything not Linux
> Reporter: Allen Wittenauer
> Priority: Blocker
> Labels: regression
>
> As a result of a ton of new code in libhadoop being added in 0.20.203/0.22, a
> lot of features that used to work no longer do reliably. The most common
> problem is native compression, but other issues such as Mac OS X's group
> support broke as well. The native code checks need to be refactored such
> that libhadoop.so should report what it supports rather than having the
> Java-side assume that if it loads, it is all supported. This would allow us
> to stub routines until they've been vetted, removing the chances of such
> regressions appearing in the future.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira