[
https://issues.apache.org/jira/browse/FLUME-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233890#comment-13233890
]
[email protected] commented on FLUME-1020:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4360/
-----------------------------------------------------------
(Updated 2012-03-20 22:33:43.076249)
Review request for Flume.
Changes
-------
No java code has changed.
Updated the build environment and the runtime environment as follows:
1. At runtime, if the hadoop binary can be found on the system, Flume
interrogates it to get the CLASSPATH and JAVA_LIBRARY_PATH variables out of it
using tricks kindly shared by Roman in Bigtop. This allows us to find the
hadoop configuration files and the appropriate JARs for the system being
accessed. At least, it's basically the state of the art for compatibility right
now if you want to call it that.
2. To allow the tricks above to work at runtime, the hadoop artifacts have been
marked as optional in the POM, which means that they will not be included in
the binary distribution. That's fine, because they are only needed if the HDFS
Sink is used, and we jump through hoops to find those artifacts if they're on
the system.
As a result, I am able to compile Flume against the default hadoop version
(0.20.205.0) and run against versions of Hadoop that I didn't explicitly build
against, like 0.23.x, without a problem. This is a huge improvement over when I
was working on this last week, where I was adding/fixing profiles to get
anything to work at all, since it's well known that different hadoop versions
generally refuse to talk to each other.
I also refactored the flume-ng script to duplicate less code and be a bit
friendlier, since I was doing surgery in there anyway.
This is ready for review now.
Summary
-------
This is an initial pass at an implementation of HDFS security. I think it will
probably work. Currently trying to get Kerberos to play nice with the cluster
on my VM though, so I haven't successfully tested it yet. It still works when
used on HDFS with security disabled. :)
The only thing I don't like is in configure() when authentication fails I throw
a FlumeException. I'll trace up and see how bad that would be but it seems
likely to break something. Just logging the error is kind of a bummer as well,
though ... need to ensure process() doesn't fill up the disk while spewing
copious error messages into the logs. Maybe this is a use case for some kind of
FatalException type thing.
This addresses bug FLUME-1020.
https://issues.apache.org/jira/browse/FLUME-1020
Diffs (updated)
-----
bin/flume-ng 0796a5b
flume-ng-sinks/flume-hdfs-sink/pom.xml 1a35baf
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java
da82f7e
Diff: https://reviews.apache.org/r/4360/diff
Testing
-------
Thanks,
Mike
> Implement Kerberos security for HDFS Sink
> -----------------------------------------
>
> Key: FLUME-1020
> URL: https://issues.apache.org/jira/browse/FLUME-1020
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
> Affects Versions: v1.0.0
> Reporter: Arvind Prabhakar
> Assignee: Mike Percy
>
> Make flume HDFS sink work with secure clusters.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira