If you just want to write data to HDFS then Flume might not be the best thing to use; however, there is a Flume Embedded Agent<https://github.com/apache/flume/blob/trunk/flume-ng-doc/sphinx/FlumeDeveloperGuide.rst#embedded-agent>that will embed Flume into your application. I don't believe it works yet with the HDFS sink, but some tinkering can likely make it work.
- Connor On Tue, Apr 30, 2013 at 11:00 AM, Chen Song <[email protected]> wrote: > I am looking at options in Java programs that can write files into HDFS > with the following requirements. > > 1) Transaction Support: Each file, when being written, either fully > written successfully or failed totally without any partial file blocks > written. > > 2) Compression Support/File Formats: Can specify compression type or file > format when writing contents. > > I know how to write data into a file on HDFS by opening a > FSDataOutputStream shown > here<http://stackoverflow.com/questions/13457934/writing-to-a-file-in-hdfs-in-hadoop>. > Just wondering if there is some libraries of out of the box solutions that > provides the support I mentioned above. > > I stumbled upon Flume, which provides HDFS sink that can support > transaction, compression, file rotation, etc. But it doesn't seem to > provide an API to be used as a library. The features Flume provides are > highly coupled with the Flume architectural components, like source, > channel, and sinks and doesn't seem to be usable independently. All I need > is merely on the HDFS loading part. > > Does anyone have some good suggestions? > > -- > Chen Song > >
