[
https://issues.apache.org/jira/browse/FLUME-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13852128#comment-13852128
]
Dennis Waldron commented on FLUME-1228:
---------------------------------------
I've come across this problem before, the solution was to use a Hadoop version
with jets3t 0.7.1...
In our Flume infrastructure at work we use Flume NG 1.4.0 (agents hosted on EC2
with HDFS sink writing to S3) supplemented with the following libraries:
{noformat}
-rw-r--r-- 1 root root 298829 Aug 19 14:27 commons-configuration-1.6.jar
-rw-r--r-- 1 root root 279781 Aug 19 14:27 commons-httpclient-3.0.1.jar
-rw-r--r-- 1 root root 3929148 Aug 19 14:27 hadoop-core-1.0.4.jar
-rw-r--r-- 1 root root 377780 Aug 19 14:27 jets3t-0.7.1.jar
{noformat}
We've also included hadoop-gpl-compression-0.2.0-dev.jar and the Hadoop native
extensions (libgplcompression.so, libhadoop.so and libsnappy.so) to support
compression. All libraries can be downloaded from
http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-core/1.0.4 excluding
hadoop-gpl-compression which can be compiled from source
(https://code.google.com/a/apache-extras.org/p/hadoop-gpl-compression/)
> flume-ng fails while writing to S3 sink
> ---------------------------------------
>
> Key: FLUME-1228
> URL: https://issues.apache.org/jira/browse/FLUME-1228
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v1.2.0, v1.4.0
> Reporter: Prashanth Jonnalagadda
> Assignee: Ashish Paliwal
> Priority: Critical
>
> flume-ng (version 1.2.0) fails while writing to S3 sink since it gets back
> 404 response code. The files with data is created on S3 though.
> Hadoop version used is 0.20.2-cdh3u4
> Followed all the steps documented in the jira -
> https://issues.cloudera.org/browse/FLUME-66
> and also I tried swapping out hadoop-core.jar that comes with CDH, with
> emr-hadoop-core-0.20.jar that comes with EC2 hadoop cluster instance as
> suggested in the following blog post -
> http://eric.lubow.org/2011/system-administration/distributed-flume-setup-with-an-s3-sink/
> but the issue still remains.
> Following errors are seen in the log:
> 2012-05-25 05:04:28,889 WARN httpclient.RestS3Service: Response
> '/flumedata%2FFlumeData.122585423857995.tmp_%24folder%24' - Unexpected
> response code 404, expected 200
> 2012-05-25 05:04:28,964 INFO s3native.NativeS3FileSystem: OutputStream for
> key 'flumedata/FlumeData.122585423857995.tmp' writing to tempfile
> '/tmp/hadoop-root/s3/output-8042215269186280519.tmp'
> 2012-05-25 05:04:28,972 INFO s3native.NativeS3FileSystem: OutputStream for
> key 'flumedata/FlumeData.122585423857995.tmp' closed. Now beginning upload
> 2012-05-25 05:04:29,044 INFO s3native.NativeS3FileSystem: OutputStream for
> key 'flumedata/FlumeData.122585423857995.tmp' upload complete
> 2012-05-25 05:04:29,074 INFO hdfs.BucketWriter: Renaming
> s3n://flume-ng/flumedata/FlumeData.122585423857995.tmp to
> s3n://flume-ng/flumedata/FlumeData.122585423857995
> 2012-05-25 05:04:29,097 WARN httpclient.RestS3Service: Response
> '/flumedata%2FFlumeData.122585423857995' - Unexpected response code 404,
> expected 200
> 2012-05-25 05:04:29,120 WARN httpclient.RestS3Service: Response
> '/flumedata%2FFlumeData.122585423857995_%24folder%24' - Unexpected response
> code 404, expected 200
> 2012-05-25 05:04:29,203 WARN httpclient.RestS3Service: Response '/flumedata'
> - Unexpected response code 404, expected 200
> 2012-05-25 05:04:29,224 WARN httpclient.RestS3Service: Response
> '/flumedata_%24folder%24' - Unexpected response code 404, expected 200
> 2012-05-25 05:04:29,608 INFO hdfs.BucketWriter: Creating
> s3n://flume-ng/flumedata/FlumeData.122585423857996.tmp
> 2012-05-25 05:04:29,720 WARN httpclient.RestS3Service: Response
> '/flumedata%2FFlumeData.122585423857996.tmp' - Unexpected response code 404,
> expected 200
> 2012-05-25 05:04:29,748 WARN httpclient.RestS3Service: Response
> '/flumedata%2FFlumeData.122585423857996.tmp_%24folder%24' - Unexpected
> response code 404, expected 200
> 2012-05-25 05:04:29,791 INFO s3native.NativeS3FileSystem: OutputStream for
> key 'flumedata/FlumeData.122585423857996.tmp' writing to tempfile
> '/tmp/hadoop-root/s3/output-2477068572058013384.tmp'
> 2012-05-25 05:04:29,793 INFO s3native.NativeS3FileSystem: OutputStream for
> key 'flumedata/FlumeData.122585423857996.tmp' closed. Now beginning upload
> 2012-05-25 05:04:29,828 INFO s3native.NativeS3FileSystem: OutputStream for
> key 'flumedata/FlumeData.122585423857996.tmp' upload complete
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)