[
https://issues.apache.org/jira/browse/FLUME-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064029#comment-14064029
]
Rob Slifka edited comment on FLUME-1228 at 7/16/14 8:20 PM:
------------------------------------------------------------
This bit us pretty hard today. We're on Flume 1.4 via CDH 4.4. Events began
backing up on two of our three nodes with the following logs:
{noformat}
16 Jul 2014 19:04:18,366 WARN [hdfs-s3sink-beacons-cold-call-runner-8]
(org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest:393) -
Response
'/2014-07-16%2F1900%2Fnginx-ip-10-80-54-224.ec2.internal.1405537200166' -
Unexpected response code 404, expected 200
16 Jul 2014 19:04:18,377 WARN [hdfs-s3sink-beacons-cold-call-runner-8]
(org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest:393) -
Response
'/2014-07-16%2F1900%2Fnginx-ip-10-80-54-224.ec2.internal.1405537200166_%24folder%24'
- Unexpected response code 404, expected 200
16 Jul 2014 19:04:18,508 WARN [hdfs-s3sink-beacons-cold-call-runner-8]
(org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest:393) -
Response '/2014-07-16%2F1900' - Unexpected response code 404, expected 200
16 Jul 2014 19:04:18,518 WARN [hdfs-s3sink-beacons-cold-call-runner-8]
(org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest:393) -
Response '/2014-07-16%2F1900_%24folder%24' - Unexpected response code 404,
expected 200
16 Jul 2014 19:04:21,632 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor]
(org.apache.flume.sink.hdfs.BucketWriter.open:219) - Creating
s3n://ACCESS_KEY:SECRET@PRODUCTION_BUCKET/2014-07-16/1900/nginx-ip-10-80-54-224.ec2.internal.1405537200167.tmp
16 Jul 2014 19:04:21,734 WARN [hdfs-s3sink-beacons-cold-call-runner-0]
(org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest:393) -
Response
'/2014-07-16%2F1900%2Fnginx-ip-10-80-54-224.ec2.internal.1405537200167.tmp' -
Unexpected response code 404, expected 200
16 Jul 2014 19:04:21,745 WARN [hdfs-s3sink-beacons-cold-call-runner-0]
(org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest:393) -
Response
'/2014-07-16%2F1900%2Fnginx-ip-10-80-54-224.ec2.internal.1405537200167.tmp_%24folder%24'
- Unexpected response code 404, expected 200
16 Jul 2014 19:04:21,761 INFO [hdfs-s3sink-beacons-cold-call-runner-0]
(org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream.<init>:182)
- OutputStream for key
'2014-07-16/1900/nginx-ip-10-80-54-224.ec2.internal.1405537200167.tmp' writing
to tempfile '/tmp/hadoop-root/s3/output-2622317072117705911.tmp'
16 Jul 2014 19:04:21,761 INFO [hdfs-s3sink-beacons-cold-call-runner-0]
(org.apache.flume.sink.hdfs.AbstractHDFSWriter.reflectGetNumCurrentReplicas:185)
- FileSystem's output stream doesn't support getNumCurrentReplicas;
--HDFS-826 not available;
fsOut=org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream;
err=java.lang.NoSuchMethodException:
org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream.getNumCurrentReplicas()
16 Jul 2014 19:04:39,254 INFO [Log-BackgroundWorker-fileChannel]
(org.apache.flume.channel.file.EventQueueBackingStoreFile.beginCheckpoint:214)
- Start checkpoint for /mnt/flume/checkpoint/checkpoint, elements to sync =
41091
16 Jul 2014 19:04:39,261 INFO [Log-BackgroundWorker-fileChannel]
(org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:239) -
Updating checkpoint metadata: logWriteOrderID: 1408437260518, queueSize: 600,
queueHead: 705519
16 Jul 2014 19:04:39,272 INFO [Log-BackgroundWorker-fileChannel]
(org.apache.flume.channel.file.Log.writeCheckpoint:1019) - Updated checkpoint
for file: /mnt/flume/data/log-648 position: 1599320032 logWriteOrderID:
1408437260518
16 Jul 2014 19:04:39,347 INFO [Log-BackgroundWorker-hotFileChannel]
(org.apache.flume.channel.file.EventQueueBackingStoreFile.beginCheckpoint:214)
- Start checkpoint for /mnt/flume/checkpoint-hot/checkpoint, elements to sync =
40628
16 Jul 2014 19:04:39,358 INFO [Log-BackgroundWorker-hotFileChannel]
(org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:239) -
Updating checkpoint metadata: logWriteOrderID: 1408437261136, queueSize:
999972, queueHead: 783009
16 Jul 2014 19:04:39,368 INFO [Log-BackgroundWorker-hotFileChannel]
(org.apache.flume.channel.file.Log.writeCheckpoint:1019) - Updated checkpoint
for file: /mnt/flume/data-hot/log-668 position: 1621862329 logWriteOrderID:
1408437261136
16 Jul 2014 19:04:39,988 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.Log.roll:944) - Roll start /mnt/flume/data-hot
16 Jul 2014 19:04:39,989 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.LogFile$Writer.<init>:188) - Opened
/mnt/flume/data-hot/log-669
16 Jul 2014 19:04:39,993 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.LogFile$Writer.close:327) - Closing
/mnt/flume/data-hot/log-668
16 Jul 2014 19:04:39,993 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.Log.roll:959) - Roll end
16 Jul 2014 19:04:52,026 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.Log.roll:944) - Roll start /mnt/flume/data
16 Jul 2014 19:04:52,026 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.LogFile$Writer.<init>:188) - Opened
/mnt/flume/data/log-649
16 Jul 2014 19:04:52,030 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.LogFile$Writer.close:327) - Closing
/mnt/flume/data/log-648
16 Jul 2014 19:04:52,030 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.Log.roll:959) - Roll end
16 Jul 2014 19:04:57,917 INFO [hdfs-s3sink-beacons-cold-call-runner-0]
(org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream.close:217)
- OutputStream for key
'2014-07-16/1900/nginx-ip-10-80-54-224.ec2.internal.1405537200167.tmp' closed.
Now beginning upload
{noformat}
When we restarted Flume, things began draining (slowly) though we're still
seeing these errors. If a restart solves it or if folks like Dennis have
figured it out some other way, I suspect they may not be discussing it. A
definitive position or fix would be awesome! We're happy to help if you need
something verified.
was (Author: rslifka):
How are you evaluating whether or not people use Flume to sync to S3? Is there
usage information being collected? Usage surveys, etc.? If you're not hearing
much then I would suggest it's either not used or working well (enough) :)
With respect to this particular issue, it bit us pretty hard today. We're on
Flume 1.4 via CDH 4.4. Events began backing up on two of our three nodes with
the following logs:
{noformat}
16 Jul 2014 19:04:18,366 WARN [hdfs-s3sink-beacons-cold-call-runner-8]
(org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest:393) -
Response
'/2014-07-16%2F1900%2Fnginx-ip-10-80-54-224.ec2.internal.1405537200166' -
Unexpected response code 404, expected 200
16 Jul 2014 19:04:18,377 WARN [hdfs-s3sink-beacons-cold-call-runner-8]
(org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest:393) -
Response
'/2014-07-16%2F1900%2Fnginx-ip-10-80-54-224.ec2.internal.1405537200166_%24folder%24'
- Unexpected response code 404, expected 200
16 Jul 2014 19:04:18,508 WARN [hdfs-s3sink-beacons-cold-call-runner-8]
(org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest:393) -
Response '/2014-07-16%2F1900' - Unexpected response code 404, expected 200
16 Jul 2014 19:04:18,518 WARN [hdfs-s3sink-beacons-cold-call-runner-8]
(org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest:393) -
Response '/2014-07-16%2F1900_%24folder%24' - Unexpected response code 404,
expected 200
16 Jul 2014 19:04:21,632 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor]
(org.apache.flume.sink.hdfs.BucketWriter.open:219) - Creating
s3n://ACCESS_KEY:SECRET@PRODUCTION_BUCKET/2014-07-16/1900/nginx-ip-10-80-54-224.ec2.internal.1405537200167.tmp
16 Jul 2014 19:04:21,734 WARN [hdfs-s3sink-beacons-cold-call-runner-0]
(org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest:393) -
Response
'/2014-07-16%2F1900%2Fnginx-ip-10-80-54-224.ec2.internal.1405537200167.tmp' -
Unexpected response code 404, expected 200
16 Jul 2014 19:04:21,745 WARN [hdfs-s3sink-beacons-cold-call-runner-0]
(org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest:393) -
Response
'/2014-07-16%2F1900%2Fnginx-ip-10-80-54-224.ec2.internal.1405537200167.tmp_%24folder%24'
- Unexpected response code 404, expected 200
16 Jul 2014 19:04:21,761 INFO [hdfs-s3sink-beacons-cold-call-runner-0]
(org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream.<init>:182)
- OutputStream for key
'2014-07-16/1900/nginx-ip-10-80-54-224.ec2.internal.1405537200167.tmp' writing
to tempfile '/tmp/hadoop-root/s3/output-2622317072117705911.tmp'
16 Jul 2014 19:04:21,761 INFO [hdfs-s3sink-beacons-cold-call-runner-0]
(org.apache.flume.sink.hdfs.AbstractHDFSWriter.reflectGetNumCurrentReplicas:185)
- FileSystem's output stream doesn't support getNumCurrentReplicas;
--HDFS-826 not available;
fsOut=org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream;
err=java.lang.NoSuchMethodException:
org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream.getNumCurrentReplicas()
16 Jul 2014 19:04:39,254 INFO [Log-BackgroundWorker-fileChannel]
(org.apache.flume.channel.file.EventQueueBackingStoreFile.beginCheckpoint:214)
- Start checkpoint for /mnt/flume/checkpoint/checkpoint, elements to sync =
41091
16 Jul 2014 19:04:39,261 INFO [Log-BackgroundWorker-fileChannel]
(org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:239) -
Updating checkpoint metadata: logWriteOrderID: 1408437260518, queueSize: 600,
queueHead: 705519
16 Jul 2014 19:04:39,272 INFO [Log-BackgroundWorker-fileChannel]
(org.apache.flume.channel.file.Log.writeCheckpoint:1019) - Updated checkpoint
for file: /mnt/flume/data/log-648 position: 1599320032 logWriteOrderID:
1408437260518
16 Jul 2014 19:04:39,347 INFO [Log-BackgroundWorker-hotFileChannel]
(org.apache.flume.channel.file.EventQueueBackingStoreFile.beginCheckpoint:214)
- Start checkpoint for /mnt/flume/checkpoint-hot/checkpoint, elements to sync =
40628
16 Jul 2014 19:04:39,358 INFO [Log-BackgroundWorker-hotFileChannel]
(org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:239) -
Updating checkpoint metadata: logWriteOrderID: 1408437261136, queueSize:
999972, queueHead: 783009
16 Jul 2014 19:04:39,368 INFO [Log-BackgroundWorker-hotFileChannel]
(org.apache.flume.channel.file.Log.writeCheckpoint:1019) - Updated checkpoint
for file: /mnt/flume/data-hot/log-668 position: 1621862329 logWriteOrderID:
1408437261136
16 Jul 2014 19:04:39,988 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.Log.roll:944) - Roll start /mnt/flume/data-hot
16 Jul 2014 19:04:39,989 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.LogFile$Writer.<init>:188) - Opened
/mnt/flume/data-hot/log-669
16 Jul 2014 19:04:39,993 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.LogFile$Writer.close:327) - Closing
/mnt/flume/data-hot/log-668
16 Jul 2014 19:04:39,993 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.Log.roll:959) - Roll end
16 Jul 2014 19:04:52,026 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.Log.roll:944) - Roll start /mnt/flume/data
16 Jul 2014 19:04:52,026 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.LogFile$Writer.<init>:188) - Opened
/mnt/flume/data/log-649
16 Jul 2014 19:04:52,030 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.LogFile$Writer.close:327) - Closing
/mnt/flume/data/log-648
16 Jul 2014 19:04:52,030 INFO [pool-5-thread-1]
(org.apache.flume.channel.file.Log.roll:959) - Roll end
16 Jul 2014 19:04:57,917 INFO [hdfs-s3sink-beacons-cold-call-runner-0]
(org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsOutputStream.close:217)
- OutputStream for key
'2014-07-16/1900/nginx-ip-10-80-54-224.ec2.internal.1405537200167.tmp' closed.
Now beginning upload
{noformat}
> flume-ng fails while writing to S3 sink
> ---------------------------------------
>
> Key: FLUME-1228
> URL: https://issues.apache.org/jira/browse/FLUME-1228
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v1.2.0, v1.4.0
> Reporter: Prashanth Jonnalagadda
> Assignee: Ashish Paliwal
> Priority: Critical
>
> flume-ng (version 1.2.0) fails while writing to S3 sink since it gets back
> 404 response code. The files with data is created on S3 though.
> Hadoop version used is 0.20.2-cdh3u4
> Followed all the steps documented in the jira -
> https://issues.cloudera.org/browse/FLUME-66
> and also I tried swapping out hadoop-core.jar that comes with CDH, with
> emr-hadoop-core-0.20.jar that comes with EC2 hadoop cluster instance as
> suggested in the following blog post -
> http://eric.lubow.org/2011/system-administration/distributed-flume-setup-with-an-s3-sink/
> but the issue still remains.
> Following errors are seen in the log:
> 2012-05-25 05:04:28,889 WARN httpclient.RestS3Service: Response
> '/flumedata%2FFlumeData.122585423857995.tmp_%24folder%24' - Unexpected
> response code 404, expected 200
> 2012-05-25 05:04:28,964 INFO s3native.NativeS3FileSystem: OutputStream for
> key 'flumedata/FlumeData.122585423857995.tmp' writing to tempfile
> '/tmp/hadoop-root/s3/output-8042215269186280519.tmp'
> 2012-05-25 05:04:28,972 INFO s3native.NativeS3FileSystem: OutputStream for
> key 'flumedata/FlumeData.122585423857995.tmp' closed. Now beginning upload
> 2012-05-25 05:04:29,044 INFO s3native.NativeS3FileSystem: OutputStream for
> key 'flumedata/FlumeData.122585423857995.tmp' upload complete
> 2012-05-25 05:04:29,074 INFO hdfs.BucketWriter: Renaming
> s3n://flume-ng/flumedata/FlumeData.122585423857995.tmp to
> s3n://flume-ng/flumedata/FlumeData.122585423857995
> 2012-05-25 05:04:29,097 WARN httpclient.RestS3Service: Response
> '/flumedata%2FFlumeData.122585423857995' - Unexpected response code 404,
> expected 200
> 2012-05-25 05:04:29,120 WARN httpclient.RestS3Service: Response
> '/flumedata%2FFlumeData.122585423857995_%24folder%24' - Unexpected response
> code 404, expected 200
> 2012-05-25 05:04:29,203 WARN httpclient.RestS3Service: Response '/flumedata'
> - Unexpected response code 404, expected 200
> 2012-05-25 05:04:29,224 WARN httpclient.RestS3Service: Response
> '/flumedata_%24folder%24' - Unexpected response code 404, expected 200
> 2012-05-25 05:04:29,608 INFO hdfs.BucketWriter: Creating
> s3n://flume-ng/flumedata/FlumeData.122585423857996.tmp
> 2012-05-25 05:04:29,720 WARN httpclient.RestS3Service: Response
> '/flumedata%2FFlumeData.122585423857996.tmp' - Unexpected response code 404,
> expected 200
> 2012-05-25 05:04:29,748 WARN httpclient.RestS3Service: Response
> '/flumedata%2FFlumeData.122585423857996.tmp_%24folder%24' - Unexpected
> response code 404, expected 200
> 2012-05-25 05:04:29,791 INFO s3native.NativeS3FileSystem: OutputStream for
> key 'flumedata/FlumeData.122585423857996.tmp' writing to tempfile
> '/tmp/hadoop-root/s3/output-2477068572058013384.tmp'
> 2012-05-25 05:04:29,793 INFO s3native.NativeS3FileSystem: OutputStream for
> key 'flumedata/FlumeData.122585423857996.tmp' closed. Now beginning upload
> 2012-05-25 05:04:29,828 INFO s3native.NativeS3FileSystem: OutputStream for
> key 'flumedata/FlumeData.122585423857996.tmp' upload complete
--
This message was sent by Atlassian JIRA
(v6.2#6252)