alex balzer created FLUME-3118:
----------------------------------

             Summary: S3 urls do not find the correct region.
                 Key: FLUME-3118
                 URL: https://issues.apache.org/jira/browse/FLUME-3118
             Project: Flume
          Issue Type: Improvement
          Components: Sinks+Sources
            Reporter: alex balzer


So I am trying to use a S3 sink using hdfs but I am running into hurdles at 
every corner. My situation is that I need to be able to push to s3 without 
using access/secret amazon keys and using the underlying instance profile to 
authenticate with s3. I also need to add the aws encryption header for AES256. 
I am trying to use the base path of `s3://something.us-east-2.something/else`, 
but when I try it I get a 
`<Error><Code>AuthorizationHeaderMalformed</Code><Message>The authorization 
header is malformed; the region 'us-east-1' is wrong; expecting 
'us-east-2'</Message><Region>us-east-2</Region><RequestId>N/A</RequestId><HostId>N/A</HostId></Error>`
 

Here is my flume config:
```
tier1.sources  = source1
tier1.channels = channel1
tier1.sinks = sink1

tier1.sources.source1.type = org.apache.flume.source.kafka.KafkaSource
tier1.sources.source1.zookeeperConnect = localhost:2181
tier1.sources.source1.topic = lynch
# tier1.sources.source1.groupId = flume
tier1.sources.source1.channels = channel1
tier1.sources.source1.interceptors = i1
tier1.sources.source1.interceptors.i1.type = timestamp
tier1.sources.source1.kafka.consumer.timeout.ms = 100

tier1.channels.channel1.type = memory
#tier1.channels.channel1.capacity = 10000
#tier1.channels.channel1.transactionCapacity = 1000

tier1.sinks.sink1.type = hdfs
tier1.sinks.sink1.hdfs.path = s3://something.us-east-2.something/else
tier1.sinks.sink1.hdfs.rollInterval = 5
tier1.sinks.sink1.hdfs.rollSize = 0
tier1.sinks.sink1.hdfs.rollCount = 0
tier1.sinks.sink1.hdfs.fileType = DataStream
tier1.sinks.sink1.channel = channel1
```

Here is the command to run it:
```
bin/flume-ng agent -c . -f kafka-source.conf -n tier1
```

It should not be this difficult to push to S3 and adding support for s3:// 
addresses and instance profiles needs to happen. I have tried many permutations 
to get this to work, and I really want to see flume become a more friendly tool 
in these situations.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to