Re: S3 Sink in FlumeNG Configuration?

Matthew Moore Fri, 29 Mar 2013 12:16:07 -0700

Hey Guys,

I've made a decent amount of progress, and now have the settings correct.
 For completeness, the settings look like this:


agent.sinks.s3Sink.type = hdfs
agent.sinks.s3Sink.hdfs.path =
s3://AWS_ACCESS_KEY_ID:AWS_SECRET_ACCESS_KEY@BUCKET-NAME/

You can see the full setup at this gist:
https://gist.github.com/crowdmatt/5256881


However, I've run into the following problem:


2013-03-29 19:05:28,954 (SinkRunner-PollingRunner-DefaultSinkProcessor)
[ERROR -
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:460)]
process failed
org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException:
Request Error. HEAD '/FlumeData.1364583927762.tmp' on Host '
mybucket.s3.amazonaws.com' @ 'Fri, 29 Mar 2013 19:05:28 GMT' --
ResponseCode: 404, ResponseStatus: Not Found, RequestId: 00864FE1DCD5AD95,
HostId: 68AuSUe/XsP9zUiwe4yqhhDjETjVEnXVuTdZjYKQfj6VBKyACLH++MD1i8xgrEE4
at
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:122)


Does anyone have any pointers on how I can start debugging?

Best,
Matt
--
Matthew Moore
Co-Founder & CTO, CrowdMob Inc.
Mobile: (650) 888-5962

Need to schedule a meeting?  Invite me via Google Calendar!
[email protected]


On Fri, Mar 29, 2013 at 8:47 AM, Matthew Moore <[email protected]> wrote:

> Hey,
>
> Thanks for the links to the Jiras.  It seems like someone implemented
> an S3BufferedWriter which might be helpful in the future.
>
> However, I'm still not sure what to set the configuration (flume.conf) to
> use s3 as a sink?  Has anyone done that?
>
> Best,
> Matt
> --
> Matthew Moore
> Co-Founder & CTO, CrowdMob Inc.
> Mobile: (650) 888-5962
>
> Need to schedule a meeting?  Invite me via Google Calendar!
> [email protected]
>
>
> On Fri, Mar 29, 2013 at 7:49 AM, Brock Noland <[email protected]> wrote:
>
>> Sorry, I don't know much about this, but here are two relevant JIRA's:
>>
>> https://issues.apache.org/jira/browse/FLUME-1228
>> https://issues.apache.org/jira/browse/FLUME-951
>>
>>
>> On Fri, Mar 29, 2013 at 9:44 AM, Matthew Moore <[email protected]> wrote:
>>
>>> Hey there,
>>>
>>> I know this is a really newbish question, but I'm hoping to get a little
>>> assistance here so I'm not stuck guess-and-checking.
>>>
>>> I'm trying to figure out how to configure FlumeNG (1.3.1), but I
>>> couldn't figure out how to setup the hdfs sink to use the s3
>>> implementations.
>>>
>>> I'm keeping track of my progress on this gist I made:
>>> https://gist.github.com/crowdmatt/5256881
>>>
>>> From what I've gathered, I should be using the hdfs type, which I'm
>>> setting up as such:
>>>
>>> agent.sinks = s3Sink
>>> agent.sinks.s3Sink.type = hdfs
>>> agent.sinks.s3Sink.channel = recoverableMemoryChannel
>>>
>>> ... but that's where I end up hitting my head against the wall.  I know
>>> I should be specifying my s3 access key, secret, and bucket in this format:
>>> s3n://ACCESS_KEY_ID:SECRET_ACCESS_KEY@my-hdfs/
>>>
>>> However, I don't know where to specify that, or what dot notation to use.
>>>
>>> Can anyone point me in the right direction?
>>>
>>> Best,
>>> Matt
>>> --
>>> Matthew Moore
>>> Co-Founder & CTO, CrowdMob Inc.
>>> Mobile: (650) 888-5962
>>>
>>> Need to schedule a meeting?  Invite me via Google Calendar!
>>> [email protected]
>>>
>>
>>
>>
>> --
>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>>
>
>

Re: S3 Sink in FlumeNG Configuration?

Reply via email to