Re: S3 Sink in FlumeNG Configuration?

Matthew Moore Fri, 29 Mar 2013 12:43:09 -0700

I am awesome at answering my own questions =\

I was using jets3t 0.7.4 instead of the 0.6.1 included with Hadoop (yet
jets3t wasn't included with Flume)


Best,
Matt
--
Matthew Moore
Co-Founder & CTO, CrowdMob Inc.
Mobile: (650) 888-5962

Need to schedule a meeting?  Invite me via Google Calendar!
[email protected]


On Fri, Mar 29, 2013 at 12:15 PM, Matthew Moore <[email protected]> wrote:

> Hey Guys,
>
> I've made a decent amount of progress, and now have the settings correct.
>  For completeness, the settings look like this:
>
> agent.sinks.s3Sink.type = hdfs
> agent.sinks.s3Sink.hdfs.path = 
> s3://AWS_ACCESS_KEY_ID:AWS_SECRET_ACCESS_KEY@BUCKET-NAME/
>
> You can see the full setup at this gist:
> https://gist.github.com/crowdmatt/5256881
>
>
> However, I've run into the following problem:
>
>
> 2013-03-29 19:05:28,954 (SinkRunner-PollingRunner-DefaultSinkProcessor)
> [ERROR -
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:460)]
> process failed
> org.apache.hadoop.fs.s3.S3Exception:
> org.jets3t.service.S3ServiceException: Request Error. HEAD
> '/FlumeData.1364583927762.tmp' on Host 'mybucket.s3.amazonaws.com' @
> 'Fri, 29 Mar 2013 19:05:28 GMT' -- ResponseCode: 404, ResponseStatus: Not
> Found, RequestId: 00864FE1DCD5AD95, HostId:
> 68AuSUe/XsP9zUiwe4yqhhDjETjVEnXVuTdZjYKQfj6VBKyACLH++MD1i8xgrEE4
>  at
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:122)
>
>
> Does anyone have any pointers on how I can start debugging?
>
> Best,
> Matt
> --
> Matthew Moore
> Co-Founder & CTO, CrowdMob Inc.
> Mobile: (650) 888-5962
>
> Need to schedule a meeting?  Invite me via Google Calendar!
> [email protected]
>
>
> On Fri, Mar 29, 2013 at 8:47 AM, Matthew Moore <[email protected]> wrote:
>
>> Hey,
>>
>> Thanks for the links to the Jiras.  It seems like someone implemented
>> an S3BufferedWriter which might be helpful in the future.
>>
>> However, I'm still not sure what to set the configuration (flume.conf) to
>> use s3 as a sink?  Has anyone done that?
>>
>> Best,
>> Matt
>> --
>> Matthew Moore
>> Co-Founder & CTO, CrowdMob Inc.
>> Mobile: (650) 888-5962
>>
>> Need to schedule a meeting?  Invite me via Google Calendar!
>> [email protected]
>>
>>
>> On Fri, Mar 29, 2013 at 7:49 AM, Brock Noland <[email protected]> wrote:
>>
>>> Sorry, I don't know much about this, but here are two relevant JIRA's:
>>>
>>> https://issues.apache.org/jira/browse/FLUME-1228
>>> https://issues.apache.org/jira/browse/FLUME-951
>>>
>>>
>>> On Fri, Mar 29, 2013 at 9:44 AM, Matthew Moore <[email protected]>wrote:
>>>
>>>> Hey there,
>>>>
>>>> I know this is a really newbish question, but I'm hoping to get a
>>>> little assistance here so I'm not stuck guess-and-checking.
>>>>
>>>> I'm trying to figure out how to configure FlumeNG (1.3.1), but I
>>>> couldn't figure out how to setup the hdfs sink to use the s3
>>>> implementations.
>>>>
>>>> I'm keeping track of my progress on this gist I made:
>>>> https://gist.github.com/crowdmatt/5256881
>>>>
>>>> From what I've gathered, I should be using the hdfs type, which I'm
>>>> setting up as such:
>>>>
>>>> agent.sinks = s3Sink
>>>> agent.sinks.s3Sink.type = hdfs
>>>> agent.sinks.s3Sink.channel = recoverableMemoryChannel
>>>>
>>>> ... but that's where I end up hitting my head against the wall.  I know
>>>> I should be specifying my s3 access key, secret, and bucket in this format:
>>>> s3n://ACCESS_KEY_ID:SECRET_ACCESS_KEY@my-hdfs/
>>>>
>>>> However, I don't know where to specify that, or what dot notation to
>>>> use.
>>>>
>>>> Can anyone point me in the right direction?
>>>>
>>>> Best,
>>>> Matt
>>>> --
>>>> Matthew Moore
>>>> Co-Founder & CTO, CrowdMob Inc.
>>>> Mobile: (650) 888-5962
>>>>
>>>> Need to schedule a meeting?  Invite me via Google Calendar!
>>>> [email protected]
>>>>
>>>
>>>
>>>
>>> --
>>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>>>
>>
>>
>

Re: S3 Sink in FlumeNG Configuration?

Reply via email to