I am awesome at answering my own questions =\ I was using jets3t 0.7.4 instead of the 0.6.1 included with Hadoop (yet jets3t wasn't included with Flume)
Best, Matt -- Matthew Moore Co-Founder & CTO, CrowdMob Inc. Mobile: (650) 888-5962 Need to schedule a meeting? Invite me via Google Calendar! [email protected] On Fri, Mar 29, 2013 at 12:15 PM, Matthew Moore <[email protected]> wrote: > Hey Guys, > > I've made a decent amount of progress, and now have the settings correct. > For completeness, the settings look like this: > > agent.sinks.s3Sink.type = hdfs > agent.sinks.s3Sink.hdfs.path = > s3://AWS_ACCESS_KEY_ID:AWS_SECRET_ACCESS_KEY@BUCKET-NAME/ > > You can see the full setup at this gist: > https://gist.github.com/crowdmatt/5256881 > > > However, I've run into the following problem: > > > 2013-03-29 19:05:28,954 (SinkRunner-PollingRunner-DefaultSinkProcessor) > [ERROR - > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:460)] > process failed > org.apache.hadoop.fs.s3.S3Exception: > org.jets3t.service.S3ServiceException: Request Error. HEAD > '/FlumeData.1364583927762.tmp' on Host 'mybucket.s3.amazonaws.com' @ > 'Fri, 29 Mar 2013 19:05:28 GMT' -- ResponseCode: 404, ResponseStatus: Not > Found, RequestId: 00864FE1DCD5AD95, HostId: > 68AuSUe/XsP9zUiwe4yqhhDjETjVEnXVuTdZjYKQfj6VBKyACLH++MD1i8xgrEE4 > at > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:122) > > > Does anyone have any pointers on how I can start debugging? > > Best, > Matt > -- > Matthew Moore > Co-Founder & CTO, CrowdMob Inc. > Mobile: (650) 888-5962 > > Need to schedule a meeting? Invite me via Google Calendar! > [email protected] > > > On Fri, Mar 29, 2013 at 8:47 AM, Matthew Moore <[email protected]> wrote: > >> Hey, >> >> Thanks for the links to the Jiras. It seems like someone implemented >> an S3BufferedWriter which might be helpful in the future. >> >> However, I'm still not sure what to set the configuration (flume.conf) to >> use s3 as a sink? Has anyone done that? >> >> Best, >> Matt >> -- >> Matthew Moore >> Co-Founder & CTO, CrowdMob Inc. >> Mobile: (650) 888-5962 >> >> Need to schedule a meeting? Invite me via Google Calendar! >> [email protected] >> >> >> On Fri, Mar 29, 2013 at 7:49 AM, Brock Noland <[email protected]> wrote: >> >>> Sorry, I don't know much about this, but here are two relevant JIRA's: >>> >>> https://issues.apache.org/jira/browse/FLUME-1228 >>> https://issues.apache.org/jira/browse/FLUME-951 >>> >>> >>> On Fri, Mar 29, 2013 at 9:44 AM, Matthew Moore <[email protected]>wrote: >>> >>>> Hey there, >>>> >>>> I know this is a really newbish question, but I'm hoping to get a >>>> little assistance here so I'm not stuck guess-and-checking. >>>> >>>> I'm trying to figure out how to configure FlumeNG (1.3.1), but I >>>> couldn't figure out how to setup the hdfs sink to use the s3 >>>> implementations. >>>> >>>> I'm keeping track of my progress on this gist I made: >>>> https://gist.github.com/crowdmatt/5256881 >>>> >>>> From what I've gathered, I should be using the hdfs type, which I'm >>>> setting up as such: >>>> >>>> agent.sinks = s3Sink >>>> agent.sinks.s3Sink.type = hdfs >>>> agent.sinks.s3Sink.channel = recoverableMemoryChannel >>>> >>>> ... but that's where I end up hitting my head against the wall. I know >>>> I should be specifying my s3 access key, secret, and bucket in this format: >>>> s3n://ACCESS_KEY_ID:SECRET_ACCESS_KEY@my-hdfs/ >>>> >>>> However, I don't know where to specify that, or what dot notation to >>>> use. >>>> >>>> Can anyone point me in the right direction? >>>> >>>> Best, >>>> Matt >>>> -- >>>> Matthew Moore >>>> Co-Founder & CTO, CrowdMob Inc. >>>> Mobile: (650) 888-5962 >>>> >>>> Need to schedule a meeting? Invite me via Google Calendar! >>>> [email protected] >>>> >>> >>> >>> >>> -- >>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org >>> >> >> >
