Re: Using S3 instead of HDFS

Harsh J Tue, 17 Jan 2012 22:53:16 -0800

When using S3 you do not need to run any component of HDFS at all. It
is meant to be an alternate FS choice. You need to run only MR.


The wiki page at http://wiki.apache.org/hadoop/AmazonS3 mentions on
how to go about specifying your auth details to S3, either directly
via the fs.default.name URI or via the additional properties
fs.s3.awsAccessKeyId and fs.s3.awsSecretAccessKey. Does this not work
for you?

On Wed, Jan 18, 2012 at 12:14 PM, Mark Kerzner <[email protected]> wrote:
> Well, here is my error message
>
> Starting Hadoop namenode daemon: starting namenode, logging to
> /usr/lib/hadoop-0.20/logs/hadoop-hadoop-namenode-ip-10-126-11-26.out
> ERROR. Could not start Hadoop namenode daemon
> Starting Hadoop secondarynamenode daemon: starting secondarynamenode,
> logging to
> /usr/lib/hadoop-0.20/logs/hadoop-hadoop-secondarynamenode-ip-10-126-11-26.out
> Exception in thread "main" java.lang.IllegalArgumentException: Invalid URI
> for NameNode address (check fs.default.name): s3n://myname.testdata is not
> of scheme 'hdfs'.
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:224)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.getServiceAddress(NameNode.java:209)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:182)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:150)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:624)
> ERROR. Could not start Hadoop secondarynamenode daemon
>
> And, if I don't need to start the NameNode, then where do I give the S3
> credentials?
>
> Thank you,
> Mark
>
>
> On Wed, Jan 18, 2012 at 12:36 AM, Harsh J <[email protected]> wrote:
>
>> Hey Mark,
>>
>> What is the exact trouble you run into? What do the error messages
>> indicate?
>>
>> This should be definitive enough I think:
>> http://wiki.apache.org/hadoop/AmazonS3
>>
>> On Wed, Jan 18, 2012 at 11:55 AM, Mark Kerzner <[email protected]>
>> wrote:
>> > Hi,
>> >
>> > whatever I do, I can't make it work, that is, I cannot use
>> >
>> > s3://host
>> >
>> > or s3n://host
>> >
>> > as a replacement for HDFS while runnings EC2 cluster. I change the
>> settings
>> > in the core-file.xml, in hdfs-site.xml, and start hadoop services, and it
>> > fails with error messages.
>> >
>> > Is there a place where this is clearly described?
>> >
>> > Thank you so much.
>> >
>> > Mark
>>
>>
>>
>> --
>> Harsh J
>> Customer Ops. Engineer, Cloudera
>>



-- 
Harsh J
Customer Ops. Engineer, Cloudera

Re: Using S3 instead of HDFS

Reply via email to