Hi, Thank you for reply.
Requirement is that I need to setup a hadoop cluster using s3 as a backup (performance won't be an issue) My Architecture is like : Hive has external table mapped to HBase. HBase is storing data to HDFS. Hive is using Hadoop to access HBase table data. Can I make this work using S3? HBase regionserver is failing with Error "Caused by: java.lang.ClassNotFoundException: org.jets3t.service.S3ServiceException" HBase master log has lots of "Unexpected response code 404, expected 200" Do I need to start DataNode with s3? Datanode log says : 2012-08-02 17:50:20,021 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DataNode STARTUP_MSG: host = datarpm-desktop/192.168.2.4 STARTUP_MSG: args = [] STARTUP_MSG: version = 1.0.1 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1243785; compiled by 'hortonfo' on Tue Feb 14 08:15:38 UTC 2012 ************************************************************/ 2012-08-02 17:50:20,145 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2012-08-02 17:50:20,156 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered. 2012-08-02 17:50:20,157 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2012-08-02 17:50:20,157 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started 2012-08-02 17:50:20,277 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered. 2012-08-02 17:50:20,281 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists! 2012-08-02 17:50:20,317 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library 2012-08-02 17:50:22,006 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to <bucket-name>/67.215.65.132:8020 failed on local exception: java.io.EOFException at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103) at org.apache.hadoop.ipc.Client.call(Client.java:1071) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy5.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:370) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:429) at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:331) at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:296) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:356) at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745) 2012-08-02 17:50:22,007 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at datarpm-desktop/192.168.2.4 Thanks, On Thu, Aug 2, 2012 at 5:22 PM, Harsh J <ha...@cloudera.com> wrote: > With S3 you do not need a NameNode. NameNode is part of HDFS. > > On Thu, Aug 2, 2012 at 12:44 PM, Alok Kumar <alok...@gmail.com> wrote: > > Hi, > > > > Followed instructions from this link for setup > > http://wiki.apache.org/hadoop/AmazonS3. > > > > my "core-site.xml " contains only these 3 properties : > > <property> > > <name>fs.default.name</name> > > <value>s3://BUCKET</value> > > </property> > > > > <property> > > <name>fs.s3.awsAccessKeyId</name> > > <value>ID</value> > > </property> > > > > <property> > > <name>fs.s3.awsSecretAccessKey</name> > > <value>SECRET</value> > > </property> > > > > hdfs-site.xml is empty! > > > > Namenode log says, its trying to connect to local HDFS not S3. > > Am i missing anything? > > > > Regards, > > Alok > > > > -- > Harsh J > -- Alok