Help with Hadoop/Hbase on s3

2009-08-07 Thread Ananth T. Sarathy
I can't seem to get Hbase to run using the hadoop i have connected to my s3 bucket Running Hbase 0.19.2 Hadoop 0.19.2 Hadoop-site.xml configuration property namefs.default.name/name values3://hbase/value /property property namefs.s3.awsAccessKeyId/name valueID/value /property

Re: Help with Hadoop/Hbase on s3

2009-08-07 Thread tim robertson
Do you need to add the Amazon S3 toolkit on the HBase classpath directly to use S3 as a store? http://developer.amazonwebservices.com/connect/entry.jspa?externalID=617categoryID=47 I'm guessing based on the java.lang.NoClassDefFoundError: org/jets3t/service/S3ServiceException Cheers Tim On

~ Replacement for MapReduceBase ~

2009-08-07 Thread Naga Vijayapuram
Hello, I am using hadoop-0.20.0 What's the replacement for the deprecated MapReduceBase? Thanks, Naga Vijayapuram

Re: ~ Replacement for MapReduceBase ~

2009-08-07 Thread Naga Vijayapuram
Appears we just need to extend the Mapper class and not use MapReduceBase anymore (in hadoop-0.20.0) If that is not the case, I would like to know the recommended approach in hadoop-0.20.0 Thanks, Naga Vijayapuram On Fri, Aug 7, 2009 at 8:06 AM, Naga Vijayapuramnvija...@gmail.com wrote:

Re: Help in running hadoop from eclipse

2009-08-07 Thread ashish pareek
Thanks Aron :) I got external debugging tool working with eclipse. But would like to know if I can debug entire Hadoop code instead of just NameNode and if yes how ? Thanks to you all Regards, Ashish On Fri, Aug 7, 2009 at 9:34 AM, ashish pareek pareek...@gmail.com wrote:

Re: Help with Hadoop/Hbase on s3

2009-08-07 Thread Ananth T. Sarathy
TIm, that got me a little further! Thanks... but now i get a different error hbase-site.xml configuration property namehbase.master/name value174.129.15.236:6/value descriptionThe host and port that the HBase master runs at. A value of 'local' runs the master and a

Re: Help with Hadoop/Hbase on s3

2009-08-07 Thread tim robertson
Pointing out the obvious but something somewhere is trying to create a bucket that has already been created. Sorry, but I don't think I can help further - perhaps change s3://testbucket to s3://testbucket2 just to be sure it is not that you have created it in another process by accident? Cheers

Re: How to HA the NameNode?

2009-08-07 Thread Jeff Hammerbacher
As David mentions, you can find more information on one approach to HA for the NN at http://www.cloudera.com/blog/2009/07/22/hadoop-ha-configuration/. On Thu, Jul 30, 2009 at 5:39 AM, David B. Ritch david.ri...@gmail.comwrote: Check out the Cloudera blog (http://www.cloudera.com). They posted

How to redistribute files on HDFS after adding new machines to cluster?

2009-08-07 Thread prashant ullegaddi
Hi, We had a cluster of 9 machines with one name node, and 8 data nodes (2 had 220GB hard disk space, rest had 450GB). Most of the space on first machines with 250GB disk space was consumed. Now we added two new machines each with 450GB hard disk space as data nodes. Is there any way to

Re: Help with Hadoop/Hbase on s3

2009-08-07 Thread Ananth T. Sarathy
Thanks for whatever help you could give... when I try to something else I get Fri Aug 7 13:31:34 EDT 2009 Starting master on ip-10-244-131-228 ulimit -n 1024 2009-08-07 13:31:34,829 INFO org.apache.hadoop.hbase.master.HMaster: vmName=Ja HotSpot(TM) Client VM, vmVendor=Sun Microsystems Inc.,

Re: How to redistribute files on HDFS after adding new machines to cluster?

2009-08-07 Thread Ravi Phulari
Use Rebalancer http://hadoop.apache.org/common/docs/r0.20.0/hdfs_user_guide.html#Rebalancer - Ravi On 8/7/09 10:38 AM, prashant ullegaddi prashullega...@gmail.com wrote: Hi, We had a cluster of 9 machines with one name node, and 8 data nodes (2 had 220GB hard disk space, rest had 450GB).

Re: How to redistribute files on HDFS after adding new machines to cluster?

2009-08-07 Thread Ted Dunning
Make sure you rebalance soon after adding the new node. Otherwise, you will have an age bias in file distribution. This can, in some applications, lead to some strange effects. For example, if you have log files that you delete when they get too old, disk space will be freed non-uniformly.

DataNode Drive failure. (Tails from the front lines)

2009-08-07 Thread Edward Capriolo
I have a hadoop 18.3 cluster. Today I got two nagios alerts. Actually, I was excited by one of them because I never had a way to test check_hpacucli. Worked first time NIIICC. (Borat voice.) --- Notification Type: PROBLEM Service: check_remote_datanode Host:

Re: HADOOP-4539 question

2009-08-07 Thread Stas Oskin
Hi. What is the recommended a utility for this? Thanks. 2009/8/7 Steve Loughran ste...@apache.org Stas Oskin wrote: Hi. I checked this ticket and I like what I found. Had question about it, and hoped someone can answer it: If I have a NN, and BN, and the NN fails, how the DFS clients

Job Posting: Sales Engineer @ Cloudera

2009-08-07 Thread Christophe Bisciglia
Hadoop Fans, We are looking to hire a sales engineer based in the bay area that can travel as needed. Following is the full job description, but if you know your Hadoop, have experience working with customers, and are excited about helping new enterprise users work with lots of data, please read

Re: Job Posting: Sales Engineer @ Cloudera

2009-08-07 Thread Naga Vijayapuram
Hello, I am resident of bay area and have experience with business development / customer support. You can view my resume at - http://www.sansthal-us.com/naga/vijay.html - Use naga / vijayRes1729 to login. Thanks, Naga Vijayapuram - 650-759-6990 On Fri, Aug 7, 2009 at 1:39 PM, Christophe

Re: DataNode Drive failure. (Tails from the front lines)

2009-08-07 Thread Koji Noguchi
You probably want this. https://issues.apache.org/jira/browse/HDFS-457 Koji As we can see here the data node shutdown. Should one disk entering a Read Only state really shut down the entire datanode? The datanode happily restarts once the disk was unmounted. just wondering? On 8/7/09 12:11

Re: How to redistribute files on HDFS after adding new machines to cluster?

2009-08-07 Thread prashant ullegaddi
Thank you Ravi and Ted. I ran hadoop balancer without default threshold. It's been running for last 8 hours! How long does it take given the following DFS stats: *3140 files and directories, 10295 blocks = 13435 total. Heap Size is 17.88 MB / 963 MB (1%) * Capacity : 3.93 TB DFS Remaining :

Re: How to redistribute files on HDFS after adding new machines to cluster?

2009-08-07 Thread Ted Dunning
I think that I remember that you essentially doubled your storage before starting balancing. This means that about 1 TB will need to be copied. By default the balancer only moves 1MB/s (per node, I believe). This means that it will take a LONG time to balance your cluster. You can increase