Re: what's going on :( ?

2009-02-09 Thread Amar Kamat
Mark Kerzner wrote: Hi, Hi, why is hadoop suddenly telling me Retrying connect to server: localhost/127.0.0.1:8020 with this configuration fs.default.name hdfs://localhost:9000 mapred.job.tracker localhost:9001 Shouldnt this be hdfs://localhost:9001 Amar

what's going on :( ?

2009-02-09 Thread Mark Kerzner
Hi, Hi, why is hadoop suddenly telling me Retrying connect to server: localhost/127.0.0.1:8020 with this configuration fs.default.name hdfs://localhost:9000 mapred.job.tracker localhost:9001 dfs.replication 1 and both this http://localhost:50070/dfs

Re: TaskTrackers being double counted after restart job recovery

2009-02-09 Thread Amar Kamat
Stefan Will wrote: Hi, I¹m using the new persistent job state feature in 0.19.0, and it¹s worked really well so far. However, this morning my JobTracker died with and OOM error (even though the heap size is set to 768M). So I killed it and all the TaskTrackers. Any specific reason why you kille

Re: lost TaskTrackers

2009-02-09 Thread Vadim Zaliva
I am starting to wonder If hadoop 19 stable enough for production? Vadim On 2/9/09, Vadim Zaliva wrote: > yes, I can access DFS from the cluster. namenode status seems to be OK > and I see no errors in namenode log files. > > initially all trackers were visible, and 9433 maps completed > succes

Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Scott Whitecross
I tried modifying the settings, and I'm still running into the same issue. I increased the xceivers count (fs.datanode.max.xcievers) in the hadoop-site.xml file. I also checked to make sure the file handles were increased, but they were fairly high to begin with. I don't think I'm dealing

Re: using HDFS for a distributed storage system

2009-02-09 Thread Mark Kerzner
It is a good and useful overview,thank you. It also mentions Stuart Sierra's post, where Stuart mentions that the process is slow. Does anybody know why? I have written code to write from the PC file system to HDFS, and I also noticed that it is very slow. Instead of 40M/sec, as promised by the To

Re: using HDFS for a distributed storage system

2009-02-09 Thread Jeff Hammerbacher
Yo, I don't want to sound all spammy, but Tom White wrote a pretty nice blog post about small files in HDFS recently that you might find helpful. The post covers some potential solutions, including Hadoop Archives: http://www.cloudera.com/blog/2009/02/02/the-small-files-problem. Later, Jeff On M

Re: using HDFS for a distributed storage system

2009-02-09 Thread lohit
> I am planning to add the individual files initially, and after a while (lets > say 2 days after insertion) will make a SequenceFile out of each directory > (I am currently looking into SequenceFile) and delete the previous files of > that directory from HDFS. That way in future, I can access any

Re: copyFromLocal *

2009-02-09 Thread lohit
Which version of hadoop are you using. I think from 0.18 or 0.19 copyFromLocal accepts multiple files as input but destination should be a directory. Lohit - Original Message From: S D To: Hadoop Mailing List Sent: Monday, February 9, 2009 3:34:22 PM Subject: copyFromLocal * I'm us

Re: Backing up HDFS?

2009-02-09 Thread lohit
We copy over selected files from HDFS to KFS and use an instance of KFS as backup file system. We use distcp to take backup. Lohit - Original Message From: Allen Wittenauer To: core-user@hadoop.apache.org Sent: Monday, February 9, 2009 5:22:38 PM Subject: Re: Backing up HDFS? On 2/9/

Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Brian Bockelman
On Feb 9, 2009, at 7:50 PM, jason hadoop wrote: The other issue you may run into, with many files in your HDFS is that you may end up with more than a few 100k worth of blocks on each of your datanodes. At present this can lead to instability due to the way the periodic block reports to the n

Hadoop Workshop for College Teaching Faculty

2009-02-09 Thread Christophe Bisciglia
Hey Hadoop Fans, I wanted to call your attention to an event we're putting on next month that would be great for your academic contacts. Please take a moment and forward this to any faculty you think might be interested. http://www.cloudera.com/sigcse-2009-disc-workshop One of the big challenges

Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread jason hadoop
The other issue you may run into, with many files in your HDFS is that you may end up with more than a few 100k worth of blocks on each of your datanodes. At present this can lead to instability due to the way the periodic block reports to the namenode are handled. The more blocks per datanode, the

Re: Backing up HDFS?

2009-02-09 Thread Jeff Hammerbacher
Hey, There's also a ticket open to enable global snapshots for a single HDFS instance: https://issues.apache.org/jira/browse/HADOOP-3637. While this doesn't solve the multi-site backup issue, it does provide stronger protection against programmatic deletion of data in a single cluster. Regards, J

Re: Backing up HDFS?

2009-02-09 Thread Allen Wittenauer
On 2/9/09 4:41 PM, "Amandeep Khurana" wrote: > Why would you want to have another backup beyond HDFS? HDFS itself > replicates your data so if the reliability of the system shouldnt be a > concern (if at all it is)... I'm reminded of a previous job where a site administrator refused to make tape

Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Bryan Duxbury
Correct. +1 to Jason's more unix file handles suggestion. That's a must-have. -Bryan On Feb 9, 2009, at 3:09 PM, Scott Whitecross wrote: This would be an addition to the hadoop-site.xml file, to up dfs.datanode.max.xcievers? Thanks. On Feb 9, 2009, at 5:54 PM, Bryan Duxbury wrote: Smal

Re: Backing up HDFS?

2009-02-09 Thread Brian Bockelman
On Feb 9, 2009, at 6:41 PM, Amandeep Khurana wrote: Why would you want to have another backup beyond HDFS? HDFS itself replicates your data so if the reliability of the system shouldnt be a concern (if at all it is)... It should be. HDFS is not an archival system. Multiple replicas does

Re: Backing up HDFS?

2009-02-09 Thread Nathan Marz
Replication only protects against single node failure. If there's a fire and we lose the whole cluster, replication doesn't help. Or if there's human error and someone accidentally deletes data, then it's deleted from all the replicas. We want our backups to protect against all these scenar

Re: Backing up HDFS?

2009-02-09 Thread Amandeep Khurana
Why would you want to have another backup beyond HDFS? HDFS itself replicates your data so if the reliability of the system shouldnt be a concern (if at all it is)... Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Mon, Feb 9, 2009 at 4:17 PM

Backing up HDFS?

2009-02-09 Thread Nathan Marz
How do people back up their data that they keep on HDFS? We have many TB of data which we need to get backed up but are unclear on how to do this efficiently/reliably.

Re: using HDFS for a distributed storage system

2009-02-09 Thread Brian Bockelman
Hey Amit, That plan sounds much better. I think you will find the system much more scalable. From our experience, it takes a while to get the right amount of monitoring and infrastructure in place to have a very dependable system with 2 replicas. I would recommend using 3 replicas until

copyFromLocal *

2009-02-09 Thread S D
I'm using the Hadoop FS shell to move files into my data store (either HDFS or S3Native). I'd like to use wildcard with copyFromLocal but this doesn't seem to work. Is there any way I can get that kind of functionality? Thanks, John

Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Scott Whitecross
This would be an addition to the hadoop-site.xml file, to up dfs.datanode.max.xcievers? Thanks. On Feb 9, 2009, at 5:54 PM, Bryan Duxbury wrote: Small files are bad for hadoop. You should avoid keeping a lot of small files if possible. That said, that error is something I've seen a lot.

Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Bryan Duxbury
Small files are bad for hadoop. You should avoid keeping a lot of small files if possible. That said, that error is something I've seen a lot. It usually happens when the number of xcievers hasn't been adjusted upwards from the default of 256. We run with 8000 xcievers, and that seems to

Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread jason hadoop
You will have to increase the per user file descriptor limit. For most linux machines the file /etc/security/limits.conf controls this on a per user basis. You will need to log in a fresh shell session after making the changes, to see them. Any login shells started before the change and process sta

Re: using HDFS for a distributed storage system

2009-02-09 Thread Amit Chandel
Thanks Brian for your inputs. I am eventually targeting to store 200k directories each containing 75 files on avg, with average size of directory being 300MB (ranging from 50MB to 650MB) in this storage system. It will mostly be an archival storage from where I should be able to access any of th

java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Scott Whitecross
Hi all - I've been running into this error the past few days: java.io.IOException: Could not get block locations. Aborting... at org.apache.hadoop.dfs.DFSClient $DFSOutputStream.processDatanodeError(DFSClient.java:2143) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access $1400(DFSClient

Re: can't read the SequenceFile correctly

2009-02-09 Thread Raghu Angadi
+1 on something like getValidBytes(). Just the existence of this would warn many programmers about getBytes(). Raghu. Owen O'Malley wrote: On Feb 6, 2009, at 8:52 AM, Bhupesh Bansal wrote: Hey Tom, I got also burned by this ?? Why does BytesWritable.getBytes() returns non-vaild bytes ??

Re: TaskTrackers being double counted after restart job recovery

2009-02-09 Thread Owen O'Malley
There is a bug that when we restart the TaskTrackers they get counted twice. The problem is the name is generated from the hostname and port number. When TaskTrackers restart they get a new port number and get counted again. The problem goes away when the old TaskTrackers time out in 10 minutes or

TaskTrackers being double counted after restart job recovery

2009-02-09 Thread Stefan Will
Hi, I¹m using the new persistent job state feature in 0.19.0, and it¹s worked really well so far. However, this morning my JobTracker died with and OOM error (even though the heap size is set to 768M). So I killed it and all the TaskTrackers. After starting everything up again, all my nodes were s

Re: using HDFS for a distributed storage system

2009-02-09 Thread Brian Bockelman
Hey Amit, Your current thoughts on keeping block size larger and removing the very small files are along the right line. Why not chose the default size of 64MB or larger? You don't seem too concerned about the number of replicas. However, you're still fighting against the tide. You've

Using the Open Source Hadoop to Generate Data-Intensive Insights

2009-02-09 Thread Bonesata
Wednesday Feb 11, Mountain View, CA info/registration: http://www.meetup.com/CIO-IT-Executives/calendar/9528874/ Speaker: Rob Weltman has been Director of Engineering in Enterprise Software at Nescape, Chief Architect at AOL, and Director of Engineering for Yahoo's data warehouse technology. He

Re: Reduce won't start until Map stage reaches 100%?

2009-02-09 Thread Arun C Murthy
On Feb 8, 2009, at 11:26 PM, Taeho Kang wrote: Dear All, With Hadoop 0.19.0, Reduce stage does not start until Map stage gets to the 100% completion. Has anyone faced the similar situation? How many maps and reduces does your job have? Arun

Re: Reduce won't start until Map stage reaches 100%?

2009-02-09 Thread Matei Zaharia
I believe that in Hadoop 0.19, scheduling was changed so that reduces don't start until 5% of maps have completed. The reasoning for this is that reduces can't do anything until there is some map output to copy over the network. So, if your job has very few map tasks, you won't see reduces start un

Re: can't read the SequenceFile correctly

2009-02-09 Thread Owen O'Malley
On Feb 6, 2009, at 8:52 AM, Bhupesh Bansal wrote: Hey Tom, I got also burned by this ?? Why does BytesWritable.getBytes() returns non-vaild bytes ?? Or we should add a BytesWritable.getValidBytes() kind of function. It does it because continually resizing the array to the "valid" length

Re: only one reducer running in a hadoop cluster

2009-02-09 Thread Nick Cen
Thanks everyone. I find the solution for this one, in my main method, i call the setNumReductTask() on JobConf with the value i want. 2009/2/9 Owen O'Malley > > On Feb 7, 2009, at 11:52 PM, Nick Cen wrote: > > Hi, >> >> I hava a hadoop cluster with 4 pc. And I wanna to integrate hadoop and >> l

Re: lost TaskTrackers

2009-02-09 Thread Vadim Zaliva
yes, I can access DFS from the cluster. namenode status seems to be OK and I see no errors in namenode log files. initially all trackers were visible, and 9433 maps completed successfully. Then, this was followed by 65975 which were killed. In log they all show same error: Error initializing atte

RE: Reduce won't start until Map stage reaches 100%?

2009-02-09 Thread zhuweimin
Hi I think the number of your job's reduce task is 1 because if the number of reduce task is 1 then reduce stage does not start until Map stage 100% completion. zhuweimin -Original Message- From: Taeho Kang [mailto:tka...@gmail.com] Sent: Monday, February 09, 2009 4:26 PM To: hadoop-u..

Re: only one reducer running in a hadoop cluster

2009-02-09 Thread Owen O'Malley
On Feb 7, 2009, at 11:52 PM, Nick Cen wrote: Hi, I hava a hadoop cluster with 4 pc. And I wanna to integrate hadoop and lucene together, so i copy some of the source code from nutch's Indexer class, but when i run my job, i found that there is only 1 reducer running on 1 pc, so the perfor