finding the input file of a failed map task
In the JobTracker website, when I click on a JobId, there is a listing of completed maps and killed maps. When I click on the number under the column completed or killed, there is a table with columns as mentioned below. Task, Complete, Status, Start Time, Finish Time, Errors Status column is blank for Failed jobs, while for completed jobs it lists the actual inputfile/block on which this map was executed. This is the exact information that I'm looking for in case of a failed job. Our jobs run on numerous files, and sometimes some input files are corrupt. So if a failed map task can also show me what was the input file it was working on, I can quickly remove that corrupt input file and rerun the job. Please let me know if this information can be obtained in any other way. Thanks Regards Sandhya
IO Exception in Map Tasks
Hi, In one of the map tasks, i get the following exception: java.io.IOException: Task process exit with nonzero status of 255. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424) java.io.IOException: Task process exit with nonzero status of 255. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424) what could be the reason? Thanks, Raakhi
Re: Storing data-node content to other machine
There is no requirement that your hdfs and mapred clusters share an installation directory, it is just done that way because it is simple and most people have a datanode and tasktracker on each slave node. Simply have 2 configuration directories on your cluster machines, and us the bin/start-dfs.sh script in one, and the bin/start-mapred.sh script in the other, and maintain different slaves files in the two directories. You will loose the benefit of data locality for your tasktrackers which do not reside on the datanode machines. On Sun, Apr 26, 2009 at 10:06 PM, Vishal Ghawate vishal_ghaw...@persistent.co.in wrote: Hi, I want to store the contents of all the client machine(datanode)of hadoop cluster to centralized machine with high storage capacity.so that tasktracker will be on the client machine but the contents are stored on the centralized machine. Can anybody help me on this please. DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- Alpha Chapters of my book on Hadoop are available http://www.apress.com/book/view/9781430219422
Re: IO Exception in Map Tasks
The jvm had a hard failure and crashed On Sun, Apr 26, 2009 at 11:34 PM, Rakhi Khatwani rakhi.khatw...@gmail.comwrote: Hi, In one of the map tasks, i get the following exception: java.io.IOException: Task process exit with nonzero status of 255. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424) java.io.IOException: Task process exit with nonzero status of 255. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424) what could be the reason? Thanks, Raakhi -- Alpha Chapters of my book on Hadoop are available http://www.apress.com/book/view/9781430219422
Re: IO Exception in Map Tasks
Thanks Jason, is there any way we can avoid this exception?? Thanks, Raakhi On Mon, Apr 27, 2009 at 1:20 PM, jason hadoop jason.had...@gmail.comwrote: The jvm had a hard failure and crashed On Sun, Apr 26, 2009 at 11:34 PM, Rakhi Khatwani rakhi.khatw...@gmail.comwrote: Hi, In one of the map tasks, i get the following exception: java.io.IOException: Task process exit with nonzero status of 255. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424) java.io.IOException: Task process exit with nonzero status of 255. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424) what could be the reason? Thanks, Raakhi -- Alpha Chapters of my book on Hadoop are available http://www.apress.com/book/view/9781430219422
Balancing datanodes - Running hadoop 0.18.3
Hi, I had sent out an email yesterday asking about how to balance the cluster after setting the replication level to 2. I have 4 datanodes and one namenode in my setup. Using the -R switch with -setrep did the trick but one of my nodes became under utilized. I then ran hadoop balancer and it did help but upto a certain extent. Datanode 4 noted below is now up to almost 5% but when i try to balance the datanode again using the hadoop balance command it says that the cluster is already balanced which isnt. I wonder if there is an alternate way(s) or maybe overtime Datanode-4 will pick up more blocks? Any clues? Thanks, Usman Name: 1 State : In Service Total raw bytes: 293778976768 (273.6 GB) Remaining raw bytes: 35858599(206.97 GB) Used raw bytes: 48140136448 (44.83 GB) % used: 16.39% Last contact: Mon Apr 27 08:34:46 UTC 2009 Name: 2 State : In Service Total raw bytes: 293778976768 (273.6 GB) Remaining raw bytes: 231235100994(215.35 GB) Used raw bytes: 40704245760 (37.91 GB) % used: 13.86% Last contact: Mon Apr 27 08:34:45 UTC 2009 Name: 3 State : In Service Total raw bytes: 293778976768 (273.6 GB) Remaining raw bytes: 211936026161(197.38 GB) Used raw bytes: 59591700480 (55.5 GB) % used: 20.28% Last contact: Mon Apr 27 08:34:45 UTC 2009 *Name: 4 *State : In Service Total raw bytes: 293778976768 (273.6 GB) Remaining raw bytes: 258876991693(241.1 GB) Used raw bytes: 12142653440 (11.31 GB) % used: 4.13% Last contact: Mon Apr 27 08:34:46 UTC 2009
write a large file to HDFS?
hi, If I write a large file to HDFS, will it be split into blocks and multi-blocks are written to HDFS at the same time? Or HDFS can only write block by block? Thanks. -- View this message in context: http://www.nabble.com/write-a-large-file-to-HDFS--tp23252754p23252754.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Blocks replication in downtime even
Hi. I have a question: If I have N of DataNodes, and one or several of the nodes have become unavailable, would HDFS re-synchronize the blocks automatically, according to replication level set? And if yes, when? As soon as the offline node was detected, or only on file access? Regards.
Re: Balancing datanodes - Running hadoop 0.18.3
Hi, The balancer works with the average utilization of all the nodes in the cluster - in your case it's about 13%. Only nodes that are +/- 10% off the average will be rebalanced. Node 4 isn't under-utilized because 13-10=3 which is less than 4%. You can use a different threshold than the default 10% (hadoop balancer -threshold 5). Read more here: http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Rebalancer Tamir On Mon, Apr 27, 2009 at 11:36 AM, Usman Waheed usm...@opera.com wrote: Hi, I had sent out an email yesterday asking about how to balance the cluster after setting the replication level to 2. I have 4 datanodes and one namenode in my setup. Using the -R switch with -setrep did the trick but one of my nodes became under utilized. I then ran hadoop balancer and it did help but upto a certain extent. Datanode 4 noted below is now up to almost 5% but when i try to balance the datanode again using the hadoop balance command it says that the cluster is already balanced which isnt. I wonder if there is an alternate way(s) or maybe overtime Datanode-4 will pick up more blocks? Any clues? Thanks, Usman Name: 1 State : In Service Total raw bytes: 293778976768 (273.6 GB) Remaining raw bytes: 35858599(206.97 GB) Used raw bytes: 48140136448 (44.83 GB) % used: 16.39% Last contact: Mon Apr 27 08:34:46 UTC 2009 Name: 2 State : In Service Total raw bytes: 293778976768 (273.6 GB) Remaining raw bytes: 231235100994(215.35 GB) Used raw bytes: 40704245760 (37.91 GB) % used: 13.86% Last contact: Mon Apr 27 08:34:45 UTC 2009 Name: 3 State : In Service Total raw bytes: 293778976768 (273.6 GB) Remaining raw bytes: 211936026161(197.38 GB) Used raw bytes: 59591700480 (55.5 GB) % used: 20.28% Last contact: Mon Apr 27 08:34:45 UTC 2009 *Name: 4 *State : In Service Total raw bytes: 293778976768 (273.6 GB) Remaining raw bytes: 258876991693(241.1 GB) Used raw bytes: 12142653440 (11.31 GB) % used: 4.13% Last contact: Mon Apr 27 08:34:46 UTC 2009
Re: Balancing datanodes - Running hadoop 0.18.3
Hi Tamir, Thanks for the info, makes sense now :). Cheers, Usman Hi, The balancer works with the average utilization of all the nodes in the cluster - in your case it's about 13%. Only nodes that are +/- 10% off the average will be rebalanced. Node 4 isn't under-utilized because 13-10=3 which is less than 4%. You can use a different threshold than the default 10% (hadoop balancer -threshold 5). Read more here: http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Rebalancer Tamir On Mon, Apr 27, 2009 at 11:36 AM, Usman Waheed usm...@opera.com wrote: Hi, I had sent out an email yesterday asking about how to balance the cluster after setting the replication level to 2. I have 4 datanodes and one namenode in my setup. Using the -R switch with -setrep did the trick but one of my nodes became under utilized. I then ran hadoop balancer and it did help but upto a certain extent. Datanode 4 noted below is now up to almost 5% but when i try to balance the datanode again using the hadoop balance command it says that the cluster is already balanced which isnt. I wonder if there is an alternate way(s) or maybe overtime Datanode-4 will pick up more blocks? Any clues? Thanks, Usman Name: 1 State : In Service Total raw bytes: 293778976768 (273.6 GB) Remaining raw bytes: 35858599(206.97 GB) Used raw bytes: 48140136448 (44.83 GB) % used: 16.39% Last contact: Mon Apr 27 08:34:46 UTC 2009 Name: 2 State : In Service Total raw bytes: 293778976768 (273.6 GB) Remaining raw bytes: 231235100994(215.35 GB) Used raw bytes: 40704245760 (37.91 GB) % used: 13.86% Last contact: Mon Apr 27 08:34:45 UTC 2009 Name: 3 State : In Service Total raw bytes: 293778976768 (273.6 GB) Remaining raw bytes: 211936026161(197.38 GB) Used raw bytes: 59591700480 (55.5 GB) % used: 20.28% Last contact: Mon Apr 27 08:34:45 UTC 2009 *Name: 4 *State : In Service Total raw bytes: 293778976768 (273.6 GB) Remaining raw bytes: 258876991693(241.1 GB) Used raw bytes: 12142653440 (11.31 GB) % used: 4.13% Last contact: Mon Apr 27 08:34:46 UTC 2009
.20.0, Partitioners?
Is there some magic to get a Partitioner working on .20.0? Setting the partitioner class on the Job object doesn't take, hadoop always uses the HashPartitioner. Looking through the source code, it looks like the MapOutputBuffer in MapTask only ever fetches the mapred.partitioner.class, and doesn't check for new api's mapreduce.partitioner.class, but I'm not confident in my understanding of how things work. I was eventually able to get my test program working correctly by: 1) Creating a partitioner that extends the deprecated org.apache.hadoop.mapred.Partitioner class. 2) Calling job.getConfiguration().set(mapred.partitioner.class, DeprecatedTestPartitioner.class.getCanonicalName()); 3) Commenting out line 395 of org.apache.hadoop.mapreduce.Job.java, where it asserts that mapred.partitioner.class is null But I'm assuming editing the hadoop core sourcecode is not the intended path. Am I missing some simple switch or something? rf
ANN: R and Hadoop = RHIPE 0.1
Hello, I'd like to announce the release of the 0.1 version of RHIPE -R and Hadoop Integrated Processing Environment. Using RHIPE, it is possible to write map-reduce algorithms using the R language and start them from within R. RHIPE is built on Hadoop and so benefits from Hadoop's fault tolerance, distributed file system and job scheduling features. For the R user, there is rhlapply which runs an lapply across the cluster. For the Hadoop user, there is rhmr which runs a general map-reduce program. The tired example of counting words: m - function(key,val){ words - substr(val, +)[[1]] wc - table(words) cln - names(wc) return(sapply(1:length(wc),function(r) list(key=cln[r],value=wc[[r]]),simplify=F)) } r - function(key,value){ value - do.call(rbind,value) return(list(list(key=key,value=sum(value } rhmr(mapper=m,reduce=r,input.folder=X,output.folder=Y) URL: http://ml.stat.purdue.edu/rhipe There are some downsides to RHIPE which are described at http://ml.stat.purdue.edu/rhipe/install.html#sec-5 Regards Saptarshi Guha
Re: Can't start fully-distributed operation of Hadoop in Sun Grid Engine
I have contacted with the administor of our cluster and he gave me the access. Now my program can work under full distributed mode. Thanks a lot. Jasmine - Original Message - From: jason hadoop jason.had...@gmail.com To: core-user@hadoop.apache.org Sent: Sunday, April 26, 2009 12:13 PM Subject: Re: Can't start fully-distributed operation of Hadoop in Sun Grid Engine It may be that the sun grid is similar to the EC2 and the machines have an internal IPaddress/name that MUST be used for inter machine communication and an external IPaddress/name that is only for internet access. The above overly complex sentence basically states there may be some firewall rules/tools in the sun grid that you need to be aware of and use. On Sun, Apr 26, 2009 at 6:31 AM, Jasmine (Xuanjing) Huang xjhu...@cs.umass.edu wrote: Hi, Jason, Thanks for your advice, after insert port into the file of hadoop-site.xml, I can start namenode and run job now. But my system works only when I set localhost to masters and add localhost (as well as some other nodes) to slavers file. And all the tasks are Data-local map tasks. I wonder if whether I enter fully distributed mode, or still in pseudo mode. As for the SGE, I am only a user and know little about it. This is the user manual of our cluster: http://www.cs.umass.edu/~swarm/index.php?n=Main.UserDochttp://www.cs.umass.edu/%7Eswarm/index.php?n=Main.UserDoc Best, Jasmine - Original Message - From: jason hadoop jason.had...@gmail.com To: core-user@hadoop.apache.org Sent: Sunday, April 26, 2009 12:06 AM Subject: Re: Can't start fully-distributed operation of Hadoop in Sun Grid Engine the parameter you specify for fs.default name should be of the form hdfs://host:port and the parameter you specify for the mapred.job.tracker MUST be host:port. I haven't looked at 18.3, but it appears that the :port is mandatory. In your case, the piece of code parsing the fs.default.name variable is not able to tokenize it into protocol host and port correctly recap: fs.default.name hdfs://namenodeHost:port mapred.job.tracker jobtrackerHost:port sepecify all the parts above and try again. Can you please point me at information on using the sun grid, I want to include a paragraph or two about it in my book. On Sat, Apr 25, 2009 at 4:28 PM, Jasmine (Xuanjing) Huang xjhu...@cs.umass.edu wrote: Hi, there, My hadoop system (version: 0.18.3) works well under standalone and pseudo-distributed operation. But if I try to run hadoop in fully-distributed mode in Sun Grid Engine, Hadoop always failed -- in fact, the jobTracker and TaskzTracker can be started, but the namenode and secondary namenode cannot be started. Could anyone help me with it? My SGE scripts looks like: #!/bin/bash #$ -cwd #$ -S /bin/bash #$ -l long=TRUE #$ -v JAVA_HOME=/usr/java/latest #$ -v HADOOP_HOME=* #$ -pe hadoop 6 PATH=$HADOOP_HOME/bin:$PATH hadoop fs -put hadoop jar * hadoop fs -get * Then the output looks like: Exception in thread main java.lang.NumberFormatException: For input string: at java.lang.NumberFormatException.forInputString(NumberFormatException. java:48) at java.lang.Integer.parseInt(Integer.java:468) at java.lang.Integer.parseInt(Integer.java:497) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:144) at org.apache.hadoop.dfs.NameNode.getAddress(NameNode.java:116) at org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFil eSystem.java:66) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1339 ) at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:56) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1351) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:118) at org.apache.hadoop.fs.FsShell.init(FsShell.java:88) at org.apache.hadoop.fs.FsShell.run(FsShell.java:1703) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.fs.FsShell.main(FsShell.java:1852) And the log of NameNode looks like 2009-04-25 17:27:17,032 INFO org.apache.hadoop.dfs.NameNode: STARTUP_MSG: / STARTUP_MSG: Starting NameNode STARTUP_MSG: host = STARTUP_MSG: args = [] STARTUP_MSG: version = 0.18.3 / 2009-04-25 17:27:17,147 ERROR org.apache.hadoop.dfs.NameNode: java.lang.NumberFormatException: For i nput string: at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Integer.parseInt(Integer.java:468) at java.lang.Integer.parseInt(Integer.java:497) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:144) at
RE: Blocks replication in downtime even
http://hadoop.apache.org/core/docs/current/hdfs_design.html#Data+Disk+Fa ilure%2C+Heartbeats+and+Re-Replication hope this helps. Koji -Original Message- From: Stas Oskin [mailto:stas.os...@gmail.com] Sent: Monday, April 27, 2009 4:11 AM To: core-user@hadoop.apache.org Subject: Blocks replication in downtime even Hi. I have a question: If I have N of DataNodes, and one or several of the nodes have become unavailable, would HDFS re-synchronize the blocks automatically, according to replication level set? And if yes, when? As soon as the offline node was detected, or only on file access? Regards.
Re: Datanode Setup
bump* Any suggestions? -- View this message in context: http://www.nabble.com/Datanode-Setup-tp23064660p23259364.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: IO Exception in Map Tasks
You will need to figure out why your task crashed, Check the task logs, there may be some messages there, that give you a hint as to what is going on. you can enable saving failed task logs and then run the task standalone in the isolation runner. chapter 7 of my book (alpha available) provides details on this, hoping the failure repeats in the controlled environment. You could unlimit the core dump size, via hadoop-env.sh *ulimit -c unlimited *, but that will require that the failed task logs be available as the core will be in the task working directory. On Mon, Apr 27, 2009 at 1:30 AM, Rakhi Khatwani rakhi.khatw...@gmail.comwrote: Thanks Jason, is there any way we can avoid this exception?? Thanks, Raakhi On Mon, Apr 27, 2009 at 1:20 PM, jason hadoop jason.had...@gmail.com wrote: The jvm had a hard failure and crashed On Sun, Apr 26, 2009 at 11:34 PM, Rakhi Khatwani rakhi.khatw...@gmail.comwrote: Hi, In one of the map tasks, i get the following exception: java.io.IOException: Task process exit with nonzero status of 255. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424) java.io.IOException: Task process exit with nonzero status of 255. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424) what could be the reason? Thanks, Raakhi -- Alpha Chapters of my book on Hadoop are available http://www.apress.com/book/view/9781430219422 -- Alpha Chapters of my book on Hadoop are available http://www.apress.com/book/view/9781430219422
Re: write a large file to HDFS?
block by block. open multiple connections and write multiple files if you are not saturating your network connection. Generally a single file writer writing large blocks rapidly will do a decent job of saturating things. On Mon, Apr 27, 2009 at 2:22 AM, Xie, Tao xietao1...@gmail.com wrote: hi, If I write a large file to HDFS, will it be split into blocks and multi-blocks are written to HDFS at the same time? Or HDFS can only write block by block? Thanks. -- View this message in context: http://www.nabble.com/write-a-large-file-to-HDFS--tp23252754p23252754.html Sent from the Hadoop core-user mailing list archive at Nabble.com. -- Alpha Chapters of my book on Hadoop are available http://www.apress.com/book/view/9781430219422
Re: .20.0, Partitioners?
Ryan, I observed this behavior too -- Partitioner does not seems to work with the new API exactly for the reason you have mentioned. Till this gets fixed, you probably need to use the old API. Jothi On 4/27/09 7:14 PM, Ryan Farris farri...@gmail.com wrote: Is there some magic to get a Partitioner working on .20.0? Setting the partitioner class on the Job object doesn't take, hadoop always uses the HashPartitioner. Looking through the source code, it looks like the MapOutputBuffer in MapTask only ever fetches the mapred.partitioner.class, and doesn't check for new api's mapreduce.partitioner.class, but I'm not confident in my understanding of how things work. I was eventually able to get my test program working correctly by: 1) Creating a partitioner that extends the deprecated org.apache.hadoop.mapred.Partitioner class. 2) Calling job.getConfiguration().set(mapred.partitioner.class, DeprecatedTestPartitioner.class.getCanonicalName()); 3) Commenting out line 395 of org.apache.hadoop.mapreduce.Job.java, where it asserts that mapred.partitioner.class is null But I'm assuming editing the hadoop core sourcecode is not the intended path. Am I missing some simple switch or something? rf
Re: .20.0, Partitioners?
I created https://issues.apache.org/jira/browse/HADOOP-5750 to follow this up. Thanks Jothi On 4/27/09 10:10 PM, Jothi Padmanabhan joth...@yahoo-inc.com wrote: Ryan, I observed this behavior too -- Partitioner does not seems to work with the new API exactly for the reason you have mentioned. Till this gets fixed, you probably need to use the old API. Jothi On 4/27/09 7:14 PM, Ryan Farris farri...@gmail.com wrote: Is there some magic to get a Partitioner working on .20.0? Setting the partitioner class on the Job object doesn't take, hadoop always uses the HashPartitioner. Looking through the source code, it looks like the MapOutputBuffer in MapTask only ever fetches the mapred.partitioner.class, and doesn't check for new api's mapreduce.partitioner.class, but I'm not confident in my understanding of how things work. I was eventually able to get my test program working correctly by: 1) Creating a partitioner that extends the deprecated org.apache.hadoop.mapred.Partitioner class. 2) Calling job.getConfiguration().set(mapred.partitioner.class, DeprecatedTestPartitioner.class.getCanonicalName()); 3) Commenting out line 395 of org.apache.hadoop.mapreduce.Job.java, where it asserts that mapred.partitioner.class is null But I'm assuming editing the hadoop core sourcecode is not the intended path. Am I missing some simple switch or something? rf
Rescheduling of already completed map/reduce task
Hi, The job froze after the filesystem hung on a machine which had successfully completed a map task. Is there a flag to enable the re scheduling of such a task ? Jstack of job tracker SocketListener0-2 prio=10 tid=0x08916000 nid=0x4a4f runnable [0x4d05c000..0x4d05ce30] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at org.mortbay.util.LineInput.fill(LineInput.java:469) at org.mortbay.util.LineInput.fillLine(LineInput.java:547) at org.mortbay.util.LineInput.readLineBuffer(LineInput.java:293) at org.mortbay.util.LineInput.readLineBuffer(LineInput.java:277) at org.mortbay.http.HttpRequest.readHeader(HttpRequest.java:238) at org.mortbay.http.HttpConnection.readRequest(HttpConnection.java:861) at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:907) at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831) at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244) at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357) at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534) Locked ownable synchronizers: - None SocketListener0-1 prio=10 tid=0x4da8c800 nid=0xeeb runnable [0x4d266000..0x4d2670b0] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at org.mortbay.util.LineInput.fill(LineInput.java:469) at org.mortbay.util.LineInput.fillLine(LineInput.java:547) at org.mortbay.util.LineInput.readLineBuffer(LineInput.java:293) at org.mortbay.util.LineInput.readLineBuffer(LineInput.java:277) at org.mortbay.http.HttpRequest.readHeader(HttpRequest.java:238) at org.mortbay.http.HttpConnection.readRequest(HttpConnection.java:861) at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:907) at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831) at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244) at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357) at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534) IPC Server listener on 54311 daemon prio=10 tid=0x4df70400 nid=0xe86 runnable [0x4d9fe000..0x4d9feeb0] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked 0x54fb4320 (a sun.nio.ch.Util$1) - locked 0x54fb4310 (a java.util.Collections$UnmodifiableSet) - locked 0x54fb40b8 (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84) at org.apache.hadoop.ipc.Server$Listener.run(Server.java:296) Locked ownable synchronizers: - None IPC Server Responder daemon prio=10 tid=0x4da22800 nid=0xe85 runnable [0x4db75000..0x4db75e30] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked 0x54f0 (a sun.nio.ch.Util$1) - locked 0x54fdce10 (a java.util.Collections$UnmodifiableSet) - locked 0x54fdcc18 (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:455) Locked ownable synchronizers: - None RMI TCP Accept-0 daemon prio=10 tid=0x4da13400 nid=0xe31 runnable [0x4de55000..0x4de56130] java.lang.Thread.State: RUNNABLE at java.net.PlainSocketImpl.socketAccept(Native Method) at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384) - locked 0x54f6dae0 (a java.net.SocksSocketImpl) at java.net.ServerSocket.implAccept(ServerSocket.java:453) at java.net.ServerSocket.accept(ServerSocket.java:421) at sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:34) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:369) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341) at java.lang.Thread.run(Thread.java:619) Locked ownable synchronizers: - None -Sagar
Re: Blocks replication in downtime even
Thanks. 2009/4/27 Koji Noguchi knogu...@yahoo-inc.com http://hadoop.apache.org/core/docs/current/hdfs_design.html#Data+Disk+Fa ilure%2C+Heartbeats+and+Re-Replicationhttp://hadoop.apache.org/core/docs/current/hdfs_design.html#Data+Disk+Fa%0Ailure%2C+Heartbeats+and+Re-Replication hope this helps. Koji -Original Message- From: Stas Oskin [mailto:stas.os...@gmail.com] Sent: Monday, April 27, 2009 4:11 AM To: core-user@hadoop.apache.org Subject: Blocks replication in downtime even Hi. I have a question: If I have N of DataNodes, and one or several of the nodes have become unavailable, would HDFS re-synchronize the blocks automatically, according to replication level set? And if yes, when? As soon as the offline node was detected, or only on file access? Regards.
Re: How to set System property for my job
I think what you want is the section Task Execution Environment in http://hadoop.apache.org/core/docs/current/mapred_tutorial.html http://hadoop.apache.org/core/docs/current/mapred_tutorial.html . Here is a sample from that document: property namemapred.child.java.opts/name value -Xmx512M -Djava.library.path=/home/mycompany/lib -verbose:gc -Xloggc:/tmp/@tas...@.gc -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false /value /property -Marc Tarandeep wrote: Hi, While submitting a job to Hadoop, how can I set system properties that are required by my code ? Passing -Dmy.prop=myvalue to the hadoop job command is not going to work as hadoop command will pass this to my program as command line argument. Is there any way to achieve this ? Thanks, Taran * * -- View this message in context: http://www.nabble.com/How-to-set-System-property-for-my-job-tp18896188p23264520.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Debian support for Cloudera's Distribution
Hey Hadoop fans, just wanted to drop a quick note to let you know that we now have debian packages for our distribution in addition to RPMs. We will continue to support both platforms going forward. Todd Lipcon put in many late nights for this, so next time you see him, but him a beer :-) http://www.cloudera.com/hadoop-deb Cheers, Christophe -- get hadoop: cloudera.com/hadoop online training: cloudera.com/hadoop-training blog: cloudera.com/blog twitter: twitter.com/cloudera
Hadoop Training, May 15th: SF Bay Area with Online Participation Available
OK, last announcement from me today :-) We're hosting a training session in the SF bay area (at the Cloudera office) on Friday, May 15th. We're doing two things differently: 1) We've allocated a chunk of discounted early bird registrations - first come first serve until May 1st, at which point, only regular registration is available. 2) We're enabling people from outside the bay area to attend through some pretty impressive web based video remote presence software we've been piloting - all you need is a browser with flash. If you have a webcam and mic, all the better. We're working with a startup on this, and we're really impressed with the technology. Since this is new for us, we've discounted web based participation significantly for this session. registration: http://cloudera.eventbrite.com/ Cheers, Christophe -- get hadoop: cloudera.com/hadoop online training: cloudera.com/hadoop-training blog: cloudera.com/blog twitter: twitter.com/cloudera