Re: Exception in thread main org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
Include the ${HADOOP}/conf/ dir in the classpath of the java program Alternatively, u can also try, bin/hadoop jar your_jar main_class args -Sagar Saju K K wrote: This is in referance with the sample application in the JAVAWord http://www.javaworld.com/javaworld/jw-09-2008/jw-09-hadoop.html?page=5 bin/hadoop dfs -mkdir /opt/www/hadoop/hadoop-0.18.2/words bin/hadoop dfs -put word1 /opt/www/hadoop/hadoop-0.18.2/words bin/hadoop dfs -put word2 /opt/www/hadoop/hadoop-0.18.2/words bin/hadoop dfs -put word3 /opt/www/hadoop/hadoop-0.18.2/words bin/hadoop dfs -put word4 /opt/www/hadoop/hadoop-0.18.2/words When i browse through the http://serdev40.apac.nokia.com:50075/browseDirectory.jsp .I could see the files in the directory Also below commands execute properly bin/hadoop dfs -ls /opt/www/hadoop/hadoop-0.18.2/words/ bin/hadoop dfs -ls /opt/www/hadoop/hadoop-0.18.2/words/word1 bin/hadoop dfs -cat /opt/www/hadoop/hadoop-0.18.2/words/word1 But on executing this command ,i am getting an error java -Xms1024m -Xmx1024m com.nokia.tag.test.EchoOhce /opt/www/hadoop/hadoop-0.18.2/words/ result java -Xms1024m -Xmx1024m com.nokia.tag.test.EchoOhce /opt/www/hadoop/hadoop-0.18.2/words result 08/11/24 10:52:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= Exception in thread main org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/opt/www/hadoop/hadoop-0.18.2/words at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:179) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:210) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1026) at com.nokia.tag.test.EchoOhce.run(EchoOhce.java:123) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at com.nokia.tag.test.EchoOhce.main(EchoOhce.java:129) Can anybody know why there is a failure from the Java application
Re: Exception in thread main org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
It works ... Thanks Sagar --saju Sagar Naik-3 wrote: Include the ${HADOOP}/conf/ dir in the classpath of the java program Alternatively, u can also try, bin/hadoop jar your_jar main_class args -Sagar Saju K K wrote: This is in referance with the sample application in the JAVAWord http://www.javaworld.com/javaworld/jw-09-2008/jw-09-hadoop.html?page=5 bin/hadoop dfs -mkdir /opt/www/hadoop/hadoop-0.18.2/words bin/hadoop dfs -put word1 /opt/www/hadoop/hadoop-0.18.2/words bin/hadoop dfs -put word2 /opt/www/hadoop/hadoop-0.18.2/words bin/hadoop dfs -put word3 /opt/www/hadoop/hadoop-0.18.2/words bin/hadoop dfs -put word4 /opt/www/hadoop/hadoop-0.18.2/words When i browse through the http://serdev40.apac.nokia.com:50075/browseDirectory.jsp .I could see the files in the directory Also below commands execute properly bin/hadoop dfs -ls /opt/www/hadoop/hadoop-0.18.2/words/ bin/hadoop dfs -ls /opt/www/hadoop/hadoop-0.18.2/words/word1 bin/hadoop dfs -cat /opt/www/hadoop/hadoop-0.18.2/words/word1 But on executing this command ,i am getting an error java -Xms1024m -Xmx1024m com.nokia.tag.test.EchoOhce /opt/www/hadoop/hadoop-0.18.2/words/ result java -Xms1024m -Xmx1024m com.nokia.tag.test.EchoOhce /opt/www/hadoop/hadoop-0.18.2/words result 08/11/24 10:52:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= Exception in thread main org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/opt/www/hadoop/hadoop-0.18.2/words at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:179) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:210) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1026) at com.nokia.tag.test.EchoOhce.run(EchoOhce.java:123) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at com.nokia.tag.test.EchoOhce.main(EchoOhce.java:129) Can anybody know why there is a failure from the Java application -- View this message in context: http://www.nabble.com/Exception-in-thread-%22main%22-org.apache.hadoop.mapred.InvalidInputException%3A-Input-path-does-not-exist%3A-tp20655207p20656757.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
hdfs read failure
Hello! I'm trying to perform a read test of HDFS files through libhdfs using the hadoop-0.18.2/src/c++/libhdfs/hdfs_read.c test program. Creating the files succeeds but reading them fails. I create two 1MB local files with hdfs_write.c and then I put it under hdfs using hadoop fs -put. The files go under dfs.data.dir as: hdfs://server:port/dfs.data.dir/file1 and hdfs://server:port/dfs.data.dir/file2 Then I try to read it back with hdfs_read and measure the time it takes but I get the following exceptions: Reading file:///home/sony/hadoop/dfs/blocks/file1 1MB Exception in thread main java.lang.IllegalArgumentException: Wrong FS: hdfs://myserver.com:23000/home/sony/hadoop/dfs/blocks/file1, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:320) at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:52) Call to org.apache.hadoop.fs.FileSystem::open((Lorg/apache/hadoop/fs/Path;I)Lorg/apache/hadoop/fs/FSDataInputStream;) failed! hdfs_read.c: Failed to open hdfs://myserver.com:23000/home/sony/hadoop/dfs/blocks/file1 for writing! .. Reading file:home/sony/hadoop/dfs/blocks/file2 1MB Exception in thread main java.lang.IllegalArgumentException: Wrong FS: hdfs://myserver.com:23000/home/sony/hadoop/dfs/blocks/file2, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:320) at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:52) Call to org.apache.hadoop.fs.FileSystem::open((Lorg/apache/hadoop/fs/Path;I)Lorg/apache/hadoop/fs/FSDataInputStream;) failed! hdfs_read.c: Failed to open hdfs://myserver.com:23000/home/sony/hadoop/dfs/blocks/file2 for writing! Do I use an incorrect URI? What can be the problem? Cheers, Tamas
s3n exceptions
Hi all I am testing s3n file system facilities and try to copy from hdfs to S3 in original format And I get next errors 08/11/24 05:04:49 INFO mapred.JobClient: Running job: job_200811240437_0004 08/11/24 05:04:50 INFO mapred.JobClient: map 0% reduce 0% 08/11/24 05:05:00 INFO mapred.JobClient: map 44% reduce 0% 08/11/24 05:05:03 INFO mapred.JobClient: map 0% reduce 0% 08/11/24 05:05:03 INFO mapred.JobClient: Task Id : attempt_200811240437_0004_m_00_0, Status : FAILED java.io.IOException: Copied: 0 Skipped: 0 Failed: 7 at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:542) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) 08/11/24 05:05:15 INFO mapred.JobClient: Task Id : attempt_200811240437_0004_m_00_1, Status : FAILED java.io.IOException: Copied: 0 Skipped: 0 Failed: 7 at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:542) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) -- Best Regards Alexander Aristov
Re: Hadoop Installation
Mithila Nagendra wrote: I tried dropping the jar files into the lib. It still doesnt work.. The following is how the lib looks after the new files were put in: [EMAIL PROTECTED] hadoop-0.17.2.1]$ cd bin [EMAIL PROTECTED] bin]$ ls hadoophadoop-daemon.sh rccstart-all.sh start-dfs.sh stop-all.sh stop-dfs.sh hadoop-config.sh hadoop-daemons.sh slaves.sh start-balancer.sh start-mapred.sh stop-balancer.sh stop-mapred.sh [EMAIL PROTECTED] bin]$ cd .. [EMAIL PROTECTED] hadoop-0.17.2.1]$ mv commons-logging-1.1.1/* lib [EMAIL PROTECTED] hadoop-0.17.2.1]$ cd lib [EMAIL PROTECTED] lib]$ ls commons-cli-2.0-SNAPSHOT.jar commons-logging-1.1.1-javadoc.jar commons-logging-tests.jar junit-3.8.1.jar log4j-1.2.13.jar site commons-codec-1.3.jar commons-logging-1.1.1-sources.jar jets3t-0.5.0.jar junit-3.8.1.LICENSE.txt native xmlenc-0.52.jar commons-httpclient-3.0.1.jar commons-logging-adapters-1.1.1.jar jetty-5.1.4.jarkfs-0.1.jar NOTICE.txt commons-logging-1.0.4.jar commons-logging-api-1.0.4.jar jetty-5.1.4.LICENSE.txtkfs-0.1.LICENSE.txt RELEASE-NOTES.txt commons-logging-1.1.1.jar commons-logging-api-1.1.1.jar jetty-ext LICENSE.txt servlet-api.jar OK, you now have two copies of commons-logging in there. I would detele the -1.0.4 version and the -api and -sources JARs. But I don't think that is the root cause of this problem. Are you running on a linux system that has commons-logging installed as an RPM or .deb package? Because that could be making a real mess of your classpath. The error your are seeing implies that log4j isnt there or it won't load. And as log4j is there, it looks like a classloader problem of some sort. These are tractable, but they are hard to track down and it is only on your system(s) that it exists. There's not much that can be done remotely. -steve
Re: Hadoop+log4j
On Nov 24, 2008, at 9:49 AM, Steve Loughran wrote: Scott Whitecross wrote: Thanks Brian. So you have had luck w/ log4j? We grab logs off machines by not using lo4j and routing to our own logging infrastructure that can feed events to other boxes via RMI and queues. This stuff slots in behind commons-logging, with a custom commons-logging bridge specified on the command line. To get this into Hadoop I had to patch hadoop.jar and remove the properties file that bound it only to log4j. The central receiver/SPOF logs events by sent time and received time and can store all results into text files intermixed for post-processing. It's good for testing, but on a big production cluster you'd want something more robust and scaleable Hey Steve, Sounds like a cool setup, but might be a little much for Scott's purposes (trying to debug a single Map phase...). Scott, I have been able to successfully add new log4j loggers, but in Hadoop code, not in a M-R task. If you try things in local mode, you'll be guaranteed to have the same JVM, so the configuration should be loaded the same way. Then again, I might be putting words into Scott's mouth: maybe he does indeed want to scale this way up and turn it into a logging infrastructure. Scott, did you have any luck debugging the job through the wiki document on debugging mapreduce? I'd make sure to start there before you take too much of a detour into log4j-land. Brian
Re: Hadoop+log4j
Scott Whitecross wrote: Thanks Brian. So you have had luck w/ log4j? We grab logs off machines by not using lo4j and routing to our own logging infrastructure that can feed events to other boxes via RMI and queues. This stuff slots in behind commons-logging, with a custom commons-logging bridge specified on the command line. To get this into Hadoop I had to patch hadoop.jar and remove the properties file that bound it only to log4j. The central receiver/SPOF logs events by sent time and received time and can store all results into text files intermixed for post-processing. It's good for testing, but on a big production cluster you'd want something more robust and scaleable
Re: Hadoop Installation
Thanks Steve! Will take a look at it.. Mithila On Mon, Nov 24, 2008 at 6:32 PM, Steve Loughran [EMAIL PROTECTED] wrote: Mithila Nagendra wrote: I tried dropping the jar files into the lib. It still doesnt work.. The following is how the lib looks after the new files were put in: [EMAIL PROTECTED] hadoop-0.17.2.1]$ cd bin [EMAIL PROTECTED] bin]$ ls hadoophadoop-daemon.sh rccstart-all.sh start-dfs.sh stop-all.sh stop-dfs.sh hadoop-config.sh hadoop-daemons.sh slaves.sh start-balancer.sh start-mapred.sh stop-balancer.sh stop-mapred.sh [EMAIL PROTECTED] bin]$ cd .. [EMAIL PROTECTED] hadoop-0.17.2.1]$ mv commons-logging-1.1.1/* lib [EMAIL PROTECTED] hadoop-0.17.2.1]$ cd lib [EMAIL PROTECTED] lib]$ ls commons-cli-2.0-SNAPSHOT.jar commons-logging-1.1.1-javadoc.jar commons-logging-tests.jar junit-3.8.1.jar log4j-1.2.13.jar site commons-codec-1.3.jar commons-logging-1.1.1-sources.jar jets3t-0.5.0.jar junit-3.8.1.LICENSE.txt native xmlenc-0.52.jar commons-httpclient-3.0.1.jar commons-logging-adapters-1.1.1.jar jetty-5.1.4.jarkfs-0.1.jar NOTICE.txt commons-logging-1.0.4.jar commons-logging-api-1.0.4.jar jetty-5.1.4.LICENSE.txtkfs-0.1.LICENSE.txt RELEASE-NOTES.txt commons-logging-1.1.1.jar commons-logging-api-1.1.1.jar jetty-ext LICENSE.txt servlet-api.jar OK, you now have two copies of commons-logging in there. I would detele the -1.0.4 version and the -api and -sources JARs. But I don't think that is the root cause of this problem. Are you running on a linux system that has commons-logging installed as an RPM or .deb package? Because that could be making a real mess of your classpath. The error your are seeing implies that log4j isnt there or it won't load. And as log4j is there, it looks like a classloader problem of some sort. These are tractable, but they are hard to track down and it is only on your system(s) that it exists. There's not much that can be done remotely. -steve
Re: Hadoop Installation
Hey Steve Out of the following which one do I remove - just making sure.. I got rid of commons-logging-1.0.4.jar commons-logging-api-1.0.4.jar commons-logging-1.1.1-sources.jar commons-logging-1.1.1-sources.jar Thanks! Mithila On Mon, Nov 24, 2008 at 6:32 PM, Steve Loughran [EMAIL PROTECTED] wrote: Mithila Nagendra wrote: I tried dropping the jar files into the lib. It still doesnt work.. The following is how the lib looks after the new files were put in: [EMAIL PROTECTED] hadoop-0.17.2.1]$ cd bin [EMAIL PROTECTED] bin]$ ls hadoophadoop-daemon.sh rccstart-all.sh start-dfs.sh stop-all.sh stop-dfs.sh hadoop-config.sh hadoop-daemons.sh slaves.sh start-balancer.sh start-mapred.sh stop-balancer.sh stop-mapred.sh [EMAIL PROTECTED] bin]$ cd .. [EMAIL PROTECTED] hadoop-0.17.2.1]$ mv commons-logging-1.1.1/* lib [EMAIL PROTECTED] hadoop-0.17.2.1]$ cd lib [EMAIL PROTECTED] lib]$ ls commons-cli-2.0-SNAPSHOT.jar commons-logging-1.1.1-javadoc.jar commons-logging-tests.jar junit-3.8.1.jar log4j-1.2.13.jar site commons-codec-1.3.jar commons-logging-1.1.1-sources.jar jets3t-0.5.0.jar junit-3.8.1.LICENSE.txt native xmlenc-0.52.jar commons-httpclient-3.0.1.jar commons-logging-adapters-1.1.1.jar jetty-5.1.4.jarkfs-0.1.jar NOTICE.txt commons-logging-1.0.4.jar commons-logging-api-1.0.4.jar jetty-5.1.4.LICENSE.txtkfs-0.1.LICENSE.txt RELEASE-NOTES.txt commons-logging-1.1.1.jar commons-logging-api-1.1.1.jar jetty-ext LICENSE.txt servlet-api.jar OK, you now have two copies of commons-logging in there. I would detele the -1.0.4 version and the -api and -sources JARs. But I don't think that is the root cause of this problem. Are you running on a linux system that has commons-logging installed as an RPM or .deb package? Because that could be making a real mess of your classpath. The error your are seeing implies that log4j isnt there or it won't load. And as log4j is there, it looks like a classloader problem of some sort. These are tractable, but they are hard to track down and it is only on your system(s) that it exists. There's not much that can be done remotely. -steve
Third Hadoop Get Together @ Berlin
The third German Hadoop get together is going to take place at 9th of December at newthinking store in Berlin: http://upcoming.yahoo.com/event/1383706/?ps=6 You can order drinks directly at the bar in the newthinking store. As this Get Together takes place in December - Christmas time - there will be cookies as well. There are quite a few good restaurants nearby, so we can go there after the official part. Stefan Groschupf offered to prepare a talk on his project katta. We are still looking for one or more interesting talks. We would like to invite you, the visitor to tell your Hadoop story. If you like, you can bring slides - there will be a beamer. Please send your proposal at [EMAIL PROTECTED] There will be slots of 20min each for talks on your Hadoop topic. After each talk there will be time to discuss. A big Thanks goes to the newthinking store for again providing a room in the center of Berlin for us. Looking forward to seeing you in Berlin, Isabel Drost -- QOTD: It's not an optical illusion, it just looks like one. -- Phil White |\ _,,,---,,_ Web: http://www.isabel-drost.de /,`.-'`'-. ;-;;,_ VoIP:sip://[EMAIL PROTECTED] |,4- ) )-,_..;\ ( `'-' Tel: (+49) 30 6920 6101 '---''(_/--' `-'\_) (fL) IM: xmpp://[EMAIL PROTECTED] pgpFXWGctW86l.pgp Description: PGP signature
Re: Hadoop Installation
Mithila Nagendra wrote: Hey Steve Out of the following which one do I remove - just making sure.. I got rid of commons-logging-1.0.4.jar commons-logging-api-1.0.4.jar commons-logging-1.1.1-sources.jar commons-logging-1.1.1-sources.jar Hadoop is currently built with commons-logging-1.0.4.jar, so strictly speaking that should be the only one to retain; all the others can go. But you could also delete everythiing except commons-logging-1.1.1.jar and it should work just as well. None of the -sources JARs are needed, none of the -api jars are needed. -steve
Re: Hadoop Installation
Hey Steve I deleted what ever I needed to.. still no luck.. You said that the classpath might be messed up.. Is there some way I can reset it? For the root user? What path do I set it to. Mithila On Mon, Nov 24, 2008 at 8:54 PM, Steve Loughran [EMAIL PROTECTED] wrote: Mithila Nagendra wrote: Hey Steve Out of the following which one do I remove - just making sure.. I got rid of commons-logging-1.0.4.jar commons-logging-api-1.0.4.jar commons-logging-1.1.1-sources.jar commons-logging-1.1.1-sources.jar Hadoop is currently built with commons-logging-1.0.4.jar, so strictly speaking that should be the only one to retain; all the others can go. But you could also delete everythiing except commons-logging-1.1.1.jar and it should work just as well. None of the -sources JARs are needed, none of the -api jars are needed. -steve
Re: Facing issues Integrating PIG with Hadoop.
Pig questions should be sent to [EMAIL PROTECTED] The error you're getting usually means that you have a version of hadoop that doesn't match your version of pig. If you downloaded latest for hadoop, that will be the case, as pig currently supports hadoop 0.18, but not 0.19 or top of trunk. Try running your pig version with a released 0.18 version of hadoop and see if you get better results. Alan. On Nov 24, 2008, at 10:09 AM, cutepooja54321 wrote: hi , i am also trying to do the same thing,, can u please help me if u managed to do it ,, pleassee.. i am also a student us latha wrote: Hi All Am a student trying to integrate PIG and Hadoop technologies to build a custom application as a part of my MS project. Am trying out a simple scenario where I have setup a single node hadoop cluster and trying to execute the pig script script1-*hadoop*.*pig *mentioned in the pig tutorial. Am hitting several issues like Failed to create data storage etc. Had posted same to the groups already. http://www.nabble.com/Integration-of-pig-and-hadoop-fails-with-% 22Failed-to-create-DataStorage%22-error.-td18931962.html Could you please suggest me the proper steps to integrate pig and hadoop. Right now, am following the below ones. 1) Have downloded latest source for hadoop and PIG 2) Compiled hadoop and started single node cluster 3) Compiled PIG and replaced the hadoop class files with the new ones from step 2 in the pig.jar 4) executing the pig script by setting HADOOPSITEPATH Please let me if the above steps needs are incorrect (or) should i use any specific pig and hadoop versions? We are stuck up with the errors. Request you to pls help in resolving the same. Thankyou Srilatha -- View this message in context: http://www.nabble.com/Facing-issues- Integrating-PIG-with-Hadoop.-tp19597351p20666428.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Is Hudson Patch verifier stuck?
Hudon patch verifier is running for last 10 hours on a patch. Is it stuck or is it normal for it to take so long on some patches? Abdul Qadeer
do NOT start reduce task until all mappers are finished
Hi, I am using 0.18.2 with fair scheduler hadoop-3476. The purpose of fair scheduler is to prevent long running jobs from blocking short jobs. I gave it a try --- start a long job first, then a short one. The short job is able to grab some map slot and finishes its map phase quickly, but it still blocks on reduce phase. Because the long job has taken all the reduce slots (because the long job starts first and its reducers are started shortly after). The long job’s reducer won’t finish until all its mappers have finished. So my short job is still blocked by the long job…. Making the fair scheduler useless for my workload. I am wondering if there is a way to NOT to start reduce task until all its mappers have finished. Thanks Haijun Cao
[ANNOUNCE] Hadoop release 0.19.0 available
Release 0.19.0 contains many improvements, new features, bug fixes and optimizations. For release details and downloads, visit: http://hadoop.apache.org/core/releases.html Thanks to all who contributed to this release! Nigel
Re: How to integrate hadoop framework with web application
Thanks for your feedback. I think I have found the initial solution. Since the hadoop job execution and the web application execution are two different processes. I plan to use intermediate files as the process communication media. It seems that it is impossible to call hadoop functions directly from the java servelt class. So my step is: 1) startup the hadoop job execution 2) Get the result output files and put to tomcat web application file folder 3) Get the hadoop job result by reading files from java servlet class. regards On Mon, Nov 24, 2008 at 3:17 PM, Alexander Aristov [EMAIL PROTECTED] wrote: Hi You may want to take a look at the Nutch project - hadoop based search engine. It has web application with hadoop integration. As far as I remember you should add hadoop libs and configuration files to classpath and init hadoop on startup. Alexander 2008/11/24 柳松 [EMAIL PROTECTED] Dear 晋光峰: Glad to see another Chinese name here. It sounds possible, but could you give us a little more detail? Best Regards. 在2008-11-24?09:41:15,晋光峰?[EMAIL PROTECTED]?写道: Dear?all, Does?anyone?knows?how?to?integrate?hadoop?to?web?applications??I?want?to startup?a?hadoop?job?by?the?Java?Servlet?(in?web?server?servlet?container), then?get?the?result?and?send?result?back?to?browser.?Is?this?possible??How to?connect?the?web?server?with?the?hadoop?framework? Please?give?me?any?advice?or?suggestions?about?this. Thanks --? Guangfeng?Jin Software?Engineer iZENEsoft?(Shanghai)?Co.,?Ltd -- Best Regards Alexander Aristov -- Guangfeng Jin Software Engineer iZENEsoft (Shanghai) Co., Ltd
Re: ls command output format
Filed HADOOP-4719 for this. Nicholas Sze. - Original Message From: Tsz Wo (Nicholas), Sze [EMAIL PROTECTED] To: core-user@hadoop.apache.org Sent: Friday, November 21, 2008 7:54:27 AM Subject: Re: ls command output format Hi Alex, Yes, the doc about ls is out-dated. Thanks for pointing this out. Would you mind to file a JIRA? Nicholas Sze - Original Message From: Alexander Aristov To: core-user@hadoop.apache.org Sent: Friday, November 21, 2008 6:08:08 AM Subject: Re: ls command output format Found out that output has been changed in 0.18 see HADOOP-2865 Docs should be also then updated. Alex 2008/11/21 Alexander Aristov Hello I wonder if hadoop shell command ls has changed output format Trying hadoop-0.18.2 I got next output [root]# hadoop fs -ls / Found 2 items drwxr-xr-x - root supergroup 0 2008-11-21 08:08 /mnt drwxr-xr-x - root supergroup 0 2008-11-21 08:19 /repos Though according to docs it should be that file name goes first. http://hadoop.apache.org/core/docs/r0.18.2/hdfs_shell.html#ls Usage: hadoop fs -ls For a file returns stat on the file with the following format: filename filesize modification_date modification_time permissions userid groupid For a directory it returns list of its direct children as in unix. A directory is listed as: dirname modification_time modification_time permissions userid groupid Example: hadoop fs -ls /user/hadoop/file1 /user/hadoop/file2 hdfs:// nn.example.com/user/hadoop/dir1 /nonexistentfile Exit Code: Returns 0 on success and -1 on error. I wouldn't notice the issue if I haven't had scripts which rely on the formatting. -- Best Regards Alexander Aristov -- Best Regards Alexander Aristov
Re: ls command output format
Thanks for creating it. I haven't tried Jira yet and didn't know how to do this. Alex 2008/11/25 Tsz Wo Sze [EMAIL PROTECTED] Filed HADOOP-4719 for this. Nicholas Sze. - Original Message From: Tsz Wo (Nicholas), Sze [EMAIL PROTECTED] To: core-user@hadoop.apache.org Sent: Friday, November 21, 2008 7:54:27 AM Subject: Re: ls command output format Hi Alex, Yes, the doc about ls is out-dated. Thanks for pointing this out. Would you mind to file a JIRA? Nicholas Sze - Original Message From: Alexander Aristov To: core-user@hadoop.apache.org Sent: Friday, November 21, 2008 6:08:08 AM Subject: Re: ls command output format Found out that output has been changed in 0.18 see HADOOP-2865 Docs should be also then updated. Alex 2008/11/21 Alexander Aristov Hello I wonder if hadoop shell command ls has changed output format Trying hadoop-0.18.2 I got next output [root]# hadoop fs -ls / Found 2 items drwxr-xr-x - root supergroup 0 2008-11-21 08:08 /mnt drwxr-xr-x - root supergroup 0 2008-11-21 08:19 /repos Though according to docs it should be that file name goes first. http://hadoop.apache.org/core/docs/r0.18.2/hdfs_shell.html#ls Usage: hadoop fs -ls For a file returns stat on the file with the following format: filename filesize modification_date modification_time permissions userid groupid For a directory it returns list of its direct children as in unix. A directory is listed as: dirname modification_time modification_time permissions userid groupid Example: hadoop fs -ls /user/hadoop/file1 /user/hadoop/file2 hdfs:// nn.example.com/user/hadoop/dir1 /nonexistentfile Exit Code: Returns 0 on success and -1 on error. I wouldn't notice the issue if I haven't had scripts which rely on the formatting. -- Best Regards Alexander Aristov -- Best Regards Alexander Aristov -- Best Regards Alexander Aristov
Re: Block placement in HDFS
On Nov 24, 2008, at 8:44 PM, Mahadev Konar wrote: Hi Dennis, I don't think that is possible to do. No, it is not possible. The block placement is determined by HDFS internally (which is local, rack local and off rack). Actually, it was changed in 0.17 or so to be node-local, off-rack, and a second node off rack. -- Owen
RE: do NOT start reduce task until all mappers are finished
Amar, Thanks for the pointer. -Original Message- From: Amar Kamat [mailto:[EMAIL PROTECTED] Sent: Monday, November 24, 2008 8:43 PM To: core-user@hadoop.apache.org Subject: Re: do NOT start reduce task until all mappers are finished Haijun Cao wrote: Hi, I am using 0.18.2 with fair scheduler hadoop-3476. The purpose of fair scheduler is to prevent long running jobs from blocking short jobs. I gave it a try --- start a long job first, then a short one. The short job is able to grab some map slot and finishes its map phase quickly, but it still blocks on reduce phase. Because the long job has taken all the reduce slots (because the long job starts first and its reducers are started shortly after). The long job's reducer won't finish until all its mappers have finished. So my short job is still blocked by the long job Making the fair scheduler useless for my workload. I am wondering if there is a way to NOT to start reduce task until all its mappers have finished. https://issues.apache.org/jira/browse/HADOOP-4666 is opened to address something similar. Starting the reducers after all the maps are done might result into increased runtime of the job. The reason for starting the reducers along with the maps is to interleave/parallelize map and shuffle(data-pulling) phase since maps are typically cpu bound while shuffle is io bound. Amar Thanks Haijun Cao