[help]how to stop HDFS
Actually I started to play with the latest release 0.23.0 on two nodes yesterday. And it was easy to start the hdfs. However it took me a while to configure the yarn. I set the variables HADOOP_COMMON_HOME to where you extracted the tarball and HADOOP_HDFS_HOME to the local dir where I pointed the hdfs to. After that I could bring up yarn and run the benchmark. But I am facing a problem that I could not see the jobs in the UI. And also when I started the historyserver, I got the following error. 11/11/30 20:53:19 FATAL hs.JobHistoryServer: Error starting JobHistoryServer java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.fs.Hdfs at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1179) at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:142) at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:233) at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:315) at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:313) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152) at org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:313) at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:426) at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:448) at org.apache.hadoop.mapreduce.v2.hs.JobHistory.init(JobHistory.java:183) at org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58) at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.init(JobHistoryServer.java:62) at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:77) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.Hdfs at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1125) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1177) ... 14 more Any clue? Hailong *** * Hailong Yang, PhD. Candidate * Sino-German Joint Software Institute, * School of Computer ScienceEngineering, Beihang University * Phone: (86-010)82315908 * Email: hailong.yang1...@gmail.com * Address: G413, New Main Building in Beihang University, * No.37 XueYuan Road,HaiDian District, * Beijing,P.R.China,100191 *** 发件人: cat fa 发送时间: 2011-11-30 10:28 收件人: common-user 主题: Re: Re: [help]how to stop HDFS In fact it's me to say sorry. I used the word install which was misleading. In fact I downloaded a tar file and extracted it to /usr/bin/hadoop Could you please tell me where to point those variables? 2011/11/30, Prashant Sharma prashant.ii...@gmail.com: I am sorry, I had no idea you have done a rpm install, my suggestion was based on the assumption that you have done a tar extract install where all three distribution have to extracted and then export variables. Also I have no experience with rpm based installs - so no comments about what went wrong in your case. Basically from the error i can say that it is not able to find the jars needed on classpath which is referred by scripts through HADOOP_COMMON_HOME. I would say check with the access permission as in which user was it installed with and which user is it running with ? On Tue, Nov 29, 2011 at 10:48 PM, cat fa boost.subscrib...@gmail.comwrote: Thank you for your help, but I'm still a little confused. Suppose I installed hadoop in /usr/bin/hadoop/ .Should I point HADOOP_COMMON_HOME to /usr/bin/hadoop ? Where should I point HADOOP_HDFS_HOME? Also to /usr/bin/hadoop/ ? 2011/11/30 Prashant Sharma prashant.ii...@gmail.com I mean, you have to export the variables export HADOOP_CONF_DIR=/path/to/your/configdirectory. also export HADOOP_HDFS_HOME ,HADOOP_COMMON_HOME. before your run your command. I suppose this should fix the problem. -P On Tue, Nov 29, 2011 at 6:23 PM, cat fa boost.subscrib...@gmail.com wrote: it didn't work. It gave me the Usage information. 2011/11/29 hailong.yang1115 hailong.yang1...@gmail.com Try $HADOOP_PREFIX_HOME/bin/hdfs namenode stop --config $HADOOP_CONF_DIR and
mapreduce matrix multiplication on hadoop
Hi I am trying to run the matrix multiplication example mentioned(with source code) on the following link: http://www.norstad.org/matrix-multiply/index.html I have hadoop setup in pseudodistributed mode and I configured it using this tutorial: http://hadoop-tutorial.blogspot.com/2010/11/running-hadoop-in-pseudo-distributed.html?showComment=1321528406255#c3661776111033973764 When I run my jar file then I get the following error: Identity test 11/11/30 10:37:34 INFO input.FileInputFormat: Total input paths to process : 2 11/11/30 10:37:34 INFO mapred.JobClient: Running job: job_20291041_0010 11/11/30 10:37:35 INFO mapred.JobClient: map 0% reduce 0% 11/11/30 10:37:44 INFO mapred.JobClient: map 100% reduce 0% 11/11/30 10:37:56 INFO mapred.JobClient: map 100% reduce 100% 11/11/30 10:37:58 INFO mapred.JobClient: Job complete: job_20291041_0010 11/11/30 10:37:58 INFO mapred.JobClient: Counters: 17 11/11/30 10:37:58 INFO mapred.JobClient: Job Counters 11/11/30 10:37:58 INFO mapred.JobClient: Launched reduce tasks=1 11/11/30 10:37:58 INFO mapred.JobClient: Launched map tasks=2 11/11/30 10:37:58 INFO mapred.JobClient: Data-local map tasks=2 11/11/30 10:37:58 INFO mapred.JobClient: FileSystemCounters 11/11/30 10:37:58 INFO mapred.JobClient: FILE_BYTES_READ=114 11/11/30 10:37:58 INFO mapred.JobClient: HDFS_BYTES_READ=248 11/11/30 10:37:58 INFO mapred.JobClient: FILE_BYTES_WRITTEN=298 11/11/30 10:37:58 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=124 11/11/30 10:37:58 INFO mapred.JobClient: Map-Reduce Framework 11/11/30 10:37:58 INFO mapred.JobClient: Reduce input groups=2 11/11/30 10:37:58 INFO mapred.JobClient: Combine output records=0 11/11/30 10:37:58 INFO mapred.JobClient: Map input records=4 11/11/30 10:37:58 INFO mapred.JobClient: Reduce shuffle bytes=60 11/11/30 10:37:58 INFO mapred.JobClient: Reduce output records=2 11/11/30 10:37:58 INFO mapred.JobClient: Spilled Records=8 11/11/30 10:37:58 INFO mapred.JobClient: Map output bytes=100 11/11/30 10:37:58 INFO mapred.JobClient: Combine input records=0 11/11/30 10:37:58 INFO mapred.JobClient: Map output records=4 11/11/30 10:37:58 INFO mapred.JobClient: Reduce input records=4 11/11/30 10:37:58 INFO input.FileInputFormat: Total input paths to process : 1 11/11/30 10:37:59 INFO mapred.JobClient: Running job: job_20291041_0011 11/11/30 10:38:00 INFO mapred.JobClient: map 0% reduce 0% 11/11/30 10:38:09 INFO mapred.JobClient: map 100% reduce 0% 11/11/30 10:38:21 INFO mapred.JobClient: map 100% reduce 100% 11/11/30 10:38:23 INFO mapred.JobClient: Job complete: job_20291041_0011 11/11/30 10:38:23 INFO mapred.JobClient: Counters: 17 11/11/30 10:38:23 INFO mapred.JobClient: Job Counters 11/11/30 10:38:23 INFO mapred.JobClient: Launched reduce tasks=1 11/11/30 10:38:23 INFO mapred.JobClient: Launched map tasks=1 11/11/30 10:38:23 INFO mapred.JobClient: Data-local map tasks=1 11/11/30 10:38:23 INFO mapred.JobClient: FileSystemCounters 11/11/30 10:38:23 INFO mapred.JobClient: FILE_BYTES_READ=34 11/11/30 10:38:23 INFO mapred.JobClient: HDFS_BYTES_READ=124 11/11/30 10:38:23 INFO mapred.JobClient: FILE_BYTES_WRITTEN=100 11/11/30 10:38:23 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=124 11/11/30 10:38:23 INFO mapred.JobClient: Map-Reduce Framework 11/11/30 10:38:23 INFO mapred.JobClient: Reduce input groups=2 11/11/30 10:38:23 INFO mapred.JobClient: Combine output records=2 11/11/30 10:38:23 INFO mapred.JobClient: Map input records=2 11/11/30 10:38:23 INFO mapred.JobClient: Reduce shuffle bytes=0 11/11/30 10:38:23 INFO mapred.JobClient: Reduce output records=2 11/11/30 10:38:23 INFO mapred.JobClient: Spilled Records=4 11/11/30 10:38:23 INFO mapred.JobClient: Map output bytes=24 11/11/30 10:38:23 INFO mapred.JobClient: Combine input records=2 11/11/30 10:38:23 INFO mapred.JobClient: Map output records=2 11/11/30 10:38:23 INFO mapred.JobClient: Reduce input records=2 Exception in thread main java.io.IOException: Cannot open filename /tmp/MatrixMultiply/out/_logs at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.ja va:1497) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.init(DFSClient.java :1488) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:376) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSyst em.java:178) at org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1 437) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:142 4) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:141 7) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:141 2) at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:62) at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:84) at
Re: [help]how to stop HDFS
On 30/11/11 04:29, Nitin Khandelwal wrote: Thanks, I missed the sbin directory, was using the normal bin directory. Thanks, Nitin On 30 November 2011 09:54, Harsh Jha...@cloudera.com wrote: Like I wrote earlier, its in the $HADOOP_HOME/sbin directory. Not the regular bin/ directory. On Wed, Nov 30, 2011 at 9:52 AM, Nitin Khandelwal nitin.khandel...@germinait.com wrote: I am using Hadoop 0.23.0 There is no hadoop-daemon.sh in bin directory.. I found the 0.23 scripts to be hard to set up, and get working https://issues.apache.org/jira/browse/HADOOP-7838 https://issues.apache.org/jira/browse/MAPREDUCE-3430 https://issues.apache.org/jira/browse/MAPREDUCE-3432 I'd like to see what Bigtop will offer in this area, as their test process will involve installing onto system images and walking through the scripts. the basic hadoop tars assume your system is well configured and you know how to do this -and debug problems
How Jobtrakcer stores Tasktrackers Information
Friends, I want to know, how Jobtracker stores information about tasktracker their tasks? is it stored in memory or is it stored in file? If anyone knows it, please let me know it Thanks Regards, Mohmmadanis Moulavi Student, MTech (Computer Sci. Engg.) Walchand college of Engg. Sangli (M.S.) India
Re: How Jobtrakcer stores Tasktrackers Information
I'm not sure what's the exact meaning of the tasktracker information you mentioned there is a TaskTrackerStatus class, and when the system runs, tt jt transmits serialized objects of this class which contains some information through heartbeat and there is a hashmapstring, TaskTrackerStatus in JobTracker, Best, Nan On Wed, Nov 30, 2011 at 9:26 PM, mohmmadanis moulavi anis_moul...@yahoo.co.in wrote: Friends, I want to know, how Jobtracker stores information about tasktracker their tasks? is it stored in memory or is it stored in file? If anyone knows it, please let me know it Thanks Regards, Mohmmadanis Moulavi Student, MTech (Computer Sci. Engg.) Walchand college of Engg. Sangli (M.S.) India -- Nan Zhu School of Electronic, Information and Electrical Engineering,229 Shanghai Jiao Tong University 800,Dongchuan Road,Shanghai,China E-Mail: zhunans...@gmail.com
Re: Re: [help]how to stop HDFS
Thank you for your help. I can use /sbin/hadoop-daemon.sh {start|stop} {service} script to start a namenode, but I can't start a resourcemanager. 2011/11/30 Harsh J ha...@cloudera.com I simply use the /sbin/hadoop-daemon.sh {start|stop} {service} script to control daemons at my end. Does this not work for you? Or perhaps this thread is more about documenting that? 2011/11/30 Nitin Khandelwal nitin.khandel...@germinait.com: Hi, Even i am facing the same problem. There may be some issue with script . The doc says to start namenode type : bin/hdfs namenode start But start is not recognized. There is a hack to start namenode with command bin/hdfs namenode , but no idea how to stop. If it had been a issue with config , the later also should not have worked. Thanks, Nitin 2011/11/30 cat fa boost.subscrib...@gmail.com In fact it's me to say sorry. I used the word install which was misleading. In fact I downloaded a tar file and extracted it to /usr/bin/hadoop Could you please tell me where to point those variables? 2011/11/30, Prashant Sharma prashant.ii...@gmail.com: I am sorry, I had no idea you have done a rpm install, my suggestion was based on the assumption that you have done a tar extract install where all three distribution have to extracted and then export variables. Also I have no experience with rpm based installs - so no comments about what went wrong in your case. Basically from the error i can say that it is not able to find the jars needed on classpath which is referred by scripts through HADOOP_COMMON_HOME. I would say check with the access permission as in which user was it installed with and which user is it running with ? On Tue, Nov 29, 2011 at 10:48 PM, cat fa boost.subscrib...@gmail.com wrote: Thank you for your help, but I'm still a little confused. Suppose I installed hadoop in /usr/bin/hadoop/ .Should I point HADOOP_COMMON_HOME to /usr/bin/hadoop ? Where should I point HADOOP_HDFS_HOME? Also to /usr/bin/hadoop/ ? 2011/11/30 Prashant Sharma prashant.ii...@gmail.com I mean, you have to export the variables export HADOOP_CONF_DIR=/path/to/your/configdirectory. also export HADOOP_HDFS_HOME ,HADOOP_COMMON_HOME. before your run your command. I suppose this should fix the problem. -P On Tue, Nov 29, 2011 at 6:23 PM, cat fa boost.subscrib...@gmail.com wrote: it didn't work. It gave me the Usage information. 2011/11/29 hailong.yang1115 hailong.yang1...@gmail.com Try $HADOOP_PREFIX_HOME/bin/hdfs namenode stop --config $HADOOP_CONF_DIR and $HADOOP_PREFIX_HOME/bin/hdfs datanode stop --config $HADOOP_CONF_DIR. It would stop namenode and datanode separately. The HADOOP_CONF_DIR is the directory where you store your configuration files. Hailong *** * Hailong Yang, PhD. Candidate * Sino-German Joint Software Institute, * School of Computer ScienceEngineering, Beihang University * Phone: (86-010)82315908 * Email: hailong.yang1...@gmail.com * Address: G413, New Main Building in Beihang University, * No.37 XueYuan Road,HaiDian District, * Beijing,P.R.China,100191 *** From: cat fa Date: 2011-11-29 20:22 To: common-user Subject: Re: [help]how to stop HDFS use $HADOOP_CONF or $HADOOP_CONF_DIR ? I'm using hadoop 0.23. you mean which class? the class of hadoop or of java? 2011/11/29 Prashant Sharma prashant.ii...@gmail.com Try making $HADOOP_CONF point to right classpath including your configuration folder. On Tue, Nov 29, 2011 at 3:58 PM, cat fa boost.subscrib...@gmail.com wrote: I used the command : $HADOOP_PREFIX_HOME/bin/hdfs start namenode --config $HADOOP_CONF_DIR to sart HDFS. This command is in Hadoop document (here http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/ClusterSetup.html ) However, I got errors as Exception in thread main java.lang.NoClassDefFoundError:start Could anyone tell me how to start and stop HDFS? By the way, how to set Gmail so that it doesn't top post my reply? -- Nitin Khandelwal -- Harsh J
Re: mapreduce matrix multiplication on hadoop
The error is that you cannot open /tmp/MatrixMultiply/out/_logs Does the directory exist? Do you have proper access rights set? Joep On Wed, Nov 30, 2011 at 3:23 AM, ChWaqas waqas...@gmail.com wrote: Hi I am trying to run the matrix multiplication example mentioned(with source code) on the following link: http://www.norstad.org/matrix-multiply/index.html I have hadoop setup in pseudodistributed mode and I configured it using this tutorial: http://hadoop-tutorial.blogspot.com/2010/11/running-hadoop-in-pseudo-distributed.html?showComment=1321528406255#c3661776111033973764 When I run my jar file then I get the following error: Identity test 11/11/30 10:37:34 INFO input.FileInputFormat: Total input paths to process : 2 11/11/30 10:37:34 INFO mapred.JobClient: Running job: job_20291041_0010 11/11/30 10:37:35 INFO mapred.JobClient: map 0% reduce 0% 11/11/30 10:37:44 INFO mapred.JobClient: map 100% reduce 0% 11/11/30 10:37:56 INFO mapred.JobClient: map 100% reduce 100% 11/11/30 10:37:58 INFO mapred.JobClient: Job complete: job_20291041_0010 11/11/30 10:37:58 INFO mapred.JobClient: Counters: 17 11/11/30 10:37:58 INFO mapred.JobClient: Job Counters 11/11/30 10:37:58 INFO mapred.JobClient: Launched reduce tasks=1 11/11/30 10:37:58 INFO mapred.JobClient: Launched map tasks=2 11/11/30 10:37:58 INFO mapred.JobClient: Data-local map tasks=2 11/11/30 10:37:58 INFO mapred.JobClient: FileSystemCounters 11/11/30 10:37:58 INFO mapred.JobClient: FILE_BYTES_READ=114 11/11/30 10:37:58 INFO mapred.JobClient: HDFS_BYTES_READ=248 11/11/30 10:37:58 INFO mapred.JobClient: FILE_BYTES_WRITTEN=298 11/11/30 10:37:58 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=124 11/11/30 10:37:58 INFO mapred.JobClient: Map-Reduce Framework 11/11/30 10:37:58 INFO mapred.JobClient: Reduce input groups=2 11/11/30 10:37:58 INFO mapred.JobClient: Combine output records=0 11/11/30 10:37:58 INFO mapred.JobClient: Map input records=4 11/11/30 10:37:58 INFO mapred.JobClient: Reduce shuffle bytes=60 11/11/30 10:37:58 INFO mapred.JobClient: Reduce output records=2 11/11/30 10:37:58 INFO mapred.JobClient: Spilled Records=8 11/11/30 10:37:58 INFO mapred.JobClient: Map output bytes=100 11/11/30 10:37:58 INFO mapred.JobClient: Combine input records=0 11/11/30 10:37:58 INFO mapred.JobClient: Map output records=4 11/11/30 10:37:58 INFO mapred.JobClient: Reduce input records=4 11/11/30 10:37:58 INFO input.FileInputFormat: Total input paths to process : 1 11/11/30 10:37:59 INFO mapred.JobClient: Running job: job_20291041_0011 11/11/30 10:38:00 INFO mapred.JobClient: map 0% reduce 0% 11/11/30 10:38:09 INFO mapred.JobClient: map 100% reduce 0% 11/11/30 10:38:21 INFO mapred.JobClient: map 100% reduce 100% 11/11/30 10:38:23 INFO mapred.JobClient: Job complete: job_20291041_0011 11/11/30 10:38:23 INFO mapred.JobClient: Counters: 17 11/11/30 10:38:23 INFO mapred.JobClient: Job Counters 11/11/30 10:38:23 INFO mapred.JobClient: Launched reduce tasks=1 11/11/30 10:38:23 INFO mapred.JobClient: Launched map tasks=1 11/11/30 10:38:23 INFO mapred.JobClient: Data-local map tasks=1 11/11/30 10:38:23 INFO mapred.JobClient: FileSystemCounters 11/11/30 10:38:23 INFO mapred.JobClient: FILE_BYTES_READ=34 11/11/30 10:38:23 INFO mapred.JobClient: HDFS_BYTES_READ=124 11/11/30 10:38:23 INFO mapred.JobClient: FILE_BYTES_WRITTEN=100 11/11/30 10:38:23 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=124 11/11/30 10:38:23 INFO mapred.JobClient: Map-Reduce Framework 11/11/30 10:38:23 INFO mapred.JobClient: Reduce input groups=2 11/11/30 10:38:23 INFO mapred.JobClient: Combine output records=2 11/11/30 10:38:23 INFO mapred.JobClient: Map input records=2 11/11/30 10:38:23 INFO mapred.JobClient: Reduce shuffle bytes=0 11/11/30 10:38:23 INFO mapred.JobClient: Reduce output records=2 11/11/30 10:38:23 INFO mapred.JobClient: Spilled Records=4 11/11/30 10:38:23 INFO mapred.JobClient: Map output bytes=24 11/11/30 10:38:23 INFO mapred.JobClient: Combine input records=2 11/11/30 10:38:23 INFO mapred.JobClient: Map output records=2 11/11/30 10:38:23 INFO mapred.JobClient: Reduce input records=2 Exception in thread main java.io.IOException: Cannot open filename /tmp/MatrixMultiply/out/_logs at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.ja va:1497) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.init(DFSClient.java :1488) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:376) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSyst em.java:178) at org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1 437) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:142 4) at
Re: [help]how to stop HDFS
It seems the ClassNotFoundException exception is the most common problem. Try point HADOOP_COMMON_HOME to HADOOP_HOME/share/hadoop/common. In my computer it's /usr/bin/hadoop/share/hadoop/common 在 2011年11月30日 下午6:50,hailong.yang1115 hailong.yang1...@gmail.com写道: Actually I started to play with the latest release 0.23.0 on two nodes yesterday. And it was easy to start the hdfs. However it took me a while to configure the yarn. I set the variables HADOOP_COMMON_HOME to where you extracted the tarball and HADOOP_HDFS_HOME to the local dir where I pointed the hdfs to. After that I could bring up yarn and run the benchmark. But I am facing a problem that I could not see the jobs in the UI. And also when I started the historyserver, I got the following error. 11/11/30 20:53:19 FATAL hs.JobHistoryServer: Error starting JobHistoryServer java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.fs.Hdfs at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1179) at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:142) at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:233) at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:315) at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:313) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152) at org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:313) at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:426) at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:448) at org.apache.hadoop.mapreduce.v2.hs.JobHistory.init(JobHistory.java:183) at org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58) at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.init(JobHistoryServer.java:62) at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:77) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.Hdfs at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1125) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1177) ... 14 more Any clue? Hailong *** * Hailong Yang, PhD. Candidate * Sino-German Joint Software Institute, * School of Computer ScienceEngineering, Beihang University * Phone: (86-010)82315908 * Email: hailong.yang1...@gmail.com * Address: G413, New Main Building in Beihang University, * No.37 XueYuan Road,HaiDian District, * Beijing,P.R.China,100191 *** 发件人: cat fa 发送时间: 2011-11-30 10:28 收件人: common-user 主题: Re: Re: [help]how to stop HDFS In fact it's me to say sorry. I used the word install which was misleading. In fact I downloaded a tar file and extracted it to /usr/bin/hadoop Could you please tell me where to point those variables? 2011/11/30, Prashant Sharma prashant.ii...@gmail.com: I am sorry, I had no idea you have done a rpm install, my suggestion was based on the assumption that you have done a tar extract install where all three distribution have to extracted and then export variables. Also I have no experience with rpm based installs - so no comments about what went wrong in your case. Basically from the error i can say that it is not able to find the jars needed on classpath which is referred by scripts through HADOOP_COMMON_HOME. I would say check with the access permission as in which user was it installed with and which user is it running with ? On Tue, Nov 29, 2011 at 10:48 PM, cat fa boost.subscrib...@gmail.com wrote: Thank you for your help, but I'm still a little confused. Suppose I installed hadoop in /usr/bin/hadoop/ .Should I point HADOOP_COMMON_HOME to /usr/bin/hadoop ? Where should I point HADOOP_HDFS_HOME? Also to /usr/bin/hadoop/ ? 2011/11/30 Prashant Sharma prashant.ii...@gmail.com I mean, you have to export the variables export HADOOP_CONF_DIR=/path/to/your/configdirectory. also export HADOOP_HDFS_HOME ,HADOOP_COMMON_HOME. before your run your command. I suppose this should fix the
HDFS Explained as Comics
For your reading pleasure! PDF 3.3MB uploaded at (the mailing list has a cap of 1MB attachments): https://docs.google.com/open?id=0B-zw6KHOtbT4MmRkZWJjYzEtYjI3Ni00NTFjLWE0OGItYTU5OGMxYjc0N2M1 Appreciate if you can spare some time to peruse this little experiment of mine to use Comics as a medium to explain computer science topics. This particular issue explains the protocols and internals of HDFS. I am eager to hear your opinions on the usefulness of this visual medium to teach complex protocols and algorithms. [My personal motivations: I have always found text descriptions to be too verbose as lot of effort is spent putting the concepts in proper time-space context (which can be easily avoided in a visual medium); sequence diagrams are unwieldy for non-trivial protocols, and they do not explain concepts; and finally, animations/videos happen too fast and do not offer self-paced learning experience.] All forms of criticisms, comments (and encouragements) welcome :) Thanks Maneesh
Re: HDFS Explained as Comics
Hi Maneesh, Thanks a lot for this! Just distributed it over the team and comments are great :) Best regards, Dejan On Wed, Nov 30, 2011 at 9:28 PM, maneesh varshney mvarsh...@gmail.comwrote: For your reading pleasure! PDF 3.3MB uploaded at (the mailing list has a cap of 1MB attachments): https://docs.google.com/open?id=0B-zw6KHOtbT4MmRkZWJjYzEtYjI3Ni00NTFjLWE0OGItYTU5OGMxYjc0N2M1 Appreciate if you can spare some time to peruse this little experiment of mine to use Comics as a medium to explain computer science topics. This particular issue explains the protocols and internals of HDFS. I am eager to hear your opinions on the usefulness of this visual medium to teach complex protocols and algorithms. [My personal motivations: I have always found text descriptions to be too verbose as lot of effort is spent putting the concepts in proper time-space context (which can be easily avoided in a visual medium); sequence diagrams are unwieldy for non-trivial protocols, and they do not explain concepts; and finally, animations/videos happen too fast and do not offer self-paced learning experience.] All forms of criticisms, comments (and encouragements) welcome :) Thanks Maneesh
Re: HDFS Explained as Comics
Thanks Maneesh. Quick question, does a client really need to know Block size and replication factor - A lot of times client has no control over these (set at cluster level) -Prashant Kommireddi On Wed, Nov 30, 2011 at 12:51 PM, Dejan Menges dejan.men...@gmail.comwrote: Hi Maneesh, Thanks a lot for this! Just distributed it over the team and comments are great :) Best regards, Dejan On Wed, Nov 30, 2011 at 9:28 PM, maneesh varshney mvarsh...@gmail.com wrote: For your reading pleasure! PDF 3.3MB uploaded at (the mailing list has a cap of 1MB attachments): https://docs.google.com/open?id=0B-zw6KHOtbT4MmRkZWJjYzEtYjI3Ni00NTFjLWE0OGItYTU5OGMxYjc0N2M1 Appreciate if you can spare some time to peruse this little experiment of mine to use Comics as a medium to explain computer science topics. This particular issue explains the protocols and internals of HDFS. I am eager to hear your opinions on the usefulness of this visual medium to teach complex protocols and algorithms. [My personal motivations: I have always found text descriptions to be too verbose as lot of effort is spent putting the concepts in proper time-space context (which can be easily avoided in a visual medium); sequence diagrams are unwieldy for non-trivial protocols, and they do not explain concepts; and finally, animations/videos happen too fast and do not offer self-paced learning experience.] All forms of criticisms, comments (and encouragements) welcome :) Thanks Maneesh
Re: HDFS Explained as Comics
Hi Prashant Others may correct me if I am wrong here.. The client (org.apache.hadoop.hdfs.DFSClient) has a knowledge of block size and replication factor. In the source code, I see the following in the DFSClient constructor: defaultBlockSize = conf.getLong(dfs.block.size, DEFAULT_BLOCK_SIZE); defaultReplication = (short) conf.getInt(dfs.replication, 3); My understanding is that the client considers the following chain for the values: 1. Manual values (the long form constructor; when a user provides these values) 2. Configuration file values (these are cluster level defaults: dfs.block.size and dfs.replication) 3. Finally, the hardcoded values (DEFAULT_BLOCK_SIZE and 3) Moreover, in the org.apache.hadoop.hdfs.protocool.ClientProtocol the API to create a file is void create(, short replication, long blocksize); I presume it means that the client already has knowledge of these values and passes them to the NameNode when creating a new file. Hope that helps. thanks -Maneesh On Wed, Nov 30, 2011 at 1:04 PM, Prashant Kommireddi prash1...@gmail.comwrote: Thanks Maneesh. Quick question, does a client really need to know Block size and replication factor - A lot of times client has no control over these (set at cluster level) -Prashant Kommireddi On Wed, Nov 30, 2011 at 12:51 PM, Dejan Menges dejan.men...@gmail.com wrote: Hi Maneesh, Thanks a lot for this! Just distributed it over the team and comments are great :) Best regards, Dejan On Wed, Nov 30, 2011 at 9:28 PM, maneesh varshney mvarsh...@gmail.com wrote: For your reading pleasure! PDF 3.3MB uploaded at (the mailing list has a cap of 1MB attachments): https://docs.google.com/open?id=0B-zw6KHOtbT4MmRkZWJjYzEtYjI3Ni00NTFjLWE0OGItYTU5OGMxYjc0N2M1 Appreciate if you can spare some time to peruse this little experiment of mine to use Comics as a medium to explain computer science topics. This particular issue explains the protocols and internals of HDFS. I am eager to hear your opinions on the usefulness of this visual medium to teach complex protocols and algorithms. [My personal motivations: I have always found text descriptions to be too verbose as lot of effort is spent putting the concepts in proper time-space context (which can be easily avoided in a visual medium); sequence diagrams are unwieldy for non-trivial protocols, and they do not explain concepts; and finally, animations/videos happen too fast and do not offer self-paced learning experience.] All forms of criticisms, comments (and encouragements) welcome :) Thanks Maneesh
Re: HDFS Explained as Comics
Sure, its just a case of how readers interpret it. 1. Client is required to specify block size and replication factor each time 2. Client does not need to worry about it since an admin has set the properties in default configuration files A client could not be allowed to override the default configs if they are set final (well there are ways to go around it as well as you suggest by using create() :) The information is great and helpful. Just want to make sure a beginner who wants to write a WordCount in Mapreduce does not worry about specifying block size' and replication factor in his code. Thanks, Prashant On Wed, Nov 30, 2011 at 1:18 PM, maneesh varshney mvarsh...@gmail.comwrote: Hi Prashant Others may correct me if I am wrong here.. The client (org.apache.hadoop.hdfs.DFSClient) has a knowledge of block size and replication factor. In the source code, I see the following in the DFSClient constructor: defaultBlockSize = conf.getLong(dfs.block.size, DEFAULT_BLOCK_SIZE); defaultReplication = (short) conf.getInt(dfs.replication, 3); My understanding is that the client considers the following chain for the values: 1. Manual values (the long form constructor; when a user provides these values) 2. Configuration file values (these are cluster level defaults: dfs.block.size and dfs.replication) 3. Finally, the hardcoded values (DEFAULT_BLOCK_SIZE and 3) Moreover, in the org.apache.hadoop.hdfs.protocool.ClientProtocol the API to create a file is void create(, short replication, long blocksize); I presume it means that the client already has knowledge of these values and passes them to the NameNode when creating a new file. Hope that helps. thanks -Maneesh On Wed, Nov 30, 2011 at 1:04 PM, Prashant Kommireddi prash1...@gmail.com wrote: Thanks Maneesh. Quick question, does a client really need to know Block size and replication factor - A lot of times client has no control over these (set at cluster level) -Prashant Kommireddi On Wed, Nov 30, 2011 at 12:51 PM, Dejan Menges dejan.men...@gmail.com wrote: Hi Maneesh, Thanks a lot for this! Just distributed it over the team and comments are great :) Best regards, Dejan On Wed, Nov 30, 2011 at 9:28 PM, maneesh varshney mvarsh...@gmail.com wrote: For your reading pleasure! PDF 3.3MB uploaded at (the mailing list has a cap of 1MB attachments): https://docs.google.com/open?id=0B-zw6KHOtbT4MmRkZWJjYzEtYjI3Ni00NTFjLWE0OGItYTU5OGMxYjc0N2M1 Appreciate if you can spare some time to peruse this little experiment of mine to use Comics as a medium to explain computer science topics. This particular issue explains the protocols and internals of HDFS. I am eager to hear your opinions on the usefulness of this visual medium to teach complex protocols and algorithms. [My personal motivations: I have always found text descriptions to be too verbose as lot of effort is spent putting the concepts in proper time-space context (which can be easily avoided in a visual medium); sequence diagrams are unwieldy for non-trivial protocols, and they do not explain concepts; and finally, animations/videos happen too fast and do not offer self-paced learning experience.] All forms of criticisms, comments (and encouragements) welcome :) Thanks Maneesh
RE: HDFS Explained as Comics
Maneesh, Firstly, I love the comic :) Secondly, I am inclined to agree with Prashant on this latest point. While one code path could take us through the user defining command line overrides (e.g. hadoop fs -D blah -put foo bar) I think it might confuse a person new to Hadoop. The most common flow would be using admin determined values from hdfs-site and the only thing that would need to change is that conversation happening between client / server and not user / client. Matt -Original Message- From: Prashant Kommireddi [mailto:prash1...@gmail.com] Sent: Wednesday, November 30, 2011 3:28 PM To: common-user@hadoop.apache.org Subject: Re: HDFS Explained as Comics Sure, its just a case of how readers interpret it. 1. Client is required to specify block size and replication factor each time 2. Client does not need to worry about it since an admin has set the properties in default configuration files A client could not be allowed to override the default configs if they are set final (well there are ways to go around it as well as you suggest by using create() :) The information is great and helpful. Just want to make sure a beginner who wants to write a WordCount in Mapreduce does not worry about specifying block size' and replication factor in his code. Thanks, Prashant On Wed, Nov 30, 2011 at 1:18 PM, maneesh varshney mvarsh...@gmail.comwrote: Hi Prashant Others may correct me if I am wrong here.. The client (org.apache.hadoop.hdfs.DFSClient) has a knowledge of block size and replication factor. In the source code, I see the following in the DFSClient constructor: defaultBlockSize = conf.getLong(dfs.block.size, DEFAULT_BLOCK_SIZE); defaultReplication = (short) conf.getInt(dfs.replication, 3); My understanding is that the client considers the following chain for the values: 1. Manual values (the long form constructor; when a user provides these values) 2. Configuration file values (these are cluster level defaults: dfs.block.size and dfs.replication) 3. Finally, the hardcoded values (DEFAULT_BLOCK_SIZE and 3) Moreover, in the org.apache.hadoop.hdfs.protocool.ClientProtocol the API to create a file is void create(, short replication, long blocksize); I presume it means that the client already has knowledge of these values and passes them to the NameNode when creating a new file. Hope that helps. thanks -Maneesh On Wed, Nov 30, 2011 at 1:04 PM, Prashant Kommireddi prash1...@gmail.com wrote: Thanks Maneesh. Quick question, does a client really need to know Block size and replication factor - A lot of times client has no control over these (set at cluster level) -Prashant Kommireddi On Wed, Nov 30, 2011 at 12:51 PM, Dejan Menges dejan.men...@gmail.com wrote: Hi Maneesh, Thanks a lot for this! Just distributed it over the team and comments are great :) Best regards, Dejan On Wed, Nov 30, 2011 at 9:28 PM, maneesh varshney mvarsh...@gmail.com wrote: For your reading pleasure! PDF 3.3MB uploaded at (the mailing list has a cap of 1MB attachments): https://docs.google.com/open?id=0B-zw6KHOtbT4MmRkZWJjYzEtYjI3Ni00NTFjLWE0OGItYTU5OGMxYjc0N2M1 Appreciate if you can spare some time to peruse this little experiment of mine to use Comics as a medium to explain computer science topics. This particular issue explains the protocols and internals of HDFS. I am eager to hear your opinions on the usefulness of this visual medium to teach complex protocols and algorithms. [My personal motivations: I have always found text descriptions to be too verbose as lot of effort is spent putting the concepts in proper time-space context (which can be easily avoided in a visual medium); sequence diagrams are unwieldy for non-trivial protocols, and they do not explain concepts; and finally, animations/videos happen too fast and do not offer self-paced learning experience.] All forms of criticisms, comments (and encouragements) welcome :) Thanks Maneesh This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of Viruses or other Malware. Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this
Re: HDFS Explained as Comics
Hi, This is indeed a good way to explain, most of the improvement has already been discussed. waiting for sequel of this comic. Regards, Abhishek On Wed, Nov 30, 2011 at 1:55 PM, maneesh varshney mvarsh...@gmail.comwrote: Hi Matthew I agree with both you and Prashant. The strip needs to be modified to explain that these can be default values that can be optionally overridden (which I will fix in the next iteration). However, from the 'understanding concepts of HDFS' point of view, I still think that block size and replication factors are the real strengths of HDFS, and the learners must be exposed to them so that they get to see how hdfs is significantly different from conventional file systems. On personal note: thanks for the first part of your message :) -Maneesh On Wed, Nov 30, 2011 at 1:36 PM, GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com wrote: Maneesh, Firstly, I love the comic :) Secondly, I am inclined to agree with Prashant on this latest point. While one code path could take us through the user defining command line overrides (e.g. hadoop fs -D blah -put foo bar) I think it might confuse a person new to Hadoop. The most common flow would be using admin determined values from hdfs-site and the only thing that would need to change is that conversation happening between client / server and not user / client. Matt -Original Message- From: Prashant Kommireddi [mailto:prash1...@gmail.com] Sent: Wednesday, November 30, 2011 3:28 PM To: common-user@hadoop.apache.org Subject: Re: HDFS Explained as Comics Sure, its just a case of how readers interpret it. 1. Client is required to specify block size and replication factor each time 2. Client does not need to worry about it since an admin has set the properties in default configuration files A client could not be allowed to override the default configs if they are set final (well there are ways to go around it as well as you suggest by using create() :) The information is great and helpful. Just want to make sure a beginner who wants to write a WordCount in Mapreduce does not worry about specifying block size' and replication factor in his code. Thanks, Prashant On Wed, Nov 30, 2011 at 1:18 PM, maneesh varshney mvarsh...@gmail.com wrote: Hi Prashant Others may correct me if I am wrong here.. The client (org.apache.hadoop.hdfs.DFSClient) has a knowledge of block size and replication factor. In the source code, I see the following in the DFSClient constructor: defaultBlockSize = conf.getLong(dfs.block.size, DEFAULT_BLOCK_SIZE); defaultReplication = (short) conf.getInt(dfs.replication, 3); My understanding is that the client considers the following chain for the values: 1. Manual values (the long form constructor; when a user provides these values) 2. Configuration file values (these are cluster level defaults: dfs.block.size and dfs.replication) 3. Finally, the hardcoded values (DEFAULT_BLOCK_SIZE and 3) Moreover, in the org.apache.hadoop.hdfs.protocool.ClientProtocol the API to create a file is void create(, short replication, long blocksize); I presume it means that the client already has knowledge of these values and passes them to the NameNode when creating a new file. Hope that helps. thanks -Maneesh On Wed, Nov 30, 2011 at 1:04 PM, Prashant Kommireddi prash1...@gmail.com wrote: Thanks Maneesh. Quick question, does a client really need to know Block size and replication factor - A lot of times client has no control over these (set at cluster level) -Prashant Kommireddi On Wed, Nov 30, 2011 at 12:51 PM, Dejan Menges dejan.men...@gmail.com wrote: Hi Maneesh, Thanks a lot for this! Just distributed it over the team and comments are great :) Best regards, Dejan On Wed, Nov 30, 2011 at 9:28 PM, maneesh varshney mvarsh...@gmail.com wrote: For your reading pleasure! PDF 3.3MB uploaded at (the mailing list has a cap of 1MB attachments): https://docs.google.com/open?id=0B-zw6KHOtbT4MmRkZWJjYzEtYjI3Ni00NTFjLWE0OGItYTU5OGMxYjc0N2M1 Appreciate if you can spare some time to peruse this little experiment of mine to use Comics as a medium to explain computer science topics. This particular issue explains the protocols and internals of HDFS. I am eager to hear your opinions on the usefulness of this visual medium to teach complex protocols and algorithms. [My personal motivations: I have always found text descriptions to be too verbose as lot of effort is spent putting the concepts in proper time-space context (which can be easily avoided in a visual
Re: HDFS Explained as Comics
Hi all, very cool comic! Thanks, Alex On Wed, Nov 30, 2011 at 11:58 PM, Abhishek Pratap Singh manu.i...@gmail.com wrote: Hi, This is indeed a good way to explain, most of the improvement has already been discussed. waiting for sequel of this comic. Regards, Abhishek On Wed, Nov 30, 2011 at 1:55 PM, maneesh varshney mvarsh...@gmail.com wrote: Hi Matthew I agree with both you and Prashant. The strip needs to be modified to explain that these can be default values that can be optionally overridden (which I will fix in the next iteration). However, from the 'understanding concepts of HDFS' point of view, I still think that block size and replication factors are the real strengths of HDFS, and the learners must be exposed to them so that they get to see how hdfs is significantly different from conventional file systems. On personal note: thanks for the first part of your message :) -Maneesh On Wed, Nov 30, 2011 at 1:36 PM, GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com wrote: Maneesh, Firstly, I love the comic :) Secondly, I am inclined to agree with Prashant on this latest point. While one code path could take us through the user defining command line overrides (e.g. hadoop fs -D blah -put foo bar) I think it might confuse a person new to Hadoop. The most common flow would be using admin determined values from hdfs-site and the only thing that would need to change is that conversation happening between client / server and not user / client. Matt -Original Message- From: Prashant Kommireddi [mailto:prash1...@gmail.com] Sent: Wednesday, November 30, 2011 3:28 PM To: common-user@hadoop.apache.org Subject: Re: HDFS Explained as Comics Sure, its just a case of how readers interpret it. 1. Client is required to specify block size and replication factor each time 2. Client does not need to worry about it since an admin has set the properties in default configuration files A client could not be allowed to override the default configs if they are set final (well there are ways to go around it as well as you suggest by using create() :) The information is great and helpful. Just want to make sure a beginner who wants to write a WordCount in Mapreduce does not worry about specifying block size' and replication factor in his code. Thanks, Prashant On Wed, Nov 30, 2011 at 1:18 PM, maneesh varshney mvarsh...@gmail.com wrote: Hi Prashant Others may correct me if I am wrong here.. The client (org.apache.hadoop.hdfs.DFSClient) has a knowledge of block size and replication factor. In the source code, I see the following in the DFSClient constructor: defaultBlockSize = conf.getLong(dfs.block.size, DEFAULT_BLOCK_SIZE); defaultReplication = (short) conf.getInt(dfs.replication, 3); My understanding is that the client considers the following chain for the values: 1. Manual values (the long form constructor; when a user provides these values) 2. Configuration file values (these are cluster level defaults: dfs.block.size and dfs.replication) 3. Finally, the hardcoded values (DEFAULT_BLOCK_SIZE and 3) Moreover, in the org.apache.hadoop.hdfs.protocool.ClientProtocol the API to create a file is void create(, short replication, long blocksize); I presume it means that the client already has knowledge of these values and passes them to the NameNode when creating a new file. Hope that helps. thanks -Maneesh On Wed, Nov 30, 2011 at 1:04 PM, Prashant Kommireddi prash1...@gmail.com wrote: Thanks Maneesh. Quick question, does a client really need to know Block size and replication factor - A lot of times client has no control over these (set at cluster level) -Prashant Kommireddi On Wed, Nov 30, 2011 at 12:51 PM, Dejan Menges dejan.men...@gmail.com wrote: Hi Maneesh, Thanks a lot for this! Just distributed it over the team and comments are great :) Best regards, Dejan On Wed, Nov 30, 2011 at 9:28 PM, maneesh varshney mvarsh...@gmail.com wrote: For your reading pleasure! PDF 3.3MB uploaded at (the mailing list has a cap of 1MB attachments): https://docs.google.com/open?id=0B-zw6KHOtbT4MmRkZWJjYzEtYjI3Ni00NTFjLWE0OGItYTU5OGMxYjc0N2M1 Appreciate if you can spare some time to peruse this little experiment of mine to use Comics as a medium to explain computer science topics. This particular issue explains the protocols and internals of HDFS. I am eager to hear your opinions on the usefulness of this visual