Re: [VOTE] Release Apache Hadoop 2.0.4-alpha
Arun, MAPREDUCE-5094 would be a useful jira to include in the 2.0.4-alpha release. It's not an absolute blocker since the values can be controlled explicitly by changing tests which use the cluster. Thanks - Sid On Tue, Apr 9, 2013 at 8:39 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.0.4-alpha that I would like to release. This is a bug-fix release which solves a number of issues discovered during integration testing of the full-stack. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.0.4-alpha-rc0/ The RC tag in svn is here: http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.0.4-alpha-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun P.S. Many thanks are in order - Roman/Cos and rest of BigTop community for helping to find a number of integration issues, Ted Yu for co-ordinating on HBase, Alejandro for co-ordinating on Oozie, Vinod/Sid/Alejandro/Xuan/Daryn and rest of devs for quickly jumping and fixing these. -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
Re: [VOTE] Release Apache Hadoop 2.0.4-alpha
HADOOP-9467 has patch available. It would be nice to include that as well. Thanks On Apr 9, 2013, at 11:14 PM, Siddharth Seth seth.siddha...@gmail.com wrote: Arun, MAPREDUCE-5094 would be a useful jira to include in the 2.0.4-alpha release. It's not an absolute blocker since the values can be controlled explicitly by changing tests which use the cluster. Thanks - Sid On Tue, Apr 9, 2013 at 8:39 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.0.4-alpha that I would like to release. This is a bug-fix release which solves a number of issues discovered during integration testing of the full-stack. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.0.4-alpha-rc0/ The RC tag in svn is here: http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.0.4-alpha-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun P.S. Many thanks are in order - Roman/Cos and rest of BigTop community for helping to find a number of integration issues, Ted Yu for co-ordinating on HBase, Alejandro for co-ordinating on Oozie, Vinod/Sid/Alejandro/Xuan/Daryn and rest of devs for quickly jumping and fixing these. -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
Re: [VOTE] Release Apache Hadoop 2.0.4-alpha
Ok, I'll spin rc1 after. Thanks. Sent from my iPhone On Apr 10, 2013, at 11:44 AM, Siddharth Seth seth.siddha...@gmail.com wrote: Arun, MAPREDUCE-5094 would be a useful jira to include in the 2.0.4-alpha release. It's not an absolute blocker since the values can be controlled explicitly by changing tests which use the cluster. Thanks - Sid On Tue, Apr 9, 2013 at 8:39 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.0.4-alpha that I would like to release. This is a bug-fix release which solves a number of issues discovered during integration testing of the full-stack. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.0.4-alpha-rc0/ The RC tag in svn is here: http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.0.4-alpha-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun P.S. Many thanks are in order - Roman/Cos and rest of BigTop community for helping to find a number of integration issues, Ted Yu for co-ordinating on HBase, Alejandro for co-ordinating on Oozie, Vinod/Sid/Alejandro/Xuan/Daryn and rest of devs for quickly jumping and fixing these. -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
Newbie question - How to start working on an issue?
Hello everyone, Its been some time I have used Hadoop in my projects and now I want to contribute back to Hadoop. So this is my first time I am trying to contribute to Hadoop. I do not have experience of contributing to any open source project. I would like to know how to start working on an issue? Till now I have downloaded Hadoop source code and successfully built it. Now I have chosen one trivial issue which I think I can solve but I do not how to start working on it. If I have some question regarding the functionality of some piece of code then to whom can I ask? Do we need to learn by debugging or other people who know that piece of code will help us? Request you to please help. Thanks and Regards, Chandrash3khar
Re: Newbie question - How to start working on an issue?
@Chandrashekhar, How you build the Hadoop?? Plz, guide me. I also want to build. Which version of Hadoop u are using? On Wed, Apr 10, 2013 at 5:01 PM, Chandrashekhar Kotekar shekhar.kote...@gmail.com wrote: Hello everyone, Its been some time I have used Hadoop in my projects and now I want to contribute back to Hadoop. So this is my first time I am trying to contribute to Hadoop. I do not have experience of contributing to any open source project. I would like to know how to start working on an issue? Till now I have downloaded Hadoop source code and successfully built it. Now I have chosen one trivial issue which I think I can solve but I do not how to start working on it. If I have some question regarding the functionality of some piece of code then to whom can I ask? Do we need to learn by debugging or other people who know that piece of code will help us? Request you to please help. Thanks and Regards, Chandrash3khar -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270
Re: Newbie question - How to start working on an issue?
Hi Chandrashekhar, Follow the How To Contribution Hadoop wiki page [1] and you can find more information about the Hadoop project from the Hadoop wiki [2]. If you like you can follow trainings [3], [4], [5]. [1] - http://wiki.apache.org/hadoop/HowToContribute [2] - http://wiki.apache.org/hadoop/FrontPage [3] - http://hortonworks.com/hadoop-training/ [4] - http://academy.mapr.com/ [5] - http://university.cloudera.com/ On Wed, Apr 10, 2013 at 6:24 PM, Mohammad Mustaqeem 3m.mustaq...@gmail.comwrote: @Chandrashekhar, How you build the Hadoop?? Plz, guide me. I also want to build. Which version of Hadoop u are using? On Wed, Apr 10, 2013 at 5:01 PM, Chandrashekhar Kotekar shekhar.kote...@gmail.com wrote: Hello everyone, Its been some time I have used Hadoop in my projects and now I want to contribute back to Hadoop. So this is my first time I am trying to contribute to Hadoop. I do not have experience of contributing to any open source project. I would like to know how to start working on an issue? Till now I have downloaded Hadoop source code and successfully built it. Now I have chosen one trivial issue which I think I can solve but I do not how to start working on it. If I have some question regarding the functionality of some piece of code then to whom can I ask? Do we need to learn by debugging or other people who know that piece of code will help us? Request you to please help. Thanks and Regards, Chandrash3khar -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270 -- Charitha Madusanka Linkdin : http://www.linkedin.com/pub/charith-madusanka/1a/508/42a Twitter : http://twitter.com/#!/charithccmc
Re: Newbie question - How to start working on an issue?
@Mohammad Mustaqeem, Hadoop can be built using maven and ant 1.Building with maven assuming you are using ubuntu First download Hadoop by issuing the command-svn checkout http://svn.apache.org/repos/asf/hadoop/common/trunk/ hadoop-trunk(this you can find in how to contribute to hadoop wiki) now cd to hadoop-trunk there you will find one file called pom.xml. a. Type the command sudo apt-get install maven(This will install maven latest) if you have not installed maven, in case you have installed maven version 3 version it will give you errors. type mvn -version and see what version you have if you don't have the latest one just follow a. b. then use the command mvn package -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip=true(now maven will start downloading the dependencies and build) c. the final hadoop-x.x.x-snaphsot.tar.gz will be inside hadoop-trunk/hadoop-dist/target/ now you will have other files also. d.unzip the tar.gz and unzip and start your hadoop as if you have downloaded hadoop from cloudera or from apache.hope you know how to start the daemons and run the basic map reduce programs. for ant as you have mentioned earlier you are not using it so i am leaving it. i have already discussed building with ant earlier. Regards niranjan singh (sorry my caps lock is not working properly ) On Wed, Apr 10, 2013 at 6:24 PM, Mohammad Mustaqeem 3m.mustaq...@gmail.comwrote: @Chandrashekhar, How you build the Hadoop?? Plz, guide me. I also want to build. Which version of Hadoop u are using? On Wed, Apr 10, 2013 at 5:01 PM, Chandrashekhar Kotekar shekhar.kote...@gmail.com wrote: Hello everyone, Its been some time I have used Hadoop in my projects and now I want to contribute back to Hadoop. So this is my first time I am trying to contribute to Hadoop. I do not have experience of contributing to any open source project. I would like to know how to start working on an issue? Till now I have downloaded Hadoop source code and successfully built it. Now I have chosen one trivial issue which I think I can solve but I do not how to start working on it. If I have some question regarding the functionality of some piece of code then to whom can I ask? Do we need to learn by debugging or other people who know that piece of code will help us? Request you to please help. Thanks and Regards, Chandrash3khar -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270
Re: git clone hadoop taking too much time almost 12 hrs
Hi Niranjan, Try doing your initial clone from the github mirror instead, I found it to be much faster: https://github.com/apache/hadoop-common I use the apache git for subsequent pulls. Best, Andrew On Tue, Apr 9, 2013 at 6:15 PM, maisnam ns maisnam...@gmail.com wrote: Hi, I am trying to execute - git clone git:// git.apache.org/hadoop-common.git so that I could setup a development environment for Hadoop under the Eclipse IDE but it is taking too much time. Can somebody let me know why it is taking too much time, I have a high speed internet connection and I don't think connectivity is the issue here. Thanks Niranjan Singh
[jira] [Created] (HADOOP-9468) JVM path embedded in fuse binaries
Sean Mackrory created HADOOP-9468: - Summary: JVM path embedded in fuse binaries Key: HADOOP-9468 URL: https://issues.apache.org/jira/browse/HADOOP-9468 Project: Hadoop Common Issue Type: Bug Reporter: Sean Mackrory Assignee: Sean Mackrory When the FUSE binaries are built, the paths to libraries in the JVMs is embedded in the RPATH so that they can be found at run-time. From an Apache Bigtop perspective, this is not sufficient because the software may be run on a machine configured very differently from the one on which they were built - so a wrapper sets LD_LIBRARY_PATH according to JAVA_HOME. I recently saw an issue where the original JVM path existed, causing LD_LIBRARY_PATH to be ignored in favor of the RPATH, but it was not the JVM intended for running Hadoop (not JAVA_HOME), and this caused problems. I'm told that setting LD_LIBRARY_PATH is standard practice before using the fuse program anyway, and if that's the case, I think removing the RPATH from the binaries is a good idea. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: git clone hadoop taking too much time almost 12 hrs
Thanks Andrew for your suggestion,I will clone it from the mirror. Regards Niranjan Singh On Wed, Apr 10, 2013 at 11:04 PM, Andrew Wang andrew.w...@cloudera.comwrote: Hi Niranjan, Try doing your initial clone from the github mirror instead, I found it to be much faster: https://github.com/apache/hadoop-common I use the apache git for subsequent pulls. Best, Andrew On Tue, Apr 9, 2013 at 6:15 PM, maisnam ns maisnam...@gmail.com wrote: Hi, I am trying to execute - git clone git:// git.apache.org/hadoop-common.git so that I could setup a development environment for Hadoop under the Eclipse IDE but it is taking too much time. Can somebody let me know why it is taking too much time, I have a high speed internet connection and I don't think connectivity is the issue here. Thanks Niranjan Singh
Re: git clone hadoop taking too much time almost 12 hrs
The whole repo is like 290 mb so make sure you have a decent internet connection On Wed, Apr 10, 2013 at 9:03 PM, maisnam ns maisnam...@gmail.com wrote: Thanks Andrew for your suggestion,I will clone it from the mirror. Regards Niranjan Singh On Wed, Apr 10, 2013 at 11:04 PM, Andrew Wang andrew.w...@cloudera.com wrote: Hi Niranjan, Try doing your initial clone from the github mirror instead, I found it to be much faster: https://github.com/apache/hadoop-common I use the apache git for subsequent pulls. Best, Andrew On Tue, Apr 9, 2013 at 6:15 PM, maisnam ns maisnam...@gmail.com wrote: Hi, I am trying to execute - git clone git:// git.apache.org/hadoop-common.git so that I could setup a development environment for Hadoop under the Eclipse IDE but it is taking too much time. Can somebody let me know why it is taking too much time, I have a high speed internet connection and I don't think connectivity is the issue here. Thanks Niranjan Singh
Re: git clone hadoop taking too much time almost 12 hrs
I once blogged about cloning big repositories after experiencing the mammoth Android's repos were: http://www.harshj.com/2010/08/29/a-less-known-thing-about-cloning-git-repositories/ Try a git clone with a --depth=1 option, to reduce total download by not getting all the history objects. This would have some side-effects vs. a regular clone, but should be fine for contributions. On Wed, Apr 10, 2013 at 11:53 PM, mugisha moses mossp...@gmail.com wrote: The whole repo is like 290 mb so make sure you have a decent internet connection On Wed, Apr 10, 2013 at 9:03 PM, maisnam ns maisnam...@gmail.com wrote: Thanks Andrew for your suggestion,I will clone it from the mirror. Regards Niranjan Singh On Wed, Apr 10, 2013 at 11:04 PM, Andrew Wang andrew.w...@cloudera.com wrote: Hi Niranjan, Try doing your initial clone from the github mirror instead, I found it to be much faster: https://github.com/apache/hadoop-common I use the apache git for subsequent pulls. Best, Andrew On Tue, Apr 9, 2013 at 6:15 PM, maisnam ns maisnam...@gmail.com wrote: Hi, I am trying to execute - git clone git:// git.apache.org/hadoop-common.git so that I could setup a development environment for Hadoop under the Eclipse IDE but it is taking too much time. Can somebody let me know why it is taking too much time, I have a high speed internet connection and I don't think connectivity is the issue here. Thanks Niranjan Singh -- Harsh J
Re: Newbie question - How to start working on an issue?
@niranjan, So far , i have used hadoop-0.20.2 to install Hadoop cluster. Is the installing steps will be same for the hadoop-3.0.0-SNAPSHOT.tar.gz ?? I am asking this because the directory structure is not as same as hadoop-0.20.2.tar.gz downloaded from the apache site. If the steps will be different then please provide any link that has instructions to install Hadoop cluster from hadoop-3.0.0-SNAPSHOT.tar.gz... -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270
Re: git clone hadoop taking too much time almost 12 hrs
Thanks moses and Harsh. Harsh , I 've bookmarked your blog, nice info. Regards Niranjan On Thu, Apr 11, 2013 at 12:09 AM, Harsh J ha...@cloudera.com wrote: I once blogged about cloning big repositories after experiencing the mammoth Android's repos were: http://www.harshj.com/2010/08/29/a-less-known-thing-about-cloning-git-repositories/ Try a git clone with a --depth=1 option, to reduce total download by not getting all the history objects. This would have some side-effects vs. a regular clone, but should be fine for contributions. On Wed, Apr 10, 2013 at 11:53 PM, mugisha moses mossp...@gmail.com wrote: The whole repo is like 290 mb so make sure you have a decent internet connection On Wed, Apr 10, 2013 at 9:03 PM, maisnam ns maisnam...@gmail.com wrote: Thanks Andrew for your suggestion,I will clone it from the mirror. Regards Niranjan Singh On Wed, Apr 10, 2013 at 11:04 PM, Andrew Wang andrew.w...@cloudera.com wrote: Hi Niranjan, Try doing your initial clone from the github mirror instead, I found it to be much faster: https://github.com/apache/hadoop-common I use the apache git for subsequent pulls. Best, Andrew On Tue, Apr 9, 2013 at 6:15 PM, maisnam ns maisnam...@gmail.com wrote: Hi, I am trying to execute - git clone git:// git.apache.org/hadoop-common.git so that I could setup a development environment for Hadoop under the Eclipse IDE but it is taking too much time. Can somebody let me know why it is taking too much time, I have a high speed internet connection and I don't think connectivity is the issue here. Thanks Niranjan Singh -- Harsh J
[jira] [Created] (HADOOP-9470) eliminate duplicate FQN tests in different Hadoop modules
Ivan A. Veselovsky created HADOOP-9470: -- Summary: eliminate duplicate FQN tests in different Hadoop modules Key: HADOOP-9470 URL: https://issues.apache.org/jira/browse/HADOOP-9470 Project: Hadoop Common Issue Type: Improvement Reporter: Ivan A. Veselovsky Assignee: Ivan A. Veselovsky In different modules of Hadoop project there are tests with identical FQNs (fully qualified name). For example, test with FQN org.apache.hadoop.util.TestRunJar is contained in 2 modules: ./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestRunJar.java ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/util/TestRunJar.java Such situation causes certain problems with test result reporting and other code analysis tools (such as Clover, e.g.) because almost all the tools identify the tests by their Java FQN. So, I suggest to rename all such test classes to avoid duplicate FQNs in different modules. I'm attaching simple shell script that can find all such problematic test classes. Currently Hadoop trunk has 9 such test classes, they are: $ ~/bin/find-duplicate-fqns.sh # Module [./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-classes] has 7 duplicate FQN tests: org.apache.hadoop.ipc.TestSocketFactory org.apache.hadoop.mapred.TestFileOutputCommitter org.apache.hadoop.mapred.TestJobClient org.apache.hadoop.mapred.TestJobConf org.apache.hadoop.mapreduce.lib.output.TestFileOutputCommitter org.apache.hadoop.util.TestReflectionUtils org.apache.hadoop.util.TestRunJar # Module [./hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/target/test-classes] has 2 duplicate FQN tests: org.apache.hadoop.yarn.TestRecordFactory org.apache.hadoop.yarn.TestRPCFactories -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Newbie question - How to start working on an issue?
Mohammad, hadoop 0.20.2 is very old. You may want to try 1.0.4 first before even touching 2.x forget 3.x for that matter. On Thu, Apr 11, 2013 at 12:32 AM, Mohammad Mustaqeem 3m.mustaq...@gmail.com wrote: @niranjan, So far , i have used hadoop-0.20.2 to install Hadoop cluster. Is the installing steps will be same for the hadoop-3.0.0-SNAPSHOT.tar.gz ?? I am asking this because the directory structure is not as same as hadoop-0.20.2.tar.gz downloaded from the apache site. If the steps will be different then please provide any link that has instructions to install Hadoop cluster from hadoop-3.0.0-SNAPSHOT.tar.gz... -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270 -- Nitin Pawar
[jira] [Created] (HADOOP-9471) hadoop-client wrongfully excludes jetty-util JAR, breaking webhdfs
Alejandro Abdelnur created HADOOP-9471: -- Summary: hadoop-client wrongfully excludes jetty-util JAR, breaking webhdfs Key: HADOOP-9471 URL: https://issues.apache.org/jira/browse/HADOOP-9471 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 2.0.3-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.0.4-alpha WebHdfsFileSystem uses jetty-util's JSON class. hadoop-client excludes that JAR, applications built using hadoop-client POM fail: {code}java.lang.NoClassDefFoundError: org/mortbay/util/ajax/JSON at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.jsonParse(WebHdfsFileSystem.java:277) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$Runner.getResponse(WebHdfsFileSystem.java:561) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$Runner.run(WebHdfsFileSystem.java:480) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:413) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:580) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:591) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1332) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Newbie question - How to start working on an issue?
Nitin, then the compilation steps will be different. I think it can be done using ant not maven. I don't know how to use ant. Can you please give instructions how to build hadoop using ant and 1 more thing, I want to make little changes in replication of Hadoop. Which version will be better for that purpose?? -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270
Re: [VOTE] Release Apache Hadoop 2.0.4-alpha
I've comitted HADOOP-9471 to trunk and branch-2 and closed JIRA with fixedVersion 2.0.5. If this JIRA makes it to 2.0.4 we need to update CHANGES.txt in trunk/branch-2 and the fixedVersion in the JIRA. Thx. On Tue, Apr 9, 2013 at 8:39 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.0.4-alpha that I would like to release. This is a bug-fix release which solves a number of issues discovered during integration testing of the full-stack. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.0.4-alpha-rc0/ The RC tag in svn is here: http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.0.4-alpha-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun P.S. Many thanks are in order - Roman/Cos and rest of BigTop community for helping to find a number of integration issues, Ted Yu for co-ordinating on HBase, Alejandro for co-ordinating on Oozie, Vinod/Sid/Alejandro/Xuan/Daryn and rest of devs for quickly jumping and fixing these. -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- Alejandro