I want to revise the source codes, which version is proper?

2012-08-23 Thread Li Shengmei
Hi, all

 I want to do some experiments with hadoop. I may revise the source
codes of hadoop. Which version is stable for my work?

 

Thanks,

May

 



Build failed in Jenkins: Hadoop-Common-trunk #511

2012-08-23 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Common-trunk/511/changes

Changes:

[tucu] MAPREDUCE-4470. Fix TestCombineFileInputFormat.testForEmptyFile (ikatsov 
via tucu)

[eli] HADOOP-8075. Lower native-hadoop library log from info to debug. 
Contributed by Hızır Sefa İrken

[atm] MAPREDUCE-4577. HDFS-3672 broke 
TestCombineFileInputFormat.testMissingBlocks() test. Contributed by Aaron T. 
Myers.

[tucu] MAPREDUCE-4068. Jars in lib subdirectory of the submittable JAR are not 
added to the classpath (rkanter via tucu)

[atm] HADOOP-8721. ZKFC should not retry 45 times when attempting a graceful 
fence during a failover. Contributed by Vinayakumar B.

[atm] HDFS-3835. Long-lived 2NN cannot perform a checkpoint if security is 
enabled and the NN restarts with outstanding delegation tokens. Contributed by 
Aaron T. Myers.

[jitendra] HDFS-3819. Should check whether invalidate work percentage default 
value is not greater than 1.0f. Contributed by Jing Zhao.

[suresh] Moving HDFS-2686 and HDFS-3832 to branch-2 section in CHANGES.txt

[suresh] HDFS-3832. Remove protocol methods related to DistributedUpgrade. 
Contributed by Suresh Srinivas.

[eli] HDFS-3837. Fix DataNode.recoverBlock findbugs warning. Contributed by Eli 
Collins

[eli] HADOOP-8720. TestLocalFileSystem should use test root subdirectory. 
Contributed by Vlad Rozov

[eli] HDFS-3830. test_libhdfs_threaded: use forceNewInstance. Contributed by 
Colin Patrick McCabe

[suresh] HDFS-3817. Avoid printing SafeModeException stack trace in RPC server. 
Contributed by Brandon Li.

--
[...truncated 27185 lines...]
[DEBUG]   (s) debug = false
[DEBUG]   (s) effort = Default
[DEBUG]   (s) failOnError = true
[DEBUG]   (s) findbugsXmlOutput = false
[DEBUG]   (s) findbugsXmlOutputDirectory = 
https://builds.apache.org/job/Hadoop-Common-trunk/ws/trunk/hadoop-common-project/target
[DEBUG]   (s) fork = true
[DEBUG]   (s) includeTests = false
[DEBUG]   (s) localRepository =id: local
  url: file:///home/jenkins/.m2/repository/
   layout: none

[DEBUG]   (s) maxHeap = 512
[DEBUG]   (s) nested = false
[DEBUG]   (s) outputDirectory = 
https://builds.apache.org/job/Hadoop-Common-trunk/ws/trunk/hadoop-common-project/target/site
[DEBUG]   (s) outputEncoding = UTF-8
[DEBUG]   (s) pluginArtifacts = 
[org.codehaus.mojo:findbugs-maven-plugin:maven-plugin:2.3.2:, 
com.google.code.findbugs:bcel:jar:1.3.9:compile, 
org.codehaus.gmaven:gmaven-mojo:jar:1.3:compile, 
org.codehaus.gmaven.runtime:gmaven-runtime-api:jar:1.3:compile, 
org.codehaus.gmaven.feature:gmaven-feature-api:jar:1.3:compile, 
org.codehaus.gmaven.runtime:gmaven-runtime-1.5:jar:1.3:compile, 
org.codehaus.gmaven.feature:gmaven-feature-support:jar:1.3:compile, 
org.codehaus.groovy:groovy-all-minimal:jar:1.5.8:compile, 
org.apache.ant:ant:jar:1.7.1:compile, 
org.apache.ant:ant-launcher:jar:1.7.1:compile, jline:jline:jar:0.9.94:compile, 
org.codehaus.plexus:plexus-interpolation:jar:1.1:compile, 
org.codehaus.gmaven:gmaven-plugin:jar:1.3:compile, 
org.codehaus.gmaven.runtime:gmaven-runtime-loader:jar:1.3:compile, 
org.codehaus.gmaven.runtime:gmaven-runtime-support:jar:1.3:compile, 
org.sonatype.gshell:gshell-io:jar:2.0:compile, 
com.thoughtworks.qdox:qdox:jar:1.10:compile, 
org.apache.maven.shared:file-management:jar:1.2.1:compile, 
org.apache.maven.shared:maven-shared-io:jar:1.1:compile, 
commons-lang:commons-lang:jar:2.4:compile, 
org.slf4j:slf4j-api:jar:1.5.10:compile, 
org.sonatype.gossip:gossip:jar:1.2:compile, 
org.apache.maven.reporting:maven-reporting-impl:jar:2.1:compile, 
commons-validator:commons-validator:jar:1.2.0:compile, 
commons-beanutils:commons-beanutils:jar:1.7.0:compile, 
commons-digester:commons-digester:jar:1.6:compile, 
commons-logging:commons-logging:jar:1.0.4:compile, oro:oro:jar:2.0.8:compile, 
xml-apis:xml-apis:jar:1.0.b2:compile, 
org.codehaus.groovy:groovy-all:jar:1.7.4:compile, 
org.apache.maven.reporting:maven-reporting-api:jar:3.0:compile, 
org.apache.maven.doxia:doxia-core:jar:1.1.3:compile, 
org.apache.maven.doxia:doxia-logging-api:jar:1.1.3:compile, 
xerces:xercesImpl:jar:2.9.1:compile, 
commons-httpclient:commons-httpclient:jar:3.1:compile, 
commons-codec:commons-codec:jar:1.2:compile, 
org.apache.maven.doxia:doxia-sink-api:jar:1.1.3:compile, 
org.apache.maven.doxia:doxia-decoration-model:jar:1.1.3:compile, 
org.apache.maven.doxia:doxia-site-renderer:jar:1.1.3:compile, 
org.apache.maven.doxia:doxia-module-xhtml:jar:1.1.3:compile, 
org.apache.maven.doxia:doxia-module-fml:jar:1.1.3:compile, 
org.codehaus.plexus:plexus-i18n:jar:1.0-beta-7:compile, 
org.codehaus.plexus:plexus-velocity:jar:1.1.7:compile, 
org.apache.velocity:velocity:jar:1.5:compile, 
commons-collections:commons-collections:jar:3.2:compile, 
org.apache.maven.shared:maven-doxia-tools:jar:1.2.1:compile, 
commons-io:commons-io:jar:1.4:compile, 
com.google.code.findbugs:findbugs-ant:jar:1.3.9:compile, 
com.google.code.findbugs:findbugs:jar:1.3.9:compile, 

Re: How to get TaskId from ContainerId or ApplicationId or Request in Hadoop 0.23??

2012-08-23 Thread Robert Evans
There really is no way.  The RM also has no knowledge of map tasks vs
reduce tasks nor should it know.

--Bobby

On 8/22/12 8:23 PM, Shekhar Gupta shkhr...@gmail.com wrote:

In ResourceManager, is there any way to findout if the assigned container
is going to execute a mapping task or a reduce task? I can access objects
Container, Application and Request in ResourceManager, can I somehow get
TaskId by using any of these objects?? Please let me know a way.

Thanks.



[jira] [Created] (HADOOP-8723) Remove tests and tests-sources jars from classpath

2012-08-23 Thread Jason Lowe (JIRA)
Jason Lowe created HADOOP-8723:
--

 Summary: Remove tests and tests-sources jars from classpath
 Key: HADOOP-8723
 URL: https://issues.apache.org/jira/browse/HADOOP-8723
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Jason Lowe


Currently {{hadoop-config.sh}} is including any tests and tests-sources jars in 
the default classpath, as those jars are shipped in the dist tarball next to 
the main jars and the script is globbing everything in that directory.

The tests and tests-sources jars aren't required to run Hadoop, but they can 
cause breakage when inadvertently picked up.  See HDFS-3831 as an example.  
Ideally we should not be adding these jars to the classpath.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: How to get TaskId from ContainerId or ApplicationId or Request in Hadoop 0.23??

2012-08-23 Thread Vinod Kumar Vavilapalli

Moving to yarn-dev, as that is the right place for this discussion.

Can you let us know more about what  you are trying to accomplish? Working with 
MapReduce over YARN or your own YARN application?

If you are working with MR over YARN, you have to note that 
TaskId/TaskAttemptID/Map/Reduce tasks are MapReduce concepts. So 
ResourceManager has no idea about them. MapReduce ApplicationMaster is the 
place where you can obtain more such MR specific information.

If it is the later, your question is moot.

HTH,
+Vinod

On Aug 22, 2012, at 6:23 PM, Shekhar Gupta wrote:

 In ResourceManager, is there any way to findout if the assigned container
 is going to execute a mapping task or a reduce task? I can access objects
 Container, Application and Request in ResourceManager, can I somehow get
 TaskId by using any of these objects?? Please let me know a way.
 
 Thanks.



Re: How to get TaskId from ContainerId or ApplicationId or Request in Hadoop 0.23??

2012-08-23 Thread Shekhar Gupta
Thanks Vinod. Yes it's the second case where I am working with MR over yarn.

When I run a job then for a specific Machine I am trying to compute that
how much time the machine takes to execute mapping tasks and reduce tasks.
What I am doing now is that I compute time difference between container
assignment and container released steps. But I can't determine if that
container was assigned for a mapping or reduce task. I'll try to look into
ApplicationMaster if I can get this information.



On Thu, Aug 23, 2012 at 11:27 AM, Vinod Kumar Vavilapalli 
vino...@hortonworks.com wrote:


 Moving to yarn-dev, as that is the right place for this discussion.

 Can you let us know more about what  you are trying to accomplish? Working
 with MapReduce over YARN or your own YARN application?

 If you are working with MR over YARN, you have to note that
 TaskId/TaskAttemptID/Map/Reduce tasks are MapReduce concepts. So
 ResourceManager has no idea about them. MapReduce ApplicationMaster is the
 place where you can obtain more such MR specific information.

 If it is the later, your question is moot.

 HTH,
 +Vinod

 On Aug 22, 2012, at 6:23 PM, Shekhar Gupta wrote:

  In ResourceManager, is there any way to findout if the assigned container
  is going to execute a mapping task or a reduce task? I can access objects
  Container, Application and Request in ResourceManager, can I somehow get
  TaskId by using any of these objects?? Please let me know a way.
 
  Thanks.




[jira] [Created] (HADOOP-8724) Add improved APIs for globbing

2012-08-23 Thread Robert Joseph Evans (JIRA)
Robert Joseph Evans created HADOOP-8724:
---

 Summary: Add improved APIs for globbing
 Key: HADOOP-8724
 URL: https://issues.apache.org/jira/browse/HADOOP-8724
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans


After the discussion on HADOOP-8709 it was decided that we need better APIs for 
globbing to remove some of the inconsistencies with other APIs.  Inorder to 
maintain backwards compatibility we should deprecate the existing APIs and add 
in new ones.


See HADOOP-8709 for more information about exactly how those APIs should look 
and behave.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8725) MR is broken when security is off

2012-08-23 Thread Daryn Sharp (JIRA)
Daryn Sharp created HADOOP-8725:
---

 Summary: MR is broken when security is off
 Key: HADOOP-8725
 URL: https://issues.apache.org/jira/browse/HADOOP-8725
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 0.23.3, 2.1.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker


HADOOP-8225 broke MR when security is off.  MR was changed to stop re-reading 
the credentials that UGI had already read, and to stop putting those tokens 
back into the UGI where they already were.  UGI only reads a credentials file 
when security is enabled, but MR uses tokens (ie. job token) even when security 
is disabled...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: New to Hadoop

2012-08-23 Thread Chandrashekhar Kotekar
Hi,

You dont need to do any course for Hadoop. I think first you read
documents, books and understand the architecture, concepts in Hadoop. Then
start using Hadoop and write some programs in it. Once you know basics then
you can contribute to Hadoop but dont jump directly into contributing code,
rather first understand Hadoop.

All the best.


Regards,
Chandrash3khar K0tekar
Mobile - 9766632117



On Fri, Aug 24, 2012 at 3:58 AM, Subodhini smv260...@gmail.com wrote:

 Hello,

 I want to contribute to the apache hadoop project. I am absolutely new to
 Hadoop. Can you guys please guide me which is the best place to start for
 hadoop and what are the prerequisites courses in computer science for
 hadoop?

 I have work experience in Java programming language and hold a background
 in Computer Science.

 Regards,
 Subodhini



Does anyone do related work on the log of hadoop?

2012-08-23 Thread Li Shengmei
Hi, all

 I want to do some study on the log of hadoop. I intend to konw how
hadoop record the IO in the logs. 

Does anyone do related work and give some suggestions? 

Thanks,

May