[jira] [Created] (MAPREDUCE-5250) Searching for ';' in JobTracker History throws ArrayOutOfBoundException

2013-05-15 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created MAPREDUCE-5250:
---

 Summary: Searching for ';' in JobTracker History throws 
ArrayOutOfBoundException 
 Key: MAPREDUCE-5250
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5250
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Minor


Searching for ';' in JobTracker History throws ArrayOutOfBoundException 

{noformat}
Problem accessing /jobhistoryhome.jsp. Reason:

0
Caused by:

java.lang.ArrayIndexOutOfBoundsException: 0
at 
org.apache.hadoop.mapred.jobhistoryhome_jsp._jspService(jobhistoryhome_jsp.java:221)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
at 
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:914)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at 
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


which mapreduce project to work on?

2013-05-15 Thread Elma Barjami
Hi together,

I'm new in Hadoop and I need some help.
What I'm trying to do is to modify the MapReduce framework , that some reduce 
work begin before all maps ends their job. 

The very first question is which project from svn is more adequate to work 
on/develop?

Second, is there a possibility to implement such a thing and if yes which 
classes/packages should I look in?


thanks,
Elma

  

Re: which mapreduce project to work on?

2013-05-15 Thread Harsh J
Are you looking to do something similar to
http://www.neilconway.org/docs/nsdi2010_hop.pdf?

Ideally you should be working on MR2 which replaces MR1, and MR2 is
available in the current 2.0.x releases as well as in trunk. If not a
specific revision of trunk, you can pick the 2.x release (or branch-2)
if you seek some stability while developing your work - merging to
trunk is easier from there.

On Wed, May 15, 2013 at 3:13 PM, Elma Barjami elmabarj...@hotmail.com wrote:
 Hi together,

 I'm new in Hadoop and I need some help.
 What I'm trying to do is to modify the MapReduce framework , that some reduce 
 work begin before all maps ends their job.

 The very first question is which project from svn is more adequate to work 
 on/develop?

 Second, is there a possibility to implement such a thing and if yes which 
 classes/packages should I look in?


 thanks,
 Elma





-- 
Harsh J


recent enhancements on map reduce project

2013-05-15 Thread Samaneh Shokuhi
Hello All,
I want to know about the  enhancements and optimizations done recently on
hadoop (map-reduce project) . Is there any documentation or paper to
explain aim of those modifications ?


Best,
Samaneh


RE: which mapreduce project to work on?

2013-05-15 Thread Elma Barjami
Hi Harsh,




i thank you for your fast reply. Yes i want to implement something similiar to 
the paper you sent
me.




But i have still some problems with the structure of the projects in svn.




You wrote that i should work on MR2. That means YARN doesn't it? After going to 
http://svn.apache.org/repos/asf/hadoop/common/trunk/ 
i see there a lot of projects. According to what you wrote me i think i should 
work on 
http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-yarn-project/?
Is that true? But even there 
(http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/)
exists many sub-projects. Could you please explain me which project is 
the right one to work on and how can i import the projects in eclipse as
 normal eclipse projects?




Thank you

 From: ha...@cloudera.com
 Date: Wed, 15 May 2013 16:24:12 +0530
 Subject: Re: which mapreduce project to work on?
 To: mapreduce-dev@hadoop.apache.org
 
 Are you looking to do something similar to
 http://www.neilconway.org/docs/nsdi2010_hop.pdf?
 
 Ideally you should be working on MR2 which replaces MR1, and MR2 is
 available in the current 2.0.x releases as well as in trunk. If not a
 specific revision of trunk, you can pick the 2.x release (or branch-2)
 if you seek some stability while developing your work - merging to
 trunk is easier from there.
 
 On Wed, May 15, 2013 at 3:13 PM, Elma Barjami elmabarj...@hotmail.com wrote:
  Hi together,
 
  I'm new in Hadoop and I need some help.
  What I'm trying to do is to modify the MapReduce framework , that some 
  reduce work begin before all maps ends their job.
 
  The very first question is which project from svn is more adequate to work 
  on/develop?
 
  Second, is there a possibility to implement such a thing and if yes which 
  classes/packages should I look in?
 
 
  thanks,
  Elma
 
 
 
 
 
 -- 
 Harsh J
  

[jira] [Created] (MAPREDUCE-5252) Fair scheduler should use SchedulerUtils.normalizeRequest

2013-05-15 Thread Sandy Ryza (JIRA)
Sandy Ryza created MAPREDUCE-5252:
-

 Summary: Fair scheduler should use SchedulerUtils.normalizeRequest
 Key: MAPREDUCE-5252
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5252
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Minor


The capacity scheduler and the fifo scheduler use the same normalizeRequest in 
SchedulerUtils.  The fair scheduler has its own version of this method that 
does exactly the same thing.  It should use the common one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Patch for v1.1.2 (if this tree is under development)

2013-05-15 Thread Karl Gierach
Hi,

Below is a patch for Hadoop v1.1.2.  I'm new to this list, so if I need to 
write up a JIRA ticket for this, please let me know.

The defect scenario is that if you enter any white space within values in this 
file:

    /etc/hadoop/mapred-site.xml


e.g.: (a white space prior to the -X...)


  property
    namemapred.reduce.child.java.opts/name
    value -Xmx1G/value
  /property


All of the child jobs fail, and each child gets an error in the stderr log like:

Could not find the main class: . Program will exit.


The root cause is obvious in the patch below - the split on the value was done 
on whitespace, and any preceding whitespace ultimately becomes a zero-length 
entry on the child jvm command line, causing the jvm to think that a '' 
argument is the main class.   The patch just skips over any zero-length entries 
prior to adding them to the jvm vargs list.  I looked in trunk as well, to see 
if the patch would apply there but it looks like Tasks were refactored and this 
code file is not present any more.


This error occurred on Open JDK, Centos 6.2, 32 bit.


Regards,
Karl


Index: src/mapred/org/apache/hadoop/mapred/TaskRunner.java
===
--- src/mapred/org/apache/hadoop/mapred/TaskRunner.java    (revision 1482686)
+++ src/mapred/org/apache/hadoop/mapred/TaskRunner.java    (working copy)
@@ -437,7 +437,9 @@
   vargs.add(-Djava.library.path= + libraryPath);
 }
 for (int i = 0; i  javaOptsSplit.length; i++) {
-  vargs.add(javaOptsSplit[i]);
+  if( javaOptsSplit[i].trim().length()  0 ) {
+    vargs.add(javaOptsSplit[i]);
+  }
 }
 
 Path childTmpDir = createChildTmpDir(workDir, conf, false);

[jira] [Created] (MAPREDUCE-5253) Whitespace value entry in mapred-site.xml for name=mapred.reduce.child.java.opts causes child tasks to fail at launch

2013-05-15 Thread Karl D. Gierach (JIRA)
Karl D. Gierach created MAPREDUCE-5253:
--

 Summary: Whitespace value entry in mapred-site.xml for 
name=mapred.reduce.child.java.opts causes child tasks to fail at launch
 Key: MAPREDUCE-5253
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5253
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 1.1.2
 Environment: Centos 6.2 32Bit, OpenJDK
Reporter: Karl D. Gierach
 Fix For: 1.1.3


Hi,

Below is a patch for Hadoop v1.1.2.  I'm new to this list, so if I need to 
write up a JIRA ticket for this, please let me know.

The defect scenario is that if you enter any white space within values in this 
file:
/etc/hadoop/mapred-site.xml

e.g.: (a white space prior to the -X...)

  property
namemapred.reduce.child.java.opts/name
value -Xmx1G/value
  /property

All of the child jobs fail, and each child gets an error in the stderr log like:

Could not find the main class: . Program will exit.

The root cause is obvious in the patch below - the split on the value was done 
on whitespace, and any preceding whitespace ultimately becomes a zero-length 
entry on the child jvm command line, causing the jvm to think that a '' 
argument is the main class.   The patch just skips over any zero-length entries 
prior to adding them to the jvm vargs list.  I looked in trunk as well, to see 
if the patch would apply there but it looks like Tasks were refactored and this 
code file is not present any more.

This error occurred on Open JDK, Centos 6.2, 32 bit.

Regards,
Karl


Index: src/mapred/org/apache/hadoop/mapred/TaskRunner.java
===
--- src/mapred/org/apache/hadoop/mapred/TaskRunner.java(revision 1482686)
+++ src/mapred/org/apache/hadoop/mapred/TaskRunner.java(working copy)
@@ -437,7 +437,9 @@
   vargs.add(-Djava.library.path= + libraryPath);
 }
 for (int i = 0; i  javaOptsSplit.length; i++) {
-  vargs.add(javaOptsSplit[i]);
+  if( javaOptsSplit[i].trim().length()  0 ) {
+vargs.add(javaOptsSplit[i]);
+  }
 }
 
 Path childTmpDir = createChildTmpDir(workDir, conf, false);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Patch for v1.1.2 (if this tree is under development)

2013-05-15 Thread Chris Nauroth
Hello Karl,

Thank you for investigating the issue and preparing a patch!  Yes, the 1.x
line is still maintained.  (We just released 1.2.0.)  All patches do
require a corresponding issue in JIRA, so I recommend logging in to
https://issues.apache.org/jira/ and submitting this information in a new
issue there.  Additional details on contribution are available here:

http://wiki.apache.org/hadoop/HowToContribute

Regarding trunk, it appears that the current version of the code also would
suffer the same problem.  See
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapReduceChildJVM.java,
method getVMCommand:

https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapReduceChildJVM.java#L156

The full scope of your JIRA issue likely would need to cover fixes for both
trunk and branch-1, as well as a unit test that covers this case.

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Wed, May 15, 2013 at 5:25 PM, Karl Gierach k...@gierach.net wrote:

 Hi,

 Below is a patch for Hadoop v1.1.2.  I'm new to this list, so if I need to
 write up a JIRA ticket for this, please let me know.

 The defect scenario is that if you enter any white space within values in
 this file:

 /etc/hadoop/mapred-site.xml


 e.g.: (a white space prior to the -X...)


   property
 namemapred.reduce.child.java.opts/name
 value -Xmx1G/value
   /property


 All of the child jobs fail, and each child gets an error in the stderr log
 like:

 Could not find the main class: . Program will exit.


 The root cause is obvious in the patch below - the split on the value was
 done on whitespace, and any preceding whitespace ultimately becomes a
 zero-length entry on the child jvm command line, causing the jvm to think
 that a '' argument is the main class.   The patch just skips over any
 zero-length entries prior to adding them to the jvm vargs list.  I looked
 in trunk as well, to see if the patch would apply there but it looks like
 Tasks were refactored and this code file is not present any more.


 This error occurred on Open JDK, Centos 6.2, 32 bit.


 Regards,
 Karl


 Index: src/mapred/org/apache/hadoop/mapred/TaskRunner.java
 ===
 --- src/mapred/org/apache/hadoop/mapred/TaskRunner.java(revision
 1482686)
 +++ src/mapred/org/apache/hadoop/mapred/TaskRunner.java(working copy)
 @@ -437,7 +437,9 @@
vargs.add(-Djava.library.path= + libraryPath);
  }
  for (int i = 0; i  javaOptsSplit.length; i++) {
 -  vargs.add(javaOptsSplit[i]);
 +  if( javaOptsSplit[i].trim().length()  0 ) {
 +vargs.add(javaOptsSplit[i]);
 +  }
  }

  Path childTmpDir = createChildTmpDir(workDir, conf, false);