date:20120711

status update

2012-07-11 Thread miomir


Hi everyone,

Could someone update the status of MAPREDUCE-2825.
Class TestRackAwareTaskPlacement doesn't seem to exist anymore?

Miomir Boljanovic

Re: status update

2012-07-11 Thread Harsh J

Hey Miomir,

Commented.

On Wed, Jul 11, 2012 at 1:05 PM,  mio...@internet.is wrote:
 Hi everyone,

 Could someone update the status of MAPREDUCE-2825.
 Class TestRackAwareTaskPlacement doesn't seem to exist anymore?

 Miomir Boljanovic



-- 
Harsh J

[jira] [Resolved] (MAPREDUCE-2825) Factor out commonly used code in mapred testcases

2012-07-11 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved MAPREDUCE-2825.


Resolution: Not A Problem

Miomir,

This issue has gone stale over the past few years. The current MR tests do use 
common class/methods already, so if you find a new spot of improvement in the 
current set of tests, please file a new JIRA for them.

Am resolving this as Not A Problem (anymore).

Sorry that the newbie tag mislead you here!

 Factor out commonly used code in mapred testcases
 -

 Key: MAPREDUCE-2825
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2825
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: test
Reporter: Amar Kamat
Priority: Minor

 The commonly used code in the testcases are made _static_ like 
 {{TestRackAwareTaskPlacement.configureJobConf()}}. It would be nice to factor 
 out these apis and either add it to a class like {{StringUtils}} or into a 
 separate dir like {{utils}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (MAPREDUCE-3837) Job tracker is not able to recover job in case of crash and after that no user can submit job.

2012-07-11 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy reopened MAPREDUCE-3837:
--


Looks like this needs a minor update to get it to work on Mac OSX...

 Job tracker is not able to recover job in case of crash and after that no 
 user can submit job.
 --

 Key: MAPREDUCE-3837
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 0.22.0, 1.1.1
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Fix For: 0.24.0, 1.2.0, 0.22.1, 0.23.2

 Attachments: PATCH-HADOOP-1-MAPREDUCE-3837-1.patch, 
 PATCH-HADOOP-1-MAPREDUCE-3837-2.patch, PATCH-HADOOP-1-MAPREDUCE-3837-3.patch, 
 PATCH-HADOOP-1-MAPREDUCE-3837-4.patch, PATCH-HADOOP-1-MAPREDUCE-3837.patch, 
 PATCH-MAPREDUCE-3837.patch, PATCH-TRUNK-MAPREDUCE-3837.patch


 If job tracker is crashed while running , and there were some jobs are 
 running , so if job tracker's property mapreduce.jobtracker.restart.recover 
 is true then it should recover the job.
 However the current behavior is as follows
 jobtracker try to restore the jobs but it can not . And after that jobtracker 
 closes its handle to hdfs and nobody else can submit job. 
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (MAPREDUCE-4253) Tests for mapreduce-client-core are lying under mapreduce-client-jobclient

2012-07-11 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reopened MAPREDUCE-4253:



Reopening to locate and commit the missed 12 files.

 Tests for mapreduce-client-core are lying under mapreduce-client-jobclient
 --

 Key: MAPREDUCE-4253
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4253
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: client
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
 Fix For: 2.0.1-alpha

 Attachments: MR-4253.1.patch, MR-4253.2.patch, 
 crossing_project_checker.rb, result.txt


 Many of the tests for client libs from mapreduce-client-core are lying under 
 mapreduce-client-jobclient.
 We should investigate if this is the right thing to do and if not, move the 
 tests back into client-core.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-3837) Job tracker is not able to recover job in case of crash and after that no user can submit job.

2012-07-11 Thread Arun C Murthy (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Arun C Murthy resolved MAPREDUCE-3837.
--

Resolution: Fixed

Thanks for the reviews Tom Mayank. I've just committed the small patch.

Job tracker is not able to recover job in case of crash and after that no
user can submit job.
--

Key: MAPREDUCE-3837
URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
Project: Hadoop Map/Reduce
Issue Type: New Feature
Affects Versions: 0.22.0, 1.1.1
Reporter: Mayank Bansal
Assignee: Mayank Bansal
Fix For: 1.2.0, 0.22.1

Attachments: MAPREDUCE-3837_addendum.patch,
PATCH-HADOOP-1-MAPREDUCE-3837-1.patch, PATCH-HADOOP-1-MAPREDUCE-3837-2.patch,
PATCH-HADOOP-1-MAPREDUCE-3837-3.patch, PATCH-HADOOP-1-MAPREDUCE-3837-4.patch,
PATCH-HADOOP-1-MAPREDUCE-3837.patch, PATCH-MAPREDUCE-3837.patch,
PATCH-TRUNK-MAPREDUCE-3837.patch

If job tracker is crashed while running , and there were some jobs are
running , so if job tracker's property mapreduce.jobtracker.restart.recover
is true then it should recover the job.
However the current behavior is as follows
jobtracker try to restore the jobs but it can not . And after that jobtracker
closes its handle to hdfs and nobody else can submit job.
Thanks,
Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

shim for metrics2 framework across hadoop 1 and hadoop 2

2012-07-11 Thread Ted Yu

Hi,
Please take a look at Alex's comments (especially option 2):
https://issues.apache.org/jira/browse/HBASE-4050?focusedCommentId=13411693page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13411693

Your feedback is welcome.

[jira] [Created] (MAPREDUCE-4428) A failed job is not available under job history if the job is killed right around the time job is notified as failed

2012-07-11 Thread Rahul Jain (JIRA)

Rahul Jain created MAPREDUCE-4428:
-

 Summary: A failed job is not available under job history if the 
job is killed right around the time job is notified as failed 
 Key: MAPREDUCE-4428
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4428
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, jobtracker
Affects Versions: 2.0.0-alpha
Reporter: Rahul Jain


We have observed this issue consistently running hadoop CDH4 version (based 
upon 2.0 alpha release):

In case our hadoop client code gets a notification for a completed job ( using 
RunningJob object job, with (job.isComplete()  job.isSuccessful()==false)
the hadoop client code does an unconditional job.killJob() to terminate the job.

With earlier hadoop versions (verified on hadoop 0.20.2 version), we still  
have full access to job logs afterwards through hadoop console. However, when 
using MapReduceV2, the failed hadoop job no longer shows up under jobhistory 
server. Also, the tracking URL of the job still points to the non-existent 
Application master http port.

Once we removed the call to job.killJob() for failed jobs from our hadoop 
client code, we were able to access the job in job history with mapreduce V2 as 
well. Therefore this appears to be a race condition in the job management wrt. 
job history for failed jobs.

We do have the application master and node manager logs collected for this 
scenario if that'll help isolate the problem and the fix better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4429) Upgrade Guava for critical performance bug fix

2012-07-11 Thread Zhihong Ted Yu (JIRA)

Zhihong Ted Yu created MAPREDUCE-4429:
-

 Summary: Upgrade Guava for critical performance bug fix
 Key: MAPREDUCE-4429
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4429
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Zhihong Ted Yu


The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055

See discussion under 'Upgrade to Guava 12.0.1: Performance bug in 
CacheBuilder/LoadingCache fixed!'


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-4257) Support fair-sharing option within a MR2 Capacity Scheduler queue

2012-07-11 Thread Tom White (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White resolved MAPREDUCE-4257.
--

Resolution: Invalid

Yes, I don't think we need this.

 Support fair-sharing option within a MR2 Capacity Scheduler queue
 -

 Key: MAPREDUCE-4257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4257
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: capacity-sched, mrv2
Reporter: Tom White
Assignee: Karthik Kambatla

 The fair scheduler can run jobs in a single pool (queue) in FIFO or fair 
 share mode. In FIFO mode one job runs at a time, in priority order, while in 
 fair share mode multiple jobs can run at the same time, and they share the 
 capacity of the pool. This JIRA is to add the latter feature to Capacity 
 Scheduler as an option - the default would remain FIFO.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4430) Adding child queues to any queue need the process restart ./yarn rmadmin -refreshQueues throws IO exception

2012-07-11 Thread Nishan Shetty (JIRA)

Nishan Shetty created MAPREDUCE-4430:


 Summary: Adding child queues to any queue need the process restart 
./yarn rmadmin -refreshQueues throws IO exception
 Key: MAPREDUCE-4430
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4430
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Nishan Shetty


1.Configure different queues for capacity scheduler say a,b under root.
2.Start the process
3.Now add the child queue a1,a2 under a
4.Now do refresh queues with command ./yarn rmadmin -refreshQueues
Observed that it throws the following IO exception

{noformat}
java.io.IOException: Failed to re-init queues
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:216)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:174)
at 
org.apache.hadoop.yarn.server.resourcemanager.api.impl.pb.service.RMAdminProtocolPBServiceImpl.refreshQueues(RMAdminProtocolPBServiceImpl.java:62)
at 
org.apache.hadoop.yarn.proto.RMAdminProtocol$RMAdminProtocolService$2.callBlockingMethod(RMAdminProtocol.java:122)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686)
Caused by: java.io.IOException: Trying to reinitialize root.b from root.b
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.reinitialize(LeafQueue.java:554)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.reinitialize(ParentQueue.java:387)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitializeQueues(CapacityScheduler.java:257)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:213)
... 11 more
 at LocalTrace:
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
Failed to re-init queues
at 
org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:50)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:40)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:184)
at 
org.apache.hadoop.yarn.server.resourcemanager.api.impl.pb.service.RMAdminProtocolPBServiceImpl.refreshQueues(RMAdminProtocolPBServiceImpl.java:62)
at 
org.apache.hadoop.yarn.proto.RMAdminProtocol$RMAdminProtocolService$2.callBlockingMethod(RMAdminProtocol.java:122)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686)
Caused by: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
Trying to reinitialize root.b from root.b
at 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.getCause(YarnRemoteExceptionPBImpl.java:94)
at 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.getCause(YarnRemoteExceptionPBImpl.java:32)
at java.lang.Throwable.printStackTrace(Throwable.java:514)
at 
org.apache.hadoop.yarn.exceptions.YarnRemoteException.printStackTrace(YarnRemoteException.java:48)
at 
org.apache.hadoop.util.StringUtils.stringifyException(StringUtils.java:69)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1715)
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira