Re: Jiras need review
Please add me (jwills) to that query, which should grab MR-2630 and MR-2641-44. On Tue, Jul 12, 2011 at 8:13 AM, Nathan Roberts nrobe...@yahoo-inc.comwrote: I sent this query to Suresh yesterday. He is supposed to be driving this from Hortonworks side, if this list doesn't look correct, let me know. project in (MAPREDUCE,HDFS,HADOOP) AND assignee in (tgraves,revans2,jeagles,sherri_chen,naisbitt,daryn,davet,sseth,eepayne,johnvijoe,kihwal,nroberts,anupamseth,raviprak) AND status = Patch Available On 7/12/11 10:09 AM, Thomas Graves tgra...@yahoo-inc.com wrote: I just talked to Mahadev on IM and he said we should send the list of Jiras we need reviewed to him and Arun. Lets make a list. Please respond with the jiras you need reviewed to this email and I'll forward them the list. Tom
Re: Jiras need review
Thanks Mahadev! On Tue, Jul 12, 2011 at 10:41 AM, Mahadev Konar maha...@hortonworks.comwrote: Hi Josh, Ill review the jira's by EOD today. We should clean up our PA's (currently we have 102 of them) and get it down to reasonable/maintainable/searchable number. thanks mahadev On Tue, Jul 12, 2011 at 9:13 AM, Josh Wills jwi...@cloudera.com wrote: Please add me (jwills) to that query, which should grab MR-2630 and MR-2641-44. On Tue, Jul 12, 2011 at 8:13 AM, Nathan Roberts nrobe...@yahoo-inc.com wrote: I sent this query to Suresh yesterday. He is supposed to be driving this from Hortonworks side, if this list doesn't look correct, let me know. project in (MAPREDUCE,HDFS,HADOOP) AND assignee in (tgraves,revans2,jeagles,sherri_chen,naisbitt,daryn,davet,sseth,eepayne,johnvijoe,kihwal,nroberts,anupamseth,raviprak) AND status = Patch Available On 7/12/11 10:09 AM, Thomas Graves tgra...@yahoo-inc.com wrote: I just talked to Mahadev on IM and he said we should send the list of Jiras we need reviewed to him and Arun. Lets make a list. Please respond with the jiras you need reviewed to this email and I'll forward them the list. Tom
Re: Jiras need review
FWIW, I'd still like my JIRAs reviewed. :) On Tue, Jul 12, 2011 at 11:30 AM, Thomas Graves tgra...@yahoo-inc.comwrote: Apologies to all - this was my screw up, this went to the wrong mailing list. Please disregard this request. Tom On 7/12/11 1:22 PM, Allen Wittenauer a...@apache.org wrote: On Jul 12, 2011, at 8:09 AM, Thomas Graves wrote: I just talked to Mahadev on IM and he said we should send the list of Jiras we need reviewed to him and Arun. Lets make a list. Please respond with the jiras you need reviewed to this email and I¹ll forward them the list. I'm going to query jira and send you guys every single patch this is in patch available then, since this wasn't qualified as what exactly you were looking for...
Re: Problem while running eclipse-files for Next Gen Mapreduce branch
You want to generate them using mvn instead. See the mapreduce/yarn/README file for how to do it. On Fri, Jul 8, 2011 at 7:00 AM, Devaraj K devara...@huawei.com wrote: Hi, I am getting this below errors when I try to generate eclipse files using eclipse-files target. Can anybody help me? Buildfile: D:\svn\nextgenmapreduce\mapreduce\build.xml ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar [get] To: D:\svn\nextgenmapreduce\mapreduce\ivy\ivy-2.2.0.jar [get] Not modified - so not downloaded ivy-init-dirs: ivy-probe-antlib: ivy-init-antlib: ivy-init: [ivy:configure] :: Ivy non official version - :: http://ant.apache.org/ivy/ http://ant.apache.org/ivy/ :: [ivy:configure] :: loading settings :: file = D:\svn\nextgenmapreduce\mapreduce\ivy\ivysettings.xml ivy-resolve-common: [ivy:resolve] [ivy:resolve] :: problems summary :: [ivy:resolve] WARNINGS [ivy:resolve] module not found: org.apache.hadoop#yarn-server-common;1.0-SNAPSHOT [ivy:resolve] apache-snapshot: tried [ivy:resolve] https://repository.apache.org/content/repositories/snapshots/org/apache/had oop/yarn-server-common/1.0-SNAPSHOT/yarn-server-common-1.0-SNAPSHOT.pom https://repository.apache.org/content/repositories/snapshots/org/apache/hado op/yarn-server-common/1.0-SNAPSHOT/yarn-server-common-1.0-SNAPSHOT.pom [ivy:resolve] -- artifact org.apache.hadoop#yarn-server-common;1.0-SNAPSHOT!yarn-server-common.jar: [ivy:resolve] https://repository.apache.org/content/repositories/snapshots/org/apache/had oop/yarn-server-common/1.0-SNAPSHOT/yarn-server-common-1.0-SNAPSHOT.jar https://repository.apache.org/content/repositories/snapshots/org/apache/hado op/yarn-server-common/1.0-SNAPSHOT/yarn-server-common-1.0-SNAPSHOT.jar [ivy:resolve] maven2: tried [ivy:resolve] http://repo1.maven.org/maven2/org/apache/hadoop/yarn-server-common/1.0-SNAP SHOT/yarn-server-common-1.0-SNAPSHOT.pom http://repo1.maven.org/maven2/org/apache/hadoop/yarn-server-common/1.0-SNAPS HOT/yarn-server-common-1.0-SNAPSHOT.pom [ivy:resolve] -- artifact org.apache.hadoop#yarn-server-common;1.0-SNAPSHOT!yarn-server-common.jar: [ivy:resolve] http://repo1.maven.org/maven2/org/apache/hadoop/yarn-server-common/1.0-SNAP SHOT/yarn-server-common-1.0-SNAPSHOT.jar http://repo1.maven.org/maven2/org/apache/hadoop/yarn-server-common/1.0-SNAPS HOT/yarn-server-common-1.0-SNAPSHOT.jar [ivy:resolve] module not found: org.apache.hadoop#hadoop-mapreduce-client-core;1.0-SNAPSHOT [ivy:resolve] apache-snapshot: tried [ivy:resolve] https://repository.apache.org/content/repositories/snapshots/org/apache/had oop/hadoop-mapreduce-client-core/1.0-SNAPSHOT/hadoop-mapreduce-client-core-1 .0-SNAPSHOT.pom https://repository.apache.org/content/repositories/snapshots/org/apache/hado op/hadoop-mapreduce-client-core/1.0-SNAPSHOT/hadoop-mapreduce-client-core-1. 0-SNAPSHOT.pom [ivy:resolve] -- artifact org.apache.hadoop#hadoop-mapreduce-client-core;1.0-SNAPSHOT!hadoop-mapreduce -client-core.jar: [ivy:resolve] https://repository.apache.org/content/repositories/snapshots/org/apache/had oop/hadoop-mapreduce-client-core/1.0-SNAPSHOT/hadoop-mapreduce-client-core-1 .0-SNAPSHOT.jar https://repository.apache.org/content/repositories/snapshots/org/apache/hado op/hadoop-mapreduce-client-core/1.0-SNAPSHOT/hadoop-mapreduce-client-core-1. 0-SNAPSHOT.jar [ivy:resolve] maven2: tried [ivy:resolve] http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-cor e/1.0-SNAPSHOT/hadoop-mapreduce-client-core-1.0-SNAPSHOT.pom http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-core /1.0-SNAPSHOT/hadoop-mapreduce-client-core-1.0-SNAPSHOT.pom [ivy:resolve] -- artifact org.apache.hadoop#hadoop-mapreduce-client-core;1.0-SNAPSHOT!hadoop-mapreduce -client-core.jar: [ivy:resolve] http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-cor e/1.0-SNAPSHOT/hadoop-mapreduce-client-core-1.0-SNAPSHOT.jar http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-core /1.0-SNAPSHOT/hadoop-mapreduce-client-core-1.0-SNAPSHOT.jar [ivy:resolve] module not found: org.apache.hadoop#yarn-common;1.0-SNAPSHOT [ivy:resolve] apache-snapshot: tried [ivy:resolve] https://repository.apache.org/content/repositories/snapshots/org/apache/had oop/yarn-common/1.0-SNAPSHOT/yarn-common-1.0-SNAPSHOT.pom https://repository.apache.org/content/repositories/snapshots/org/apache/hado op/yarn-common/1.0-SNAPSHOT/yarn-common-1.0-SNAPSHOT.pom [ivy:resolve] -- artifact org.apache.hadoop#yarn-common;1.0-SNAPSHOT!yarn-common.jar: [ivy:resolve]
Re: Problem while running eclipse-files for Next Gen Mapreduce branch
Hey Bobby- Vinod and I did a cleanup pass over INSTALL and yarn/README in https://issues.apache.org/jira/browse/MAPREDUCE-2645. Please take a look and check if there's anything else we need to add/update. Josh On Fri, Jul 8, 2011 at 7:25 AM, Robert Evans ev...@yahoo-inc.com wrote: I mapreduce/INSTALL also has some important information in it, and be aware that you do not have to install the avro plugin any more. Maven can download it and install it automatically now, but the README was never updated. Also be sure to install protocol buffers. The build will fail without it. --Bobby On 7/8/11 9:04 AM, Josh Wills jwi...@cloudera.com wrote: You want to generate them using mvn instead. See the mapreduce/yarn/README file for how to do it. On Fri, Jul 8, 2011 at 7:00 AM, Devaraj K devara...@huawei.com wrote: Hi, I am getting this below errors when I try to generate eclipse files using eclipse-files target. Can anybody help me? Buildfile: D:\svn\nextgenmapreduce\mapreduce\build.xml ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar [get] To: D:\svn\nextgenmapreduce\mapreduce\ivy\ivy-2.2.0.jar [get] Not modified - so not downloaded ivy-init-dirs: ivy-probe-antlib: ivy-init-antlib: ivy-init: [ivy:configure] :: Ivy non official version - :: http://ant.apache.org/ivy/ http://ant.apache.org/ivy/ :: [ivy:configure] :: loading settings :: file = D:\svn\nextgenmapreduce\mapreduce\ivy\ivysettings.xml ivy-resolve-common: [ivy:resolve] [ivy:resolve] :: problems summary :: [ivy:resolve] WARNINGS [ivy:resolve] module not found: org.apache.hadoop#yarn-server-common;1.0-SNAPSHOT [ivy:resolve] apache-snapshot: tried [ivy:resolve] https://repository.apache.org/content/repositories/snapshots/org/apache/had oop/yarn-server-common/1.0-SNAPSHOT/yarn-server-common-1.0-SNAPSHOT.pom https://repository.apache.org/content/repositories/snapshots/org/apache/hado op/yarn-server-common/1.0-SNAPSHOT/yarn-server-common-1.0-SNAPSHOT.pom [ivy:resolve] -- artifact org.apache.hadoop#yarn-server-common;1.0-SNAPSHOT!yarn-server-common.jar: [ivy:resolve] https://repository.apache.org/content/repositories/snapshots/org/apache/had oop/yarn-server-common/1.0-SNAPSHOT/yarn-server-common-1.0-SNAPSHOT.jar https://repository.apache.org/content/repositories/snapshots/org/apache/hado op/yarn-server-common/1.0-SNAPSHOT/yarn-server-common-1.0-SNAPSHOT.jar [ivy:resolve] maven2: tried [ivy:resolve] http://repo1.maven.org/maven2/org/apache/hadoop/yarn-server-common/1.0-SNAP SHOT/yarn-server-common-1.0-SNAPSHOT.pom http://repo1.maven.org/maven2/org/apache/hadoop/yarn-server-common/1.0-SNAPS HOT/yarn-server-common-1.0-SNAPSHOT.pom [ivy:resolve] -- artifact org.apache.hadoop#yarn-server-common;1.0-SNAPSHOT!yarn-server-common.jar: [ivy:resolve] http://repo1.maven.org/maven2/org/apache/hadoop/yarn-server-common/1.0-SNAP SHOT/yarn-server-common-1.0-SNAPSHOT.jar http://repo1.maven.org/maven2/org/apache/hadoop/yarn-server-common/1.0-SNAPS HOT/yarn-server-common-1.0-SNAPSHOT.jar [ivy:resolve] module not found: org.apache.hadoop#hadoop-mapreduce-client-core;1.0-SNAPSHOT [ivy:resolve] apache-snapshot: tried [ivy:resolve] https://repository.apache.org/content/repositories/snapshots/org/apache/had oop/hadoop-mapreduce-client-core/1.0-SNAPSHOT/hadoop-mapreduce-client-core-1 .0-SNAPSHOT.pom https://repository.apache.org/content/repositories/snapshots/org/apache/hado op/hadoop-mapreduce-client-core/1.0-SNAPSHOT/hadoop-mapreduce-client-core-1. 0-SNAPSHOT.pom [ivy:resolve] -- artifact org.apache.hadoop#hadoop-mapreduce-client-core;1.0-SNAPSHOT!hadoop-mapreduce -client-core.jar: [ivy:resolve] https://repository.apache.org/content/repositories/snapshots/org/apache/had oop/hadoop-mapreduce-client-core/1.0-SNAPSHOT/hadoop-mapreduce-client-core-1 .0-SNAPSHOT.jar https://repository.apache.org/content/repositories/snapshots/org/apache/hado op/hadoop-mapreduce-client-core/1.0-SNAPSHOT/hadoop-mapreduce-client-core-1. 0-SNAPSHOT.jar [ivy:resolve] maven2: tried [ivy:resolve] http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-cor e/1.0-SNAPSHOT/hadoop-mapreduce-client-core-1.0-SNAPSHOT.pom http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-core /1.0-SNAPSHOT/hadoop-mapreduce-client-core-1.0-SNAPSHOT.pom [ivy:resolve] -- artifact org.apache.hadoop#hadoop-mapreduce-client-core;1.0-SNAPSHOT!hadoop-mapreduce -client-core.jar: [ivy:resolve] http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-cor e
Re: Problem while running eclipse-files for Next Gen Mapreduce branch
Haoyuan, Did you specify HADOOP_CONF_DIR, YARN_CONF_DIR, etc. as per Step 8 of the INSTALL doc? I've been running my mrv2 cluster in pseudo-distributed mode on my linux box- happy to send you my configs if you'd like to play. There are also a couple of changes you need to make to the yarn config files to get real MR running, as Bobby alluded to in his email. To run the example in the INSTALL file, you'll need to patch in this change: https://issues.apache.org/jira/browse/MAPREDUCE-2644 Josh On Fri, Jul 8, 2011 at 8:07 AM, Haoyuan Li haoyuan...@gmail.com wrote: Hi Josh, Which configuration file could tell Yarn the location of HDFS? This could be found on http://svn.apache.org/repos/asf/hadoop/common/branches/MR-279/mapreduce/INSTALL Thank you. Best, Haoyuan On Fri, Jul 8, 2011 at 10:58 PM, Josh Wills jwi...@cloudera.com wrote: Hey Bobby- Vinod and I did a cleanup pass over INSTALL and yarn/README in https://issues.apache.org/jira/browse/MAPREDUCE-2645. Please take a look and check if there's anything else we need to add/update. Josh On Fri, Jul 8, 2011 at 7:25 AM, Robert Evans ev...@yahoo-inc.com wrote: I mapreduce/INSTALL also has some important information in it, and be aware that you do not have to install the avro plugin any more. Maven can download it and install it automatically now, but the README was never updated. Also be sure to install protocol buffers. The build will fail without it. --Bobby On 7/8/11 9:04 AM, Josh Wills jwi...@cloudera.com wrote: You want to generate them using mvn instead. See the mapreduce/yarn/README file for how to do it. On Fri, Jul 8, 2011 at 7:00 AM, Devaraj K devara...@huawei.com wrote: Hi, I am getting this below errors when I try to generate eclipse files using eclipse-files target. Can anybody help me? Buildfile: D:\svn\nextgenmapreduce\mapreduce\build.xml ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar [get] To: D:\svn\nextgenmapreduce\mapreduce\ivy\ivy-2.2.0.jar [get] Not modified - so not downloaded ivy-init-dirs: ivy-probe-antlib: ivy-init-antlib: ivy-init: [ivy:configure] :: Ivy non official version - :: http://ant.apache.org/ivy/ http://ant.apache.org/ivy/ :: [ivy:configure] :: loading settings :: file = D:\svn\nextgenmapreduce\mapreduce\ivy\ivysettings.xml ivy-resolve-common: [ivy:resolve] [ivy:resolve] :: problems summary :: [ivy:resolve] WARNINGS [ivy:resolve] module not found: org.apache.hadoop#yarn-server-common;1.0-SNAPSHOT [ivy:resolve] apache-snapshot: tried [ivy:resolve] https://repository.apache.org/content/repositories/snapshots/org/apache/had oop/yarn-server-common/1.0-SNAPSHOT/yarn-server-common-1.0-SNAPSHOT.pom https://repository.apache.org/content/repositories/snapshots/org/apache/hado op/yarn-server-common/1.0-SNAPSHOT/yarn-server-common-1.0-SNAPSHOT.pom [ivy:resolve] -- artifact org.apache.hadoop#yarn-server-common;1.0-SNAPSHOT!yarn-server-common.jar: [ivy:resolve] https://repository.apache.org/content/repositories/snapshots/org/apache/had oop/yarn-server-common/1.0-SNAPSHOT/yarn-server-common-1.0-SNAPSHOT.jar https://repository.apache.org/content/repositories/snapshots/org/apache/hado op/yarn-server-common/1.0-SNAPSHOT/yarn-server-common-1.0-SNAPSHOT.jar [ivy:resolve] maven2: tried [ivy:resolve] http://repo1.maven.org/maven2/org/apache/hadoop/yarn-server-common/1.0-SNAP SHOT/yarn-server-common-1.0-SNAPSHOT.pom http://repo1.maven.org/maven2/org/apache/hadoop/yarn-server-common/1.0-SNAPS HOT/yarn-server-common-1.0-SNAPSHOT.pom [ivy:resolve] -- artifact org.apache.hadoop#yarn-server-common;1.0-SNAPSHOT!yarn-server-common.jar: [ivy:resolve] http://repo1.maven.org/maven2/org/apache/hadoop/yarn-server-common/1.0-SNAP SHOT/yarn-server-common-1.0-SNAPSHOT.jar http://repo1.maven.org/maven2/org/apache/hadoop/yarn-server-common/1.0-SNAPS HOT/yarn-server-common-1.0-SNAPSHOT.jar [ivy:resolve] module not found: org.apache.hadoop#hadoop-mapreduce-client-core;1.0-SNAPSHOT [ivy:resolve] apache-snapshot: tried [ivy:resolve] https://repository.apache.org/content/repositories/snapshots/org/apache/had oop/hadoop-mapreduce-client-core/1.0-SNAPSHOT/hadoop-mapreduce-client-core-1 .0-SNAPSHOT.pom https://repository.apache.org/content/repositories/snapshots/org/apache/hado op/hadoop-mapreduce-client-core/1.0-SNAPSHOT
[jira] [Created] (MAPREDUCE-2641) Fix the ExponentiallySmoothedTaskRuntimeEstimator and its unit test
Fix the ExponentiallySmoothedTaskRuntimeEstimator and its unit test --- Key: MAPREDUCE-2641 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2641 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Reporter: Josh Wills Assignee: Josh Wills Priority: Minor Attachments: MAPREDUCE-2641.patch Fixed the ExponentiallySmoothedTaskRuntimeEstimator so that it can run and pass the test defined for it in TestRuntimeEstimators. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2642) Fix two bugs in v2.app.speculate.DataStatistics
Fix two bugs in v2.app.speculate.DataStatistics --- Key: MAPREDUCE-2642 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2642 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Reporter: Josh Wills Assignee: Josh Wills Priority: Minor Fixes two bugs in DataStatistics: a divide by zero in the variance calculation when count == 0, and a synchronization issue in how the updateStatistics method was implemented. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2643) Fixing typos/formatting/null checking in v2.app.speculate package
Fixing typos/formatting/null checking in v2.app.speculate package - Key: MAPREDUCE-2643 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2643 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Reporter: Josh Wills Assignee: Josh Wills Priority: Minor No functional changes in this patch: just fixing some typos in the comments, fixing some formatting issues, renaming classes for consistency, and making the null checking around Jobs/Tasks more consistent. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Introduction
Hello MR Developers, My name is Josh, I started working at Cloudera last week as a data scientist. I spent my last three years at Google, where I worked on the ads system for awhile and then built alot of the data analysis and experiment infrastructure behind Google+. I am really excited about the NextGen MR project, especially because of the potential to create new classes of data-intensive applications to run on Hadoop. I'm looking forward to helping out wherever I can, and while I apologize for all of the JIRA spam you've been receiving from me lately, I have no intentions of letting up. :) Best, Josh
Re: Introduction
Thanks Arun! On Tue, Jul 5, 2011 at 9:51 AM, Arun C Murthy a...@hortonworks.com wrote: Josh, Welcome on board. Apache Hadoop is a volunteer driven community project and we are glad to have you help us. I, personally, look forward to working with you on NG MR and am excited to see you report issues and fix them! Arun On Jul 5, 2011, at 8:34 AM, Josh Wills wrote: Hello MR Developers, My name is Josh, I started working at Cloudera last week as a data scientist. I spent my last three years at Google, where I worked on the ads system for awhile and then built alot of the data analysis and experiment infrastructure behind Google+. I am really excited about the NextGen MR project, especially because of the potential to create new classes of data-intensive applications to run on Hadoop. I'm looking forward to helping out wherever I can, and while I apologize for all of the JIRA spam you've been receiving from me lately, I have no intentions of letting up. :) Best, Josh
[jira] [Created] (MAPREDUCE-2639) MR-279: Fixup the exponentially smoothed runtime estimator, fix a couple of bugs in DataStatistics, and do a little bit of cleanup.
MR-279: Fixup the exponentially smoothed runtime estimator, fix a couple of bugs in DataStatistics, and do a little bit of cleanup. --- Key: MAPREDUCE-2639 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2639 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Environment: All Reporter: Josh Wills Assignee: Josh Wills Priority: Minor Attachments: MAPREDUCE-2639.patch A catch-all JIRA for a pass I took through the v2.app.speculate package. 1) Fixed the ExponentiallySmoothedTaskRuntimeEstimator so that it can run and pass the test defined in TestRuntimeEstimators. 2) Fixed two bugs in DataStatistics: 1) a divide by zero in the variance calculation in the case that count == 0 and 2) a synchronization issue in how the updateStatistics method was implemented, 3) A bunch of typo corrections, formatting fixes, and adding some consistency around the null value checking. I probably need to do a couple more passes through this code to get it into better shape, but this seemed like a good start. Will attach my patch momentarily. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2633) MR-279: Add a getCounter(Enum) method to the Counters interface
MR-279: Add a getCounter(Enum) method to the Counters interface --- Key: MAPREDUCE-2633 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2633 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Environment: All Reporter: Josh Wills Priority: Minor I'm fixing a few TODOs I came across in TaskAttemptImpl.java related to the fact that the MRv2 Counters interface don't expose a getCounter(Enum) method for accessing a Counter using the enum's class as the group name and the enum's value as the name of the counter. Will add the patch momentarily. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2630) MR-279: refreshQueues leads to NPEs when used w/FifoScheduler
MR-279: refreshQueues leads to NPEs when used w/FifoScheduler - Key: MAPREDUCE-2630 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2630 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Environment: All Reporter: Josh Wills Priority: Minor The RM's admin service exposes a method refreshQueues that is used to update the queue configuration when used with the CapacityScheduler, but if it is used with the FifoScheduler, it will set the containerTokenSecretManager/clusterTracker fields on the FifoScheduler to null, which eventually leads to NPE. Since the FifoScheduler only has one queue that cannot be refreshed, the correct behavior is for the refreshQueues call to be a no-op. I will attach a patch that fixes this by splitting the ResourceScheduler's reinitialize method into separate initialize/updateQueues methods. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: mappers
(redirected to mapreduce-user@, mapreduce-dev bcc'd) The param you're referring to controls the maximum number of simultaneously active mappers on a given task tracker, i.e., how many map slots are available on that node. But a single task tracker can be used for multiple MR jobs, so you can't look at the metrics for the task tracker to see how many mappers ran on a job. For a single job, the total number of mappers that are run == the number of input splits. Hoping that anyone who knows this stuff better than I do will reply to correct any mistakes in my answer, Josh On Sun, Jun 26, 2011 at 5:16 AM, Keren Ouaknine ker...@gmail.com wrote: Hello, I am looking for the actual number of mappers on each machine for the job. I know how to configure the max number (mapred.tasktracker.map.tasks.maximum in mapred-site.xml file), but not the actual number of mappers that were running for a completed job. Any idea where can I find this data? Thanks, Keren -- Keren Ouaknine Cell: +972 54 2565404 Web: www.kereno.com
Re: Queries on MRv2
Hey Praveen, I'm in the same boat as you re: getting started with the MR2 code. I have a couple of answers and a couple of followup questions for Arun et al. to keep in mind as they're writing a design doc. On Wed, Jun 15, 2011 at 5:27 AM, Praveen Sripati praveensrip...@gmail.com wrote: Hi, - How to specify that an ApplicationMaster use a particular version of the MapReduce library dynamically? I don't totally grok the question-- doesn't the client-side code that configures the ApplicationMaster decide this? - How does the ApplicationManager pick a node to run the ApplicationMaster? What resource considerations are taken if any while picking a particular node to run the ApplicationMaster? There is an ApplicationsManager (note the extra 's') that is part of the functionality of the RM. See reference: http://developer.yahoo.com/blogs/hadoop/posts/2011/03/mapreduce-nextgen-scheduler/ - Who observes the ResourceManager/ApplicationMaster/NodeManager for failures to be restarted later? From the blog entry it seems that the state of the ResourceManager is stored in the ZooKeeper and the state of the ApplicationManager is stored in the HDFS. So this is the classic problem of any such system-- who watches the watchmen? It seems like the client would be notified when an ApplicationManager failed by the ApplicationSManager (see above blog post again, it's actually a good blog post, it would be great to have a few more of them), the ResourceManager would know when a NodeManager failed, and it falls to an admin and/or an external monitoring system to detect ResourceManager failure and handle the restart. - Looks like the containers are based on Linux cgroups. So, is the MRv2 limited only to the Linux boxes? Yeah, I bumped into this when I was doing a naive build + install on my Mac. Not that I see folks running alot of hadoop clusters on Macs, but it would be cool if the basic build/install just worked on every platform, even if it's just as simple as detecting the platform and skipping the build of the native container-executor stuff. (Note: I actually got the container-executor stuff to build by using the standard Mac tricks, but I'm not sure if it's worth checking in.) Hope the design document from Arun will make me ask less queries in this forum :) Thanks, Praveen On Wed, Jun 15, 2011 at 9:17 AM, Mahadev Konar maha...@apache.org wrote: Praveen, In that case, if a just launched container is released, the NM will be notified via the RM that the container is not longer valid and the NM will go ahead and kill the container. On Tue, Jun 14, 2011 at 8:38 PM, Praveen Sripati praveensrip...@gmail.com wrote: Mahadev, MapReduce ApplicationMaster might behave well, but what about custom ApplicationMasters for other models. Q) What happens if an ApplicationMaster asks a NM to launch a container and then releases the container in the allocate call later? A) The Application Master only releases the container once the container is done. Thanks, Praveen On Wed, Jun 15, 2011 at 8:59 AM, Mahadev Konar maha...@apache.org wrote: Praveen, Answers in line: Q) What happens if an ApplicationMaster asks a NM to launch a container and then releases the container in the allocate call later? The Application Master only releases the container once the container is done. Q) So, the NM watches the UNIX Process/Containers and sends the status to the ApplicationManager. Later the ApplicationManager sends the status of the containers in response to the allocate call to the ApplicationMaster. Why should the ApplicationMaster be aware of the container status, since it's already tracking the map/reduce tasks in the containers? Its just a way to notify the application master as soon as possible when the containers fail. This helps in speeding up the notification of failed containers else AM has to wait for discovering failures via timeouts. Q) Does the ApplicationMaster notify the NodeManager to exit the UNIX Process when the map/reduce tasks in that particular container are completed? Are the containers re-used? Yes it notifes the NM. Containers are not re used as of now. In future we do see the containers being re used but we'll need leases to do that. Q) The ApplicationManager asks the NodeManager to create a container and also launch the map/reduce task in it. From then on the ApplicationManager and Map/Reduce tasks interact directly without the NodeManager. Am I correct? I think you mean ApplicationMaster. Yes, the applicationmaster and map/reduce tasks talk directly without NM being involved. Praveen On Wed, Jun 15, 2011 at 12:59 AM, Arun C Murthy a...@yahoo-inc.com wrote: On Jun 14, 2011, at 6:31 PM, Praveen Sripati wrote: Hi, I have gone through MapReduce NextGen Blog entries and JIRA and have the following