date:20180409

[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-09 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16431378#comment-16431378
 ] 

Jason Lowe commented on MAPREDUCE-7069:
---

Thanks for updating the patch!  Looks like TestPipeApplication and 
TestMRIntermediateDataEncryption both exited abnormally.  I was not able to 
reproduce either failure locally with the patch applied.

In mapred-default.xml "UEnvironment" s/b "Environment".

The example added to the tutorial is missing the "-D" flag required to be in 
front of all of the example options.  These properties are not valid 
command-line options directly.

Nit: "alternateForm" confusd me as a parameter name to testAMStandardEnv at 
first, and the comment in the method body explaining what it did was key to 
understanding it.  Maybe "useSeparateEnvProps" or something similar would be a 
better parameter name?

Nit: It would be nice to have some whitespace between the unit test methods for 
readability.

I'm not sure the DockerContainers.md change is necessary.  We still support the 
old, single-property way to set a list of environment variables that doesn't 
contain commas in the values.  I'm not sure we need to make the example more 
complicated given the variable settings don't have commas and therefore would 
require the separate property form.

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch, MAPREDUCE-7069.005.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

2018-04-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430945#comment-16430945
 ] 

Hadoop QA commented on MAPREDUCE-7073:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 48s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
38s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 51m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | MAPREDUCE-7073 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918193/MAPREDUCE-7073.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 497c96b6d9d5 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ac32b35 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7383/testReport/ |
| Max. process+thread count | 408 (vs. ulimit of 1) |
| modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
U: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7383/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Optimize TokenCache#obtai

[jira] [Updated] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

2018-04-09 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated MAPREDUCE-7073:

Attachment: MAPREDUCE-7073.002.patch

> Optimize TokenCache#obtainTokensForNamenodesInternal
> 
>
> Key: MAPREDUCE-7073
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7073
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
> Attachments: MAPREDUCE-7073.001.patch, MAPREDUCE-7073.002.patch
>
>
> {{FileInputFormat#listStatus}} is too slow file system cache is disabled. 
> {{TokenCache#obtainTokensForNamenodesInternal}} for every filesystem instance 
> {{Master.getMasterPrincipal(conf)}} is caled which reloads YarnConfiguration .
> For fileInput with 1k file will reload YarnConfiguration 1k times.
> {{Master.getMasterPrincipal(conf)}} can be passed for  
> {{obtainTokensForNamenodesInternal}} per filesystem call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-7074) Shuffle get stuck in fetch failures loop, when a few mapoutput were lost or corrupted and task timeout was set to 0

2018-04-09 Thread smarthan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

smarthan updated MAPREDUCE-7074:

Attachment: MAPREDUCE-7074.patch
ExceptionMsg.txt

> Shuffle  get stuck in fetch failures loop, when a few mapoutput were lost or 
> corrupted and task timeout was set to 0
> 
>
> Key: MAPREDUCE-7074
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7074
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, task
>Affects Versions: 2.8.0, 3.0.0
> Environment: cdh 5.10.0 ,  apache hadoop 2.8.0
>Reporter: smarthan
>Priority: Major
> Fix For: 2.8.0
>
> Attachments: ExceptionMsg.txt, MAPREDUCE-7074.patch
>
>
> When a MR job like this:
>  - MR job with many map tasks, such as 1 or more
>  - a few map output were lost or corrupted after map task complete 
> successfully and before shuffle start
>  - mapreduce.task.timeout was set to 0 and 
> mapreduce.task.progress-report.interval was not set
> the shuffle of reduce task will get stuck in fetch failures loop for a long 
> time, several or even dozens of hours.
> It was caused by MAPREDUCE-6740, it releate mapreduce.task.timeout with 
> mapreduce.task.progress-report.interval by 
> MRJobConfUtil.getTaskProgressReportInterval()
> {code:java}
>   public static long getTaskProgressReportInterval(final Configuration conf) {
> long taskHeartbeatTimeOut = conf.getLong(
> MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS);
> return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL,
> (long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * 
> taskHeartbeatTimeOut));
>   }
> {code}
> When mapreduce.task.timeout was set to 0 and 
> mapreduce.task.progress-report.interval was not set, 
> getTaskProgressReportInterval will retrun 0L.
>  In the class TaskReporter which is used to report task progress and status 
> to AM, it set taskProgressInterval= 
> MRJobConfUtil.getTaskProgressReportInterval(), and 
> lock.wait(taskProgressInterval) before every progress report.
> {code:java}
>  public void run() {
>   ...skip...
>   long taskProgressInterval = MRJobConfUtil.
>   getTaskProgressReportInterval(conf);
>   while (!taskDone.get()) {
> ...skip...
> try {
>   // sleep for a bit
>   synchronized(lock) {
> if (taskDone.get()) {
>   break;
> }
> lock.wait(taskProgressInterval);
>   }
>   if (taskDone.get()) {
> break;
>   }
>   if (sendProgress) {
> // we need to send progress update
> updateCounters();
> taskStatus.statusUpdate(taskProgress.get(),
> taskProgress.toString(), 
> counters);
> taskFound = umbilical.statusUpdate(taskId, taskStatus);
> taskStatus.clearStatus();
>   }
>   ...skip...
> } 
> ...skip...
>   }
>}
> {code}
> When mapreduce.task.timeout was set to 0, lock.wait(taskProgressInterval) 
> will be lock.wait(0), and because there is no operation to notify it ,the 
> reporter will wait all the time and don't report anything to AM. 
>  So, when fetch failures happend in shuffle, TaskReporter will not report 
> fetch failures to AM , although the log of reducer show message"Reporting 
> fetch failure...", and the fetch failures loop will not stop util reduce task 
> failed for exceeded MAX_FAILED_UNIQUE_FETCHES.
> So, it's necessary to set a TASK_PROGRESS_REPORT_INTERVAL_MAX value (such as 
> 30s) when the taskProgressInterval return by 
> MRJobConfUtil.getTaskProgressReportInterval() equals 0 or beyond the max 
> value, set the taskProgressInterval = TASK_PROGRESS_REPORT_INTERVAL_MAX. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-7074) Shuffle get stuck in fetch failures loop, when a few mapoutput were lost or corrupted and task timeout was set to 0

2018-04-09 Thread smarthan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

smarthan updated MAPREDUCE-7074:

Attachment: (was: fetch_failures_report.patch)

> Shuffle  get stuck in fetch failures loop, when a few mapoutput were lost or 
> corrupted and task timeout was set to 0
> 
>
> Key: MAPREDUCE-7074
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7074
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, task
>Affects Versions: 2.8.0, 3.0.0
> Environment: cdh 5.10.0 ,  apache hadoop 2.8.0
>Reporter: smarthan
>Priority: Major
> Fix For: 2.8.0
>
>
> When a MR job like this:
>  - MR job with many map tasks, such as 1 or more
>  - a few map output were lost or corrupted after map task complete 
> successfully and before shuffle start
>  - mapreduce.task.timeout was set to 0 and 
> mapreduce.task.progress-report.interval was not set
> the shuffle of reduce task will get stuck in fetch failures loop for a long 
> time, several or even dozens of hours.
> It was caused by MAPREDUCE-6740, it releate mapreduce.task.timeout with 
> mapreduce.task.progress-report.interval by 
> MRJobConfUtil.getTaskProgressReportInterval()
> {code:java}
>   public static long getTaskProgressReportInterval(final Configuration conf) {
> long taskHeartbeatTimeOut = conf.getLong(
> MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS);
> return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL,
> (long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * 
> taskHeartbeatTimeOut));
>   }
> {code}
> When mapreduce.task.timeout was set to 0 and 
> mapreduce.task.progress-report.interval was not set, 
> getTaskProgressReportInterval will retrun 0L.
>  In the class TaskReporter which is used to report task progress and status 
> to AM, it set taskProgressInterval= 
> MRJobConfUtil.getTaskProgressReportInterval(), and 
> lock.wait(taskProgressInterval) before every progress report.
> {code:java}
>  public void run() {
>   ...skip...
>   long taskProgressInterval = MRJobConfUtil.
>   getTaskProgressReportInterval(conf);
>   while (!taskDone.get()) {
> ...skip...
> try {
>   // sleep for a bit
>   synchronized(lock) {
> if (taskDone.get()) {
>   break;
> }
> lock.wait(taskProgressInterval);
>   }
>   if (taskDone.get()) {
> break;
>   }
>   if (sendProgress) {
> // we need to send progress update
> updateCounters();
> taskStatus.statusUpdate(taskProgress.get(),
> taskProgress.toString(), 
> counters);
> taskFound = umbilical.statusUpdate(taskId, taskStatus);
> taskStatus.clearStatus();
>   }
>   ...skip...
> } 
> ...skip...
>   }
>}
> {code}
> When mapreduce.task.timeout was set to 0, lock.wait(taskProgressInterval) 
> will be lock.wait(0), and because there is no operation to notify it ,the 
> reporter will wait all the time and don't report anything to AM. 
>  So, when fetch failures happend in shuffle, TaskReporter will not report 
> fetch failures to AM , although the log of reducer show message"Reporting 
> fetch failure...", and the fetch failures loop will not stop util reduce task 
> failed for exceeded MAX_FAILED_UNIQUE_FETCHES.
> So, it's necessary to set a TASK_PROGRESS_REPORT_INTERVAL_MAX value (such as 
> 30s) when the taskProgressInterval return by 
> MRJobConfUtil.getTaskProgressReportInterval() equals 0 or beyond the max 
> value, set the taskProgressInterval = TASK_PROGRESS_REPORT_INTERVAL_MAX. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-7074) Shuffle get stuck in fetch failures loop, when a few mapoutput were lost or corrupted and task timeout was set to 0

2018-04-09 Thread smarthan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

smarthan updated MAPREDUCE-7074:

Description: 
When a MR job like this:
 - MR job with many map tasks, such as 1 or more
 - a few map output were lost or corrupted after map task complete successfully 
and before shuffle start
 - mapreduce.task.timeout was set to 0 and 
mapreduce.task.progress-report.interval was not set

the shuffle of reduce task will get stuck in fetch failures loop for a long 
time, several or even dozens of hours.

It was caused by MAPREDUCE-6740, it releate mapreduce.task.timeout with 
mapreduce.task.progress-report.interval by 
MRJobConfUtil.getTaskProgressReportInterval()
{code:java}
  public static long getTaskProgressReportInterval(final Configuration conf) {
long taskHeartbeatTimeOut = conf.getLong(
MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS);
return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL,
(long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * taskHeartbeatTimeOut));
  }
{code}
When mapreduce.task.timeout was set to 0 and 
mapreduce.task.progress-report.interval was not set, 
getTaskProgressReportInterval will retrun 0L.
 In the class TaskReporter which is used to report task progress and status to 
AM, it set taskProgressInterval= MRJobConfUtil.getTaskProgressReportInterval(), 
and lock.wait(taskProgressInterval) before every progress report.
{code:java}
 public void run() {
  ...skip...
  long taskProgressInterval = MRJobConfUtil.
  getTaskProgressReportInterval(conf);
  while (!taskDone.get()) {
...skip...
try {
  // sleep for a bit
  synchronized(lock) {
if (taskDone.get()) {
  break;
}
lock.wait(taskProgressInterval);
  }
  if (taskDone.get()) {
break;
  }
  if (sendProgress) {
// we need to send progress update
updateCounters();
taskStatus.statusUpdate(taskProgress.get(),
taskProgress.toString(), 
counters);
taskFound = umbilical.statusUpdate(taskId, taskStatus);
taskStatus.clearStatus();
  }
  ...skip...
} 
...skip...
  }
   }
{code}
When mapreduce.task.timeout was set to 0, lock.wait(taskProgressInterval) will 
be lock.wait(0), and because there is no operation to notify it ,the reporter 
will wait all the time and don't report anything to AM. 
 So, when fetch failures happend in shuffle, TaskReporter will not report fetch 
failures to AM , although the log of reducer show message"Reporting fetch 
failure...", and the fetch failures loop will not stop util reduce task failed 
for exceeded MAX_FAILED_UNIQUE_FETCHES.

So, it's necessary to set a TASK_PROGRESS_REPORT_INTERVAL_MAX value (such as 
30s) when the taskProgressInterval return by 
MRJobConfUtil.getTaskProgressReportInterval() equals 0 or beyond the max value, 
set the taskProgressInterval = TASK_PROGRESS_REPORT_INTERVAL_MAX. 

  was:
When a MR job like this:
 - MR job with many map tasks, such as 1 or more
 - a few map output were lost or corrupted after map task complete successfully 
and before shuffle start
 - mapreduce.task.timeout was set to 0 and 
mapreduce.task.progress-report.interval was not set

the shuffle of reduce task will get stuck in fetch failures loop for a long 
time, several or even dozens of hours.

It was caused by MAPREDUCE-6740, it releate mapreduce.task.timeout with 
mapreduce.task.progress-report.interval by 
MRJobConfUtil.getTaskProgressReportInterval()
{code:java}
  public static long getTaskProgressReportInterval(final Configuration conf) {
long taskHeartbeatTimeOut = conf.getLong(
MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS);
return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL,
(long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * taskHeartbeatTimeOut));
  }
{code}
When mapreduce.task.timeout was set to 0 and 
mapreduce.task.progress-report.interval was not set, 
getTaskProgressReportInterval will retrun 0L.
 In the class TaskReporter which is used to report task progress and status to 
AM, it set taskProgressInterval= MRJobConfUtil.getTaskProgressReportInterval(), 
and lock.wait(taskProgressInterval) before every progress report.
{code:java}
 public void run() {
  ...skip...
  long taskProgressInterval = MRJobConfUtil.
  getTaskProgressReportInterval(conf);
  while (!taskDone.get()) {
...skip...
try {
  // sleep for a bit
  synchronized(lock) {
if (taskDone.get()) {
  break;
}
lock.wait(taskProgressInterval);
  }
  if (taskDone.get()) {
break;
  }
  if (sendP

[jira] [Updated] (MAPREDUCE-7074) Shuffle get stuck in fetch failures loop, when a few mapoutput were lost or corrupted and task timeout was set to 0

2018-04-09 Thread smarthan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

smarthan updated MAPREDUCE-7074:

Description: 
When a MR job like this:
 - MR job with many map tasks, such as 1 or more
 - a few map output were lost or corrupted after map task complete successfully 
and before shuffle start
 - mapreduce.task.timeout was set to 0 and 
mapreduce.task.progress-report.interval was not set

the shuffle of reduce task will get stuck in fetch failures loop for a long 
time, several or even dozens of hours.

It was caused by MAPREDUCE-6740, it releate mapreduce.task.timeout with 
mapreduce.task.progress-report.interval by 
MRJobConfUtil.getTaskProgressReportInterval()
{code:java}
  public static long getTaskProgressReportInterval(final Configuration conf) {
long taskHeartbeatTimeOut = conf.getLong(
MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS);
return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL,
(long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * taskHeartbeatTimeOut));
  }
{code}
When mapreduce.task.timeout was set to 0 and 
mapreduce.task.progress-report.interval was not set, 
getTaskProgressReportInterval will retrun 0L.
 In the class TaskReporter which is used to report task progress and status to 
AM, it set taskProgressInterval= MRJobConfUtil.getTaskProgressReportInterval(), 
and lock.wait(taskProgressInterval) before every progress report.
{code:java}
 public void run() {
  ...skip...
  long taskProgressInterval = MRJobConfUtil.
  getTaskProgressReportInterval(conf);
  while (!taskDone.get()) {
...skip...
try {
  // sleep for a bit
  synchronized(lock) {
if (taskDone.get()) {
  break;
}
lock.wait(taskProgressInterval);
  }
  if (taskDone.get()) {
break;
  }
  if (sendProgress) {
// we need to send progress update
updateCounters();
taskStatus.statusUpdate(taskProgress.get(),
taskProgress.toString(), 
counters);
taskFound = umbilical.statusUpdate(taskId, taskStatus);
taskStatus.clearStatus();
  }
  ...skip...
} 
...skip...
  }
   }
{code}
When mapreduce.task.timeout was set to 0, lock.wait(taskProgressInterval) will 
be lock.wait(0), and because there is no operation to notify it ,the reporter 
will wait all the time and don't report anything to AM. 
 So, when fetch failures happend in shuffle, TaskReporter will not report fetch 
failures to AM , although the log of reducer show message"Reporting fetch 
failure...", and the fetch failures loop will not stop util reduce task failed 
for exceeded MAX_FAILED_UNIQUE_FETCHES.

So, it's necessary to set a TASK_PROGRESS_REPORT_INTERVAL_MAX value (such as 
30s) when the taskProgressInterval return by 
MRJobConfUtil.getTaskProgressReportInterval() equals 0 or beyond the max value, 
set the taskProgressInterval = TASK_PROGRESS_REPORT_INTERVAL_MAX.

Exception Message:

2018-04-09 14:57:08,610 INFO [fetcher#3] 
org.apache.hadoop.mapreduce.task.reduce.Fetcher: for 
url=6562/mapOutput?job=job_152320039_13196&reduce=0&map=attempt_152320039_13196_m_003652_0,attempt_152320039_13196_m_001331_0,attempt_152320039_13196_m_000342_0,attempt_152320039_13196_m_000105_0,attempt_152320039_13196_m_001211_0,attempt_152320039_13196_m_002219_0,attempt_152320039_13196_m_004747_0,attempt_152320039_13196_m_62_0
 sent hash and received reply
2018-04-09 14:57:08,612 WARN [fetcher#3] 
org.apache.hadoop.mapreduce.task.reduce.Fetcher: Invalid map id 
java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 Internal 
Server Error
Content-Type: text/plain; charset=UTF is not properly formed
at 
org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:201)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:510)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:348)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:198)
2018-04-09 14:57:08,612 WARN [fetcher#3] 
org.apache.hadoop.mapreduce.task.reduce.Fetcher: copyMapOutput failed for tasks 
[attempt_152320039_13196_m_003652_0, 
attempt_152320039_13196_m_001331_0, attempt_152320039_13196_m_000342_0, 
attempt_152320039_13196_m_000105_0, attempt_152320039_13196_m_001211_0, 
attempt_152320039_13196_m_002219_0, attempt_152320039_13196_m_004747_0, 
attempt_152320039_13196_m_62_0]
2018-04-09 14:57:08,612 INFO [fetcher#3] 
org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Reporting fetch 
failure for attempt_152320039_13196_m_003652_0 to jobtracker.
2018-04-09 14:57:08,612 INF

[jira] [Created] (MAPREDUCE-7074) Shuffle get stuck in fetch failures loop, when a few mapoutput were lost or corrupted and task timeout was set to 0

2018-04-09 Thread smarthan (JIRA)

smarthan created MAPREDUCE-7074:
---

 Summary: Shuffle  get stuck in fetch failures loop, when a few 
mapoutput were lost or corrupted and task timeout was set to 0
 Key: MAPREDUCE-7074
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7074
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, task
Affects Versions: 3.0.0, 2.8.0
 Environment: cdh 5.10.0 ,  apache hadoop 2.8.0
Reporter: smarthan
 Fix For: 2.8.0
 Attachments: fetch_failures_report.patch

When a MR job like this:
 - MR job with many map tasks, such as 1 or more
 - a few map output were lost or corrupted after map task complete successfully 
and before shuffle start
 - mapreduce.task.timeout was set to 0 and 
mapreduce.task.progress-report.interval was not set

the shuffle of reduce task will get stuck in fetch failures loop for a long 
time, several or even dozens of hours.

It was caused by MAPREDUCE-6740, it releate mapreduce.task.timeout with 
mapreduce.task.progress-report.interval by 
MRJobConfUtil.getTaskProgressReportInterval()
{code:java}
  public static long getTaskProgressReportInterval(final Configuration conf) {
long taskHeartbeatTimeOut = conf.getLong(
MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS);
return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL,
(long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * taskHeartbeatTimeOut));
  }
{code}
When mapreduce.task.timeout was set to 0 and 
mapreduce.task.progress-report.interval was not set, 
getTaskProgressReportInterval will retrun 0L.
 In the class TaskReporter which is used to report task progress and status to 
AM, it set taskProgressInterval= MRJobConfUtil.getTaskProgressReportInterval(), 
and lock.wait(taskProgressInterval) before every progress report.
{code:java}
 public void run() {
  ...skip...
  long taskProgressInterval = MRJobConfUtil.
  getTaskProgressReportInterval(conf);
  while (!taskDone.get()) {
...skip...
try {
  // sleep for a bit
  synchronized(lock) {
if (taskDone.get()) {
  break;
}
lock.wait(taskProgressInterval);
  }
  if (taskDone.get()) {
break;
  }
  if (sendProgress) {
// we need to send progress update
updateCounters();
taskStatus.statusUpdate(taskProgress.get(),
taskProgress.toString(), 
counters);
taskFound = umbilical.statusUpdate(taskId, taskStatus);
taskStatus.clearStatus();
  }
  ...skip...
} 
...skip...
  }
   }
{code}
When mapreduce.task.timeout was set to 0, lock.wait(taskProgressInterval) will 
be lock.wait(0), and because there is no operation to notify it ,the reporter 
will wait all the time and don't report anything to AM. 
 So, when fetch failures happend in shuffle, TaskReporter will not report fetch 
failures to AM , although the log of reducer show message"Reporting fetch 
failure...", and the fetch failures loop will not stop util reduce task failed 
for exceeded MAX_FAILED_UNIQUE_FETCHES.

So, it's necessary to set a TASK_PROGRESS_REPORT_INTERVAL_MAX value (such as 
30s) when the taskProgressInterval return by 
MRJobConfUtil.getTaskProgressReportInterval() equals 0 or beyond the max value, 
set the taskProgressInterval = TASK_PROGRESS_REPORT_INTERVAL_MAX.

Exception Message:

{code:java}
2018-04-09 14:57:08,610 INFO [fetcher#3] 
org.apache.hadoop.mapreduce.task.reduce.Fetcher: for 
url=6562/mapOutput?job=job_152320039_13196&reduce=0&map=attempt_152320039_13196_m_003652_0,attempt_152320039_13196_m_001331_0,attempt_152320039_13196_m_000342_0,attempt_152320039_13196_m_000105_0,attempt_152320039_13196_m_001211_0,attempt_152320039_13196_m_002219_0,attempt_152320039_13196_m_004747_0,attempt_152320039_13196_m_62_0
 sent hash and received reply
2018-04-09 14:57:08,612 WARN [fetcher#3] 
org.apache.hadoop.mapreduce.task.reduce.Fetcher: Invalid map id 
java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 Internal 
Server Error
Content-Type: text/plain; charset=UTF is not properly formed
at 
org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:201)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:510)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:348)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:198)
2018-04-09 14:57:08,612 WARN [fetcher#3] 
org.apache.hadoop.mapreduce.task.reduce.Fetcher: copyMapOutput failed for tasks 
[attempt_152320039_13196_m_003652_0, 
attempt_152320039_13196_m_001331_0, attempt_152320039_13196_m_000342_0,

[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-09 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430549#comment-16430549
 ] 

Jim Brennan commented on MAPREDUCE-7069:


[~jlowe], I think this is ready for review - the unit test failure is unrelated 
- not sure if there is a bug for that jobclient test failure?

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch, MAPREDUCE-7069.005.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

2018-04-09 Thread lindongdong (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430494#comment-16430494
 ] 

lindongdong commented on MAPREDUCE-7073:


*Thanks~~*

> Optimize TokenCache#obtainTokensForNamenodesInternal
> 
>
> Key: MAPREDUCE-7073
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7073
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
> Attachments: MAPREDUCE-7073.001.patch
>
>
> {{FileInputFormat#listStatus}} is too slow file system cache is disabled. 
> {{TokenCache#obtainTokensForNamenodesInternal}} for every filesystem instance 
> {{Master.getMasterPrincipal(conf)}} is caled which reloads YarnConfiguration .
> For fileInput with 1k file will reload YarnConfiguration 1k times.
> {{Master.getMasterPrincipal(conf)}} can be passed for  
> {{obtainTokensForNamenodesInternal}} per filesystem call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

2018-04-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430362#comment-16430362
 ] 

Hadoop QA commented on MAPREDUCE-7073:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
59s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-mapreduce-client-core in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-mapreduce-client-core in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 25s{color} 
| {color:red} hadoop-mapreduce-client-core in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-mapreduce-client-core in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  4m  
7s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-mapreduce-client-core in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 27s{color} 
| {color:red} hadoop-mapreduce-client-core in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 54m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | MAPREDUCE-7073 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918059/MAPREDUCE-7073.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 71c9167e4433 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 5700556 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7382/artifact/out/patch-mvninstall-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7382/artifact/out/patch-compile-hadoop-mapreduce-project_hadoop-mapredu

[jira] [Commented] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

2018-04-09 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430324#comment-16430324
 ] 

Bibin A Chundatt commented on MAPREDUCE-7073:
-

Issue credits to [~lindongdong]

> Optimize TokenCache#obtainTokensForNamenodesInternal
> 
>
> Key: MAPREDUCE-7073
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7073
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
> Attachments: MAPREDUCE-7073.001.patch
>
>
> {{FileInputFormat#listStatus}} is too slow file system cache is disabled. 
> {{TokenCache#obtainTokensForNamenodesInternal}} for every filesystem instance 
> {{Master.getMasterPrincipal(conf)}} is caled which reloads YarnConfiguration .
> For fileInput with 1k file will reload YarnConfiguration 1k times.
> {{Master.getMasterPrincipal(conf)}} can be passed for  
> {{obtainTokensForNamenodesInternal}} per filesystem call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

2018-04-09 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated MAPREDUCE-7073:

Description: 
{{FileInputFormat#listStatus}} is too slow file system cache is disabled. 

{{TokenCache#obtainTokensForNamenodesInternal}} for every filesystem instance 
{{Master.getMasterPrincipal(conf)}} is caled which reloads YarnConfiguration .
For fileInput with 1k file will reload YarnConfiguration 1k times.

{{Master.getMasterPrincipal(conf)}} can be passed for  
{{obtainTokensForNamenodesInternal}} per filesystem call.


  was:
{{FileInputFormat#listStatus}} is too slow file system cache is disabled. 

{{TokenCache#obtainTokensForNamenodesInternal}} for every filesystem instance 
{{Master.getMasterPrincipal(conf)}} is caled which reloads YarnConfiguration .
For fileInput with 1k file will reload YarnConfiguration 1k times.

{{Master.getMasterPrincipal(conf)}} can be passed for  
{{obtainTokensForNamenodesInternal}} per filesystem call.


> Optimize TokenCache#obtainTokensForNamenodesInternal
> 
>
> Key: MAPREDUCE-7073
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7073
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
> Attachments: MAPREDUCE-7073.001.patch
>
>
> {{FileInputFormat#listStatus}} is too slow file system cache is disabled. 
> {{TokenCache#obtainTokensForNamenodesInternal}} for every filesystem instance 
> {{Master.getMasterPrincipal(conf)}} is caled which reloads YarnConfiguration .
> For fileInput with 1k file will reload YarnConfiguration 1k times.
> {{Master.getMasterPrincipal(conf)}} can be passed for  
> {{obtainTokensForNamenodesInternal}} per filesystem call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

2018-04-09 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated MAPREDUCE-7073:

Status: Patch Available  (was: Open)

> Optimize TokenCache#obtainTokensForNamenodesInternal
> 
>
> Key: MAPREDUCE-7073
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7073
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
> Attachments: MAPREDUCE-7073.001.patch
>
>
> {{FileInputFormat#listStatus}} is too slow file system cache is disabled. 
> {{TokenCache#obtainTokensForNamenodesInternal}} for every filesystem instance 
> {{Master.getMasterPrincipal(conf)}} is caled which reloads YarnConfiguration .
> For fileInput with 1k file will reload YarnConfiguration 1k times.
> {{Master.getMasterPrincipal(conf)}} can be passed for  
> {{obtainTokensForNamenodesInternal}} per filesystem call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

2018-04-09 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated MAPREDUCE-7073:

Attachment: MAPREDUCE-7073.001.patch

> Optimize TokenCache#obtainTokensForNamenodesInternal
> 
>
> Key: MAPREDUCE-7073
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7073
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
> Attachments: MAPREDUCE-7073.001.patch
>
>
> {{FileInputFormat#listStatus}} is too slow file system cache is disabled. 
> {{TokenCache#obtainTokensForNamenodesInternal}} for every filesystem instance 
> {{Master.getMasterPrincipal(conf)}} is caled which reloads YarnConfiguration .
> For fileInput with 1k file will reload YarnConfiguration 1k times.
> {{Master.getMasterPrincipal(conf)}} can be passed for  
> {{obtainTokensForNamenodesInternal}} per filesystem call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

2018-04-09 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated MAPREDUCE-7073:

Description: 
{{FileInputFormat#listStatus}} is too slow file system cache is disabled. 

{{TokenCache#obtainTokensForNamenodesInternal}} for every filesystem instance 
{{Master.getMasterPrincipal(conf)}} is caled which reloads YarnConfiguration .
For fileInput with 1k file will reload YarnConfiguration 1k times.

{{Master.getMasterPrincipal(conf)}} can be passed for  
{{obtainTokensForNamenodesInternal}} per filesystem call.

  was:
FileInputFormat#listStatus is too slow file system cache is disabled. 

{{TokenCache#obtainTokensForNamenodesInternal}} for every filesystem instance 
{{Master.getMasterPrincipal(conf)}} is caled which reloads YarnConfiguration .
For fileInput with 1k file will reload YarnConfiguration 1k times.

{{Master.getMasterPrincipal(conf)}} can be passed for  
{{obtainTokensForNamenodesInternal}} per filesystem call.


> Optimize TokenCache#obtainTokensForNamenodesInternal
> 
>
> Key: MAPREDUCE-7073
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7073
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
>
> {{FileInputFormat#listStatus}} is too slow file system cache is disabled. 
> {{TokenCache#obtainTokensForNamenodesInternal}} for every filesystem instance 
> {{Master.getMasterPrincipal(conf)}} is caled which reloads YarnConfiguration .
> For fileInput with 1k file will reload YarnConfiguration 1k times.
> {{Master.getMasterPrincipal(conf)}} can be passed for  
> {{obtainTokensForNamenodesInternal}} per filesystem call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Assigned] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

2018-04-09 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt reassigned MAPREDUCE-7073:
---

Assignee: Bibin A Chundatt

> Optimize TokenCache#obtainTokensForNamenodesInternal
> 
>
> Key: MAPREDUCE-7073
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7073
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
>
> FileInputFormat#listStatus is too slow file system cache is disabled. 
> {{TokenCache#obtainTokensForNamenodesInternal}} for every filesystem instance 
> {{Master.getMasterPrincipal(conf)}} is caled which reloads YarnConfiguration .
> For fileInput with 1k file will reload YarnConfiguration 1k times.
> {{Master.getMasterPrincipal(conf)}} can be passed for  
> {{obtainTokensForNamenodesInternal}} per filesystem call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

2018-04-09 Thread Bibin A Chundatt (JIRA)

Bibin A Chundatt created MAPREDUCE-7073:
---

 Summary: Optimize TokenCache#obtainTokensForNamenodesInternal
 Key: MAPREDUCE-7073
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7073
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Bibin A Chundatt


FileInputFormat#listStatus is too slow file system cache is disabled. 

{{TokenCache#obtainTokensForNamenodesInternal}} for every filesystem instance 
{{Master.getMasterPrincipal(conf)}} is caled which reloads YarnConfiguration .
For fileInput with 1k file will reload YarnConfiguration 1k times.

{{Master.getMasterPrincipal(conf)}} can be passed for  
{{obtainTokensForNamenodesInternal}} per filesystem call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

[jira] [Commented] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

[jira] [Updated] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

[jira] [Updated] (MAPREDUCE-7074) Shuffle get stuck in fetch failures loop, when a few mapoutput were lost or corrupted and task timeout was set to 0

[jira] [Updated] (MAPREDUCE-7074) Shuffle get stuck in fetch failures loop, when a few mapoutput were lost or corrupted and task timeout was set to 0

[jira] [Updated] (MAPREDUCE-7074) Shuffle get stuck in fetch failures loop, when a few mapoutput were lost or corrupted and task timeout was set to 0

[jira] [Updated] (MAPREDUCE-7074) Shuffle get stuck in fetch failures loop, when a few mapoutput were lost or corrupted and task timeout was set to 0

[jira] [Created] (MAPREDUCE-7074) Shuffle get stuck in fetch failures loop, when a few mapoutput were lost or corrupted and task timeout was set to 0

[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

[jira] [Commented] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

[jira] [Commented] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

[jira] [Commented] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

[jira] [Updated] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

[jira] [Updated] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

[jira] [Updated] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

[jira] [Updated] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

[jira] [Assigned] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

[jira] [Created] (MAPREDUCE-7073) Optimize TokenCache#obtainTokensForNamenodesInternal

18 matches

Site Navigation

Mail list logo

Footer information