[jira] [Created] (MAPREDUCE-6920) Cannot find the effect of mapreduce.job.speculative.slowtaskthreshold parameter

2017-07-25 Thread NING DING (JIRA)
NING DING created MAPREDUCE-6920:


 Summary: Cannot find the effect of 
mapreduce.job.speculative.slowtaskthreshold parameter
 Key: MAPREDUCE-6920
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6920
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: NING DING
Priority: Minor


The description of parameter mapreduce.job.speculative.slowtaskthreshold is as 
below.
{code:xml}

  mapreduce.job.speculative.slowtaskthreshold
  1.0
  The number of standard deviations by which a task's
  ave progress-rates must be lower than the average of all running tasks'
  for the task to be considered too slow.
  

{code}

But from the source code I find it has no effect for starting speculative task.
The call stack is as below.
DefaultSpeculator.speculationValue -> StartEndTimesBase.thresholdRuntime -> 
DataStatistics.outlier
{code:title=DefaultSpeculator.java|borderStyle=solid}
  private TaskItem speculationValue(TaskId taskID, long now) {
TaskItem taskItem = new TaskItem();
Job job = context.getJob(taskID.getJobId());
Task task = job.getTask(taskID);
Map attempts = task.getAttempts();
long acceptableRuntime = Long.MIN_VALUE;
long speculationValue = Long.MIN_VALUE;

if (!mayHaveSpeculated.contains(taskID)) {
  acceptableRuntime = estimator.thresholdRuntime(taskID);
  if (acceptableRuntime == Long.MAX_VALUE) {
taskItem.setSpeculationValue(ON_SCHEDULE);
return taskItem;
  }
}
   ...
  }
{code}
{code:title=StartEndTimesBase.java|borderStyle=solid}
  public long thresholdRuntime(TaskId taskID) {
JobId jobID = taskID.getJobId();
Job job = context.getJob(jobID);

TaskType type = taskID.getTaskType();

DataStatistics statistics
= dataStatisticsForTask(taskID);

int completedTasksOfType
= type == TaskType.MAP
? job.getCompletedMaps() : job.getCompletedReduces();

int totalTasksOfType
= type == TaskType.MAP
? job.getTotalMaps() : job.getTotalReduces();

if (completedTasksOfType < MINIMUM_COMPLETE_NUMBER_TO_SPECULATE
|| (((float)completedTasksOfType) / totalTasksOfType)
  < MINIMUM_COMPLETE_PROPORTION_TO_SPECULATE ) {
  return Long.MAX_VALUE;
}

long result =  statistics == null
? Long.MAX_VALUE
: (long)statistics.outlier(slowTaskRelativeTresholds.get(job));
return result;
  }
{code}
{code:title=DataStatistics.java|borderStyle=solid}
  public synchronized double outlier(float sigma) {
if (count != 0.0) {
  return mean() + std() * sigma;
}

return 0.0;
  }
{code}

The StartEndTimesBase.contextualize read 
mapreduce.job.speculative.slowtaskthreshold parameter value, then use it as 
outlier method parameter sigma value.
{code:title=StartEndTimesBase.java|borderStyle=solid}
  public void contextualize(Configuration conf, AppContext context) {
this.context = context;

Map allJobs = context.getAllJobs();

for (Map.Entry entry : allJobs.entrySet()) {
  final Job job = entry.getValue();
  mapperStatistics.put(job, new DataStatistics());
  reducerStatistics.put(job, new DataStatistics());
  slowTaskRelativeTresholds.put
  (job, conf.getFloat(MRJobConfig.SPECULATIVE_SLOWTASK_THRESHOLD,1.0f));
}
  }
{code}

I think the outlier return value is hard to be Long.MAX_VALUE no matter what 
the mapreduce.job.speculative.slowtaskthreshold parameter value is.
Then it cannot affect the return value of DefaultSpeculator.speculationValue 
method.
Then I run a test for this parameter. Test source code is as below.

{code:title=TestSpeculativeTask.java|borderStyle=solid}
package test.speculation;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class TestSpeculativeTask extends Configured implements Tool {

  public TestSpeculativeTask() {

  }

  public int run(String[] args) throws Exception {
Configuration conf = getConf();
Job job = Job.getInstance(conf);
FileInputFormat.setInputPaths(job, args[0]);
job.setMapperClass(SpeculativeTestMapper.class);
job.setNumReduceTasks(0);
job.setJarByClass(SpeculativeTestMapper.class);
Path output = new Path(args[1]);
FileOutputFormat.setOutputPath(job, output);
job.setOutputFormatClass(TextOutputFormat.class);
job.setOutputKeyClass(Text.class);

Re: Pre-Commit build is failing

2017-07-25 Thread Sean Busbey
-dev@yetus to bcc, since I think this is a Hadoop issue and not a yetus
issue.

Please review/commit HADOOP-14686 (which I am providing as a
volunteer/contributor on the Hadoop project).

On Tue, Jul 25, 2017 at 7:54 PM, Allen Wittenauer 
wrote:

>
> Again: just grab the .gitignore file from trunk and update it in
> branch-2.7. It hasn't been touched (outside of one patch) in years.  The
> existing jobs should then work.
>
> The rest of this stuff, yes, I know and yes it's intentional.  The
> directory structure was inherited from the original jobs that Nigel set up
> with the old version of test-patch.  Maybe some day I'll fix it.  But
> that's a project for a different day.  In order to fix it, it means taking
> down the patch testing for Hadoop while I work it out.  You'll notice that
> all of the other Yetus jobs for Hadoop have a much different layout.
>
>
>
>
> > On Jul 25, 2017, at 7:24 PM, suraj acharya  wrote:
> >
> > Hi,
> >
> > Seems like the issue was incorrect/unclean checkout.
> > I made a few changes[1] to the directories the checkout happens to  and
> it is now running.
> > Of course, this build[2] will take some time to run, but at the moment,
> it is running maven install.
> >
> > I am not sure who sets up/ manages the jenkins job of HDFS and dont want
> to change that, but I will keep the dummy job around for a couple of days
> in case anyone wants to see.
> > Also, I see that you'll were using the master branch of Yetus. If there
> is no patch present there that is of importance, then I would recommend to
> use the latest stable release version 0.5.0
> >
> > If you have more questions, feel free to ping dev@yetus.
> > Hope this helps.
> >
> > [1]: https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-
> Copy/configure
> > [2]: https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-
> Copy/12/console
> >
> > -Suraj Acharya
> >
> > On Tue, Jul 25, 2017 at 6:57 PM, suraj acharya 
> wrote:
> > For anyone looking. I created another job here. [1].
> > Set it with debug to see the issue.
> > The error is being seen here[2].
> > From the looks of it, it looks like, the way the checkout is happening
> is not very clean.
> > I will continue to look at it, but in case anyone wants to jump in.
> >
> > [1] : https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-Copy/
> > [2] : https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-
> Copy/11/console
> >
> > -Suraj Acharya
> >
> > On Tue, Jul 25, 2017 at 6:28 PM, Konstantin Shvachko <
> shv.had...@gmail.com> wrote:
> > Hi Yetus developers,
> >
> > We cannot build Hadoop branch-2.7 anymore. Here is a recent example of a
> > failed build:
> > https://builds.apache.org/job/PreCommit-HDFS-Build/20409/console
> >
> > It seems the build is failing because Yetus cannot apply the patch from
> the
> > jira.
> >
> > ERROR: HDFS-11896 does not apply to branch-2.7.
> >
> > As far as I understand this is Yetus problem. Probably in 0.3.0.
> > I can apply this patch successfully, but Yetus test-patch.sh script
> clearly
> > failed to apply. Cannot say why because Yetus does not report it.
> > I also ran Hadoop's test-patch.sh script locally and it passed
> successfully
> > on branch-2.7.
> >
> > Could anybody please take a look and help fixing the build.
> > This would be very helpful for the release (2.7.4) process.
> >
> > Thanks,
> > --Konst
> >
> > On Mon, Jul 24, 2017 at 10:41 PM, Konstantin Shvachko <
> shv.had...@gmail.com>
> > wrote:
> >
> > > Or should we backport the entire HADOOP-11917
> > >  ?
> > >
> > > Thanks,
> > > --Konst
> > >
> > > On Mon, Jul 24, 2017 at 6:56 PM, Konstantin Shvachko <
> shv.had...@gmail.com
> > > > wrote:
> > >
> > >> Allen,
> > >>
> > >> Should we add "patchprocess/" to .gitignore, is that the problem for
> 2.7?
> > >>
> > >> Thanks,
> > >> --Konstantin
> > >>
> > >> On Fri, Jul 21, 2017 at 6:24 PM, Konstantin Shvachko <
> > >> shv.had...@gmail.com> wrote:
> > >>
> > >>> What stuff? Is there a jira?
> > >>> It did work like a week ago. Is it a new Yetus requirement.
> > >>> Anyways I can commit a change to fix the build on our side.
> > >>> Just need to know what is missing.
> > >>>
> > >>> Thanks,
> > >>> --Konst
> > >>>
> > >>> On Fri, Jul 21, 2017 at 5:50 PM, Allen Wittenauer <
> > >>> a...@effectivemachines.com> wrote:
> > >>>
> > 
> >  > On Jul 21, 2017, at 5:46 PM, Konstantin Shvachko <
> >  shv.had...@gmail.com> wrote:
> >  >
> >  > + d...@yetus.apache.org
> >  >
> >  > Guys, could you please take a look. Seems like Yetus problem with
> >  > pre-commit build for branch-2.7.
> > 
> > 
> >  branch-2.7 is missing stuff in .gitignore.
> > >>>
> > >>>
> > >>>
> > >>
> > >
> >
> >
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional 

Re: Pre-Commit build is failing

2017-07-25 Thread Allen Wittenauer

Again: just grab the .gitignore file from trunk and update it in 
branch-2.7. It hasn't been touched (outside of one patch) in years.  The 
existing jobs should then work. 

The rest of this stuff, yes, I know and yes it's intentional.  The 
directory structure was inherited from the original jobs that Nigel set up with 
the old version of test-patch.  Maybe some day I'll fix it.  But that's a 
project for a different day.  In order to fix it, it means taking down the 
patch testing for Hadoop while I work it out.  You'll notice that all of the 
other Yetus jobs for Hadoop have a much different layout.




> On Jul 25, 2017, at 7:24 PM, suraj acharya  wrote:
> 
> Hi,
> 
> Seems like the issue was incorrect/unclean checkout.
> I made a few changes[1] to the directories the checkout happens to  and it is 
> now running. 
> Of course, this build[2] will take some time to run, but at the moment, it is 
> running maven install.
> 
> I am not sure who sets up/ manages the jenkins job of HDFS and dont want to 
> change that, but I will keep the dummy job around for a couple of days in 
> case anyone wants to see.
> Also, I see that you'll were using the master branch of Yetus. If there is no 
> patch present there that is of importance, then I would recommend to use the 
> latest stable release version 0.5.0
> 
> If you have more questions, feel free to ping dev@yetus.
> Hope this helps.
> 
> [1]: https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-Copy/configure
> [2]: https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-Copy/12/console
> 
> -Suraj Acharya
> 
> On Tue, Jul 25, 2017 at 6:57 PM, suraj acharya  wrote:
> For anyone looking. I created another job here. [1].
> Set it with debug to see the issue.
> The error is being seen here[2].
> From the looks of it, it looks like, the way the checkout is happening is not 
> very clean.
> I will continue to look at it, but in case anyone wants to jump in.
> 
> [1] : https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-Copy/
> [2] : https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-Copy/11/console
> 
> -Suraj Acharya
> 
> On Tue, Jul 25, 2017 at 6:28 PM, Konstantin Shvachko  
> wrote:
> Hi Yetus developers,
> 
> We cannot build Hadoop branch-2.7 anymore. Here is a recent example of a
> failed build:
> https://builds.apache.org/job/PreCommit-HDFS-Build/20409/console
> 
> It seems the build is failing because Yetus cannot apply the patch from the
> jira.
> 
> ERROR: HDFS-11896 does not apply to branch-2.7.
> 
> As far as I understand this is Yetus problem. Probably in 0.3.0.
> I can apply this patch successfully, but Yetus test-patch.sh script clearly
> failed to apply. Cannot say why because Yetus does not report it.
> I also ran Hadoop's test-patch.sh script locally and it passed successfully
> on branch-2.7.
> 
> Could anybody please take a look and help fixing the build.
> This would be very helpful for the release (2.7.4) process.
> 
> Thanks,
> --Konst
> 
> On Mon, Jul 24, 2017 at 10:41 PM, Konstantin Shvachko 
> wrote:
> 
> > Or should we backport the entire HADOOP-11917
> >  ?
> >
> > Thanks,
> > --Konst
> >
> > On Mon, Jul 24, 2017 at 6:56 PM, Konstantin Shvachko  > > wrote:
> >
> >> Allen,
> >>
> >> Should we add "patchprocess/" to .gitignore, is that the problem for 2.7?
> >>
> >> Thanks,
> >> --Konstantin
> >>
> >> On Fri, Jul 21, 2017 at 6:24 PM, Konstantin Shvachko <
> >> shv.had...@gmail.com> wrote:
> >>
> >>> What stuff? Is there a jira?
> >>> It did work like a week ago. Is it a new Yetus requirement.
> >>> Anyways I can commit a change to fix the build on our side.
> >>> Just need to know what is missing.
> >>>
> >>> Thanks,
> >>> --Konst
> >>>
> >>> On Fri, Jul 21, 2017 at 5:50 PM, Allen Wittenauer <
> >>> a...@effectivemachines.com> wrote:
> >>>
> 
>  > On Jul 21, 2017, at 5:46 PM, Konstantin Shvachko <
>  shv.had...@gmail.com> wrote:
>  >
>  > + d...@yetus.apache.org
>  >
>  > Guys, could you please take a look. Seems like Yetus problem with
>  > pre-commit build for branch-2.7.
> 
> 
>  branch-2.7 is missing stuff in .gitignore.
> >>>
> >>>
> >>>
> >>
> >
> 
> 


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: Pre-Commit build is failing

2017-07-25 Thread suraj acharya
Hi,

Seems like the issue was incorrect/unclean checkout.
I made a few changes[1] to the directories the checkout happens to  and it
is now running.
Of course, this build[2] will take some time to run, but at the moment, it
is running maven install.

I am not sure who sets up/ manages the jenkins job of HDFS and dont want to
change that, but I will keep the dummy job around for a couple of days in
case anyone wants to see.
Also, I see that you'll were using the master branch of Yetus. If there is
no patch present there that is of importance, then I would recommend to use
the latest stable release version 0.5.0

If you have more questions, feel free to ping dev@yetus.
Hope this helps.

[1]: https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-Copy/configure
[2]:
https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-Copy/12/console

-Suraj Acharya

On Tue, Jul 25, 2017 at 6:57 PM, suraj acharya  wrote:

> For anyone looking. I created another job here. [1].
> Set it with debug to see the issue.
> The error is being seen here[2].
> From the looks of it, it looks like, the way the checkout is happening is
> not very clean.
> I will continue to look at it, but in case anyone wants to jump in.
>
> [1] : https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-Copy/
> 
> [2] : https://builds.apache.org/job/PreCommit-HDFS-Build-
> Suraj-Copy/11/console
>
> -Suraj Acharya
>
> On Tue, Jul 25, 2017 at 6:28 PM, Konstantin Shvachko  > wrote:
>
>> Hi Yetus developers,
>>
>> We cannot build Hadoop branch-2.7 anymore. Here is a recent example of a
>> failed build:
>> https://builds.apache.org/job/PreCommit-HDFS-Build/20409/console
>>
>> It seems the build is failing because Yetus cannot apply the patch from
>> the
>> jira.
>>
>> ERROR: HDFS-11896 does not apply to branch-2.7.
>>
>> As far as I understand this is Yetus problem. Probably in 0.3.0.
>> I can apply this patch successfully, but Yetus test-patch.sh script
>> clearly
>> failed to apply. Cannot say why because Yetus does not report it.
>> I also ran Hadoop's test-patch.sh script locally and it passed
>> successfully
>> on branch-2.7.
>>
>> Could anybody please take a look and help fixing the build.
>> This would be very helpful for the release (2.7.4) process.
>>
>> Thanks,
>> --Konst
>>
>> On Mon, Jul 24, 2017 at 10:41 PM, Konstantin Shvachko <
>> shv.had...@gmail.com>
>> wrote:
>>
>> > Or should we backport the entire HADOOP-11917
>> >  ?
>> >
>> > Thanks,
>> > --Konst
>> >
>> > On Mon, Jul 24, 2017 at 6:56 PM, Konstantin Shvachko <
>> shv.had...@gmail.com
>> > > wrote:
>> >
>> >> Allen,
>> >>
>> >> Should we add "patchprocess/" to .gitignore, is that the problem for
>> 2.7?
>> >>
>> >> Thanks,
>> >> --Konstantin
>> >>
>> >> On Fri, Jul 21, 2017 at 6:24 PM, Konstantin Shvachko <
>> >> shv.had...@gmail.com> wrote:
>> >>
>> >>> What stuff? Is there a jira?
>> >>> It did work like a week ago. Is it a new Yetus requirement.
>> >>> Anyways I can commit a change to fix the build on our side.
>> >>> Just need to know what is missing.
>> >>>
>> >>> Thanks,
>> >>> --Konst
>> >>>
>> >>> On Fri, Jul 21, 2017 at 5:50 PM, Allen Wittenauer <
>> >>> a...@effectivemachines.com> wrote:
>> >>>
>> 
>>  > On Jul 21, 2017, at 5:46 PM, Konstantin Shvachko <
>>  shv.had...@gmail.com> wrote:
>>  >
>>  > + d...@yetus.apache.org
>>  >
>>  > Guys, could you please take a look. Seems like Yetus problem with
>>  > pre-commit build for branch-2.7.
>> 
>> 
>>  branch-2.7 is missing stuff in .gitignore.
>> >>>
>> >>>
>> >>>
>> >>
>> >
>>
>
>


Re: Pre-Commit build is failing

2017-07-25 Thread suraj acharya
For anyone looking. I created another job here. [1].
Set it with debug to see the issue.
The error is being seen here[2].
>From the looks of it, it looks like, the way the checkout is happening is
not very clean.
I will continue to look at it, but in case anyone wants to jump in.

[1] : https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-Copy/

[2] :
https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-Copy/11/console

-Suraj Acharya

On Tue, Jul 25, 2017 at 6:28 PM, Konstantin Shvachko 
wrote:

> Hi Yetus developers,
>
> We cannot build Hadoop branch-2.7 anymore. Here is a recent example of a
> failed build:
> https://builds.apache.org/job/PreCommit-HDFS-Build/20409/console
>
> It seems the build is failing because Yetus cannot apply the patch from the
> jira.
>
> ERROR: HDFS-11896 does not apply to branch-2.7.
>
> As far as I understand this is Yetus problem. Probably in 0.3.0.
> I can apply this patch successfully, but Yetus test-patch.sh script clearly
> failed to apply. Cannot say why because Yetus does not report it.
> I also ran Hadoop's test-patch.sh script locally and it passed successfully
> on branch-2.7.
>
> Could anybody please take a look and help fixing the build.
> This would be very helpful for the release (2.7.4) process.
>
> Thanks,
> --Konst
>
> On Mon, Jul 24, 2017 at 10:41 PM, Konstantin Shvachko <
> shv.had...@gmail.com>
> wrote:
>
> > Or should we backport the entire HADOOP-11917
> >  ?
> >
> > Thanks,
> > --Konst
> >
> > On Mon, Jul 24, 2017 at 6:56 PM, Konstantin Shvachko <
> shv.had...@gmail.com
> > > wrote:
> >
> >> Allen,
> >>
> >> Should we add "patchprocess/" to .gitignore, is that the problem for
> 2.7?
> >>
> >> Thanks,
> >> --Konstantin
> >>
> >> On Fri, Jul 21, 2017 at 6:24 PM, Konstantin Shvachko <
> >> shv.had...@gmail.com> wrote:
> >>
> >>> What stuff? Is there a jira?
> >>> It did work like a week ago. Is it a new Yetus requirement.
> >>> Anyways I can commit a change to fix the build on our side.
> >>> Just need to know what is missing.
> >>>
> >>> Thanks,
> >>> --Konst
> >>>
> >>> On Fri, Jul 21, 2017 at 5:50 PM, Allen Wittenauer <
> >>> a...@effectivemachines.com> wrote:
> >>>
> 
>  > On Jul 21, 2017, at 5:46 PM, Konstantin Shvachko <
>  shv.had...@gmail.com> wrote:
>  >
>  > + d...@yetus.apache.org
>  >
>  > Guys, could you please take a look. Seems like Yetus problem with
>  > pre-commit build for branch-2.7.
> 
> 
>  branch-2.7 is missing stuff in .gitignore.
> >>>
> >>>
> >>>
> >>
> >
>


Re: Pre-Commit build is failing

2017-07-25 Thread Konstantin Shvachko
Hi Yetus developers,

We cannot build Hadoop branch-2.7 anymore. Here is a recent example of a
failed build:
https://builds.apache.org/job/PreCommit-HDFS-Build/20409/console

It seems the build is failing because Yetus cannot apply the patch from the
jira.

ERROR: HDFS-11896 does not apply to branch-2.7.

As far as I understand this is Yetus problem. Probably in 0.3.0.
I can apply this patch successfully, but Yetus test-patch.sh script clearly
failed to apply. Cannot say why because Yetus does not report it.
I also ran Hadoop's test-patch.sh script locally and it passed successfully
on branch-2.7.

Could anybody please take a look and help fixing the build.
This would be very helpful for the release (2.7.4) process.

Thanks,
--Konst

On Mon, Jul 24, 2017 at 10:41 PM, Konstantin Shvachko 
wrote:

> Or should we backport the entire HADOOP-11917
>  ?
>
> Thanks,
> --Konst
>
> On Mon, Jul 24, 2017 at 6:56 PM, Konstantin Shvachko  > wrote:
>
>> Allen,
>>
>> Should we add "patchprocess/" to .gitignore, is that the problem for 2.7?
>>
>> Thanks,
>> --Konstantin
>>
>> On Fri, Jul 21, 2017 at 6:24 PM, Konstantin Shvachko <
>> shv.had...@gmail.com> wrote:
>>
>>> What stuff? Is there a jira?
>>> It did work like a week ago. Is it a new Yetus requirement.
>>> Anyways I can commit a change to fix the build on our side.
>>> Just need to know what is missing.
>>>
>>> Thanks,
>>> --Konst
>>>
>>> On Fri, Jul 21, 2017 at 5:50 PM, Allen Wittenauer <
>>> a...@effectivemachines.com> wrote:
>>>

 > On Jul 21, 2017, at 5:46 PM, Konstantin Shvachko <
 shv.had...@gmail.com> wrote:
 >
 > + d...@yetus.apache.org
 >
 > Guys, could you please take a look. Seems like Yetus problem with
 > pre-commit build for branch-2.7.


 branch-2.7 is missing stuff in .gitignore.
>>>
>>>
>>>
>>
>


[jira] [Created] (MAPREDUCE-6919) ShuffleMetrics.ShuffleConnections Gauge Metric Rises Infinitely

2017-07-25 Thread Erik Krogen (JIRA)
Erik Krogen created MAPREDUCE-6919:
--

 Summary: ShuffleMetrics.ShuffleConnections Gauge Metric Rises 
Infinitely
 Key: MAPREDUCE-6919
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6919
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Erik Krogen


We recently noticed that the mapred.ShuffleMetrics.ShuffleConnections metric 
rises indefinitely (see attached graph), despite supposedly being a gauge 
measuring the number of currently open connections:
{code:title=ShuffleHandler.java}
@Metric("# of current shuffle connections")
MutableGaugeInt shuffleConnections;
{code}

It seems this is because the metric is incremented once for each map file sent, 
but decremented once for each request. Thus a request which fetches multiple 
map files permanently increments shuffleConnections by (mapsFetched - 1).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6918) ShuffleMetrics.ShuffleConnections Gauge Metric Climbs Infinitely

2017-07-25 Thread Erik Krogen (JIRA)
Erik Krogen created MAPREDUCE-6918:
--

 Summary: ShuffleMetrics.ShuffleConnections Gauge Metric Climbs 
Infinitely
 Key: MAPREDUCE-6918
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6918
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Erik Krogen


We recently noticed that the {{mapred.ShuffleMetrics.ShuffleConnections}} 
metric seems to climb infinitely, up to many millions (see attached graph), 
despite being supposedly a gauge measure of the number of open connections:
{code:title=ShuffleHandler.java}
@Metric("# of current shuffle connections")
MutableGaugeInt shuffleConnections;
{code}

It seems that shuffleConnections gets incremented once for every map fetched, 
but only decremented once for every request. It seems to me it should be 
modified to only be incremented once for every request rather than for every 
map fetched, but I'm not familiar with the original intent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org