[jira] [Commented] (TEZ-3824) MRCombiner creates new JobConf copy per spill

Jason Lowe (JIRA) Thu, 10 May 2018 07:43:23 -0700

    [ 
https://issues.apache.org/jira/browse/TEZ-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470476#comment-16470476
 ]


Jason Lowe commented on TEZ-3824:
---------------------------------

bq. As is, the patch will send in a null for the config in case the old API is 
being used?

The new JobConf object added by the patch is only initialized if the new API is 
being used, but it is also only used if the new API is being used.  
createReduceContext is only called by runNewCombiner, which in turn is only 
called if useNewApi is true.

My main comment on the patch is whether we really need a separate jobConf 
field.  The constructor can simply check the new API flag from the parsed 
configuration and either assign {{conf}} to the parsed Configuration object or 
to a JobConf.  That way we don't have to hold onto a Configuration object _and_ 
a JobConf object when doing the new API.


> MRCombiner creates new JobConf copy per spill
> ---------------------------------------------
>
>                 Key: TEZ-3824
>                 URL: https://issues.apache.org/jira/browse/TEZ-3824
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>            Priority: Major
>         Attachments: TEZ-3824.001.patch
>
>
> {noformat:title=scope-57(HASH_JOIN) stack trace}
> "SpillThread {scope_60_" #99 daemon prio=5 os_prio=0 tid=0x00007f2128d21800 
> nid=0x7487 runnable [0x00007f21154c4000]
>    java.lang.Thread.State: RUNNABLE
>         at 
> java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1012)
>         at 
> java.util.concurrent.ConcurrentHashMap.putAll(ConcurrentHashMap.java:1084)
>         at 
> java.util.concurrent.ConcurrentHashMap.<init>(ConcurrentHashMap.java:852)
>         at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:728)
>         - locked <0x00000000d1dc5240> (a org.apache.hadoop.conf.Configuration)
>         at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:442)
>         at 
> org.apache.hadoop.mapreduce.task.JobContextImpl.<init>(JobContextImpl.java:67)
>         at 
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl.<init>(TaskAttemptContextImpl.java:49)
>         at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.<init>(TaskInputOutputContextImpl.java:54)
>         at 
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.<init>(ReduceContextImpl.java:95)
>         at 
> org.apache.tez.mapreduce.combine.MRCombiner.createReduceContext(MRCombiner.java:237)
>         at 
> org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:181)
>         at 
> org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)
>         at 
> org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:313)
>         at 
> org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.spill(DefaultSorter.java:937)
>         at 
> org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.sortAndSpill(DefaultSorter.java:861)
>         at 
> org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter$SpillThread.run(DefaultSorter.java:780)
> {noformat}
> {code:title=JobConf copy construction for tez}
>   public JobContextImpl(Configuration conf, JobID jobId) {
>     if (conf instanceof JobConf) {
>       this.conf = (JobConf)conf;
>     } else {
> --->this.conf = new JobConf(conf);<----
>     }
>     this.jobId = jobId;
>     this.credentials = this.conf.getCredentials();
>     try {
>       this.ugi = UserGroupInformation.getCurrentUser();
>     } catch (IOException e) {
>       throw new RuntimeException(e);
>     }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (TEZ-3824) MRCombiner creates new JobConf copy per spill

Reply via email to