[jira] [Commented] (TEZ-4087) Shuffle: Fix shuffle cleanup to prevent thread leaks
[ https://issues.apache.org/jira/browse/TEZ-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162299#comment-17162299 ] Jonathan Turner Eagles commented on TEZ-4087: - [~rajesh.balamohan], [~ashutoshc]. This commit has caused findbugs warning filed in TEZ-4203. Can either of you put up a patch to address? > Shuffle: Fix shuffle cleanup to prevent thread leaks > > > Key: TEZ-4087 > URL: https://issues.apache.org/jira/browse/TEZ-4087 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Major > Attachments: TEZ-4087.1.patch > > Time Spent: 10m > Remaining Estimate: 0h > > In certain cases, Shuffle's cleanupIgnoreErrors() is not called. This leaves > 4 threads (inmem, diskmerger, Referee, ShuffleAndMergeRunner) run forever. > When these are run in long running processes (e.g LLAP in Hive), they reach > the thread limits over time. > Note: Root cause why cleanupIgnoreErrors() is not invoked is not yet known. I > will share the details when i get more details on this. Creating this ticket > to add additional safety knobs to ensure that thread leaks do not happen. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-4133) key class implements writableComparable and configurable use default configuration
[ https://issues.apache.org/jira/browse/TEZ-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162114#comment-17162114 ] Jonathan Turner Eagles commented on TEZ-4133: - +1. Will commit this change to master and branch-0.9 > key class implements writableComparable and configurable use default > configuration > -- > > Key: TEZ-4133 > URL: https://issues.apache.org/jira/browse/TEZ-4133 > Project: Apache Tez > Issue Type: Bug >Reporter: wang qiang >Assignee: wang qiang >Priority: Major > Attachments: TEZ-4133-1.patch, TEZ-4133-2.patch > > > key class implments writableComparable and configurable ,setConf be invoked > with a default configuration,it should like > mapreduce(org.apache.hadoop.mapred.JobConf#getOutputKeyComparator) > {code:java} > //org.apache.tez.runtime.library.common.ConfigUtils#getIntermediateOutputKeyComparator > public static RawComparator > getIntermediateOutputKeyComparator(Configuration conf) { > Class theClass = conf.getClass( > TezRuntimeConfiguration.TEZ_RUNTIME_KEY_COMPARATOR_CLASS, null, > RawComparator.class); > if (theClass != null) > return ReflectionUtils.newInstance(theClass, conf); > // aybe change to > // return > WritableComparator.get(getIntermediateOutputKeyClass(conf).asSubclass( > //WritableComparable.class),conf); > return > WritableComparator.get(getIntermediateOutputKeyClass(conf).asSubclass( > WritableComparable.class)); > } > public static RawComparator > getIntermediateInputKeyComparator(Configuration conf) { > Class theClass = conf.getClass( > TezRuntimeConfiguration.TEZ_RUNTIME_KEY_COMPARATOR_CLASS, null, > RawComparator.class); > if (theClass != null) > return ReflectionUtils.newInstance(theClass, conf); > // maybe change to > // return > WritableComparator.get(getIntermediateInputKeyClass(conf).asSubclass( > //WritableComparable.class),conf); > return WritableComparator.get(getIntermediateInputKeyClass(conf).asSubclass( > WritableComparable.class)); > } > {code} > > > > > > > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-4196) Add a possibility to provide configuration when creating TezClient
[ https://issues.apache.org/jira/browse/TEZ-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161837#comment-17161837 ] Peter Vary commented on TEZ-4196: - [~abstractdog]: TEZ-4175 seems like a perfect solution for this specific problem - I am not sure about the other consequences. Any timeline for that patch to get committed? Thanks, Peter > Add a possibility to provide configuration when creating TezClient > -- > > Key: TEZ-4196 > URL: https://issues.apache.org/jira/browse/TEZ-4196 > Project: Apache Tez > Issue Type: Improvement >Reporter: Peter Vary >Priority: Major > Attachments: TEZ-4196.01.patch, TEZ-4196.02.patch > > > Currently Tez/YarnConfiguration creation is not too flexible. > It would be good to have a possibility to provide YarnConfiguration from > outside, and also making sure that when a TezConfiguration is created it > should not parse the files again. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-4175) Consider removing YarnConfiguration where it's possible
[ https://issues.apache.org/jira/browse/TEZ-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161824#comment-17161824 ] László Bodor commented on TEZ-4175: --- could you please take a look at this patch [~jeagles], [~gopalv]? 1. eliminates the doubled tez + yarn config from AMConfiguration and relies on single tezconf 2. eliminates the yarn conf from classes extending FrameworkClient (the only relevant is TezYarnClient which can use tezconf with this patch) {code} public void init(TezConfiguration tezConf) { yarnClient.init(tezConf); } {code} I don't know the history of doubled tez/yarn confs, but is seems like this patch could prevent creating unnecessary conf objects > Consider removing YarnConfiguration where it's possible > --- > > Key: TEZ-4175 > URL: https://issues.apache.org/jira/browse/TEZ-4175 > Project: Apache Tez > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: TEZ-4175.01.patch, TEZ-4175.02.patch, TEZ-4175.03.patch, > TEZ-4175.03.patch > > > A comment in DAGAppmaster made me think that we don't need to rely on > [YarnConfiguration|https://github.com/apache/hadoop/blob/branch-3.1.3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java] > in all cases, what if it can be replace with base Configuration... > {code} > // TODO Does this really need to be a YarnConfiguration ? > Configuration conf = new Configuration(new YarnConfiguration()); > {code} > In hadoop 3.1.3 source, I cannot see that it adds e.g. yarn-site as a > resource: > {code} > public YarnConfiguration() { > super(); > } > > public YarnConfiguration(Configuration conf) { > super(conf); > if (! (conf instanceof YarnConfiguration)) { > this.reloadConfiguration(); > } > } > {code} > in current codebase: > {code} > grep -iRH "new YarnConfiguration" --include="*.java" > tez-plugins/tez-history-parser/src/main/java/org/apache/tez/history/ATSImportTool.java: > YarnConfiguration yarnConf = new YarnConfiguration(conf); > tez-plugins/tez-aux-services/src/main/java/org/apache/tez/auxservices/ShuffleHandler.java: > super.serviceInit(new YarnConfiguration(conf)); > tez-api/src/test/java/org/apache/tez/dag/api/client/rpc/TestDAGClient.java: > YarnConfiguration yarnConf = new YarnConfiguration(tezConf); > tez-api/src/test/java/org/apache/tez/dag/api/client/rpc/TestDAGClient.java: > YarnConfiguration yarnConf = new YarnConfiguration(tezConf); > tez-api/src/test/java/org/apache/tez/dag/api/client/rpc/TestDAGClient.java: > YarnConfiguration yarnConf = new YarnConfiguration(tezConf); > tez-api/src/test/java/org/apache/tez/client/TestTezClient.java: > tezClient.init(new TezConfiguration(false), new YarnConfiguration()); > tez-api/src/main/java/org/apache/tez/client/TezClient.java: > amConfig.setYarnConfiguration(new > YarnConfiguration(amConfig.getTezConfiguration())); > tez-api/src/main/java/org/apache/tez/client/TezClient.java: > amConfig.setYarnConfiguration(new > YarnConfiguration(amConfig.getTezConfiguration())); > tez-api/src/main/java/org/apache/tez/client/TezClient.java:return > getDAGClient(appId, tezConf, new YarnConfiguration(tezConf), frameworkClient, > ugi); > tez-tests/src/test/java/org/apache/tez/test/FaultToleranceTestRunner.java: > tezConf = new TezConfiguration(new YarnConfiguration()); > tez-tests/src/test/java/org/apache/tez/test/FaultToleranceTestRunner.java: >tezConf = new TezConfiguration(new YarnConfiguration(this.conf)); > tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java: > Configuration testConf = new YarnConfiguration( > tez-mapreduce/src/main/java/org/apache/tez/mapreduce/client/YARNRunner.java: > this(conf, new ResourceMgrDelegate(new YarnConfiguration(conf))); > tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestContainerReuse.java: > Configuration conf = new Configuration(new YarnConfiguration()); > tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestContainerReuse.java: > Configuration conf = new Configuration(new YarnConfiguration()); > tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestContainerReuse.java: > Configuration tezConf = new Configuration(new YarnConfiguration()); > tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestContainerReuse.java: > Configuration tezConf = new Configuration(new YarnConfiguration()); > tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestContainerReuse.java: > Configuration tezConf = new Configuration(new YarnConfiguration()); > tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestContainerReuse.java: > Configuration tezConf = new Configuration(new
[jira] [Commented] (TEZ-4196) Add a possibility to provide configuration when creating TezClient
[ https://issues.apache.org/jira/browse/TEZ-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161819#comment-17161819 ] László Bodor commented on TEZ-4196: --- [~pvary]: thanks for the patch, I can understand the purpose of this fix according to your internal conversation [~rbalamohan], this indeed opens an API where an incorrect YarnConfiguration can be passed by the user...however, the getClient API (introduced in TEZ-3892) already assumes the user is aware what's going on with already running AM's (I know this is not a 100% valid explanation) before merging this particular patch, please let's consider taking a look at TEZ-4175, as it also eliminates the overhead which is supposed to be eliminated by this patch (right?): {code} amConfig.setYarnConfiguration(new YarnConfiguration(amConfig.getTezConfiguration())); {code} > Add a possibility to provide configuration when creating TezClient > -- > > Key: TEZ-4196 > URL: https://issues.apache.org/jira/browse/TEZ-4196 > Project: Apache Tez > Issue Type: Improvement >Reporter: Peter Vary >Priority: Major > Attachments: TEZ-4196.01.patch, TEZ-4196.02.patch > > > Currently Tez/YarnConfiguration creation is not too flexible. > It would be good to have a possibility to provide YarnConfiguration from > outside, and also making sure that when a TezConfiguration is created it > should not parse the files again. -- This message was sent by Atlassian Jira (v8.3.4#803005)