[jira] [Updated] (TEZ-4076) Add hadoop-cloud-storage jar to aws and azure mvn profiles
[ https://issues.apache.org/jira/browse/TEZ-4076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-4076: - Fix Version/s: 0.10.1 > Add hadoop-cloud-storage jar to aws and azure mvn profiles > -- > > Key: TEZ-4076 > URL: https://issues.apache.org/jira/browse/TEZ-4076 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 0.10.1 > > Attachments: TEZ-4076.patch > > Time Spent: 10m > Remaining Estimate: 0h > > It would make sense to include the dependencies in the > {{hadoop-cloud-storage}} jar file when choosing aws or azure profiles. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Assigned] (TEZ-4076) Add hadoop-cloud-storage jar to aws and azure mvn profiles
[ https://issues.apache.org/jira/browse/TEZ-4076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-4076: Assignee: Jesus Camacho Rodriguez > Add hadoop-cloud-storage jar to aws and azure mvn profiles > -- > > Key: TEZ-4076 > URL: https://issues.apache.org/jira/browse/TEZ-4076 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > It would make sense to include the dependencies in the > {{hadoop-cloud-storage}} jar file when choosing aws or azure profiles. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4076) Add hadoop-cloud-storage jar to aws and azure mvn profiles
[ https://issues.apache.org/jira/browse/TEZ-4076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882550#comment-16882550 ] Gopal V commented on TEZ-4076: -- LGTM - +1 > Add hadoop-cloud-storage jar to aws and azure mvn profiles > -- > > Key: TEZ-4076 > URL: https://issues.apache.org/jira/browse/TEZ-4076 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > It would make sense to include the dependencies in the > {{hadoop-cloud-storage}} jar file when choosing aws or azure profiles. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4075) Tez: Reimplement tez.runtime.transfer.data-via-events.enabled
[ https://issues.apache.org/jira/browse/TEZ-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868950#comment-16868950 ] Gopal V commented on TEZ-4075: -- The corresponding test-scenario for this is https://github.com/apache/tez/blob/master/tez-tests/src/main/java/org/apache/tez/mapreduce/examples/BroadcastLoadGen.java > Tez: Reimplement tez.runtime.transfer.data-via-events.enabled > - > > Key: TEZ-4075 > URL: https://issues.apache.org/jira/browse/TEZ-4075 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Priority: Major > > This was factored out by TEZ-2196, which does skip buffers for 1-partition > data exchanges (therefore goes to disk directly). > {code} > if (shufflePayload.hasData()) { > shuffleManager.addKnownInput(shufflePayload.getHost(), > DataProto dataProto = shufflePayload.getData(); > shufflePayload.getPort(), srcAttemptIdentifier, srcIndex); > FetchedInput fetchedInput = > inputAllocator.allocate(dataProto.getRawLength(), > dataProto.getCompressedLength(), srcAttemptIdentifier); > moveDataToFetchedInput(dataProto, fetchedInput, hostIdentifier); > shuffleManager.addCompletedInputWithData(srcAttemptIdentifier, > fetchedInput); > } else { > shuffleManager.addKnownInput(shufflePayload.getHost(), > shufflePayload.getPort(), srcAttemptIdentifier, srcIndex); > } > {code} > got removed in > https://github.com/apache/tez/commit/1ba1f927c16a1d7c273b6cd1a8553e5269d1541a > It would be better to buffer up the 512Byte limit for the event size before > writing to disk, since creating a new file always incurs disk traffic, even > if the file is eventually being served out of the buffer cache. > The total overhead of receiving an event, then firing an HTTP call to fetch > the data etc adds approx 100-150ms to a query - the data xfer through the > event will skip the disk entirely for this & also remove the extra IOPS > incurred. > This channel is not suitable for large-scale event transport, but > specifically the workload here deals with 1-row control tables which consume > more bandwidth with HTTP headers and hostnames than the 93 byte payload. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-4075) Tez: Reimplement tez.runtime.transfer.data-via-events.enabled
Gopal V created TEZ-4075: Summary: Tez: Reimplement tez.runtime.transfer.data-via-events.enabled Key: TEZ-4075 URL: https://issues.apache.org/jira/browse/TEZ-4075 Project: Apache Tez Issue Type: Bug Reporter: Gopal V This was factored out by TEZ-2196, which does skip buffers for 1-partition data exchanges (therefore goes to disk directly). {code} if (shufflePayload.hasData()) { shuffleManager.addKnownInput(shufflePayload.getHost(), DataProto dataProto = shufflePayload.getData(); shufflePayload.getPort(), srcAttemptIdentifier, srcIndex); FetchedInput fetchedInput = inputAllocator.allocate(dataProto.getRawLength(), dataProto.getCompressedLength(), srcAttemptIdentifier); moveDataToFetchedInput(dataProto, fetchedInput, hostIdentifier); shuffleManager.addCompletedInputWithData(srcAttemptIdentifier, fetchedInput); } else { shuffleManager.addKnownInput(shufflePayload.getHost(), shufflePayload.getPort(), srcAttemptIdentifier, srcIndex); } {code} got removed in https://github.com/apache/tez/commit/1ba1f927c16a1d7c273b6cd1a8553e5269d1541a It would be better to buffer up the 512Byte limit for the event size before writing to disk, since creating a new file always incurs disk traffic, even if the file is eventually being served out of the buffer cache. The total overhead of receiving an event, then firing an HTTP call to fetch the data etc adds approx 100-150ms to a query - the data xfer through the event will skip the disk entirely for this & also remove the extra IOPS incurred. This channel is not suitable for large-scale event transport, but specifically the workload here deals with 1-row control tables which consume more bandwidth with HTTP headers and hostnames than the 93 byte payload. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4073) Configuration: Reduce Vertex and DAG Payload Size
[ https://issues.apache.org/jira/browse/TEZ-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16853602#comment-16853602 ] Gopal V commented on TEZ-4073: -- The HiveConf source annotation, allows for a source filter version of {{public static ByteString createByteStringFromConf(Configuration conf) throws IOException}} > Configuration: Reduce Vertex and DAG Payload Size > - > > Key: TEZ-4073 > URL: https://issues.apache.org/jira/browse/TEZ-4073 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Priority: Major > Attachments: tez-am-protobuf-reading.png, tez-protobuf-writing.png > > > As the total number of vertices go up, the Tez protobuf transport starts to > show up as a potential scalability problem for the task submission and the AM > {code} > public TezTaskRunner2(Configuration tezConf, UserGroupInformation ugi, > String[] localDirs, > ... > this.taskConf = new Configuration(tezConf); > if (taskSpec.getTaskConf() != null) { > Iterator> iter = > taskSpec.getTaskConf().iterator(); > while (iter.hasNext()) { > Entry entry = iter.next(); > taskConf.set(entry.getKey(), entry.getValue()); > } > } > {code} > The TaskSpec getTaskConf() need not include any of the default configs, since > the keys are placed into an existing task conf. > {code} > // Security framework already loaded the tokens into current ugi > DAGProtos.ConfigurationProto confProto = > > TezUtilsInternal.readUserSpecifiedTezConfiguration(System.getenv(Environment.PWD.name())); > TezUtilsInternal.addUserSpecifiedTezConfiguration(defaultConf, > confProto.getConfKeyValuesList()); > UserGroupInformation.setConfiguration(defaultConf); > Credentials credentials = > UserGroupInformation.getCurrentUser().getCredentials(); > {code} > At the very least, the DAG and Vertex do not both need to have the same > configs repeated in them. > !tez-protobuf-writing.png! > + > !tez-am-protobuf-reading.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4073) Configuration: Reduce Vertex and DAG Payload Size
[ https://issues.apache.org/jira/browse/TEZ-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16853529#comment-16853529 ] Gopal V commented on TEZ-4073: -- We can also differentiate between the configs set via HiveConf::initialize() && HiveConf::set() using the source variable. > Configuration: Reduce Vertex and DAG Payload Size > - > > Key: TEZ-4073 > URL: https://issues.apache.org/jira/browse/TEZ-4073 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Priority: Major > Attachments: tez-am-protobuf-reading.png, tez-protobuf-writing.png > > > As the total number of vertices go up, the Tez protobuf transport starts to > show up as a potential scalability problem for the task submission and the AM > {code} > public TezTaskRunner2(Configuration tezConf, UserGroupInformation ugi, > String[] localDirs, > ... > this.taskConf = new Configuration(tezConf); > if (taskSpec.getTaskConf() != null) { > Iterator> iter = > taskSpec.getTaskConf().iterator(); > while (iter.hasNext()) { > Entry entry = iter.next(); > taskConf.set(entry.getKey(), entry.getValue()); > } > } > {code} > The TaskSpec getTaskConf() need not include any of the default configs, since > the keys are placed into an existing task conf. > {code} > // Security framework already loaded the tokens into current ugi > DAGProtos.ConfigurationProto confProto = > > TezUtilsInternal.readUserSpecifiedTezConfiguration(System.getenv(Environment.PWD.name())); > TezUtilsInternal.addUserSpecifiedTezConfiguration(defaultConf, > confProto.getConfKeyValuesList()); > UserGroupInformation.setConfiguration(defaultConf); > Credentials credentials = > UserGroupInformation.getCurrentUser().getCredentials(); > {code} > At the very least, the DAG and Vertex do not both need to have the same > configs repeated in them. > !tez-protobuf-writing.png! > + > !tez-am-protobuf-reading.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4073) Configuration: Reduce Vertex and DAG Payload Size
[ https://issues.apache.org/jira/browse/TEZ-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16853527#comment-16853527 ] Gopal V commented on TEZ-4073: -- The trivial version of this is to skip all the configs which have *.xml when writing this out, since the all the XML files have already been parsed + sent for the AM init. > Configuration: Reduce Vertex and DAG Payload Size > - > > Key: TEZ-4073 > URL: https://issues.apache.org/jira/browse/TEZ-4073 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Priority: Major > Attachments: tez-am-protobuf-reading.png, tez-protobuf-writing.png > > > As the total number of vertices go up, the Tez protobuf transport starts to > show up as a potential scalability problem for the task submission and the AM > {code} > public TezTaskRunner2(Configuration tezConf, UserGroupInformation ugi, > String[] localDirs, > ... > this.taskConf = new Configuration(tezConf); > if (taskSpec.getTaskConf() != null) { > Iterator> iter = > taskSpec.getTaskConf().iterator(); > while (iter.hasNext()) { > Entry entry = iter.next(); > taskConf.set(entry.getKey(), entry.getValue()); > } > } > {code} > The TaskSpec getTaskConf() need not include any of the default configs, since > the keys are placed into an existing task conf. > {code} > // Security framework already loaded the tokens into current ugi > DAGProtos.ConfigurationProto confProto = > > TezUtilsInternal.readUserSpecifiedTezConfiguration(System.getenv(Environment.PWD.name())); > TezUtilsInternal.addUserSpecifiedTezConfiguration(defaultConf, > confProto.getConfKeyValuesList()); > UserGroupInformation.setConfiguration(defaultConf); > Credentials credentials = > UserGroupInformation.getCurrentUser().getCredentials(); > {code} > At the very least, the DAG and Vertex do not both need to have the same > configs repeated in them. > !tez-protobuf-writing.png! > + > !tez-am-protobuf-reading.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (TEZ-4073) Configuration: Reduce Vertex and DAG Payload Size
[ https://issues.apache.org/jira/browse/TEZ-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851545#comment-16851545 ] Gopal V edited comment on TEZ-4073 at 5/30/19 6:07 AM: --- Async dispatcher CPU is mostly spent on the protobuf codepaths. The AM side shows hotspots in places like {code} public VertexManagerPluginDescriptor build() { VertexManagerPluginDescriptor desc = VertexManagerPluginDescriptor.create( RootInputVertexManager.class.getName()); try { return desc.setUserPayload(TezUtils .createUserPayloadFromConf(this.conf)); } catch (IOException e) { throw new TezUncheckedException(e); } } {code} was (Author: gopalv): Async dispatcher CPU is mostly spent on the protobuf codepaths. > Configuration: Reduce Vertex and DAG Payload Size > - > > Key: TEZ-4073 > URL: https://issues.apache.org/jira/browse/TEZ-4073 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Priority: Major > Attachments: tez-am-protobuf-reading.png, tez-protobuf-writing.png > > > As the total number of vertices go up, the Tez protobuf transport starts to > show up as a potential scalability problem for the task submission and the AM > {code} > public TezTaskRunner2(Configuration tezConf, UserGroupInformation ugi, > String[] localDirs, > ... > this.taskConf = new Configuration(tezConf); > if (taskSpec.getTaskConf() != null) { > Iterator> iter = > taskSpec.getTaskConf().iterator(); > while (iter.hasNext()) { > Entry entry = iter.next(); > taskConf.set(entry.getKey(), entry.getValue()); > } > } > {code} > The TaskSpec getTaskConf() need not include any of the default configs, since > the keys are placed into an existing task conf. > {code} > // Security framework already loaded the tokens into current ugi > DAGProtos.ConfigurationProto confProto = > > TezUtilsInternal.readUserSpecifiedTezConfiguration(System.getenv(Environment.PWD.name())); > TezUtilsInternal.addUserSpecifiedTezConfiguration(defaultConf, > confProto.getConfKeyValuesList()); > UserGroupInformation.setConfiguration(defaultConf); > Credentials credentials = > UserGroupInformation.getCurrentUser().getCredentials(); > {code} > At the very least, the DAG and Vertex do not both need to have the same > configs repeated in them. > !tez-protobuf-writing.png! > + > !tez-am-protobuf-reading.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4073) Configuration: Reduce Vertex and DAG Payload Size
[ https://issues.apache.org/jira/browse/TEZ-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851545#comment-16851545 ] Gopal V commented on TEZ-4073: -- Async dispatcher CPU is mostly spent on the protobuf codepaths. > Configuration: Reduce Vertex and DAG Payload Size > - > > Key: TEZ-4073 > URL: https://issues.apache.org/jira/browse/TEZ-4073 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Priority: Major > Attachments: tez-am-protobuf-reading.png, tez-protobuf-writing.png > > > As the total number of vertices go up, the Tez protobuf transport starts to > show up as a potential scalability problem for the task submission and the AM > {code} > public TezTaskRunner2(Configuration tezConf, UserGroupInformation ugi, > String[] localDirs, > ... > this.taskConf = new Configuration(tezConf); > if (taskSpec.getTaskConf() != null) { > Iterator> iter = > taskSpec.getTaskConf().iterator(); > while (iter.hasNext()) { > Entry entry = iter.next(); > taskConf.set(entry.getKey(), entry.getValue()); > } > } > {code} > The TaskSpec getTaskConf() need not include any of the default configs, since > the keys are placed into an existing task conf. > {code} > // Security framework already loaded the tokens into current ugi > DAGProtos.ConfigurationProto confProto = > > TezUtilsInternal.readUserSpecifiedTezConfiguration(System.getenv(Environment.PWD.name())); > TezUtilsInternal.addUserSpecifiedTezConfiguration(defaultConf, > confProto.getConfKeyValuesList()); > UserGroupInformation.setConfiguration(defaultConf); > Credentials credentials = > UserGroupInformation.getCurrentUser().getCredentials(); > {code} > At the very least, the DAG and Vertex do not both need to have the same > configs repeated in them. > !tez-protobuf-writing.png! > + > !tez-am-protobuf-reading.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-4073) Configuration: Reduce Vertex and DAG Payload Size
[ https://issues.apache.org/jira/browse/TEZ-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-4073: - Description: As the total number of vertices go up, the Tez protobuf transport starts to show up as a potential scalability problem for the task submission and the AM {code} public TezTaskRunner2(Configuration tezConf, UserGroupInformation ugi, String[] localDirs, ... this.taskConf = new Configuration(tezConf); if (taskSpec.getTaskConf() != null) { Iterator> iter = taskSpec.getTaskConf().iterator(); while (iter.hasNext()) { Entry entry = iter.next(); taskConf.set(entry.getKey(), entry.getValue()); } } {code} The TaskSpec getTaskConf() need not include any of the default configs, since the keys are placed into an existing task conf. {code} // Security framework already loaded the tokens into current ugi DAGProtos.ConfigurationProto confProto = TezUtilsInternal.readUserSpecifiedTezConfiguration(System.getenv(Environment.PWD.name())); TezUtilsInternal.addUserSpecifiedTezConfiguration(defaultConf, confProto.getConfKeyValuesList()); UserGroupInformation.setConfiguration(defaultConf); Credentials credentials = UserGroupInformation.getCurrentUser().getCredentials(); {code} At the very least, the DAG and Vertex do not both need to have the same configs repeated in them. !tez-protobuf-writing.png! + !tez-am-protobuf-reading.png! was: As the total number of vertices go up, the Tez protobuf transport starts to show up as a potential scalability problem for the task submission and the AM {code} public TezTaskRunner2(Configuration tezConf, UserGroupInformation ugi, String[] localDirs, ... this.taskConf = new Configuration(tezConf); if (taskSpec.getTaskConf() != null) { Iterator> iter = taskSpec.getTaskConf().iterator(); while (iter.hasNext()) { Entry entry = iter.next(); taskConf.set(entry.getKey(), entry.getValue()); } } {code} The TaskSpec getTaskConf() need not include any of the default configs, since the keys are placed into an existing task conf. {code} // Security framework already loaded the tokens into current ugi DAGProtos.ConfigurationProto confProto = TezUtilsInternal.readUserSpecifiedTezConfiguration(System.getenv(Environment.PWD.name())); TezUtilsInternal.addUserSpecifiedTezConfiguration(defaultConf, confProto.getConfKeyValuesList()); UserGroupInformation.setConfiguration(defaultConf); Credentials credentials = UserGroupInformation.getCurrentUser().getCredentials(); {code} At the very least, the DAG and Vertex do not both need to have the same configs repeated in them. > Configuration: Reduce Vertex and DAG Payload Size > - > > Key: TEZ-4073 > URL: https://issues.apache.org/jira/browse/TEZ-4073 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Priority: Major > Attachments: tez-am-protobuf-reading.png, tez-protobuf-writing.png > > > As the total number of vertices go up, the Tez protobuf transport starts to > show up as a potential scalability problem for the task submission and the AM > {code} > public TezTaskRunner2(Configuration tezConf, UserGroupInformation ugi, > String[] localDirs, > ... > this.taskConf = new Configuration(tezConf); > if (taskSpec.getTaskConf() != null) { > Iterator> iter = > taskSpec.getTaskConf().iterator(); > while (iter.hasNext()) { > Entry entry = iter.next(); > taskConf.set(entry.getKey(), entry.getValue()); > } > } > {code} > The TaskSpec getTaskConf() need not include any of the default configs, since > the keys are placed into an existing task conf. > {code} > // Security framework already loaded the tokens into current ugi > DAGProtos.ConfigurationProto confProto = > > TezUtilsInternal.readUserSpecifiedTezConfiguration(System.getenv(Environment.PWD.name())); > TezUtilsInternal.addUserSpecifiedTezConfiguration(defaultConf, > confProto.getConfKeyValuesList()); > UserGroupInformation.setConfiguration(defaultConf); > Credentials credentials = > UserGroupInformation.getCurrentUser().getCredentials(); > {code} > At the very least, the DAG and Vertex do not both need to have the same > configs repeated in them. > !tez-protobuf-writing.png! > + > !tez-am-protobuf-reading.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-4073) Configuration: Reduce Vertex and DAG Payload Size
[ https://issues.apache.org/jira/browse/TEZ-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-4073: - Attachment: tez-am-protobuf-reading.png > Configuration: Reduce Vertex and DAG Payload Size > - > > Key: TEZ-4073 > URL: https://issues.apache.org/jira/browse/TEZ-4073 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Priority: Major > Attachments: tez-am-protobuf-reading.png, tez-protobuf-writing.png > > > As the total number of vertices go up, the Tez protobuf transport starts to > show up as a potential scalability problem for the task submission and the AM > {code} > public TezTaskRunner2(Configuration tezConf, UserGroupInformation ugi, > String[] localDirs, > ... > this.taskConf = new Configuration(tezConf); > if (taskSpec.getTaskConf() != null) { > Iterator> iter = > taskSpec.getTaskConf().iterator(); > while (iter.hasNext()) { > Entry entry = iter.next(); > taskConf.set(entry.getKey(), entry.getValue()); > } > } > {code} > The TaskSpec getTaskConf() need not include any of the default configs, since > the keys are placed into an existing task conf. > {code} > // Security framework already loaded the tokens into current ugi > DAGProtos.ConfigurationProto confProto = > > TezUtilsInternal.readUserSpecifiedTezConfiguration(System.getenv(Environment.PWD.name())); > TezUtilsInternal.addUserSpecifiedTezConfiguration(defaultConf, > confProto.getConfKeyValuesList()); > UserGroupInformation.setConfiguration(defaultConf); > Credentials credentials = > UserGroupInformation.getCurrentUser().getCredentials(); > {code} > At the very least, the DAG and Vertex do not both need to have the same > configs repeated in them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-4073) Configuration: Reduce Vertex and DAG Payload Size
[ https://issues.apache.org/jira/browse/TEZ-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-4073: - Attachment: tez-protobuf-writing.png > Configuration: Reduce Vertex and DAG Payload Size > - > > Key: TEZ-4073 > URL: https://issues.apache.org/jira/browse/TEZ-4073 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Priority: Major > Attachments: tez-protobuf-writing.png > > > As the total number of vertices go up, the Tez protobuf transport starts to > show up as a potential scalability problem for the task submission and the AM > {code} > public TezTaskRunner2(Configuration tezConf, UserGroupInformation ugi, > String[] localDirs, > ... > this.taskConf = new Configuration(tezConf); > if (taskSpec.getTaskConf() != null) { > Iterator> iter = > taskSpec.getTaskConf().iterator(); > while (iter.hasNext()) { > Entry entry = iter.next(); > taskConf.set(entry.getKey(), entry.getValue()); > } > } > {code} > The TaskSpec getTaskConf() need not include any of the default configs, since > the keys are placed into an existing task conf. > {code} > // Security framework already loaded the tokens into current ugi > DAGProtos.ConfigurationProto confProto = > > TezUtilsInternal.readUserSpecifiedTezConfiguration(System.getenv(Environment.PWD.name())); > TezUtilsInternal.addUserSpecifiedTezConfiguration(defaultConf, > confProto.getConfKeyValuesList()); > UserGroupInformation.setConfiguration(defaultConf); > Credentials credentials = > UserGroupInformation.getCurrentUser().getCredentials(); > {code} > At the very least, the DAG and Vertex do not both need to have the same > configs repeated in them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-4073) Configuration: Reduce Vertex and DAG Payload Size
Gopal V created TEZ-4073: Summary: Configuration: Reduce Vertex and DAG Payload Size Key: TEZ-4073 URL: https://issues.apache.org/jira/browse/TEZ-4073 Project: Apache Tez Issue Type: Bug Reporter: Gopal V As the total number of vertices go up, the Tez protobuf transport starts to show up as a potential scalability problem for the task submission and the AM {code} public TezTaskRunner2(Configuration tezConf, UserGroupInformation ugi, String[] localDirs, ... this.taskConf = new Configuration(tezConf); if (taskSpec.getTaskConf() != null) { Iterator> iter = taskSpec.getTaskConf().iterator(); while (iter.hasNext()) { Entry entry = iter.next(); taskConf.set(entry.getKey(), entry.getValue()); } } {code} The TaskSpec getTaskConf() need not include any of the default configs, since the keys are placed into an existing task conf. {code} // Security framework already loaded the tokens into current ugi DAGProtos.ConfigurationProto confProto = TezUtilsInternal.readUserSpecifiedTezConfiguration(System.getenv(Environment.PWD.name())); TezUtilsInternal.addUserSpecifiedTezConfiguration(defaultConf, confProto.getConfKeyValuesList()); UserGroupInformation.setConfiguration(defaultConf); Credentials credentials = UserGroupInformation.getCurrentUser().getCredentials(); {code} At the very least, the DAG and Vertex do not both need to have the same configs repeated in them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3310) Handle splits grouping better when locality information is not available (or only when localhost is available)
[ https://issues.apache.org/jira/browse/TEZ-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16833956#comment-16833956 ] Gopal V commented on TEZ-3310: -- The reason "localhost" is scrubbed is because RM was waiting for "localhost" to heartbeat to allocate containers. IIRC, this was because abfsStore.getAbfsConfiguration().getAzureBlockLocationHost() defaulted to "localhost". > Handle splits grouping better when locality information is not available (or > only when localhost is available) > -- > > Key: TEZ-3310 > URL: https://issues.apache.org/jira/browse/TEZ-3310 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rajesh Balamohan >Priority: Minor > > This is a follow up JIRA to TEZ-3291. TEZ-3291 tries to handle the case when > only localhost is specified in the locations. It would be good to improve > handling of splits grouping when Tez does not have enough information about > the locality. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-4057) Fix Unsorted broadcast shuffle umasks
[ https://issues.apache.org/jira/browse/TEZ-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-4057: - Fix Version/s: (was: 0.9.2) 0.9.3 > Fix Unsorted broadcast shuffle umasks > - > > Key: TEZ-4057 > URL: https://issues.apache.org/jira/browse/TEZ-4057 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.2 >Reporter: Gopal V >Assignee: Eric Wohlstadter >Priority: Major > Fix For: 0.10.1, 0.9.3 > > Attachments: TEZ-4057.1.patch > > > {code} > if (numPartitions == 1 && !pipelinedShuffle) { > //special case, where in only one partition is available. > finalOutPath = outputFileHandler.getOutputFileForWrite(); > finalIndexPath = > outputFileHandler.getOutputIndexFileForWrite(indexFileSizeEstimate); > skipBuffers = true; > writer = new IFile.Writer(conf, rfs, finalOutPath, keyClass, valClass, > codec, outputRecordsCounter, outputRecordBytesCounter); > } else { > skipBuffers = false; > writer = null; > } > {code} > The broadcast events don't update the file umasks, because they have 1 > partition. > {code} > total 8.0K > -rw--- 1 hive hadoop 15 Mar 27 20:30 file.out > -rw-r- 1 hive hadoop 32 Mar 27 20:30 file.out.index > {code} > ending up with readable index files and unreadable .out files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-4057) Fix Unsorted broadcast shuffle umasks
[ https://issues.apache.org/jira/browse/TEZ-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-4057: - Fix Version/s: (was: 0.9.3) 0.9.2 > Fix Unsorted broadcast shuffle umasks > - > > Key: TEZ-4057 > URL: https://issues.apache.org/jira/browse/TEZ-4057 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.2 >Reporter: Gopal V >Assignee: Eric Wohlstadter >Priority: Major > Fix For: 0.9.2, 0.10.1 > > Attachments: TEZ-4057.1.patch > > > {code} > if (numPartitions == 1 && !pipelinedShuffle) { > //special case, where in only one partition is available. > finalOutPath = outputFileHandler.getOutputFileForWrite(); > finalIndexPath = > outputFileHandler.getOutputIndexFileForWrite(indexFileSizeEstimate); > skipBuffers = true; > writer = new IFile.Writer(conf, rfs, finalOutPath, keyClass, valClass, > codec, outputRecordsCounter, outputRecordBytesCounter); > } else { > skipBuffers = false; > writer = null; > } > {code} > The broadcast events don't update the file umasks, because they have 1 > partition. > {code} > total 8.0K > -rw--- 1 hive hadoop 15 Mar 27 20:30 file.out > -rw-r- 1 hive hadoop 32 Mar 27 20:30 file.out.index > {code} > ending up with readable index files and unreadable .out files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-4057) Fix Unsorted broadcast shuffle umasks
[ https://issues.apache.org/jira/browse/TEZ-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-4057: - Fix Version/s: 0.10.1 > Fix Unsorted broadcast shuffle umasks > - > > Key: TEZ-4057 > URL: https://issues.apache.org/jira/browse/TEZ-4057 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.2 >Reporter: Gopal V >Assignee: Eric Wohlstadter >Priority: Major > Fix For: 0.10.1 > > Attachments: TEZ-4057.1.patch > > > {code} > if (numPartitions == 1 && !pipelinedShuffle) { > //special case, where in only one partition is available. > finalOutPath = outputFileHandler.getOutputFileForWrite(); > finalIndexPath = > outputFileHandler.getOutputIndexFileForWrite(indexFileSizeEstimate); > skipBuffers = true; > writer = new IFile.Writer(conf, rfs, finalOutPath, keyClass, valClass, > codec, outputRecordsCounter, outputRecordBytesCounter); > } else { > skipBuffers = false; > writer = null; > } > {code} > The broadcast events don't update the file umasks, because they have 1 > partition. > {code} > total 8.0K > -rw--- 1 hive hadoop 15 Mar 27 20:30 file.out > -rw-r- 1 hive hadoop 32 Mar 27 20:30 file.out.index > {code} > ending up with readable index files and unreadable .out files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4057) Fix Unsorted broadcast shuffle umasks
[ https://issues.apache.org/jira/browse/TEZ-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804265#comment-16804265 ] Gopal V commented on TEZ-4057: -- LGTM - +1 > Fix Unsorted broadcast shuffle umasks > - > > Key: TEZ-4057 > URL: https://issues.apache.org/jira/browse/TEZ-4057 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.2 >Reporter: Gopal V >Assignee: Eric Wohlstadter >Priority: Major > Attachments: TEZ-4057.1.patch > > > {code} > if (numPartitions == 1 && !pipelinedShuffle) { > //special case, where in only one partition is available. > finalOutPath = outputFileHandler.getOutputFileForWrite(); > finalIndexPath = > outputFileHandler.getOutputIndexFileForWrite(indexFileSizeEstimate); > skipBuffers = true; > writer = new IFile.Writer(conf, rfs, finalOutPath, keyClass, valClass, > codec, outputRecordsCounter, outputRecordBytesCounter); > } else { > skipBuffers = false; > writer = null; > } > {code} > The broadcast events don't update the file umasks, because they have 1 > partition. > {code} > total 8.0K > -rw--- 1 hive hadoop 15 Mar 27 20:30 file.out > -rw-r- 1 hive hadoop 32 Mar 27 20:30 file.out.index > {code} > ending up with readable index files and unreadable .out files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-4057) Fix Unsorted broadcast shuffle umasks
Gopal V created TEZ-4057: Summary: Fix Unsorted broadcast shuffle umasks Key: TEZ-4057 URL: https://issues.apache.org/jira/browse/TEZ-4057 Project: Apache Tez Issue Type: Bug Affects Versions: 0.9.2 Reporter: Gopal V {code} if (numPartitions == 1 && !pipelinedShuffle) { //special case, where in only one partition is available. finalOutPath = outputFileHandler.getOutputFileForWrite(); finalIndexPath = outputFileHandler.getOutputIndexFileForWrite(indexFileSizeEstimate); skipBuffers = true; writer = new IFile.Writer(conf, rfs, finalOutPath, keyClass, valClass, codec, outputRecordsCounter, outputRecordBytesCounter); } else { skipBuffers = false; writer = null; } {code} The broadcast events don't update the file umasks, because they have 1 partition. {code} total 8.0K -rw--- 1 hive hadoop 15 Mar 27 20:30 file.out -rw-r- 1 hive hadoop 32 Mar 27 20:30 file.out.index {code} ending up with readable index files and unreadable .out files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4044) Zookeeper: exclude jline from Zookeeper client from tez dist
[ https://issues.apache.org/jira/browse/TEZ-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790803#comment-16790803 ] Gopal V commented on TEZ-4044: -- thanks [~jeagles]! > Zookeeper: exclude jline from Zookeeper client from tez dist > > > Key: TEZ-4044 > URL: https://issues.apache.org/jira/browse/TEZ-4044 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Fix For: 0.9.2, 0.10.1 > > Attachments: TEZ-4044.1.patch > > > {code} > [INFO] | +- org.apache.zookeeper:zookeeper:jar:3.4.9:compile > [INFO] | | \- jline:jline:jar:0.9.94:compile > {code} > Breaks CLI clients further down the dependency tree. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4044) Zookeeper: exclude jline from Zookeeper client from tez dist
[ https://issues.apache.org/jira/browse/TEZ-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790132#comment-16790132 ] Gopal V commented on TEZ-4044: -- bq. I wonder how this is failing for you Beeline stops working - because it uses jline too. bq. It's a simple change, but it seems more natural to me that the version of zookeeper would be provided by the environment and not Tez at all. ZK doesn't have a zk-client.jar as far as I know. I tried skipping all of Zookeeper, but that's got other knock-on effects on mini-cluster based tests. JLine is used by the ZK CLI and is not needed at all. > Zookeeper: exclude jline from Zookeeper client from tez dist > > > Key: TEZ-4044 > URL: https://issues.apache.org/jira/browse/TEZ-4044 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: TEZ-4044.1.patch > > > {code} > [INFO] | +- org.apache.zookeeper:zookeeper:jar:3.4.9:compile > [INFO] | | \- jline:jline:jar:0.9.94:compile > {code} > Breaks CLI clients further down the dependency tree. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4048) Make proto history logger queue size configurable
[ https://issues.apache.org/jira/browse/TEZ-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779880#comment-16779880 ] Gopal V commented on TEZ-4048: -- The real problem is that a huge number of events being logged aren't significant to diagnostics - workload management in LLAP dequeuing tasks is showing up as KILLED tasks which is why the queue is overflowing right (& the output writer is writing to S3 right now, which is another slow point). > Make proto history logger queue size configurable > - > > Key: TEZ-4048 > URL: https://issues.apache.org/jira/browse/TEZ-4048 > Project: Apache Tez > Issue Type: Improvement >Affects Versions: 0.9.next >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Minor > Attachments: TEZ-4048.1.patch > > > Currently, the queue size is hard-coded to 10K which may be small for some > bigger cluster. Make it configurable and bump up the default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4048) Make proto history logger queue size configurable
[ https://issues.apache.org/jira/browse/TEZ-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779879#comment-16779879 ] Gopal V commented on TEZ-4048: -- Since I'm overflowing 10k items right now easily (~200k items every 30s), I don't think OOM'ing AMs is a good idea here by accumulating indefinitely. A diagnostic plugin crashing queries is not a good thing by any means. > Make proto history logger queue size configurable > - > > Key: TEZ-4048 > URL: https://issues.apache.org/jira/browse/TEZ-4048 > Project: Apache Tez > Issue Type: Improvement >Affects Versions: 0.9.next >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Minor > Attachments: TEZ-4048.1.patch > > > Currently, the queue size is hard-coded to 10K which may be small for some > bigger cluster. Make it configurable and bump up the default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4048) Make proto history logger queue size configurable
[ https://issues.apache.org/jira/browse/TEZ-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779833#comment-16779833 ] Gopal V commented on TEZ-4048: -- LGTM - +1 The events that overflow are just dropped right now - I suspect we'll end up with a survivor ratio like 2 queues (one for events that actually happened and one to hold all the "KILLED" container events). > Make proto history logger queue size configurable > - > > Key: TEZ-4048 > URL: https://issues.apache.org/jira/browse/TEZ-4048 > Project: Apache Tez > Issue Type: Improvement >Affects Versions: 0.9.next >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Minor > Attachments: TEZ-4048.1.patch > > > Currently, the queue size is hard-coded to 10K which may be small for some > bigger cluster. Make it configurable and bump up the default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-4044) Zookeeper: exclude jline from Zookeeper client from tez dist
[ https://issues.apache.org/jira/browse/TEZ-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-4044: Assignee: Gopal V > Zookeeper: exclude jline from Zookeeper client from tez dist > > > Key: TEZ-4044 > URL: https://issues.apache.org/jira/browse/TEZ-4044 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > > {code} > [INFO] | +- org.apache.zookeeper:zookeeper:jar:3.4.9:compile > [INFO] | | \- jline:jline:jar:0.9.94:compile > {code} > Breaks CLI clients further down the dependency tree. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-4044) Zookeeper: exclude jline from Zookeeper client from tez dist
Gopal V created TEZ-4044: Summary: Zookeeper: exclude jline from Zookeeper client from tez dist Key: TEZ-4044 URL: https://issues.apache.org/jira/browse/TEZ-4044 Project: Apache Tez Issue Type: Bug Affects Versions: 0.10.0 Reporter: Gopal V {code} [INFO] | +- org.apache.zookeeper:zookeeper:jar:3.4.9:compile [INFO] | | \- jline:jline:jar:0.9.94:compile {code} Breaks CLI clients further down the dependency tree. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3985) Correctness: Throw a clear exception for DMEs sent during cleanup
[ https://issues.apache.org/jira/browse/TEZ-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16763925#comment-16763925 ] Gopal V commented on TEZ-3985: -- Fixed minor nits. > Correctness: Throw a clear exception for DMEs sent during cleanup > - > > Key: TEZ-3985 > URL: https://issues.apache.org/jira/browse/TEZ-3985 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Jaume M >Priority: Major > Attachments: TEZ-3985.1.patch, TEZ-3985.2.patch, TEZ-3985.3.patch, > TEZ-3985.3.patch, TEZ-3985.4.patch > > > If a DME is sent during cleanup, that implies that the .close() of the > LogicalIOProcessorRuntimeTask did not succeed and therefore these events are > an error condition. > These events should not be sent and more importantly should be received by > the AM. > Throw a clear exception, in case of this & allow the developers to locate the > extraneous event from the backtrace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-4038) Add a /prof profiler endpoint like HiveServer2 has
[ https://issues.apache.org/jira/browse/TEZ-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-4038: Assignee: Gopal V > Add a /prof profiler endpoint like HiveServer2 has > -- > > Key: TEZ-4038 > URL: https://issues.apache.org/jira/browse/TEZ-4038 > Project: Apache Tez > Issue Type: New Feature >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > > HIVE-20202 makes it very easy to generate flamegraph profiles for Hive > queries. > Extending the same to Tez would be helpful (the profiler toolkit is already > Apache licensed). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-4038) Add a /prof profiler endpoint like HiveServer2 has
Gopal V created TEZ-4038: Summary: Add a /prof profiler endpoint like HiveServer2 has Key: TEZ-4038 URL: https://issues.apache.org/jira/browse/TEZ-4038 Project: Apache Tez Issue Type: New Feature Reporter: Gopal V HIVE-20202 makes it very easy to generate flamegraph profiles for Hive queries. Extending the same to Tez would be helpful (the profiler toolkit is already Apache licensed). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3985) Correctness: Throw a clear exception for DMEs sent during cleanup
[ https://issues.apache.org/jira/browse/TEZ-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3985: - Attachment: TEZ-3985.3.patch > Correctness: Throw a clear exception for DMEs sent during cleanup > - > > Key: TEZ-3985 > URL: https://issues.apache.org/jira/browse/TEZ-3985 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Jaume M >Priority: Major > Attachments: TEZ-3985.1.patch, TEZ-3985.2.patch, TEZ-3985.3.patch, > TEZ-3985.3.patch > > > If a DME is sent during cleanup, that implies that the .close() of the > LogicalIOProcessorRuntimeTask did not succeed and therefore these events are > an error condition. > These events should not be sent and more importantly should be received by > the AM. > Throw a clear exception, in case of this & allow the developers to locate the > extraneous event from the backtrace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-4033) Add a -Pozone flag to include Ozone/HDDS
[ https://issues.apache.org/jira/browse/TEZ-4033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-4033: - Issue Type: New Feature (was: Bug) > Add a -Pozone flag to include Ozone/HDDS > - > > Key: TEZ-4033 > URL: https://issues.apache.org/jira/browse/TEZ-4033 > Project: Apache Tez > Issue Type: New Feature >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > > Add Ozone as an optional dependency to the Tez dist tarball. > https://hadoop.apache.org/ozone/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-4033) Add a -Pozone flag to include Ozone/HDDS
[ https://issues.apache.org/jira/browse/TEZ-4033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-4033: Assignee: Gopal V > Add a -Pozone flag to include Ozone/HDDS > - > > Key: TEZ-4033 > URL: https://issues.apache.org/jira/browse/TEZ-4033 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > > Add Ozone as an optional dependency to the Tez dist tarball. > https://hadoop.apache.org/ozone/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-4033) Add a -Pozone flag to include Ozone/HDDS
Gopal V created TEZ-4033: Summary: Add a -Pozone flag to include Ozone/HDDS Key: TEZ-4033 URL: https://issues.apache.org/jira/browse/TEZ-4033 Project: Apache Tez Issue Type: Bug Reporter: Gopal V Add Ozone as an optional dependency to the Tez dist tarball. https://hadoop.apache.org/ozone/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-4028) Events not visible from proto history logging for s3a filesystem until dag completes.
[ https://issues.apache.org/jira/browse/TEZ-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-4028: - Labels: history (was: ) > Events not visible from proto history logging for s3a filesystem until dag > completes. > - > > Key: TEZ-4028 > URL: https://issues.apache.org/jira/browse/TEZ-4028 > Project: Apache Tez > Issue Type: Bug >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash >Priority: Major > Labels: history > Fix For: 0.10.1 > > Attachments: TEZ-4028.01.patch, TEZ-4028.02.patch > > > The events are not visible in the files because s3 filesystem > * flush writes to local disk and only upload/commit to s3 on close. > * does not support append > As an initial fix we log the dag submitted, initialized and started events > into a file and these can be read to get the dag plan, config from the AM. > The counters are anyways not available until the dag completes. > The in-progress information cannot be read, this can be obtained from the AM > once we have the above events. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4028) Events not visible from proto history logging for s3a filesystem until dag completes.
[ https://issues.apache.org/jira/browse/TEZ-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752776#comment-16752776 ] Gopal V commented on TEZ-4028: -- LGTM - +1 The DAG now gets a flush the moment it starts, so that the configs + plans are flushed early. > Events not visible from proto history logging for s3a filesystem until dag > completes. > - > > Key: TEZ-4028 > URL: https://issues.apache.org/jira/browse/TEZ-4028 > Project: Apache Tez > Issue Type: Bug >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash >Priority: Major > Attachments: TEZ-4028.01.patch, TEZ-4028.02.patch > > > The events are not visible in the files because s3 filesystem > * flush writes to local disk and only upload/commit to s3 on close. > * does not support append > As an initial fix we log the dag submitted, initialized and started events > into a file and these can be read to get the dag plan, config from the AM. > The counters are anyways not available until the dag completes. > The in-progress information cannot be read, this can be obtained from the AM > once we have the above events. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3957) Report TASK_DURATION_MILLIS as a Counter for completed tasks
[ https://issues.apache.org/jira/browse/TEZ-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711829#comment-16711829 ] Gopal V commented on TEZ-3957: -- LGTM - +1 This counter is aggregated for capacity planning models with container reuse. > Report TASK_DURATION_MILLIS as a Counter for completed tasks > > > Key: TEZ-3957 > URL: https://issues.apache.org/jira/browse/TEZ-3957 > Project: Apache Tez > Issue Type: Improvement >Reporter: Eric Wohlstadter >Assignee: Sergey Shelukhin >Priority: Major > Attachments: TEZ-3957.01.patch, TEZ-3957.02.patch, TEZ-3957.02.patch, > TEZ-3957.03.patch, TEZ-3957.patch > > > timeTaken is already being reported by {{TaskAttemptFinishedEvent}}, but not > as a Counter. > Combined with TEZ-3911, this provides min(timeTaken), max(timeTaken), > avg(timeTaken). > The value will be: {{finishTime - launchTime}} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3957) Report TASK_DURATION_MILLIS as a Counter for completed tasks
[ https://issues.apache.org/jira/browse/TEZ-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677669#comment-16677669 ] Gopal V commented on TEZ-3957: -- {code} [ERROR] Process Exit Code: 1 [ERROR] ExecutionException The forked VM terminated without properly saying goodbye. VM crash or System.exit called? [ERROR] Command was /bin/sh -c cd /testptch/tez/tez-runtime-internals && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx1024m -XX:+HeapDumpOnOutOfMemoryError -jar /testptch/tez/tez-runtime-internals/target/surefire/surefirebooter2594127067763002469.jar /testptch/tez/tez-runtime-internals/target/surefire 2018-11-07T04-40-29_190-jvmRun1 surefire5873371651625085212tmp surefire_91017432904624962456tmp [ERROR] Error occurred in starting fork, check output in log {code} Looks like openjdk8 isn't working on these machines? > Report TASK_DURATION_MILLIS as a Counter for completed tasks > > > Key: TEZ-3957 > URL: https://issues.apache.org/jira/browse/TEZ-3957 > Project: Apache Tez > Issue Type: Improvement >Reporter: Eric Wohlstadter >Assignee: Sergey Shelukhin >Priority: Major > Attachments: TEZ-3957.01.patch, TEZ-3957.02.patch, TEZ-3957.02.patch, > TEZ-3957.patch > > > timeTaken is already being reported by {{TaskAttemptFinishedEvent}}, but not > as a Counter. > Combined with TEZ-3911, this provides min(timeTaken), max(timeTaken), > avg(timeTaken). > The value will be: {{finishTime - launchTime}} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3957) Report TASK_DURATION_MILLIS as a Counter for completed tasks
[ https://issues.apache.org/jira/browse/TEZ-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3957: - Attachment: TEZ-3957.02.patch > Report TASK_DURATION_MILLIS as a Counter for completed tasks > > > Key: TEZ-3957 > URL: https://issues.apache.org/jira/browse/TEZ-3957 > Project: Apache Tez > Issue Type: Improvement >Reporter: Eric Wohlstadter >Assignee: Sergey Shelukhin >Priority: Major > Attachments: TEZ-3957.01.patch, TEZ-3957.02.patch, TEZ-3957.02.patch, > TEZ-3957.patch > > > timeTaken is already being reported by {{TaskAttemptFinishedEvent}}, but not > as a Counter. > Combined with TEZ-3911, this provides min(timeTaken), max(timeTaken), > avg(timeTaken). > The value will be: {{finishTime - launchTime}} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3957) Report TASK_DURATION_MILLIS as a Counter for completed tasks
[ https://issues.apache.org/jira/browse/TEZ-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677425#comment-16677425 ] Gopal V commented on TEZ-3957: -- [~sershe]: can you fix the NS_TO_MS with the java time conversions? MapReduce would not need a similar counter, the container lifetimes can be used as a proxy for this in wall-clock seconds, but with Tez we're moved down from the second to the millisecond time-frame where the ntpd slew would be visible. > Report TASK_DURATION_MILLIS as a Counter for completed tasks > > > Key: TEZ-3957 > URL: https://issues.apache.org/jira/browse/TEZ-3957 > Project: Apache Tez > Issue Type: Improvement >Reporter: Eric Wohlstadter >Assignee: Sergey Shelukhin >Priority: Major > Attachments: TEZ-3957.01.patch, TEZ-3957.patch > > > timeTaken is already being reported by {{TaskAttemptFinishedEvent}}, but not > as a Counter. > Combined with TEZ-3911, this provides min(timeTaken), max(timeTaken), > avg(timeTaken). > The value will be: {{finishTime - launchTime}} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3976) Batch ShuffleManager error report events
[ https://issues.apache.org/jira/browse/TEZ-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3976: - Fix Version/s: 0.10.1 > Batch ShuffleManager error report events > > > Key: TEZ-3976 > URL: https://issues.apache.org/jira/browse/TEZ-3976 > Project: Apache Tez > Issue Type: Bug >Reporter: Jaume M >Assignee: Jaume M >Priority: Major > Fix For: 0.10.1 > > Attachments: TEZ-3976.1.patch, TEZ-3976.2.patch, TEZ-3976.3.patch, > TEZ-3976.4.patch, TEZ-3976.5.patch, TEZ-3976.6.patch, TEZ-3976.7.patch, > TEZ-3976.8.patch, TEZ-3976.9.patch > > > The symptoms are a lot of these logs are being shown: > {code:java} > 2018-06-15T18:09:35,811 INFO [Fetcher_B {Reducer_5} #0 ()] > org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager: Reducer_5: > Fetch failed for src: InputAttemptIdentifier [inputIdentifier=701, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000701_0_12541_0, spillType=2, > spillId=0]InputIdentifier: InputAttemptIdentifier [inputIdentifier=701, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000701_0_12541_0, spillType=2, > spillId=0], connectFailed: true > 2018-06-15T18:09:35,811 WARN [Fetcher_B {Reducer_5} #1 ()] > org.apache.tez.runtime.library.common.shuffle.Fetcher: copyInputs failed for > tasks [InputAttemptIdentifier [inputIdentifier=589, attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0]] > 2018-06-15T18:09:35,811 INFO [Fetcher_B {Reducer_5} #1 ()] > org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager: Reducer_5: > Fetch failed for src: InputAttemptIdentifier [inputIdentifier=589, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0]InputIdentifier: InputAttemptIdentifier [inputIdentifier=589, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0], connectFailed: true > {code} > Each of those translate into an event in the AM which finally crashes due to > OOM after around 30 minutes and around 10 million shuffle input errors (and > 10 million lines like the previous ones). When the ShufflerManager is closed > and the counters reported there are many shuffle input errors, some of those > logs are: > {code:java} > 2018-06-15T17:46:30,988 INFO [TezTR-441963_21_34_4_0_4 > (152901963_0021_34_04_00_4)] runtime.LogicalIOProcessorRuntimeTask: > Final Counters for attempt_152901963_0021_34_04_00_4: Counters: 43 > [[org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=0, > NUM_SHUFFLED_INPUTS=26, NUM_FAILED_SHUFFLE_INPUTS=858965, > INPUT_RECORDS_PROCESSED=26, OUTPUT_RECORDS=1, OUTPUT_LARGE_RECORDS=0, > OUTPUT_BYTES=779472, OUTPUT_BYTES_WITH_OVERHEAD=779483, > OUTPUT_BYTES_PHYSICAL=780146, ADDITIONAL_SPILLS_BYTES_WRITTEN=0, > ADDITIONAL_SPILLS_BYTES_READ=0, ADDITIONAL_SPILL_COUNT=0, > SHUFFLE_BYTES=4207563, SHUFFLE_BYTES_DECOMPRESSED=20266603, > SHUFFLE_BYTES_TO_MEM=3380616, SHUFFLE_BYTES_TO_DISK=0, > SHUFFLE_BYTES_DISK_DIRECT=826947, SHUFFLE_PHASE_TIME=52516, > FIRST_EVENT_RECEIVED=1, LAST_EVENT_RECEIVED=1185][HIVE > RECORDS_OUT_INTERMEDIATE_^[[1;35;40m^[[KReducer_12^[[m^[[K=1, > RECORDS_OUT_OPERATOR_GBY_159=1, > RECORDS_OUT_OPERATOR_RS_160=1][TaskCounter_^[[1;35;40m^[[KReducer_12^[[m^[[K_INPUT_Map_11 > FIRST_EVENT_RECEIVED=1, INPUT_RECORDS_PROCESSED=26, > LAST_EVENT_RECEIVED=1185, NUM_FAILED_SHUFFLE_INPUTS=858965, > NUM_SHUFFLED_INPUTS=26, SHUFFLE_BYTES=4207563, > SHUFFLE_BYTES_DECOMPRESSED=20266603, SHUFFLE_BYTES_DISK_DIRECT=826947, > SHUFFLE_BYTES_TO_DISK=0, SHUFFLE_BYTES_TO_MEM=3380616, > SHUFFLE_PHASE_TIME=52516][TaskCounter_^[[1;35;40m^[[KReducer_12^[[m^[[K_OUTPUT_Map_1 > ADDITIONAL_SPILLS_BYTES_READ=0, ADDITIONAL_SPILLS_BYTES_WRITTEN=0, > ADDITIONAL_SPILL_COUNT=0, OUTPUT_BYTES=779472, OUTPUT_BYTES_PHYSICAL=780146, > OUTPUT_BYTES_WITH_OVERHEAD=779483, OUTPUT_LARGE_RECORDS=0, OUTPUT_RECORDS=1, > SPILLED_RECORDS=0]] > 2018-06-15T17:46:32,271 INFO [TezTR-441963_21_34_3_15_1 ()] > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Final Counters for > attempt_152901963_0021_34_03_15_1: Counters: 87 [[File System > Counters FILE_BYTES_READ=0, FILE_BYTES_WRITTEN=0, FILE_READ_OPS=0, > FILE_LARGE_READ_OPS=0, FILE_WRITE_OPS=0, HDFS_BYTES_READ=2344929, > HDFS_BYTES_WRITTEN=0, HDFS_READ_OPS=5, HDFS_LARGE_READ_OPS=0, > HDFS_WRITE_OPS=0][org.apache.tez.common.counters.TaskCounter > SPILLED_RECORDS=0, NUM_SHUFFLED_INPUTS=1, NUM_FAILED_SHUFFLE_INPUTS=105195, > INPUT_RECORDS_PROCESSED=397, INPUT_SPLIT_LENGTH_BYTES=21563271, > OUTPUT_RECORDS=15737, OUTPUT_LARGE_RECORDS=0, OUTPUT_BYTES=1235818, > OUTPUT_BYTES_WITH_OVERHEAD=1267307,
[jira] [Updated] (TEZ-3976) Batch ShuffleManager error report events
[ https://issues.apache.org/jira/browse/TEZ-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3976: - Summary: Batch ShuffleManager error report events (was: ShuffleManager reporting too many errors) > Batch ShuffleManager error report events > > > Key: TEZ-3976 > URL: https://issues.apache.org/jira/browse/TEZ-3976 > Project: Apache Tez > Issue Type: Bug >Reporter: Jaume M >Assignee: Jaume M >Priority: Major > Attachments: TEZ-3976.1.patch, TEZ-3976.2.patch, TEZ-3976.3.patch, > TEZ-3976.4.patch, TEZ-3976.5.patch, TEZ-3976.6.patch, TEZ-3976.7.patch, > TEZ-3976.8.patch, TEZ-3976.9.patch > > > The symptoms are a lot of these logs are being shown: > {code:java} > 2018-06-15T18:09:35,811 INFO [Fetcher_B {Reducer_5} #0 ()] > org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager: Reducer_5: > Fetch failed for src: InputAttemptIdentifier [inputIdentifier=701, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000701_0_12541_0, spillType=2, > spillId=0]InputIdentifier: InputAttemptIdentifier [inputIdentifier=701, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000701_0_12541_0, spillType=2, > spillId=0], connectFailed: true > 2018-06-15T18:09:35,811 WARN [Fetcher_B {Reducer_5} #1 ()] > org.apache.tez.runtime.library.common.shuffle.Fetcher: copyInputs failed for > tasks [InputAttemptIdentifier [inputIdentifier=589, attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0]] > 2018-06-15T18:09:35,811 INFO [Fetcher_B {Reducer_5} #1 ()] > org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager: Reducer_5: > Fetch failed for src: InputAttemptIdentifier [inputIdentifier=589, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0]InputIdentifier: InputAttemptIdentifier [inputIdentifier=589, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0], connectFailed: true > {code} > Each of those translate into an event in the AM which finally crashes due to > OOM after around 30 minutes and around 10 million shuffle input errors (and > 10 million lines like the previous ones). When the ShufflerManager is closed > and the counters reported there are many shuffle input errors, some of those > logs are: > {code:java} > 2018-06-15T17:46:30,988 INFO [TezTR-441963_21_34_4_0_4 > (152901963_0021_34_04_00_4)] runtime.LogicalIOProcessorRuntimeTask: > Final Counters for attempt_152901963_0021_34_04_00_4: Counters: 43 > [[org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=0, > NUM_SHUFFLED_INPUTS=26, NUM_FAILED_SHUFFLE_INPUTS=858965, > INPUT_RECORDS_PROCESSED=26, OUTPUT_RECORDS=1, OUTPUT_LARGE_RECORDS=0, > OUTPUT_BYTES=779472, OUTPUT_BYTES_WITH_OVERHEAD=779483, > OUTPUT_BYTES_PHYSICAL=780146, ADDITIONAL_SPILLS_BYTES_WRITTEN=0, > ADDITIONAL_SPILLS_BYTES_READ=0, ADDITIONAL_SPILL_COUNT=0, > SHUFFLE_BYTES=4207563, SHUFFLE_BYTES_DECOMPRESSED=20266603, > SHUFFLE_BYTES_TO_MEM=3380616, SHUFFLE_BYTES_TO_DISK=0, > SHUFFLE_BYTES_DISK_DIRECT=826947, SHUFFLE_PHASE_TIME=52516, > FIRST_EVENT_RECEIVED=1, LAST_EVENT_RECEIVED=1185][HIVE > RECORDS_OUT_INTERMEDIATE_^[[1;35;40m^[[KReducer_12^[[m^[[K=1, > RECORDS_OUT_OPERATOR_GBY_159=1, > RECORDS_OUT_OPERATOR_RS_160=1][TaskCounter_^[[1;35;40m^[[KReducer_12^[[m^[[K_INPUT_Map_11 > FIRST_EVENT_RECEIVED=1, INPUT_RECORDS_PROCESSED=26, > LAST_EVENT_RECEIVED=1185, NUM_FAILED_SHUFFLE_INPUTS=858965, > NUM_SHUFFLED_INPUTS=26, SHUFFLE_BYTES=4207563, > SHUFFLE_BYTES_DECOMPRESSED=20266603, SHUFFLE_BYTES_DISK_DIRECT=826947, > SHUFFLE_BYTES_TO_DISK=0, SHUFFLE_BYTES_TO_MEM=3380616, > SHUFFLE_PHASE_TIME=52516][TaskCounter_^[[1;35;40m^[[KReducer_12^[[m^[[K_OUTPUT_Map_1 > ADDITIONAL_SPILLS_BYTES_READ=0, ADDITIONAL_SPILLS_BYTES_WRITTEN=0, > ADDITIONAL_SPILL_COUNT=0, OUTPUT_BYTES=779472, OUTPUT_BYTES_PHYSICAL=780146, > OUTPUT_BYTES_WITH_OVERHEAD=779483, OUTPUT_LARGE_RECORDS=0, OUTPUT_RECORDS=1, > SPILLED_RECORDS=0]] > 2018-06-15T17:46:32,271 INFO [TezTR-441963_21_34_3_15_1 ()] > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Final Counters for > attempt_152901963_0021_34_03_15_1: Counters: 87 [[File System > Counters FILE_BYTES_READ=0, FILE_BYTES_WRITTEN=0, FILE_READ_OPS=0, > FILE_LARGE_READ_OPS=0, FILE_WRITE_OPS=0, HDFS_BYTES_READ=2344929, > HDFS_BYTES_WRITTEN=0, HDFS_READ_OPS=5, HDFS_LARGE_READ_OPS=0, > HDFS_WRITE_OPS=0][org.apache.tez.common.counters.TaskCounter > SPILLED_RECORDS=0, NUM_SHUFFLED_INPUTS=1, NUM_FAILED_SHUFFLE_INPUTS=105195, > INPUT_RECORDS_PROCESSED=397, INPUT_SPLIT_LENGTH_BYTES=21563271, > OUTPUT_RECORDS=15737, OUTPUT_LARGE_RECORDS=0, OUTPUT_BYTES=1235818, >
[jira] [Commented] (TEZ-3976) ShuffleManager reporting too many errors
[ https://issues.apache.org/jira/browse/TEZ-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661392#comment-16661392 ] Gopal V commented on TEZ-3976: -- LGTM - +1 > ShuffleManager reporting too many errors > > > Key: TEZ-3976 > URL: https://issues.apache.org/jira/browse/TEZ-3976 > Project: Apache Tez > Issue Type: Bug >Reporter: Jaume M >Assignee: Jaume M >Priority: Major > Attachments: TEZ-3976.1.patch, TEZ-3976.2.patch, TEZ-3976.3.patch, > TEZ-3976.4.patch, TEZ-3976.5.patch, TEZ-3976.6.patch, TEZ-3976.7.patch, > TEZ-3976.8.patch, TEZ-3976.9.patch > > > The symptoms are a lot of these logs are being shown: > {code:java} > 2018-06-15T18:09:35,811 INFO [Fetcher_B {Reducer_5} #0 ()] > org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager: Reducer_5: > Fetch failed for src: InputAttemptIdentifier [inputIdentifier=701, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000701_0_12541_0, spillType=2, > spillId=0]InputIdentifier: InputAttemptIdentifier [inputIdentifier=701, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000701_0_12541_0, spillType=2, > spillId=0], connectFailed: true > 2018-06-15T18:09:35,811 WARN [Fetcher_B {Reducer_5} #1 ()] > org.apache.tez.runtime.library.common.shuffle.Fetcher: copyInputs failed for > tasks [InputAttemptIdentifier [inputIdentifier=589, attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0]] > 2018-06-15T18:09:35,811 INFO [Fetcher_B {Reducer_5} #1 ()] > org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager: Reducer_5: > Fetch failed for src: InputAttemptIdentifier [inputIdentifier=589, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0]InputIdentifier: InputAttemptIdentifier [inputIdentifier=589, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0], connectFailed: true > {code} > Each of those translate into an event in the AM which finally crashes due to > OOM after around 30 minutes and around 10 million shuffle input errors (and > 10 million lines like the previous ones). When the ShufflerManager is closed > and the counters reported there are many shuffle input errors, some of those > logs are: > {code:java} > 2018-06-15T17:46:30,988 INFO [TezTR-441963_21_34_4_0_4 > (152901963_0021_34_04_00_4)] runtime.LogicalIOProcessorRuntimeTask: > Final Counters for attempt_152901963_0021_34_04_00_4: Counters: 43 > [[org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=0, > NUM_SHUFFLED_INPUTS=26, NUM_FAILED_SHUFFLE_INPUTS=858965, > INPUT_RECORDS_PROCESSED=26, OUTPUT_RECORDS=1, OUTPUT_LARGE_RECORDS=0, > OUTPUT_BYTES=779472, OUTPUT_BYTES_WITH_OVERHEAD=779483, > OUTPUT_BYTES_PHYSICAL=780146, ADDITIONAL_SPILLS_BYTES_WRITTEN=0, > ADDITIONAL_SPILLS_BYTES_READ=0, ADDITIONAL_SPILL_COUNT=0, > SHUFFLE_BYTES=4207563, SHUFFLE_BYTES_DECOMPRESSED=20266603, > SHUFFLE_BYTES_TO_MEM=3380616, SHUFFLE_BYTES_TO_DISK=0, > SHUFFLE_BYTES_DISK_DIRECT=826947, SHUFFLE_PHASE_TIME=52516, > FIRST_EVENT_RECEIVED=1, LAST_EVENT_RECEIVED=1185][HIVE > RECORDS_OUT_INTERMEDIATE_^[[1;35;40m^[[KReducer_12^[[m^[[K=1, > RECORDS_OUT_OPERATOR_GBY_159=1, > RECORDS_OUT_OPERATOR_RS_160=1][TaskCounter_^[[1;35;40m^[[KReducer_12^[[m^[[K_INPUT_Map_11 > FIRST_EVENT_RECEIVED=1, INPUT_RECORDS_PROCESSED=26, > LAST_EVENT_RECEIVED=1185, NUM_FAILED_SHUFFLE_INPUTS=858965, > NUM_SHUFFLED_INPUTS=26, SHUFFLE_BYTES=4207563, > SHUFFLE_BYTES_DECOMPRESSED=20266603, SHUFFLE_BYTES_DISK_DIRECT=826947, > SHUFFLE_BYTES_TO_DISK=0, SHUFFLE_BYTES_TO_MEM=3380616, > SHUFFLE_PHASE_TIME=52516][TaskCounter_^[[1;35;40m^[[KReducer_12^[[m^[[K_OUTPUT_Map_1 > ADDITIONAL_SPILLS_BYTES_READ=0, ADDITIONAL_SPILLS_BYTES_WRITTEN=0, > ADDITIONAL_SPILL_COUNT=0, OUTPUT_BYTES=779472, OUTPUT_BYTES_PHYSICAL=780146, > OUTPUT_BYTES_WITH_OVERHEAD=779483, OUTPUT_LARGE_RECORDS=0, OUTPUT_RECORDS=1, > SPILLED_RECORDS=0]] > 2018-06-15T17:46:32,271 INFO [TezTR-441963_21_34_3_15_1 ()] > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Final Counters for > attempt_152901963_0021_34_03_15_1: Counters: 87 [[File System > Counters FILE_BYTES_READ=0, FILE_BYTES_WRITTEN=0, FILE_READ_OPS=0, > FILE_LARGE_READ_OPS=0, FILE_WRITE_OPS=0, HDFS_BYTES_READ=2344929, > HDFS_BYTES_WRITTEN=0, HDFS_READ_OPS=5, HDFS_LARGE_READ_OPS=0, > HDFS_WRITE_OPS=0][org.apache.tez.common.counters.TaskCounter > SPILLED_RECORDS=0, NUM_SHUFFLED_INPUTS=1, NUM_FAILED_SHUFFLE_INPUTS=105195, > INPUT_RECORDS_PROCESSED=397, INPUT_SPLIT_LENGTH_BYTES=21563271, > OUTPUT_RECORDS=15737, OUTPUT_LARGE_RECORDS=0, OUTPUT_BYTES=1235818, > OUTPUT_BYTES_WITH_OVERHEAD=1267307, OUTPUT_BYTES_PHYSICAL=357520,
[jira] [Commented] (TEZ-3976) ShuffleManager reporting too many errors
[ https://issues.apache.org/jira/browse/TEZ-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657617#comment-16657617 ] Gopal V commented on TEZ-3976: -- [~jmarhuen]: I'm fine with this patch (+1), except for one detail - the InputReadErrorEvent equals() & hashCode should not use the numFailures to reference it. {code} + private final HashMap failedEvents + = new HashMap<>() {code} I know you cannot modify numFailures there, but it is just adding noise to the hashtable because it is always the same value (& go into the lower bits of the hash function). The lookup should match whether you have 1 failure or two - currently it assumes you're always merging +1, which is true, but I've debugged too many hash function skews this week to let this one go. > ShuffleManager reporting too many errors > > > Key: TEZ-3976 > URL: https://issues.apache.org/jira/browse/TEZ-3976 > Project: Apache Tez > Issue Type: Bug >Reporter: Jaume M >Assignee: Jaume M >Priority: Major > Attachments: TEZ-3976.1.patch, TEZ-3976.2.patch, TEZ-3976.3.patch, > TEZ-3976.4.patch, TEZ-3976.5.patch, TEZ-3976.6.patch, TEZ-3976.7.patch > > > The symptoms are a lot of these logs are being shown: > {code:java} > 2018-06-15T18:09:35,811 INFO [Fetcher_B {Reducer_5} #0 ()] > org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager: Reducer_5: > Fetch failed for src: InputAttemptIdentifier [inputIdentifier=701, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000701_0_12541_0, spillType=2, > spillId=0]InputIdentifier: InputAttemptIdentifier [inputIdentifier=701, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000701_0_12541_0, spillType=2, > spillId=0], connectFailed: true > 2018-06-15T18:09:35,811 WARN [Fetcher_B {Reducer_5} #1 ()] > org.apache.tez.runtime.library.common.shuffle.Fetcher: copyInputs failed for > tasks [InputAttemptIdentifier [inputIdentifier=589, attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0]] > 2018-06-15T18:09:35,811 INFO [Fetcher_B {Reducer_5} #1 ()] > org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager: Reducer_5: > Fetch failed for src: InputAttemptIdentifier [inputIdentifier=589, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0]InputIdentifier: InputAttemptIdentifier [inputIdentifier=589, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0], connectFailed: true > {code} > Each of those translate into an event in the AM which finally crashes due to > OOM after around 30 minutes and around 10 million shuffle input errors (and > 10 million lines like the previous ones). When the ShufflerManager is closed > and the counters reported there are many shuffle input errors, some of those > logs are: > {code:java} > 2018-06-15T17:46:30,988 INFO [TezTR-441963_21_34_4_0_4 > (152901963_0021_34_04_00_4)] runtime.LogicalIOProcessorRuntimeTask: > Final Counters for attempt_152901963_0021_34_04_00_4: Counters: 43 > [[org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=0, > NUM_SHUFFLED_INPUTS=26, NUM_FAILED_SHUFFLE_INPUTS=858965, > INPUT_RECORDS_PROCESSED=26, OUTPUT_RECORDS=1, OUTPUT_LARGE_RECORDS=0, > OUTPUT_BYTES=779472, OUTPUT_BYTES_WITH_OVERHEAD=779483, > OUTPUT_BYTES_PHYSICAL=780146, ADDITIONAL_SPILLS_BYTES_WRITTEN=0, > ADDITIONAL_SPILLS_BYTES_READ=0, ADDITIONAL_SPILL_COUNT=0, > SHUFFLE_BYTES=4207563, SHUFFLE_BYTES_DECOMPRESSED=20266603, > SHUFFLE_BYTES_TO_MEM=3380616, SHUFFLE_BYTES_TO_DISK=0, > SHUFFLE_BYTES_DISK_DIRECT=826947, SHUFFLE_PHASE_TIME=52516, > FIRST_EVENT_RECEIVED=1, LAST_EVENT_RECEIVED=1185][HIVE > RECORDS_OUT_INTERMEDIATE_^[[1;35;40m^[[KReducer_12^[[m^[[K=1, > RECORDS_OUT_OPERATOR_GBY_159=1, > RECORDS_OUT_OPERATOR_RS_160=1][TaskCounter_^[[1;35;40m^[[KReducer_12^[[m^[[K_INPUT_Map_11 > FIRST_EVENT_RECEIVED=1, INPUT_RECORDS_PROCESSED=26, > LAST_EVENT_RECEIVED=1185, NUM_FAILED_SHUFFLE_INPUTS=858965, > NUM_SHUFFLED_INPUTS=26, SHUFFLE_BYTES=4207563, > SHUFFLE_BYTES_DECOMPRESSED=20266603, SHUFFLE_BYTES_DISK_DIRECT=826947, > SHUFFLE_BYTES_TO_DISK=0, SHUFFLE_BYTES_TO_MEM=3380616, > SHUFFLE_PHASE_TIME=52516][TaskCounter_^[[1;35;40m^[[KReducer_12^[[m^[[K_OUTPUT_Map_1 > ADDITIONAL_SPILLS_BYTES_READ=0, ADDITIONAL_SPILLS_BYTES_WRITTEN=0, > ADDITIONAL_SPILL_COUNT=0, OUTPUT_BYTES=779472, OUTPUT_BYTES_PHYSICAL=780146, > OUTPUT_BYTES_WITH_OVERHEAD=779483, OUTPUT_LARGE_RECORDS=0, OUTPUT_RECORDS=1, > SPILLED_RECORDS=0]] > 2018-06-15T17:46:32,271 INFO [TezTR-441963_21_34_3_15_1 ()] > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Final Counters for > attempt_152901963_0021_34_03_15_1: Counters: 87
[jira] [Commented] (TEZ-4010) Use InputReadyVertexManager for broadcast connection
[ https://issues.apache.org/jira/browse/TEZ-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654147#comment-16654147 ] Gopal V commented on TEZ-4010: -- [~jeagles]: you are right - the scenario in TEZ-3274 is related, but this particular case applies to the scenario at the end of the condition (in the else). {code} // Intended order of picking a vertex manager // If there is an InputInitializer then we use the RootInputVertexManager. May be fixed by TEZ-703 // If there is a custom edge we fall back to default ImmediateStartVertexManager // If there is a one to one edge then we use the InputReadyVertexManager // If there is a scatter-gather edge then we use the ShuffleVertexManager // Else we use the default ImmediateStartVertexManager {code} For this specific case there are no explicit flags set for broadcast vertex manager (i.e it is currently handled by the catch-all else statement). That did give us a performance improvement in the past for very short running queries (where the broadcast would start & finish before the join side ended up spinning up YARN containers), so not waiting for the broadcast to start asking for the YARN containers was an effective speedup. I even have slides talking about this as a feature https://www.slideshare.net/t3rmin4t0r/performance-hive/34 > Use InputReadyVertexManager for broadcast connection > > > Key: TEZ-4010 > URL: https://issues.apache.org/jira/browse/TEZ-4010 > Project: Apache Tez > Issue Type: Improvement >Reporter: Hitesh Sharma >Priority: Minor > > As per > [VertexImpl::assignVertexManager|https://github.com/apache/tez/blob/39d76a656216d4843908279ef8eaa29a4cc83104/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L2677] > an ImmediateStartVertexManager is used for broadcast connections. This seems > to be inefficient as tasks are started way before they can run in the DAG. > Thoughts on using InputReadyVertexManager for broadcast connections? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-4010) Use InputReadyVertexManager for broadcast connection
[ https://issues.apache.org/jira/browse/TEZ-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16653140#comment-16653140 ] Gopal V commented on TEZ-4010: -- bq. This seems to be inefficient as tasks are started way before they can run in the DAG. This made sense a long time ago when the BI queries need to start the YARN containers ahead of the broadcast tasks (the YARN task spinups take between 3-10 seconds). At this point (with LLAP replacing YARN containers for BI), I'd say that this is an appendix to be removed and replaced with the InputReadyVertexManager instead. > Use InputReadyVertexManager for broadcast connection > > > Key: TEZ-4010 > URL: https://issues.apache.org/jira/browse/TEZ-4010 > Project: Apache Tez > Issue Type: Improvement >Reporter: Hitesh Sharma >Priority: Minor > > As per > [VertexImpl::assignVertexManager|https://github.com/apache/tez/blob/39d76a656216d4843908279ef8eaa29a4cc83104/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L2677] > an ImmediateStartVertexManager is used for broadcast connections. This seems > to be inefficient as tasks are started way before they can run in the DAG. > Thoughts on using InputReadyVertexManager for broadcast connections? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3991) Unmanaged tez sessions
[ https://issues.apache.org/jira/browse/TEZ-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3991: - Labels: Kubernetes (was: ) > Unmanaged tez sessions > -- > > Key: TEZ-3991 > URL: https://issues.apache.org/jira/browse/TEZ-3991 > Project: Apache Tez > Issue Type: New Feature >Affects Versions: 0.10.0 >Reporter: Prasanth Jayachandran >Assignee: Eric Wohlstadter >Priority: Major > Labels: Kubernetes > > Provide an option for launching tez AM in unmanaged mode. In unmanaged mode, > tez AMs can register itself with Zookeeper which clients (like HiveServer2) > can discover via zk registry client. > HiveServer2 currently manages the lifecycle of tez AMs. The unmanaged mode > will let AM come up on their own (can be via simple java launcher) and be > discoverable for others. > Example use case for this is, HiveServer2 can discover already running AMs > and can attach to it for DAG submission and detach when done executing > queries. AMs can similarly discover LLAP daemons via task scheduler plugin > for submitting tasks. > A mode to cut off interactions with RM will also useful since for LLAP no > on-demand containers are required. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-4003) Add gop...@apache.org to KEYS file
[ https://issues.apache.org/jira/browse/TEZ-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-4003: - Attachment: TEZ-4003.patch > Add gop...@apache.org to KEYS file > -- > > Key: TEZ-4003 > URL: https://issues.apache.org/jira/browse/TEZ-4003 > Project: Apache Tez > Issue Type: Task >Reporter: Gopal V >Assignee: Gopal V >Priority: Trivial > Attachments: TEZ-4003.patch > > > {code} > -END PGP PUBLIC KEY BLOCK- > pub rsa4096 2018-09-20 [SC] > 6CFAA64865AD19C55C5662680C5267F97FBEC4F9 > uid [ultimate] Gopal Vijayaraghavan (CODE SIGNING KEY) > > sig 30C5267F97FBEC4F9 2018-09-20 Gopal Vijayaraghavan (CODE SIGNING > KEY) > sub rsa4096 2018-09-20 [E] > sig 0C5267F97FBEC4F9 2018-09-20 Gopal Vijayaraghavan (CODE SIGNING > KEY) > -BEGIN PGP PUBLIC KEY BLOCK- > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-4003) Add gop...@apache.org to KEYS file
[ https://issues.apache.org/jira/browse/TEZ-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-4003: Assignee: Gopal V > Add gop...@apache.org to KEYS file > -- > > Key: TEZ-4003 > URL: https://issues.apache.org/jira/browse/TEZ-4003 > Project: Apache Tez > Issue Type: Task >Reporter: Gopal V >Assignee: Gopal V >Priority: Trivial > Attachments: TEZ-4003.patch > > > {code} > -END PGP PUBLIC KEY BLOCK- > pub rsa4096 2018-09-20 [SC] > 6CFAA64865AD19C55C5662680C5267F97FBEC4F9 > uid [ultimate] Gopal Vijayaraghavan (CODE SIGNING KEY) > > sig 30C5267F97FBEC4F9 2018-09-20 Gopal Vijayaraghavan (CODE SIGNING > KEY) > sub rsa4096 2018-09-20 [E] > sig 0C5267F97FBEC4F9 2018-09-20 Gopal Vijayaraghavan (CODE SIGNING > KEY) > -BEGIN PGP PUBLIC KEY BLOCK- > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-4003) Add gop...@apache.org to KEYS file
Gopal V created TEZ-4003: Summary: Add gop...@apache.org to KEYS file Key: TEZ-4003 URL: https://issues.apache.org/jira/browse/TEZ-4003 Project: Apache Tez Issue Type: Task Reporter: Gopal V Attachments: TEZ-4003.patch {code} -END PGP PUBLIC KEY BLOCK- pub rsa4096 2018-09-20 [SC] 6CFAA64865AD19C55C5662680C5267F97FBEC4F9 uid [ultimate] Gopal Vijayaraghavan (CODE SIGNING KEY) sig 30C5267F97FBEC4F9 2018-09-20 Gopal Vijayaraghavan (CODE SIGNING KEY) sub rsa4096 2018-09-20 [E] sig 0C5267F97FBEC4F9 2018-09-20 Gopal Vijayaraghavan (CODE SIGNING KEY) -BEGIN PGP PUBLIC KEY BLOCK- {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3888) Update Jetty to org.eclipse.jetty 9.x
[ https://issues.apache.org/jira/browse/TEZ-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642412#comment-16642412 ] Gopal V commented on TEZ-3888: -- Sure, standardizing the versions for ABI makes sense - I think HIVE-19421 moved HS2 ahead of where you are. > Update Jetty to org.eclipse.jetty 9.x > - > > Key: TEZ-3888 > URL: https://issues.apache.org/jira/browse/TEZ-3888 > Project: Apache Tez > Issue Type: Improvement >Reporter: Eric Wohlstadter >Assignee: Eric Wohlstadter >Priority: Major > Fix For: 0.9.2 > > Attachments: TEZ-3888.1.patch > > > mortbay Jetty 6 is no longer supported and has multiple CVEs. > Tez can't be used in scenarios where compliance against vulnerability > scanning tools is required. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3888) Update Jetty to org.eclipse.jetty 9.x
[ https://issues.apache.org/jira/browse/TEZ-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642405#comment-16642405 ] Gopal V commented on TEZ-3888: -- bq. this is causing NoSuchMethod errors in HiveServer2. Which version is this? Since the public interface for HS2 goes over HTTP, CVE-2009-1523 needs to go back in. > Update Jetty to org.eclipse.jetty 9.x > - > > Key: TEZ-3888 > URL: https://issues.apache.org/jira/browse/TEZ-3888 > Project: Apache Tez > Issue Type: Improvement >Reporter: Eric Wohlstadter >Assignee: Eric Wohlstadter >Priority: Major > Fix For: 0.9.2 > > Attachments: TEZ-3888.1.patch > > > mortbay Jetty 6 is no longer supported and has multiple CVEs. > Tez can't be used in scenarios where compliance against vulnerability > scanning tools is required. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-3984) Shuffle: Out of Band DME event sending causes errors
[ https://issues.apache.org/jira/browse/TEZ-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-3984: Assignee: Jaume M (was: Gopal V) > Shuffle: Out of Band DME event sending causes errors > > > Key: TEZ-3984 > URL: https://issues.apache.org/jira/browse/TEZ-3984 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.4, 0.9.1, 0.10.0 >Reporter: Gopal V >Assignee: Jaume M >Priority: Critical > Labels: correctness > Attachments: TEZ-3984.1.patch, TEZ-3984.2.patch, TEZ-3984.3.patch, > TEZ-3984.4.patch, TEZ-3984.5.patch, TEZ-3984.5.patch > > > In case of a task Input throwing an exception, the outputs are also closed in > the LogicalIOProcessorRuntimeTask.cleanup(). > Cleanup ignore all the events returned by output close, however if any output > tries to send an event out of band by directly calling > outputContext.sendEvents(events), then those events can reach the AM before > the task failure is reported. > This can cause correctness issues with shuffle since zero sized events can be > sent out due to an input failure and downstream tasks may never reattempt a > fetch from the valid attempt. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3984) Shuffle: Out of Band DME event sending causes errors
[ https://issues.apache.org/jira/browse/TEZ-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3984: - Attachment: TEZ-3984.5.patch > Shuffle: Out of Band DME event sending causes errors > > > Key: TEZ-3984 > URL: https://issues.apache.org/jira/browse/TEZ-3984 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.4, 0.9.1, 0.10.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Critical > Labels: correctness > Attachments: TEZ-3984.1.patch, TEZ-3984.2.patch, TEZ-3984.3.patch, > TEZ-3984.4.patch, TEZ-3984.5.patch, TEZ-3984.5.patch > > > In case of a task Input throwing an exception, the outputs are also closed in > the LogicalIOProcessorRuntimeTask.cleanup(). > Cleanup ignore all the events returned by output close, however if any output > tries to send an event out of band by directly calling > outputContext.sendEvents(events), then those events can reach the AM before > the task failure is reported. > This can cause correctness issues with shuffle since zero sized events can be > sent out due to an input failure and downstream tasks may never reattempt a > fetch from the valid attempt. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-3984) Shuffle: Out of Band DME event sending causes errors
[ https://issues.apache.org/jira/browse/TEZ-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-3984: Assignee: Gopal V (was: Jaume M) > Shuffle: Out of Band DME event sending causes errors > > > Key: TEZ-3984 > URL: https://issues.apache.org/jira/browse/TEZ-3984 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.4, 0.9.1, 0.10.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Critical > Labels: correctness > Attachments: TEZ-3984.1.patch, TEZ-3984.2.patch, TEZ-3984.3.patch, > TEZ-3984.4.patch, TEZ-3984.5.patch > > > In case of a task Input throwing an exception, the outputs are also closed in > the LogicalIOProcessorRuntimeTask.cleanup(). > Cleanup ignore all the events returned by output close, however if any output > tries to send an event out of band by directly calling > outputContext.sendEvents(events), then those events can reach the AM before > the task failure is reported. > This can cause correctness issues with shuffle since zero sized events can be > sent out due to an input failure and downstream tasks may never reattempt a > fetch from the valid attempt. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3984) Shuffle: Out of Band DME event sending causes errors
[ https://issues.apache.org/jira/browse/TEZ-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596751#comment-16596751 ] Gopal V commented on TEZ-3984: -- The patch looks good - minor NIT, the OrderedPartitionedKVOutput event lists need to be prefixed. The sorter events need to be inserted at 0, not appended (for event order related issues - which doesn't exist today, because it is likely to be no events in generateEvents but is neater to see them in order in the AM). > Shuffle: Out of Band DME event sending causes errors > > > Key: TEZ-3984 > URL: https://issues.apache.org/jira/browse/TEZ-3984 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.4, 0.9.1, 0.10.0 >Reporter: Gopal V >Assignee: Jaume M >Priority: Critical > Labels: correctness > Attachments: TEZ-3984.1.patch > > > In case of a task Input throwing an exception, the outputs are also closed in > the LogicalIOProcessorRuntimeTask.cleanup(). > Cleanup ignore all the events returned by output close, however if any output > tries to send an event out of band by directly calling > outputContext.sendEvents(events), then those events can reach the AM before > the task failure is reported. > This can cause correctness issues with shuffle since zero sized events can be > sent out due to an input failure and downstream tasks may never reattempt a > fetch from the valid attempt. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3985) Correctness: Throw a clear exception for DMEs sent during cleanup
[ https://issues.apache.org/jira/browse/TEZ-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595259#comment-16595259 ] Gopal V commented on TEZ-3985: -- Interestingly, this does not get logged at all - because the cleanup ignores all exceptions (however, the correctness issue is resolved). This needs to instead log something explicitly like "Cowardly not sending events during cleanup", new Throwable() or something like that. > Correctness: Throw a clear exception for DMEs sent during cleanup > - > > Key: TEZ-3985 > URL: https://issues.apache.org/jira/browse/TEZ-3985 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Jaume M >Priority: Major > Attachments: TEZ-3985.1.patch > > > If a DME is sent during cleanup, that implies that the .close() of the > LogicalIOProcessorRuntimeTask did not succeed and therefore these events are > an error condition. > These events should not be sent and more importantly should be received by > the AM. > Throw a clear exception, in case of this & allow the developers to locate the > extraneous event from the backtrace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3958) Add internal vertex priority information into the tez dag.dot debug information
[ https://issues.apache.org/jira/browse/TEZ-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3958: - Attachment: TEZ-3958.5.patch > Add internal vertex priority information into the tez dag.dot debug > information > --- > > Key: TEZ-3958 > URL: https://issues.apache.org/jira/browse/TEZ-3958 > Project: Apache Tez > Issue Type: Improvement >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: TEZ-3958.1.patch, TEZ-3958.2.patch, TEZ-3958.3.patch, > TEZ-3958.4.patch, TEZ-3958.5.patch > > > Adding the actual vertex priority as computed by Tez into the debug dag.dot > file would allows the debugging of task pre-emption issues when the DAG is no > longer a tree. > There are pre-emption issues with isomerization of Tez DAGs, where the a > R-isomer dag with mirror rotation runs at a different speed than the L-isomer > dag, due to priorities at the same level changing due to the vertex-id order. > Since the problem is hard to debug through, it would be good to record the > computed priority in the DAG .dot file in the logging directories. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-3958) Add internal vertex priority information into the tez dag.dot debug information
[ https://issues.apache.org/jira/browse/TEZ-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-3958: Assignee: Jaume M (was: Gopal V) > Add internal vertex priority information into the tez dag.dot debug > information > --- > > Key: TEZ-3958 > URL: https://issues.apache.org/jira/browse/TEZ-3958 > Project: Apache Tez > Issue Type: Improvement >Reporter: Gopal V >Assignee: Jaume M >Priority: Major > Attachments: TEZ-3958.1.patch, TEZ-3958.2.patch, TEZ-3958.3.patch, > TEZ-3958.4.patch, TEZ-3958.5.patch > > > Adding the actual vertex priority as computed by Tez into the debug dag.dot > file would allows the debugging of task pre-emption issues when the DAG is no > longer a tree. > There are pre-emption issues with isomerization of Tez DAGs, where the a > R-isomer dag with mirror rotation runs at a different speed than the L-isomer > dag, due to priorities at the same level changing due to the vertex-id order. > Since the problem is hard to debug through, it would be good to record the > computed priority in the DAG .dot file in the logging directories. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-3958) Add internal vertex priority information into the tez dag.dot debug information
[ https://issues.apache.org/jira/browse/TEZ-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-3958: Assignee: Gopal V (was: Jaume M) > Add internal vertex priority information into the tez dag.dot debug > information > --- > > Key: TEZ-3958 > URL: https://issues.apache.org/jira/browse/TEZ-3958 > Project: Apache Tez > Issue Type: Improvement >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: TEZ-3958.1.patch, TEZ-3958.2.patch, TEZ-3958.3.patch, > TEZ-3958.4.patch > > > Adding the actual vertex priority as computed by Tez into the debug dag.dot > file would allows the debugging of task pre-emption issues when the DAG is no > longer a tree. > There are pre-emption issues with isomerization of Tez DAGs, where the a > R-isomer dag with mirror rotation runs at a different speed than the L-isomer > dag, due to priorities at the same level changing due to the vertex-id order. > Since the problem is hard to debug through, it would be good to record the > computed priority in the DAG .dot file in the logging directories. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-3985) Correctness: Throw a clear exception for DMEs sent during cleanup
Gopal V created TEZ-3985: Summary: Correctness: Throw a clear exception for DMEs sent during cleanup Key: TEZ-3985 URL: https://issues.apache.org/jira/browse/TEZ-3985 Project: Apache Tez Issue Type: Bug Reporter: Gopal V If a DME is sent during cleanup, that implies that the .close() of the LogicalIOProcessorRuntimeTask did not succeed and therefore these events are an error condition. These events should not be sent and more importantly should be received by the AM. Throw a clear exception, in case of this & allow the developers to locate the extraneous event from the backtrace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3984) Shuffle: Out of Band DME event sending causes errors
[ https://issues.apache.org/jira/browse/TEZ-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594304#comment-16594304 ] Gopal V commented on TEZ-3984: -- Specific sequence of events is - input throws exception. {code} 2018-08-27T17:25:15,579 WARN [TezTR-437616_7273_9_0_0_0 (1520459437616_7273_9_00_00_0)] runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when closing input calls(cleanup). Exception class=java.io.IOException, message ... {code} Output gets closed for memory recovery {code} 2018-08-27T17:25:15,579 INFO [TezTR-437616_7273_9_0_0_0 (1520459437616_7273_9_00_00_0)] impl.PipelinedSorter: Reducer 2: Starting flush of map output {code} Sorter pushes event to the output context directly {code} 2018-08-27T17:25:15,990 INFO [TezTR-437616_7273_9_0_0_0 (1520459437616_7273_9_00_00_0)] impl.PipelinedSorter: Reducer 2: Adding spill event for spill (final update=true), spillId=0 {code} And the Reducer 2 gets the event routed to it. > Shuffle: Out of Band DME event sending causes errors > > > Key: TEZ-3984 > URL: https://issues.apache.org/jira/browse/TEZ-3984 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.4, 0.9.1, 0.10.0 >Reporter: Gopal V >Priority: Critical > Labels: correctness > > In case of a task Input throwing an exception, the outputs are also closed in > the LogicalIOProcessorRuntimeTask.cleanup(). > Cleanup ignore all the events returned by output close, however if any output > tries to send an event out of band by directly calling > outputContext.sendEvents(events), then those events can reach the AM before > the task failure is reported. > This can cause correctness issues with shuffle since zero sized events can be > sent out due to an input failure and downstream tasks may never reattempt a > fetch from the valid attempt. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3984) Shuffle: Out of Band DME event sending causes errors
[ https://issues.apache.org/jira/browse/TEZ-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3984: - Labels: correctness (was: ) > Shuffle: Out of Band DME event sending causes errors > > > Key: TEZ-3984 > URL: https://issues.apache.org/jira/browse/TEZ-3984 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.4, 0.9.1, 0.10.0 >Reporter: Gopal V >Priority: Critical > Labels: correctness > > In case of a task Input throwing an exception, the outputs are also closed in > the LogicalIOProcessorRuntimeTask.cleanup(). > Cleanup ignore all the events returned by output close, however if any output > tries to send an event out of band by directly calling > outputContext.sendEvents(events), then those events can reach the AM before > the task failure is reported. > This can cause correctness issues with shuffle since zero sized events can be > sent out due to an input failure and downstream tasks may never reattempt a > fetch from the valid attempt. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-3984) Shuffle: Out of Band DME event sending causes errors
Gopal V created TEZ-3984: Summary: Shuffle: Out of Band DME event sending causes errors Key: TEZ-3984 URL: https://issues.apache.org/jira/browse/TEZ-3984 Project: Apache Tez Issue Type: Bug Affects Versions: 0.9.1, 0.8.4, 0.10.0 Reporter: Gopal V In case of a task Input throwing an exception, the outputs are also closed in the LogicalIOProcessorRuntimeTask.cleanup(). Cleanup ignore all the events returned by output close, however if any output tries to send an event out of band by directly calling outputContext.sendEvents(events), then those events can reach the AM before the task failure is reported. This can cause correctness issues with shuffle since zero sized events can be sent out due to an input failure and downstream tasks may never reattempt a fetch from the valid attempt. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3980) ShuffleRunner: the wake loop needs to check for shutdown
[ https://issues.apache.org/jira/browse/TEZ-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582757#comment-16582757 ] Gopal V commented on TEZ-3980: -- Testing issues with LLAP task pre-emption. When reducers doing the unsorted shuffle join (or the bloom filter semi-join) are pre-empted, they leave behind a shuffle runner thread. After 32k threads leak, this fails with a "cannot create thread" in some other random IPC thread. > ShuffleRunner: the wake loop needs to check for shutdown > > > Key: TEZ-3980 > URL: https://issues.apache.org/jira/browse/TEZ-3980 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: TEZ-3980.1.patch > > > In the ShuffleRunner threads, there's a loop which does not terminate if the > task threads get killed. > {code} > while ((runningFetchers.size() >= numFetchers || > pendingHosts.isEmpty()) > && numCompletedInputs.get() < numInputs) { > inputContext.notifyProgress(); > boolean ret = wakeLoop.await(1000, TimeUnit.MILLISECONDS); > } > {code} > The wakeLoop signal does not exit this out of the loop and is missing a break > for shut-down. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3980) ShuffleRunner: the wake loop needs to check for shutdown
[ https://issues.apache.org/jira/browse/TEZ-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582752#comment-16582752 ] Gopal V commented on TEZ-3980: -- The shufflescheduler has a check for shutdown.get() + a break inside the loop (also uses thread wait). This is a shufflemanager only bug right now. > ShuffleRunner: the wake loop needs to check for shutdown > > > Key: TEZ-3980 > URL: https://issues.apache.org/jira/browse/TEZ-3980 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: TEZ-3980.1.patch > > > In the ShuffleRunner threads, there's a loop which does not terminate if the > task threads get killed. > {code} > while ((runningFetchers.size() >= numFetchers || > pendingHosts.isEmpty()) > && numCompletedInputs.get() < numInputs) { > inputContext.notifyProgress(); > boolean ret = wakeLoop.await(1000, TimeUnit.MILLISECONDS); > } > {code} > The wakeLoop signal does not exit this out of the loop and is missing a break > for shut-down. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3980) ShuffleRunner: the wake loop needs to check for shutdown
[ https://issues.apache.org/jira/browse/TEZ-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3980: - Attachment: (was: TEZ-3980.1.patch) > ShuffleRunner: the wake loop needs to check for shutdown > > > Key: TEZ-3980 > URL: https://issues.apache.org/jira/browse/TEZ-3980 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: TEZ-3980.1.patch > > > In the ShuffleRunner threads, there's a loop which does not terminate if the > task threads get killed. > {code} > while ((runningFetchers.size() >= numFetchers || > pendingHosts.isEmpty()) > && numCompletedInputs.get() < numInputs) { > inputContext.notifyProgress(); > boolean ret = wakeLoop.await(1000, TimeUnit.MILLISECONDS); > } > {code} > The wakeLoop signal does not exit this out of the loop and is missing a break > for shut-down. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3980) ShuffleRunner: the wake loop needs to check for shutdown
[ https://issues.apache.org/jira/browse/TEZ-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3980: - Attachment: TEZ-3980.1.patch > ShuffleRunner: the wake loop needs to check for shutdown > > > Key: TEZ-3980 > URL: https://issues.apache.org/jira/browse/TEZ-3980 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: TEZ-3980.1.patch > > > In the ShuffleRunner threads, there's a loop which does not terminate if the > task threads get killed. > {code} > while ((runningFetchers.size() >= numFetchers || > pendingHosts.isEmpty()) > && numCompletedInputs.get() < numInputs) { > inputContext.notifyProgress(); > boolean ret = wakeLoop.await(1000, TimeUnit.MILLISECONDS); > } > {code} > The wakeLoop signal does not exit this out of the loop and is missing a break > for shut-down. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3980) ShuffleRunner: the wake loop needs to check for shutdown
[ https://issues.apache.org/jira/browse/TEZ-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3980: - Attachment: TEZ-3980.1.patch > ShuffleRunner: the wake loop needs to check for shutdown > > > Key: TEZ-3980 > URL: https://issues.apache.org/jira/browse/TEZ-3980 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: TEZ-3980.1.patch > > > In the ShuffleRunner threads, there's a loop which does not terminate if the > task threads get killed. > {code} > while ((runningFetchers.size() >= numFetchers || > pendingHosts.isEmpty()) > && numCompletedInputs.get() < numInputs) { > inputContext.notifyProgress(); > boolean ret = wakeLoop.await(1000, TimeUnit.MILLISECONDS); > } > {code} > The wakeLoop signal does not exit this out of the loop and is missing a break > for shut-down. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-3980) ShuffleRunner: the wake loop needs to check for shutdown
Gopal V created TEZ-3980: Summary: ShuffleRunner: the wake loop needs to check for shutdown Key: TEZ-3980 URL: https://issues.apache.org/jira/browse/TEZ-3980 Project: Apache Tez Issue Type: Bug Reporter: Gopal V In the ShuffleRunner threads, there's a loop which does not terminate if the task threads get killed. {code} while ((runningFetchers.size() >= numFetchers || pendingHosts.isEmpty()) && numCompletedInputs.get() < numInputs) { inputContext.notifyProgress(); boolean ret = wakeLoop.await(1000, TimeUnit.MILLISECONDS); } {code} The wakeLoop signal does not exit this out of the loop and is missing a break for shut-down. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-3980) ShuffleRunner: the wake loop needs to check for shutdown
[ https://issues.apache.org/jira/browse/TEZ-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-3980: Assignee: Gopal V > ShuffleRunner: the wake loop needs to check for shutdown > > > Key: TEZ-3980 > URL: https://issues.apache.org/jira/browse/TEZ-3980 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > > In the ShuffleRunner threads, there's a loop which does not terminate if the > task threads get killed. > {code} > while ((runningFetchers.size() >= numFetchers || > pendingHosts.isEmpty()) > && numCompletedInputs.get() < numInputs) { > inputContext.notifyProgress(); > boolean ret = wakeLoop.await(1000, TimeUnit.MILLISECONDS); > } > {code} > The wakeLoop signal does not exit this out of the loop and is missing a break > for shut-down. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3974) Tez: Correctness regression of TEZ-955 in TEZ-2937
[ https://issues.apache.org/jira/browse/TEZ-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3974: - Fix Version/s: 0.10.0 > Tez: Correctness regression of TEZ-955 in TEZ-2937 > -- > > Key: TEZ-3974 > URL: https://issues.apache.org/jira/browse/TEZ-3974 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.1 >Reporter: Gopal V >Assignee: Jaume M >Priority: Critical > Fix For: 0.10.0 > > Attachments: TEZ-3974.1.patch, TEZ-3974.2.patch, TEZ-3974.3.patch > > > TEZ-2937 might have introduced a race condition for Tez output events, along > with TEZ-2237 > {code} > // Close the Outputs. > for (OutputSpec outputSpec : outputSpecs) { > String destVertexName = outputSpec.getDestinationVertexName(); > initializedOutputs.remove(destVertexName); > List closeOutputEvents = > ((LogicalOutputFrameworkInterface)outputsMap.get(destVertexName)).close(); > sendTaskGeneratedEvents(closeOutputEvents, > EventProducerConsumerType.OUTPUT, taskSpec.getVertexName(), > destVertexName, taskSpec.getTaskAttemptID()); > } > // Close the Processor. > processorClosed = true; > processor.close(); > {code} > As part of TEZ-2237, the outputs send empty events when the output is closed > without being started (which happens in task init failures). > These events are obsoleted when a task fails and this happens in the AM, but > not before the dispatcher looks at them. > Depending on the timing, the empty events can escape obsoletion & be sent to > a downstream task. > This gets marked as a SKIPPED event in the downstream task, which means that > further obsoletion events sent to the downstream task is ignored (because a > zero byte fetch is not repeated on node failure). > So the downstream task can exit without actually waiting for the retry of the > failed task and cause silent dataloss in case where the retry succeeds in > another attempt. > So if processor.close() throws an exception, this introduce a race condition > and if the AM is too fast, we end up with correctness issues. > This was originally reported in TEZ-955 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3974) Tez: Correctness regression of TEZ-955 in TEZ-2937
[ https://issues.apache.org/jira/browse/TEZ-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3974: - Affects Version/s: 0.9.1 > Tez: Correctness regression of TEZ-955 in TEZ-2937 > -- > > Key: TEZ-3974 > URL: https://issues.apache.org/jira/browse/TEZ-3974 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.1 >Reporter: Gopal V >Assignee: Jaume M >Priority: Critical > Fix For: 0.10.0 > > Attachments: TEZ-3974.1.patch, TEZ-3974.2.patch, TEZ-3974.3.patch > > > TEZ-2937 might have introduced a race condition for Tez output events, along > with TEZ-2237 > {code} > // Close the Outputs. > for (OutputSpec outputSpec : outputSpecs) { > String destVertexName = outputSpec.getDestinationVertexName(); > initializedOutputs.remove(destVertexName); > List closeOutputEvents = > ((LogicalOutputFrameworkInterface)outputsMap.get(destVertexName)).close(); > sendTaskGeneratedEvents(closeOutputEvents, > EventProducerConsumerType.OUTPUT, taskSpec.getVertexName(), > destVertexName, taskSpec.getTaskAttemptID()); > } > // Close the Processor. > processorClosed = true; > processor.close(); > {code} > As part of TEZ-2237, the outputs send empty events when the output is closed > without being started (which happens in task init failures). > These events are obsoleted when a task fails and this happens in the AM, but > not before the dispatcher looks at them. > Depending on the timing, the empty events can escape obsoletion & be sent to > a downstream task. > This gets marked as a SKIPPED event in the downstream task, which means that > further obsoletion events sent to the downstream task is ignored (because a > zero byte fetch is not repeated on node failure). > So the downstream task can exit without actually waiting for the retry of the > failed task and cause silent dataloss in case where the retry succeeds in > another attempt. > So if processor.close() throws an exception, this introduce a race condition > and if the AM is too fast, we end up with correctness issues. > This was originally reported in TEZ-955 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3974) Tez: Correctness regression of TEZ-955 in TEZ-2937
[ https://issues.apache.org/jira/browse/TEZ-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575211#comment-16575211 ] Gopal V commented on TEZ-3974: -- I see - https://builds.apache.org/job/PreCommit-TEZ-Build/2871/ has kicked off for this. > Tez: Correctness regression of TEZ-955 in TEZ-2937 > -- > > Key: TEZ-3974 > URL: https://issues.apache.org/jira/browse/TEZ-3974 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Jaume M >Priority: Critical > Attachments: TEZ-3974.1.patch, TEZ-3974.2.patch, TEZ-3974.3.patch > > > TEZ-2937 might have introduced a race condition for Tez output events, along > with TEZ-2237 > {code} > // Close the Outputs. > for (OutputSpec outputSpec : outputSpecs) { > String destVertexName = outputSpec.getDestinationVertexName(); > initializedOutputs.remove(destVertexName); > List closeOutputEvents = > ((LogicalOutputFrameworkInterface)outputsMap.get(destVertexName)).close(); > sendTaskGeneratedEvents(closeOutputEvents, > EventProducerConsumerType.OUTPUT, taskSpec.getVertexName(), > destVertexName, taskSpec.getTaskAttemptID()); > } > // Close the Processor. > processorClosed = true; > processor.close(); > {code} > As part of TEZ-2237, the outputs send empty events when the output is closed > without being started (which happens in task init failures). > These events are obsoleted when a task fails and this happens in the AM, but > not before the dispatcher looks at them. > Depending on the timing, the empty events can escape obsoletion & be sent to > a downstream task. > This gets marked as a SKIPPED event in the downstream task, which means that > further obsoletion events sent to the downstream task is ignored (because a > zero byte fetch is not repeated on node failure). > So the downstream task can exit without actually waiting for the retry of the > failed task and cause silent dataloss in case where the retry succeeds in > another attempt. > So if processor.close() throws an exception, this introduce a race condition > and if the AM is too fast, we end up with correctness issues. > This was originally reported in TEZ-955 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3974) Tez: Correctness regression of TEZ-955 in TEZ-2937
[ https://issues.apache.org/jira/browse/TEZ-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575203#comment-16575203 ] Gopal V commented on TEZ-3974: -- Reuploading to force QE runs. > Tez: Correctness regression of TEZ-955 in TEZ-2937 > -- > > Key: TEZ-3974 > URL: https://issues.apache.org/jira/browse/TEZ-3974 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Jaume M >Priority: Critical > Attachments: TEZ-3974.1.patch, TEZ-3974.2.patch, TEZ-3974.3.patch > > > TEZ-2937 might have introduced a race condition for Tez output events, along > with TEZ-2237 > {code} > // Close the Outputs. > for (OutputSpec outputSpec : outputSpecs) { > String destVertexName = outputSpec.getDestinationVertexName(); > initializedOutputs.remove(destVertexName); > List closeOutputEvents = > ((LogicalOutputFrameworkInterface)outputsMap.get(destVertexName)).close(); > sendTaskGeneratedEvents(closeOutputEvents, > EventProducerConsumerType.OUTPUT, taskSpec.getVertexName(), > destVertexName, taskSpec.getTaskAttemptID()); > } > // Close the Processor. > processorClosed = true; > processor.close(); > {code} > As part of TEZ-2237, the outputs send empty events when the output is closed > without being started (which happens in task init failures). > These events are obsoleted when a task fails and this happens in the AM, but > not before the dispatcher looks at them. > Depending on the timing, the empty events can escape obsoletion & be sent to > a downstream task. > This gets marked as a SKIPPED event in the downstream task, which means that > further obsoletion events sent to the downstream task is ignored (because a > zero byte fetch is not repeated on node failure). > So the downstream task can exit without actually waiting for the retry of the > failed task and cause silent dataloss in case where the retry succeeds in > another attempt. > So if processor.close() throws an exception, this introduce a race condition > and if the AM is too fast, we end up with correctness issues. > This was originally reported in TEZ-955 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-3974) Tez: Correctness regression of TEZ-955 in TEZ-2937
[ https://issues.apache.org/jira/browse/TEZ-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-3974: Assignee: Gopal V (was: Jaume M) > Tez: Correctness regression of TEZ-955 in TEZ-2937 > -- > > Key: TEZ-3974 > URL: https://issues.apache.org/jira/browse/TEZ-3974 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Gopal V >Priority: Critical > Attachments: TEZ-3974.1.patch, TEZ-3974.2.patch > > > TEZ-2937 might have introduced a race condition for Tez output events, along > with TEZ-2237 > {code} > // Close the Outputs. > for (OutputSpec outputSpec : outputSpecs) { > String destVertexName = outputSpec.getDestinationVertexName(); > initializedOutputs.remove(destVertexName); > List closeOutputEvents = > ((LogicalOutputFrameworkInterface)outputsMap.get(destVertexName)).close(); > sendTaskGeneratedEvents(closeOutputEvents, > EventProducerConsumerType.OUTPUT, taskSpec.getVertexName(), > destVertexName, taskSpec.getTaskAttemptID()); > } > // Close the Processor. > processorClosed = true; > processor.close(); > {code} > As part of TEZ-2237, the outputs send empty events when the output is closed > without being started (which happens in task init failures). > These events are obsoleted when a task fails and this happens in the AM, but > not before the dispatcher looks at them. > Depending on the timing, the empty events can escape obsoletion & be sent to > a downstream task. > This gets marked as a SKIPPED event in the downstream task, which means that > further obsoletion events sent to the downstream task is ignored (because a > zero byte fetch is not repeated on node failure). > So the downstream task can exit without actually waiting for the retry of the > failed task and cause silent dataloss in case where the retry succeeds in > another attempt. > So if processor.close() throws an exception, this introduce a race condition > and if the AM is too fast, we end up with correctness issues. > This was originally reported in TEZ-955 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3976) ShuffleManager reporting too many errors
[ https://issues.apache.org/jira/browse/TEZ-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572034#comment-16572034 ] Gopal V commented on TEZ-3976: -- bq. Normally this scenario does not occur because one of the following triggers first: That is what happens with ShuffleScheduler (the one that matches the MapReduce shuffle). ShuffleManager is the other codepath which is used for unsorted shuffle joins. > ShuffleManager reporting too many errors > > > Key: TEZ-3976 > URL: https://issues.apache.org/jira/browse/TEZ-3976 > Project: Apache Tez > Issue Type: Bug >Reporter: Jaume M >Priority: Major > > The symptoms are a lot of these logs are being shown: > {code:java} > 2018-06-15T18:09:35,811 INFO [Fetcher_B {Reducer_5} #0 ()] > org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager: Reducer_5: > Fetch failed for src: InputAttemptIdentifier [inputIdentifier=701, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000701_0_12541_0, spillType=2, > spillId=0]InputIdentifier: InputAttemptIdentifier [inputIdentifier=701, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000701_0_12541_0, spillType=2, > spillId=0], connectFailed: true > 2018-06-15T18:09:35,811 WARN [Fetcher_B {Reducer_5} #1 ()] > org.apache.tez.runtime.library.common.shuffle.Fetcher: copyInputs failed for > tasks [InputAttemptIdentifier [inputIdentifier=589, attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0]] > 2018-06-15T18:09:35,811 INFO [Fetcher_B {Reducer_5} #1 ()] > org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager: Reducer_5: > Fetch failed for src: InputAttemptIdentifier [inputIdentifier=589, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0]InputIdentifier: InputAttemptIdentifier [inputIdentifier=589, > attemptNumber=0, > pathComponent=attempt_152901963_0021_34_01_000589_0_12445_0, spillType=2, > spillId=0], connectFailed: true > {code} > Each of those translate into an event in the AM which finally crashes due to > OOM after around 30 minutes and around 10 million shuffle input errors (and > 10 million lines like the previous ones). When the ShufflerManager is closed > and the counters reported there are many shuffle input errors, some of those > logs are: > {code:java} > 2018-06-15T17:46:30,988 INFO [TezTR-441963_21_34_4_0_4 > (152901963_0021_34_04_00_4)] runtime.LogicalIOProcessorRuntimeTask: > Final Counters for attempt_152901963_0021_34_04_00_4: Counters: 43 > [[org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=0, > NUM_SHUFFLED_INPUTS=26, NUM_FAILED_SHUFFLE_INPUTS=858965, > INPUT_RECORDS_PROCESSED=26, OUTPUT_RECORDS=1, OUTPUT_LARGE_RECORDS=0, > OUTPUT_BYTES=779472, OUTPUT_BYTES_WITH_OVERHEAD=779483, > OUTPUT_BYTES_PHYSICAL=780146, ADDITIONAL_SPILLS_BYTES_WRITTEN=0, > ADDITIONAL_SPILLS_BYTES_READ=0, ADDITIONAL_SPILL_COUNT=0, > SHUFFLE_BYTES=4207563, SHUFFLE_BYTES_DECOMPRESSED=20266603, > SHUFFLE_BYTES_TO_MEM=3380616, SHUFFLE_BYTES_TO_DISK=0, > SHUFFLE_BYTES_DISK_DIRECT=826947, SHUFFLE_PHASE_TIME=52516, > FIRST_EVENT_RECEIVED=1, LAST_EVENT_RECEIVED=1185][HIVE > RECORDS_OUT_INTERMEDIATE_^[[1;35;40m^[[KReducer_12^[[m^[[K=1, > RECORDS_OUT_OPERATOR_GBY_159=1, > RECORDS_OUT_OPERATOR_RS_160=1][TaskCounter_^[[1;35;40m^[[KReducer_12^[[m^[[K_INPUT_Map_11 > FIRST_EVENT_RECEIVED=1, INPUT_RECORDS_PROCESSED=26, > LAST_EVENT_RECEIVED=1185, NUM_FAILED_SHUFFLE_INPUTS=858965, > NUM_SHUFFLED_INPUTS=26, SHUFFLE_BYTES=4207563, > SHUFFLE_BYTES_DECOMPRESSED=20266603, SHUFFLE_BYTES_DISK_DIRECT=826947, > SHUFFLE_BYTES_TO_DISK=0, SHUFFLE_BYTES_TO_MEM=3380616, > SHUFFLE_PHASE_TIME=52516][TaskCounter_^[[1;35;40m^[[KReducer_12^[[m^[[K_OUTPUT_Map_1 > ADDITIONAL_SPILLS_BYTES_READ=0, ADDITIONAL_SPILLS_BYTES_WRITTEN=0, > ADDITIONAL_SPILL_COUNT=0, OUTPUT_BYTES=779472, OUTPUT_BYTES_PHYSICAL=780146, > OUTPUT_BYTES_WITH_OVERHEAD=779483, OUTPUT_LARGE_RECORDS=0, OUTPUT_RECORDS=1, > SPILLED_RECORDS=0]] > 2018-06-15T17:46:32,271 INFO [TezTR-441963_21_34_3_15_1 ()] > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Final Counters for > attempt_152901963_0021_34_03_15_1: Counters: 87 [[File System > Counters FILE_BYTES_READ=0, FILE_BYTES_WRITTEN=0, FILE_READ_OPS=0, > FILE_LARGE_READ_OPS=0, FILE_WRITE_OPS=0, HDFS_BYTES_READ=2344929, > HDFS_BYTES_WRITTEN=0, HDFS_READ_OPS=5, HDFS_LARGE_READ_OPS=0, > HDFS_WRITE_OPS=0][org.apache.tez.common.counters.TaskCounter > SPILLED_RECORDS=0, NUM_SHUFFLED_INPUTS=1, NUM_FAILED_SHUFFLE_INPUTS=105195, > INPUT_RECORDS_PROCESSED=397, INPUT_SPLIT_LENGTH_BYTES=21563271, > OUTPUT_RECORDS=15737, OUTPUT_LARGE_RECORDS=0, OUTPUT_BYTES=1235818, > OUTPUT_BYTES_WITH_OVERHEAD=1267307,
[jira] [Assigned] (TEZ-3974) Tez: Correctness regression of TEZ-955 in TEZ-2937
[ https://issues.apache.org/jira/browse/TEZ-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-3974: Assignee: Jaume M > Tez: Correctness regression of TEZ-955 in TEZ-2937 > -- > > Key: TEZ-3974 > URL: https://issues.apache.org/jira/browse/TEZ-3974 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Jaume M >Priority: Critical > Attachments: TEZ-3974.1.patch > > > TEZ-2937 might have introduced a race condition for Tez output events, along > with TEZ-2237 > {code} > // Close the Outputs. > for (OutputSpec outputSpec : outputSpecs) { > String destVertexName = outputSpec.getDestinationVertexName(); > initializedOutputs.remove(destVertexName); > List closeOutputEvents = > ((LogicalOutputFrameworkInterface)outputsMap.get(destVertexName)).close(); > sendTaskGeneratedEvents(closeOutputEvents, > EventProducerConsumerType.OUTPUT, taskSpec.getVertexName(), > destVertexName, taskSpec.getTaskAttemptID()); > } > // Close the Processor. > processorClosed = true; > processor.close(); > {code} > As part of TEZ-2237, the outputs send empty events when the output is closed > without being started (which happens in task init failures). > These events are obsoleted when a task fails and this happens in the AM, but > not before the dispatcher looks at them. > Depending on the timing, the empty events can escape obsoletion & be sent to > a downstream task. > This gets marked as a SKIPPED event in the downstream task, which means that > further obsoletion events sent to the downstream task is ignored (because a > zero byte fetch is not repeated on node failure). > So the downstream task can exit without actually waiting for the retry of the > failed task and cause silent dataloss in case where the retry succeeds in > another attempt. > So if processor.close() throws an exception, this introduce a race condition > and if the AM is too fast, we end up with correctness issues. > This was originally reported in TEZ-955 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3974) Tez: Correctness regression of TEZ-955 in TEZ-2937
[ https://issues.apache.org/jira/browse/TEZ-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568979#comment-16568979 ] Gopal V commented on TEZ-3974: -- LGTM - +1 [~jaume]: can you give brief note about testing, because this is an issue which doesn't break when running in a mini-cluster? > Tez: Correctness regression of TEZ-955 in TEZ-2937 > -- > > Key: TEZ-3974 > URL: https://issues.apache.org/jira/browse/TEZ-3974 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Priority: Critical > Attachments: TEZ-3974.1.patch > > > TEZ-2937 might have introduced a race condition for Tez output events, along > with TEZ-2237 > {code} > // Close the Outputs. > for (OutputSpec outputSpec : outputSpecs) { > String destVertexName = outputSpec.getDestinationVertexName(); > initializedOutputs.remove(destVertexName); > List closeOutputEvents = > ((LogicalOutputFrameworkInterface)outputsMap.get(destVertexName)).close(); > sendTaskGeneratedEvents(closeOutputEvents, > EventProducerConsumerType.OUTPUT, taskSpec.getVertexName(), > destVertexName, taskSpec.getTaskAttemptID()); > } > // Close the Processor. > processorClosed = true; > processor.close(); > {code} > As part of TEZ-2237, the outputs send empty events when the output is closed > without being started (which happens in task init failures). > These events are obsoleted when a task fails and this happens in the AM, but > not before the dispatcher looks at them. > Depending on the timing, the empty events can escape obsoletion & be sent to > a downstream task. > This gets marked as a SKIPPED event in the downstream task, which means that > further obsoletion events sent to the downstream task is ignored (because a > zero byte fetch is not repeated on node failure). > So the downstream task can exit without actually waiting for the retry of the > failed task and cause silent dataloss in case where the retry succeeds in > another attempt. > So if processor.close() throws an exception, this introduce a race condition > and if the AM is too fast, we end up with correctness issues. > This was originally reported in TEZ-955 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (TEZ-3971) Incorrect query result in hive when hive.convert.join.bucket.mapjoin.tez=true
[ https://issues.apache.org/jira/browse/TEZ-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V resolved TEZ-3971. -- Resolution: Won't Do > Incorrect query result in hive when hive.convert.join.bucket.mapjoin.tez=true > - > > Key: TEZ-3971 > URL: https://issues.apache.org/jira/browse/TEZ-3971 > Project: Apache Tez > Issue Type: Bug > Environment: We are using Hive 3, Hadoop 3.1 and Tez 0.91 >Reporter: Karthik >Priority: Major > Attachments: extended_explain.txt > > > When hive.convert.join.bucket.mapjoin.tez=true and bucketed column is in > select clause but not in where clause, hive is performing a bucket map join > and returning incorrect results. When the bucketed column is removed from > select clause or hive.convert.join.bucket.mapjoin.tez=false, returned query > results are correct. > > create table my_fact(AMT decimal(20,3),bucket_col string ,join_col string ) > PARTITIONED BY (FISCAL_YEAR string ,ACCOUNTING_PERIOD string ) > CLUSTERED BY (bucket_col) INTO 10 > BUCKETS > stored as ORC > ; > create table my_dim(join_col string,filter_col string) stored as orc; > After populating and analyzing above tables, explain plan looks as below > when hive.convert.join.bucket.mapjoin.tez=TRUE: > > explain select T4.join_col as account1,my_fact.accounting_period > FROM my_fact JOIN my_dim T4 ON my_fact.join_col = T4.join_col > WHERE my_fact.fiscal_year = '2015' > AND T4.filter_col IN ( 'VAL1', 'VAL2' ) > and my_fact.accounting_period in (10); > Vertex dependency in root stage > Map 1 <- Map 2 (CUSTOM_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_24] > Select Operator [SEL_23] (rows=15282589 width=291) > Output:["_col0","_col1","_col2"] > Map Join Operator [MAPJOIN_22] (rows=15282589 width=291) > > *BucketMapJoin*:true,Conds:SEL_21._col1=RS_19._col0(Inner),Output:["_col0","_col3","_col4"] > <-Map 2 [CUSTOM_EDGE] vectorized, llap > MULTICAST [RS_19] > PartitionCols:_col0 > Select Operator [SEL_18] (rows=818 width=186) > Output:["_col0"] > Filter Operator [FIL_17] (rows=818 width=186) > predicate:((filter_col) IN ('VAL1', 'VAL2') and join_col is not null) > TableScan [TS_3] (rows=1635 width=186) > default@my_dim,t4,Tbl:COMPLETE,Col:NONE,Output:["join_col","filter_col"] > <-Select Operator [SEL_21] (rows=13893263 width=291) > Output:["_col0","_col1","_col3"] > Filter Operator [FIL_20] (rows=13893263 width=291) > predicate:join_col is not null > TableScan [TS_0] (rows=13893263 width=291) > > default@my_fact,my_fact,Tbl:COMPLETE,Col:NONE,Output:["bucket_col","join_col"] > [^extended_explain.txt] has more detailed plan. > When hive.convert.join.bucket.mapjoin.tez=false, plan no longer has > bucketjoin and query results are correct. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_24] > Select Operator [SEL_23] (rows=15282589 width=291) > Output:["_col0","_col1","_col2"] > Map Join Operator [MAPJOIN_22] (rows=15282589 width=291) > Conds:SEL_21._col1=RS_19._col0(Inner),Output:["_col0","_col3","_col4"] > <-Map 2 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_19] > PartitionCols:_col0 > Select Operator [SEL_18] (rows=818 width=186) > Output:["_col0"] > Filter Operator [FIL_17] (rows=818 width=186) > predicate:((filter_col) IN ('VAL1', 'VAL2') and join_col is not null) > TableScan [TS_3] (rows=1635 width=186) > default@my_dim,t4,Tbl:COMPLETE,Col:NONE,Output:["join_col","filter_col"] > <-Select Operator [SEL_21] (rows=13893263 width=291) > Output:["_col0","_col1","_col3"] > Filter Operator [FIL_20] (rows=13893263 width=291) > predicate:join_col is not null > TableScan [TS_0] (rows=13893263 width=291) > > default@my_fact,my_fact,Tbl:COMPLETE,Col:NONE,Output:["bucket_col","join_col"] > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3965) TestMROutput: Fix the hard-coded "/tmp/output" paths
[ https://issues.apache.org/jira/browse/TEZ-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542533#comment-16542533 ] Gopal V commented on TEZ-3965: -- LGTM - +1 > TestMROutput: Fix the hard-coded "/tmp/output" paths > > > Key: TEZ-3965 > URL: https://issues.apache.org/jira/browse/TEZ-3965 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Assignee: Jaume M >Priority: Minor > Attachments: TEZ-3965.1.patch > > > {code} > testNewAPI_SequenceFileOutputFormat(org.apache.tez.mapreduce.output.TestMROutput) > Time elapsed: 0.086 sec <<< ERROR! > java.io.IOException: Mkdirs failed to create > /tmp/output/_temporary/0/_temporary/attempt_15306467542521_0001_r_00_1 > {code} > To reproduce issue > {code} > sudo mkdir -p /tmp/output > sudo chown a-x /tmp/output > {code} > Having a diff user owned /tmp/output will fail this test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3967) DAGImpl: dag lock is unfair and can starve the writers
[ https://issues.apache.org/jira/browse/TEZ-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542205#comment-16542205 ] Gopal V commented on TEZ-3967: -- {code} TEZ_DAG_STATUS_CHECK_INTERVAL("hive.tez.dag.status.check.interval", "500ms", new TimeValidator(TimeUnit.MILLISECONDS), "Interval between subsequent DAG status invocation."), {code} bq. I can see unfair locking being better for through put for the exact exact same reason you are suggesting The trouble is that overlapping read-locks are starving out the write-lock (because the read-lock can lock over the other read-lock) right now. Yes, fairness is not free - if you have a different write-preferred locking for this, I think that's what we are really looking for. > DAGImpl: dag lock is unfair and can starve the writers > -- > > Key: TEZ-3967 > URL: https://issues.apache.org/jira/browse/TEZ-3967 > Project: Apache Tez > Issue Type: Bug >Reporter: Gopal V >Priority: Major > > Found when debugging HIVE-20103, that a reader arriving when another reader > is active can postpone a writer from obtaining a write-lock. > This is fundamentally bad for the DAGImpl as useful progress can only happen > when the writeLock is held. > {code} > public void handle(DAGEvent event) { > ... > try { > writeLock.lock(); > {code} > {code} >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x7efb02246f40> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) > at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:1162) > at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:149) > at > org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:2251) > at > org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:2242) > at > org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:180) > at > org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:115) > at java.lang.Thread.run(Thread.java:745) > {code} > while read-lock is passed around between > {code} >at > org.apache.tez.dag.app.dag.impl.DAGImpl.getDAGStatus(DAGImpl.java:901) > at > org.apache.tez.dag.app.dag.impl.DAGImpl.getDAGStatus(DAGImpl.java:940) > at > org.apache.tez.dag.api.client.DAGClientHandler.getDAGStatus(DAGClientHandler.java:73) > {code} > calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3968) Tez Job Fails with Shuffle failures too fast when NM returns a 401 error
[ https://issues.apache.org/jira/browse/TEZ-3968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537776#comment-16537776 ] Gopal V commented on TEZ-3968: -- The core issue is that the 400+ retries happened in under a second and the downstream tasks exited without waiting for the producer retry to start off. This does not happen when a machine is unreachable, while in case of an NM which has lost data, but is otherwise healthy the error happens too fast for the retry to catch up and obsolete the older shuffle output. This is a scenario where being slower would fix the issue and being faster makes it worse. > Tez Job Fails with Shuffle failures too fast when NM returns a 401 error > > > Key: TEZ-3968 > URL: https://issues.apache.org/jira/browse/TEZ-3968 > Project: Apache Tez > Issue Type: Improvement >Affects Versions: 0.7.1 >Reporter: Prabhu Joseph >Priority: Major > > Tez Job failed with a reduce task failed on all four attempts while fetching > a particular map output from a Node. NodeManager where MapTask has succeeded > was stopped and got NM local directories cleared and started again (as disks > were full). This has caused the shuffle failure in NodeManager as there is no > Job Token found. > NodeManager Logs shows reason for Shuffle Failure: > {code} > 2018-07-05 00:26:00,371 WARN mapred.ShuffleHandler > (ShuffleHandler.java:messageReceived(947)) - Shuffle failure > org.apache.hadoop.security.token.SecretManager$InvalidToken: Can't find job > token for job job_1530690553693_17267 !! > at > org.apache.hadoop.mapreduce.security.token.JobTokenSecretManager.retrieveTokenSecret(JobTokenSecretManager.java:112) > at > org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:1133) > at > org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:944) > {code} > Analysis of Application Logs: > Application application_1530690553693_17267 failed with task > task_1530690553693_17267_4_02_000496 failed on all four attempts. > Four Attempts: > {code} > attempt_1530690553693_17267_4_02_000496_3 -> > container_e270_1530690553693_17267_01_014554 -> bigdata2.openstacklocal > attempt_1530690553693_17267_4_02_000496_2 -> > container_e270_1530690553693_17267_01_014423 -> bigdata3.openstacklocal > attempt_1530690553693_17267_4_02_000496_1 -> > container_e270_1530690553693_17267_01_014311 -> bigdata4.openstacklocal > attempt_1530690553693_17267_4_02_000496_0 -> > container_e270_1530690553693_17267_01_014613 -> bigdata5.openstacklocal > {code} > All the four attempts failed while fetching a same Map Output: > {code} > 2018-07-05 00:26:54,161 [WARN] [fetcher {Map_1} #51] > |orderedgrouped.FetcherOrderedGrouped|: Failed to verify reply after > connecting to bigdata6.openstacklocal:13562 with 1 inputs pending > java.io.IOException: Server returned HTTP response code: 401 for URL: > http://bigdata6.openstacklocal:13562/mapOutput?job=job_1530690553693_17267=496=attempt_1530690553693_17267_4_01_000874_0_10003 > {code} > The failures are being reported back to the AM correctly in Tez, though it is > not reported as a "source unhealthy" because the NodeManager is healthy (due > to the cleanup). > {code} > 2018-07-04 23:47:42,344 [INFO] [fetcher {Map_1} #10] > |orderedgrouped.ShuffleScheduler|: Map_1: Reporting fetch failure for > InputIdentifier: InputAttemptIdentifier [inputIdentifier=InputIdentifier > [inputIndex=874], attemptNumber=0, > pathComponent=ttempt_1530690553693_17267_4_01_000874_0_10003, spillType=0, > spillId=-1] taskAttemptIdentifier: Map 1_000874_00 to AM. > {code} > There are approximated 460 errors reported back to the AM like this, which > keeps getting marked as "fetcher unhealthy" which is probably because the > restarted NM showed up as healthy. > This scenario of shuffle failures are not handled as NM showed up as healthy. > Mapper (source InputIdentifier ) has to be marked as unhealthy and rerun. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3968) Tez Job Fails with Shuffle failures too fast when NM returns a 401 error
[ https://issues.apache.org/jira/browse/TEZ-3968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3968: - Description: Tez Job failed with a reduce task failed on all four attempts while fetching a particular map output from a Node. NodeManager where MapTask has succeeded was stopped and got NM local directories cleared and started again (as disks were full). This has caused the shuffle failure in NodeManager as there is no Job Token found. NodeManager Logs shows reason for Shuffle Failure: {code} 2018-07-05 00:26:00,371 WARN mapred.ShuffleHandler (ShuffleHandler.java:messageReceived(947)) - Shuffle failure org.apache.hadoop.security.token.SecretManager$InvalidToken: Can't find job token for job job_1530690553693_17267 !! at org.apache.hadoop.mapreduce.security.token.JobTokenSecretManager.retrieveTokenSecret(JobTokenSecretManager.java:112) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:1133) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:944) {code} Analysis of Application Logs: Application application_1530690553693_17267 failed with task task_1530690553693_17267_4_02_000496 failed on all four attempts. Four Attempts: {code} attempt_1530690553693_17267_4_02_000496_3 -> container_e270_1530690553693_17267_01_014554 -> bigdata2.openstacklocal attempt_1530690553693_17267_4_02_000496_2 -> container_e270_1530690553693_17267_01_014423 -> bigdata3.openstacklocal attempt_1530690553693_17267_4_02_000496_1 -> container_e270_1530690553693_17267_01_014311 -> bigdata4.openstacklocal attempt_1530690553693_17267_4_02_000496_0 -> container_e270_1530690553693_17267_01_014613 -> bigdata5.openstacklocal {code} All the four attempts failed while fetching a same Map Output: {code} 2018-07-05 00:26:54,161 [WARN] [fetcher {Map_1} #51] |orderedgrouped.FetcherOrderedGrouped|: Failed to verify reply after connecting to bigdata6.openstacklocal:13562 with 1 inputs pending java.io.IOException: Server returned HTTP response code: 401 for URL: http://bigdata6.openstacklocal:13562/mapOutput?job=job_1530690553693_17267=496=attempt_1530690553693_17267_4_01_000874_0_10003 {code} The failures are being reported back to the AM correctly in Tez, though it is not reported as a "source unhealthy" because the NodeManager is healthy (due to the cleanup). {code} 2018-07-04 23:47:42,344 [INFO] [fetcher {Map_1} #10] |orderedgrouped.ShuffleScheduler|: Map_1: Reporting fetch failure for InputIdentifier: InputAttemptIdentifier [inputIdentifier=InputIdentifier [inputIndex=874], attemptNumber=0, pathComponent=ttempt_1530690553693_17267_4_01_000874_0_10003, spillType=0, spillId=-1] taskAttemptIdentifier: Map 1_000874_00 to AM. {code} There are approximated 460 errors reported back to the AM like this, which keeps getting marked as "fetcher unhealthy" which is probably because the restarted NM showed up as healthy. This scenario of shuffle failures are not handled as NM showed up as healthy. Mapper (source InputIdentifier ) has to be marked as unhealthy and rerun. was: Tez Job failed with a reduce task failed on all four attempts while fetching a particular map output from a Node. NodeManager where MapTask has succeeded was stopped and got NM local directories cleared and started again (as disks were full). This has caused the shuffle failure in NodeManager as there is no Job Token found. NodeManager Logs shows reason for Shuffle Failure: {code} 2018-07-05 00:26:00,371 WARN mapred.ShuffleHandler (ShuffleHandler.java:messageReceived(947)) - Shuffle failure org.apache.hadoop.security.token.SecretManager$InvalidToken: Can't find job token for job job_1530690553693_17267 !! at org.apache.hadoop.mapreduce.security.token.JobTokenSecretManager.retrieveTokenSecret(JobTokenSecretManager.java:112) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:1133) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:944) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787) at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787) at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148) at
[jira] [Updated] (TEZ-3968) Tez Job Fails with Shuffle failures too fast when NM returns a 401 error
[ https://issues.apache.org/jira/browse/TEZ-3968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3968: - Summary: Tez Job Fails with Shuffle failures too fast when NM returns a 401 error (was: Tez Job Fails with Shuffle failures without rerunning the producer MapTask) > Tez Job Fails with Shuffle failures too fast when NM returns a 401 error > > > Key: TEZ-3968 > URL: https://issues.apache.org/jira/browse/TEZ-3968 > Project: Apache Tez > Issue Type: Improvement >Affects Versions: 0.7.1 >Reporter: Prabhu Joseph >Priority: Major > > Tez Job failed with a reduce task failed on all four attempts while fetching > a particular map output from a Node. NodeManager where MapTask has succeeded > was stopped and got NM local directories cleared and started again (as disks > were full). This has caused the shuffle failure in NodeManager as there is no > Job Token found. > NodeManager Logs shows reason for Shuffle Failure: > {code} > 2018-07-05 00:26:00,371 WARN mapred.ShuffleHandler > (ShuffleHandler.java:messageReceived(947)) - Shuffle failure > org.apache.hadoop.security.token.SecretManager$InvalidToken: Can't find job > token for job job_1530690553693_17267 !! > at > org.apache.hadoop.mapreduce.security.token.JobTokenSecretManager.retrieveTokenSecret(JobTokenSecretManager.java:112) > at > org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:1133) > at > org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:944) > at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) > at > org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787) > at > org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142) > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) > at > org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787) > at > org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148) > at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) > at > org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) > at > org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459) > at > org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536) > at > org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435) > at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) > at > org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107) > at > org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88) > at > org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) > at > org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) > at > org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > Analysis of Application Logs: > Application application_1530690553693_17267 failed with task > task_1530690553693_17267_4_02_000496 failed on all four
[jira] [Created] (TEZ-3969) TaskAttemptImpl: static fields initialized in instance ctor
Gopal V created TEZ-3969: Summary: TaskAttemptImpl: static fields initialized in instance ctor Key: TEZ-3969 URL: https://issues.apache.org/jira/browse/TEZ-3969 Project: Apache Tez Issue Type: Bug Reporter: Gopal V The TODO is probably well-placed (& the bug looks somewhat intentional to minimize the size of TaskAttemptImpl object). This isn't causing any bugs at the moment, because the block is called from the same thread always. {code} public TaskAttemptImpl(TezTaskAttemptID attemptId, EventHandler eventHandler, ... // TODO: Move these configs over to Vertex.VertexConfig MAX_ALLOWED_OUTPUT_FAILURES = conf.getInt(TezConfiguration .TEZ_TASK_MAX_ALLOWED_OUTPUT_FAILURES, TezConfiguration .TEZ_TASK_MAX_ALLOWED_OUTPUT_FAILURES_DEFAULT); MAX_ALLOWED_OUTPUT_FAILURES_FRACTION = conf.getDouble(TezConfiguration .TEZ_TASK_MAX_ALLOWED_OUTPUT_FAILURES_FRACTION, TezConfiguration .TEZ_TASK_MAX_ALLOWED_OUTPUT_FAILURES_FRACTION_DEFAULT); MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC = conf.getInt( TezConfiguration.TEZ_AM_MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC, TezConfiguration.TEZ_AM_MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC_DEFAULT); {code} But these fields are static members of the class & this is excluded in the findbugs to avoid warnings. {code} private static double MAX_ALLOWED_OUTPUT_FAILURES_FRACTION; private static int MAX_ALLOWED_OUTPUT_FAILURES; private static int MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC; {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (TEZ-3916) Add hadoop-azure-datalake jar to azure profile
[ https://issues.apache.org/jira/browse/TEZ-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537541#comment-16537541 ] Gopal V edited comment on TEZ-3916 at 7/9/18 8:45 PM: -- These jars are in the tez.tar.gz (similar to other hadoop-* jars) when building with -Pazure options. The change LGTM - +1 [~ewohlstadter]: kick a test again, so that we're testing against a release builds of the jars? was (Author: gopalv): These jars are in the tez.tar.gz (similar to other hadoop-* jars) when building with -Pazure options. The change LGTM - +1 > Add hadoop-azure-datalake jar to azure profile > -- > > Key: TEZ-3916 > URL: https://issues.apache.org/jira/browse/TEZ-3916 > Project: Apache Tez > Issue Type: Improvement >Reporter: Eric Wohlstadter >Assignee: Eric Wohlstadter >Priority: Critical > Fix For: 0.10.0 > > Attachments: TEZ-3916.1.patch > > > This jar is required for secure access to Azure object storage: > https://hadoop.apache.org/docs/current/hadoop-azure-datalake/index.html > There is already an azure profile in Tez but it doesn't include this jar. > Since the jar is only supported on Hadoop 2.8+, will either need to: > 1. Determine that including it in a 2.7 build is fine > 2. Or if it is not fine, then include the jar only when both the 2.8 profile > and the azure profile are activated -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3916) Add hadoop-azure-datalake jar to azure profile
[ https://issues.apache.org/jira/browse/TEZ-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537541#comment-16537541 ] Gopal V commented on TEZ-3916: -- These jars are in the tez.tar.gz (similar to other hadoop-* jars) when building with -Pazure options. The change LGTM - +1 > Add hadoop-azure-datalake jar to azure profile > -- > > Key: TEZ-3916 > URL: https://issues.apache.org/jira/browse/TEZ-3916 > Project: Apache Tez > Issue Type: Improvement >Reporter: Eric Wohlstadter >Assignee: Eric Wohlstadter >Priority: Critical > Fix For: 0.10.0 > > Attachments: TEZ-3916.1.patch > > > This jar is required for secure access to Azure object storage: > https://hadoop.apache.org/docs/current/hadoop-azure-datalake/index.html > There is already an azure profile in Tez but it doesn't include this jar. > Since the jar is only supported on Hadoop 2.8+, will either need to: > 1. Determine that including it in a 2.7 build is fine > 2. Or if it is not fine, then include the jar only when both the 2.8 profile > and the azure profile are activated -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-3967) DAGImpl: dag lock is unfair and can starve the writers
Gopal V created TEZ-3967: Summary: DAGImpl: dag lock is unfair and can starve the writers Key: TEZ-3967 URL: https://issues.apache.org/jira/browse/TEZ-3967 Project: Apache Tez Issue Type: Bug Reporter: Gopal V Found when debugging HIVE-20103, that a reader arriving when another reader is active can postpone a writer from obtaining a write-lock. This is fundamentally bad for the DAGImpl as useful progress can only happen when the writeLock is held. {code} public void handle(DAGEvent event) { ... try { writeLock.lock(); {code} {code} java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x7efb02246f40> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:1162) at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:149) at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:2251) at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:2242) at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:180) at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:115) at java.lang.Thread.run(Thread.java:745) {code} while read-lock is passed around between {code} at org.apache.tez.dag.app.dag.impl.DAGImpl.getDAGStatus(DAGImpl.java:901) at org.apache.tez.dag.app.dag.impl.DAGImpl.getDAGStatus(DAGImpl.java:940) at org.apache.tez.dag.api.client.DAGClientHandler.getDAGStatus(DAGClientHandler.java:73) {code} calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-3963) Possible InflaterInputStream leaked in TezCommonUtils and related classes
[ https://issues.apache.org/jira/browse/TEZ-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-3963: Assignee: Jaume M > Possible InflaterInputStream leaked in TezCommonUtils and related classes > -- > > Key: TEZ-3963 > URL: https://issues.apache.org/jira/browse/TEZ-3963 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.1 >Reporter: Jaume M >Assignee: Jaume M >Priority: Major > > I don't think [this is > closed|https://github.com/apache/tez/blob/314dfc79b4b3f528b680b4fee73ad0dca3a3a19b/tez-api/src/main/java/org/apache/tez/common/TezCommonUtils.java#L397] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-3958) Add internal vertex priority information into the tez dag.dot debug information
[ https://issues.apache.org/jira/browse/TEZ-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned TEZ-3958: Assignee: Jaume M > Add internal vertex priority information into the tez dag.dot debug > information > --- > > Key: TEZ-3958 > URL: https://issues.apache.org/jira/browse/TEZ-3958 > Project: Apache Tez > Issue Type: Improvement >Reporter: Gopal V >Assignee: Jaume M >Priority: Major > Attachments: TEZ-3958.1.patch > > > Adding the actual vertex priority as computed by Tez into the debug dag.dot > file would allows the debugging of task pre-emption issues when the DAG is no > longer a tree. > There are pre-emption issues with isomerization of Tez DAGs, where the a > R-isomer dag with mirror rotation runs at a different speed than the L-isomer > dag, due to priorities at the same level changing due to the vertex-id order. > Since the problem is hard to debug through, it would be good to record the > computed priority in the DAG .dot file in the logging directories. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-3965) TestMROutput: Fix the hard-coded "/tmp/output" paths
Gopal V created TEZ-3965: Summary: TestMROutput: Fix the hard-coded "/tmp/output" paths Key: TEZ-3965 URL: https://issues.apache.org/jira/browse/TEZ-3965 Project: Apache Tez Issue Type: Bug Reporter: Gopal V {code} testNewAPI_SequenceFileOutputFormat(org.apache.tez.mapreduce.output.TestMROutput) Time elapsed: 0.086 sec <<< ERROR! java.io.IOException: Mkdirs failed to create /tmp/output/_temporary/0/_temporary/attempt_15306467542521_0001_r_00_1 {code} To reproduce issue {code} sudo mkdir -p /tmp/output sudo chown a-x /tmp/output {code} Having a diff user owned /tmp/output will fail this test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3963) Possible InflaterInputStream leaked in TezCommonUtils and related classes
[ https://issues.apache.org/jira/browse/TEZ-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16528568#comment-16528568 ] Gopal V commented on TEZ-3963: -- +1 tests pending > Possible InflaterInputStream leaked in TezCommonUtils and related classes > -- > > Key: TEZ-3963 > URL: https://issues.apache.org/jira/browse/TEZ-3963 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.1 >Reporter: Jaume M >Priority: Major > > I don't think [this is > closed|https://github.com/apache/tez/blob/314dfc79b4b3f528b680b4fee73ad0dca3a3a19b/tez-api/src/main/java/org/apache/tez/common/TezCommonUtils.java#L397] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-3962) Configuration decode leaks an Inflater object
Gopal V created TEZ-3962: Summary: Configuration decode leaks an Inflater object Key: TEZ-3962 URL: https://issues.apache.org/jira/browse/TEZ-3962 Project: Apache Tez Issue Type: Bug Affects Versions: 0.9.2, 0.10.0 Reporter: Gopal V {code} public static Configuration createConfFromByteString(ByteString byteString) throws IOException { ... InflaterInputStream uncompressIs = new InflaterInputStream(byteString.newInput()); DAGProtos.ConfigurationProto confProto = DAGProtos.ConfigurationProto.parseFrom(uncompressIs); {code} InflaterInputStream is never closed, this will get eventually collected - but the off-heap buffers for Inflater leaks temporarily. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3958) Add internal vertex priority information into the tez dag.dot debug information
[ https://issues.apache.org/jira/browse/TEZ-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522662#comment-16522662 ] Gopal V commented on TEZ-3958: -- Let me take a very specific case here and talk about a DAG in this order. {code} digraph g { store_returns -> Map1 -> ReducerJoin1; store_sales -> Map2 -> ReducerJoin1; Map1 -> BloomGen1 -> Map2; } {code} for the hive bloom filter impl (where the small side sends a bloom filter to the bigger side before a shuffle join). This has many different stable (i.e no deadlocks) ways to mark out priority, but not all of them are optimal & it is very hard to debug it (i.e Map1 & Map2 might run in different order if they have their vertex ids swapped). > Add internal vertex priority information into the tez dag.dot debug > information > --- > > Key: TEZ-3958 > URL: https://issues.apache.org/jira/browse/TEZ-3958 > Project: Apache Tez > Issue Type: Improvement >Reporter: Gopal V >Priority: Major > > Adding the actual vertex priority as computed by Tez into the debug dag.dot > file would allows the debugging of task pre-emption issues when the DAG is no > longer a tree. > There are pre-emption issues with isomerization of Tez DAGs, where the a > R-isomer dag with mirror rotation runs at a different speed than the L-isomer > dag, due to priorities at the same level changing due to the vertex-id order. > Since the problem is hard to debug through, it would be good to record the > computed priority in the DAG .dot file in the logging directories. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3958) Add internal vertex priority information into the tez dag.dot debug information
[ https://issues.apache.org/jira/browse/TEZ-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522650#comment-16522650 ] Gopal V commented on TEZ-3958: -- [~jmarhuen]: yes, you are right - however the priority limits for the task itself is computed within the scheduler here. https://github.com/apache/tez/blob/3f2373e2b2ab3825ef50e9f19b8704265542a8b2/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/DAGSchedulerNaturalOrder.java#L47 {code} int priorityLowLimit = ((vertexDistanceFromRoot + 1) * dag.getTotalVertices() * 3) + (vertex.getVertexId().getId() * 3); {code} > Add internal vertex priority information into the tez dag.dot debug > information > --- > > Key: TEZ-3958 > URL: https://issues.apache.org/jira/browse/TEZ-3958 > Project: Apache Tez > Issue Type: Improvement >Reporter: Gopal V >Priority: Major > > Adding the actual vertex priority as computed by Tez into the debug dag.dot > file would allows the debugging of task pre-emption issues when the DAG is no > longer a tree. > There are pre-emption issues with isomerization of Tez DAGs, where the a > R-isomer dag with mirror rotation runs at a different speed than the L-isomer > dag, due to priorities at the same level changing due to the vertex-id order. > Since the problem is hard to debug through, it would be good to record the > computed priority in the DAG .dot file in the logging directories. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-3958) Add internal vertex priority information into the tez dag.dot debug information
Gopal V created TEZ-3958: Summary: Add internal vertex priority information into the tez dag.dot debug information Key: TEZ-3958 URL: https://issues.apache.org/jira/browse/TEZ-3958 Project: Apache Tez Issue Type: Improvement Reporter: Gopal V Adding the actual vertex priority as computed by Tez into the debug dag.dot file would allows the debugging of task pre-emption issues when the DAG is no longer a tree. There are pre-emption issues with isomerization of Tez DAGs, where the a R-isomer dag with mirror rotation runs at a different speed than the L-isomer dag, due to priorities at the same level changing due to the vertex-id order. Since the problem is hard to debug through, it would be good to record the computed priority in the DAG .dot file in the logging directories. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3953) Restore ABI-compat for DAGClient for TEZ-3951
[ https://issues.apache.org/jira/browse/TEZ-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3953: - Summary: Restore ABI-compat for DAGClient for TEZ-3951 (was: make interface change from TEZ-3951 non-breaking) > Restore ABI-compat for DAGClient for TEZ-3951 > - > > Key: TEZ-3953 > URL: https://issues.apache.org/jira/browse/TEZ-3953 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: TEZ-3953.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3953) make interface change from TEZ-3951 non-breaking
[ https://issues.apache.org/jira/browse/TEZ-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-3953: - Affects Version/s: 0.10.0 > make interface change from TEZ-3951 non-breaking > > > Key: TEZ-3953 > URL: https://issues.apache.org/jira/browse/TEZ-3953 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: TEZ-3953.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3953) make interface change from TEZ-3951 non-breaking
[ https://issues.apache.org/jira/browse/TEZ-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508933#comment-16508933 ] Gopal V commented on TEZ-3953: -- LGTM - +1 Is there a hive-3 branch impl patch for Hive SyncDagClient for this? > make interface change from TEZ-3951 non-breaking > > > Key: TEZ-3953 > URL: https://issues.apache.org/jira/browse/TEZ-3953 > Project: Apache Tez > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: TEZ-3953.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)