[
https://issues.apache.org/jira/browse/TEZ-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195367#comment-14195367
]
Jason Lowe commented on TEZ-14:
---
Apologies for the late reply. I haven't had time to look at
[
https://issues.apache.org/jira/browse/TEZ-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305247#comment-14305247
]
Jason Lowe commented on TEZ-2018:
-
bq. Maybe this plugin could be enhanced to do the
[
https://issues.apache.org/jira/browse/TEZ-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316386#comment-14316386
]
Jason Lowe commented on TEZ-2073:
-
bq. Is the fs.permissions.umask-mode applicable to all
[
https://issues.apache.org/jira/browse/TEZ-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317013#comment-14317013
]
Jason Lowe commented on TEZ-2073:
-
+1 lgtm. RawLocalFileSystem explicitly overrides the
[
https://issues.apache.org/jira/browse/TEZ-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527296#comment-14527296
]
Jason Lowe commented on TEZ-2393:
-
I think the main problems will be from anyone who
[
https://issues.apache.org/jira/browse/TEZ-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545652#comment-14545652
]
Jason Lowe commented on TEZ-2311:
-
It's kind of a pain to scrub the logs for posting, but I
[
https://issues.apache.org/jira/browse/TEZ-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496800#comment-14496800
]
Jason Lowe commented on TEZ-2319:
-
MR does not dump the final state all at once, rather it
[
https://issues.apache.org/jira/browse/TEZ-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-2304:
Attachment: 168563_recovery.gz
Posting the logs of the second AM attempt up to the point of the first invalid
Jason Lowe created TEZ-2311:
---
Summary: AM can hang if kill received while recovering from
previous attempt
Key: TEZ-2311
URL: https://issues.apache.org/jira/browse/TEZ-2311
Project: Apache Tez
[
https://issues.apache.org/jira/browse/TEZ-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492617#comment-14492617
]
Jason Lowe commented on TEZ-2311:
-
The AM appeared to hang because it was still waiting for
[
https://issues.apache.org/jira/browse/TEZ-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488258#comment-14488258
]
Jason Lowe commented on TEZ-2304:
-
Log snippets showing state transitions and eventual
Jason Lowe created TEZ-2303:
---
Summary: ConcurrentModificationException while processing recovery
Key: TEZ-2303
URL: https://issues.apache.org/jira/browse/TEZ-2303
Project: Apache Tez
Issue Type:
Jason Lowe created TEZ-2304:
---
Summary: InvalidStateTransitonException TA_SCHEDULE at START_WAIT
during recovery
Key: TEZ-2304
URL: https://issues.apache.org/jira/browse/TEZ-2304
Project: Apache Tez
[
https://issues.apache.org/jira/browse/TEZ-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488219#comment-14488219
]
Jason Lowe commented on TEZ-2303:
-
{noformat}
2015-04-09 19:36:11,231 INFO [main]
[
https://issues.apache.org/jira/browse/TEZ-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-2485:
Attachment: ats-omit-dup-display-names-and-zero-counters_v2.patch
Minor fix to patch, was trying to apply
[
https://issues.apache.org/jira/browse/TEZ-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-2485:
Attachment: ats-omit-dup-display-names-and-zero-counters.patch
Posting a prototype patch that does two main
[
https://issues.apache.org/jira/browse/TEZ-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590735#comment-14590735
]
Jason Lowe commented on TEZ-2549:
-
bq. can you shed some more light on the counter proto
[
https://issues.apache.org/jira/browse/TEZ-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589809#comment-14589809
]
Jason Lowe commented on TEZ-2018:
-
bq. Will Application History Server continue to hold the
[
https://issues.apache.org/jira/browse/TEZ-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589854#comment-14589854
]
Jason Lowe commented on TEZ-2549:
-
Thanks for moving the patch forward, Jon. Couple of
Jason Lowe created TEZ-2711:
---
Summary: Tez fails to submit a job where user.name is not UGI user
Key: TEZ-2711
URL: https://issues.apache.org/jira/browse/TEZ-2711
Project: Apache Tez
Issue Type:
[
https://issues.apache.org/jira/browse/TEZ-2711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692287#comment-14692287
]
Jason Lowe commented on TEZ-2711:
-
This same scenario works for MapReduce jobs because the
[
https://issues.apache.org/jira/browse/TEZ-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-2628:
Attachment: TEZ-2628.002.patch
Yes, there's a problem with retention on a secure cluster. The timeline
[
https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699963#comment-14699963
]
Jason Lowe commented on TEZ-2726:
-
+1 for throwing an exception. I think it could be
Jason Lowe created TEZ-2677:
---
Summary: NPE while submitting MRRSleepJob if
tez.runtime.sort.threads is set
Key: TEZ-2677
URL: https://issues.apache.org/jira/browse/TEZ-2677
Project: Apache Tez
Jason Lowe created TEZ-2679:
---
Summary: Admin forms of launch env and java opts settings
Key: TEZ-2679
URL: https://issues.apache.org/jira/browse/TEZ-2679
Project: Apache Tez
Issue Type:
[
https://issues.apache.org/jira/browse/TEZ-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-2679:
Summary: Admin forms of launch env settings (was: Admin forms of launch
env and java opts settings)
Thanks,
[
https://issues.apache.org/jira/browse/TEZ-2654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646232#comment-14646232
]
Jason Lowe commented on TEZ-2654:
-
Since we already provide a REST service for getting dag
[
https://issues.apache.org/jira/browse/TEZ-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696028#comment-14696028
]
Jason Lowe commented on TEZ-2628:
-
One advantage of getting Hive to post this via HDFS is
[
https://issues.apache.org/jira/browse/TEZ-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695967#comment-14695967
]
Jason Lowe commented on TEZ-2628:
-
Alternatively, if the Hive server knows which app ID the
[
https://issues.apache.org/jira/browse/TEZ-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695937#comment-14695937
]
Jason Lowe commented on TEZ-2628:
-
bq. It seems to me that all the code in EntityLogger
[
https://issues.apache.org/jira/browse/TEZ-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638970#comment-14638970
]
Jason Lowe commented on TEZ-2311:
-
Ideally we would like this fixed in a 0.7 patch release
[
https://issues.apache.org/jira/browse/TEZ-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-2628:
Attachment: TEZ-2628.001.patch
Posting a prototype patch a bit early since there was some interest expressed
Jason Lowe created TEZ-2628:
---
Summary: History logging plugin to write ATS events to HDFS
Key: TEZ-2628
URL: https://issues.apache.org/jira/browse/TEZ-2628
Project: Apache Tez
Issue Type:
[
https://issues.apache.org/jira/browse/TEZ-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967361#comment-14967361
]
Jason Lowe commented on TEZ-2581:
-
I noticed the document doesn't discuss much about how user-provided code
[
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981205#comment-14981205
]
Jason Lowe commented on TEZ-808:
Thanks for updating the patch, Bikas! Patch still doesn't treat 0 as a
[
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977056#comment-14977056
]
Jason Lowe commented on TEZ-808:
bq. If we use a boolean, then I think it will be fine to not use volatile
[
https://issues.apache.org/jira/browse/TEZ-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979050#comment-14979050
]
Jason Lowe commented on TEZ-2914:
-
If we're trying to port the capability of MAPREDUCE-5583 then it would be
[
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974432#comment-14974432
]
Jason Lowe commented on TEZ-808:
Thanks for the patch, Bikas! I haven't had a chance to look at it in great
[
https://issues.apache.org/jira/browse/TEZ-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969319#comment-14969319
]
Jason Lowe commented on TEZ-2581:
-
bq. Ideally we should provide API in VertexMangerPlugin to allow user to
[
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-808:
---
Attachment: TEZ-808.branch-0.7.patch
Would it be possible to backport this to branch-0.7? We're going to be on
[
https://issues.apache.org/jira/browse/TEZ-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988152#comment-14988152
]
Jason Lowe commented on TEZ-2918:
-
Latest patch lgtm, assuming the test is unrelated. Only nit is that it's
[
https://issues.apache.org/jira/browse/TEZ-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986084#comment-14986084
]
Jason Lowe commented on TEZ-2918:
-
Thanks for the patch, Bikas!
I agree that using AtomicBoolean.lazySet is
[
https://issues.apache.org/jira/browse/TEZ-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe reassigned TEZ-2886:
---
Assignee: Jason Lowe
bq. Did you mean "AM NodeManager" and "Other NodeManagers" ?
Technically yes.
[
https://issues.apache.org/jira/browse/TEZ-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961288#comment-14961288
]
Jason Lowe commented on TEZ-2679:
-
+1 looks OK to me. This makes the behavior more inline with how
[
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957536#comment-14957536
]
Jason Lowe commented on TEZ-808:
bq. Fixing IOs vs Fixing processor callback - which one of these would
[
https://issues.apache.org/jira/browse/TEZ-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957624#comment-14957624
]
Jason Lowe commented on TEZ-2679:
-
bq. Before they would only get './' and now they will get './' +
[
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957683#comment-14957683
]
Jason Lowe commented on TEZ-808:
Sounds pretty good to me. Only suggestion is to have a pure progress() API
[
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957448#comment-14957448
]
Jason Lowe commented on TEZ-808:
bq. Like I said in item 1) above - add finer grained updates of processed
[
https://issues.apache.org/jira/browse/TEZ-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949368#comment-14949368
]
Jason Lowe commented on TEZ-2872:
-
This is similar to a scenario MapReduce encountered before, see
Jason Lowe created TEZ-2872:
---
Summary: Tez AM can be overwhelmed by
TezTaskUmbilicalProtocol.getTask responses
Key: TEZ-2872
URL: https://issues.apache.org/jira/browse/TEZ-2872
Project: Apache Tez
[
https://issues.apache.org/jira/browse/TEZ-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949535#comment-14949535
]
Jason Lowe commented on TEZ-2872:
-
The problem with TEZ-754 is that it only helps when containers are
[
https://issues.apache.org/jira/browse/TEZ-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-2628:
Attachment: TEZ-2628.004.patch
Minor update to the patch to fix a bug that [~jeagles] pointed out offline.
[
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955483#comment-14955483
]
Jason Lowe commented on TEZ-808:
Correct, in this latest case the tasks were part of Pig streaming jobs and
Jason Lowe created TEZ-2886:
---
Summary: Ability to merge AM credentials with DAG credentials
Key: TEZ-2886
URL: https://issues.apache.org/jira/browse/TEZ-2886
Project: Apache Tez
Issue Type:
[
https://issues.apache.org/jira/browse/TEZ-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955549#comment-14955549
]
Jason Lowe commented on TEZ-2886:
-
For example, the RM will automatically add tokens for the log aggregation
[
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955389#comment-14955389
]
Jason Lowe commented on TEZ-808:
Just ran across the lack of this for some Tez jobs that hung forever. Tasks
[
https://issues.apache.org/jira/browse/TEZ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955678#comment-14955678
]
Jason Lowe commented on TEZ-808:
No, we can't key off the progress field. In practice progress can go
[
https://issues.apache.org/jira/browse/TEZ-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949296#comment-14949296
]
Jason Lowe commented on TEZ-2864:
-
Is this a Tez issue? I'm not sure how Tez is supposed to know about, and
[
https://issues.apache.org/jira/browse/TEZ-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715831#comment-14715831
]
Jason Lowe commented on TEZ-2628:
-
I believe this is a bug in the MemoryTimelineStore. The
Jason Lowe created TEZ-2787:
---
Summary: Tez AM should have java.io.tmpdir=./tmp to be consistent
with tasks
Key: TEZ-2787
URL: https://issues.apache.org/jira/browse/TEZ-2787
Project: Apache Tez
[
https://issues.apache.org/jira/browse/TEZ-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14933876#comment-14933876
]
Jason Lowe commented on TEZ-2628:
-
Yes, sorry this wasn't clear. The group of the the timeline server
[
https://issues.apache.org/jira/browse/TEZ-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034642#comment-15034642
]
Jason Lowe commented on TEZ-2914:
-
One advantage to doing it at the YARN level is that we can tell the RM
[
https://issues.apache.org/jira/browse/TEZ-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034642#comment-15034642
]
Jason Lowe edited comment on TEZ-2914 at 12/1/15 9:44 PM:
--
One advantage to doing
[
https://issues.apache.org/jira/browse/TEZ-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-2972:
Attachment: TEZ-2972.001.patch
Patch that adds a tez.am.node-updates.enabled property to control whether the
Jason Lowe created TEZ-2972:
---
Summary: Ability for Tez AM to ignore node updates from YARN
Key: TEZ-2972
URL: https://issues.apache.org/jira/browse/TEZ-2972
Project: Apache Tez
Issue Type:
[
https://issues.apache.org/jira/browse/TEZ-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe reassigned TEZ-2972:
---
Assignee: Jason Lowe
This can also be important on clusters where the UNHEALTHY state is used as
part
[
https://issues.apache.org/jira/browse/TEZ-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-2972:
Attachment: TEZ-2972.002.patch
Thanks for the review, Bikas! I updated the patch to avoid sending the node
[
https://issues.apache.org/jira/browse/TEZ-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15062896#comment-15062896
]
Jason Lowe commented on TEZ-3009:
-
Sample container log showing the problem:
{noformat}
2015-12-11
Jason Lowe created TEZ-3009:
---
Summary: Errors that occur during container task acquisition are
not logged
Key: TEZ-3009
URL: https://issues.apache.org/jira/browse/TEZ-3009
Project: Apache Tez
Jason Lowe created TEZ-3010:
---
Summary: Container task acquisition has no retries for errors
Key: TEZ-3010
URL: https://issues.apache.org/jira/browse/TEZ-3010
Project: Apache Tez
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TEZ-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-2972:
Hadoop Flags: Incompatible change
> Ability for Tez AM to ignore node updates from YARN
>
[
https://issues.apache.org/jira/browse/TEZ-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-2972:
Attachment: TEZ-2972.003.patch
Updated the patch to use tez.am.node-unhealthy-reschedule-tasks instead of
[
https://issues.apache.org/jira/browse/TEZ-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-2972:
Fix Version/s: 0.7.1
Thanks, Bikas! I committed the branch-0.7 patch.
> Avoid task rescheduling when a node
[
https://issues.apache.org/jira/browse/TEZ-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15064650#comment-15064650
]
Jason Lowe commented on TEZ-3009:
-
I don't see any indication it's fixed in master. From TezChild.run:
[
https://issues.apache.org/jira/browse/TEZ-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-3009:
Attachment: TEZ-3009.001.patch
Patch that adds logging of errors that occur during task fetch. Manually
[
https://issues.apache.org/jira/browse/TEZ-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-2972:
Attachment: TEZ-2972.003.addendum.patch
TEZ-2972-branch-0.7.001.patch
Attached is a patch for
[
https://issues.apache.org/jira/browse/TEZ-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-3009:
Attachment: TEZ-3009.002.patch
Thanks for the review, Sid! Updated the patch to log at the ERROR level
[
https://issues.apache.org/jira/browse/TEZ-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321370#comment-15321370
]
Jason Lowe commented on TEZ-3293:
-
The same type of error was fixed in MapReduce's version of the
[
https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15325186#comment-15325186
]
Jason Lowe commented on TEZ-3296:
-
bq. Could you please help me understand the logic to make these unique.
Jason Lowe created TEZ-3296:
---
Summary: Tez job can hang if two vertices at the same root
distance have different task requirements
Key: TEZ-3296
URL: https://issues.apache.org/jira/browse/TEZ-3296
Project:
[
https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-3296:
Attachment: TEZ-3296.001.patch
Patch that changes the container priority calculations to generate a unique
[
https://issues.apache.org/jira/browse/TEZ-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-3293:
Attachment: TEZ-3293.001.patch
Patch to have unreserve only adjust usedMemory so it's symmetrical with the
Jason Lowe created TEZ-3293:
---
Summary: Fetch failures can cause a shuffle hang waiting for
memory merge that never starts
Key: TEZ-3293
URL: https://issues.apache.org/jira/browse/TEZ-3293
Project: Apache
[
https://issues.apache.org/jira/browse/TEZ-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-3293:
Priority: Critical (was: Major)
> Fetch failures can cause a shuffle hang waiting for memory merge that
[
https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-3296:
Attachment: taskschedulerlog
We no longer have the logs from the original job that hung. However it's easy
[
https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334521#comment-15334521
]
Jason Lowe commented on TEZ-3296:
-
bq. Could you please attach the task scheduler logs for the hung job and
Jason Lowe created TEZ-3306:
---
Summary: Improve container priority assignments for vertices
Key: TEZ-3306
URL: https://issues.apache.org/jira/browse/TEZ-3306
Project: Apache Tez
Issue Type:
[
https://issues.apache.org/jira/browse/TEZ-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334129#comment-15334129
]
Jason Lowe commented on TEZ-3296:
-
bq. Wondering why the app was hung.
As I mentioned in the description
[
https://issues.apache.org/jira/browse/TEZ-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-3036:
Attachment: TEZ-3036.001.patch
Attaching a prototype patch that seems to fix the issue. This has the
[
https://issues.apache.org/jira/browse/TEZ-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15102478#comment-15102478
]
Jason Lowe commented on TEZ-3036:
-
My apologies, I misread the heap dump info. NoSuchMethodError was being
Jason Lowe created TEZ-3036:
---
Summary: Tez AM can hang on startup with no indication of error
Key: TEZ-3036
URL: https://issues.apache.org/jira/browse/TEZ-3036
Project: Apache Tez
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TEZ-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097107#comment-15097107
]
Jason Lowe commented on TEZ-3036:
-
In this particular instance the hang occurred because
[
https://issues.apache.org/jira/browse/TEZ-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097108#comment-15097108
]
Jason Lowe commented on TEZ-3036:
-
Haven't verified this, but I suspect this can be replicated by simply
Jason Lowe created TEZ-3103:
---
Summary: Shuffle can hang when memory to memory merging enabled
Key: TEZ-3103
URL: https://issues.apache.org/jira/browse/TEZ-3103
Project: Apache Tez
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TEZ-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated TEZ-3102:
Attachment: TEZ-3102.001.patch
Attaching a patch that does sufficient processing of the kill event for the
[
https://issues.apache.org/jira/browse/TEZ-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15137056#comment-15137056
]
Jason Lowe commented on TEZ-3102:
-
Hang occurs because TaskImpl.shouldScheduleNewAttempt returns false as it
Jason Lowe created TEZ-3102:
---
Summary: Fetch failure of a speculated task causes job hang
Key: TEZ-3102
URL: https://issues.apache.org/jira/browse/TEZ-3102
Project: Apache Tez
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TEZ-1944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15137059#comment-15137059
]
Jason Lowe commented on TEZ-1944:
-
This seems likely caused by the same problem reported in TEZ-1911.
> OOM
Jason Lowe created TEZ-3114:
---
Summary: Shuffle OOM due to EventMetaData flood
Key: TEZ-3114
URL: https://issues.apache.org/jira/browse/TEZ-3114
Project: Apache Tez
Issue Type: Bug
Affects
[
https://issues.apache.org/jira/browse/TEZ-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142873#comment-15142873
]
Jason Lowe commented on TEZ-3114:
-
There's no flow control to prevent shuffle transfer events from arriving
1 - 100 of 456 matches
Mail list logo