[ https://issues.apache.org/jira/browse/TEZ-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17744187#comment-17744187 ]
okumin edited comment on TEZ-4344 at 7/18/23 11:47 AM: ------------------------------------------------------- [~ayushtkn] [~abstractdog] Hi, can we have a chance to make it more pluggable? As a background, we inject a similar capability to the same point. * https://speakerdeck.com/okumin/hive-distributed-profiling-system-in-treasure-data-english-version-number-tdtechtalk?slide=26 * https://api-docs.treasuredata.com/blog/hive-distributed-profiling/ I guess it is impossible to fully replace our patch with TezThreadDumpHelper because we need to add some contexts specific to us(e.g. a global job id of our platform) and we want to send it to our DWH. So, I wonder if we can generalize the feature like `TaskAttemptHook`. I'm really surprised and glad to see people who are on the same page! was (Author: okumin): [~ayushtkn][~abstractdog] Hi, can we have a chance to make it more pluggable? As a background, we inject a similar capability to the same point. * https://speakerdeck.com/okumin/hive-distributed-profiling-system-in-treasure-data-english-version-number-tdtechtalk?slide=26 * https://api-docs.treasuredata.com/blog/hive-distributed-profiling/ I guess it is impossible to fully replace our patch with TezThreadDumpHelper because we need to add some contexts specific to us(e.g. a global job id of our platform) and we want to send it to our DWH. So, I wonder if we can generalize the feature like `TaskAttemptHook`. I'm really surprised and glad to see people who are on the same page! > Collect jstack periodically from all containers > ----------------------------------------------- > > Key: TEZ-4344 > URL: https://issues.apache.org/jira/browse/TEZ-4344 > Project: Apache Tez > Issue Type: Sub-task > Reporter: László Bodor > Assignee: Ayush Saxena > Priority: Major > Fix For: 0.10.3 > > Time Spent: 6h 20m > Remaining Estimate: 0h > > 1. set a property of interval of seconds (default: 0 ==> off) > 2. attach jstack files to app logs (this is easy maybe, putting jstack log > files next to app syslog files can make it included by yarn) > jstack should have a name like containername_dagname_timestamp > +option if containers should create jstacks even when idle (don't have a task > assigned to them), by default they're not supposed to do so > I don't want to have a jstack dependency for this (configure path, etc.), so > an internal thread dump facility is preferred with zero configuration. > Also this doesn't require new endpoints of AM and task containers (like > TEZ-4345), this can be implemented quite easily. -- This message was sent by Atlassian Jira (v8.20.10#820010)