[
https://issues.apache.org/jira/browse/BEAM-10291?focusedWorklogId=456216&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456216
]
ASF GitHub Bot logged work on BEAM-10291:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 08/Jul/20 15:45
Start Date: 08/Jul/20 15:45
Worklog Time Spent: 10m
Work Description: davidyan74 commented on a change in pull request #12143:
URL: https://github.com/apache/beam/pull/12143#discussion_r451644932
##########
File path:
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowOperationContext.java
##########
@@ -264,7 +276,49 @@ public void reportLull(Thread trackedThread, long millis) {
logRecord.setLoggerName(DataflowOperationContext.LOG.getName());
// Publish directly in the context of this specific ExecutionState.
- DataflowWorkerLoggingInitializer.getLoggingHandler().publish(this,
logRecord);
+ DataflowWorkerLoggingHandler dataflowLoggingHandler =
+ DataflowWorkerLoggingInitializer.getLoggingHandler();
+ dataflowLoggingHandler.publish(this, logRecord);
+
+ if (shouldLogFullThreadDump()) {
+ Map<Thread, StackTraceElement[]> threadSet =
Thread.getAllStackTraces();
+ for (Map.Entry<Thread, StackTraceElement[]> entry :
threadSet.entrySet()) {
+ Thread thread = entry.getKey();
+ StackTraceElement[] stackTrace = entry.getValue();
+ StringBuilder message = new StringBuilder();
+ message.append(thread.toString()).append(":\n");
+ message.append(getStackTraceForLullMessage(stackTrace));
+ logRecord = new LogRecord(Level.INFO, message.toString());
+ logRecord.setLoggerName(DataflowOperationContext.LOG.getName());
+ dataflowLoggingHandler.publish(this, logRecord);
+ }
+ }
+ }
+
+ // A full thread dump is performed at most once every 20 minutes.
+ private static final long LOG_LULL_FULL_THREAD_DUMP_MS = 20 * 60 * 1000;
+
+ // Last time when a full thread dump was performed.
+ private long lastFullThreadDumpMillis = 0;
+
+ private boolean shouldLogFullThreadDump() {
Review comment:
Thanks! Makes sense. Done. PTAL.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 456216)
Time Spent: 4h 40m (was: 4.5h)
> Lull detection log to include full thread dump
> ----------------------------------------------
>
> Key: BEAM-10291
> URL: https://issues.apache.org/jira/browse/BEAM-10291
> Project: Beam
> Issue Type: Improvement
> Components: runner-dataflow
> Reporter: David Yan
> Assignee: David Yan
> Priority: P2
> Time Spent: 4h 40m
> Remaining Estimate: 0h
>
> What we have today is a thread dump of the thread that's stuck, but in many
> cases (most notably BQ) I/O happens in a separate thread that is not included
> in the dump. Ideally, we'd need to have a full thread dump of the entire
> process.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)