[
https://issues.apache.org/jira/browse/HUDI-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17762464#comment-17762464
]
Lokesh Jain commented on HUDI-6820:
-----------------------------------
Usually the error message is of the form:
{code:java}
,##[error]We stopped hearing from agent Hosted Agent. Verify the agent machine
is running and has a healthy network connection. Anything that terminates an
agent process, starves it for CPU, or blocks its network access can cause this
error. For more information, see:
https://go.microsoft.com/fwlink/?linkid=846610 {code}
Or of the form:
{code:java}
The job running on agent Azure Pipelines 7 ran longer than the maximum time of
150 minutes. {code}
It could be an actual timeout where tests were running for 150 minutes. But in
many cases issue we are seeing is that raw logs show
{code:java}
2023-07-28T06:01:14.7898970Z at
java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_372]
2023-07-28T06:01:14.7899483Z at
org.apache.hudi.metrics.Metrics.shutdown(Metrics.java:116)
~[hudi-client-common-0.14.0-SNAPSHOT.jar:0.14.0-SNAPSHOT]
2023-07-28T06:01:14.7899862Z at java.lang.Thread.run(Thread.java:750)
~[?:1.8.0_372]
2023-07-28T07:27:41.0195722Z ##[error]The operation was canceled.
2023-07-28T07:27:41.0242402Z ##[section]Finishing: FT client/spark-client
{code}
There is a gap of 1 hr and more between last test run and the operation
cancellation.
> Fix Azure CI timeout for UT FT other modules
> --------------------------------------------
>
> Key: HUDI-6820
> URL: https://issues.apache.org/jira/browse/HUDI-6820
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Lokesh Jain
> Priority: Major
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)