[ 
https://issues.apache.org/jira/browse/HUDI-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17762464#comment-17762464
 ] 

Lokesh Jain commented on HUDI-6820:
-----------------------------------

Usually the error message is of the form:
{code:java}
,##[error]We stopped hearing from agent Hosted Agent. Verify the agent machine 
is running and has a healthy network connection. Anything that terminates an 
agent process, starves it for CPU, or blocks its network access can cause this 
error. For more information, see: 
https://go.microsoft.com/fwlink/?linkid=846610 {code}
Or of the form:
{code:java}
The job running on agent Azure Pipelines 7 ran longer than the maximum time of 
150 minutes. {code}
It could be an actual timeout where tests were running for 150 minutes. But in 
many cases issue we are seeing is that raw logs show
{code:java}
2023-07-28T06:01:14.7898970Z    at 
java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_372]
2023-07-28T06:01:14.7899483Z    at 
org.apache.hudi.metrics.Metrics.shutdown(Metrics.java:116) 
~[hudi-client-common-0.14.0-SNAPSHOT.jar:0.14.0-SNAPSHOT]
2023-07-28T06:01:14.7899862Z    at java.lang.Thread.run(Thread.java:750) 
~[?:1.8.0_372]
2023-07-28T07:27:41.0195722Z ##[error]The operation was canceled.
2023-07-28T07:27:41.0242402Z ##[section]Finishing: FT client/spark-client
 {code}
There is a gap of 1 hr and more between last test run and the operation 
cancellation.

> Fix Azure CI timeout for UT FT other modules
> --------------------------------------------
>
>                 Key: HUDI-6820
>                 URL: https://issues.apache.org/jira/browse/HUDI-6820
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Lokesh Jain
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to