This is an automated email from the ASF dual-hosted git repository.

yangjie01 pushed a commit to branch branch-4.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-4.0 by this push:
     new 508edd4c4232 [SPARK-51630][CORE][TESTS] Remove `pids` size check from 
"SPARK-45907: Use ProcessHandle APIs to computeProcessTree in 
ProcfsMetricsGetter"
508edd4c4232 is described below

commit 508edd4c4232e811efa4924068246dde95d565e4
Author: yangjie01 <yangji...@baidu.com>
AuthorDate: Thu Apr 10 15:24:53 2025 +0800

    [SPARK-51630][CORE][TESTS] Remove `pids` size check from "SPARK-45907: Use 
ProcessHandle APIs to computeProcessTree in ProcfsMetricsGetter"
    
    ### What changes were proposed in this pull request?
    This PR removes the size check for `pids` from the test case titled 
"SPARK-45907: Use ProcessHandle APIs to computeProcessTree in 
ProcfsMetricsGetter".
    
    ### Why are the changes needed?
    To avoid potential test instability, the test case 'SPARK-45907: Use 
ProcessHandle APIs to computeProcessTree in ProcfsMetricsGetter' may fail when 
tested in the following environment:
    
    ```
    Apple M3
    macOS 15.4
    zulu 17.0.14
    ```
    
    run `build/sbt "core/testOnly org.apache.spark.ui.UISeleniumSuite 
org.apache.spark.executor.ProcfsMetricsGetterSuite"`
    
    ```
    [info] UISeleniumSuite:
    [info] - all jobs page should be rendered even though we configure the 
scheduling mode to fair (4 seconds, 202 milliseconds)
    [info] - effects of unpersist() / persist() should be reflected (2 seconds, 
845 milliseconds)
    [info] - failed stages should not appear to be active (2 seconds, 455 
milliseconds)
    [info] - spark.ui.killEnabled should properly control kill button display 
(8 seconds, 610 milliseconds)
    [info] - jobs page should not display job group name unless some job was 
submitted in a job group (2 seconds, 546 milliseconds)
    [info] - job progress bars should handle stage / task failures (2 seconds, 
610 milliseconds)
    [info] - job details page should display useful information for stages that 
haven't started (2 seconds, 292 milliseconds)
    [info] - job progress bars / cells reflect skipped stages / tasks (2 
seconds, 304 milliseconds)
    [info] - stages that aren't run appear as 'skipped stages' after a job 
finishes (2 seconds, 201 milliseconds)
    [info] - jobs with stages that are skipped should show correct link 
descriptions on all jobs page (2 seconds, 188 milliseconds)
    [info] - attaching and detaching a new tab (2 seconds, 268 milliseconds)
    [info] - kill stage POST/GET response is correct (173 milliseconds)
    [info] - kill job POST/GET response is correct (141 milliseconds)
    [info] - stage & job retention (2 seconds, 661 milliseconds)
    [info] - live UI json application list (2 seconds, 187 milliseconds)
    [info] - job stages should have expected dotfile under DAG visualization (2 
seconds, 126 milliseconds)
    [info] - stages page should show skipped stages (2 seconds, 651 
milliseconds)
    [info] - Staleness of Spark UI should not last minutes or hours (2 seconds, 
167 milliseconds)
    [info] - description for empty jobs (2 seconds, 242 milliseconds)
    [info] - Support disable event timeline (4 seconds, 585 milliseconds)
    [info] - SPARK-41365: Stage page can be accessed if URI was encoded twice 
(2 seconds, 306 milliseconds)
    [info] - SPARK-44895: Add 'daemon', 'priority' for ThreadStackTrace (2 
seconds, 219 milliseconds)
    [info] ProcfsMetricsGetterSuite:
    [info] - testGetProcessInfo (1 millisecond)
    OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader 
classes because bootstrap classpath has been appended
    [info] - SPARK-34845: partial metrics shouldn't be returned (493 
milliseconds)
    [info] - SPARK-45907: Use ProcessHandle APIs to computeProcessTree in 
ProcfsMetricsGetter *** FAILED *** (10 seconds, 149 milliseconds)
    [info]   The code passed to eventually never returned normally. Attempted 
102 times over 10.036665625 seconds. Last failure message: 1 did not equal 3. 
(ProcfsMetricsGetterSuite.scala:87)
    [info]   org.scalatest.exceptions.TestFailedDueToTimeoutException:
    [info]   at 
org.scalatest.enablers.Retrying$$anon$4.tryTryAgain$2(Retrying.scala:219)
    [info]   at 
org.scalatest.enablers.Retrying$$anon$4.retry(Retrying.scala:226)
    [info]   at 
org.scalatest.concurrent.Eventually.eventually(Eventually.scala:313)
    [info]   at 
org.scalatest.concurrent.Eventually.eventually$(Eventually.scala:312)
    [info]   at 
org.scalatest.concurrent.Eventually$.eventually(Eventually.scala:457)
    [info]   at 
org.apache.spark.executor.ProcfsMetricsGetterSuite.$anonfun$new$3(ProcfsMetricsGetterSuite.scala:87)
    ```
    
    After conducting an investigation, I discovered that the `eventually` block 
does not always capture the stage where `pids.size` is 3. Due to timing issues, 
it may directly capture the scenario where `pids.size` is 4. Furthermore, since 
the checks for `pids.contains(currentPid)` and `pids.contains(child)` are more 
crucial, this PR removes the check for the size of `pids`.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    - Pass GitHub Actions
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No
    
    Closes #50545 from LuciferYang/ProcfsMetricsGetterSuite.
    
    Lead-authored-by: yangjie01 <yangji...@baidu.com>
    Co-authored-by: YangJie <yangji...@baidu.com>
    Signed-off-by: yangjie01 <yangji...@baidu.com>
    (cherry picked from commit 6cdf54b341ee63039cb71734daafce7e628793e2)
    Signed-off-by: yangjie01 <yangji...@baidu.com>
---
 .../test/scala/org/apache/spark/executor/ProcfsMetricsGetterSuite.scala  | 1 -
 1 file changed, 1 deletion(-)

diff --git 
a/core/src/test/scala/org/apache/spark/executor/ProcfsMetricsGetterSuite.scala 
b/core/src/test/scala/org/apache/spark/executor/ProcfsMetricsGetterSuite.scala
index 573540180e6c..77d782461a2e 100644
--- 
a/core/src/test/scala/org/apache/spark/executor/ProcfsMetricsGetterSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/executor/ProcfsMetricsGetterSuite.scala
@@ -86,7 +86,6 @@ class ProcfsMetricsGetterSuite extends SparkFunSuite {
       val child = process.toHandle.pid()
       eventually(timeout(10.seconds), interval(100.milliseconds)) {
         val pids = p.computeProcessTree()
-        assert(pids.size === 3)
         assert(pids.contains(currentPid))
         assert(pids.contains(child))
       }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to