[
https://issues.apache.org/jira/browse/YUNIKORN-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17727747#comment-17727747
]
Wilfred Spiegelenburg commented on YUNIKORN-1278:
-------------------------------------------------
Captured the failure so we have something to investigate: based on
(approximately) 1.3 code base. Only 1 instance failed all other K8s setup passed
{code:java}
Running Suite: TestSimplePreemptor -
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor
1877[TestSimplePreemptor]
1878================================================================================================================
1879Random Seed: 1685476199
1880
1881Will run 2 of 2 specs
1882------------------------------
1883[BeforeSuite]
1884/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:55
1885 STEP: Create initial configMap if not exists @ 05/30/23 20:08:48.388
1886 STEP: Port-forward the scheduler pod @ 05/30/23 20:08:48.399
1887Forwarding from 127.0.0.1:9080 -> 9080
1888Forwarding from [::1]:9080 -> 9080
1889Port-forwarding traffic for yunikorn-scheduler... STEP: Enabling new
scheduling config @ 05/30/23 20:08:48.438
1890Handling connection for 9080
1891 STEP: Port-forward the scheduler pod @ 05/30/23 20:08:48.473
1892port-forward is already running STEP: create development namespace @
05/30/23 20:08:48.473
1893 STEP: Tainting some nodes.. @ 05/30/23 20:08:49.496
1894[BeforeSuite] PASSED [1.114 seconds]
1895------------------------------
1896SimplePreemptor Verify_basic_simple_preemption. Use case: Only one pod is
running and same pod has been selected as victim
1897/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:154
1898 STEP: Deploy the sleep pod sleepjob to the development namespace @
05/30/23 20:08:49.499
1899 STEP: Deploy the sleep pod sleepjob2 to the development namespace @
05/30/23 20:08:53.294
1900 [FAILED] in [It] -
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:172
@ 05/30/23 20:09:54.293
1901 STEP: Delete all sleep pods @ 05/30/23 20:09:54.293
1902 STEP: Deleting sleep pod: sleepjob @ 05/30/23 20:09:54.494
1903 STEP: Deleting sleep pod: sleepjob2 @ 05/30/23 20:09:54.894
1904• [FAILED] [65.795 seconds]
1905SimplePreemptor [It] Verify_basic_simple_preemption. Use case: Only one pod
is running and same pod has been selected as victim
1906/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:154
1907
1908 [FAILED] Unexpected error:
1909 <*errors.errorString | 0xc00044c680>: {
1910 s: "timed out waiting for the condition",
1911 }
1912 timed out waiting for the condition
1913 occurred
1914 In [It] at:
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:172
@ 05/30/23 20:09:54.293
1915------------------------------
1916SimplePreemptor Verify_simple_preemption. Use case: When 3 sleep pods (2
opted out, regular) are running, regular pod should be victim to free up
resources for 4th sleep pod
1917/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:179
1918 STEP: Deploy the sleep pod sleepjob to the development namespace @
05/30/23 20:09:55.294
1919 STEP: Deploy the sleep pod sleepjob2 to the development namespace @
05/30/23 20:09:59.295
1920 STEP: Deploy the sleep pod sleepjob3 to the development namespace @
05/30/23 20:10:03.493
1921 STEP: Deploy the sleep pod sleepjob4 to the development namespace @
05/30/23 20:10:07.494
1922 [FAILED] in [It] -
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:204
@ 05/30/23 20:14:09.094
1923 STEP: Delete all sleep pods @ 05/30/23 20:14:09.094
1924 STEP: Deleting sleep pod: sleepjob @ 05/30/23 20:14:09.295
1925 STEP: Deleting sleep pod: sleepjob2 @ 05/30/23 20:14:09.695
1926 STEP: Deleting sleep pod: sleepjob3 @ 05/30/23 20:14:10.093
1927 STEP: Deleting sleep pod: sleepjob4 @ 05/30/23 20:14:10.495
1928• [FAILED] [255.600 seconds]
1929SimplePreemptor [It] Verify_simple_preemption. Use case: When 3 sleep pods
(2 opted out, regular) are running, regular pod should be victim to free up
resources for 4th sleep pod
1930/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:179
1931
1932 [FAILED] Unexpected error:
1933 <*errors.errorString | 0xc00044c680>: {
1934 s: "timed out waiting for the condition",
1935 }
1936 timed out waiting for the condition
1937 occurred
1938 In [It] at:
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:204
@ 05/30/23 20:14:09.094
1939------------------------------
1940[AfterSuite]
1941/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:127
1942 STEP: Untainting some nodes @ 05/30/23 20:14:10.894
1943 STEP: Check Yunikorn's health @ 05/30/23 20:14:10.894
1944Handling connection for 9080
1945 STEP: Tearing down namespace: dev1719k @ 05/30/23 20:14:10.901
1946 STEP: Restoring the old config maps @ 05/30/23 20:14:11.297
1947[AfterSuite] PASSED [0.463 seconds]
1948------------------------------
1949[ReportAfterSuite] TestSimplePreemptor
1950/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_suite_test.go:37
1951[ReportAfterSuite] PASSED [0.001 seconds]
1952------------------------------
1953
1954Summarizing 2 Failures:
1955 [FAIL] SimplePreemptor [It] Verify_basic_simple_preemption. Use case:
Only one pod is running and same pod has been selected as victim
1956
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:172
1957 [FAIL] SimplePreemptor [It] Verify_simple_preemption. Use case: When 3
sleep pods (2 opted out, regular) are running, regular pod should be victim to
free up resources for 4th sleep pod
1958
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:204
1959
1960Ran 2 of 2 Specs in 322.975 seconds
1961FAIL! -- 0 Passed | 2 Failed | 0 Pending | 0 Skipped
1962--- FAIL: TestSimplePreemptor (322.98s) {code}
> Fix flaky E2E simple preemptor test suite runs
> ----------------------------------------------
>
> Key: YUNIKORN-1278
> URL: https://issues.apache.org/jira/browse/YUNIKORN-1278
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: test - e2e
> Reporter: Manikandan R
> Assignee: Manikandan R
> Priority: Major
>
> e2e preemption test seems to be flaky. It is failing on different version of
> K8s for different PR's.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]