[ 
https://issues.apache.org/jira/browse/YUNIKORN-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17727747#comment-17727747
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-1278:
-------------------------------------------------

Captured the failure so we have something to investigate: based on 
(approximately) 1.3 code base. Only 1 instance failed all other K8s setup passed
{code:java}
Running Suite: TestSimplePreemptor - 
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor
1877[TestSimplePreemptor] 
1878================================================================================================================
1879Random Seed: 1685476199
1880
1881Will run 2 of 2 specs
1882------------------------------
1883[BeforeSuite] 
1884/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:55
1885  STEP: Create initial configMap if not exists @ 05/30/23 20:08:48.388
1886  STEP: Port-forward the scheduler pod @ 05/30/23 20:08:48.399
1887Forwarding from 127.0.0.1:9080 -> 9080
1888Forwarding from [::1]:9080 -> 9080
1889Port-forwarding traffic for yunikorn-scheduler...  STEP: Enabling new 
scheduling config @ 05/30/23 20:08:48.438
1890Handling connection for 9080
1891  STEP: Port-forward the scheduler pod @ 05/30/23 20:08:48.473
1892port-forward is already running  STEP: create development namespace @ 
05/30/23 20:08:48.473
1893  STEP: Tainting some nodes.. @ 05/30/23 20:08:49.496
1894[BeforeSuite] PASSED [1.114 seconds]
1895------------------------------
1896SimplePreemptor Verify_basic_simple_preemption. Use case: Only one pod is 
running and same pod has been selected as victim
1897/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:154
1898  STEP: Deploy the sleep pod sleepjob to the development namespace @ 
05/30/23 20:08:49.499
1899  STEP: Deploy the sleep pod sleepjob2 to the development namespace @ 
05/30/23 20:08:53.294
1900  [FAILED] in [It] - 
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:172
 @ 05/30/23 20:09:54.293
1901  STEP: Delete all sleep pods @ 05/30/23 20:09:54.293
1902  STEP: Deleting sleep pod: sleepjob @ 05/30/23 20:09:54.494
1903  STEP: Deleting sleep pod: sleepjob2 @ 05/30/23 20:09:54.894
1904• [FAILED] [65.795 seconds]
1905SimplePreemptor [It] Verify_basic_simple_preemption. Use case: Only one pod 
is running and same pod has been selected as victim
1906/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:154
1907
1908  [FAILED] Unexpected error:
1909      <*errors.errorString | 0xc00044c680>: {
1910          s: "timed out waiting for the condition",
1911      }
1912      timed out waiting for the condition
1913  occurred
1914  In [It] at: 
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:172
 @ 05/30/23 20:09:54.293
1915------------------------------
1916SimplePreemptor Verify_simple_preemption. Use case: When 3 sleep pods (2 
opted out, regular) are running, regular pod should be victim to free up 
resources for 4th sleep pod
1917/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:179
1918  STEP: Deploy the sleep pod sleepjob to the development namespace @ 
05/30/23 20:09:55.294
1919  STEP: Deploy the sleep pod sleepjob2 to the development namespace @ 
05/30/23 20:09:59.295
1920  STEP: Deploy the sleep pod sleepjob3 to the development namespace @ 
05/30/23 20:10:03.493
1921  STEP: Deploy the sleep pod sleepjob4 to the development namespace @ 
05/30/23 20:10:07.494
1922  [FAILED] in [It] - 
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:204
 @ 05/30/23 20:14:09.094
1923  STEP: Delete all sleep pods @ 05/30/23 20:14:09.094
1924  STEP: Deleting sleep pod: sleepjob @ 05/30/23 20:14:09.295
1925  STEP: Deleting sleep pod: sleepjob2 @ 05/30/23 20:14:09.695
1926  STEP: Deleting sleep pod: sleepjob3 @ 05/30/23 20:14:10.093
1927  STEP: Deleting sleep pod: sleepjob4 @ 05/30/23 20:14:10.495
1928• [FAILED] [255.600 seconds]
1929SimplePreemptor [It] Verify_simple_preemption. Use case: When 3 sleep pods 
(2 opted out, regular) are running, regular pod should be victim to free up 
resources for 4th sleep pod
1930/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:179
1931
1932  [FAILED] Unexpected error:
1933      <*errors.errorString | 0xc00044c680>: {
1934          s: "timed out waiting for the condition",
1935      }
1936      timed out waiting for the condition
1937  occurred
1938  In [It] at: 
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:204
 @ 05/30/23 20:14:09.094
1939------------------------------
1940[AfterSuite] 
1941/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:127
1942  STEP: Untainting some nodes @ 05/30/23 20:14:10.894
1943  STEP: Check Yunikorn's health @ 05/30/23 20:14:10.894
1944Handling connection for 9080
1945  STEP: Tearing down namespace: dev1719k @ 05/30/23 20:14:10.901
1946  STEP: Restoring the old config maps @ 05/30/23 20:14:11.297
1947[AfterSuite] PASSED [0.463 seconds]
1948------------------------------
1949[ReportAfterSuite] TestSimplePreemptor
1950/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_suite_test.go:37
1951[ReportAfterSuite] PASSED [0.001 seconds]
1952------------------------------
1953
1954Summarizing 2 Failures:
1955  [FAIL] SimplePreemptor [It] Verify_basic_simple_preemption. Use case: 
Only one pod is running and same pod has been selected as victim
1956  
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:172
1957  [FAIL] SimplePreemptor [It] Verify_simple_preemption. Use case: When 3 
sleep pods (2 opted out, regular) are running, regular pod should be victim to 
free up resources for 4th sleep pod
1958  
/home/runner/work/yunikorn-k8shim/yunikorn-k8shim/test/e2e/simple_preemptor/simple_preemptor_test.go:204
1959
1960Ran 2 of 2 Specs in 322.975 seconds
1961FAIL! -- 0 Passed | 2 Failed | 0 Pending | 0 Skipped
1962--- FAIL: TestSimplePreemptor (322.98s) {code}

> Fix flaky E2E simple preemptor test suite runs
> ----------------------------------------------
>
>                 Key: YUNIKORN-1278
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1278
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: test - e2e
>            Reporter: Manikandan R
>            Assignee: Manikandan R
>            Priority: Major
>
> e2e preemption test seems to be flaky. It is failing on different version of 
> K8s for different PR's.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to