[
https://issues.apache.org/jira/browse/PHOENIX-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17072381#comment-17072381
]
Istvan Toth commented on PHOENIX-5769:
--------------------------------------
Some more observations based on
[https://builds.apache.org/job/PreCommit-PHOENIX-Build/3691]:
* The jenkins build hosts are about thrice as slow running the test suite as a
modern machine (ryzen 2700x + NVME SSD). Running a single query takes ~13s vs
4.5s.
* mvn verify for the phoenix-core subproject with the current settings takes
about 4.5 hours, so the 5 hour default limit was really too optimistic,
considering that we have other subprojects, and downloads and Yetus stuff also
takes ~20 minutes until the mvn verify gets started.
* However, the timeouts look super strange. It's as if Jenkins has simply
disconnected from maven, and didn't get any maven output after 22:20 until
after 00:10, when it aborted the build. The SplitSystemCatalogTests have
actually run, as Jenkins processes the output files, as can be seen later in
the log.
*
*22:15:10* [WARNING] Tests run: 26, Failures: 0, Errors: 0, Skipped: 4, Time
elapsed: 322.952 s - in
org.apache.phoenix.schema.stats.NamespaceDisabledStatsCollectorIT*22:15:40*
[WARNING] Tests run: 39, Failures: 0, Errors: 0, Skipped: 6, Time elapsed:
328.538 s - in org.apache.phoenix.schema.stats.NonTxStatsCollectorIT*22:15:51*
[WARNING] Tests run: 26, Failures: 0, Errors: 0, Skipped: 4, Time elapsed:
363.015 s - in
org.apache.phoenix.schema.stats.NamespaceEnabledStatsCollectorIT*22:20:05*
[WARNING] Tests run: 78, Failures: 0, Errors: 0, Skipped: 12, Time elapsed:
530.679 s - in org.apache.phoenix.schema.stats.TxStatsCollectorIT*00:10:34*
Build timed out (after 400 minutes). Marking the build as aborted.*00:10:35*
Build was aborted*00:10:35* Archiving artifacts*00:10:38* [INFO] *00:10:38*
[INFO] Results:*00:10:38* [INFO] *00:10:38* [WARNING] Tests run: 1079,
Failures: 0, Errors: 0, Skipped: 65*00:10:38* [INFO] *00:10:38* [INFO]
*00:10:38* [INFO] --- maven-failsafe-plugin:2.22.0:integration-test
(SplitSystemCatalogTests) @ phoenix-core ---*00:10:38* [INFO] *00:10:38* [INFO]
-------------------------------------------------------*00:10:38* [INFO] T E S
T S*00:10:38* [INFO] -------------------------------------------------------
* There also seems to be some interference in minicluster shutdown/startup
between tests, but failsafe helpfully erases the information that may help us
track this down. I opened PHOENIX-5814 to fix the maven setting.
> Phoenix precommit Flapping HadoopQA Tests in master
> ----------------------------------------------------
>
> Key: PHOENIX-5769
> URL: https://issues.apache.org/jira/browse/PHOENIX-5769
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Daniel Wong
> Assignee: Istvan Toth
> Priority: Major
> Attachments: PHOENIX-5769.master.v1.patch,
> PHOENIX-5769.master.v3.patch, consoleFull (1).html, consoleFull (2).html,
> consoleFull (3).html, consoleFull (4).html, consoleFull (5).html, consoleFull
> (6).html, consoleFull (7).html, consoleFull (8).html, consoleFull.html
>
>
> I was recently trying to commit changes to Phoenix for multiple issues and
> were asked to get clean HadoopQA runs. However, this took a huge effort as I
> had to resubmit the same patch multiple times in order to get one "clean".
> Looking at the errors the most common one were 3 "Multiple regions on
> <hostname,regions>" and 3 for apache infra issues (host shutdown), 1 for
> org.apache.hadoop.hbase.NotServingRegionException, 1 for
> SnapshotDoesNotExistException. See builds
> [https://builds.apache.org/job/PreCommit-PHOENIX-Build/] here from 3540's to
> 3560's. In addition I see multiple builds running simultaneously, limiting
> tests to running on 1 host should be configurable right?
> In addition I was recommended by [~yanxinyi] that master was less likely to
> have issues getting a clean run than 4.x. FYI [~ckulkarni]
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)