[
https://issues.apache.org/jira/browse/IGNITE-28729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Pavlov updated IGNITE-28729:
-----------------------------------
Description:
TC Bot experienced severe slowdown/unavailability while cleanup, missing fat
build reconciliation, and TeamCity build loading were running. Monitoring/logs
show Ignite page replacements, JVM pauses, long Ignite queries, and repeated
TeamCity 403 errors for old/inaccessible builds.
Scope:
* Tune cleaner to run in smaller chunks and avoid large removeAll batches.
* Prevent missing build reconciliation from repeatedly reloading
old/inaccessible builds.
* Treat TeamCity 403 "Not enough permissions to access build" as
permanent/skip-worthy instead of retrying indefinitely.
* Review Ignite memory split/data region sizing to avoid page replacements
under normal load.
Further investigation:
* Check interaction between cleaner and reconciliation to prevent deleted old
builds from being reintroduced.
* Investigate StatisticsCompacted.keys compatibility error involving
GridIntList.
* Profile expensive scans/queries in findMissingBuildsFromBuildRef,
getMissingBuilds, and suite history loading.
was:
TC Bot experienced severe slowdown/unavailability while cleanup, missing fat
build reconciliation, and TeamCity build loading were running. Monitoring/logs
show Ignite page replacements, JVM pauses, long Ignite queries, and repeated
TeamCity 403 errors for old/inaccessible builds.
Scope:
* Tune cleaner to run in smaller chunks and avoid large removeAll batches.
* Prevent missing build reconciliation from repeatedly reloading
old/inaccessible builds.
* Treat TeamCity 403 "Not enough permissions to access build" as
permanent/skip-worthy instead of retrying indefinitely.
* Move TeamCity/GitHub/Jira service tokens to ENV-based configuration and
verify effective credentials after restart.
* Review Ignite memory split/data region sizing to avoid page replacements
under normal load.
Further investigation:
* Check interaction between cleaner and reconciliation to prevent deleted old
builds from being reintroduced.
* Investigate StatisticsCompacted.keys compatibility error involving
GridIntList.
* Profile expensive scans/queries in findMissingBuildsFromBuildRef,
getMissingBuilds, and suite history loading.
> [TC Bot] Reduce performance degradation during cleanup and missing build
> reconciliation
> ---------------------------------------------------------------------------------------
>
> Key: IGNITE-28729
> URL: https://issues.apache.org/jira/browse/IGNITE-28729
> Project: Ignite
> Issue Type: Task
> Reporter: Ignite TC Bot
> Assignee: Dmitry Pavlov
> Priority: Major
> Labels: ise
>
> TC Bot experienced severe slowdown/unavailability while cleanup, missing fat
> build reconciliation, and TeamCity build loading were running.
> Monitoring/logs show Ignite page replacements, JVM pauses, long Ignite
> queries, and repeated TeamCity 403 errors for old/inaccessible builds.
> Scope:
> * Tune cleaner to run in smaller chunks and avoid large removeAll batches.
> * Prevent missing build reconciliation from repeatedly reloading
> old/inaccessible builds.
> * Treat TeamCity 403 "Not enough permissions to access build" as
> permanent/skip-worthy instead of retrying indefinitely.
> * Review Ignite memory split/data region sizing to avoid page replacements
> under normal load.
> Further investigation:
> * Check interaction between cleaner and reconciliation to prevent deleted old
> builds from being reintroduced.
> * Investigate StatisticsCompacted.keys compatibility error involving
> GridIntList.
> * Profile expensive scans/queries in findMissingBuildsFromBuildRef,
> getMissingBuilds, and suite history loading.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)