sandynz opened a new issue, #38192: URL: https://github.com/apache/shardingsphere/issues/38192
## Background
Our GitHub Actions cache has reached the 10GB limit, retaining only data
from the last ~14 hours. Analysis reveals:
1. **Cache Strategy Issues**: 11 workflows use `actions/cache`, with 15
instances using `github.sha` as cache keys, causing new caches to be created
for every commit without reuse
2. **Low Cache Hit Rate**: Average file size is 1GB, but hit rate is below
10%+ (there is `restore-keys`)
3. **Mixed Storage**: Third-party dependencies and project build artifacts
are stored together, increasing cache size
Related commits:
- #38189 Extract upload-e2e-artifacts and download-e2e-artifacts for
e2e-operation.yml and nightly-e2e-operation.yml
- #37946 Extracted shared actions for Operation E2E
- #37743 Simplified Maven cache for Operation E2E
## Goals
1. Reduce cache usage from 10GB to 2-3GB
2. Increase cache hit rate from 10%+ to 80%+
3. Extend cache retention from 14 hours to 7+ days
4. Reduce dependency download time by 30%
## Proposal
### 1. Multi-Tier Caching Strategy
| Tier | Storage | Content | Key Strategy |
|------|---------|---------|--------------|
| L1: Third-party deps | `setup-java cache: 'maven'` | `~/.m2/repository`
(exclude project) | `hashFiles('**/pom.xml')` |
| L2: Project artifacts | `actions/upload-artifact` | Project jars, Docker
images | `github.run_id` |
### 2. Specific Improvements
#### A. Replace `actions/cache` with `setup-java cache: 'maven'`
**Benefits**:
- Built-in multi-level restore-keys (`<platform>-maven-<hash>` →
`<platform>-maven-` → `maven-`)
- Automatically generates keys based on `pom.xml` hash
- Smarter cache lifecycle management
**Migration Example**:
```yaml
# Old approach
- uses: actions/[email protected]
with:
path: |
~/.m2/repository
!~/.m2/repository/org/apache/shardingsphere
key: ${{ needs.global-environment.outputs.GLOBAL_CACHE_PREFIX
}}-maven-third-party-cache-${{ github.sha }}
# New approach
- uses: actions/[email protected]
with:
distribution: 'temurin'
java-version: 17
cache: 'maven'
```
#### B. Use artifact for Project Build Artifacts
- uses: actions/upload-artifact@v4
with:
name: shardingsphere-build-${{ github.run_id }}
path: /tmp/maven-repo-output.tar.gz
retention-days: 1
#### C. Abstract Shared Actions (Based on Proven Practices)
Building on the success of upload-e2e-artifacts and
download-e2e-artifacts, introduce:
- setup-build-environment/action.yml: Unified Java setup + Maven cache
- upload-build-artifacts/action.yml: Package and upload project build
artifacts
- download-build-artifacts/action.yml: Download and restore project build
artifacts
### 3. Migration Plan
Phase 1 (High Priority):
- nightly-build.yml
- nightly-check.yml
- nightly-e2e-sql.yml, nightly-e2e-agent.yml
Phase 2:
- e2e-sql.yml, e2e-agent.yml
- ci.yml, nightly-ci.yml
- graalvm.yml
### 4. Compatibility
- Backward compatible: Preserve restore-keys from actions/cache for smooth
migration
- Commercial version support: Shared actions support behavior control via
environment variables
Expected Impact
- ✅ Cache usage: 10GB → 2-3GB
- ✅ Cache hit rate: 10%+ → 80%+
- ✅ Cache retention: 14 hours → 7+ days
- ✅ Build speed: 30% reduction in dependency download time
Discussion
Community feedback and suggestions are welcome!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
[email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
