kgyrtkirk commented on PR #15106: URL: https://github.com/apache/druid/pull/15106#issuecomment-1770453353
> We cannot assume that every phase will always call install first, since that is already not the case today, e.g. our web console step does not https://github.com/apache/druid/blob/master/.github/workflows/static-checks.yml#L173-L177 that's pretty interesting :D a cache should never be taken for granted... for the record: it does run the tests with 0 downloaded artifacts...so: * its either working incorrectly even in that case * or doesn't care at all for what's in the cache >The cache might be deleted for any reason, and in cases where we fall back to using the setup-java maven cache such as here https://github.com/apache/druid/blob/master/.github/workflows/unit-and-integration-tests-unified.yml#L80 it's possible the maven repo would contain artifacts that do not get built again. > For example, if a PR removes a submodule, but some code depending on that submodule still exists, it could pass if the cache contains that artifact, but fail if the cache does not. I agree that we should prefer to not keep artifacts produced from the current sources in the cache I think that we would need to cache more than what the proposed PR tries (include the node stuff/etc) and possibly remove the attempts to avoid compilation in some jobs - because they use up too much cache space and cause churn. So instead of some random maven commands which supposed to work; consider: * somehow configure maven to install artifacts into a secondary location (and also look them up there) * last time I checked this was impossible ; didn't see it possible right now * configure the cache to exclude a set of directories during save * seems like `path` is [glob](https://github.com/actions/cache/tree/main/save); so it could possibly configured to include+exclude (didn't tried it yet) * this will need an extra action in every job to maintain the cache * purge the druid artifacts from the `.m2` at the end (before save) * this could be compatible with the `setup-java` we are using; so it could be less intrusive * failing to add this step will pollute the cache with artifacts from the project... * change the project to not use maven local * not sure if this is possible... I'm right now biased toward the `configure the cache to not include stuff` direction; for a couple reasons: * it may not pick up garbage * less likely to make it suck some notes on current state: * a standard build produces `1.2G+2G+1.5G = 4.7G` of "cache stuff" * it takes 2:30 to build the docker image ; and 30sec to restore it * with some effort we could probably also cache the baseimage (possibly put it into the main cached image) * I think this should be done on-demand; instead of utilizing a cache https://github.com/apache/druid/actions/runs/6568490107/job/17842978452?pr=15106 https://github.com/apache/druid/actions/runs/6568490107/job/17844540576?pr=15106 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
