kgyrtkirk commented on PR #15106:
URL: https://github.com/apache/druid/pull/15106#issuecomment-1770453353

   > We cannot assume that every phase will always call install first, since 
that is already not the case today, e.g. our web console step does not 
https://github.com/apache/druid/blob/master/.github/workflows/static-checks.yml#L173-L177
   
   that's pretty interesting :D a cache should never be taken for granted...
   for the record: it does run the tests with 0 downloaded artifacts...so:
   * its either working incorrectly even in that case
   * or doesn't care at all for what's in the cache
   
   >The cache might be deleted for any reason, and in cases where we fall back 
to using the setup-java maven cache such as here 
https://github.com/apache/druid/blob/master/.github/workflows/unit-and-integration-tests-unified.yml#L80
 it's possible the maven repo would contain artifacts that do not get built 
again.
   
   > For example, if a PR removes a submodule, but some code depending on that 
submodule still exists, it could pass if the cache contains that artifact, but 
fail if the cache does not.
   
   I agree that we should prefer to not keep artifacts produced from the 
current sources in the cache
   
   I think that we would need to cache more than what the proposed PR tries 
(include the node stuff/etc) and possibly remove the attempts to avoid 
compilation in some jobs - because they use up too much cache space and cause 
churn.
   So instead of some random maven commands which supposed to work; consider:
   
   * somehow configure maven to install artifacts into a secondary location 
(and also look them up there)
       * last time I checked this was impossible ; didn't see it possible right 
now
   * configure the cache to exclude a set of directories during save
       * seems like `path` is 
[glob](https://github.com/actions/cache/tree/main/save); so it could possibly 
configured to include+exclude (didn't tried it yet)
       * this will need an extra action in every job to maintain the cache
   * purge the druid artifacts from the `.m2` at the end (before save)
       * this could be compatible with the `setup-java` we are using; so it 
could be less intrusive
       * failing to add this step will pollute the cache with artifacts from 
the project...
   * change the project to not use maven local 
      * not sure if this is possible...
   
   I'm right now biased toward the `configure the cache to not include stuff` 
direction; for a couple reasons:
   * it may not pick up garbage
   * less likely to make it suck
   
   some notes on current state:
   * a standard build produces `1.2G+2G+1.5G = 4.7G` of "cache stuff"
   * it takes 2:30 to build the docker image ; and 30sec to restore it 
      * with some effort we could probably also cache the baseimage (possibly 
put it into the main cached image)
      * I think this should be done on-demand; instead of utilizing a cache
   
https://github.com/apache/druid/actions/runs/6568490107/job/17842978452?pr=15106
   
https://github.com/apache/druid/actions/runs/6568490107/job/17844540576?pr=15106
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to