lhotari commented on pull request #8485:
URL: https://github.com/apache/pulsar/pull/8485#issuecomment-724474645


   I'm currently experimenting a solution where the build is split in to 
multiple phases
   
   1. license check, build Pulsar artifacts
   2. run unit tests
   3. build docker images
   4. run integration tests
   
   Each phase is a "job" in Github Workflow. The the unit test and integration 
test jobs have parallel sub-jobs by using the matrix feature of Github Flows.
   
   The challenge is the large size of Pulsar artifacts. Currently the 
~/.m2/repository/org/apache/pulsar files installed with "mvn install" are about 
2.5 GB in size.
   Break down of directory sizes in MB:
   https://gist.github.com/lhotari/3da3b220edd5684e54a005f358f3d045
   
   The large size of artifacts seems to be caused by shaded and bundled 
dependencies. 
   The bundled dependencies seems to be the pulsar-io modules built with 
[nifi-nar-maven-plugin](https://github.com/apache/nifi-maven). This results in 
the excessive IO during builds.
   
   The solution seems to be to create yet another maven profile that is for 
building just the essentials for running unit tests. Unit tests should be able 
to run without building the shaded jars, the distribution or the nar modules 
with the embedded dependencies.
   
   Perhaps there's also a way to share the dependencies across the Pulsar IO 
nar modules. It seems like a waste to duplicate most of the same dependencies 
in each nar file.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to