Please comment on a couple enhancements relating to performing an optimized multi-module incremental build.
I am driving an effort at Ariba, Inc. to use these approaches to optimize their large product build system. The work is not yet complete, and I've been given a green light to contribute this work to Apache community if it makes sense. 1. Given a set of identified leaf modules that were known to have changed, perform an optimal transitive rebuild only if a performing a binary compatibility check of the leaf's artifacts are incompatible with respect to prior published artifacts. This implies that the build/analysis would be performed while traversing the dependency DAG - not pre-computing a list as is done with ivy:buildlist. The analysis code should be pluggable, and certainly the binary compatibility checker plug-in would be available. I have a BCEL based dependency checker that implements the JLS binary compatibility specification (Ch 13) 2. Another plug-in could be a metadata validator, which would compare the declared ivy.xml dependencies against the external dependencies as discovered by byte code analysis of just product artifacts. The point of the validator is to detect that correct first-order dependencies are declared. This helps ensure that first-order dependencies are declared - for this to work, an index that maps Java package name to defining modules is consulted. If there is an ivy dependency that is not in the discovered set of available defining modules there is an option to fail the build. The validator also leverages BCEL to get the external dependencies. 3. Another plug-in is an indexer tool that incrementally updates the java to owning modules index mentioned above. This also leverages BCEL to report what packages the artifact publishes. So to recap: While actually doing the DFS in the dependency DAG, identify leaf components and call out to a module that: a) performs the module build b) calls out the analysis modules that: b1) checks binary compatibility of latest artifact vs. prior artifact and sets a flag that influences whether the next module in the DAG traversal needs to be transitively rebuilt. b2) update the Java pkg to owning module index again by leveraging BCEL to keep b3) validates the artifact's external first-order dependencies from byte code analysis match what's declared in ivy.xml - optionally fail the build. I think that captures the gist of it. I'm having a hard time explaining to the VP's why nobody else has thought or implemented this. Any helpful feedback on that is appreciated. Thanks, Richard -----Original Message----- From: Richard Mauri [mailto:[email protected]] Sent: Thursday, August 18, 2011 12:21 PM To: [email protected] Subject: Opinions on a possible optimization enhancement to Ivy buildlist to support a form of incremental build I would like to get a discussion started around an enhancement I am considering to contribute to ivy buildlist. I think it would be a valuable addition, and would like to hear from others before contributing a formal patch/contribution. The goal of this contribution is to support a form of incremental build where given a set of changed components the buildlist would return a short list of components that must be visited for rebuild. Consider a hierarchical component directory structure (the project) and a master build.xml with subdirectories containing ant/ivy components. Parent | |-A |-B |-C |-D |-E Consider a dependency relationship like (A depends on B etc.) A /\ B C \ / D | E Today, ivy buildlist can be used by this parent build to ultimately determine the ordered build sequence (in this case starting from most independent). E D B C A Now, given that the toplevel build has available to it a set of components have been touched (Change-Set) , I'd like to constrain the list returned by ivy buildlist so that only those components in the change set PLUS the components that directly depend on the change set is returned (the rebuild-list). The reason the direct dependencies are included is to catch API contract violations that must be caught during the rebuild. Change-Set: {E,C} Rebuild-List: E D C A I assert that subsequent subant iteration over Rebuild-List would be more optimal than iterating over the entire project (imagine 100's of comps in the project). The enhancement to build list would be to add an attribute that represents the Change-Set and enhance the DFS DAG traversal reporting to return only the Change-Set and the components that depend directly on Change-set Thoughts? Thanks, Rich --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
