2013/8/9 Romain Manni-Bucau <[email protected]>: > When i tested on tomee gain was ridiculous too so maybe not the first place > to hack on to make maven fast ;)
> Le 9 août 2013 18:36, "Jason van Zyl" <[email protected]> a écrit : >> And what's the net difference then before after trying to parallelize the >> classloading? I'll read up on the Java7 classloading this weekend. I think this really depends on how we're able to exploit it. Our domain is partitioned into lots of small classloaders, so there should be a bit of potential. How did you try to partition your classloading in tomee ? From what I've seen of "asm" performance, class loading is mostly IO. Within a single classloader I think you'd need some kind of preemptive/recording based strategy. Implementing that in the classRealm class in classworlds should be almost trivial, and unless someone beats me too it, I'll do that over a few glasses of red wine some time. (Record class loading order from one invocation and re-use in another). Parallel construction of multiple classloaders should have some potential As for "making maven fast", well that's a topic I've spent considerable time & energy on. Apart from class loading, pom loading, pom merging and artifact resolutions are basically the computationally intensive parts of the maven core. Class loading and artifact resolution are the big ones; the atctual XML parsing/merging is really not that much. Most of the inefficiencies are in plugins. And sometimes there's inefficiencies related to layering. An example of this is maven-install-plugin; it uses maven core to install (copy) the jar file into the local repository, but then it re-reads the file to calculate SHA1/MD5 checksums. Until recently it atually read the files 3 times, I just reduced that to 2 times. I have been profiling the heck out of a bunch of builds, and the big stuff is in the plugins. For maven core I think it is safe to say you'll need to look for algorithmic improvements to gain anything significant; stuff like requesting a bunch of artifacts from the remote repository in one HTTP request comes to mind. One could work on parallelizing classloading, which should be doable. Other than that there's not much left. As a theory for my really long runs in the woods, I consider parallelizing the entire pom loading, interpolation and artifact resolution process. Unfortunately the massive amount of mutable state within the maven model and the maven core makes this infeasible. Simply put; the availablity of setters all over the place allows the construction of models/data to decay to spaghetti. Such spaghetti also creates wasted computation, since the same values are recalculated repeatedly. It also hinders parallelization. Maven core has its share of such spaghetti. On my last long run in the woods I contemplated writing another totally immutable layer of objects beneath the current objects and simply transfer all the state to the current model objects when done. But we're looking at quite a tremendous effort to catch that last second of wasted computation - better spend that energy optimizing plugins :) On the non-radical front, parallel classloading is probably the last "simple" thing that can be optimized in core. For multi-module builds there's the potential of re-using state/data computed in one module for the next. Surefire could conceivably keep the forked process alive between modules if the classpath is only expanded in the next run. Or surefire could run an additional invocation early in the lifecycle and start the forked VM while the compiler plugin is running (if it forks, which it can decide early); although the actual .class files may not be available, it knows everything it needs to know. Kristian --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
