hmm, thinking a bit of it we have a nice default behavior = implicit scanning for rest so i guess we can avoid link by default.
still to avoid it we can use web.xml... so a flag for tck should be enough....but if you do so some tests need to be fixed. - Romain 2012/4/16 David Blevins <[email protected]> > Ok, fixed the getAnnotatedClasses() bug. Added XBEAN-206 and more code > like it to JarArchive. > > As well i've split out the 'link()' method so we can see the times of the > related functionality: > > 2.87 = scan > 1.62 = linkSubclasses > 4.05 = linkImplementations > 0.03 = linkMetaAnnotations > 8.57 = total > > (times are in seconds) > > Most the cost is linkImplementations for enabling 'findImplementations' > methods, which we don't even use. So those can easily go without debate. > > The linkMetaAnnotations call is negligible, even still, we could only call > it if there are meta-annotations in the app. We can happily disable that > unless it's needed. > > That leaves linkSubclasses which at the very least should be disableable. > > > -David > > > On Apr 15, 2012, at 2:16 PM, Romain Manni-Bucau wrote: > > > added a patch: https://issues.apache.org/jira/browse/XBEAN-206 > > > > can you test it against your mini bench please? > > > > - Romain > > > > > > 2012/4/15 Romain Manni-Bucau <[email protected]> > > > >> Hi David, > >> > >> for me only 1 should be done. > >> > >> well, i didnt understand the whole mail: why do we need to browse the > zip > >> file multiple times? only for the getbytecode method? i think we can get > >> rid of multiple scannings and keep the link() features. Another point is > >> getAnnotatedClasses() should be able to return sthg even when link() was > >> not called. > >> > >> If the zip parsing is badly done by the jre (if it doesn't use fseek for > >> instance) we simply have to rewrite it. > >> > >> well in > org.apache.xbean.finder.archive.JarArchive.JarIterator#JarIterator > >> why Jarfile is not used when possible? > >> > >> - Romain > >> > >> > >> > >> 2012/4/15 David Blevins <[email protected]> > >> > >>> (decision and 4 choices at the bottom -- feedback requested) > >>> > >>> I did some studying of the zip file format and determined that part of > >>> the reworked xbean-finder Archive API was plain wrong. > >>> > >>> Using maps as an analogy here is how we were effectively scanning zips > >>> (jars): > >>> > >>> "Style A" > >>> > >>> Map<String, InputStream> zip = new HashMap<String, InputStream>(); > >>> for (String entryName : zip.keySet()) { > >>> InputStream inputStream = zip.get(entryName); > >>> // scan the stream > >>> } > >>> > >>> While there is some indexing in a zip file in what is called the > central > >>> directory, it isn't nearly good enough to support this type of random > >>> access. The actual reading is done in C code when a zip file is > randomly > >>> accessed in this way, but basically it seems about as slow as starting > at > >>> the beginning of a stream and reading ahead in the stream until the > index > >>> is hit and then reading for "real". I doubt it's doing exactly that > as in > >>> C code you should be able to start in the middle of a file, but let's > put > >>> it this way... at the very minimum you are reading the Central > Directory > >>> each and every single random access. > >>> > >>> I've reworked the Archive API so that when you iterate over it, you > >>> iterate over actual entries. Using map again as an analogy it looks > like > >>> this now: > >>> > >>> "Style B" > >>> > >>> for (Map.Entry<String, InputStream> entry : zip.entrySet()) { > >>> String className = entry.getKey(); > >>> InputStream inputStream = entry.getValue(); > >>> // scan the stream > >>> } > >>> > >>> > >>> Using Altassian Confluence as a driver to benchmark only the call to > 'new > >>> AnnotationFinder(archive)' which is where our scanning happens, here > are > >>> the results before (style A) and after (style b): > >>> > >>> > >>> StyleA: 8.89s - 9.02s > >>> StyleB: 3.33s - 3.52s > >>> > >>> Now unfortunately the 'link()' call used to resolve parent classes that > >>> are not in the jars scanned as well as to resolve meta-annotations > still > >>> needs the StyleA random access. These things don't involve going in > "jar > >>> order", but definitely are random access. With the new and improved > code > >>> that scans Confluence at around 3.4s, here is the time with 'link()' > added > >>> > >>> StyleB scan + StyleA link: 15.61s - 15.75s > >>> > >>> That link() call adds another 12 seconds. Roughly equivalent to the > cost > >>> of 4 more scans. > >>> > >>> So the good news is we don't need the link. We very much like the > link, > >>> but we don't need the link for Java EE 6 certification. We have two > very > >>> excellent features associated with that linking. > >>> > >>> - Meta-Annotations > >>> - Discovery JAX-RS of non-annotated Application subclasses (Application > >>> is a concrete class you subclass, like HttpServlet) > >>> > >>> We have more or less 4 kinds of choices on how we deal with this: > >>> > >>> 1. Link() is always called. (always slow, extra features always > enabled) > >>> 2. Link() can be disabled but is enabled by default. (slow, > w/optional > >>> fast flag, extra features enabled by default) > >>> 3. Link() can be enabled but is disabled by default. (fast, > w/optional > >>> slow flag, extra features disabled by default) > >>> 4. Link is never enabled. (always fast, extra features permanently > >>> disabled) > >>> > >>> > >>> Thoughts, preferences? > >>> > >>> > >>> -David > >>> > >>> > >> > >
