Re: XBean and scanning performance

Romain Manni-Bucau Sun, 15 Apr 2012 16:39:50 -0700

hmm,

thinking a bit of it we have a nice default behavior = implicit scanning
for rest so i guess we can avoid link by default.


still to avoid it we can use web.xml...

so a flag for tck should be enough....but if you do so some tests need to
be fixed.

- Romain


2012/4/16 David Blevins <[email protected]>

> Ok, fixed the getAnnotatedClasses() bug.  Added XBEAN-206 and more code
> like it to JarArchive.
>
> As well i've split out the 'link()' method so we can see the times of the
> related functionality:
>
> 2.87 = scan
> 1.62 = linkSubclasses
> 4.05 = linkImplementations
> 0.03 = linkMetaAnnotations
> 8.57 = total
>
> (times are in seconds)
>
> Most the cost is linkImplementations for enabling 'findImplementations'
> methods, which we don't even use.  So those can easily go without debate.
>
> The linkMetaAnnotations call is negligible, even still, we could only call
> it if there are meta-annotations in the app.  We can happily disable that
> unless it's needed.
>
> That leaves linkSubclasses which at the very least should be disableable.
>
>
> -David
>
>
> On Apr 15, 2012, at 2:16 PM, Romain Manni-Bucau wrote:
>
> > added a patch: https://issues.apache.org/jira/browse/XBEAN-206
> >
> > can you test it against your mini bench please?
> >
> > - Romain
> >
> >
> > 2012/4/15 Romain Manni-Bucau <[email protected]>
> >
> >> Hi David,
> >>
> >> for me only 1 should be done.
> >>
> >> well, i didnt understand the whole mail: why do we need to browse the
> zip
> >> file multiple times? only for the getbytecode method? i think we can get
> >> rid of multiple scannings and keep the link() features. Another point is
> >> getAnnotatedClasses() should be able to return sthg even when link() was
> >> not called.
> >>
> >> If the zip parsing is badly done by the jre (if it doesn't use fseek for
> >> instance) we simply have to rewrite it.
> >>
> >> well in
> org.apache.xbean.finder.archive.JarArchive.JarIterator#JarIterator
> >> why Jarfile is not used when possible?
> >>
> >> - Romain
> >>
> >>
> >>
> >> 2012/4/15 David Blevins <[email protected]>
> >>
> >>> (decision and 4 choices at the bottom -- feedback requested)
> >>>
> >>> I did some studying of the zip file format and determined that part of
> >>> the reworked xbean-finder Archive API was plain wrong.
> >>>
> >>> Using maps as an analogy here is how we were effectively scanning zips
> >>> (jars):
> >>>
> >>>   "Style A"
> >>>
> >>>   Map<String, InputStream> zip = new HashMap<String, InputStream>();
> >>>   for (String entryName : zip.keySet()) {
> >>>       InputStream inputStream = zip.get(entryName);
> >>>       // scan the stream
> >>>   }
> >>>
> >>> While there is some indexing in a zip file in what is called the
> central
> >>> directory, it isn't nearly good enough to support this type of random
> >>> access.  The actual reading is done in C code when a zip file is
> randomly
> >>> accessed in this way, but basically it seems about as slow as starting
> at
> >>> the beginning of a stream and reading ahead in the stream until the
> index
> >>> is hit and then reading for "real".  I doubt it's doing exactly that
> as in
> >>> C code you should be able to start in the middle of a file, but let's
> put
> >>> it this way... at the very minimum you are reading the Central
> Directory
> >>> each and every single random access.
> >>>
> >>> I've reworked the Archive API so that when you iterate over it, you
> >>> iterate over actual entries.  Using map again as an analogy it looks
> like
> >>> this now:
> >>>
> >>>   "Style B"
> >>>
> >>>   for (Map.Entry<String, InputStream> entry : zip.entrySet()) {
> >>>       String className = entry.getKey();
> >>>       InputStream inputStream = entry.getValue();
> >>>       // scan the stream
> >>>   }
> >>>
> >>>
> >>> Using Altassian Confluence as a driver to benchmark only the call to
> 'new
> >>> AnnotationFinder(archive)' which is where our scanning happens, here
> are
> >>> the results before (style A) and after (style b):
> >>>
> >>>
> >>> StyleA: 8.89s - 9.02s
> >>> StyleB: 3.33s - 3.52s
> >>>
> >>> Now unfortunately the 'link()' call used to resolve parent classes that
> >>> are not in the jars scanned as well as to resolve meta-annotations
> still
> >>> needs the StyleA random access.  These things don't involve going in
> "jar
> >>> order", but definitely are random access.  With the new and improved
> code
> >>> that scans Confluence at around 3.4s, here is the time with 'link()'
> added
> >>>
> >>> StyleB scan + StyleA link: 15.61s - 15.75s
> >>>
> >>> That link() call adds another 12 seconds.  Roughly equivalent to the
> cost
> >>> of 4 more scans.
> >>>
> >>> So the good news is we don't need the link.  We very much like the
> link,
> >>> but we don't need the link for Java EE 6 certification.  We have two
> very
> >>> excellent features associated with that linking.
> >>>
> >>> - Meta-Annotations
> >>> - Discovery JAX-RS of non-annotated Application subclasses (Application
> >>> is a concrete class you subclass, like HttpServlet)
> >>>
> >>> We have more or less 4 kinds of choices on how we deal with this:
> >>>
> >>> 1. Link() is always called.  (always slow, extra features always
> enabled)
> >>> 2. Link() can be disabled but is enabled by default.   (slow,
> w/optional
> >>> fast flag, extra features enabled by default)
> >>> 3. Link() can be enabled but is disabled by default.   (fast,
> w/optional
> >>> slow flag, extra features disabled by default)
> >>> 4. Link is never enabled.  (always fast, extra features permanently
> >>> disabled)
> >>>
> >>>
> >>> Thoughts, preferences?
> >>>
> >>>
> >>> -David
> >>>
> >>>
> >>
>
>

Re: XBean and scanning performance

Reply via email to