I tried sending this previously, but Stefano seems to have not received it (at least I got no response or followup) and (as I suspected) the list definitely didn't because I wasn't subscribed. I subscribed to the digest version now so that I'm not so excluded from japitools-related discussion on harmony-dev.
My response to Stefano's questions is below. I've seen further discussion on japitools in the harmony-dev archives and I'll send separate mail addressing the japi-related bits of that. Stuart. ---------- Forwarded message ---------- From: Stuart Ballard <[EMAIL PROTECTED]> Date: Nov 3, 2006 12:29 PM Subject: Re: Japi diffs for harmony To: Stefano Mazzocchi <[EMAIL PROTECTED]> Cc: harmony-dev@incubator.apache.org (this will probably bounce from harmony-dev, I'm not subscribed; feel free to forward it) On 11/3/06, Stefano Mazzocchi <[EMAIL PROTECTED]> wrote:
Stuart, I'm a *huge* fan of your Japi diffs (I admit that I have a sort of obsessive compulsion about it, getting happy with every small percentage increase and sad when it decreases).
*grin* I do the same thing for Classpath. To be honest I find the Harmony results confusing myself; sometimes they seem to bounce up and down with the same errors getting added and removed over and over again. I'm dependent on an unofficial source for Japi files of Harmony, though, because the project doesn't provide official nightly builds as far as I know - and my suspicion is that perhaps different variations are getting built and I'm getting which ever happens to be the last that was built. I haven't gotten around to asking, though, sorry about that. If I could grab an actual tarball of the very latest build at any given time, I'd like that better. See http://builder.classpath.org/dist/ for what they do which is perfect for me.
That said, I have a few questions to ask, see below.
Great, I'm always happy to clarify this stuff :)
It tells me there are 10 packages and 52 classes missing. Great. Then it gives me a percentage. Who is that percentage calculated? Is it by class coverage? method coverage? package coverage? a weighted sum of all the above?
The percentages are admittedly difficult to define in a way that's truly meaningful, but the idea of *having* a percentage was too compelling for me not to try to come up with a definition. So here's how Japi calculates it: First of all, anything whatsoever that's neither public nor protected is discarded. Each class or interface (or enum or annotation) counts as one item. Each field, method or constructor within a class counts as one item. *Including* inherited fields and methods, regardless of whether they get overridden in the class itself. Each of these items gets classified as good, bad, minor or missing. If an entire class is missing, *all* its members are classified as missing. Likewise if an entire package is classified as missing, all the classes within it and all their members are counted as missing. If there is more than one error on the same item (eg a class is incorrectly abstract, doesn't implement the right interface *and* has the wrong serialVersionUID) it still only counts as bad once (the serialVersionUID is a "minor" error but the "bad" errors take precedence). The percentages are then calculated based on the total count in the "left-hand" API. In other words if a particular package in the RI had only one single interface with four methods, of which Harmony implemented two correctly and one incorrectly, the percentages for that package would be: 60% good (the interface itself + 2 methods) 20% bad (one method) 20% missing (one method) If Harmony also incorrectly added an extra method to the interface, it would be 60% good, 20% bad, 20% missing, 20% abs. add. (Because while adding members in general is legal, because it's backwards compatible, adding methods to interfaces or abstract classes that have public or protected constructors is not). And yes, that's a total of 120%, because it's a percentage of the left-hand API, and an abs-add error indicates something "outside" that API. Incidentally the inclusion of inherited members for calculating the percentage causes some strange effects; - because of inherited methods from Object, simply declaring a class has a disproportionate effect, even if none of the declared members of that class itself are present. And because an enum, for example, inherits so many members from Object and Enum, merely declaring an enum has a hugely disproportionate effect compared to the tiny amount of code it takes. But I couldn't come up with another way of defining the percentages that would make sense and be self-consistent. The reasons why are a little complicated and this email is already long so I won't go into them, though.
How is it possible that jdk15 vs harmony is 94.66% and harmony vs jdk15 is 90.88%
Wow, I'm impressed that harmony is 94.66 against 1.5. That's incredibly good progress - especially if all of that is actual functional implementations rather than stubbed out methods. (If you have stubbed out methods by the way, I suggest defining a RuntimeException subclass called "NotImplementedException" in any package and adding it to the throws clause of the methods in question. Japi will recognize those methods as not implemented. But back to your question. The reason for providing two different comparisons in the first place is that they're *intentionally* not symmetric. Japitools compares for "backwards" compatibility. In an ideal world, for example, a comparison of jdk14 vs jdk15 would give precisely 100% coverage and zero errors of any kind, where a comparison of jdk15 versus jdk14 would indicate exactly what new features were introduced in 1.5 (by flagging them as "missing in jdk14"). You can actually use the obvious url construction (h-jdk14-jdk15 etc) to get the results of each of those comparisons to see how well reality matches that ideal... So the jdk15 vs harmony comparisons indicate how well harmony covers the JDK1.5 API set. For the purposes of that comparison, if Harmony pulls a Microsoft and adds a dozen proprietary methods into java.lang.Object for talking to the Apache webserver ;) it wouldn't matter in the slightest. But since in reality Harmony *doesn't* want to pull a Microsoft (I hope), the other comparison is used for that. That indicates "how well JDK1.5 covers the Harmony API" which is equivalent to how well Harmony sticks to *only* implementing what's in JDK1.5. So the fact that Harmony's doing worse in the latter type of comparison means it's doing a better job of implementing stuff than it is of keeping internal stuff out of the public APIs of the standard packages.
Ah, also, would it be possible to have a list of the percentage over time? It would be very interesting to plot the evolution of coverage over time.
That's something I've considered for a long time, just haven't gotten around to making my scripts capable of keeping track of. One issue there is that sometimes the percentage changes as a result of changing something in Japi to fix a bug or better reflect the reality of the situation, rather than as a result of any changes to Harmony itself. So any such graph would have to be taken with a grain of salt - but that applies to the percentages as a whole too of course. Does that help? Stuart. -- http://sab39.dev.netreach.com/ -- http://sab39.dev.netreach.com/