Alex,
I think Jim's comments go the desired granularity of the information
that could be available. Although it is not *necessary* recompile on
the fly, it may be *preferable*. to By compiling on demand and
comparing byte codes, you can achieve as high a resolution as is
desired, without having to have any static information at all, whether
in class files or in "side files". In principle, you should be able to
step through an expression, sub expression by sub expression. By
contrast, with statically generated information, you get what you get
-- and line-based info is not necessarily very good, and more detailed
information is potentially very big.
However, back on the side of static information, I guess there is
potentially a performance issue. In a debugger, you're mostly dealing
with "line at a time" code at human speeds, as the user steps through
code, or sets up breakpoints. In profiling tools, you probably want
info about the entire contents of the class file, "in batch time". But
the problem remains with statically generated files -- how do you
determine the resolution of the information that should be generated?
-- Jon
On Apr 29, 2008, at 11:41 PM, Alex Rau wrote:
Hi Jim,
thanks for the detailed info. Unfortunately I've not had much time
this week to investigate deeply on your proposal (compiler API /
debugger API). Here are some things I came up so far - please
correct me in case I got something wrong:
1) The debugger API is based on a design with two virtual machines
involved ( the debugger vm and the vm which gets debugged). While
this fits perfectly a debugging or profiling scenario where two
virtual machines are always involved this does not properly line up
with my scenario where only one instance of a virtual machine
exists. Our software is based on top of a readily available
(compiled) build. It performs modifications on the byte code of the
build, runs all unit tests and generates xml reports (all done in
the mentioned single vm in one shot). That's all. A second vm is
just not existing and would mean much more overhead to our design
just for getting column information.
2) I could not yet find my way through the compiler and debugger API
from a technical point f view to really have the column information
in the end. I've already had a look on the netbeans sources and
(probably) found the right code location but I have to investigate
on that in more detail. However this indicates somehow that it's
getting much more tricky compared to the variant where the compiler
itself outputs the column information into the byte code via
additional attributes. A question here: is is necessary to recompile
on the fly during debugging to get the line/column information ? If
yes then this would make it even more difficult and would mean that
we have to support an additional compilation process while up to now
we strictly rely on already performed compilations. We work on byte
code exclusively and the sources are only required for the report
generation.
3) I think that line numbers and column information are actually
"attributes" of the compiler ( result ) in a broader sense. It
always depends on the compiler what values these attributes will
have. Compared to for example a duration of a method invocation
(profiling) or a certain value of a variable (debugging) the latter
are *always* runtime-dependent values. What I'd like to say is:
there are static ( runtime-independent, "compiler only"-dependent )
attributes (line and column info) and dynamic attributes ( runtime
and execution dependent ) attributes (invocation duration, variable
value). I see a "natural" separation between those where static
attributes should be stored statically (e.g. in the byte code) and
dynamic attributes should be accessible dynamically (like the
debugger API allows). This does imply as well that while we are
interested in static attributes of the compiler it's really not
necessary to reread these attributes with every modification on
bytecode level. Having these information at a single point of time
(after the compilation is finished) is totally sufficient compared
to getting the information during runtime every time.
It looks to me that what I want to achieve belongs more to the
compiler than somewhere else. Any comments ?
Best regards,
Alex
On 24.04.2008, at 04:53, Jim Holmlund wrote:
Just to summarize:
- jcov is an internal to Sun tool.
- to support jcov, a .class file attribute called the
CharacterRangeTable attribute was
defined and javac was changed to output it in response to the -
Xjcov(I think) command line option:
CharacterRangeTable_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 character_range_table_length;
{ u2 start_pc;
u2 end_pc;
u4 character_range_start;
u4 character_range_end;
u2 flags;
} character_range_table[character_range_table_length];
}
The 'flags' field item describes the kind of range, eg statement,
block, assignment,
flow_controller ..
- the CharacterRangeTable was never added to the VM Spec.
- jcov used the old JVMPI. Robert rewrote it to do byte code
instrumentation
via java.lang.instrument. It still uses the CharacterRangeTable.
As Robert mentioned, we have had requests from debuggers to include
this kind of info in the .class file, for example to allow stepping
thru terms of an expression, multiple statements on one line, etc.
We planned to do something for this in JDK 6, eg, formalize the
CharacterRangeTable attribute by adding it to the definition of the
class file in the VM spec, and add functionality to JVM TI, JDWP,
and JDI to allow debuggers to access this information.
When Peter von der Ahé heard about this, he suggested that we not
do this and instead proposed a solution that required no changes to
be made to the JDK. His idea was that an IDE has the source code
for a file in which fine grained stepping is desired, and the IDE
can get the bytecodes from the debuggee VM via JDI
(Method.bytecodes()). The IDE can then use the compiler APIs
introduced in JDK 6
http://www.artima.com/lejava/articles/compiler_api.html
to match the source code to the bytecodes to find the bytecodes
that correspond to source constructs of interest. This idea was
investigated by the NetBeans debugger team and found to be
effective, so it was implemented as the 'expression stepping'
feature in NetBeans 6.0:
http://www.netbeans.org/features/java/debugger.html
So, we ended up not needing character offset information in JPDA
and so we didn't add the CharacterRangeTable attribute to the VM
spec. Adding thisinformation to JPDA would be very low on our list
of things to do, unless
some needs arise that can't be handled by Peter's technique.
I wonder if Alex could also use Peter's idea. Alex did mention that
the tools he is interested
in normally have the source code available so maybe he could.
- jjh
Jonathan Gibbons wrote:
Hi Serviceability folk,
The Subject line is from a thread on the compiler-dev list. You
might be interested to check it out here:
http://mail.openjdk.java.net/pipermail/compiler-dev/2008-April/thread.html#300
The thread concerns an interest in improving the information about
source location generated by the compiler, javac, and more
specifically, increasing the resolution of the info from line-
based coordinates to source-based coordinates. The submitter is
also talking about using side files for the info, which (if I
recall correctly) I have heard folk such as Jim discuss before now.
What would be the interest from the serviceability group about any
such work? Is it "on your radar", "sometime eventually", or "it'll
never happen"? :-)
-- Jon
P.S. Warning: the submitter has provided a patch on the compiler-
dev thread but has not yet signed the SCA.