Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-28 Thread Vladimir Ivanov

Looking very good, thanks.  Ship it!

Thanks, John!


Actually, can you insert a comment why the injected counts are not scaled?  (Or 
perhaps they should be??)
Sure! I intentionally don't scale the counts because I don't see any 
reason to do so. Profiling is done on per-MethodHandle basis, so the 
counts should be very close (considering racy updates) to the actual 
behavior.



Also, we may need a followup bug for the code with this comment:
   // Look for the following shape: AndI (ProfileBoolean) (ConI 1))

Since profileBoolean returns a TypeInt::BOOL, the AndI with (ConI 1) should 
fold up.
So there's some work to do in MulNode, which may allow that special pattern 
match to go away.
But I don't want to divert the present bug by a possibly complex dive into 
fixing AndI::Ideal.

Good catch! It's an overlook on my side.
The following change for ProfileBooleanNode solves the problem:
-  virtual const Type *bottom_type() const { return TypeInt::INT; }
+  virtual const Type *bottom_type() const { return TypeInt::BOOL; }

I polished the change a little according to your comments (diff against 
v03):

http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03-04/hotspot

Changes:
  - added short explanation why injected counts aren't scaled

  - adjusted ProfileBooleanNode type to TypeInt::BOOL and removed 
excessive pattern matching in has_injected_profile()


  - added an assert when ProfileBooleanNode is removed to catch the 
cases when injected profile isn't used: if we decide to generalize the 
API, I'd be happy to remove it, but current usages assumes that injected 
counts are always consumed during parsing and missing cases can cause 
hard-to-diagnose performance problems.


Best regards,
Vladimir Ivanov



(Generally speaking, pattern matching should assume strong normalization of its inputs.  Otherwise 
you end up duplicating pattern match code in many places, inconsistently.  Funny one-off idiom 
checks like this are evidence of incomplete IR normalization.  See 
http://en.wikipedia.org/wiki/Rewriting for some background on terms like normalization 
and confluence which are relevant to C2.)

— John

On Jan 27, 2015, at 8:05 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com 
wrote:


Thanks for the feedback, John!

Updated webrev:
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/jdk
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/hotspot

Changes:
  - renamed MHI::profileBranch to MHI::profileBoolean, and ProfileBranchNode to 
ProfileBooleanNode;
  - restructured profile layout ([0] = false_cnt, [1] = true_cnt)
  - factored out profile injection in a separate function 
(has_injected_profile() in parse2.cpp)
  - ProfileBooleanNode stores true/false counts instead of taken/not_taken 
counts
  - matching from value counts to taken/not_taken happens in 
has_injected_profile();
  - added BoolTest::ne support
  - sharpened test for AndI case: now it checks AndI (ProfileBoolean) (ConI 1) 
shape

Best regards,
Vladimir Ivanov



___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-28 Thread John Rose
On Jan 28, 2015, at 1:00 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com 
wrote:
 I polished the change a little according to your comments (diff against v03):
 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03-04/hotspot 
 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03-04/hotspot

+1  Glad to see the AndI folds up easily; thanks for the cleanup.___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-27 Thread John Rose
Looking very good, thanks.  Ship it!

Actually, can you insert a comment why the injected counts are not scaled?  (Or 
perhaps they should be??)

Also, we may need a followup bug for the code with this comment:
  // Look for the following shape: AndI (ProfileBoolean) (ConI 1))

Since profileBoolean returns a TypeInt::BOOL, the AndI with (ConI 1) should 
fold up.
So there's some work to do in MulNode, which may allow that special pattern 
match to go away.
But I don't want to divert the present bug by a possibly complex dive into 
fixing AndI::Ideal.

(Generally speaking, pattern matching should assume strong normalization of its 
inputs.  Otherwise you end up duplicating pattern match code in many places, 
inconsistently.  Funny one-off idiom checks like this are evidence of 
incomplete IR normalization.  See http://en.wikipedia.org/wiki/Rewriting for 
some background on terms like normalization and confluence which are 
relevant to C2.)

— John

On Jan 27, 2015, at 8:05 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com 
wrote:
 
 Thanks for the feedback, John!
 
 Updated webrev:
 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/jdk
 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/hotspot
 
 Changes:
  - renamed MHI::profileBranch to MHI::profileBoolean, and ProfileBranchNode 
 to ProfileBooleanNode;
  - restructured profile layout ([0] = false_cnt, [1] = true_cnt)
  - factored out profile injection in a separate function 
 (has_injected_profile() in parse2.cpp)
  - ProfileBooleanNode stores true/false counts instead of taken/not_taken 
 counts
  - matching from value counts to taken/not_taken happens in 
 has_injected_profile();
  - added BoolTest::ne support
  - sharpened test for AndI case: now it checks AndI (ProfileBoolean) (ConI 1) 
 shape
 
 Best regards,
 Vladimir Ivanov

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-27 Thread Vladimir Ivanov

Thanks for the feedback, John!

Updated webrev:
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/jdk
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.03/hotspot

Changes:
  - renamed MHI::profileBranch to MHI::profileBoolean, and 
ProfileBranchNode to ProfileBooleanNode;

  - restructured profile layout ([0] = false_cnt, [1] = true_cnt)
  - factored out profile injection in a separate function 
(has_injected_profile() in parse2.cpp)
  - ProfileBooleanNode stores true/false counts instead of 
taken/not_taken counts
  - matching from value counts to taken/not_taken happens in 
has_injected_profile();

  - added BoolTest::ne support
  - sharpened test for AndI case: now it checks AndI (ProfileBoolean) 
(ConI 1) shape


Best regards,
Vladimir Ivanov

On 1/27/15 3:04 AM, John Rose wrote:

On Jan 26, 2015, at 8:41 AM, Vladimir Ivanov
vladimir.x.iva...@oracle.com mailto:vladimir.x.iva...@oracle.com wrote:


What do you think about the following version?
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.02

As you suggested, I reified MHI::profileBranch on LambdaForm level and
removed @LambdaForm.Shared. My main concern about removing @Sharen was
that profile pollution can affect the code before profileBranch call
(akin to 8068915 [1]) and it seems it's the case: Gbemu (at least) is
sensitive to that change (there's a 10% difference in peak performance
between @Shared and has_injected_profile()).

I can leave @Shared as is for now or remove it and work on the fix to
the deoptimization counts pollution. What do you prefer?


Generic advice here:  It's better to leave it out, if in doubt.  If it
has a real benefit, and we don't have time to make it clean, put it in
and file a tracking bug to clean it up.

I re-read the change.  It's simpler and more coherent now.

I see one more issue which we should fix now, while we can.  It's the
sort of thing which is hard to clean up later.

The two fields of the profileBranch array have obscure and inconsistent
labelings.  It took me some hard thought and the inspection of three
files to decide what taken and not taken mean in the C2 code that
injects the profile.  The problem is that, when you look at
profileBranch, all you see is an integer (boolean) argument and an
array, and no clear indication about which array element corresponds to
which argument value.  It's made worse by the fact that taken and not
taken are not mentioned at all in the JDK code, which instead wires
together the branches of selectAlternative without much comment.

My preferred formulation, for making things clearer:  Decouple the idea
of branching from the idea of profile injection.  Name the intrinsic
(yes, one more bikeshed color) profileBoolean (or even
injectBooleanProfile), and use the natural indexing of the array:  0
(Java false) is a[0], and 1 (Java true) is a[1].  We might later extend
this to work with booleans (more generally, small-integer flags), of
more than two possible values, klasses, etc.

This line then goes away, and 'result' is used directly as the profile
index:
+int idx = result ? 0 : 1;

The ProfileBooleanNode should have an embedded (or simply indirect)
array of ints which is a simple copy of the profile array, so there's no
doubt about which count is which.

The parsing of the predicate that contains profileBoolean should
probably be more robust, at least allowing for 'eq' and 'ne' versions of
the test.  (C2 freely flips comparison senses, in various places.)  The
check for Op_AndI must be more precise; make sure n-in(2) is a constant
of the expected value (1).  The most robust way to handle it (but try
this another time, I think) would be to make two temp copies of the
predicate, substituting the occurrence of ProfileBoolean with '0' and
'1', respectively; if they both fold to '0' and '1' or '1' and '0', then
you take the indicated action.

I suggest putting the new code in Parse::dynamic_branch_prediction,
which pattern-matches for injected profiles, into its own subroutine.
  Maybe:
bool use_mdo = true;
if (has_injected_profile(btest, test, taken, not_taken)) {
  use_mdo = false;
}
if (use_mdo) { ... old code

I see why you used the opposite order in the existing code:  It mirrors
the order of the second and third arguments to selectAlternative.  But
the JVM knows nothing about selectAlternative, so it's just confusing
when reading the VM code to know which profile array element means what.

— John

P.S.  Long experience with byte-order bugs in HotSpot convinces me that
if you are not scrupulously clear in your terms, when working with equal
and opposite configuration pairs, you will have a long bug tail,
especially if you have to maintain agreement about the configurations
through many layers of software.  This is one of those cases.  The best
chance to fix such bugs is not to allow them in the first place.  In the
case of byte-order, we have first vs. second, MSB vs. LSB, and
high vs. low parts of values, for values in memory and 

Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-26 Thread Vladimir Ivanov

John,

What do you think about the following version?
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.02

As you suggested, I reified MHI::profileBranch on LambdaForm level and 
removed @LambdaForm.Shared. My main concern about removing @Sharen was 
that profile pollution can affect the code before profileBranch call 
(akin to 8068915 [1]) and it seems it's the case: Gbemu (at least) is 
sensitive to that change (there's a 10% difference in peak performance 
between @Shared and has_injected_profile()).


I can leave @Shared as is for now or remove it and work on the fix to 
the deoptimization counts pollution. What do you prefer?


Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.java.net/browse/JDK-8068915

On 1/23/15 4:31 AM, John Rose wrote:

On Jan 20, 2015, at 11:09 AM, Vladimir Ivanov
vladimir.x.iva...@oracle.com mailto:vladimir.x.iva...@oracle.com wrote:



What I'm mainly poking at here is that 'isGWT' is not informative about
the intended use of the flag.

I agree. It was an interim solution. Initially, I planned to introduce
customization and guide the logic based on that property. But it's not
there yet and I needed something for GWT case. Unfortunately, I missed
the case when GWT is edited. In that case, isGWT flag is missed and no
annotation is set.
So, I removed isGWT flag and introduced a check for selectAlternative
occurence in LambdaForm shape, as you suggested.


Good.

I think there is a sweeter spot just a little further on.  Make
profileBranch be an LF intrinsic and expose it like this:
   GWT(p,t,f;S) := let(a=new int[3]) in lambda(*: S) {
selectAlternative(profileBranch(p.invoke( *), a), t, f).invoke( *); }

Then selectAlternative triggers branchy bytecodes in the IBGen, and
profileBranch injects profiling in C2.
The presence of profileBranch would then trigger the @Shared annotation,
if you still need it.

After thinking about it some more, I still believe it would be better to
detect the use of profileBranch during a C2 compile task, and feed that
to the too_many_traps logic.  I agree it is much easier to stick the
annotation on in the IBGen; the problem is that because of a minor phase
ordering problem you are introducing an annotation which flows from the
JDK to the VM.  Here's one more suggestion at reducing this coupling…

Note that C-set_trap_count is called when each Parse phase processes a
whole method.  This means that information about the contents of the
nmethod accumulates during the parse.  Likewise, add a flag method
C-{has,set}_injected_profile, and set the flag whenever the parser sees
a profileBranch intrinsic (with or without a constant profile array;
your call).  Then consult that flag from too_many_traps.  It is true
that code which is parsed upstream of the very first profileBranch will
potentially issue a non-trapping fallback, but by definition that code
would be unrelated to the injected profile, so I don't see a harm in
that.  If this approach works, then you can remove the annotation
altogether, which is clearly preferable.  We understand the annotation
now, but it has the danger of becoming a maintainer's puzzlement.




In 'updateCounters', if the counter overflows, you'll get continuous
creation of ArithmeticExceptions.  Will that optimize or will it cause a
permanent slowdown?  Consider a hack like this on the exception path:
   counters[idx] = Integer.MAX_VALUE / 2;

I had an impression that VM optimizes overflows in Math.exact*
intrinsics, but it's not the case - it always inserts an uncommon
trap. I used the workaround you proposed.


Good.




On the Name Bikeshed:  It looks like @IgnoreProfile (ignore_profile in
the VM) promises too much ignorance, since it suppresses branch counts
and traps, but allows type profiles to be consulted.  Maybe something
positive like @ManyTraps or @SharedMegamorphic?  (It's just a name,
and this is just a suggestion.)

What do you think about @LambdaForm.Shared?


That's fine.  Suggest changing the JVM accessor to
is_lambda_form_shared, because the term shared is already overused in
the VM.

Or, to be much more accurate, s/@Shared/@CollectiveProfile/.  Better
yet, get rid of it, as suggested above.

(I just realized that profile pollution looks logically parallel to the
http://en.wikipedia.org/wiki/Tragedy_of_the_commons .)

Also, in the comment explaining the annotation:
   s/mostly useless/probably polluted by conflicting behavior from
multiple call sites/

I very much like the fact that profileBranch is the VM intrinsic, not
selectAlternative.  A VM intrinsic should be nice and narrow like that.
  In fact, you can delete selectAlternative from vmSymbols while you are
at it.

(We could do profileInteger and profileClass in a similar way, if that
turned out to be useful.)

— John

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-26 Thread Vladimir Ivanov

As you suggested, I reified MHI::profileBranch on LambdaForm level and
removed @LambdaForm.Shared. My main concern about removing @Sharen was
that profile pollution can affect the code before profileBranch call
(akin to 8068915 [1]) and it seems it's the case: Gbemu (at least) is
sensitive to that change (there's a 10% difference in peak performance
between @Shared and has_injected_profile()).
Ignore that. Additional runs don't prove there's a regression on Gbemu. 
There's some variance on Gbemu and it's present w/ and w/o @Shared.


Best regards,
Vladimir Ivanov


I can leave @Shared as is for now or remove it and work on the fix to
the deoptimization counts pollution. What do you prefer?

Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.java.net/browse/JDK-8068915

On 1/23/15 4:31 AM, John Rose wrote:

On Jan 20, 2015, at 11:09 AM, Vladimir Ivanov
vladimir.x.iva...@oracle.com mailto:vladimir.x.iva...@oracle.com
wrote:



What I'm mainly poking at here is that 'isGWT' is not informative about
the intended use of the flag.

I agree. It was an interim solution. Initially, I planned to introduce
customization and guide the logic based on that property. But it's not
there yet and I needed something for GWT case. Unfortunately, I missed
the case when GWT is edited. In that case, isGWT flag is missed and no
annotation is set.
So, I removed isGWT flag and introduced a check for selectAlternative
occurence in LambdaForm shape, as you suggested.


Good.

I think there is a sweeter spot just a little further on.  Make
profileBranch be an LF intrinsic and expose it like this:
   GWT(p,t,f;S) := let(a=new int[3]) in lambda(*: S) {
selectAlternative(profileBranch(p.invoke( *), a), t, f).invoke( *); }

Then selectAlternative triggers branchy bytecodes in the IBGen, and
profileBranch injects profiling in C2.
The presence of profileBranch would then trigger the @Shared annotation,
if you still need it.

After thinking about it some more, I still believe it would be better to
detect the use of profileBranch during a C2 compile task, and feed that
to the too_many_traps logic.  I agree it is much easier to stick the
annotation on in the IBGen; the problem is that because of a minor phase
ordering problem you are introducing an annotation which flows from the
JDK to the VM.  Here's one more suggestion at reducing this coupling…

Note that C-set_trap_count is called when each Parse phase processes a
whole method.  This means that information about the contents of the
nmethod accumulates during the parse.  Likewise, add a flag method
C-{has,set}_injected_profile, and set the flag whenever the parser sees
a profileBranch intrinsic (with or without a constant profile array;
your call).  Then consult that flag from too_many_traps.  It is true
that code which is parsed upstream of the very first profileBranch will
potentially issue a non-trapping fallback, but by definition that code
would be unrelated to the injected profile, so I don't see a harm in
that.  If this approach works, then you can remove the annotation
altogether, which is clearly preferable.  We understand the annotation
now, but it has the danger of becoming a maintainer's puzzlement.




In 'updateCounters', if the counter overflows, you'll get continuous
creation of ArithmeticExceptions.  Will that optimize or will it
cause a
permanent slowdown?  Consider a hack like this on the exception path:
   counters[idx] = Integer.MAX_VALUE / 2;

I had an impression that VM optimizes overflows in Math.exact*
intrinsics, but it's not the case - it always inserts an uncommon
trap. I used the workaround you proposed.


Good.




On the Name Bikeshed:  It looks like @IgnoreProfile (ignore_profile in
the VM) promises too much ignorance, since it suppresses branch
counts
and traps, but allows type profiles to be consulted.  Maybe something
positive like @ManyTraps or @SharedMegamorphic?  (It's just a name,
and this is just a suggestion.)

What do you think about @LambdaForm.Shared?


That's fine.  Suggest changing the JVM accessor to
is_lambda_form_shared, because the term shared is already overused in
the VM.

Or, to be much more accurate, s/@Shared/@CollectiveProfile/.  Better
yet, get rid of it, as suggested above.

(I just realized that profile pollution looks logically parallel to the
http://en.wikipedia.org/wiki/Tragedy_of_the_commons .)

Also, in the comment explaining the annotation:
   s/mostly useless/probably polluted by conflicting behavior from
multiple call sites/

I very much like the fact that profileBranch is the VM intrinsic, not
selectAlternative.  A VM intrinsic should be nice and narrow like that.
  In fact, you can delete selectAlternative from vmSymbols while you are
at it.

(We could do profileInteger and profileClass in a similar way, if that
turned out to be useful.)

— John

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-26 Thread John Rose
On Jan 26, 2015, at 8:41 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com 
wrote:
 
 What do you think about the following version?
 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.02 
 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.02
 
 As you suggested, I reified MHI::profileBranch on LambdaForm level and 
 removed @LambdaForm.Shared. My main concern about removing @Sharen was that 
 profile pollution can affect the code before profileBranch call (akin to 
 8068915 [1]) and it seems it's the case: Gbemu (at least) is sensitive to 
 that change (there's a 10% difference in peak performance between @Shared and 
 has_injected_profile()).
 
 I can leave @Shared as is for now or remove it and work on the fix to the 
 deoptimization counts pollution. What do you prefer?

Generic advice here:  It's better to leave it out, if in doubt.  If it has a 
real benefit, and we don't have time to make it clean, put it in and file a 
tracking bug to clean it up.

I re-read the change.  It's simpler and more coherent now.

I see one more issue which we should fix now, while we can.  It's the sort of 
thing which is hard to clean up later.

The two fields of the profileBranch array have obscure and inconsistent 
labelings.  It took me some hard thought and the inspection of three files to 
decide what taken and not taken mean in the C2 code that injects the 
profile.  The problem is that, when you look at profileBranch, all you see is 
an integer (boolean) argument and an array, and no clear indication about which 
array element corresponds to which argument value.  It's made worse by the fact 
that taken and not taken are not mentioned at all in the JDK code, which 
instead wires together the branches of selectAlternative without much comment.

My preferred formulation, for making things clearer:  Decouple the idea of 
branching from the idea of profile injection.  Name the intrinsic (yes, one 
more bikeshed color) profileBoolean (or even injectBooleanProfile), and use 
the natural indexing of the array:  0 (Java false) is a[0], and 1 (Java true) 
is a[1].  We might later extend this to work with booleans (more generally, 
small-integer flags), of more than two possible values, klasses, etc.

This line then goes away, and 'result' is used directly as the profile index:
+int idx = result ? 0 : 1;

The ProfileBooleanNode should have an embedded (or simply indirect) array of 
ints which is a simple copy of the profile array, so there's no doubt about 
which count is which.

The parsing of the predicate that contains profileBoolean should probably be 
more robust, at least allowing for 'eq' and 'ne' versions of the test.  (C2 
freely flips comparison senses, in various places.)  The check for Op_AndI must 
be more precise; make sure n-in(2) is a constant of the expected value (1).  
The most robust way to handle it (but try this another time, I think) would be 
to make two temp copies of the predicate, substituting the occurrence of 
ProfileBoolean with '0' and '1', respectively; if they both fold to '0' and '1' 
or '1' and '0', then you take the indicated action.

I suggest putting the new code in Parse::dynamic_branch_prediction, which 
pattern-matches for injected profiles, into its own subroutine.  Maybe:
   bool use_mdo = true;
   if (has_injected_profile(btest, test, taken, not_taken)) {
 use_mdo = false;
   }
   if (use_mdo) { ... old code

I see why you used the opposite order in the existing code:  It mirrors the 
order of the second and third arguments to selectAlternative.  But the JVM 
knows nothing about selectAlternative, so it's just confusing when reading the 
VM code to know which profile array element means what.

— John

P.S.  Long experience with byte-order bugs in HotSpot convinces me that if you 
are not scrupulously clear in your terms, when working with equal and opposite 
configuration pairs, you will have a long bug tail, especially if you have to 
maintain agreement about the configurations through many layers of software.  
This is one of those cases.  The best chance to fix such bugs is not to allow 
them in the first place.  In the case of byte-order, we have first vs. 
second, MSB vs. LSB, and high vs. low parts of values, for values in 
memory and in registers, and all possible misunderstandings about them and 
their relation have probably happened and caused bugs.___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-22 Thread John Rose
On Jan 20, 2015, at 11:09 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com 
wrote:
 
 What I'm mainly poking at here is that 'isGWT' is not informative about
 the intended use of the flag.
 I agree. It was an interim solution. Initially, I planned to introduce 
 customization and guide the logic based on that property. But it's not there 
 yet and I needed something for GWT case. Unfortunately, I missed the case 
 when GWT is edited. In that case, isGWT flag is missed and no annotation is 
 set.
 So, I removed isGWT flag and introduced a check for selectAlternative 
 occurence in LambdaForm shape, as you suggested.

Good.

I think there is a sweeter spot just a little further on.  Make profileBranch 
be an LF intrinsic and expose it like this:
  GWT(p,t,f;S) := let(a=new int[3]) in lambda(*: S) { 
selectAlternative(profileBranch(p.invoke( *), a), t, f).invoke( *); }

Then selectAlternative triggers branchy bytecodes in the IBGen, and 
profileBranch injects profiling in C2.
The presence of profileBranch would then trigger the @Shared annotation, if you 
still need it.

After thinking about it some more, I still believe it would be better to detect 
the use of profileBranch during a C2 compile task, and feed that to the 
too_many_traps logic.  I agree it is much easier to stick the annotation on in 
the IBGen; the problem is that because of a minor phase ordering problem you 
are introducing an annotation which flows from the JDK to the VM.  Here's one 
more suggestion at reducing this coupling…

Note that C-set_trap_count is called when each Parse phase processes a whole 
method.  This means that information about the contents of the nmethod 
accumulates during the parse.  Likewise, add a flag method 
C-{has,set}_injected_profile, and set the flag whenever the parser sees a 
profileBranch intrinsic (with or without a constant profile array; your call).  
Then consult that flag from too_many_traps.  It is true that code which is 
parsed upstream of the very first profileBranch will potentially issue a 
non-trapping fallback, but by definition that code would be unrelated to the 
injected profile, so I don't see a harm in that.  If this approach works, then 
you can remove the annotation altogether, which is clearly preferable.  We 
understand the annotation now, but it has the danger of becoming a maintainer's 
puzzlement.

 
 In 'updateCounters', if the counter overflows, you'll get continuous
 creation of ArithmeticExceptions.  Will that optimize or will it cause a
 permanent slowdown?  Consider a hack like this on the exception path:
counters[idx] = Integer.MAX_VALUE / 2;
 I had an impression that VM optimizes overflows in Math.exact* intrinsics, 
 but it's not the case - it always inserts an uncommon trap. I used the 
 workaround you proposed.

Good.

 
 On the Name Bikeshed:  It looks like @IgnoreProfile (ignore_profile in
 the VM) promises too much ignorance, since it suppresses branch counts
 and traps, but allows type profiles to be consulted.  Maybe something
 positive like @ManyTraps or @SharedMegamorphic?  (It's just a name,
 and this is just a suggestion.)
 What do you think about @LambdaForm.Shared?

That's fine.  Suggest changing the JVM accessor to is_lambda_form_shared, 
because the term shared is already overused in the VM.

Or, to be much more accurate, s/@Shared/@CollectiveProfile/.  Better yet, get 
rid of it, as suggested above.

(I just realized that profile pollution looks logically parallel to the 
http://en.wikipedia.org/wiki/Tragedy_of_the_commons 
http://en.wikipedia.org/wiki/Tragedy_of_the_commons .)

Also, in the comment explaining the annotation:
  s/mostly useless/probably polluted by conflicting behavior from multiple call 
sites/

I very much like the fact that profileBranch is the VM intrinsic, not 
selectAlternative.  A VM intrinsic should be nice and narrow like that.  In 
fact, you can delete selectAlternative from vmSymbols while you are at it.

(We could do profileInteger and profileClass in a similar way, if that turned 
out to be useful.)

— John___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-21 Thread Vladimir Ivanov
Duncan, sorry for that.
Updated webrev inplace.

Best regards,
Vladimir Ivanov

On 1/21/15 1:39 PM, MacGregor, Duncan (GE Energy Management) wrote:
 This version seems to have inconsistent removal of ignore profile in the
 hotspot patch. It’s no longer added to vmSymbols but is still referenced
 in classFileParser.
 
 On 19/01/2015 20:21, MacGregor, Duncan (GE Energy Management)
 duncan.macgre...@ge.com wrote:
 
 Okay, I¹ve done some tests of this with the micro benchmarks for our
 language  runtime which show pretty much no change except for one test
 which is now almost 3x slower. It uses nested loops to iterate over an
 array and concatenate the string-like objects it contains, and replaces
 elements with these new longer string-llike objects. It¹s a bit of a
 pathological case, and I haven¹t seen the same sort of degradation in the
 other benchmarks or in real applications, but I haven¹t done serious
 benchmarking of them with this change.

 I shall see if the test case can be reduced down to anything simpler while
 still showing the same performance behaviour, and try add some compilation
 logging options to narrow down what¹s going on.

 Duncan.

 On 16/01/2015 17:16, Vladimir Ivanov vladimir.x.iva...@oracle.com
 wrote:

 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
 https://bugs.openjdk.java.net/browse/JDK-8063137

 After GuardWithTest (GWT) LambdaForms became shared, profile pollution
 significantly distorted compilation decisions. It affected inlining and
 hindered some optimizations. It causes significant performance
 regressions for Nashorn (on Octane benchmarks).

 Inlining was fixed by 8059877 [1], but it didn't cover the case when a
 branch is never taken. It can cause missed optimization opportunity, and
 not just increase in code size. For example, non-pruned branch can break
 escape analysis.

 Currently, there are 2 problems:
- branch frequencies profile pollution
- deoptimization counts pollution

 Branch frequency pollution hides from JIT the fact that a branch is
 never taken. Since GWT LambdaForms (and hence their bytecode) are
 heavily shared, but the behavior is specific to MethodHandle, there's no
 way for JIT to understand how particular GWT instance behaves.

 The solution I propose is to do profiling in Java code and feed it to
 JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where
 profiling info is stored. Once JIT kicks in, it can retrieve these
 counts, if corresponding MethodHandle is a compile-time constant (and it
 is usually the case). To communicate the profile data from Java code to
 JIT, MethodHandleImpl::profileBranch() is used.

 If GWT MethodHandle isn't a compile-time constant, profiling should
 proceed. It happens when corresponding LambdaForm is already shared, for
 newly created GWT MethodHandles profiling can occur only in native code
 (dedicated nmethod for a single LambdaForm). So, when compilation of the
 whole MethodHandle chain is triggered, the profile should be already
 gathered.

 Overriding branch frequencies is not enough. Statistics on
 deoptimization events is also polluted. Even if a branch is never taken,
 JIT doesn't issue an uncommon trap there unless corresponding bytecode
 doesn't trap too much and doesn't cause too many recompiles.

 I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT
 sees it on some method, Compile::too_many_traps 
 Compile::too_many_recompiles for that method always return false. It
 allows JIT to prune the branch based on custom profile and recompile the
 method, if the branch is visited.

 For now, I wanted to keep the fix very focused. The next thing I plan to
 do is to experiment with ignoring deoptimization counts for other
 LambdaForms which are heavily shared. I already saw problems caused by
 deoptimization counts pollution (see JDK-8068915 [2]).

 I plan to backport the fix into 8u40, once I finish extensive
 performance testing.

 Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite,
 Octane).

 Thanks!

 PS: as a summary, my experiments show that fixes for 8063137  8068915
 [2] almost completely recovers peak performance after LambdaForm sharing
 [3]. There's one more problem left (non-inlined MethodHandle invocations
 are more expensive when LFs are shared), but it's a story for another
 day.

 Best regards,
 Vladimir Ivanov

 [1] https://bugs.openjdk.java.net/browse/JDK-8059877
  8059877: GWT branch frequencies pollution due to LF sharing
 [2] https://bugs.openjdk.java.net/browse/JDK-8068915
 [3] https://bugs.openjdk.java.net/browse/JDK-8046703
  JEP 210: LambdaForm Reduction and Caching
 ___
 mlvm-dev mailing list
 mlvm-dev@openjdk.java.net
 http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

 ___
 mlvm-dev mailing list
 mlvm-dev@openjdk.java.net
 

Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-21 Thread MacGregor, Duncan (GE Energy Management)
This version seems to have inconsistent removal of ignore profile in the
hotspot patch. It’s no longer added to vmSymbols but is still referenced
in classFileParser.

On 19/01/2015 20:21, MacGregor, Duncan (GE Energy Management)
duncan.macgre...@ge.com wrote:

Okay, I¹ve done some tests of this with the micro benchmarks for our
language  runtime which show pretty much no change except for one test
which is now almost 3x slower. It uses nested loops to iterate over an
array and concatenate the string-like objects it contains, and replaces
elements with these new longer string-llike objects. It¹s a bit of a
pathological case, and I haven¹t seen the same sort of degradation in the
other benchmarks or in real applications, but I haven¹t done serious
benchmarking of them with this change.

I shall see if the test case can be reduced down to anything simpler while
still showing the same performance behaviour, and try add some compilation
logging options to narrow down what¹s going on.

Duncan.

On 16/01/2015 17:16, Vladimir Ivanov vladimir.x.iva...@oracle.com
wrote:

http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
https://bugs.openjdk.java.net/browse/JDK-8063137

After GuardWithTest (GWT) LambdaForms became shared, profile pollution
significantly distorted compilation decisions. It affected inlining and
hindered some optimizations. It causes significant performance
regressions for Nashorn (on Octane benchmarks).

Inlining was fixed by 8059877 [1], but it didn't cover the case when a
branch is never taken. It can cause missed optimization opportunity, and
not just increase in code size. For example, non-pruned branch can break
escape analysis.

Currently, there are 2 problems:
   - branch frequencies profile pollution
   - deoptimization counts pollution

Branch frequency pollution hides from JIT the fact that a branch is
never taken. Since GWT LambdaForms (and hence their bytecode) are
heavily shared, but the behavior is specific to MethodHandle, there's no
way for JIT to understand how particular GWT instance behaves.

The solution I propose is to do profiling in Java code and feed it to
JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where
profiling info is stored. Once JIT kicks in, it can retrieve these
counts, if corresponding MethodHandle is a compile-time constant (and it
is usually the case). To communicate the profile data from Java code to
JIT, MethodHandleImpl::profileBranch() is used.

If GWT MethodHandle isn't a compile-time constant, profiling should
proceed. It happens when corresponding LambdaForm is already shared, for
newly created GWT MethodHandles profiling can occur only in native code
(dedicated nmethod for a single LambdaForm). So, when compilation of the
whole MethodHandle chain is triggered, the profile should be already
gathered.

Overriding branch frequencies is not enough. Statistics on
deoptimization events is also polluted. Even if a branch is never taken,
JIT doesn't issue an uncommon trap there unless corresponding bytecode
doesn't trap too much and doesn't cause too many recompiles.

I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT
sees it on some method, Compile::too_many_traps 
Compile::too_many_recompiles for that method always return false. It
allows JIT to prune the branch based on custom profile and recompile the
method, if the branch is visited.

For now, I wanted to keep the fix very focused. The next thing I plan to
do is to experiment with ignoring deoptimization counts for other
LambdaForms which are heavily shared. I already saw problems caused by
deoptimization counts pollution (see JDK-8068915 [2]).

I plan to backport the fix into 8u40, once I finish extensive
performance testing.

Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite,
Octane).

Thanks!

PS: as a summary, my experiments show that fixes for 8063137  8068915
[2] almost completely recovers peak performance after LambdaForm sharing
[3]. There's one more problem left (non-inlined MethodHandle invocations
are more expensive when LFs are shared), but it's a story for another
day.

Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.java.net/browse/JDK-8059877
 8059877: GWT branch frequencies pollution due to LF sharing
[2] https://bugs.openjdk.java.net/browse/JDK-8068915
[3] https://bugs.openjdk.java.net/browse/JDK-8046703
 JEP 210: LambdaForm Reduction and Caching
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-20 Thread Vladimir Ivanov

Duncan, thanks a lot for giving it a try!

If you plan to spend more time on it, please, apply 8068915 as well. I 
saw huge intermittent performance regressions due to continuous 
deoptimization storm. You can look into -XX:+LogCompilation output and 
look for repeated deoptimization events in steady state w/ Action_none. 
Also, there's deoptimization statistics in the log (at least, in jdk9). 
It's located right before compilation_log tag.


Thanks again for the valuable feedback!

Best regards,
Vladimir Ivanov

[1] http://cr.openjdk.java.net/~vlivanov/8068915/webrev.00

On 1/19/15 11:21 PM, MacGregor, Duncan (GE Energy Management) wrote:

Okay, I¹ve done some tests of this with the micro benchmarks for our
language  runtime which show pretty much no change except for one test
which is now almost 3x slower. It uses nested loops to iterate over an
array and concatenate the string-like objects it contains, and replaces
elements with these new longer string-llike objects. It¹s a bit of a
pathological case, and I haven¹t seen the same sort of degradation in the
other benchmarks or in real applications, but I haven¹t done serious
benchmarking of them with this change.

I shall see if the test case can be reduced down to anything simpler while
still showing the same performance behaviour, and try add some compilation
logging options to narrow down what¹s going on.

Duncan.

On 16/01/2015 17:16, Vladimir Ivanov vladimir.x.iva...@oracle.com
wrote:


http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
https://bugs.openjdk.java.net/browse/JDK-8063137

After GuardWithTest (GWT) LambdaForms became shared, profile pollution
significantly distorted compilation decisions. It affected inlining and
hindered some optimizations. It causes significant performance
regressions for Nashorn (on Octane benchmarks).

Inlining was fixed by 8059877 [1], but it didn't cover the case when a
branch is never taken. It can cause missed optimization opportunity, and
not just increase in code size. For example, non-pruned branch can break
escape analysis.

Currently, there are 2 problems:
   - branch frequencies profile pollution
   - deoptimization counts pollution

Branch frequency pollution hides from JIT the fact that a branch is
never taken. Since GWT LambdaForms (and hence their bytecode) are
heavily shared, but the behavior is specific to MethodHandle, there's no
way for JIT to understand how particular GWT instance behaves.

The solution I propose is to do profiling in Java code and feed it to
JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where
profiling info is stored. Once JIT kicks in, it can retrieve these
counts, if corresponding MethodHandle is a compile-time constant (and it
is usually the case). To communicate the profile data from Java code to
JIT, MethodHandleImpl::profileBranch() is used.

If GWT MethodHandle isn't a compile-time constant, profiling should
proceed. It happens when corresponding LambdaForm is already shared, for
newly created GWT MethodHandles profiling can occur only in native code
(dedicated nmethod for a single LambdaForm). So, when compilation of the
whole MethodHandle chain is triggered, the profile should be already
gathered.

Overriding branch frequencies is not enough. Statistics on
deoptimization events is also polluted. Even if a branch is never taken,
JIT doesn't issue an uncommon trap there unless corresponding bytecode
doesn't trap too much and doesn't cause too many recompiles.

I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT
sees it on some method, Compile::too_many_traps 
Compile::too_many_recompiles for that method always return false. It
allows JIT to prune the branch based on custom profile and recompile the
method, if the branch is visited.

For now, I wanted to keep the fix very focused. The next thing I plan to
do is to experiment with ignoring deoptimization counts for other
LambdaForms which are heavily shared. I already saw problems caused by
deoptimization counts pollution (see JDK-8068915 [2]).

I plan to backport the fix into 8u40, once I finish extensive
performance testing.

Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite,
Octane).

Thanks!

PS: as a summary, my experiments show that fixes for 8063137  8068915
[2] almost completely recovers peak performance after LambdaForm sharing
[3]. There's one more problem left (non-inlined MethodHandle invocations
are more expensive when LFs are shared), but it's a story for another day.

Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.java.net/browse/JDK-8059877
 8059877: GWT branch frequencies pollution due to LF sharing
[2] https://bugs.openjdk.java.net/browse/JDK-8068915
[3] https://bugs.openjdk.java.net/browse/JDK-8046703
 JEP 210: LambdaForm Reduction and Caching
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net

Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-20 Thread MacGregor, Duncan (GE Energy Management)
Hmm, 8068915 hasn’t fixed it, but running fewer benchmarks seems to make
the problem go away, so it looks like there’s something going wrong fairly
deep in our runtime. Trying the full suite with compilation logging
enabled now to see if I can find a smoking gun.

On 20/01/2015 12:40, Vladimir Ivanov vladimir.x.iva...@oracle.com
wrote:

Duncan, thanks a lot for giving it a try!

If you plan to spend more time on it, please, apply 8068915 as well. I
saw huge intermittent performance regressions due to continuous
deoptimization storm. You can look into -XX:+LogCompilation output and
look for repeated deoptimization events in steady state w/ Action_none.
Also, there's deoptimization statistics in the log (at least, in jdk9).
It's located right before compilation_log tag.

Thanks again for the valuable feedback!

Best regards,
Vladimir Ivanov

[1] http://cr.openjdk.java.net/~vlivanov/8068915/webrev.00

On 1/19/15 11:21 PM, MacGregor, Duncan (GE Energy Management) wrote:
 Okay, I¹ve done some tests of this with the micro benchmarks for our
 language  runtime which show pretty much no change except for one test
 which is now almost 3x slower. It uses nested loops to iterate over an
 array and concatenate the string-like objects it contains, and replaces
 elements with these new longer string-llike objects. It¹s a bit of a
 pathological case, and I haven¹t seen the same sort of degradation in
the
 other benchmarks or in real applications, but I haven¹t done serious
 benchmarking of them with this change.

 I shall see if the test case can be reduced down to anything simpler
while
 still showing the same performance behaviour, and try add some
compilation
 logging options to narrow down what¹s going on.

 Duncan.

 On 16/01/2015 17:16, Vladimir Ivanov vladimir.x.iva...@oracle.com
 wrote:

 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
 https://bugs.openjdk.java.net/browse/JDK-8063137

 After GuardWithTest (GWT) LambdaForms became shared, profile pollution
 significantly distorted compilation decisions. It affected inlining and
 hindered some optimizations. It causes significant performance
 regressions for Nashorn (on Octane benchmarks).

 Inlining was fixed by 8059877 [1], but it didn't cover the case when a
 branch is never taken. It can cause missed optimization opportunity,
and
 not just increase in code size. For example, non-pruned branch can
break
 escape analysis.

 Currently, there are 2 problems:
- branch frequencies profile pollution
- deoptimization counts pollution

 Branch frequency pollution hides from JIT the fact that a branch is
 never taken. Since GWT LambdaForms (and hence their bytecode) are
 heavily shared, but the behavior is specific to MethodHandle, there's
no
 way for JIT to understand how particular GWT instance behaves.

 The solution I propose is to do profiling in Java code and feed it to
 JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where
 profiling info is stored. Once JIT kicks in, it can retrieve these
 counts, if corresponding MethodHandle is a compile-time constant (and
it
 is usually the case). To communicate the profile data from Java code to
 JIT, MethodHandleImpl::profileBranch() is used.

 If GWT MethodHandle isn't a compile-time constant, profiling should
 proceed. It happens when corresponding LambdaForm is already shared,
for
 newly created GWT MethodHandles profiling can occur only in native code
 (dedicated nmethod for a single LambdaForm). So, when compilation of
the
 whole MethodHandle chain is triggered, the profile should be already
 gathered.

 Overriding branch frequencies is not enough. Statistics on
 deoptimization events is also polluted. Even if a branch is never
taken,
 JIT doesn't issue an uncommon trap there unless corresponding bytecode
 doesn't trap too much and doesn't cause too many recompiles.

 I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT
 sees it on some method, Compile::too_many_traps 
 Compile::too_many_recompiles for that method always return false. It
 allows JIT to prune the branch based on custom profile and recompile
the
 method, if the branch is visited.

 For now, I wanted to keep the fix very focused. The next thing I plan
to
 do is to experiment with ignoring deoptimization counts for other
 LambdaForms which are heavily shared. I already saw problems caused by
 deoptimization counts pollution (see JDK-8068915 [2]).

 I plan to backport the fix into 8u40, once I finish extensive
 performance testing.

 Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite,
 Octane).

 Thanks!

 PS: as a summary, my experiments show that fixes for 8063137  8068915
 [2] almost completely recovers peak performance after LambdaForm
sharing
 [3]. There's one more problem left (non-inlined MethodHandle
invocations
 are more expensive when LFs are shared), but it's a story for another
day.

 Best regards,
 Vladimir Ivanov


Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-20 Thread MacGregor, Duncan (GE Energy Management)
So, very few deopt events in the logs (exactly 4 in fact, in both the
performant and non-performant cases, and for the exact same methods), but
in the case where performance has degraded I only see an initial
compilation for the problem method and not the later inlining I see in the
performant case. I’ll dig through the rest of the logs and try see if
there’s any differences leading up to the inlining.

On the bright side while going through the logs I did spot one obvious
snafu in our code (unnecessary MutableCallSite usage), and have got a 2.5
times speed up on another benchmark, so I’m not too unhappy. :-)

On 20/01/2015 17:14, MacGregor, Duncan (GE Energy Management)
duncan.macgre...@ge.com wrote:

Hmm, 8068915 hasn’t fixed it, but running fewer benchmarks seems to make
the problem go away, so it looks like there’s something going wrong fairly
deep in our runtime. Trying the full suite with compilation logging
enabled now to see if I can find a smoking gun.

On 20/01/2015 12:40, Vladimir Ivanov vladimir.x.iva...@oracle.com
wrote:

Duncan, thanks a lot for giving it a try!

If you plan to spend more time on it, please, apply 8068915 as well. I
saw huge intermittent performance regressions due to continuous
deoptimization storm. You can look into -XX:+LogCompilation output and
look for repeated deoptimization events in steady state w/ Action_none.
Also, there's deoptimization statistics in the log (at least, in jdk9).
It's located right before compilation_log tag.

Thanks again for the valuable feedback!

Best regards,
Vladimir Ivanov

[1] http://cr.openjdk.java.net/~vlivanov/8068915/webrev.00

On 1/19/15 11:21 PM, MacGregor, Duncan (GE Energy Management) wrote:
 Okay, I¹ve done some tests of this with the micro benchmarks for our
 language  runtime which show pretty much no change except for one test
 which is now almost 3x slower. It uses nested loops to iterate over an
 array and concatenate the string-like objects it contains, and replaces
 elements with these new longer string-llike objects. It¹s a bit of a
 pathological case, and I haven¹t seen the same sort of degradation in
the
 other benchmarks or in real applications, but I haven¹t done serious
 benchmarking of them with this change.

 I shall see if the test case can be reduced down to anything simpler
while
 still showing the same performance behaviour, and try add some
compilation
 logging options to narrow down what¹s going on.

 Duncan.

 On 16/01/2015 17:16, Vladimir Ivanov vladimir.x.iva...@oracle.com
 wrote:

 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
 https://bugs.openjdk.java.net/browse/JDK-8063137

 After GuardWithTest (GWT) LambdaForms became shared, profile pollution
 significantly distorted compilation decisions. It affected inlining
and
 hindered some optimizations. It causes significant performance
 regressions for Nashorn (on Octane benchmarks).

 Inlining was fixed by 8059877 [1], but it didn't cover the case when a
 branch is never taken. It can cause missed optimization opportunity,
and
 not just increase in code size. For example, non-pruned branch can
break
 escape analysis.

 Currently, there are 2 problems:
- branch frequencies profile pollution
- deoptimization counts pollution

 Branch frequency pollution hides from JIT the fact that a branch is
 never taken. Since GWT LambdaForms (and hence their bytecode) are
 heavily shared, but the behavior is specific to MethodHandle, there's
no
 way for JIT to understand how particular GWT instance behaves.

 The solution I propose is to do profiling in Java code and feed it to
 JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where
 profiling info is stored. Once JIT kicks in, it can retrieve these
 counts, if corresponding MethodHandle is a compile-time constant (and
it
 is usually the case). To communicate the profile data from Java code
to
 JIT, MethodHandleImpl::profileBranch() is used.

 If GWT MethodHandle isn't a compile-time constant, profiling should
 proceed. It happens when corresponding LambdaForm is already shared,
for
 newly created GWT MethodHandles profiling can occur only in native
code
 (dedicated nmethod for a single LambdaForm). So, when compilation of
the
 whole MethodHandle chain is triggered, the profile should be already
 gathered.

 Overriding branch frequencies is not enough. Statistics on
 deoptimization events is also polluted. Even if a branch is never
taken,
 JIT doesn't issue an uncommon trap there unless corresponding bytecode
 doesn't trap too much and doesn't cause too many recompiles.

 I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT
 sees it on some method, Compile::too_many_traps 
 Compile::too_many_recompiles for that method always return false. It
 allows JIT to prune the branch based on custom profile and recompile
the
 method, if the branch is visited.

 For now, I wanted to keep the fix very focused. The next 

Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-20 Thread Vladimir Ivanov

John, thanks for the review!

Updated webrev:
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.01/hotspot
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.01/jdk

See my answers inline.

On 1/17/15 2:13 AM, John Rose wrote:

On Jan 16, 2015, at 9:16 AM, Vladimir Ivanov
vladimir.x.iva...@oracle.com mailto:vladimir.x.iva...@oracle.com wrote:


http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
https://bugs.openjdk.java.net/browse/JDK-8063137
...
PS: as a summary, my experiments show that fixes for 8063137  8068915
[2] almost completely recovers peak performance after LambdaForm
sharing [3]. There's one more problem left (non-inlined MethodHandle
invocations are more expensive when LFs are shared), but it's a story
for another day.


This performance bump is excellent news.  LFs are supposed to express
emergently common behaviors, like hidden classes.  We are much closer to
that goal now.

I'm glad to see that the library-assisted profiling turns out to be
relatively clean.

In effect this restores the pre-LF CountingMethodHandle logic from 2011,
which was so beneficial in JDK 7:
http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/02de5cdbef21/src/share/classes/java/lang/invoke/CountingMethodHandle.java

I have some suggestions to make this version a little cleaner; see below.

Starting with the JDK changes:

In LambdaForm.java, I'm feeling flag pressure from all the little
boolean fields and constructor parameters.

(Is it time to put in a bit-encoded field private byte
LambdaForm.flags, or do we wait for another boolean to come along?  But
see next questions, which are more important.)

What happens when a GWT LF gets inlined into a larger LF?  Then there
might be two or more selectAlternative calls.
Will this confuse anything or will it Just Work?  The combined LF will
get profiled as usual, and the selectAlternative calls will also collect
profile (or not?).

This leads to another question:  Why have a boolean 'isGWT' at all?  Why
not just check for one or more occurrence of selectAlternative, and
declare that those guys override (some of) the profiling.  Something like:

   -+ if (PROFILE_GWT  lambdaForm.isGWT)
   ++ if (PROFILE_GWT  lambdaForm.containsFunction(NF_selectAlternative))
(...where LF.containsFunction(NamedFunction) is a variation of
LF.contains(Name).)

I suppose the answer may be that you want to inline GWTs (if ever) into
customized code where the JVM profiling should get maximum benefit.  In
that case case you might want to set the boolean to false to
distinguish immature GWT combinators from customized ones.

If that's the case, perhaps the real boolean flag you want is not
'isGWT' but 'sharedProfile' or 'immature' or some such, or (inverting)
'customized'.  (I like the feel of a 'customized' flag.)  Then
@IgnoreProfile would get attached to a LF that (a ) contains
selectAlternative and (b ) is marked as non-customized/immature/shared.
  You might also want to adjust the call to 'profileBranch' based on
whether the containing LF was shared or customized.

What I'm mainly poking at here is that 'isGWT' is not informative about
the intended use of the flag.
I agree. It was an interim solution. Initially, I planned to introduce 
customization and guide the logic based on that property. But it's not 
there yet and I needed something for GWT case. Unfortunately, I missed 
the case when GWT is edited. In that case, isGWT flag is missed and no 
annotation is set.
So, I removed isGWT flag and introduced a check for selectAlternative 
occurence in LambdaForm shape, as you suggested.



In 'updateCounters', if the counter overflows, you'll get continuous
creation of ArithmeticExceptions.  Will that optimize or will it cause a
permanent slowdown?  Consider a hack like this on the exception path:
counters[idx] = Integer.MAX_VALUE / 2;
I had an impression that VM optimizes overflows in Math.exact* 
intrinsics, but it's not the case - it always inserts an uncommon trap. 
I used the workaround you proposed.



On the Name Bikeshed:  It looks like @IgnoreProfile (ignore_profile in
the VM) promises too much ignorance, since it suppresses branch counts
and traps, but allows type profiles to be consulted.  Maybe something
positive like @ManyTraps or @SharedMegamorphic?  (It's just a name,
and this is just a suggestion.)

What do you think about @LambdaForm.Shared?


Going to the JVM:

In library_call.cpp, I think you should change the assert to a guard:
   -+ assert(aobj-length() == 2, );
   ++  aobj-length() == 2) {

Done.


In Parse::dynamic_branch_prediction, the mere presence of the Opaque4
node is enough to trigger replacement of profiling.  I think there
should *not* be a test of method()-ignore_profile().  That should
provide better integration between the two sources of profile data to
JVM profiling?

Done.


Also, I think the name 'Opaque4Node' is way too… opaque.  Suggest
'ProfileBranchNode', since that's 

Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-19 Thread Vladimir Ivanov

Thanks, Vladimir!


I would suggest to add more detailed comment (instead of simple Stop
profiling) to inline_profileBranch() intrinsic explaining what it is
doing because it is not strictly intrinsic - it does not implement
profileBranch() java code when counts is constant.

Sure, will do.


You forgot to mark Opaque4Node as macro node. I would suggest to base it
on Opaque2Node then you will get some methods from it.
Do I really need to do so? I expect it to go away during IGVN pass right 
after parsing is over. That's why I register the node for igvn in 
LibraryCallKit::inline_profileBranch(). Changes in macro.cpp  
compile.cpp are leftovers from the version when Opaque4 was macro node. 
I plan to remove them.


Best regards,
Vladimir Ivanov


On 1/16/15 9:16 AM, Vladimir Ivanov wrote:

http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
https://bugs.openjdk.java.net/browse/JDK-8063137

After GuardWithTest (GWT) LambdaForms became shared, profile pollution
significantly distorted compilation decisions. It affected inlining and
hindered some optimizations. It causes significant performance
regressions for Nashorn (on Octane benchmarks).

Inlining was fixed by 8059877 [1], but it didn't cover the case when a
branch is never taken. It can cause missed optimization opportunity, and
not just increase in code size. For example, non-pruned branch can break
escape analysis.

Currently, there are 2 problems:
   - branch frequencies profile pollution
   - deoptimization counts pollution

Branch frequency pollution hides from JIT the fact that a branch is
never taken. Since GWT LambdaForms (and hence their bytecode) are
heavily shared, but the behavior is specific to MethodHandle, there's no
way for JIT to understand how particular GWT instance behaves.

The solution I propose is to do profiling in Java code and feed it to
JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where
profiling info is stored. Once JIT kicks in, it can retrieve these
counts, if corresponding MethodHandle is a compile-time constant (and it
is usually the case). To communicate the profile data from Java code to
JIT, MethodHandleImpl::profileBranch() is used.

If GWT MethodHandle isn't a compile-time constant, profiling should
proceed. It happens when corresponding LambdaForm is already shared, for
newly created GWT MethodHandles profiling can occur only in native code
(dedicated nmethod for a single LambdaForm). So, when compilation of the
whole MethodHandle chain is triggered, the profile should be already
gathered.

Overriding branch frequencies is not enough. Statistics on
deoptimization events is also polluted. Even if a branch is never taken,
JIT doesn't issue an uncommon trap there unless corresponding bytecode
doesn't trap too much and doesn't cause too many recompiles.

I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT
sees it on some method, Compile::too_many_traps 
Compile::too_many_recompiles for that method always return false. It
allows JIT to prune the branch based on custom profile and recompile the
method, if the branch is visited.

For now, I wanted to keep the fix very focused. The next thing I plan to
do is to experiment with ignoring deoptimization counts for other
LambdaForms which are heavily shared. I already saw problems caused by
deoptimization counts pollution (see JDK-8068915 [2]).

I plan to backport the fix into 8u40, once I finish extensive
performance testing.

Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite,
Octane).

Thanks!

PS: as a summary, my experiments show that fixes for 8063137  8068915
[2] almost completely recovers peak performance after LambdaForm sharing
[3]. There's one more problem left (non-inlined MethodHandle invocations
are more expensive when LFs are shared), but it's a story for another
day.

Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.java.net/browse/JDK-8059877
 8059877: GWT branch frequencies pollution due to LF sharing
[2] https://bugs.openjdk.java.net/browse/JDK-8068915
[3] https://bugs.openjdk.java.net/browse/JDK-8046703
 JEP 210: LambdaForm Reduction and Caching
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-19 Thread MacGregor, Duncan (GE Energy Management)
Okay, I¹ve done some tests of this with the micro benchmarks for our
language  runtime which show pretty much no change except for one test
which is now almost 3x slower. It uses nested loops to iterate over an
array and concatenate the string-like objects it contains, and replaces
elements with these new longer string-llike objects. It¹s a bit of a
pathological case, and I haven¹t seen the same sort of degradation in the
other benchmarks or in real applications, but I haven¹t done serious
benchmarking of them with this change.

I shall see if the test case can be reduced down to anything simpler while
still showing the same performance behaviour, and try add some compilation
logging options to narrow down what¹s going on.

Duncan.

On 16/01/2015 17:16, Vladimir Ivanov vladimir.x.iva...@oracle.com
wrote:

http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
https://bugs.openjdk.java.net/browse/JDK-8063137

After GuardWithTest (GWT) LambdaForms became shared, profile pollution
significantly distorted compilation decisions. It affected inlining and
hindered some optimizations. It causes significant performance
regressions for Nashorn (on Octane benchmarks).

Inlining was fixed by 8059877 [1], but it didn't cover the case when a
branch is never taken. It can cause missed optimization opportunity, and
not just increase in code size. For example, non-pruned branch can break
escape analysis.

Currently, there are 2 problems:
   - branch frequencies profile pollution
   - deoptimization counts pollution

Branch frequency pollution hides from JIT the fact that a branch is
never taken. Since GWT LambdaForms (and hence their bytecode) are
heavily shared, but the behavior is specific to MethodHandle, there's no
way for JIT to understand how particular GWT instance behaves.

The solution I propose is to do profiling in Java code and feed it to
JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where
profiling info is stored. Once JIT kicks in, it can retrieve these
counts, if corresponding MethodHandle is a compile-time constant (and it
is usually the case). To communicate the profile data from Java code to
JIT, MethodHandleImpl::profileBranch() is used.

If GWT MethodHandle isn't a compile-time constant, profiling should
proceed. It happens when corresponding LambdaForm is already shared, for
newly created GWT MethodHandles profiling can occur only in native code
(dedicated nmethod for a single LambdaForm). So, when compilation of the
whole MethodHandle chain is triggered, the profile should be already
gathered.

Overriding branch frequencies is not enough. Statistics on
deoptimization events is also polluted. Even if a branch is never taken,
JIT doesn't issue an uncommon trap there unless corresponding bytecode
doesn't trap too much and doesn't cause too many recompiles.

I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT
sees it on some method, Compile::too_many_traps 
Compile::too_many_recompiles for that method always return false. It
allows JIT to prune the branch based on custom profile and recompile the
method, if the branch is visited.

For now, I wanted to keep the fix very focused. The next thing I plan to
do is to experiment with ignoring deoptimization counts for other
LambdaForms which are heavily shared. I already saw problems caused by
deoptimization counts pollution (see JDK-8068915 [2]).

I plan to backport the fix into 8u40, once I finish extensive
performance testing.

Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite,
Octane).

Thanks!

PS: as a summary, my experiments show that fixes for 8063137  8068915
[2] almost completely recovers peak performance after LambdaForm sharing
[3]. There's one more problem left (non-inlined MethodHandle invocations
are more expensive when LFs are shared), but it's a story for another day.

Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.java.net/browse/JDK-8059877
 8059877: GWT branch frequencies pollution due to LF sharing
[2] https://bugs.openjdk.java.net/browse/JDK-8068915
[3] https://bugs.openjdk.java.net/browse/JDK-8046703
 JEP 210: LambdaForm Reduction and Caching
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-16 Thread Vladimir Kozlov

Nice! At least Hotspot part since I don't understand jdk part :)

I would suggest to add more detailed comment (instead of simple Stop 
profiling) to inline_profileBranch() intrinsic explaining what it is 
doing because it is not strictly intrinsic - it does not implement 
profileBranch() java code when counts is constant.


You forgot to mark Opaque4Node as macro node. I would suggest to base it 
on Opaque2Node then you will get some methods from it.


Thanks,
Vladimir

On 1/16/15 9:16 AM, Vladimir Ivanov wrote:

http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
https://bugs.openjdk.java.net/browse/JDK-8063137

After GuardWithTest (GWT) LambdaForms became shared, profile pollution
significantly distorted compilation decisions. It affected inlining and
hindered some optimizations. It causes significant performance
regressions for Nashorn (on Octane benchmarks).

Inlining was fixed by 8059877 [1], but it didn't cover the case when a
branch is never taken. It can cause missed optimization opportunity, and
not just increase in code size. For example, non-pruned branch can break
escape analysis.

Currently, there are 2 problems:
   - branch frequencies profile pollution
   - deoptimization counts pollution

Branch frequency pollution hides from JIT the fact that a branch is
never taken. Since GWT LambdaForms (and hence their bytecode) are
heavily shared, but the behavior is specific to MethodHandle, there's no
way for JIT to understand how particular GWT instance behaves.

The solution I propose is to do profiling in Java code and feed it to
JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where
profiling info is stored. Once JIT kicks in, it can retrieve these
counts, if corresponding MethodHandle is a compile-time constant (and it
is usually the case). To communicate the profile data from Java code to
JIT, MethodHandleImpl::profileBranch() is used.

If GWT MethodHandle isn't a compile-time constant, profiling should
proceed. It happens when corresponding LambdaForm is already shared, for
newly created GWT MethodHandles profiling can occur only in native code
(dedicated nmethod for a single LambdaForm). So, when compilation of the
whole MethodHandle chain is triggered, the profile should be already
gathered.

Overriding branch frequencies is not enough. Statistics on
deoptimization events is also polluted. Even if a branch is never taken,
JIT doesn't issue an uncommon trap there unless corresponding bytecode
doesn't trap too much and doesn't cause too many recompiles.

I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT
sees it on some method, Compile::too_many_traps 
Compile::too_many_recompiles for that method always return false. It
allows JIT to prune the branch based on custom profile and recompile the
method, if the branch is visited.

For now, I wanted to keep the fix very focused. The next thing I plan to
do is to experiment with ignoring deoptimization counts for other
LambdaForms which are heavily shared. I already saw problems caused by
deoptimization counts pollution (see JDK-8068915 [2]).

I plan to backport the fix into 8u40, once I finish extensive
performance testing.

Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite, Octane).

Thanks!

PS: as a summary, my experiments show that fixes for 8063137  8068915
[2] almost completely recovers peak performance after LambdaForm sharing
[3]. There's one more problem left (non-inlined MethodHandle invocations
are more expensive when LFs are shared), but it's a story for another day.

Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.java.net/browse/JDK-8059877
 8059877: GWT branch frequencies pollution due to LF sharing
[2] https://bugs.openjdk.java.net/browse/JDK-8068915
[3] https://bugs.openjdk.java.net/browse/JDK-8046703
 JEP 210: LambdaForm Reduction and Caching
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: [9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

2015-01-16 Thread John Rose
On Jan 16, 2015, at 9:16 AM, Vladimir Ivanov vladimir.x.iva...@oracle.com 
wrote:
 
 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ 
 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ 
 http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
 https://bugs.openjdk.java.net/browse/JDK-8063137 
 https://bugs.openjdk.java.net/browse/JDK-8063137
 ...
 PS: as a summary, my experiments show that fixes for 8063137  8068915 [2] 
 almost completely recovers peak performance after LambdaForm sharing [3]. 
 There's one more problem left (non-inlined MethodHandle invocations are more 
 expensive when LFs are shared), but it's a story for another day.

This performance bump is excellent news.  LFs are supposed to express 
emergently common behaviors, like hidden classes.  We are much closer to that 
goal now.

I'm glad to see that the library-assisted profiling turns out to be relatively 
clean.

In effect this restores the pre-LF CountingMethodHandle logic from 2011, which 
was so beneficial in JDK 7:
  
http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/02de5cdbef21/src/share/classes/java/lang/invoke/CountingMethodHandle.java

I have some suggestions to make this version a little cleaner; see below.

Starting with the JDK changes:

In LambdaForm.java, I'm feeling flag pressure from all the little boolean 
fields and constructor parameters.

(Is it time to put in a bit-encoded field private byte LambdaForm.flags, or 
do we wait for another boolean to come along?  But see next questions, which 
are more important.)

What happens when a GWT LF gets inlined into a larger LF?  Then there might be 
two or more selectAlternative calls.
Will this confuse anything or will it Just Work?  The combined LF will get 
profiled as usual, and the selectAlternative calls will also collect profile 
(or not?).

This leads to another question:  Why have a boolean 'isGWT' at all?  Why not 
just check for one or more occurrence of selectAlternative, and declare that 
those guys override (some of) the profiling.  Something like:

  -+ if (PROFILE_GWT  lambdaForm.isGWT) 
  ++ if (PROFILE_GWT  lambdaForm.containsFunction(NF_selectAlternative))
(...where LF.containsFunction(NamedFunction) is a variation of 
LF.contains(Name).)

I suppose the answer may be that you want to inline GWTs (if ever) into 
customized code where the JVM profiling should get maximum benefit.  In that 
case case you might want to set the boolean to false to distinguish 
immature GWT combinators from customized ones.

If that's the case, perhaps the real boolean flag you want is not 'isGWT' but 
'sharedProfile' or 'immature' or some such, or (inverting) 'customized'.  (I 
like the feel of a 'customized' flag.)  Then @IgnoreProfile would get attached 
to a LF that (a ) contains selectAlternative and (b ) is marked as 
non-customized/immature/shared.  You might also want to adjust the call to 
'profileBranch' based on whether the containing LF was shared or customized.

What I'm mainly poking at here is that 'isGWT' is not informative about the 
intended use of the flag.

In 'updateCounters', if the counter overflows, you'll get continuous creation 
of ArithmeticExceptions.  Will that optimize or will it cause a permanent 
slowdown?  Consider a hack like this on the exception path:
   counters[idx] = Integer.MAX_VALUE / 2;

On the Name Bikeshed:  It looks like @IgnoreProfile (ignore_profile in the VM) 
promises too much ignorance, since it suppresses branch counts and traps, but 
allows type profiles to be consulted.  Maybe something positive like 
@ManyTraps or @SharedMegamorphic?  (It's just a name, and this is just a 
suggestion.)

Going to the JVM:

In library_call.cpp, I think you should change the assert to a guard:
  -+ assert(aobj-length() == 2, );
  ++  aobj-length() == 2) {

In Parse::dynamic_branch_prediction, the mere presence of the Opaque4 node is 
enough to trigger replacement of profiling.  I think there should *not* be a 
test of method()-ignore_profile().  That should provide better integration 
between the two sources of profile data to JVM profiling?

Also, I think the name 'Opaque4Node' is way too… opaque.  Suggest 
'ProfileBranchNode', since that's exactly what it does.

Suggest changing the log element profile_branch to observe 
source='profileBranch', to make a better hint as to the source of the info.

— John

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev