Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-16 Thread Jatin Bhateja
On Wed, 16 Feb 2022 12:26:45 GMT, Jatin Bhateja  wrote:

>>> > Hi, IIRC for evex encoding you can embed the RC control bit directly in 
>>> > the evex prefix, removing the need to rely on global MXCSR register. 
>>> > Thanks.
>>> 
>>> Hi @merykitty , You are correct, we can embed RC mode in instruction 
>>> encoding of round instruction (towards -inf,+inf, zero). But to match the 
>>> semantics of Math.round API one needs to add 0.5[f] to input value and then 
>>> perform rounding over resultant value, which is why @sviswa7 suggested to 
>>> use a global rounding mode driven by MXCSR.RC so that intermediate floating 
>>> inexact values are resolved as desired, but OOO execution may misplace 
>>> LDMXCSR and hence may have undesired side effects.
>> 
>> **Just want to correct above statement, LDMXCSR will not be 
>> re-ordered/re-scheduled early OOO backend.**
>
>> That pseudocode would make a very useful comment too. This whole patch is 
>> very thinly commented.
> 
> I have replaced earlier bulky sequence, new sequence is having similar 
> performance but reduction in code may improve inlining behavior.  Added 
> descriptive comments around the special cases.

> There are already `RoundFloat`, `RoundDouble`, and `RoundDoubleMode` nodes 
> defined.
> 
> Though `RoundFloat` and `RoundDouble` are legacy nodes used only on x86-32, 
> `RoundDoubleMode` supports multiple rounding modes and is amenable to 
> auto-vectorization.
> 
> What do you think about the following alternative?
> 
> Reuse `RoundDoubleMode` (with a new rounding mode) and introduce 
> `RoundFloatMode`.
> 
> Special rounding rules is not the only peculiarity of `Math.round()`. It also 
> converts the result to an integral type. It can be represented as `ConvF2I 
> (RoundFloatMode f #rmode)` / `ConvD2L (RoundDoubleMode d #rmode)`. In scalar 
> case, it can be matched as a single AD instruction.
> 
> Auto-vectorizer can then convert it to `VectorCastF2X (RoundFloatModeV vf 
> #rmode)` / `VectorCastD2X (RoundDoubleModeV vd #rmode)` and match it in a 
> similar manner.

Adding new rounding mode to RoundDoubleMode may disturb other targets. 
match_rule_supported routine operates over Opcodes and currently any target 
supporting RoundDoubleMode generates code for all the rounding modes. Your 
solution is anyways based on creating new scalar and vector IR node for 
floating point rounding operation, which is what patch is doing currently.

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-16 Thread Jatin Bhateja
On Mon, 14 Feb 2022 17:14:10 GMT, Jatin Bhateja  wrote:

>> That pseudocode would make a very useful comment too. This whole patch is 
>> very thinly commented.
>
>> > Hi, IIRC for evex encoding you can embed the RC control bit directly in 
>> > the evex prefix, removing the need to rely on global MXCSR register. 
>> > Thanks.
>> 
>> Hi @merykitty , You are correct, we can embed RC mode in instruction 
>> encoding of round instruction (towards -inf,+inf, zero). But to match the 
>> semantics of Math.round API one needs to add 0.5[f] to input value and then 
>> perform rounding over resultant value, which is why @sviswa7 suggested to 
>> use a global rounding mode driven by MXCSR.RC so that intermediate floating 
>> inexact values are resolved as desired, but OOO execution may misplace 
>> LDMXCSR and hence may have undesired side effects.
> 
> **Just want to correct above statement, LDMXCSR will not be 
> re-ordered/re-scheduled early OOO backend.**

> That pseudocode would make a very useful comment too. This whole patch is 
> very thinly commented.

I have replaced earlier bulky sequence, new sequence is having similar 
performance but reduction in code may improve inlining behavior.  Added 
descriptive comments around the special cases.

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-14 Thread Jatin Bhateja
On Mon, 14 Feb 2022 09:12:54 GMT, Andrew Haley  wrote:

>>> What does this do? Comment, even pseudo code, would be nice.
>> 
>> Thanks @theRealAph , I shall append the comments over the routine.
>> BTW, entire rounding algorithm can also be implemented using  Vector API 
>> which can perform if-conversion using masked operations.
>> 
>> class roundf {
>>public static VectorSpecies ISPECIES = IntVector.SPECIES_512;
>>public static VectorSpecies SPECIES = FloatVector.SPECIES_512;
>> 
>>public static int round_vector(float[] a, int[] r, int ctr) {
>>   IntVector shiftVBC = (IntVector) ISPECIES.broadcast(24 - 2 + 127);
>>   for (int i = 0; i < a.length; i += SPECIES.length()) {
>>  FloatVector fv = FloatVector.fromArray(SPECIES, a, i);
>>  IntVector iv = fv.reinterpretAsInts();
>>  IntVector biasedExpV = iv.lanewise(VectorOperators.AND, 0x7F80);
>>  biasedExpV = biasedExpV.lanewise(VectorOperators.ASHR, 23);
>>  IntVector shiftV = shiftVBC.lanewise(VectorOperators.SUB, 
>> biasedExpV);
>>  VectorMask cond = shiftV.lanewise(VectorOperators.AND, -32)
>>.compare(VectorOperators.EQ, 0);
>>  IntVector res = iv.lanewise(VectorOperators.AND, 0x007F)
>>.lanewise(VectorOperators.OR, 0x007F + 1);
>>  VectorMask cond1 = iv.compare(VectorOperators.LT, 0);
>>  VectorMask cond2 = cond1.and(cond);
>>  res = res.lanewise(VectorOperators.NEG, cond2);
>>  res = res.lanewise(VectorOperators.ASHR, shiftV)
>>.lanewise(VectorOperators.ADD, 1)
>>.lanewise(VectorOperators.ASHR, 1);
>>  res = fv.convert(VectorOperators.F2I, 0)
>>.reinterpretAsInts()
>>.blend(res, cond);
>>  res.intoArray(r, i);
>>   }
>>   return r[ctr];
>>}
>
> That pseudocode would make a very useful comment too. This whole patch is 
> very thinly commented.

> > Hi, IIRC for evex encoding you can embed the RC control bit directly in the 
> > evex prefix, removing the need to rely on global MXCSR register. Thanks.
> 
> Hi @merykitty , You are correct, we can embed RC mode in instruction encoding 
> of round instruction (towards -inf,+inf, zero). But to match the semantics of 
> Math.round API one needs to add 0.5[f] to input value and then perform 
> rounding over resultant value, which is why @sviswa7 suggested to use a 
> global rounding mode driven by MXCSR.RC so that intermediate floating inexact 
> values are resolved as desired, but OOO execution may misplace LDMXCSR and 
> hence may have undesired side effects.

**Just want to correct above statement, LDMXCSR will not be 
re-ordered/re-scheduled early OOO backend.**

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-14 Thread Andrew Haley
On Sun, 13 Feb 2022 13:12:35 GMT, Jatin Bhateja  wrote:

>>> Hi, IIRC for evex encoding you can embed the RC control bit directly in the 
>>> evex prefix, removing the need to rely on global MXCSR register. Thanks.
>> 
>> Hi @merykitty ,  You are correct, we can embed RC mode in instruction 
>> encoding of round instruction (towards -inf,+inf, zero). But to match the 
>> semantics of Math.round API one needs to add 0.5[f] to input value and then 
>> perform rounding over resultant value, which is why @sviswa7 suggested to 
>> use a global rounding mode driven by MXCSR.RC so that intermediate floating 
>> inexact values are resolved as desired, but OOO execution may misplace 
>> LDMXCSR and hence may have undesired side effects.
>
>> What does this do? Comment, even pseudo code, would be nice.
> 
> Thanks @theRealAph , I shall append the comments over the routine.
> BTW, entire rounding algorithm can also be implemented using  Vector API 
> which can perform if-conversion using masked operations.
> 
> class roundf {
>public static VectorSpecies ISPECIES = IntVector.SPECIES_512;
>public static VectorSpecies SPECIES = FloatVector.SPECIES_512;
> 
>public static int round_vector(float[] a, int[] r, int ctr) {
>   IntVector shiftVBC = (IntVector) ISPECIES.broadcast(24 - 2 + 127);
>   for (int i = 0; i < a.length; i += SPECIES.length()) {
>  FloatVector fv = FloatVector.fromArray(SPECIES, a, i);
>  IntVector iv = fv.reinterpretAsInts();
>  IntVector biasedExpV = iv.lanewise(VectorOperators.AND, 0x7F80);
>  biasedExpV = biasedExpV.lanewise(VectorOperators.ASHR, 23);
>  IntVector shiftV = shiftVBC.lanewise(VectorOperators.SUB, 
> biasedExpV);
>  VectorMask cond = shiftV.lanewise(VectorOperators.AND, -32)
>.compare(VectorOperators.EQ, 0);
>  IntVector res = iv.lanewise(VectorOperators.AND, 0x007F)
>.lanewise(VectorOperators.OR, 0x007F + 1);
>  VectorMask cond1 = iv.compare(VectorOperators.LT, 0);
>  VectorMask cond2 = cond1.and(cond);
>  res = res.lanewise(VectorOperators.NEG, cond2);
>  res = res.lanewise(VectorOperators.ASHR, shiftV)
>.lanewise(VectorOperators.ADD, 1)
>.lanewise(VectorOperators.ASHR, 1);
>  res = fv.convert(VectorOperators.F2I, 0)
>.reinterpretAsInts()
>.blend(res, cond);
>  res.intoArray(r, i);
>   }
>   return r[ctr];
>}

That pseudocode would make a very useful comment too. This whole patch is very 
thinly commented.

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-13 Thread Jatin Bhateja
On Sun, 13 Feb 2022 13:08:41 GMT, Jatin Bhateja  wrote:

>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4066:
>> 
>>> 4064: }
>>> 4065: 
>>> 4066: void 
>>> C2_MacroAssembler::vector_cast_double_special_cases_evex(XMMRegister dst, 
>>> XMMRegister src, XMMRegister xtmp1,
>> 
>> What does this do? Comment, even pseudo code, would be nice.
>
>> Hi, IIRC for evex encoding you can embed the RC control bit directly in the 
>> evex prefix, removing the need to rely on global MXCSR register. Thanks.
> 
> Hi @merykitty ,  You are correct, we can embed RC mode in instruction 
> encoding of round instruction (towards -inf,+inf, zero). But to match the 
> semantics of Math.round API one needs to add 0.5[f] to input value and then 
> perform rounding over resultant value, which is why @sviswa7 suggested to use 
> a global rounding mode driven by MXCSR.RC so that intermediate floating 
> inexact values also are resolved as desired, but OOO execution may misplace 
> LDMXCSR and hence may have undesired side effects.

> What does this do? Comment, even pseudo code, would be nice.

Thanks @theRealAph , I shall append the comments over the routine.
BTW, entire rounding algorithm can also be implemented using  Vector API which 
can perform if-conversion using masked operations.

class roundf {
   public static VectorSpecies ISPECIES = IntVector.SPECIES_512;
   public static VectorSpecies SPECIES = FloatVector.SPECIES_512;

   public static int round_vector(float[] a, int[] r, int ctr) {
  IntVector shiftVBC = (IntVector) ISPECIES.broadcast(24 - 2 + 127);
  for (int i = 0; i < a.length; i += SPECIES.length()) {
 FloatVector fv = FloatVector.fromArray(SPECIES, a, i);
 IntVector iv = fv.reinterpretAsInts();
 IntVector biasedExpV = iv.lanewise(VectorOperators.AND, 0x7F80);
 biasedExpV = biasedExpV.lanewise(VectorOperators.ASHR, 23);
 IntVector shiftV = shiftVBC.lanewise(VectorOperators.SUB, biasedExpV);
 VectorMask cond = shiftV.lanewise(VectorOperators.AND, -32)
   .compare(VectorOperators.EQ, 0);
 IntVector res = iv.lanewise(VectorOperators.AND, 0x007F)
   .lanewise(VectorOperators.OR, 0x007F + 1);
 VectorMask cond1 = iv.compare(VectorOperators.LT, 0);
 VectorMask cond2 = cond1.and(cond);
 res = res.lanewise(VectorOperators.NEG, cond2);
 res = res.lanewise(VectorOperators.ASHR, shiftV)
   .lanewise(VectorOperators.ADD, 1)
   .lanewise(VectorOperators.ASHR, 1);
 res = fv.convert(VectorOperators.F2I, 0)
   .reinterpretAsInts()
   .blend(res, cond);
 res.intoArray(r, i);
  }
  return r[ctr];
   }

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-13 Thread Jatin Bhateja
On Sun, 13 Feb 2022 10:58:19 GMT, Andrew Haley  wrote:

>> Jatin Bhateja has updated the pull request with a new target base due to a 
>> merge or a rebase. The incremental webrev excludes the unrelated changes 
>> brought in by the merge/rebase. The pull request contains four additional 
>> commits since the last revision:
>> 
>>  - 8279508: Adding vectorized algorithms to match the semantics of rounding 
>> operations.
>>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>>  - 8279508: Adding a test for scalar intrinsification.
>>  - 8279508: Auto-vectorize Math.round API
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4066:
> 
>> 4064: }
>> 4065: 
>> 4066: void 
>> C2_MacroAssembler::vector_cast_double_special_cases_evex(XMMRegister dst, 
>> XMMRegister src, XMMRegister xtmp1,
> 
> What does this do? Comment, even pseudo code, would be nice.

> Hi, IIRC for evex encoding you can embed the RC control bit directly in the 
> evex prefix, removing the need to rely on global MXCSR register. Thanks.

Hi @merykitty ,  You are correct, we can embed RC mode in instruction encoding 
round instructions (towards -inf,+inf, zero). But to match the semantics of 
Math.round API one needs to add 0.5[f] to input value and then perform rounding 
over resultant value, which is why @sviswa7 suggested to use a global rounding 
mode driven by MXCSR.RC so that intermediate floating inexact values also are 
resolved as desired, but OOO execution may misplace LDMXCSR and hence may have 
undesired side effects.

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-13 Thread Andrew Haley
On Sun, 13 Feb 2022 03:09:43 GMT, Jatin Bhateja  wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar 
>> IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain 
>> ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 584.99 | 1870.70 | 3.20 | 
>> 510.35 | 548.60 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 257.17 | 965.33 | 3.75 | 
>> 293.60 | 273.15 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.69 | 3592.54 | 4.35 | 
>> 825.32 | 1836.42 | 2.23
>> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | 
>> 412.31 | 945.82 | 2.29
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains four additional 
> commits since the last revision:
> 
>  - 8279508: Adding vectorized algorithms to match the semantics of rounding 
> operations.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Adding a test for scalar intrinsification.
>  - 8279508: Auto-vectorize Math.round API

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4066:

> 4064: }
> 4065: 
> 4066: void 
> C2_MacroAssembler::vector_cast_double_special_cases_evex(XMMRegister dst, 
> XMMRegister src, XMMRegister xtmp1,

What does this do? Comment, even pseudo code, would be nice.

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-13 Thread Quan Anh Mai
On Sun, 13 Feb 2022 03:09:43 GMT, Jatin Bhateja  wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar 
>> IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain 
>> ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 584.99 | 1870.70 | 3.20 | 
>> 510.35 | 548.60 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 257.17 | 965.33 | 3.75 | 
>> 293.60 | 273.15 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.69 | 3592.54 | 4.35 | 
>> 825.32 | 1836.42 | 2.23
>> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | 
>> 412.31 | 945.82 | 2.29
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains four additional 
> commits since the last revision:
> 
>  - 8279508: Adding vectorized algorithms to match the semantics of rounding 
> operations.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Adding a test for scalar intrinsification.
>  - 8279508: Auto-vectorize Math.round API

Also, it seems you have tried using `roundss/sd/ps/pd` followed by a cast to 
correct the rounding behaviour but decided to take another approach. Some 
comments around the functions explaining why that is so would be preferable. 
Thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-12 Thread Quan Anh Mai
On Sun, 13 Feb 2022 03:09:43 GMT, Jatin Bhateja  wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar 
>> IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain 
>> ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 584.99 | 1870.70 | 3.20 | 
>> 510.35 | 548.60 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 257.17 | 965.33 | 3.75 | 
>> 293.60 | 273.15 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.69 | 3592.54 | 4.35 | 
>> 825.32 | 1836.42 | 2.23
>> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | 
>> 412.31 | 945.82 | 2.29
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains four additional 
> commits since the last revision:
> 
>  - 8279508: Adding vectorized algorithms to match the semantics of rounding 
> operations.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Adding a test for scalar intrinsification.
>  - 8279508: Auto-vectorize Math.round API

Hi, IIRC for evex encoding you can embed the RC control bit directly in the 
evex prefix, removing the need to rely on global MXCSR register. Thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-12 Thread Jatin Bhateja
> Summary of changes:
> - Intrinsify Math.round(float) and Math.round(double) APIs.
> - Extend auto-vectorizer to infer vector operations on encountering scalar IR 
> nodes for above intrinsics.
> - Test creation using new IR testing framework.
> 
> Following are the performance number of a JMH micro included with the patch 
> 
> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
> 
> 
> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain 
> ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
> -- | -- | -- | -- | -- | -- | -- | --
> FpRoundingBenchmark.test_round_double | 1024.00 | 584.99 | 1870.70 | 3.20 | 
> 510.35 | 548.60 | 1.07
> FpRoundingBenchmark.test_round_double | 2048.00 | 257.17 | 965.33 | 3.75 | 
> 293.60 | 273.15 | 0.93
> FpRoundingBenchmark.test_round_float | 1024.00 | 825.69 | 3592.54 | 4.35 | 
> 825.32 | 1836.42 | 2.23
> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | 
> 412.31 | 945.82 | 2.29
> 
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin

Jatin Bhateja has updated the pull request with a new target base due to a 
merge or a rebase. The incremental webrev excludes the unrelated changes 
brought in by the merge/rebase. The pull request contains four additional 
commits since the last revision:

 - 8279508: Adding vectorized algorithms to match the semantics of rounding 
operations.
 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
 - 8279508: Adding a test for scalar intrinsification.
 - 8279508: Auto-vectorize Math.round API

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7094/files
  - new: https://git.openjdk.java.net/jdk/pull/7094/files/575d2935..2dc364fa

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=7094=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7094=01-02

  Stats: 33695 lines in 1192 files changed: 23243 ins; 5703 del; 4749 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7094.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094

PR: https://git.openjdk.java.net/jdk/pull/7094