[ 
https://issues.apache.org/jira/browse/GROOVY-12065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Sun updated GROOVY-12065:
--------------------------------
    Description: 
h2. Overview
This improvement introduces a dedicated, single-pass bytecode compaction layer 
via {{PeepholeOptimizingMethodVisitor}}. This adapter wraps the underlying ASM 
{{MethodVisitor}} during class generation ({{AsmClassGenerator}}), intercepting 
instruction streams within local basic blocks to eliminate redundant 
operations, rewrite conditional branches, and narrow constant instructions to 
their most optimal forms.

Crucially, by decoupling peephole optimization from structural code generation, 
this approach significantly simplifies the constant emission logic inside 
{{OperandStack}}. Instead of maintaining verbose, duplicate routing logic for 
specialized primitive opcodes (such as {{ICONST_x}}, {{BIPUSH}}, or 
{{FCONST_x}}), {{OperandStack}} now delegates constant pushing uniformly via 
{{visitLdcInsn}}. The stateful peephole layer then transparently condenses 
these instructions into their tightest bytecode representations under the hood.

h2. Key Optimizations Implemented

h3. 1. Redundant Instruction & Dead Code Elimination
* *Discarded Assignment Values:* Eliminates wasteful {{DUP}} -> {{[X]STORE}} -> 
{{POP}} patterns typically produced during assignments where the expression 
result on the operand stack is unused. The optimizer flattens these directly 
into a single {{[X]STORE}}.
* *Redundant Loads:* Detects situations where a local variable or constant is 
loaded ({{[X]LOAD}} / {{LDC}}) but immediately discarded via {{POP}}/{{POP2}}, 
or followed immediately by a void {{RETURN}}. The optimizer safely drops both 
operations while preserving interleaved local increments ({{IINC}}).

h3. 2. Conditional Jump & Zero-Comparison Rewriting
* Optimizes integer comparisons against zero by transforming expensive binary 
comparison sequences (e.g., {{ILOAD}} -> {{ICONST_0}} -> {{IF_ICMPxx}}) into 
compact unary zero-comparison instructions ({{IFxx}}). For instance, an 
{{IF_ICMPEQ}} branch against a buffered {{0}} constant is cleanly rewritten 
directly to {{IFEQ}}.

h3. 3. Instruction Narrowing & Constant Compaction
* Intercepts standard literal definitions and transparently narrows them down 
to the smallest possible specialized opcodes ({{ICONST_M1}} through 
{{ICONST_5}}, {{BIPUSH}}, {{SIPUSH}}, {{LCONST_0/1}}, {{FCONST_0/1/2}}, and 
{{DCONST_0/1}}).
* Explicitly safeguards signed floating-point zeros ({{-0.0f}} and {{-0.0d}}) 
from accidental flattening to guarantee strict IEEE 754 runtime compliance.

h3. 4. Big Number Literal Lowering
* Centralizes string-constructed object instantiations for {{BigDecimal}} and 
{{BigInteger}} literals by rewriting buffered constants into inline {{NEW}} -> 
{{DUP}} -> {{LDC [string]}} -> {{INVOKESPECIAL <init>}} pipelines seamlessly 
before stack emission.

h2. Implementation Strategy & Safety Guardrails
The {{PeepholeOptimizingMethodVisitor}} operates on a lightweight, stack-local 
sliding window. To fully preserve runtime semantics, debugging capabilities, 
and catch-block structures, the lookahead window is automatically flushed to 
the delegate visitor when encountering non-local boundaries or frame-altering 
instructions, including:
* Control flow jumps and basic block {{Label}} targets.
* Stack map frames ({{visitFrame}}).
* Line numbers and local variable debug maps.
* Method invocations and {{InvokeDynamic}} instructions.



  was:
h3. Background

Groovy's bytecode generator relied on ad-hoc, per-type conditional branches 
scattered across {{OperandStack}} and {{BytecodeHelper}} to select compact 
constant-loading instructions ({{{}ICONST_x{}}}, {{{}LCONST_x{}}}, 
{{{}FCONST_x{}}}, {{{}DCONST_x{}}}, {{{}BIPUSH{}}}, {{{}SIPUSH{}}}) over the 
generic {{LDC}} instruction. This logic was fragile, inconsistent, and 
incomplete — for instance, {{BytecodeHelper.pushConstant()}} was missing the 
{{ICONST_M1}} case for the integer constant {{{}-1{}}}.
h3. Change

A new {{PeepholeOptimizingMethodVisitor}} — a {{MethodVisitor}} decorator — is 
introduced and inserted into the per-method visitor chain in 
{{{}AsmClassGenerator{}}}. It intercepts every {{visitLdcInsn}} and 
{{visitIntInsn(BIPUSH/SIPUSH, ...)}} call and rewrites each to the shortest 
valid equivalent opcode:
||Value||Emitted instruction||
|{{int}} −1|{{ICONST_M1}}|
|{{int}} 0–5|{{ICONST_0}} – {{ICONST_5}}|
|{{int}} in [−128, 127]|{{BIPUSH}}|
|{{int}} in [−32768, 32767]|{{SIPUSH}}|
|{{long}} 0 / 1|{{LCONST_0}} / {{LCONST_1}}|
|{{float}} 0 / 1 / 2|{{FCONST_0}} / {{FCONST_1}} / {{FCONST_2}}|
|{{double}} 0 / 1|{{DCONST_0}} / {{DCONST_1}}|

Signed-zero {{float}} and {{double}} values are handled correctly by comparing 
raw bit patterns (via {{Float.floatToRawIntBits}} / 
{{{}Double.doubleToRawLongBits{}}}) rather than {{{}=={}}}, preserving the 
{{-0.0}} / {{+0.0}} distinction that a straight equality check would collapse.

All per-type branches previously inlined in {{OperandStack}} are replaced with 
plain {{visitLdcInsn(value)}} calls; the peephole visitor centralizes the 
selection uniformly for every generated method. 
{{BytecodeHelper.pushConstant()}} also gains the previously missing 
{{ICONST_M1}} branch.

 


> Implement peephole optimization for bytecode generation
> -------------------------------------------------------
>
>                 Key: GROOVY-12065
>                 URL: https://issues.apache.org/jira/browse/GROOVY-12065
>             Project: Groovy
>          Issue Type: Improvement
>            Reporter: Daniel Sun
>            Priority: Major
>
> h2. Overview
> This improvement introduces a dedicated, single-pass bytecode compaction 
> layer via {{PeepholeOptimizingMethodVisitor}}. This adapter wraps the 
> underlying ASM {{MethodVisitor}} during class generation 
> ({{AsmClassGenerator}}), intercepting instruction streams within local basic 
> blocks to eliminate redundant operations, rewrite conditional branches, and 
> narrow constant instructions to their most optimal forms.
> Crucially, by decoupling peephole optimization from structural code 
> generation, this approach significantly simplifies the constant emission 
> logic inside {{OperandStack}}. Instead of maintaining verbose, duplicate 
> routing logic for specialized primitive opcodes (such as {{ICONST_x}}, 
> {{BIPUSH}}, or {{FCONST_x}}), {{OperandStack}} now delegates constant pushing 
> uniformly via {{visitLdcInsn}}. The stateful peephole layer then 
> transparently condenses these instructions into their tightest bytecode 
> representations under the hood.
> h2. Key Optimizations Implemented
> h3. 1. Redundant Instruction & Dead Code Elimination
> * *Discarded Assignment Values:* Eliminates wasteful {{DUP}} -> {{[X]STORE}} 
> -> {{POP}} patterns typically produced during assignments where the 
> expression result on the operand stack is unused. The optimizer flattens 
> these directly into a single {{[X]STORE}}.
> * *Redundant Loads:* Detects situations where a local variable or constant is 
> loaded ({{[X]LOAD}} / {{LDC}}) but immediately discarded via 
> {{POP}}/{{POP2}}, or followed immediately by a void {{RETURN}}. The optimizer 
> safely drops both operations while preserving interleaved local increments 
> ({{IINC}}).
> h3. 2. Conditional Jump & Zero-Comparison Rewriting
> * Optimizes integer comparisons against zero by transforming expensive binary 
> comparison sequences (e.g., {{ILOAD}} -> {{ICONST_0}} -> {{IF_ICMPxx}}) into 
> compact unary zero-comparison instructions ({{IFxx}}). For instance, an 
> {{IF_ICMPEQ}} branch against a buffered {{0}} constant is cleanly rewritten 
> directly to {{IFEQ}}.
> h3. 3. Instruction Narrowing & Constant Compaction
> * Intercepts standard literal definitions and transparently narrows them down 
> to the smallest possible specialized opcodes ({{ICONST_M1}} through 
> {{ICONST_5}}, {{BIPUSH}}, {{SIPUSH}}, {{LCONST_0/1}}, {{FCONST_0/1/2}}, and 
> {{DCONST_0/1}}).
> * Explicitly safeguards signed floating-point zeros ({{-0.0f}} and {{-0.0d}}) 
> from accidental flattening to guarantee strict IEEE 754 runtime compliance.
> h3. 4. Big Number Literal Lowering
> * Centralizes string-constructed object instantiations for {{BigDecimal}} and 
> {{BigInteger}} literals by rewriting buffered constants into inline {{NEW}} 
> -> {{DUP}} -> {{LDC [string]}} -> {{INVOKESPECIAL <init>}} pipelines 
> seamlessly before stack emission.
> h2. Implementation Strategy & Safety Guardrails
> The {{PeepholeOptimizingMethodVisitor}} operates on a lightweight, 
> stack-local sliding window. To fully preserve runtime semantics, debugging 
> capabilities, and catch-block structures, the lookahead window is 
> automatically flushed to the delegate visitor when encountering non-local 
> boundaries or frame-altering instructions, including:
> * Control flow jumps and basic block {{Label}} targets.
> * Stack map frames ({{visitFrame}}).
> * Line numbers and local variable debug maps.
> * Method invocations and {{InvokeDynamic}} instructions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to