[ 
https://issues.apache.org/jira/browse/CALCITE-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

krooswu updated CALCITE-7109:
-----------------------------
    Description: 
h1. Proposal: Add support for << (bitwise left shift) operator
h2. Overview

Add comprehensive support for the bitwise left shift operator ({{{}<<{}}}) in 
Apache Calcite, following mainstream database semantics.
h2. Implementation Details
h3. Core Components
 * {*}Parser Extension{*}: Extend parser to recognize {{<<}} operator with 
appropriate precedence (32, matching standard shift operators)
 * {*}Operator Definition{*}: Add {{SqlBinaryOperator LEFTSHIFT}} in 
{{SqlStdOperatorTable}}
 * {*}Integration{*}: Support in {{{}SqlToRelConverter{}}}, {{{}RexBuilder{}}}, 
and code generation
 * {*}Type System{*}: Add type inference for all INTEGER family types (TINYINT, 
SMALLINT, INTEGER, BIGINT)

h3. Type Handling Strategy
 * {*}Return Type{*}: Result type matches left operand type 
({{{}ReturnTypes.ARG0_NULLABLE{}}})
 * {*}Mixed Type Promotion{*}: When operand types differ, promote to 
accommodate both operands
 * {*}Supported Types{*}:
 ** INTEGER family types (TINYINT, SMALLINT, INTEGER, BIGINT)
 ** VARBINARY types (following BigQuery semantics)

h3. Behavioral Semantics
h4. Shift Count Handling
 * {*}Negative Shift{*}: Throws {{IllegalArgumentException}} due to 
inconsistent database behaviors
{{SELECT 8 << -1; – Throws exception for clarity and safety}}
 * {*}Large Shift{*}: Shift counts >= type bit width return 0
{{SELECT 1 << 40; – Returns 0 for INTEGER types}}

h4. Cross-Database Negative Shift Behavior Analysis

Testing reveals significant inconsistencies in negative shift handling:

{*}PostgreSQL{*}: Uses modular arithmetic (takes low 5 bits for all shifts)

Negative shifts
{code:java}
SELECT 1 << -1; – Returns -2147483648 (equivalent to 1 << 31)
SELECT 1 << -2; – Returns 1073741824 (equivalent to 1 << 30)
SELECT 1 << -3; – Returns 536870912 (equivalent to 1 << 29){code}
– Large positive shifts 
{code:java}
SELECT 1 << 40; – Returns 256 (equivalent to 1 << 8)
SELECT 1 << 33; – Returns 2 (equivalent to 1 << 1)}}{code}
{*}SQL Server{*}: Treats negative shift as right shift

Negative shifts
{code:java}
SELECT 8 << -1; – Equivalent to 8 >> 1, returns 4{code}
{*}MySQL{*}: Returns 0 for negative shifts, normal calculation for large 
positive shifts
{code:java}
SELECT 1 << 40; – Returns 1099511627776 (normal calculation)
SELECT 1 << -1; – Returns 0 (safe handling)
SELECT 1 << -2; – Returns 0 (safe handling){code}
Given these inconsistencies, Calcite adopts a *strict validation approach* to 
avoid user confusion and ensure predictable behavior.
h4. Overflow Behavior
 * {*}Silent Truncation{*}: Overflow bits are silently discarded (no exceptions 
thrown)
 * {*}Natural Bit Operations{*}: Follow Java's natural bit truncation behavior
 * {*}Alignment{*}: Matches PostgreSQL, MySQL, and SQL Server overflow handling

h4. Null Handling
 * {*}Standard SQL Semantics{*}: NULL operands result in NULL output
 * {*}Consistent{*}: Follows existing Calcite null propagation patterns

h3. Database Compatibility Matrix
||Database||Negative Shift||Overflow||Implementation Status||
|*PostgreSQL*|Modular arithmetic (unique)|Silent truncation|❌ Diverges (strict 
validation)|
|*MySQL*|Returns 0|No limit on large positive shifts|❌ Diverges (strict 
validation)|
|*SQL Server*|Treated as right shift|Returns 0 for large shifts|❌ Diverges 
(strict validation)|
|*BigQuery*|[Needs verification]|Silent truncation|✅ Aligned (BYTES support)|
h3. Test Coverage
 * {*}Parsing Tests{*}: {{SqlParserTest}} - operator recognition and precedence
 * {*}Validation Tests{*}: {{SqlValidatorTest}} - type checking and error 
conditions
 * {*}Execution Tests{*}: {{SqlOperatorTest}} - runtime behavior verification
 * {*}Rex Tests{*}: {{{}RexImpTableTest{}}}, {{RexProgramTest}} - expression 
tree handling
 * {*}Integration Tests{*}: {{operator.iq}} - end-to-end SQL execution scenarios
 * {*}Edge Cases{*}: Boundary conditions, type promotions, null handling

h3. Design Decisions & Rationales
h4. Why Silent Truncation?
 * {*}Industry Standard{*}: All major databases handle overflow this way
 * {*}Performance{*}: Avoids runtime exception overhead
 * {*}Predictability{*}: Consistent bit-level behavior across platforms

h4. Why Strict Negative Shift Validation?
 * {*}Database Inconsistency{*}: PostgreSQL, SQL Server, and likely others 
handle negative shifts completely differently
 * {*}User Clarity{*}: Exception with clear message prevents confusion about 
intended behavior
 * {*}Predictable Behavior{*}: Consistent error handling across all Calcite 
deployments
 * {*}Safety First{*}: Avoids potentially unexpected results from modular 
arithmetic or implicit right shifts

h4. Why ARG0 Return Type?
 * {*}Consistency{*}: Matches other bitwise operators in Calcite
 * {*}Control{*}: Users can explicitly cast for type promotion when needed
 * {*}Performance{*}: Avoids unnecessary type conversions

h3. VARBINARY Implementation Details
 * {*}Byte-level Shifting{*}: Shifts entire bytes within the binary array
 * {*}Length Preservation{*}: Result maintains same byte length as input
 * {*}Overflow Handling{*}: Bytes shifted beyond array boundaries are discarded
 * {*}BigQuery Semantics{*}: Follows BigQuery's BYTES shifting behavior

h3. Future Extensions
 * {*}Configuration Options{*}: Optional strict mode for overflow checking
 * {*}Right Shift Operator{*}: Companion {{>>}} operator implementation

h3. Breaking Changes
 * {*}None{*}: This is a new feature addition with no existing functionality 
changes

This proposal ensures Calcite's left shift operator behaves consistently with 
mainstream SQL databases while maintaining performance and providing clear, 
predictable semantics.

  was:
h1. Proposal: Add support for << (bitwise left shift) operator
h2. Overview

Add comprehensive support for the bitwise left shift operator ({{{}<<{}}}) in 
Apache Calcite, following mainstream database semantics.
h2. Implementation Details
h3. Core Components
 * {*}Parser Extension{*}: Extend parser to recognize {{<<}} operator with 
appropriate precedence (32, matching standard shift operators)
 * {*}Operator Definition{*}: Add {{SqlBinaryOperator LEFTSHIFT}} in 
{{SqlStdOperatorTable}}
 * {*}Integration{*}: Support in {{{}SqlToRelConverter{}}}, {{{}RexBuilder{}}}, 
and code generation
 * {*}Type System{*}: Add type inference for all INTEGER family types (TINYINT, 
SMALLINT, INTEGER, BIGINT)

h3. Type Handling Strategy
 * {*}Return Type{*}: Result type matches left operand type 
({{{}ReturnTypes.ARG0_NULLABLE{}}})
 * {*}Mixed Type Promotion{*}: When operand types differ, promote to 
accommodate both operands
 * {*}Supported Types{*}:
 ** INTEGER family types (TINYINT, SMALLINT, INTEGER, BIGINT)
 ** VARBINARY types (following BigQuery semantics)

h3. Behavioral Semantics
h4. Shift Count Handling
 * {*}Negative Shift{*}: Throws {{IllegalArgumentException}} due to 
inconsistent database behaviors
{{SELECT 8 << -1;  -- Throws exception for clarity and safety}}
 * {*}Large Shift{*}: Shift counts >= type bit width return 0
{{SELECT 1 << 40;  -- Returns 0 for INTEGER types}}

h4. Cross-Database Negative Shift Behavior Analysis

Testing reveals significant inconsistencies in negative shift handling:

{*}PostgreSQL{*}: Uses modular arithmetic (takes low 5 bits for all shifts)

 
h1. {{-- Negative shifts
SELECT 1 << -1;  -- Returns -2147483648 (equivalent to 1 << 31)
SELECT 1 << -2;  -- Returns 1073741824 (equivalent to 1 << 30)
SELECT 1 << -3;  -- Returns 536870912 (equivalent to 1 << 29)

-- Large positive shifts  
SELECT 1 << 40;  -- Returns 256 (equivalent to 1 << 8)
SELECT 1 << 33;  -- Returns 2 (equivalent to 1 << 1)}}

{*}SQL Server{*}: Treats negative shift as right shift

 
h1. {{SELECT 8 << -1;  -- Equivalent to 8 >> 1, returns 4}}

{*}MySQL{*}: Returns 0 for negative shifts, normal calculation for large 
positive shifts

 
h1. {{SELECT 1 << 40;  -- Returns 1099511627776 (normal calculation)
SELECT 1 << -1;  -- Returns 0 (safe handling)
SELECT 1 << -2;  -- Returns 0 (safe handling)}}

Given these inconsistencies, Calcite adopts a *strict validation approach* to 
avoid user confusion and ensure predictable behavior.
h4. Overflow Behavior
 * {*}Silent Truncation{*}: Overflow bits are silently discarded (no exceptions 
thrown)
 * {*}Natural Bit Operations{*}: Follow Java's natural bit truncation behavior
 * {*}Alignment{*}: Matches PostgreSQL, MySQL, and SQL Server overflow handling

h4. Null Handling
 * {*}Standard SQL Semantics{*}: NULL operands result in NULL output
 * {*}Consistent{*}: Follows existing Calcite null propagation patterns

h3. Database Compatibility Matrix
||Database||Negative Shift||Overflow||Implementation Status||
|*PostgreSQL*|Modular arithmetic (unique)|Silent truncation|❌ Diverges (strict 
validation)|
|*MySQL*|Returns 0|No limit on large positive shifts|❌ Diverges (strict 
validation)|
|*SQL Server*|Treated as right shift|Returns 0 for large shifts|❌ Diverges 
(strict validation)|
|*BigQuery*|[Needs verification]|Silent truncation|✅ Aligned (BYTES support)|
h3. Test Coverage
 * {*}Parsing Tests{*}: {{SqlParserTest}} - operator recognition and precedence
 * {*}Validation Tests{*}: {{SqlValidatorTest}} - type checking and error 
conditions
 * {*}Execution Tests{*}: {{SqlOperatorTest}} - runtime behavior verification
 * {*}Rex Tests{*}: {{{}RexImpTableTest{}}}, {{RexProgramTest}} - expression 
tree handling
 * {*}Integration Tests{*}: {{operator.iq}} - end-to-end SQL execution scenarios
 * {*}Edge Cases{*}: Boundary conditions, type promotions, null handling

h3. Design Decisions & Rationales
h4. Why Silent Truncation?
 * {*}Industry Standard{*}: All major databases handle overflow this way
 * {*}Performance{*}: Avoids runtime exception overhead
 * {*}Predictability{*}: Consistent bit-level behavior across platforms

h4. Why Strict Negative Shift Validation?
 * {*}Database Inconsistency{*}: PostgreSQL, SQL Server, and likely others 
handle negative shifts completely differently
 * {*}User Clarity{*}: Exception with clear message prevents confusion about 
intended behavior
 * {*}Predictable Behavior{*}: Consistent error handling across all Calcite 
deployments
 * {*}Safety First{*}: Avoids potentially unexpected results from modular 
arithmetic or implicit right shifts

h4. Why ARG0 Return Type?
 * {*}Consistency{*}: Matches other bitwise operators in Calcite
 * {*}Control{*}: Users can explicitly cast for type promotion when needed
 * {*}Performance{*}: Avoids unnecessary type conversions

h3. VARBINARY Implementation Details
 * {*}Byte-level Shifting{*}: Shifts entire bytes within the binary array
 * {*}Length Preservation{*}: Result maintains same byte length as input
 * {*}Overflow Handling{*}: Bytes shifted beyond array boundaries are discarded
 * {*}BigQuery Semantics{*}: Follows BigQuery's BYTES shifting behavior

h3. Future Extensions
 * {*}Configuration Options{*}: Optional strict mode for overflow checking
 * {*}Right Shift Operator{*}: Companion {{>>}} operator implementation

h3. Breaking Changes
 * {*}None{*}: This is a new feature addition with no existing functionality 
changes

This proposal ensures Calcite's left shift operator behaves consistently with 
mainstream SQL databases while maintaining performance and providing clear, 
predictable semantics.


> Add support for << operator in Calcite
> --------------------------------------
>
>                 Key: CALCITE-7109
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7109
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: krooswu
>            Assignee: krooswu
>            Priority: Major
>              Labels: pull-request-available
>
> h1. Proposal: Add support for << (bitwise left shift) operator
> h2. Overview
> Add comprehensive support for the bitwise left shift operator ({{{}<<{}}}) in 
> Apache Calcite, following mainstream database semantics.
> h2. Implementation Details
> h3. Core Components
>  * {*}Parser Extension{*}: Extend parser to recognize {{<<}} operator with 
> appropriate precedence (32, matching standard shift operators)
>  * {*}Operator Definition{*}: Add {{SqlBinaryOperator LEFTSHIFT}} in 
> {{SqlStdOperatorTable}}
>  * {*}Integration{*}: Support in {{{}SqlToRelConverter{}}}, 
> {{{}RexBuilder{}}}, and code generation
>  * {*}Type System{*}: Add type inference for all INTEGER family types 
> (TINYINT, SMALLINT, INTEGER, BIGINT)
> h3. Type Handling Strategy
>  * {*}Return Type{*}: Result type matches left operand type 
> ({{{}ReturnTypes.ARG0_NULLABLE{}}})
>  * {*}Mixed Type Promotion{*}: When operand types differ, promote to 
> accommodate both operands
>  * {*}Supported Types{*}:
>  ** INTEGER family types (TINYINT, SMALLINT, INTEGER, BIGINT)
>  ** VARBINARY types (following BigQuery semantics)
> h3. Behavioral Semantics
> h4. Shift Count Handling
>  * {*}Negative Shift{*}: Throws {{IllegalArgumentException}} due to 
> inconsistent database behaviors
> {{SELECT 8 << -1; – Throws exception for clarity and safety}}
>  * {*}Large Shift{*}: Shift counts >= type bit width return 0
> {{SELECT 1 << 40; – Returns 0 for INTEGER types}}
> h4. Cross-Database Negative Shift Behavior Analysis
> Testing reveals significant inconsistencies in negative shift handling:
> {*}PostgreSQL{*}: Uses modular arithmetic (takes low 5 bits for all shifts)
> Negative shifts
> {code:java}
> SELECT 1 << -1; – Returns -2147483648 (equivalent to 1 << 31)
> SELECT 1 << -2; – Returns 1073741824 (equivalent to 1 << 30)
> SELECT 1 << -3; – Returns 536870912 (equivalent to 1 << 29){code}
> – Large positive shifts 
> {code:java}
> SELECT 1 << 40; – Returns 256 (equivalent to 1 << 8)
> SELECT 1 << 33; – Returns 2 (equivalent to 1 << 1)}}{code}
> {*}SQL Server{*}: Treats negative shift as right shift
> Negative shifts
> {code:java}
> SELECT 8 << -1; – Equivalent to 8 >> 1, returns 4{code}
> {*}MySQL{*}: Returns 0 for negative shifts, normal calculation for large 
> positive shifts
> {code:java}
> SELECT 1 << 40; – Returns 1099511627776 (normal calculation)
> SELECT 1 << -1; – Returns 0 (safe handling)
> SELECT 1 << -2; – Returns 0 (safe handling){code}
> Given these inconsistencies, Calcite adopts a *strict validation approach* to 
> avoid user confusion and ensure predictable behavior.
> h4. Overflow Behavior
>  * {*}Silent Truncation{*}: Overflow bits are silently discarded (no 
> exceptions thrown)
>  * {*}Natural Bit Operations{*}: Follow Java's natural bit truncation behavior
>  * {*}Alignment{*}: Matches PostgreSQL, MySQL, and SQL Server overflow 
> handling
> h4. Null Handling
>  * {*}Standard SQL Semantics{*}: NULL operands result in NULL output
>  * {*}Consistent{*}: Follows existing Calcite null propagation patterns
> h3. Database Compatibility Matrix
> ||Database||Negative Shift||Overflow||Implementation Status||
> |*PostgreSQL*|Modular arithmetic (unique)|Silent truncation|❌ Diverges 
> (strict validation)|
> |*MySQL*|Returns 0|No limit on large positive shifts|❌ Diverges (strict 
> validation)|
> |*SQL Server*|Treated as right shift|Returns 0 for large shifts|❌ Diverges 
> (strict validation)|
> |*BigQuery*|[Needs verification]|Silent truncation|✅ Aligned (BYTES support)|
> h3. Test Coverage
>  * {*}Parsing Tests{*}: {{SqlParserTest}} - operator recognition and 
> precedence
>  * {*}Validation Tests{*}: {{SqlValidatorTest}} - type checking and error 
> conditions
>  * {*}Execution Tests{*}: {{SqlOperatorTest}} - runtime behavior verification
>  * {*}Rex Tests{*}: {{{}RexImpTableTest{}}}, {{RexProgramTest}} - expression 
> tree handling
>  * {*}Integration Tests{*}: {{operator.iq}} - end-to-end SQL execution 
> scenarios
>  * {*}Edge Cases{*}: Boundary conditions, type promotions, null handling
> h3. Design Decisions & Rationales
> h4. Why Silent Truncation?
>  * {*}Industry Standard{*}: All major databases handle overflow this way
>  * {*}Performance{*}: Avoids runtime exception overhead
>  * {*}Predictability{*}: Consistent bit-level behavior across platforms
> h4. Why Strict Negative Shift Validation?
>  * {*}Database Inconsistency{*}: PostgreSQL, SQL Server, and likely others 
> handle negative shifts completely differently
>  * {*}User Clarity{*}: Exception with clear message prevents confusion about 
> intended behavior
>  * {*}Predictable Behavior{*}: Consistent error handling across all Calcite 
> deployments
>  * {*}Safety First{*}: Avoids potentially unexpected results from modular 
> arithmetic or implicit right shifts
> h4. Why ARG0 Return Type?
>  * {*}Consistency{*}: Matches other bitwise operators in Calcite
>  * {*}Control{*}: Users can explicitly cast for type promotion when needed
>  * {*}Performance{*}: Avoids unnecessary type conversions
> h3. VARBINARY Implementation Details
>  * {*}Byte-level Shifting{*}: Shifts entire bytes within the binary array
>  * {*}Length Preservation{*}: Result maintains same byte length as input
>  * {*}Overflow Handling{*}: Bytes shifted beyond array boundaries are 
> discarded
>  * {*}BigQuery Semantics{*}: Follows BigQuery's BYTES shifting behavior
> h3. Future Extensions
>  * {*}Configuration Options{*}: Optional strict mode for overflow checking
>  * {*}Right Shift Operator{*}: Companion {{>>}} operator implementation
> h3. Breaking Changes
>  * {*}None{*}: This is a new feature addition with no existing functionality 
> changes
> This proposal ensures Calcite's left shift operator behaves consistently with 
> mainstream SQL databases while maintaining performance and providing clear, 
> predictable semantics.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to