[
https://issues.apache.org/jira/browse/CALCITE-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
krooswu updated CALCITE-7109:
-----------------------------
Description:
h1. Proposal: Add support for << (bitwise left shift) operator
h2. Overview
Add comprehensive support for the bitwise left shift operator ({{{}<<{}}}) in
Apache Calcite, following mainstream database semantics.
h2. Implementation Details
h3. Core Components
* {*}Parser Extension{*}: Extend parser to recognize {{<<}} operator with
appropriate precedence (32, matching standard shift operators)
* {*}Operator Definition{*}: Add {{SqlBinaryOperator LEFTSHIFT}} in
{{SqlStdOperatorTable}}
* {*}Integration{*}: Support in {{{}SqlToRelConverter{}}}, {{{}RexBuilder{}}},
and code generation
* {*}Type System{*}: Add type inference for all INTEGER family types (TINYINT,
SMALLINT, INTEGER, BIGINT)
h3. Type Handling Strategy
* {*}Return Type{*}: Always BIGINT ({{{}ReturnTypes.BIGINT_NULLABLE{}}}) to
prevent overflow
* {*}Mixed Type Promotion{*}: All INTEGER family operands promoted to BIGINT
for safety
* {*}Supported Types{*}:
** INTEGER family types (TINYINT, SMALLINT, INTEGER, BIGINT) - all results as
BIGINT
** VARBINARY types (following BigQuery semantics) - maintain original type
h3. Behavioral Semantics
h4. Shift Count Handling
* {*}Negative Shift{*}: Throws {{IllegalArgumentException}} due to
inconsistent database behaviors
{{SELECT 8 << -1; – Throws exception for clarity and safety}}
* {*}Large Shift{*}: Shift counts >= type bit width return 0
{{SELECT 1 << 40; – Returns 0 for INTEGER types}}
h4. Cross-Database Negative Shift Behavior Analysis
Testing reveals significant inconsistencies in negative shift handling:
{*}PostgreSQL{*}: Uses modular arithmetic (takes low 5 bits for all shifts)
{code:java}
SELECT 1 << -1; – Returns -2147483648 (equivalent to 1 << 31)
SELECT 1 << -2; – Returns 1073741824 (equivalent to 1 << 30)
SELECT 1 << -3; – Returns 536870912 (equivalent to 1 << 29){code}
– Large positive shifts
{code:java}
SELECT 1 << 40; – Returns 256 (equivalent to 1 << 8)
SELECT 1 << 33; – Returns 2 (equivalent to 1 << 1)}}{code}
{*}SQL Server{*}: Treats negative shift as right shift
{code:java}
SELECT 8 << -1; – Equivalent to 8 >> 1, returns 4{code}
{*}MySQL{*}: Returns 0 for negative shifts, normal calculation for large
positive shifts
{code:java}
SELECT 1 << 40; – Returns 1099511627776 (normal calculation)
SELECT 1 << -1; – Returns 0 (safe handling)
SELECT 1 << -2; – Returns 0 (safe handling)}}
{code}
*SQL Server* (treats negative as right shift, large shifts as 0):
{code:java}
SELECT 8 << -1; – Returns 4 (equivalent to 8 >> 1)
SELECT 1 << 40; – Returns 0 (shifts >= 32 yield 0)}}
{code}
{*}ClickHouse{*}: Throws error for negative shifts, normal calculation for
large positive shifts
{code:java}
SELECT 1 << 40; – Returns 1099511627776 (normal calculation)
SELECT 1 << -1; – Error (exception thrown)}}
{code}
Given that {*}all major modern/cloud databases (BigQuery, Snowflake,
ClickHouse) use strict validation{*}, Calcite adopts the same approach as the
clear industry standard for contemporary SQL engines.
h4. Overflow Behavior
* {*}Silent Truncation{*}: Overflow bits are silently discarded (no exceptions
thrown)
* {*}Natural Bit Operations{*}: Follow Java's natural bit truncation behavior
* {*}Alignment{*}: Matches PostgreSQL, MySQL, and SQL Server overflow handling
h4. Null Handling
* {*}Standard SQL Semantics{*}: NULL operands result in NULL output
* {*}Consistent{*}: Follows existing Calcite null propagation patterns
h3. Database Compatibility Matrix
||Database||Negative Shift||Overflow||Implementation Status||
|*PostgreSQL*|Modular arithmetic (unique)|Silent truncation|❌ Diverges (strict
validation)|
|*MySQL*|Returns 0|No limit on large positive shifts|❌ Diverges (strict
validation)|
|*SQL Server*|Treated as right shift|Returns 0 for large shifts|❌ Diverges
(strict validation)|
|*Snowflake*|Error (throws exception)|No limit on large positive shifts|✅
Aligned (strict validation)|
|*BigQuery*|Error (throws exception)|No limit on large positive shifts|✅
Aligned (strict validation)|
h3. Test Coverage
* {*}Parsing Tests{*}: {{SqlParserTest}} - operator recognition and precedence
* {*}Validation Tests{*}: {{SqlValidatorTest}} - type checking and error
conditions
* {*}Execution Tests{*}: {{SqlOperatorTest}} - runtime behavior verification
* {*}Rex Tests{*}: {{{}RexImpTableTest{}}}, {{RexProgramTest}} - expression
tree handling
* {*}Integration Tests{*}: {{operator.iq}} - end-to-end SQL execution scenarios
* {*}Edge Cases{*}: Boundary conditions, type promotions, null handling
h3. Design Decisions & Rationales
h4. Why Silent Truncation?
* {*}Industry Standard{*}: All major databases handle overflow this way
* {*}Performance{*}: Avoids runtime exception overhead
* {*}Predictability{*}: Consistent bit-level behavior across platforms
h4. Why Strict Negative Shift Validation?
* {*}Modern Database Consensus{*}: BigQuery, Snowflake, and ClickHouse (all
major modern/cloud SQL engines) throw exceptions for negative shifts
* {*}Industry Standard{*}: This is clearly the established pattern for
contemporary database design
* {*}Clear Evolution{*}: Legacy databases have quirky behaviors, modern
databases use strict validation
* {*}User Clarity{*}: Exception with clear message prevents confusion about
intended behavior
* {*}Future-Proof{*}: Aligns with the direction the SQL industry is moving
h4. Why Always Return BIGINT?
* {*}Modern Database Alignment{*}: BigQuery, Snowflake, and ClickHouse all
handle large shift results naturally
* {*}MySQL Pattern{*}: MySQL always returns unsigned 64-bit integers for bit
operations
* {*}Overflow Prevention{*}: Eliminates unexpected overflow for common
operations like {{1000000 << 10}}
* {*}User Convenience{*}: No need for explicit casting to avoid overflow
* {*}Industry Trend{*}: Contemporary databases favor safety over
micro-optimization
h3. VARBINARY Implementation Details
* {*}Byte-level Shifting{*}: Shifts entire bytes within the binary array
* {*}Length Preservation{*}: Result maintains same byte length as input
* {*}Overflow Handling{*}: Bytes shifted beyond array boundaries are discarded
* {*}BigQuery Semantics{*}: Follows BigQuery's BYTES shifting behavior
h3. Future Extensions
* {*}Configuration Options{*}: Optional strict mode for overflow checking
* {*}Right Shift Operator{*}: Companion {{>>}} operator implementation
h3. Breaking Changes
* {*}None{*}: This is a new feature addition with no existing functionality
changes
This proposal ensures Calcite's left shift operator behaves consistently with
mainstream SQL databases while maintaining performance and providing clear,
predictable semantics.
was:
h1. Proposal: Add support for << (bitwise left shift) operator
h2. Overview
Add comprehensive support for the bitwise left shift operator ({{{}<<{}}}) in
Apache Calcite, following mainstream database semantics.
h2. Implementation Details
h3. Core Components
* {*}Parser Extension{*}: Extend parser to recognize {{<<}} operator with
appropriate precedence (32, matching standard shift operators)
* {*}Operator Definition{*}: Add {{SqlBinaryOperator LEFTSHIFT}} in
{{SqlStdOperatorTable}}
* {*}Integration{*}: Support in {{{}SqlToRelConverter{}}}, {{{}RexBuilder{}}},
and code generation
* {*}Type System{*}: Add type inference for all INTEGER family types (TINYINT,
SMALLINT, INTEGER, BIGINT)
h3. Type Handling Strategy
* {*}Return Type{*}: Always BIGINT ({{{}ReturnTypes.BIGINT_NULLABLE{}}}) to
prevent overflow
* {*}Mixed Type Promotion{*}: All INTEGER family operands promoted to BIGINT
for safety
* {*}Supported Types{*}:
** INTEGER family types (TINYINT, SMALLINT, INTEGER, BIGINT) - all results as
BIGINT
** VARBINARY types (following BigQuery semantics) - maintain original type
h3. Behavioral Semantics
h4. Shift Count Handling
* {*}Negative Shift{*}: Throws {{IllegalArgumentException}} due to
inconsistent database behaviors
{{SELECT 8 << -1; – Throws exception for clarity and safety}}
* {*}Large Shift{*}: Shift counts >= type bit width return 0
{{SELECT 1 << 40; – Returns 0 for INTEGER types}}
h4. Cross-Database Negative Shift Behavior Analysis
Testing reveals significant inconsistencies in negative shift handling:
{*}PostgreSQL{*}: Uses modular arithmetic (takes low 5 bits for all shifts)
{code:java}
SELECT 1 << -1; – Returns -2147483648 (equivalent to 1 << 31)
SELECT 1 << -2; – Returns 1073741824 (equivalent to 1 << 30)
SELECT 1 << -3; – Returns 536870912 (equivalent to 1 << 29){code}
– Large positive shifts
{code:java}
SELECT 1 << 40; – Returns 256 (equivalent to 1 << 8)
SELECT 1 << 33; – Returns 2 (equivalent to 1 << 1)}}{code}
{*}SQL Server{*}: Treats negative shift as right shift
{code:java}
SELECT 8 << -1; – Equivalent to 8 >> 1, returns 4{code}
{*}MySQL{*}: Returns 0 for negative shifts, normal calculation for large
positive shifts
{code:java}
SELECT 1 << 40; – Returns 1099511627776 (normal calculation)
SELECT 1 << -1; – Returns 0 (safe handling)
SELECT 1 << -2; – Returns 0 (safe handling)}}
{code}
*SQL Server* (treats negative as right shift, large shifts as 0):
h1.
{code:java}
SELECT 8 << -1; – Returns 4 (equivalent to 8 >> 1)
SELECT 1 << 40; – Returns 0 (shifts >= 32 yield 0)}}
{code}
{*}ClickHouse{*}: Throws error for negative shifts, normal calculation for
large positive shifts
h1.
{code:java}
SELECT 1 << 40; – Returns 1099511627776 (normal calculation)
SELECT 1 << -1; – Error (exception thrown)}}
{code}
Given that {*}all major modern/cloud databases (BigQuery, Snowflake,
ClickHouse) use strict validation{*}, Calcite adopts the same approach as the
clear industry standard for contemporary SQL engines.
h4. Overflow Behavior
* {*}Silent Truncation{*}: Overflow bits are silently discarded (no exceptions
thrown)
* {*}Natural Bit Operations{*}: Follow Java's natural bit truncation behavior
* {*}Alignment{*}: Matches PostgreSQL, MySQL, and SQL Server overflow handling
h4. Null Handling
* {*}Standard SQL Semantics{*}: NULL operands result in NULL output
* {*}Consistent{*}: Follows existing Calcite null propagation patterns
h3. Database Compatibility Matrix
||Database||Negative Shift||Overflow||Implementation Status||
|*PostgreSQL*|Modular arithmetic (unique)|Silent truncation|❌ Diverges (strict
validation)|
|*MySQL*|Returns 0|No limit on large positive shifts|❌ Diverges (strict
validation)|
|*SQL Server*|Treated as right shift|Returns 0 for large shifts|❌ Diverges
(strict validation)|
|*Snowflake*|Error (throws exception)|No limit on large positive shifts|✅
Aligned (strict validation)|
|*BigQuery*|Error (throws exception)|No limit on large positive shifts|✅
Aligned (strict validation)|
h3. Test Coverage
* {*}Parsing Tests{*}: {{SqlParserTest}} - operator recognition and precedence
* {*}Validation Tests{*}: {{SqlValidatorTest}} - type checking and error
conditions
* {*}Execution Tests{*}: {{SqlOperatorTest}} - runtime behavior verification
* {*}Rex Tests{*}: {{{}RexImpTableTest{}}}, {{RexProgramTest}} - expression
tree handling
* {*}Integration Tests{*}: {{operator.iq}} - end-to-end SQL execution scenarios
* {*}Edge Cases{*}: Boundary conditions, type promotions, null handling
h3. Design Decisions & Rationales
h4. Why Silent Truncation?
* {*}Industry Standard{*}: All major databases handle overflow this way
* {*}Performance{*}: Avoids runtime exception overhead
* {*}Predictability{*}: Consistent bit-level behavior across platforms
h4. Why Strict Negative Shift Validation?
* {*}Modern Database Consensus{*}: BigQuery, Snowflake, and ClickHouse (all
major modern/cloud SQL engines) throw exceptions for negative shifts
* {*}Industry Standard{*}: This is clearly the established pattern for
contemporary database design
* {*}Clear Evolution{*}: Legacy databases have quirky behaviors, modern
databases use strict validation
* {*}User Clarity{*}: Exception with clear message prevents confusion about
intended behavior
* {*}Future-Proof{*}: Aligns with the direction the SQL industry is moving
h4. Why Always Return BIGINT?
* {*}Modern Database Alignment{*}: BigQuery, Snowflake, and ClickHouse all
handle large shift results naturally
* {*}MySQL Pattern{*}: MySQL always returns unsigned 64-bit integers for bit
operations
* {*}Overflow Prevention{*}: Eliminates unexpected overflow for common
operations like {{1000000 << 10}}
* {*}User Convenience{*}: No need for explicit casting to avoid overflow
* {*}Industry Trend{*}: Contemporary databases favor safety over
micro-optimization
h3. VARBINARY Implementation Details
* {*}Byte-level Shifting{*}: Shifts entire bytes within the binary array
* {*}Length Preservation{*}: Result maintains same byte length as input
* {*}Overflow Handling{*}: Bytes shifted beyond array boundaries are discarded
* {*}BigQuery Semantics{*}: Follows BigQuery's BYTES shifting behavior
h3. Future Extensions
* {*}Configuration Options{*}: Optional strict mode for overflow checking
* {*}Right Shift Operator{*}: Companion {{>>}} operator implementation
h3. Breaking Changes
* {*}None{*}: This is a new feature addition with no existing functionality
changes
This proposal ensures Calcite's left shift operator behaves consistently with
mainstream SQL databases while maintaining performance and providing clear,
predictable semantics.
> Add support for << operator in Calcite
> --------------------------------------
>
> Key: CALCITE-7109
> URL: https://issues.apache.org/jira/browse/CALCITE-7109
> Project: Calcite
> Issue Type: New Feature
> Reporter: krooswu
> Assignee: krooswu
> Priority: Major
> Labels: pull-request-available
>
> h1. Proposal: Add support for << (bitwise left shift) operator
> h2. Overview
> Add comprehensive support for the bitwise left shift operator ({{{}<<{}}}) in
> Apache Calcite, following mainstream database semantics.
> h2. Implementation Details
> h3. Core Components
> * {*}Parser Extension{*}: Extend parser to recognize {{<<}} operator with
> appropriate precedence (32, matching standard shift operators)
> * {*}Operator Definition{*}: Add {{SqlBinaryOperator LEFTSHIFT}} in
> {{SqlStdOperatorTable}}
> * {*}Integration{*}: Support in {{{}SqlToRelConverter{}}},
> {{{}RexBuilder{}}}, and code generation
> * {*}Type System{*}: Add type inference for all INTEGER family types
> (TINYINT, SMALLINT, INTEGER, BIGINT)
> h3. Type Handling Strategy
> * {*}Return Type{*}: Always BIGINT ({{{}ReturnTypes.BIGINT_NULLABLE{}}}) to
> prevent overflow
> * {*}Mixed Type Promotion{*}: All INTEGER family operands promoted to BIGINT
> for safety
> * {*}Supported Types{*}:
> ** INTEGER family types (TINYINT, SMALLINT, INTEGER, BIGINT) - all results
> as BIGINT
> ** VARBINARY types (following BigQuery semantics) - maintain original type
> h3. Behavioral Semantics
> h4. Shift Count Handling
> * {*}Negative Shift{*}: Throws {{IllegalArgumentException}} due to
> inconsistent database behaviors
> {{SELECT 8 << -1; – Throws exception for clarity and safety}}
> * {*}Large Shift{*}: Shift counts >= type bit width return 0
> {{SELECT 1 << 40; – Returns 0 for INTEGER types}}
> h4. Cross-Database Negative Shift Behavior Analysis
> Testing reveals significant inconsistencies in negative shift handling:
> {*}PostgreSQL{*}: Uses modular arithmetic (takes low 5 bits for all shifts)
> {code:java}
> SELECT 1 << -1; – Returns -2147483648 (equivalent to 1 << 31)
> SELECT 1 << -2; – Returns 1073741824 (equivalent to 1 << 30)
> SELECT 1 << -3; – Returns 536870912 (equivalent to 1 << 29){code}
> – Large positive shifts
> {code:java}
> SELECT 1 << 40; – Returns 256 (equivalent to 1 << 8)
> SELECT 1 << 33; – Returns 2 (equivalent to 1 << 1)}}{code}
> {*}SQL Server{*}: Treats negative shift as right shift
> {code:java}
> SELECT 8 << -1; – Equivalent to 8 >> 1, returns 4{code}
> {*}MySQL{*}: Returns 0 for negative shifts, normal calculation for large
> positive shifts
> {code:java}
> SELECT 1 << 40; – Returns 1099511627776 (normal calculation)
> SELECT 1 << -1; – Returns 0 (safe handling)
> SELECT 1 << -2; – Returns 0 (safe handling)}}
> {code}
> *SQL Server* (treats negative as right shift, large shifts as 0):
> {code:java}
> SELECT 8 << -1; – Returns 4 (equivalent to 8 >> 1)
> SELECT 1 << 40; – Returns 0 (shifts >= 32 yield 0)}}
> {code}
> {*}ClickHouse{*}: Throws error for negative shifts, normal calculation for
> large positive shifts
> {code:java}
> SELECT 1 << 40; – Returns 1099511627776 (normal calculation)
> SELECT 1 << -1; – Error (exception thrown)}}
> {code}
>
> Given that {*}all major modern/cloud databases (BigQuery, Snowflake,
> ClickHouse) use strict validation{*}, Calcite adopts the same approach as the
> clear industry standard for contemporary SQL engines.
> h4. Overflow Behavior
> * {*}Silent Truncation{*}: Overflow bits are silently discarded (no
> exceptions thrown)
> * {*}Natural Bit Operations{*}: Follow Java's natural bit truncation behavior
> * {*}Alignment{*}: Matches PostgreSQL, MySQL, and SQL Server overflow
> handling
> h4. Null Handling
> * {*}Standard SQL Semantics{*}: NULL operands result in NULL output
> * {*}Consistent{*}: Follows existing Calcite null propagation patterns
> h3. Database Compatibility Matrix
> ||Database||Negative Shift||Overflow||Implementation Status||
> |*PostgreSQL*|Modular arithmetic (unique)|Silent truncation|❌ Diverges
> (strict validation)|
> |*MySQL*|Returns 0|No limit on large positive shifts|❌ Diverges (strict
> validation)|
> |*SQL Server*|Treated as right shift|Returns 0 for large shifts|❌ Diverges
> (strict validation)|
> |*Snowflake*|Error (throws exception)|No limit on large positive shifts|✅
> Aligned (strict validation)|
> |*BigQuery*|Error (throws exception)|No limit on large positive shifts|✅
> Aligned (strict validation)|
> h3. Test Coverage
> * {*}Parsing Tests{*}: {{SqlParserTest}} - operator recognition and
> precedence
> * {*}Validation Tests{*}: {{SqlValidatorTest}} - type checking and error
> conditions
> * {*}Execution Tests{*}: {{SqlOperatorTest}} - runtime behavior verification
> * {*}Rex Tests{*}: {{{}RexImpTableTest{}}}, {{RexProgramTest}} - expression
> tree handling
> * {*}Integration Tests{*}: {{operator.iq}} - end-to-end SQL execution
> scenarios
> * {*}Edge Cases{*}: Boundary conditions, type promotions, null handling
> h3. Design Decisions & Rationales
> h4. Why Silent Truncation?
> * {*}Industry Standard{*}: All major databases handle overflow this way
> * {*}Performance{*}: Avoids runtime exception overhead
> * {*}Predictability{*}: Consistent bit-level behavior across platforms
> h4. Why Strict Negative Shift Validation?
> * {*}Modern Database Consensus{*}: BigQuery, Snowflake, and ClickHouse (all
> major modern/cloud SQL engines) throw exceptions for negative shifts
> * {*}Industry Standard{*}: This is clearly the established pattern for
> contemporary database design
> * {*}Clear Evolution{*}: Legacy databases have quirky behaviors, modern
> databases use strict validation
> * {*}User Clarity{*}: Exception with clear message prevents confusion about
> intended behavior
> * {*}Future-Proof{*}: Aligns with the direction the SQL industry is moving
> h4. Why Always Return BIGINT?
> * {*}Modern Database Alignment{*}: BigQuery, Snowflake, and ClickHouse all
> handle large shift results naturally
> * {*}MySQL Pattern{*}: MySQL always returns unsigned 64-bit integers for bit
> operations
> * {*}Overflow Prevention{*}: Eliminates unexpected overflow for common
> operations like {{1000000 << 10}}
> * {*}User Convenience{*}: No need for explicit casting to avoid overflow
> * {*}Industry Trend{*}: Contemporary databases favor safety over
> micro-optimization
> h3. VARBINARY Implementation Details
> * {*}Byte-level Shifting{*}: Shifts entire bytes within the binary array
> * {*}Length Preservation{*}: Result maintains same byte length as input
> * {*}Overflow Handling{*}: Bytes shifted beyond array boundaries are
> discarded
> * {*}BigQuery Semantics{*}: Follows BigQuery's BYTES shifting behavior
> h3. Future Extensions
> * {*}Configuration Options{*}: Optional strict mode for overflow checking
> * {*}Right Shift Operator{*}: Companion {{>>}} operator implementation
> h3. Breaking Changes
> * {*}None{*}: This is a new feature addition with no existing functionality
> changes
> This proposal ensures Calcite's left shift operator behaves consistently with
> mainstream SQL databases while maintaining performance and providing clear,
> predictable semantics.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)