[ 
https://issues.apache.org/jira/browse/CALCITE-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18010238#comment-18010238
 ] 

krooswu commented on CALCITE-7109:
----------------------------------

Thank you for the feedback! [~mbudiu] 

Based on your suggestion, I plan to follow a more incremental approach:
 # {*}Introduce a new built-in function named {{BIT_SHIFT_SHIFT}}{*}, 
consistent with existing functions like {{{}BIT_AND{}}}, {{{}BIT_OR{}}}, etc. 
This allows users to perform bitwise left shifts in a clear and standard way, 
without introducing a new operator at this stage.

 # Once this function is accepted and merged, I will *submit a follow-up PR to 
add the {{<<}} operator* as syntactic sugar that maps to the same 
implementation. This should make the feature more convenient for users, while 
maintaining separation between semantic support and syntactic enhancement.

This approach will also make it easier to extend support for other types (e.g., 
byte[], decimal) in a modular way going forward.

Please let me know if this plan sounds good to you. I'm happy to update the PR 
accordingly.

> Add support for << operator in Calcite
> --------------------------------------
>
>                 Key: CALCITE-7109
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7109
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: krooswu
>            Assignee: krooswu
>            Priority: Major
>              Labels: pull-request-available
>
> h2. Overview
> Add comprehensive support for the bitwise left shift operator ({{{}<<{}}}) in 
> Apache Calcite, following mainstream database semantics.
> h2. Implementation Details
> h3. Core Components
>  * {*}Parser Extension{*}: Extend parser to recognize {{<<}} operator with 
> appropriate precedence (32, matching standard shift operators)
>  * {*}Operator Definition{*}: Add {{SqlBinaryOperator LEFTSHIFT}} in 
> {{SqlStdOperatorTable}}
>  * {*}Integration{*}: Support in {{{}SqlToRelConverter{}}}, 
> {{{}RexBuilder{}}}, and code generation
>  * {*}Type System{*}: Add type inference for all INTEGER family types 
> (TINYINT, SMALLINT, INTEGER, BIGINT)
> h3. Type Handling Strategy
>  * {*}Return Type{*}: Always BIGINT ({{{}ReturnTypes.BIGINT_NULLABLE{}}}) to 
> prevent overflow
>  * {*}Mixed Type Promotion{*}: All INTEGER family operands promoted to BIGINT 
> for safety
>  * {*}Supported Types{*}:
>  ** INTEGER family types (TINYINT, SMALLINT, INTEGER, BIGINT) - all results 
> as BIGINT
>  ** VARBINARY types (following BigQuery semantics) - maintain original type
> h3. Behavioral Semantics
> h4. Shift Count Handling
>  * {*}Negative Shift{*}: Throws {{IllegalArgumentException}} due to 
> inconsistent database behaviors
> {{SELECT 8 << -1; – Throws exception for clarity and safety}}
>  * {*}Large Shift{*}: Shift counts >= type bit width return 0
> {{SELECT 1 << 40; – Returns 0 for INTEGER types}}
> h4. Cross-Database Negative Shift Behavior Analysis
> Testing reveals significant inconsistencies in negative shift handling:
> {*}PostgreSQL{*}: Uses modular arithmetic (takes low 5 bits for all shifts)
> {code:java}
> SELECT 1 << -1; – Returns -2147483648 (equivalent to 1 << 31)
> SELECT 1 << -2; – Returns 1073741824 (equivalent to 1 << 30)
> SELECT 1 << -3; – Returns 536870912 (equivalent to 1 << 29){code}
> – Large positive shifts
> {code:java}
> SELECT 1 << 40; – Returns 256 (equivalent to 1 << 8)
> SELECT 1 << 33; – Returns 2 (equivalent to 1 << 1)}}{code}
> {*}SQL Server{*}: Treats negative shift as right shift
> {code:java}
> SELECT 8 << -1; – Equivalent to 8 >> 1, returns 4{code}
> {*}MySQL{*}: Returns 0 for negative shifts, normal calculation for large 
> positive shifts
> {code:java}
> SELECT 1 << 40; – Returns 1099511627776 (normal calculation)
> SELECT 1 << -1; – Returns 0 (safe handling)
> SELECT 1 << -2; – Returns 0 (safe handling)}}
> {code}
> *SQL Server* (treats negative as right shift, large shifts as 0):
> {code:java}
> SELECT 8 << -1; – Returns 4 (equivalent to 8 >> 1)
> SELECT 1 << 40; – Returns 0 (shifts >= 32 yield 0)}}
> {code}
> {*}ClickHouse{*}: Throws error for negative shifts, normal calculation for 
> large positive shifts
> {code:java}
> SELECT 1 << 40; – Returns 1099511627776 (normal calculation)
> SELECT 1 << -1; – Error (exception thrown)}}
> {code}
>  
> Given that {*}all major modern/cloud databases (BigQuery, Snowflake, 
> ClickHouse) use strict validation{*}, Calcite adopts the same approach as the 
> clear industry standard for contemporary SQL engines.
> h4. Overflow Behavior
>  * {*}Silent Truncation{*}: Overflow bits are silently discarded (no 
> exceptions thrown)
>  * {*}Natural Bit Operations{*}: Follow Java's natural bit truncation behavior
>  * {*}Alignment{*}: Matches PostgreSQL, MySQL, and SQL Server overflow 
> handling
> h4. Null Handling
>  * {*}Standard SQL Semantics{*}: NULL operands result in NULL output
>  * {*}Consistent{*}: Follows existing Calcite null propagation patterns
> h3. Database Compatibility Matrix
> ||Database||Negative Shift||Overflow||Implementation Status||
> |*PostgreSQL*|Modular arithmetic (unique)|Silent truncation|❌ Diverges 
> (strict validation)|
> |*MySQL*|Returns 0|No limit on large positive shifts|❌ Diverges (strict 
> validation)|
> |*SQL Server*|Treated as right shift|Returns 0 for large shifts|❌ Diverges 
> (strict validation)|
> |*Snowflake*|Error (throws exception)|No limit on large positive shifts|✅ 
> Aligned (strict validation)|
> |*BigQuery*|Error (throws exception)|No limit on large positive shifts|✅ 
> Aligned (strict validation)|
> |*Clickhouse*| Error (throws exception)|No limit on large positive shifts|✅ 
> Aligned (strict validation)|
> h3. Test Coverage
>  * {*}Parsing Tests{*}: {{SqlParserTest}} - operator recognition and 
> precedence
>  * {*}Validation Tests{*}: {{SqlValidatorTest}} - type checking and error 
> conditions
>  * {*}Execution Tests{*}: {{SqlOperatorTest}} - runtime behavior verification
>  * {*}Rex Tests{*}: {{{}RexImpTableTest{}}}, {{RexProgramTest}} - expression 
> tree handling
>  * {*}Integration Tests{*}: {{operator.iq}} - end-to-end SQL execution 
> scenarios
>  * {*}Edge Cases{*}: Boundary conditions, type promotions, null handling
> h3. Design Decisions & Rationales
> h4. Why Silent Truncation?
>  * {*}Industry Standard{*}: All major databases handle overflow this way
>  * {*}Performance{*}: Avoids runtime exception overhead
>  * {*}Predictability{*}: Consistent bit-level behavior across platforms
> h4. Why Strict Negative Shift Validation?
>  * {*}Modern Database Consensus{*}: BigQuery, Snowflake, and ClickHouse (all 
> major modern/cloud SQL engines) throw exceptions for negative shifts
>  * {*}Industry Standard{*}: This is clearly the established pattern for 
> contemporary database design
>  * {*}Clear Evolution{*}: Legacy databases have quirky behaviors, modern 
> databases use strict validation
>  * {*}User Clarity{*}: Exception with clear message prevents confusion about 
> intended behavior
>  * {*}Future-Proof{*}: Aligns with the direction the SQL industry is moving
> h4. Why Always Return BIGINT?
>  * {*}Modern Database Alignment{*}: BigQuery, Snowflake, and ClickHouse all 
> handle large shift results naturally
>  * {*}MySQL Pattern{*}: MySQL always returns unsigned 64-bit integers for bit 
> operations
>  * {*}Overflow Prevention{*}: Eliminates unexpected overflow for common 
> operations like {{1000000 << 10}}
>  * {*}User Convenience{*}: No need for explicit casting to avoid overflow
>  * {*}Industry Trend{*}: Contemporary databases favor safety over 
> micro-optimization
> h3. VARBINARY Implementation Details
>  * {*}Byte-level Shifting{*}: Shifts entire bytes within the binary array
>  * {*}Length Preservation{*}: Result maintains same byte length as input
>  * {*}Overflow Handling{*}: Bytes shifted beyond array boundaries are 
> discarded
>  * {*}BigQuery Semantics{*}: Follows BigQuery's BYTES shifting behavior
> h3. Future Extensions
>  * {*}Right Shift Operator{*}: Companion {{>>}} operator implementation
> h3. Breaking Changes
>  * {*}None{*}: This is a new feature addition with no existing functionality 
> changes
> This proposal ensures Calcite's left shift operator behaves consistently with 
> mainstream SQL databases while maintaining performance and providing clear, 
> predictable semantics.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to