[GitHub] [iceberg] maytasm commented on pull request #8412: Fix Iceberg to handle literal short and byte

via GitHub Fri, 01 Sep 2023 00:00:15 -0700


maytasm commented on PR #8412:
URL: https://github.com/apache/iceberg/pull/8412#issuecomment-1702264636


   > I tested the master code in Spark 3.4.0 and it works correctly.
   > 
   > ```scala
   > scala> spark.sql("create table local.db.test_short(id short) using 
iceberg").show()
   > ++
   > ||
   > ++
   > ++
   > 
   > scala> spark.sql("insert into local.db.test_short values(1), (2), 
(3)").show()
   > ++
   > ||
   > ++
   > ++
   > 
   > scala> spark.sql("select * from local.db.test_short").show()
   > +---+
   > | id|
   > +---+
   > |  1|
   > |  2|
   > |  3|
   > +---+
   > 
   > scala> spark.sql("select * from local.db.test_short where id = 2").show()
   > +---+
   > | id|
   > +---+
   > |  2|
   > +---+
   > ```
   > 
   > It seems Iceberg promotes the short to int:
   > 
   > ```scala
   > scala> spark.sql("desc table local.db.test_short").show()
   > +--------+---------+-------+
   > |col_name|data_type|comment|
   > +--------+---------+-------+
   > |      id|      int|   null|
   > +--------+---------+-------+
   > ```
   
   What table format are you using? 
   I encountered this problem on a Hive table with tinyint / smallint data type.
   For example, 
   I ran `create table hive.mmonsereenusorn.mmonsereenusorn_test_numbers_3 (a 
tinyint, b smallint, c int, d bigint, e real, f double, g double precision, h 
decimal)` in Trino to create the Hive table.
   In spark, this table has tinyint and smallint
   ```
   spark-sql-3.3> desc mmonsereenusorn.mmonsereenusorn_test_numbers_3;
   a                    tinyint                                     
   b                    smallint                                    
   c                    int                                         
   d                    bigint                                      
   e                    float                                       
   f                    double                                      
   g                    double                                      
   h                    decimal(38,0)                               
                                                                    
   # Partitioning                                                   
   Not partitioned                                                  
   Time taken: 4.125 seconds, Fetched 11 row(s)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] maytasm commented on pull request #8412: Fix Iceberg to handle literal short and byte

Reply via email to