JingsongLi commented on issue #1215:
URL: https://github.com/apache/iceberg/issues/1215#issuecomment-660768444


   +1, We should deal with these differences and we can consider rewriting some 
`ParquetValueWriter`s.
   The same is true for reader and Avro, including ORC.
   
   This workload is not small. I wonder if we can only implement Flink internal 
data structure conversions, and just provide a converter to convert Flink `Row` 
to `RowData`.
   
   Here is a Flink table/SQL data structure table, the internal data structure 
is more efficient, and the external structure is more often used in UDF.
   
   ```
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | SQL Data Types                 | Internal Data Structures (RowData)     
 |  External (Row)   |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | BOOLEAN                        | boolean                                
 |  boolean          |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | CHAR / VARCHAR / STRING        | {@link StringData}                     
 |  String           |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | BINARY / VARBINARY / BYTES     | byte[]                                 
 |  byte[]           |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | DECIMAL                        | {@link DecimalData}                    
 |  BigDecimal       |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | TINYINT                        | byte                                   
 |  byte             |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | SMALLINT                       | short                                  
 |  short            |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | INT                            | int                                    
 |  int              |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | BIGINT                         | long                                   
 |  long             |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | FLOAT                          | float                                  
 |  float            |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | DOUBLE                         | double                                 
 |  double           |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | DATE                           | int (number of days since epoch)       
 |  LocalDate        |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | TIME                           | int (number of milliseconds of the 
day) |  LocalTime        |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | TIMESTAMP                      | {@link TimestampData}                  
 |  LocalDateTime    |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | TIMESTAMP WITH LOCAL TIME ZONE | {@link TimestampData}                  
 |  Instant          |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | INTERVAL YEAR TO MONTH         | int (number of months)                 
 |  Period           |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | INTERVAL DAY TO MONTH          | long (number of milliseconds)          
 |  Duration         |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | ROW / structured types         | {@link RowData}                        
 |  Row              |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | ARRAY                          | {@link ArrayData}                      
 |  T[]              |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | MAP / MULTISET                 | {@link MapData}                        
 |  Map              |
    * 
+--------------------------------+-----------------------------------------+-------------------+
    * | RAW                            | {@link RawValueData}                   
 |  T                |
    * 
+--------------------------------+-----------------------------------------+-------------------+
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to