stayrascal commented on issue #8038:
URL: https://github.com/apache/hudi/issues/8038#issuecomment-1443400947

   And for Trino case, it cannot count all records base on RT table.
   
   ```
   trino> select count(*) from hive.hudi_hms_db.flink_hudi_mor_streaming_tbl_rt;
    _col0
   -------
     8163
   (1 row)
   
   Query 20230224_100730_00008_sxxri, FINISHED, 2 nodes
   Splits: 21 total, 21 done (100.00%)
   0.65 [8.16K rows, 1.72MB] [12.6K rows/s, 2.66MB/s]
   
   trino> select count(*) from hive.hudi_hms_db.flink_hudi_mor_streaming_tbl_ro;
    _col0
   -------
     8163
   (1 row)
   
   Query 20230224_100735_00009_sxxri, FINISHED, 2 nodes
   Splits: 21 total, 21 done (100.00%)
   0.61 [8.16K rows, 1.72MB] [13.3K rows/s, 2.8MB/s]
   
   trino> select count(*) from hive.hudi_hms_db.flink_hudi_mor_streaming_tbl;
    _col0
   -------
        0
   (1 row)
   
   Query 20230224_100738_00010_sxxri, FINISHED, 2 nodes
   Splits: 18 total, 18 done (100.00%)
   0.56 [0 rows, 0B] [0 rows/s, 0B/s]
   ```
   
   And if the MOR table haven't done any compaction, query on RT table will 
throw a exception that the base file not exist, is an expected behavior?
   
   ```
   trino> select * from hive.hudi_hms_db.flink_hudi_mor_tbl_rt;
   
   Query 20230224_100913_00011_sxxri, FAILED, 2 nodes
   Splits: 4 total, 0 done (0.00%)
   0.51 [0 rows, 0B] [0 rows/s, 0B/s]
   
   Query 20230224_100913_00011_sxxri failed: Not valid Parquet file: 
hdfs://xxxxxxx/hive/hudi_hms_db/flink_hudi_mor_tbl/par3/.83b4db58-a84b-40b5-b38d-d79acfa8db3c_20230216160153391.log.1_0-1-0
 expected magic number: PAR1 got: #
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to