Maplejw opened a new issue, #13221:
URL: https://github.com/apache/hudi/issues/13221

   **Describe the problem you faced**
   
   I create table format as orc and insert data to table successfully.
   But when I do query for this table,it says it is not a Parquet file.
   
   
![Image](https://github.com/user-attachments/assets/dbb974ee-4c20-4466-9b57-6f3625366a15)
   
   
   
    
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   I use spark-sql
   ```
   export SPARK_VERSION=3.4
   spark-sql --packages 
org.apache.hudi:hudi-spark$SPARK_VERSION-bundle_2.12:1.0.1 \
   --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
   --conf 
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' \
   --conf 
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
 \
   --conf 'spark.kryo.registrator=org.apache.spark.HoodieSparkKryoRegistrar'
   ```
   
   
   To create table  format as orc and insert data
   ```
   CREATE TABLE `carbus_dw`.`dwd_client_behavior_hudi_5` (
     `user_id` BIGINT,
     `device_id` STRING,
     `network_type` STRING,
     `behavior_id` STRING,
     `sub_behavior_id` STRING,
     `app_version` STRING,
     `sys_version` STRING,
     `using_time` BIGINT,
     `proc_time` BIGINT,
     `nonce_id` STRING,
     `type` INT,
     `proc_date` INT)
   USING hudi
   PARTITIONED BY (proc_date)
   LOCATION 'gs://igg-rd8-data-project/carbus/dw/dwd/client_behavior_hudi_5'
   TBLPROPERTIES (
     'hoodie.base.file.format' = 'ORC',
     'hoodie.table.base.file.format' = 'ORC',
     'type' = 'cow');
   
   insert overwrite table carbus_dw.dwd_client_behavior_hudi_5 
partition(proc_date=20250421)
   select user_id,device_id,network_type,behavior_id,sub_behavior_id,
   app_version,sys_version,using_time,proc_time,nonce_id,type
   from carbus_dw.dwd_client_behavior where proc_date=20250421;
   ```
   do query
   ```
   select count(0) from carbus_dw.dwd_client_behavior_hudi_5  where 
proc_date=20250421;
   ```
   
   
   **Expected behavior**
   
   should query successfully
   
   **Environment Description**
   
   * Hudi version :1.0.1
   
   * Spark version : 3.4.4
   
   * Hive version : 2.9.2
   
   * Hadoop version : 2.9.2
   
   * Storage (HDFS/S3/GCS..) : GCS
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   None
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to