codejoyan commented on issue #5231:
URL: https://github.com/apache/hudi/issues/5231#issuecomment-1090602244

   @alexeykudinkin here is the content of the .hoodie file, the data files and 
the data file counts. 
   Let me know if you need any further info.
   
   **.hoodie file content**
   ```
   root@adhoc-2:/opt# hdfs dfs -ls /user/hive/warehouse/stock_ticks_cow/.hoodie/
   Found 14 items
   drwxr-xr-x   - root supergroup          0 2022-04-06 18:26 
/user/hive/warehouse/stock_ticks_cow/.hoodie/.aux
   drwxr-xr-x   - root supergroup          0 2022-04-06 18:27 
/user/hive/warehouse/stock_ticks_cow/.hoodie/.temp
   -rw-r--r--   1 root supergroup       4442 2022-04-06 18:26 
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182630323.commit
   -rw-r--r--   1 root supergroup          0 2022-04-06 18:26 
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182630323.commit.requested
   -rw-r--r--   1 root supergroup       3017 2022-04-06 18:26 
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182630323.inflight
   -rw-r--r--   1 root supergroup       2825 2022-04-06 18:27 
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182714571.commit
   -rw-r--r--   1 root supergroup          0 2022-04-06 18:27 
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182714571.commit.requested
   -rw-r--r--   1 root supergroup       3131 2022-04-06 18:27 
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182714571.inflight
   -rw-r--r--   1 root supergroup       2823 2022-04-06 18:27 
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182741563.commit
   -rw-r--r--   1 root supergroup          0 2022-04-06 18:27 
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182741563.commit.requested
   -rw-r--r--   1 root supergroup       3129 2022-04-06 18:27 
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182741563.inflight
   drwxr-xr-x   - root supergroup          0 2022-04-06 18:26 
/user/hive/warehouse/stock_ticks_cow/.hoodie/archived
   -rw-r--r--   1 root supergroup        512 2022-04-06 18:26 
/user/hive/warehouse/stock_ticks_cow/.hoodie/hoodie.properties
   drwxr-xr-x   - root supergroup          0 2022-04-06 18:26 
/user/hive/warehouse/stock_ticks_cow/.hoodie/metadata
   ```
   
   **Data Files Listing:**
   ```
   root@adhoc-2:/opt# hdfs dfs -ls 
/user/hive/warehouse/stock_ticks_cow/2018/08/31/
   Found 4 items
   -rw-r--r--   1 root supergroup         96 2022-04-06 18:26 
/user/hive/warehouse/stock_ticks_cow/2018/08/31/.hoodie_partition_metadata
   -rw-r--r--   1 root supergroup     443929 2022-04-06 18:27 
/user/hive/warehouse/stock_ticks_cow/2018/08/31/c872d135-bf8f-4c5e-9eee-6347635c32d3-0_0-21-22_20220406182714571.parquet
   -rw-r--r--   1 root supergroup     443651 2022-04-06 18:27 
/user/hive/warehouse/stock_ticks_cow/2018/08/31/c872d135-bf8f-4c5e-9eee-6347635c32d3-0_0-21-22_20220406182741563.parquet
   -rw-r--r--   1 root supergroup     443927 2022-04-06 18:26 
/user/hive/warehouse/stock_ticks_cow/2018/08/31/c872d135-bf8f-4c5e-9eee-6347635c32d3-0_1-35-37_20220406182630323.parquet
   root@adhoc-2:/opt# hdfs dfs -ls 
/user/hive/warehouse/stock_ticks_cow/2019/08/31/
   Found 2 items
   -rw-r--r--   1 root supergroup         96 2022-04-06 18:26 
/user/hive/warehouse/stock_ticks_cow/2019/08/31/.hoodie_partition_metadata
   -rw-r--r--   1 root supergroup     443971 2022-04-06 18:26 
/user/hive/warehouse/stock_ticks_cow/2019/08/31/258177c0-b9eb-43be-9fad-7c1d57dd4279-0_0-35-36_20220406182630323.parquet
   ```
   
   **Data File Count**
   ```
   scala> spark.sql("select _hoodie_file_name, date, count(1) from 
stock_ticks_cow group by _hoodie_file_name, date").show(false);
   
+------------------------------------------------------------------------+----------+--------+
   |_hoodie_file_name                                                       
|date      |count(1)|
   
+------------------------------------------------------------------------+----------+--------+
   
|c872d135-bf8f-4c5e-9eee-6347635c32d3-0_0-21-22_20220406182741563.parquet|2018/08/31|99
      |
   
|c872d135-bf8f-4c5e-9eee-6347635c32d3-0_0-21-22_20220406182714571.parquet|2018/08/31|98
      |
   
|258177c0-b9eb-43be-9fad-7c1d57dd4279-0_0-35-36_20220406182630323.parquet|2019/08/31|197
     |
   
+------------------------------------------------------------------------+----------+--------+
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to