codejoyan commented on issue #5231:
URL: https://github.com/apache/hudi/issues/5231#issuecomment-1090602244
@alexeykudinkin here is the content of the .hoodie file, the data files and
the data file counts.
Let me know if you need any further info.
**.hoodie file content**
```
root@adhoc-2:/opt# hdfs dfs -ls /user/hive/warehouse/stock_ticks_cow/.hoodie/
Found 14 items
drwxr-xr-x - root supergroup 0 2022-04-06 18:26
/user/hive/warehouse/stock_ticks_cow/.hoodie/.aux
drwxr-xr-x - root supergroup 0 2022-04-06 18:27
/user/hive/warehouse/stock_ticks_cow/.hoodie/.temp
-rw-r--r-- 1 root supergroup 4442 2022-04-06 18:26
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182630323.commit
-rw-r--r-- 1 root supergroup 0 2022-04-06 18:26
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182630323.commit.requested
-rw-r--r-- 1 root supergroup 3017 2022-04-06 18:26
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182630323.inflight
-rw-r--r-- 1 root supergroup 2825 2022-04-06 18:27
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182714571.commit
-rw-r--r-- 1 root supergroup 0 2022-04-06 18:27
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182714571.commit.requested
-rw-r--r-- 1 root supergroup 3131 2022-04-06 18:27
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182714571.inflight
-rw-r--r-- 1 root supergroup 2823 2022-04-06 18:27
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182741563.commit
-rw-r--r-- 1 root supergroup 0 2022-04-06 18:27
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182741563.commit.requested
-rw-r--r-- 1 root supergroup 3129 2022-04-06 18:27
/user/hive/warehouse/stock_ticks_cow/.hoodie/20220406182741563.inflight
drwxr-xr-x - root supergroup 0 2022-04-06 18:26
/user/hive/warehouse/stock_ticks_cow/.hoodie/archived
-rw-r--r-- 1 root supergroup 512 2022-04-06 18:26
/user/hive/warehouse/stock_ticks_cow/.hoodie/hoodie.properties
drwxr-xr-x - root supergroup 0 2022-04-06 18:26
/user/hive/warehouse/stock_ticks_cow/.hoodie/metadata
```
**Data Files Listing:**
```
root@adhoc-2:/opt# hdfs dfs -ls
/user/hive/warehouse/stock_ticks_cow/2018/08/31/
Found 4 items
-rw-r--r-- 1 root supergroup 96 2022-04-06 18:26
/user/hive/warehouse/stock_ticks_cow/2018/08/31/.hoodie_partition_metadata
-rw-r--r-- 1 root supergroup 443929 2022-04-06 18:27
/user/hive/warehouse/stock_ticks_cow/2018/08/31/c872d135-bf8f-4c5e-9eee-6347635c32d3-0_0-21-22_20220406182714571.parquet
-rw-r--r-- 1 root supergroup 443651 2022-04-06 18:27
/user/hive/warehouse/stock_ticks_cow/2018/08/31/c872d135-bf8f-4c5e-9eee-6347635c32d3-0_0-21-22_20220406182741563.parquet
-rw-r--r-- 1 root supergroup 443927 2022-04-06 18:26
/user/hive/warehouse/stock_ticks_cow/2018/08/31/c872d135-bf8f-4c5e-9eee-6347635c32d3-0_1-35-37_20220406182630323.parquet
root@adhoc-2:/opt# hdfs dfs -ls
/user/hive/warehouse/stock_ticks_cow/2019/08/31/
Found 2 items
-rw-r--r-- 1 root supergroup 96 2022-04-06 18:26
/user/hive/warehouse/stock_ticks_cow/2019/08/31/.hoodie_partition_metadata
-rw-r--r-- 1 root supergroup 443971 2022-04-06 18:26
/user/hive/warehouse/stock_ticks_cow/2019/08/31/258177c0-b9eb-43be-9fad-7c1d57dd4279-0_0-35-36_20220406182630323.parquet
```
**Data File Count**
```
scala> spark.sql("select _hoodie_file_name, date, count(1) from
stock_ticks_cow group by _hoodie_file_name, date").show(false);
+------------------------------------------------------------------------+----------+--------+
|_hoodie_file_name
|date |count(1)|
+------------------------------------------------------------------------+----------+--------+
|c872d135-bf8f-4c5e-9eee-6347635c32d3-0_0-21-22_20220406182741563.parquet|2018/08/31|99
|
|c872d135-bf8f-4c5e-9eee-6347635c32d3-0_0-21-22_20220406182714571.parquet|2018/08/31|98
|
|258177c0-b9eb-43be-9fad-7c1d57dd4279-0_0-35-36_20220406182630323.parquet|2019/08/31|197
|
+------------------------------------------------------------------------+----------+--------+
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]