alexeykudinkin commented on issue #5231:
URL: https://github.com/apache/hudi/issues/5231#issuecomment-1097314266
Validated that #5296 addresses the issue:
```
cala> val partitions = FSUtils.getAllPartitionPaths(engineContext,
metadataConfig, basePath).iterator().asScala.toList;
partitions: List[String] = List(2018/08/31, 2019/08/31)
scala>
scala>
scala> partitions.flatMap(x => {
| val engContext = new HoodieLocalEngineContext(conf.get());
| val fsView = new HoodieMetadataFileSystemView(engContext,
metaClient,
metaClient.getActiveTimeline().getCommitsTimeline().filterCompletedInstants(),
metadataConfig);
|
fsView.getLatestBaseFiles(x).iterator().asScala.toList.map(_.getFileName)
| })
res0: List[String] =
List(c4ea1cd9-0fec-4f7f-8272-e093fe6f9344-0_0-21-22_20220412225124731.parquet,
be940ea6-2ece-405b-8de0-626e803050d8-0_0-36-37_20220412224915898.parquet)
scala>
scala>
spark.read.format("hudi").load("hdfs:///user/hive/warehouse/stock_ticks_cow").createOrReplaceTempView("stock_ticks_cow")
scala>
scala> spark.sql("select date, count(1) from stock_ticks_cow group by
date").show(false)
+----------+--------+
|date |count(1)|
+----------+--------+
|2019/08/31|197 |
|2018/08/31|197 |
+----------+--------+
scala> spark.sql("select _hoodie_file_name, date, count(1) from
stock_ticks_cow group by _hoodie_file_name, date").show(false);
+------------------------------------------------------------------------+----------+--------+
|_hoodie_file_name
|date |count(1)|
+------------------------------------------------------------------------+----------+--------+
|be940ea6-2ece-405b-8de0-626e803050d8-0_0-36-37_20220412224915898.parquet|2019/08/31|197
|
|c4ea1cd9-0fec-4f7f-8272-e093fe6f9344-0_0-21-22_20220412225124731.parquet|2018/08/31|197
|
+------------------------------------------------------------------------+----------+--------+
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]