BalaMahesh opened a new issue #2203: URL: https://github.com/apache/hudi/issues/2203
**Describe the problem you faced** Hive query for some partitions on the HUDI table with partition column in where condition is returning no result. I have verified partitions by using show partitions, desc formatted etc., I am also able to see the .hoodie_partititon_metadata file and parquet file in the table partition directory. By using the parquet-tools , i did cat on the file and it has exactly one ingested event. select count(*),dt from _ro table group by dt; : This query returns the count as 1 inside that partition (y) select * from _ro where id=x; (x in the partition y) but when i do select * from _ro where dt="y", it returns empty result but for other dt value it returns results. I am not sure where the exact issue is, is it because the file size is small and it has only record or if hive is behaving miscellaneously . I have seen the query logs and it shows numFiles = 1 , numSplits=1. **To Reproduce** Steps to reproduce the behavior: 1. Ingesting records using HoodieDeltaStreamer from JsonKafka Source 2. Partitioning the data based on date field in (yyyy-MM-dd) format 3. Querying the _ro table. **Expected behavior** It should return the single row **Environment Description** * Hudi version : 0.6.1 * Spark version : 2.4.7 * Hive version : 1.2 * Hadoop version : 2.7.1 * Storage (HDFS/S3/GCS..) : S3 * Running on Docker? (yes/no) : No ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
