RameshkumarChikoti123 opened a new issue, #12152:
URL: https://github.com/apache/hudi/issues/12152

   Added two record  keys(customer_id,name) and configured Record index as below
   
   `hudi_options = {
       'hoodie.table.name': "hudi-table-with-rli-two-record-keys",
       'hoodie.datasource.write.recordkey.field': "customer_id,name",
       'hoodie.datasource.write.partitionpath.field': "state",
       'hoodie.datasource.write.precombine.field': "created_at",
       'hoodie.datasource.write.operation': "upsert",  # Use upsert operation
       'hoodie.index.type': "RECORD_INDEX",
       'hoodie.metadata.enable': "true",
       'hoodie.metadata.index.column.stats.enable': "true",
       'hoodie.metadata.record.index.enable': "true"
   }
   
df.write.format("hudi").options(**hudi_options).mode("append").save("s3a://bucket/var/proj/hudipoc-proj/hudi-table-with-rli-two-record-key/")`
   
   **Reading record with composite keys**
   
   ` spark.read.format("hudi") \
       .option("hoodie.enable.data.skipping", "true") \
       .option("hoodie.metadata.enable", "true") \
       .option("hoodie.metadata.record.index.enable", "true") \
       .option("hoodie.metadata.index.column.stats.enable", "true") \
       
.load("s3a://bucket/var/proj/hudipoc-proj/hudi-table-with-rli-two-record-key/") 
\
       .createOrReplaceTempView("hudi_snapshot1")
   spark.sql("select * from hudi_snapshot1 where 
customer_id='04da8419-fb9e-47f1-a44f-3cf2199ad20a'and name='Customer_43680' 
").show(truncate=False)`
   
   
   **Observations**:
   Spark is reading from all the partition  as show in attached  image
   
   
![image](https://github.com/user-attachments/assets/3faee159-9157-43c3-bcee-551b2cb85ec1)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to