RameshkumarChikoti123 opened a new issue, #12152:
URL: https://github.com/apache/hudi/issues/12152
Added two record keys(customer_id,name) and configured Record index as below
`hudi_options = {
'hoodie.table.name': "hudi-table-with-rli-two-record-keys",
'hoodie.datasource.write.recordkey.field': "customer_id,name",
'hoodie.datasource.write.partitionpath.field': "state",
'hoodie.datasource.write.precombine.field': "created_at",
'hoodie.datasource.write.operation': "upsert", # Use upsert operation
'hoodie.index.type': "RECORD_INDEX",
'hoodie.metadata.enable': "true",
'hoodie.metadata.index.column.stats.enable': "true",
'hoodie.metadata.record.index.enable': "true"
}
df.write.format("hudi").options(**hudi_options).mode("append").save("s3a://bucket/var/proj/hudipoc-proj/hudi-table-with-rli-two-record-key/")`
**Reading record with composite keys**
` spark.read.format("hudi") \
.option("hoodie.enable.data.skipping", "true") \
.option("hoodie.metadata.enable", "true") \
.option("hoodie.metadata.record.index.enable", "true") \
.option("hoodie.metadata.index.column.stats.enable", "true") \
.load("s3a://bucket/var/proj/hudipoc-proj/hudi-table-with-rli-two-record-key/")
\
.createOrReplaceTempView("hudi_snapshot1")
spark.sql("select * from hudi_snapshot1 where
customer_id='04da8419-fb9e-47f1-a44f-3cf2199ad20a'and name='Customer_43680'
").show(truncate=False)`
**Observations**:
Spark is reading from all the partition as show in attached image

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]