[
https://issues.apache.org/jira/browse/HUDI-4040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sagar Sumit updated HUDI-4040:
------------------------------
Status: In Progress (was: Open)
> Add customerColumnPartitionerRow support
> ----------------------------------------
>
> Key: HUDI-4040
> URL: https://issues.apache.org/jira/browse/HUDI-4040
> Project: Apache Hudi
> Issue Type: Improvement
> Reporter: Hui An
> Priority: Major
> Labels: pull-request-available
>
> like RDDCustomColumnsSortPartitioner, we can use this partitioner to sort
> customer columns users specified.
> for example:
> {code:scala}
> df.write.format("hudi")
> .option(HoodieWriteConfig.TABLE_NAME, "test_table")
> .option(OPERATION.key, DataSourceWriteOptions.BULK_INSERT_OPERATION_OPT_VAL)
> .option(RECORDKEY_FIELD.key, "session_id")
> .option(PARTITIONPATH_FIELD.key, "date")
> .option("hoodie.bulkinsert.user.defined.partitioner.class",
> "org.apache.hudi.execution.bulkinsert.CustomColumnsSortPartitionerWithRows")
> .option("hoodie.bulkinsert.user.defined.partitioner.sort.columns",
> "page_type")
> .mode(SaveMode.Append)
> .save("hdfs://test/test_table")
> {code}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)