satish created HUDI-1443:
----------------------------
Summary: Remove record deserialization in
RDDCustomColumnsSortPartitioner
Key: HUDI-1443
URL: https://issues.apache.org/jira/browse/HUDI-1443
Project: Apache Hudi
Issue Type: Sub-task
Components: Performance
Reporter: satish
https://github.com/apache/hudi/pull/2263#discussion_r533653930 has the context.
When sorting is specified as part of clustering, we use custom partitioner
RDDCustomColumnsSortPartitioner. This deserializes schema to get values for
sort columns. Check if its possible to avoid this and implement the suggestion
in PR.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)