[ 
https://issues.apache.org/jira/browse/HUDI-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704925#comment-17704925
 ] 

Wally Tang edited comment on HUDI-5982 at 3/25/23 10:54 AM:
------------------------------------------------------------

My idea is that if the user's primary key data contains ",", we can replace it 
with __commas__ _when generating the recordKey. When the user wants to retrieve 
the real primary key data from the recordKey, they can replace  __commas___  
with ",".


was (Author: tangshangwen):
My idea is that if the user's primary key data contains ",", we can replace it 
with "_{_}commas__{_}{_}" when generating the recordKey. When the user wants to 
retrieve the real primary key data from the recordKey, they can replace 
"__{_}{_}commas__{_}" with ",".

> When the user's primary key data contains commas, BucketIdentifier cannot be 
> used
> ---------------------------------------------------------------------------------
>
>                 Key: HUDI-5982
>                 URL: https://issues.apache.org/jira/browse/HUDI-5982
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: index
>    Affects Versions: 0.12.0
>            Reporter: Wally Tang
>            Priority: Major
>
> In the scenario of using composite primary keys and bucket index in a Hudi 
> table, BucketIdentifier splits the recordKey using commas as a delimiter. 
> This can cause exceptions to occur if the user's primary key data contains 
> commas.
> {code:java}
> // BucketIdentifier.java
> private static List<String> getHashKeysUsingIndexFields(String recordKey, 
> List<String> indexKeyFields) {
>   Map<String, String> recordKeyPairs = Arrays.stream(recordKey.split(","))
>       .map(p -> p.split(":"))
>       .collect(Collectors.toMap(p -> p[0], p -> p[1]));
>   return indexKeyFields.stream()
>       .map(recordKeyPairs::get).collect(Collectors.toList());
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to