[ 
https://issues.apache.org/jira/browse/HUDI-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378984#comment-17378984
 ] 

ASF GitHub Bot commented on HUDI-1951:
--------------------------------------

minihippo commented on a change in pull request #3173:
URL: https://github.com/apache/hudi/pull/3173#discussion_r667671074



##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/SimpleAvroKeyGenerator.java
##########
@@ -30,19 +33,36 @@
 
   public SimpleAvroKeyGenerator(TypedProperties props) {
     this(props, 
props.getString(KeyGeneratorOptions.RECORDKEY_FIELD_OPT_KEY.key()),
-        
props.getString(KeyGeneratorOptions.PARTITIONPATH_FIELD_OPT_KEY.key()));
+        props.getString(KeyGeneratorOptions.PARTITIONPATH_FIELD_OPT_KEY.key()),
+        props.getString(KeyGeneratorOptions.INDEXKEY_FILED_OPT.key(),
+            KeyGeneratorOptions.INDEXKEY_FILED_OPT.defaultValue()));
   }
 
   SimpleAvroKeyGenerator(TypedProperties props, String partitionPathField) {
-    this(props, null, partitionPathField);
+    this(props, null, partitionPathField, null);
   }
 
   SimpleAvroKeyGenerator(TypedProperties props, String recordKeyField, String 
partitionPathField) {
+    this(props, recordKeyField, partitionPathField, null);
+  }
+
+  SimpleAvroKeyGenerator(TypedProperties props, String recordKeyField, String 
partitionPathField,
+      String indexKeyField) {
     super(props);
     this.recordKeyFields = recordKeyField == null
         ? Collections.emptyList()
         : Collections.singletonList(recordKeyField);
     this.partitionPathFields = Collections.singletonList(partitionPathField);
+    if (!StringUtils.isNullOrEmpty(indexKeyField) && 
!indexKeyField.equals(recordKeyField)) {

Review comment:
       Incorrect check here. But for the bucket index, indexKeyField can be the 
subset. There is an one2one match between bucketId and file groupId. Therefore, 
the record indexed by `colA` is always clustered to the same bucket and updated 
by key `colA` and `colB` with the old one stored in the bucket




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


> Hash Index for HUDI
> -------------------
>
>                 Key: HUDI-1951
>                 URL: https://issues.apache.org/jira/browse/HUDI-1951
>             Project: Apache Hudi
>          Issue Type: New Feature
>            Reporter: XiaoyuGeng
>            Assignee: XiaoyuGeng
>            Priority: Major
>              Labels: pull-request-available
>
> https://cwiki.apache.org/confluence/display/HUDI/RFC+-+29%3A+Hash+Index



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to