[GitHub] [incubator-hudi] jaimin-shah commented on a change in pull request #862: Add support for composite key
jaimin-shah commented on a change in pull request #862: Add support for composite key URL: https://github.com/apache/incubator-hudi/pull/862#discussion_r327747014 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/keygen/TimestampBasedKeyGenerator.java ## @@ -102,7 +109,12 @@ public HoodieKey getKey(GenericRecord record) { "Unexpected type for partition field: " + partitionVal.getClass().getName()); } - return new HoodieKey(DataSourceUtils.getNestedFieldValAsString(record, recordKeyField), + return new HoodieKey( + fields.stream() + .map( + recordKeyField -> + DataSourceUtils.getNestedFieldValAsString(record, recordKeyField)) + .collect(Collectors.joining(".")), Review comment: Hi @afilipchik for your use case I don't think there will be any problem but there can be problem when there no restrictions on recordKeyField ( e.g. Version suppose to monotonically increase ). You can take a look at https://github.com/apache/incubator-hudi/blob/master/hudi-spark/src/main/java/org/apache/hudi/ComplexKeyGenerator.java it has generic implementation of complex key for quite similar requirement. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] jaimin-shah commented on a change in pull request #862: Add support for composite key
jaimin-shah commented on a change in pull request #862: Add support for composite key URL: https://github.com/apache/incubator-hudi/pull/862#discussion_r327747014 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/keygen/TimestampBasedKeyGenerator.java ## @@ -102,7 +109,12 @@ public HoodieKey getKey(GenericRecord record) { "Unexpected type for partition field: " + partitionVal.getClass().getName()); } - return new HoodieKey(DataSourceUtils.getNestedFieldValAsString(record, recordKeyField), + return new HoodieKey( + fields.stream() + .map( + recordKeyField -> + DataSourceUtils.getNestedFieldValAsString(record, recordKeyField)) + .collect(Collectors.joining(".")), Review comment: Hi @afilipchik for your use case I don't think there will be any problem but there can be problem when there no restrictions on recordKeyField ( e.g. Version suppose to monotonically increase ). You can take a look at https://github.com/apache/incubator-hudi/blob/master/hudi-spark/src/main/java/org/apache/hudi/ComplexKeyGenerator.java it has generic implementation of complex key similar requirement. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] jaimin-shah commented on a change in pull request #862: Add support for composite key
jaimin-shah commented on a change in pull request #862: Add support for composite key URL: https://github.com/apache/incubator-hudi/pull/862#discussion_r327502159 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/keygen/TimestampBasedKeyGenerator.java ## @@ -102,7 +109,12 @@ public HoodieKey getKey(GenericRecord record) { "Unexpected type for partition field: " + partitionVal.getClass().getName()); } - return new HoodieKey(DataSourceUtils.getNestedFieldValAsString(record, recordKeyField), + return new HoodieKey( + fields.stream() + .map( + recordKeyField -> + DataSourceUtils.getNestedFieldValAsString(record, recordKeyField)) + .collect(Collectors.joining(".")), Review comment: @bvaradar By ambiguity I meant two different records having same key. For example US, .A => US..A and US. , A => US..A Keeping recordKeyField as part of key resolves this. Although I agree these kind of cases are quite rare. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] jaimin-shah commented on a change in pull request #862: Add support for composite key
jaimin-shah commented on a change in pull request #862: Add support for composite key URL: https://github.com/apache/incubator-hudi/pull/862#discussion_r326138793 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/keygen/TimestampBasedKeyGenerator.java ## @@ -102,7 +109,12 @@ public HoodieKey getKey(GenericRecord record) { "Unexpected type for partition field: " + partitionVal.getClass().getName()); } - return new HoodieKey(DataSourceUtils.getNestedFieldValAsString(record, recordKeyField), + return new HoodieKey( + fields.stream() + .map( + recordKeyField -> + DataSourceUtils.getNestedFieldValAsString(record, recordKeyField)) + .collect(Collectors.joining(".")), Review comment: Hi I think this kind of key creates ambiguity refer this PR for details https://github.com/apache/incubator-hudi/pull/728 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services