[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-08-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391601#comment-17391601
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan merged pull request #3315:
URL: https://github.com/apache/hudi/pull/3315


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-08-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391380#comment-17391380
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * 7c05ed3df1d3e9bff709d9aba9168f7bb7b1ac54 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1295)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-08-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391344#comment-17391344
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * 4d941eb79ab5e769b2643c5ad1b5b2d767a0af17 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1277)
 
   * 7c05ed3df1d3e9bff709d9aba9168f7bb7b1ac54 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1295)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-08-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391325#comment-17391325
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * 4d941eb79ab5e769b2643c5ad1b5b2d767a0af17 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1277)
 
   * 7c05ed3df1d3e9bff709d9aba9168f7bb7b1ac54 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-08-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391317#comment-17391317
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r680658525



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/RealtimeSplit.java
##
@@ -70,13 +78,24 @@
*/
   void setBasePath(String basePath);
 
+  void setHoodieVirtualKeyInfoOpt(Option 
hoodieVirtualKeyInfoOpt);
+
   default void writeToOutput(DataOutput out) throws IOException {
 InputSplitUtils.writeString(getBasePath(), out);
 InputSplitUtils.writeString(getMaxCommitTime(), out);
 out.writeInt(getDeltaLogPaths().size());
 for (String logFilePath : getDeltaLogPaths()) {
   InputSplitUtils.writeString(logFilePath, out);
 }
+if (!getHoodieVirtualKeyInfoOpt().isPresent()) {

Review comment:
   I could not find any other ways to do this. We are not mapping to 
anything. We are just doing X or Y depending on whether its present or not. So, 
we can't use .map().orElse(). I checked other apis in Option, but could not 
find any. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-08-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391303#comment-17391303
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r680653697



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/RealtimeSplit.java
##
@@ -70,13 +78,24 @@
*/
   void setBasePath(String basePath);
 
+  void setHoodieVirtualKeyInfoOpt(Option 
hoodieVirtualKeyInfoOpt);
+
   default void writeToOutput(DataOutput out) throws IOException {
 InputSplitUtils.writeString(getBasePath(), out);
 InputSplitUtils.writeString(getMaxCommitTime(), out);
 out.writeInt(getDeltaLogPaths().size());
 for (String logFilePath : getDeltaLogPaths()) {
   InputSplitUtils.writeString(logFilePath, out);
 }
+if (!getHoodieVirtualKeyInfoOpt().isPresent()) {

Review comment:
   ifPresent returns a void. Will try to see if there are other options. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-08-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391302#comment-17391302
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r680653697



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/RealtimeSplit.java
##
@@ -70,13 +78,24 @@
*/
   void setBasePath(String basePath);
 
+  void setHoodieVirtualKeyInfoOpt(Option 
hoodieVirtualKeyInfoOpt);
+
   default void writeToOutput(DataOutput out) throws IOException {
 InputSplitUtils.writeString(getBasePath(), out);
 InputSplitUtils.writeString(getMaxCommitTime(), out);
 out.writeInt(getDeltaLogPaths().size());
 for (String logFilePath : getDeltaLogPaths()) {
   InputSplitUtils.writeString(logFilePath, out);
 }
+if (!getHoodieVirtualKeyInfoOpt().isPresent()) {

Review comment:
   ifPresent returns avoid. will try using .map().orElse() 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-08-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391295#comment-17391295
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r680647264



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java
##
@@ -157,6 +157,11 @@
   .withDocumentation("When enabled, populates all meta fields. When 
disabled, no meta fields are populated "
   + "and incremental queries will not be functional. This is only 
meant to be used for append only/immutable data for batch processing");
 
+  public static final ConfigProperty HOODIE_TABLE_KEY_GENERATOR_CLASS 
= ConfigProperty
+  .key("hoodie.datasource.write.keygenerator.class")

Review comment:
   I will follow similar naming convention as record key and partition path
   hoodie.table.keygenerator.class




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-08-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391292#comment-17391292
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r680643619



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
##
@@ -324,6 +324,14 @@ public void validateTableProperties(Properties properties, 
WriteOperationType op
 && Boolean.parseBoolean((String) 
properties.getOrDefault(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key(), 
HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.defaultValue( {
   throw new 
HoodieException(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key() + " already 
disabled for the table. Can't be re-enabled back");
 }
+
+// meta fields can be disabled only with SimpleKeyGenerator
+if (!getTableConfig().populateMetaFields()
+&& 
!properties.getProperty(HoodieTableConfig.HOODIE_TABLE_KEY_GENERATOR_CLASS.key(),
 "org.apache.hudi.keygen.SimpleKeyGenerator")

Review comment:
   I did respond to that already. I did leave reviewer notes before too. 
SimpleKeyGenerator is not visible from this class. SoI had to hard code. Not 
sure if I understand you suggestion. 

##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
##
@@ -324,6 +324,14 @@ public void validateTableProperties(Properties properties, 
WriteOperationType op
 && Boolean.parseBoolean((String) 
properties.getOrDefault(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key(), 
HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.defaultValue( {
   throw new 
HoodieException(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key() + " already 
disabled for the table. Can't be re-enabled back");
 }
+
+// meta fields can be disabled only with SimpleKeyGenerator
+if (!getTableConfig().populateMetaFields()
+&& 
!properties.getProperty(HoodieTableConfig.HOODIE_TABLE_KEY_GENERATOR_CLASS.key(),
 "org.apache.hudi.keygen.SimpleKeyGenerator")

Review comment:
   I did respond to that already. I did leave reviewer notes before too. 
SimpleKeyGenerator is not visible from this class. So I had to hard code. Not 
sure if I understand you suggestion. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-08-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391291#comment-17391291
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r680643364



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java
##
@@ -157,6 +157,11 @@
   .withDocumentation("When enabled, populates all meta fields. When 
disabled, no meta fields are populated "
   + "and incremental queries will not be functional. This is only 
meant to be used for append only/immutable data for batch processing");
 
+  public static final ConfigProperty HOODIE_TABLE_KEY_GENERATOR_CLASS 
= ConfigProperty
+  .key("hoodie.datasource.write.keygenerator.class")

Review comment:
   my intention here is to just use the same config someone uses with spark 
datasource. Rather than introducing new configs. But guess we don't follow that 
for partition path nor record keys. Will fix it. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-08-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391265#comment-17391265
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

vinothchandar commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r680614863



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java
##
@@ -157,6 +157,11 @@
   .withDocumentation("When enabled, populates all meta fields. When 
disabled, no meta fields are populated "
   + "and incremental queries will not be functional. This is only 
meant to be used for append only/immutable data for batch processing");
 
+  public static final ConfigProperty HOODIE_TABLE_KEY_GENERATOR_CLASS 
= ConfigProperty
+  .key("hoodie.datasource.write.keygenerator.class")

Review comment:
   do we really write this as `hoodie.datasource` prefix?  This is a table 
level property, not just for the datasource write. lets fix this?

##
File path: 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/action/compact/CompactionTestBase.java
##
@@ -198,7 +198,7 @@ protected void executeCompaction(String 
compactionInstantTime, SparkRDDWriteClie
 assertEquals(latestCompactionCommitTime, compactionInstantTime,
 "Expect compaction instant time to be the latest commit time");
 assertEquals(expectedNumRecs,
-HoodieClientTestUtils.countRecordsSince(jsc, basePath, sqlContext, 
timeline, "000"),
+HoodieClientTestUtils.countRecordsWithOptionalSince(jsc, basePath, 
sqlContext, timeline, Option.of("000")),

Review comment:
   rename `countRecordOptionallySince` 

##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimeFileSplit.java
##
@@ -36,16 +38,20 @@
 
   private String basePath;
 
+  private Option hoodieVirtualKeyInfoOpt = 
Option.empty();
+
   public HoodieRealtimeFileSplit() {
 super();
   }
 
-  public HoodieRealtimeFileSplit(FileSplit baseSplit, String basePath, 
List deltaLogPaths, String maxCommitTime)
+  public HoodieRealtimeFileSplit(FileSplit baseSplit, String basePath, 
List deltaLogPaths, String maxCommitTime,
+ Option 
hoodieVirtualKeyInfoOpt)

Review comment:
   I think its okay to drop the `Opt` everywhere.  `virtualKeyInfo` is 
concise and already conveys the meaning. `virtualKeyInfo.isPresent()` will 
again convey what `Opt` would have conveyed. 
   
   Also lets drop `hoodie` prefix everywhere in the variables as well. 

##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/RealtimeSplit.java
##
@@ -52,8 +53,15 @@
*/
   String getBasePath();
 
+  /**
+   * Returns Virtual key info if meta fields are disabled.
+   * @return
+   */
+  Option getHoodieVirtualKeyInfoOpt();

Review comment:
   fix name.

##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##
@@ -204,23 +226,34 @@ private static Configuration 
addProjectionField(Configuration conf, String field
 return conf;
   }
 
-  public static void addRequiredProjectionFields(Configuration configuration) {
+  public static void addRequiredProjectionFields(Configuration configuration, 
Option hoodieVirtualKeyInfoOpt) {

Review comment:
   namign

##
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/cluster/SparkExecuteClusteringCommitActionExecutor.java
##
@@ -211,8 +213,11 @@ protected String getCommitActionType() {
   
.withBitCaskDiskMapCompressionEnabled(config.getCommonConfig().isBitCaskDiskMapCompressionEnabled())
   .build();
 
+  HoodieTableConfig tableConfig = 
table.getMetaClient().getTableConfig();
   
recordIterators.add(HoodieFileSliceReader.getFileSliceReader(baseFileReader, 
scanner, readerSchema,
-  table.getMetaClient().getTableConfig().getPayloadClass()));
+  tableConfig.getPayloadClass(),
+  tableConfig.populateMetaFields() ? Option.empty() : 
Option.of(tableConfig.getRecordKeyFieldProp()),

Review comment:
   we can use `Option>` anywhere both a record key and 
partition path field needs to be passed. 

##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordScanner.java
##
@@ -80,6 +82,10 @@
   private final HoodieTableMetaClient hoodieTableMetaClient;
   // Merge strategy to use when combining records from log
   private final String payloadClassFQN;
+  // simple recordKey field
+  private Option simpleRecordKeyField = Option.empty();

Review comment:
   lets use a Pair

##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimeFileSplit.java
##
@@ -60,6 +66,16 @@ public String 

[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390809#comment-17390809
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * 4d941eb79ab5e769b2643c5ad1b5b2d767a0af17 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1277)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390798#comment-17390798
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * dbecf0559f3d42a43b3e311b10e090bd6d03ab86 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1276)
 
   * 4d941eb79ab5e769b2643c5ad1b5b2d767a0af17 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1277)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390792#comment-17390792
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * dbecf0559f3d42a43b3e311b10e090bd6d03ab86 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1276)
 
   * 4d941eb79ab5e769b2643c5ad1b5b2d767a0af17 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390738#comment-17390738
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * dbecf0559f3d42a43b3e311b10e090bd6d03ab86 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1276)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390711#comment-17390711
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * 5182e15a24f3b02c95fa9943997fb5810715 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1247)
 
   * dbecf0559f3d42a43b3e311b10e090bd6d03ab86 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1276)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390679#comment-17390679
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * 5182e15a24f3b02c95fa9943997fb5810715 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1247)
 
   * dbecf0559f3d42a43b3e311b10e090bd6d03ab86 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390161#comment-17390161
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r679492701



##
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/cluster/SparkExecuteClusteringCommitActionExecutor.java
##
@@ -129,7 +131,7 @@ public 
SparkExecuteClusteringCommitActionExecutor(HoodieEngineContext context,
   /**
* Validate actions taken by clustering. In the first implementation, we 
validate at least one new file is written.
* But we can extend this to add more validation. E.g. number of records 
read = number of records written etc.
-   * 
+   *

Review comment:
   I have no idea how these are getting included. I reverted this locally 
and did git add of this file. I pushed it and repeated it twice, but it gets 
auto formatted somehow :( 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390158#comment-17390158
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * 5182e15a24f3b02c95fa9943997fb5810715 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1247)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390143#comment-17390143
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * f3e087be2161389fcfdbcdedf748172c29a39ed2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1238)
 
   * 5182e15a24f3b02c95fa9943997fb5810715 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1247)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390134#comment-17390134
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * f3e087be2161389fcfdbcdedf748172c29a39ed2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1238)
 
   * 5182e15a24f3b02c95fa9943997fb5810715 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389298#comment-17389298
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * f3e087be2161389fcfdbcdedf748172c29a39ed2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1238)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389277#comment-17389277
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * edf2faa79c773b0f4245f9a2b684eb9e65c61ea5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1230)
 
   * f3e087be2161389fcfdbcdedf748172c29a39ed2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1238)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389275#comment-17389275
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * edf2faa79c773b0f4245f9a2b684eb9e65c61ea5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1230)
 
   * f3e087be2161389fcfdbcdedf748172c29a39ed2 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389160#comment-17389160
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * edf2faa79c773b0f4245f9a2b684eb9e65c61ea5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1230)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389153#comment-17389153
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * 5c6f69320207cff8459082975140f3d8d4f4ce25 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1228)
 
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * edf2faa79c773b0f4245f9a2b684eb9e65c61ea5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1230)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389150#comment-17389150
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * 5c6f69320207cff8459082975140f3d8d4f4ce25 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1228)
 
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   * edf2faa79c773b0f4245f9a2b684eb9e65c61ea5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389149#comment-17389149
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-888706822


   @vinothchandar : addressed all feedback and left comments for some. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389148#comment-17389148
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * 5c6f69320207cff8459082975140f3d8d4f4ce25 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1228)
 
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   * 40cbb91efb34e6360849a42056de6343cc1251d6 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389143#comment-17389143
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678736586



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordScanner.java
##
@@ -302,7 +310,12 @@ private void processDataBlock(HoodieDataBlock dataBlock) 
throws Exception {
   }
 
   protected HoodieRecord createHoodieRecord(IndexedRecord rec) {
-return SpillableMapUtils.convertToHoodieRecordPayload((GenericRecord) rec, 
this.payloadClassFQN);
+if (!simpleRecordKeyFieldOpt.isPresent()) {

Review comment:
   I ran into issues as I mentioned. Here is the stack trace 
https://gist.github.com/nsivabalan/5147fde404e970fab66515af2eddcdcb
   hence left it as is.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389140#comment-17389140
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * 5c6f69320207cff8459082975140f3d8d4f4ce25 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1228)
 
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389137#comment-17389137
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * 1eed53e2e882b8cf296a99df0664845f99c8b264 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1226)
 
   * 5c6f69320207cff8459082975140f3d8d4f4ce25 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1228)
 
   * c3bdfdd1e47790f46f4263a7fb5242c01dd02188 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389136#comment-17389136
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678696181



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieFileSliceReader.java
##
@@ -36,11 +38,14 @@
   private Iterator> 
recordsIterator;
 
   public static  
HoodieFileSliceReader getFileSliceReader(
-  HoodieFileReader baseFileReader, HoodieMergedLogRecordScanner 
scanner, Schema schema, String payloadClass) throws IOException {
+  HoodieFileReader baseFileReader, HoodieMergedLogRecordScanner 
scanner, Schema schema, String payloadClass,
+  Option simpleRecordKeyFieldOpt, Option 
simplePartitionPathFieldOpt) throws IOException {

Review comment:
   responded else where. I would like to keep the MetaField logic 
(HoodieRecord.RECORD_KEY_METADATA_FIELD) within SpillableMapUtils and don't 
want to leak it out to all callers. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389131#comment-17389131
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678729363



##
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##
@@ -532,8 +539,8 @@ object HoodieSparkSqlWriter {
  tableConfig: HoodieTableConfig,
  jsc: JavaSparkContext,
  tableInstantInfo: TableInstantInfo
- ): (Boolean, 
common.util.Option[java.lang.String], common.util.Option[java.lang.String]) = {
-if(writeResult.getWriteStatuses.rdd.filter(ws => ws.hasErrors).isEmpty()) {
+): (Boolean, 
common.util.Option[java.lang.String], common.util.Option[java.lang.String]) = {

Review comment:
   I didn't know we can do this. Will fix it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389127#comment-17389127
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * 1eed53e2e882b8cf296a99df0664845f99c8b264 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1226)
 
   * 5c6f69320207cff8459082975140f3d8d4f4ce25 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1228)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389126#comment-17389126
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * 1eed53e2e882b8cf296a99df0664845f99c8b264 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1226)
 
   * 5c6f69320207cff8459082975140f3d8d4f4ce25 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389117#comment-17389117
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * 1eed53e2e882b8cf296a99df0664845f99c8b264 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1226)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389098#comment-17389098
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678696181



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieFileSliceReader.java
##
@@ -36,11 +38,14 @@
   private Iterator> 
recordsIterator;
 
   public static  
HoodieFileSliceReader getFileSliceReader(
-  HoodieFileReader baseFileReader, HoodieMergedLogRecordScanner 
scanner, Schema schema, String payloadClass) throws IOException {
+  HoodieFileReader baseFileReader, HoodieMergedLogRecordScanner 
scanner, Schema schema, String payloadClass,
+  Option simpleRecordKeyFieldOpt, Option 
simplePartitionPathFieldOpt) throws IOException {

Review comment:
   responded else where. I would like to keep the MetaField logic 
(HoodieRecord.RECORD_KEY_METADATA_FIELD) within SpillableMapUtils and don't 
want to spread out to all callers. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389093#comment-17389093
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * b624c142628ec6b780bc124dbd5c9d0a9d5e1fdc Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1225)
 
   * 1eed53e2e882b8cf296a99df0664845f99c8b264 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1226)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389092#comment-17389092
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * b624c142628ec6b780bc124dbd5c9d0a9d5e1fdc Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1225)
 
   * 1eed53e2e882b8cf296a99df0664845f99c8b264 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389089#comment-17389089
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678682342



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##
@@ -206,16 +206,28 @@ private static Configuration 
addProjectionField(Configuration conf, String field
 
   public static void addRequiredProjectionFields(Configuration configuration) {
 // Need this to do merge records in HoodieRealtimeRecordReader
-addProjectionField(configuration, HoodieRecord.RECORD_KEY_METADATA_FIELD, 
HoodieInputFormatUtils.HOODIE_RECORD_KEY_COL_POS);
-addProjectionField(configuration, HoodieRecord.COMMIT_TIME_METADATA_FIELD, 
HoodieInputFormatUtils.HOODIE_COMMIT_TIME_COL_POS);
-addProjectionField(configuration, 
HoodieRecord.PARTITION_PATH_METADATA_FIELD, 
HoodieInputFormatUtils.HOODIE_PARTITION_PATH_COL_POS);
+if (configuration.get(HoodieInputFormatUtils.HOODIE_USE_META_FIELDS) == 
null || configuration.get(HoodieInputFormatUtils.HOODIE_USE_META_FIELDS)
+.equals(HoodieInputFormatUtils.DEFAULT_HOODIE_USE_META_FIELDS)) {
+  addProjectionField(configuration, 
HoodieRecord.RECORD_KEY_METADATA_FIELD, 
HoodieInputFormatUtils.HOODIE_RECORD_KEY_COL_POS);
+  addProjectionField(configuration, 
HoodieRecord.COMMIT_TIME_METADATA_FIELD, 
HoodieInputFormatUtils.HOODIE_COMMIT_TIME_COL_POS);
+  addProjectionField(configuration, 
HoodieRecord.PARTITION_PATH_METADATA_FIELD, 
HoodieInputFormatUtils.HOODIE_PARTITION_PATH_COL_POS);
+} else {
+  addProjectionField(configuration, 
configuration.get(HoodieInputFormatUtils.RECORD_KEY_FIELD), 
Integer.parseInt(configuration.get(HoodieInputFormatUtils.RECORD_KEY_FIELD_INDEX)));
+  addProjectionField(configuration, 
configuration.get(HoodieInputFormatUtils.PARTITION_PATH_FIELD), 
Integer.parseInt(configuration.get(HoodieInputFormatUtils.PARTITION_PATH_FIELD_INDEX)));
+}
   }
 
   public static boolean requiredProjectionFieldsExistInConf(Configuration 
configuration) {
 String readColNames = 
configuration.get(ColumnProjectionUtils.READ_COLUMN_NAMES_CONF_STR, "");
-return readColNames.contains(HoodieRecord.RECORD_KEY_METADATA_FIELD)
-&& readColNames.contains(HoodieRecord.COMMIT_TIME_METADATA_FIELD)
-&& readColNames.contains(HoodieRecord.PARTITION_PATH_METADATA_FIELD);
+if (configuration.get(HoodieInputFormatUtils.HOODIE_USE_META_FIELDS) == 
null || configuration.get(HoodieInputFormatUtils.HOODIE_USE_META_FIELDS)
+.equals(HoodieInputFormatUtils.DEFAULT_HOODIE_USE_META_FIELDS)) {
+  return readColNames.contains(HoodieRecord.RECORD_KEY_METADATA_FIELD)
+  && readColNames.contains(HoodieRecord.COMMIT_TIME_METADATA_FIELD)
+  && readColNames.contains(HoodieRecord.PARTITION_PATH_METADATA_FIELD);
+} else {
+  return readColNames.contains(HoodieInputFormatUtils.RECORD_KEY_FIELD)

Review comment:
   when meta fields are disabled we don't project the commit time field.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389082#comment-17389082
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678658288



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordScanner.java
##
@@ -302,7 +310,12 @@ private void processDataBlock(HoodieDataBlock dataBlock) 
throws Exception {
   }
 
   protected HoodieRecord createHoodieRecord(IndexedRecord rec) {
-return SpillableMapUtils.convertToHoodieRecordPayload((GenericRecord) rec, 
this.payloadClassFQN);
+if (!simpleRecordKeyFieldOpt.isPresent()) {

Review comment:
   not sure if I tried it here or for keyGen related code blocks. But what 
I found was, if 
   ```
   Option.map.ofElse
   ```
   is called just once, it works smoothly. But if this piece of code gets 
called N no of times repeatedly, I ran into issues. 
   I will try it out again just to confirm my understanding. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389080#comment-17389080
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678660798



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/util/SpillableMapUtils.java
##
@@ -110,8 +110,15 @@ public static long generateChecksum(byte[] data) {
* Utility method to convert bytes to HoodieRecord using schema and payload 
class.
*/
   public static  R convertToHoodieRecordPayload(GenericRecord rec, String 
payloadClazz) {
-String recKey = rec.get(HoodieRecord.RECORD_KEY_METADATA_FIELD).toString();
-String partitionPath = 
rec.get(HoodieRecord.PARTITION_PATH_METADATA_FIELD).toString();
+return convertToHoodieRecordPayload(rec, payloadClazz, 
HoodieRecord.RECORD_KEY_METADATA_FIELD, 
HoodieRecord.PARTITION_PATH_METADATA_FIELD);

Review comment:
   I had this dilemma too and this was my reasoning. Conversion of 
genRecord to HoodieRecord is confined within SpillableMapUtils. and hence felt 
it makes sense that this class decides whether to use 
HoodieRecord.RECORD_KEY_METADATA_FIELD or what the caller has passed in. 
   If not, then HoodieRecord.RECORD_KEY_METADATA_FIELD has to be passed in by 
all N callers to this method. Felt we are leaking this hoodie's meta field into 
to other classes. 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389076#comment-17389076
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678658288



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordScanner.java
##
@@ -302,7 +310,12 @@ private void processDataBlock(HoodieDataBlock dataBlock) 
throws Exception {
   }
 
   protected HoodieRecord createHoodieRecord(IndexedRecord rec) {
-return SpillableMapUtils.convertToHoodieRecordPayload((GenericRecord) rec, 
this.payloadClassFQN);
+if (!simpleRecordKeyFieldOpt.isPresent()) {

Review comment:
   not sure if I tried it here or for keyGen related code blocks. But what 
I found was, if 
   ```
   Option.map.ofElse
   ```
   is called just one, it works smoothly. But if this piece of code gets called 
N no of times repeatedly, I ran into issues. 
   I will try it out again just to confirm my understanding. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389073#comment-17389073
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

vinothchandar commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678653856



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/RealtimeSplit.java
##
@@ -34,49 +36,73 @@
  */
 public interface RealtimeSplit extends InputSplitWithLocationInfo {
 
+
   /**
* Return Log File Paths.
+   *
* @return
*/
   List getDeltaLogPaths();
 
   /**
* Return Max Instant Time.
+   *
* @return
*/
   String getMaxCommitTime();
 
   /**
* Return Base Path of the dataset.
+   *
* @return
*/
   String getBasePath();
 
+  /**
+   * Returns Virtual key info if meta fields are disabled.
+   * @return
+   */
+  Option getHoodieVirtualKeyInfoOpt();
+
   /**
* Update Log File Paths.
+   *
* @param deltaLogPaths
*/
   void setDeltaLogPaths(List deltaLogPaths);
 
   /**
* Update Maximum valid instant time.
+   *
* @param maxCommitTime
*/
   void setMaxCommitTime(String maxCommitTime);
 
   /**
* Set Base Path.
+   *
* @param basePath
*/
   void setBasePath(String basePath);
 
+  void setHoodieVirtualKeyInfoOpt(Option 
hoodieVirtualKeyInfoOpt);
+
   default void writeToOutput(DataOutput out) throws IOException {
 InputSplitUtils.writeString(getBasePath(), out);
 InputSplitUtils.writeString(getMaxCommitTime(), out);
 out.writeInt(getDeltaLogPaths().size());
 for (String logFilePath : getDeltaLogPaths()) {
   InputSplitUtils.writeString(logFilePath, out);
 }
+if (!getHoodieVirtualKeyInfoOpt().isPresent()) {
+  InputSplitUtils.writeString(HoodieRealtimeInputFormatUtils.NULL_STR, 
out);

Review comment:
   I ll let you make the call here. it just boils down to efficiency of 
storage.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389071#comment-17389071
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

vinothchandar commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678651491



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##
@@ -87,6 +93,20 @@
   fsCache.put(metaClient, fsView);
 }
 HoodieTableFileSystemView fsView = fsCache.get(metaClient);
+// fetch virtual key params if required.

Review comment:
   I think we can just do it once. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389070#comment-17389070
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678651147



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
##
@@ -324,6 +324,14 @@ public void validateTableProperties(Properties properties, 
WriteOperationType op
 && Boolean.parseBoolean((String) 
properties.getOrDefault(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key(), 
HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.defaultValue( {
   throw new 
HoodieException(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key() + " already 
disabled for the table. Can't be re-enabled back");
 }
+
+// meta fields can be disabled only with SimpleKeyGenerator
+if (!getTableConfig().populateMetaFields()
+&& 
!properties.getProperty(HoodieTableConfig.HOODIE_TABLE_KEY_GENERATOR_CLASS.key(),
 HoodieTableConfig.DEFAULT_HOODIE_TABLE_KEY_GENERATOR_CLASS)

Review comment:
   As I left a note(and asked if there is some other way to go about this), 
this class is not accessible from this module. And so, we have to hard code the 
class name. We will get compilation issues if I try to do 
"org.apache.hudi.keygen.SimpleKeyGenerator.class"




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389069#comment-17389069
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678649799



##
File path: 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieClientTestUtils.java
##
@@ -171,6 +171,33 @@ public static long countRecordsSince(JavaSparkContext jsc, 
String basePath, SQLC
 }
   }
 
+  /**
+   * Obtain all new data written into the Hoodie table since the given 
timestamp.
+   */
+  public static long countAllRecords(JavaSparkContext jsc, String basePath, 
SQLContext sqlContext,

Review comment:
   We already have helper methods in this class only. readCommit, 
countRecordsSince. Thats why placed it here. anyways, I have fixed the existing 
method with optional additional argument. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389061#comment-17389061
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678643058



##
File path: 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/TestHoodieMergeOnReadTable.java
##
@@ -132,17 +140,27 @@ public void init(HoodieFileFormat baseFileFormat) throws 
IOException {
 
   @BeforeEach
   public void init() throws IOException {
-init(HoodieTableConfig.HOODIE_BASE_FILE_FORMAT_PROP.defaultValue());
+init(HoodieTableConfig.HOODIE_BASE_FILE_FORMAT_PROP.defaultValue(), true);
   }
 
   @AfterEach
   public void clean() throws IOException {
 cleanupResources();
   }
 
-  @Test
-  public void testSimpleInsertAndUpdate() throws Exception {
-HoodieWriteConfig cfg = getConfig(true);
+  private static Stream populateMetaFieldsParams() {

Review comment:
   This will definitely blow up the runtime if we were to add to every 
method. I tried to do a decent coverage of operations. We can discuss on how to 
go about this. Basically this could 2x the runtime of every class where we wish 
to run it fully against meta disabled and meta enabled. Already, our CI is not 
able to handle the load. So, atleast once we make CI (move to azure completely) 
stable with no limits on test runs (as of now, guess there is 1 hour limit in 
azure, and with our bulk insert support for virtual keys, guess all azure runs 
are failing), we can revisit this to add more tests with meta disabled. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389055#comment-17389055
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678634080



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/KeyGenUtils.java
##
@@ -60,7 +60,7 @@ public static String 
getRecordKeyFromGenericRecord(GenericRecord genericRecord,
* @return the partition path for the passed in generic record.
*/
   public static String getPartitionPathFromGenericRecord(GenericRecord 
genericRecord, Option keyGeneratorOpt) {
-return keyGeneratorOpt.isPresent() ? 
keyGeneratorOpt.get().getRecordKey(genericRecord) : 
genericRecord.get(HoodieRecord.PARTITION_PATH_METADATA_FIELD).toString();
+return keyGeneratorOpt.isPresent() ? 
keyGeneratorOpt.get().getPartitionPath(genericRecord) : 
genericRecord.get(HoodieRecord.PARTITION_PATH_METADATA_FIELD).toString();

Review comment:
   yes, I have tests succeeding at writeClient, spark datasource and 
deltastreamer as well for both COW and MOR. 

##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/KeyGenUtils.java
##
@@ -60,7 +60,7 @@ public static String 
getRecordKeyFromGenericRecord(GenericRecord genericRecord,
* @return the partition path for the passed in generic record.
*/
   public static String getPartitionPathFromGenericRecord(GenericRecord 
genericRecord, Option keyGeneratorOpt) {
-return keyGeneratorOpt.isPresent() ? 
keyGeneratorOpt.get().getRecordKey(genericRecord) : 
genericRecord.get(HoodieRecord.PARTITION_PATH_METADATA_FIELD).toString();
+return keyGeneratorOpt.isPresent() ? 
keyGeneratorOpt.get().getPartitionPath(genericRecord) : 
genericRecord.get(HoodieRecord.PARTITION_PATH_METADATA_FIELD).toString();

Review comment:
   yes, we have tests succeeding at writeClient, spark datasource and 
deltastreamer as well for both COW and MOR. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389054#comment-17389054
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678633348



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieInputFormatUtils.java
##
@@ -82,6 +82,13 @@
   public static final int HOODIE_PARTITION_PATH_COL_POS = 3;
   public static final String HOODIE_READ_COLUMNS_PROP = 
"hoodie.read.columns.set";
 
+  public static final String HOODIE_USE_META_FIELDS = "hoodie.use.meta.fields";

Review comment:
   I am not sure If I fully go with that just in this specific context. In 
general, I agree having the same property throughout makes sense. but here, 
somehow I feel its not very apparent. 

   populateMetaFields is something clearly on the write path(as it conveys 
whether to populate meta fields or not). I would expect something like 
"hasMetaFields" or something for read path. 
   Or we could rename the entire property to "enableMetaFields" or something so 
that both write and read side makes sense. 
   Anyways, w/ latest fix, we may not need these variables only. will remove 
them. just wanted to convey my opinion. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389046#comment-17389046
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678628690



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/RealtimeSplit.java
##
@@ -34,49 +36,73 @@
  */
 public interface RealtimeSplit extends InputSplitWithLocationInfo {
 
+
   /**
* Return Log File Paths.
+   *
* @return
*/
   List getDeltaLogPaths();
 
   /**
* Return Max Instant Time.
+   *
* @return
*/
   String getMaxCommitTime();
 
   /**
* Return Base Path of the dataset.
+   *
* @return
*/
   String getBasePath();
 
+  /**
+   * Returns Virtual key info if meta fields are disabled.
+   * @return
+   */
+  Option getHoodieVirtualKeyInfoOpt();
+
   /**
* Update Log File Paths.
+   *
* @param deltaLogPaths
*/
   void setDeltaLogPaths(List deltaLogPaths);
 
   /**
* Update Maximum valid instant time.
+   *
* @param maxCommitTime
*/
   void setMaxCommitTime(String maxCommitTime);
 
   /**
* Set Base Path.
+   *
* @param basePath
*/
   void setBasePath(String basePath);
 
+  void setHoodieVirtualKeyInfoOpt(Option 
hoodieVirtualKeyInfoOpt);
+
   default void writeToOutput(DataOutput out) throws IOException {
 InputSplitUtils.writeString(getBasePath(), out);
 InputSplitUtils.writeString(getMaxCommitTime(), out);
 out.writeInt(getDeltaLogPaths().size());
 for (String logFilePath : getDeltaLogPaths()) {
   InputSplitUtils.writeString(logFilePath, out);
 }
+if (!getHoodieVirtualKeyInfoOpt().isPresent()) {
+  InputSplitUtils.writeString(HoodieRealtimeInputFormatUtils.NULL_STR, 
out);

Review comment:
   @vinothchandar : also, is there some other elegant way to do this. I can 
do a boolean. but not sure if that has any major benefit over this. In 
InputSpltiUtils, we have apis for only string. And so chose this way. But open 
to setting 1 byte and decoding it. my understanding is that, these are not 
stored in disk as such for us to worry about storage.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389044#comment-17389044
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678628690



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/RealtimeSplit.java
##
@@ -34,49 +36,73 @@
  */
 public interface RealtimeSplit extends InputSplitWithLocationInfo {
 
+
   /**
* Return Log File Paths.
+   *
* @return
*/
   List getDeltaLogPaths();
 
   /**
* Return Max Instant Time.
+   *
* @return
*/
   String getMaxCommitTime();
 
   /**
* Return Base Path of the dataset.
+   *
* @return
*/
   String getBasePath();
 
+  /**
+   * Returns Virtual key info if meta fields are disabled.
+   * @return
+   */
+  Option getHoodieVirtualKeyInfoOpt();
+
   /**
* Update Log File Paths.
+   *
* @param deltaLogPaths
*/
   void setDeltaLogPaths(List deltaLogPaths);
 
   /**
* Update Maximum valid instant time.
+   *
* @param maxCommitTime
*/
   void setMaxCommitTime(String maxCommitTime);
 
   /**
* Set Base Path.
+   *
* @param basePath
*/
   void setBasePath(String basePath);
 
+  void setHoodieVirtualKeyInfoOpt(Option 
hoodieVirtualKeyInfoOpt);
+
   default void writeToOutput(DataOutput out) throws IOException {
 InputSplitUtils.writeString(getBasePath(), out);
 InputSplitUtils.writeString(getMaxCommitTime(), out);
 out.writeInt(getDeltaLogPaths().size());
 for (String logFilePath : getDeltaLogPaths()) {
   InputSplitUtils.writeString(logFilePath, out);
 }
+if (!getHoodieVirtualKeyInfoOpt().isPresent()) {
+  InputSplitUtils.writeString(HoodieRealtimeInputFormatUtils.NULL_STR, 
out);

Review comment:
   @vinothchandar : also, is there some other elegant way to do this. I can 
do a boolean. but not sure if that has any major benefit over this. In 
InputSpltiUtils, we have apis for only string. And so chose this way. But open 
to setting 1 byte and decoding it. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389043#comment-17389043
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678628690



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/RealtimeSplit.java
##
@@ -34,49 +36,73 @@
  */
 public interface RealtimeSplit extends InputSplitWithLocationInfo {
 
+
   /**
* Return Log File Paths.
+   *
* @return
*/
   List getDeltaLogPaths();
 
   /**
* Return Max Instant Time.
+   *
* @return
*/
   String getMaxCommitTime();
 
   /**
* Return Base Path of the dataset.
+   *
* @return
*/
   String getBasePath();
 
+  /**
+   * Returns Virtual key info if meta fields are disabled.
+   * @return
+   */
+  Option getHoodieVirtualKeyInfoOpt();
+
   /**
* Update Log File Paths.
+   *
* @param deltaLogPaths
*/
   void setDeltaLogPaths(List deltaLogPaths);
 
   /**
* Update Maximum valid instant time.
+   *
* @param maxCommitTime
*/
   void setMaxCommitTime(String maxCommitTime);
 
   /**
* Set Base Path.
+   *
* @param basePath
*/
   void setBasePath(String basePath);
 
+  void setHoodieVirtualKeyInfoOpt(Option 
hoodieVirtualKeyInfoOpt);
+
   default void writeToOutput(DataOutput out) throws IOException {
 InputSplitUtils.writeString(getBasePath(), out);
 InputSplitUtils.writeString(getMaxCommitTime(), out);
 out.writeInt(getDeltaLogPaths().size());
 for (String logFilePath : getDeltaLogPaths()) {
   InputSplitUtils.writeString(logFilePath, out);
 }
+if (!getHoodieVirtualKeyInfoOpt().isPresent()) {
+  InputSplitUtils.writeString(HoodieRealtimeInputFormatUtils.NULL_STR, 
out);

Review comment:
   @vinothchandar : also, is there some other elegant way to do this. I can 
do a boolean. but not sure if that has any major benefit over this. In 
InputSpltiUtils, we have apis for string. And so chose this way. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389042#comment-17389042
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * b624c142628ec6b780bc124dbd5c9d0a9d5e1fdc Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1225)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389039#comment-17389039
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * f243dba0eb83068ee8ec39f1e3f703c4d20ee39e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1210)
 
   * b624c142628ec6b780bc124dbd5c9d0a9d5e1fdc UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389038#comment-17389038
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r678625840



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##
@@ -87,6 +93,20 @@
   fsCache.put(metaClient, fsView);
 }
 HoodieTableFileSystemView fsView = fsCache.get(metaClient);
+// fetch virtual key params if required.

Review comment:
   @vinothchandar : quick question. here we have a N no of metaclients, one 
per partition path (line 86). 
   Is it safe to assume all partitions have same schema? Then, should I move 
this outside of foreach and just pick first entry and find the tableConfig to 
populate hoodieVirtualKeyInfo. If not, we are fetching the schema for every 
partition to find the hoodieVirtualKeyInfo
   
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388476#comment-17388476
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r677990702



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimeFileSplit.java
##
@@ -30,22 +32,29 @@
  */
 public class HoodieRealtimeFileSplit extends FileSplit implements 
RealtimeSplit {
 
+  private static final Logger LOG = 
LogManager.getLogger(HoodieRealtimeFileSplit.class);
+
   private List deltaLogPaths;
 
   private String maxCommitTime;
 
   private String basePath;
 
+  private HoodieVirtualKeyInfo virtualKeyInfoOpt = null;

Review comment:
   I initially had this as Option< HoodieVirtualKeyInfo>, but was not sure 
if Option is causing any issues and hence removed and tried.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388475#comment-17388475
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * f243dba0eb83068ee8ec39f1e3f703c4d20ee39e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1210)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388473#comment-17388473
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c20ec0e011d3f7537e95c3b02dc2de2c7d58738a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1202)
 
   * f243dba0eb83068ee8ec39f1e3f703c4d20ee39e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388371#comment-17388371
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

vinothchandar commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r677886161



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/KeyGenUtils.java
##
@@ -60,7 +60,7 @@ public static String 
getRecordKeyFromGenericRecord(GenericRecord genericRecord,
* @return the partition path for the passed in generic record.
*/
   public static String getPartitionPathFromGenericRecord(GenericRecord 
genericRecord, Option keyGeneratorOpt) {
-return keyGeneratorOpt.isPresent() ? 
keyGeneratorOpt.get().getRecordKey(genericRecord) : 
genericRecord.get(HoodieRecord.PARTITION_PATH_METADATA_FIELD).toString();
+return keyGeneratorOpt.isPresent() ? 
keyGeneratorOpt.get().getPartitionPath(genericRecord) : 
genericRecord.get(HoodieRecord.PARTITION_PATH_METADATA_FIELD).toString();

Review comment:
   @nsivabalan Got it. can you double check this again across the board. to 
see no typos. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388369#comment-17388369
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

vinothchandar commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r677872797



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/KeyGenUtils.java
##
@@ -60,7 +60,7 @@ public static String 
getRecordKeyFromGenericRecord(GenericRecord genericRecord,
* @return the partition path for the passed in generic record.
*/
   public static String getPartitionPathFromGenericRecord(GenericRecord 
genericRecord, Option keyGeneratorOpt) {
-return keyGeneratorOpt.isPresent() ? 
keyGeneratorOpt.get().getRecordKey(genericRecord) : 
genericRecord.get(HoodieRecord.PARTITION_PATH_METADATA_FIELD).toString();

Review comment:
   oops

##
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/cluster/SparkExecuteClusteringCommitActionExecutor.java
##
@@ -210,8 +211,11 @@ protected String getCommitActionType() {
   .build();
 
   
recordIterators.add(HoodieFileSliceReader.getFileSliceReader(baseFileReader, 
scanner, readerSchema,
-  table.getMetaClient().getTableConfig().getPayloadClass()));
+  table.getMetaClient().getTableConfig().getPayloadClass(),
+  table.getMetaClient().getTableConfig().populateMetaFields() ? 
Option.empty() : 
Option.of(table.getMetaClient().getTableConfig().getRecordKeyFieldProp()),
+  table.getMetaClient().getTableConfig().populateMetaFields() ? 
Option.empty() : 
Option.of(table.getMetaClient().getTableConfig().getPartitionFieldProp(;

Review comment:
   I feel if we had a `Option.ofBoolean()` where it returns a Option.empty 
or the value to further map, this will read much nicer. 
   
   
`Option.ofBoolean(tableConfig.populateMetaFields).map(tableConfig::getPartitionFieldProp)`
 

##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/util/SpillableMapUtils.java
##
@@ -110,8 +110,15 @@ public static long generateChecksum(byte[] data) {
* Utility method to convert bytes to HoodieRecord using schema and payload 
class.
*/
   public static  R convertToHoodieRecordPayload(GenericRecord rec, String 
payloadClazz) {
-String recKey = rec.get(HoodieRecord.RECORD_KEY_METADATA_FIELD).toString();
-String partitionPath = 
rec.get(HoodieRecord.PARTITION_PATH_METADATA_FIELD).toString();
+return convertToHoodieRecordPayload(rec, payloadClazz, 
HoodieRecord.RECORD_KEY_METADATA_FIELD, 
HoodieRecord.PARTITION_PATH_METADATA_FIELD);

Review comment:
   can't you pass these as default at a higher layer. So there will only be 
one method here, which takes 4 args. 

##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
##
@@ -324,6 +324,14 @@ public void validateTableProperties(Properties properties, 
WriteOperationType op
 && Boolean.parseBoolean((String) 
properties.getOrDefault(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key(), 
HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.defaultValue( {
   throw new 
HoodieException(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key() + " already 
disabled for the table. Can't be re-enabled back");
 }
+
+// meta fields can be disabled only with SimpleKeyGenerator
+if (!getTableConfig().populateMetaFields()
+&& 
!properties.getProperty(HoodieTableConfig.HOODIE_TABLE_KEY_GENERATOR_CLASS.key(),
 HoodieTableConfig.DEFAULT_HOODIE_TABLE_KEY_GENERATOR_CLASS)

Review comment:
   just use `org.apache.hudi.keygen.SimpleKeyGenerator.class.getName` or 
something always as opposed to using strings to hold the class name. the 
advantage is that the IDE can find these usages easily

##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieInputFormatUtils.java
##
@@ -82,6 +82,13 @@
   public static final int HOODIE_PARTITION_PATH_COL_POS = 3;
   public static final String HOODIE_READ_COLUMNS_PROP = 
"hoodie.read.columns.set";
 
+  public static final String HOODIE_USE_META_FIELDS = "hoodie.use.meta.fields";

Review comment:
   but its a table level property and the query has the understandin that 
its about data is stored, not just about how its written. I would advise 
against having a different name for the same thing. it leads to more harm than 
good over the long run

##
File path: 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/TestHoodieMergeOnReadTable.java
##
@@ -232,21 +253,30 @@ public void testSimpleInsertAndUpdateHFile() throws 
Exception {
 }
   }
 
-  @Test
-  public void testSimpleClusteringNoUpdates() throws Exception {
-testClustering(false);
+  @ParameterizedTest
+  

[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388368#comment-17388368
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r677883351



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/KeyGenUtils.java
##
@@ -60,7 +60,7 @@ public static String 
getRecordKeyFromGenericRecord(GenericRecord genericRecord,
* @return the partition path for the passed in generic record.
*/
   public static String getPartitionPathFromGenericRecord(GenericRecord 
genericRecord, Option keyGeneratorOpt) {
-return keyGeneratorOpt.isPresent() ? 
keyGeneratorOpt.get().getRecordKey(genericRecord) : 
genericRecord.get(HoodieRecord.PARTITION_PATH_METADATA_FIELD).toString();
+return keyGeneratorOpt.isPresent() ? 
keyGeneratorOpt.get().getPartitionPath(genericRecord) : 
genericRecord.get(HoodieRecord.PARTITION_PATH_METADATA_FIELD).toString();

Review comment:
   this was never invoked for COW table and hence didn't hit this issue. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388285#comment-17388285
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r677780865



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieInputFormatUtils.java
##
@@ -82,6 +82,13 @@
   public static final int HOODIE_PARTITION_PATH_COL_POS = 3;
   public static final String HOODIE_READ_COLUMNS_PROP = 
"hoodie.read.columns.set";
 
+  public static final String HOODIE_USE_META_FIELDS = "hoodie.use.meta.fields";

Review comment:
   rational to use a different naming compared to what we use in table 
props. This is just a query side variable used just in realtime path. 
"UseMetaFields" makes more sense from a query standpoint than 
"hoodiePopulateMetaFields". 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388226#comment-17388226
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * c20ec0e011d3f7537e95c3b02dc2de2c7d58738a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1202)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388190#comment-17388190
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * ffc86fc0ea2e5eefde7a975d5d74e3ae97858896 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1122)
 
   * c20ec0e011d3f7537e95c3b02dc2de2c7d58738a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1202)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388188#comment-17388188
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * ffc86fc0ea2e5eefde7a975d5d74e3ae97858896 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1122)
 
   * c20ec0e011d3f7537e95c3b02dc2de2c7d58738a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386091#comment-17386091
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * ffc86fc0ea2e5eefde7a975d5d74e3ae97858896 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1122)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386059#comment-17386059
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * f01f1ba13d909b1e3bf1300068e410208232c2ea Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1119)
 
   * ffc86fc0ea2e5eefde7a975d5d74e3ae97858896 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1122)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386031#comment-17386031
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * f01f1ba13d909b1e3bf1300068e410208232c2ea Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1119)
 
   * ffc86fc0ea2e5eefde7a975d5d74e3ae97858896 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386026#comment-17386026
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * 8e21760032ad596c59cc95eebc6c1991a571b488 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1085)
 
   * f01f1ba13d909b1e3bf1300068e410208232c2ea Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1119)
 
   * ffc86fc0ea2e5eefde7a975d5d74e3ae97858896 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386023#comment-17386023
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-885443569


   @vinothchandar : this patch is good to be reviewed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386010#comment-17386010
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r675342299



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordScanner.java
##
@@ -80,6 +80,10 @@
   private final HoodieTableMetaClient hoodieTableMetaClient;
   // Merge strategy to use when combining records from log
   private final String payloadClassFQN;
+  // simple recordKey field
+  private Option simpleRecordKeyFieldOpt = Option.empty();

Review comment:
   On the query side, I am naming all variables with the assumption that if 
populateMetaFields is set to false, then key gen will be simpleKeyGen as we 
would failed the operation during write itself. 
   Where as on the write path, we will serialize all key gen properties. and 
hence recordKey property and partition path property is not prefixed with 
"simple". 
   

##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieFileSliceReader.java
##
@@ -36,11 +38,14 @@
   private Iterator> 
recordsIterator;
 
   public static  
HoodieFileSliceReader getFileSliceReader(
-  HoodieFileReader baseFileReader, HoodieMergedLogRecordScanner 
scanner, Schema schema, String payloadClass) throws IOException {
+  HoodieFileReader baseFileReader, HoodieMergedLogRecordScanner 
scanner, Schema schema, String payloadClass,
+  Option simpleRecordKeyFieldOpt, Option 
simplePartitionPathFieldOpt) throws IOException {

Review comment:
   Initially I had 3 args, populateMetaFields, simpleRecordKey and 
simplePartitionPath. Based on feedback for COW patch, have changed it this way. 
   But just for this class, I feel we could go with 3 args and then in line 46 
call either method depending on populateMetaFields. Mainly bcoz, caller of this 
method actually adds Option and sends to this method, but then in line 48, we 
again remove the Option and pass the argument. So, why not keep it simple. I am 
fine either ways, let me know your thoughts

##
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##
@@ -128,8 +129,11 @@ object HoodieSparkSqlWriter {
   .setPayloadClassName(hoodieConfig.getString(PAYLOAD_CLASS_OPT_KEY))
   
.setPreCombineField(hoodieConfig.getStringOrDefault(PRECOMBINE_FIELD_OPT_KEY, 
null))
   .setPartitionColumns(partitionColumns)
-  
.setPopulateMetaFields(parameters.getOrElse(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key(),
 HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.defaultValue()).toBoolean)
+  .setPopulateMetaFields(populateMetaFields)
+  .setRecordKeyFields(hoodieConfig.getString(RECORDKEY_FIELD_OPT_KEY))
+  
.setPartitionColumns(hoodieConfig.getString(PARTITIONPATH_FIELD_OPT_KEY))

Review comment:
   So, we are adding record keys, partition path and key gen class prop for 
all tables at spark datasource layer. 

##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
##
@@ -324,6 +324,13 @@ public void validateTableProperties(Properties properties, 
WriteOperationType op
 && Boolean.parseBoolean((String) 
properties.getOrDefault(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key(), 
HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.defaultValue( {
   throw new 
HoodieException(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key() + " already 
disabled for the table. Can't be re-enabled back");
 }
+
+// meta fields can be disabled only with SimpleKeyGenerator
+if (!getTableConfig().populateMetaFields()
+&& 
!properties.getProperty(HoodieTableConfig.HOODIE_TABLE_KEY_GENERATOR_CLASS.key()).equals("org.apache.hudi.keygen.SimpleKeyGenerator"))
 {

Review comment:
   SimpleKeyGenerator class is not reachable from here and so have hard 
coded. Let me know if there is some other way to not hard code the class 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual 

[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386002#comment-17386002
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * 8e21760032ad596c59cc95eebc6c1991a571b488 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1085)
 
   * f01f1ba13d909b1e3bf1300068e410208232c2ea Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1119)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385998#comment-17385998
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * 8e21760032ad596c59cc95eebc6c1991a571b488 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1085)
 
   * f01f1ba13d909b1e3bf1300068e410208232c2ea UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385170#comment-17385170
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * 8e21760032ad596c59cc95eebc6c1991a571b488 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1085)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385167#comment-17385167
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

codecov-commenter edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883856272


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3315?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3315](https://codecov.io/gh/apache/hudi/pull/3315?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (8e21760) into 
[master](https://codecov.io/gh/apache/hudi/commit/a086d255c89d12eb42cad8c5ae0e000f3b83bbe6?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (a086d25) will **decrease** coverage by `44.91%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3315/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3315?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #3315   +/-   ##
   
   - Coverage 47.74%   2.82%   -44.92% 
   + Complexity 5591  85 -5506 
   
 Files   938 280  -658 
 Lines 41823   11862-29961 
 Branches   4213 989 -3224 
   
   - Hits  19968 335-19633 
   + Misses20070   11501 -8569 
   + Partials   1785  26 -1759 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (-34.56%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `4.88% <ø> (-51.10%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `8.97% <ø> (-50.91%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3315?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3315/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh)
 | `0.00% <0.00%> (-43.38%)` | :arrow_down: |
   | 
[...in/java/org/apache/hudi/io/HoodieAppendHandle.java](https://codecov.io/gh/apache/hudi/pull/3315/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0hvb2RpZUFwcGVuZEhhbmRsZS5qYXZh)
 | `0.00% <0.00%> (ø)` | |
   | 
[...g/apache/hudi/io/HoodieKeyLocationFetchHandle.java](https://codecov.io/gh/apache/hudi/pull/3315/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0hvb2RpZUtleUxvY2F0aW9uRmV0Y2hIYW5kbGUuamF2YQ==)
 | `0.00% <0.00%> (ø)` | |
   | 
[...ain/java/org/apache/hudi/io/HoodieMergeHandle.java](https://codecov.io/gh/apache/hudi/pull/3315/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0hvb2RpZU1lcmdlSGFuZGxlLmphdmE=)
 | `0.00% <0.00%> (ø)` | |
   | 
[...va/org/apache/hudi/io/HoodieSortedMergeHandle.java](https://codecov.io/gh/apache/hudi/pull/3315/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0hvb2RpZVNvcnRlZE1lcmdlSGFuZGxlLmphdmE=)
 | `0.00% <0.00%> (ø)` | |
   | 

[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385160#comment-17385160
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * b4b137968fc3ceed09408fdef934bba73764c5e5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1064)
 
   * 8e21760032ad596c59cc95eebc6c1991a571b488 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1085)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385157#comment-17385157
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * b4b137968fc3ceed09408fdef934bba73764c5e5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1064)
 
   * 8e21760032ad596c59cc95eebc6c1991a571b488 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385155#comment-17385155
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r674408672



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
##
@@ -608,6 +608,14 @@ public boolean populateMetaFields() {
 HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.defaultValue()));
   }
 
+  public String getSimpleRecordKeyField() {
+return getString(HoodieTableConfig.HOODIE_TABLE_SIMPLE_RECORDKEY_FIELD);

Review comment:
   I do see we already have two configs in table props to hold record key 
and partition path from sql layer. But I am not very sure if it works across 
all layers. i.e. data source, write client etc. in other words, not sure if I 
can re-use the same variables to store record keys and partition paths. 
   Open to discuss and decide on how to go about this. 
   

##
File path: 
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/DataSourceUtils.java
##
@@ -182,6 +183,13 @@ public static HoodieWriteConfig createHoodieConfig(String 
schemaStr, String base
   builder = builder.withSchema(schemaStr);
 }
 
+boolean isSimpleKeyGen = 
(parameters.getOrDefault(DataSourceWriteOptions.KEYGENERATOR_CLASS_OPT_KEY().key(),
 DataSourceWriteOptions.DEFAULT_KEYGENERATOR_CLASS_OPT_VAL()))
+.equals(SimpleKeyGenerator.class.getName());
+if (isSimpleKeyGen) {

Review comment:
   I chose to serialize the props only when key gen is of type simple. Once 
we add a validation rule that meta client can be enabled only w/ simple key 
gen, we can serialize these props only when meta fields are disabled. don't 
really need to check for simple key gen. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384645#comment-17384645
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * b4b137968fc3ceed09408fdef934bba73764c5e5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1064)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384621#comment-17384621
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

codecov-commenter commented on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883856272


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3315?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3315](https://codecov.io/gh/apache/hudi/pull/3315?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (b4b1379) into 
[master](https://codecov.io/gh/apache/hudi/commit/a086d255c89d12eb42cad8c5ae0e000f3b83bbe6?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (a086d25) will **decrease** coverage by `44.91%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3315/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3315?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #3315   +/-   ##
   
   - Coverage 47.74%   2.82%   -44.92% 
   + Complexity 5591  85 -5506 
   
 Files   938 280  -658 
 Lines 41823   11862-29961 
 Branches   4213 989 -3224 
   
   - Hits  19968 335-19633 
   + Misses20070   11501 -8569 
   + Partials   1785  26 -1759 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (-34.56%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `4.88% <ø> (-51.10%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `8.97% <ø> (-50.91%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3315?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3315/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh)
 | `0.00% <0.00%> (-43.38%)` | :arrow_down: |
   | 
[...in/java/org/apache/hudi/io/HoodieAppendHandle.java](https://codecov.io/gh/apache/hudi/pull/3315/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0hvb2RpZUFwcGVuZEhhbmRsZS5qYXZh)
 | `0.00% <0.00%> (ø)` | |
   | 
[...g/apache/hudi/io/HoodieKeyLocationFetchHandle.java](https://codecov.io/gh/apache/hudi/pull/3315/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0hvb2RpZUtleUxvY2F0aW9uRmV0Y2hIYW5kbGUuamF2YQ==)
 | `0.00% <0.00%> (ø)` | |
   | 
[...ain/java/org/apache/hudi/io/HoodieMergeHandle.java](https://codecov.io/gh/apache/hudi/pull/3315/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0hvb2RpZU1lcmdlSGFuZGxlLmphdmE=)
 | `0.00% <0.00%> (ø)` | |
   | 
[...va/org/apache/hudi/io/HoodieSortedMergeHandle.java](https://codecov.io/gh/apache/hudi/pull/3315/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0hvb2RpZVNvcnRlZE1lcmdlSGFuZGxlLmphdmE=)
 | `0.00% <0.00%> (ø)` | |
   | 

[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384617#comment-17384617
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot edited a comment on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * b4b137968fc3ceed09408fdef934bba73764c5e5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1064)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384615#comment-17384615
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

hudi-bot commented on pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#issuecomment-883851530


   
   ## CI report:
   
   * b4b137968fc3ceed09408fdef934bba73764c5e5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2177) Virtual keys support for Compaction

2021-07-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384611#comment-17384611
 ] 

ASF GitHub Bot commented on HUDI-2177:
--

nsivabalan opened a new pull request #3315:
URL: https://github.com/apache/hudi/pull/3315


   ## What is the purpose of the pull request
   
- Adding virtual keys support to MOR table
 - Compaction
 - Realtime read
 - Clustering
   
   Constraints:
   Only SimpleKeyGen is supported because during real time read (snapshot 
read), we can't afford to generated keys using complex key gens. Query times 
will shoot up and its unusable from a user standpoint. 
   
   ## Brief change log
   
   - Introduced 2 additional configs to HoodieTableConfig to serialize the 
simple record field and simple partition field for the table. 
   - Added virutal keys support to MOR table for compaction, clustering and 
realtime read. 
   - Metadata table is ensured to work w/ virtual keyed MOR table. 
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This change added tests and can be verified as follows:
   
   - Fixed TestHoodieMergeOnReadTable and TestHoodieBackedMetadata for virtual 
keys with MOR table.
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Virtual keys support for Compaction
> ---
>
> Key: HUDI-2177
> URL: https://issues.apache.org/jira/browse/HUDI-2177
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: sivabalan narayanan
>Priority: Major
> Fix For: 0.9.0
>
>
> Virtual keys support for Compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)