[GitHub] [hudi] garyli1019 commented on a change in pull request #2616: [HUDI-1625] Support range partition keygen

2021-06-05 Thread GitBox


garyli1019 commented on a change in pull request #2616:
URL: https://github.com/apache/hudi/pull/2616#discussion_r646079155



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/RangePartitionAvroKeyGenerator.java
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.keygen;
+
+import org.apache.hudi.avro.HoodieAvroUtils;
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.exception.HoodieKeyGeneratorException;
+import org.apache.hudi.exception.HoodieNotSupportedException;
+import org.apache.hudi.keygen.constant.KeyGeneratorOptions;
+
+import org.apache.avro.generic.GenericRecord;
+
+public class RangePartitionAvroKeyGenerator extends SimpleAvroKeyGenerator {
+
+  private final Long rangePerBucket;
+  private final String partitionName;
+
+  public static class Config {
+public static final String RANGE_PER_PARTITION_PROP = 
"hoodie.keygen.range.partition.num";
+public static final Long DEFAULT_RANGE_PER_PARTITION = 10L;
+public static final String RANGE_PARTITION_NAME_PROP = 
"hoodie.keygen.range.partition.name";
+public static final String DEFAULT_RANGE_PARTITION_NAME = "rangePartition";

Review comment:
   for the CDC data has incremental primary key and only a few updates, 
this could speed up the ingestion, but nothing happens on the query side. It's 
also possible to implement some range query optimizer to use the range 
information to do some data skipping as well.
   I am not quite sure if we should move forward with this patch or we should 
aim for a bigger picture with a comprehensive range partition design. cc: 
@nsivabalan 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] arun990 commented on issue #3009: Dependency error when attempt to build Hudi from git source ..

2021-06-05 Thread GitBox


arun990 commented on issue #3009:
URL: https://github.com/apache/hudi/issues/3009#issuecomment-855328928


   Hi, checked it and the build was completed without the reported error.
   Thank you.
   Closing the issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] arun990 closed issue #3009: Dependency error when attempt to build Hudi from git source ..

2021-06-05 Thread GitBox


arun990 closed issue #3009:
URL: https://github.com/apache/hudi/issues/3009


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[hudi] branch master updated (2a7e1e0 -> 08464a6)

2021-06-05 Thread garyli
This is an automated email from the ASF dual-hosted git repository.

garyli pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git.


from 2a7e1e0  [HUDI-1942] Add Default value for 
HIVE_AUTO_CREATE_DATABASE_OPT_KEY in HoodieSparkSqlWriter (#3036)
 add 08464a6  [HUDI-1931] BucketAssignFunction use ValueState instead of 
MapState (#3026)

No new revisions were added by this update.

Summary of changes:
 .../sink/partitioner/BucketAssignFunction.java | 56 +++--
 .../sink/partitioner/BucketAssignOperator.java | 57 ++
 .../org/apache/hudi/table/HoodieTableSink.java |  3 +-
 .../org/apache/hudi/sink/StreamWriteITCase.java|  5 +-
 .../hudi/sink/utils/CompactFunctionWrapper.java| 10 ++--
 .../hudi/sink/utils/MockOperatorStateStore.java|  6 ++-
 .../org/apache/hudi/sink/utils/MockValueState.java | 28 ++-
 .../sink/utils/StreamWriteFunctionWrapper.java | 33 -
 8 files changed, 147 insertions(+), 51 deletions(-)
 create mode 100644 
hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/BucketAssignOperator.java
 copy 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/InMemoryMetricsReporter.java
 => hudi-flink/src/test/java/org/apache/hudi/sink/utils/MockValueState.java 
(66%)


[GitHub] [hudi] garyli1019 merged pull request #3026: [HUDI-1931] BucketAssignFunction use ValueState instead of MapState

2021-06-05 Thread GitBox


garyli1019 merged pull request #3026:
URL: https://github.com/apache/hudi/pull/3026


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] garyli1019 commented on pull request #3026: [HUDI-1931] BucketAssignFunction use ValueState instead of MapState

2021-06-05 Thread GitBox


garyli1019 commented on pull request #3026:
URL: https://github.com/apache/hudi/pull/3026#issuecomment-855327366


   merging. we can address the minor comment in #3024 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * 480c169776dcf2260cbfebc7dc90bd2f1807e411 UNKNOWN
   * e2cac1f9ec077d1e033c358168864db967dbeea5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=156)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * 4b7f6c639c497e89b8ca90be6a0742715ea18209 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=155)
 
   * 480c169776dcf2260cbfebc7dc90bd2f1807e411 UNKNOWN
   * e2cac1f9ec077d1e033c358168864db967dbeea5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=156)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * 4b7f6c639c497e89b8ca90be6a0742715ea18209 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=155)
 
   * 480c169776dcf2260cbfebc7dc90bd2f1807e411 UNKNOWN
   * e2cac1f9ec077d1e033c358168864db967dbeea5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * 0070398ad4cd0bdebd084fe91acc2daa046fc16d Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=154)
 
   * 4b7f6c639c497e89b8ca90be6a0742715ea18209 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=155)
 
   * 480c169776dcf2260cbfebc7dc90bd2f1807e411 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * 0070398ad4cd0bdebd084fe91acc2daa046fc16d Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=154)
 
   * 4b7f6c639c497e89b8ca90be6a0742715ea18209 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=155)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * 0070398ad4cd0bdebd084fe91acc2daa046fc16d Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=154)
 
   * 4b7f6c639c497e89b8ca90be6a0742715ea18209 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #2967: Added blog for Hudi cleaner service

2021-06-05 Thread GitBox


nsivabalan commented on a change in pull request #2967:
URL: https://github.com/apache/hudi/pull/2967#discussion_r646041232



##
File path: 
docs/_posts/2021-06-03-employing-right-configurations-for-hudi-cleaner.md
##
@@ -0,0 +1,106 @@
+---
+title: "Employing correct configurations for Hudi's cleaner table service"
+excerpt: "Ensuring isolation between Hudi writers and readers using 
`HoodieCleaner.java`"
+author: pratyakshsharma
+category: blog
+---
+
+Apache Hudi provides snapshot isolation between writers and readers. This is 
made possible by Hudi’s MVCC concurrency model. In this blog, we will explain 
how to employ the right configurations to manage multiple file versions. 
Furthermore, we will discuss mechanisms available to users on how to maintain 
just the required number of old file versions so that long running readers do 
not fail. 
+
+### Reclaiming space and keeping your data lake storage costs in check
+
+Hudi provides different table management services to be able to manage your 
tables on the data lake. One of these services is called the **Cleaner**. As 
you write more data to your table, for every batch of updates received, Hudi 
can either generate a new version of the data file with updates applied to 
records (COPY_ON_WRITE) or write these delta updates to a log file, avoiding 
rewriting newer version of an existing file (MERGE_ON_READ). In such 
situations, depending on the frequency of your updates, the number of file 
versions of log files can grow indefinitely. If your use-cases do not require 
keeping an infinite history of these versions, it is imperative to have a 
process that reclaims older versions of the data. This is Hudi’s cleaner 
service.
+
+### Problem Statement
+
+In a data lake architecture, it is a very common scenario to have readers and 
writers concurrently accessing the same table. As the Hudi cleaner service 
periodically reclaims older file versions, scenarios arise where a long running 
query might be accessing a file version that is deemed to be reclaimed by the 
cleaner. Here, we need to employ the correct configs to ensure readers (aka 
queries) don’t fail.
+
+### Deeper dive into Hudi Cleaner
+
+To deal with the mentioned scenario, lets understand the  different cleaning 
policies that Hudi offers and the corresponding properties that need to be 
configured. Options are available to schedule cleaning asynchronously or 
synchronously. Before going into more details, we would like to explain a few 
underlying concepts:
+
+ - **Hudi base file**: Columnar file which consists of final data after 
compaction. A base file’s name follows the following naming convention: 
`__.parquet`. In subsequent writes of this 
file, file id remains the same and commit time gets updated to show the latest 
version. This also implies any particular version of a record, given its 
partition path, can be uniquely located using the file id and instant time. 
+ - **File slice**: A file slice consists of the base file and any log files 
consisting of the delta, in case of MERGE_ON_READ table type.
+ - **Hudi File Group**: Any file group in Hudi is uniquely identified by the 
partition path and the  file id that the files in this group have as part of 
their name. A file group consists of all the file slices in a particular 
partition path. Also any partition path can have multiple file groups.
+
+### Cleaning Policies
+
+Hudi cleaner currently supports below cleaning policies:
+
+ - **KEEP_LATEST_COMMITS**: This is the default policy. This is a temporal 
cleaning policy that ensures the effect of having lookback into all the changes 
that happened in the last X commits. Suppose a writer is ingesting data  into a 
Hudi dataset every 30 minutes and the longest running query can take 5 hours to 
finish, then the user should retain atleast the last 10 commits. With such a 
configuration, we ensure that the oldest version of a file is kept on disk for 
at least 5 hours, thereby preventing the longest running query from failing at 
any point in time. Incremental cleaning is also possible using this policy.
+ - **KEEP_LATEST_FILE_VERSIONS**: This policy has the effect of keeping N 
number of file versions irrespective of time. This policy is useful when it is 
known how many MAX versions of the file does one want to keep at any given 
time. To achieve the same behaviour as before of preventing long running 
queries from failing, one should do their calculations based on data patterns. 
Alternatively, this policy is also useful if a user just wants to maintain 1 
latest version of the file.
+
+### Examples
+
+Suppose a user is ingesting data into a hudi dataset of type COPY_ON_WRITE 
every 30 minutes as shown below:
+
+![Initial timeline](/assets/images/blog/hoodie-cleaner/Initial_timeline.png)
+_Figure1: Incoming records getting ingested into a hudi dataset every 30 
minutes_
+
+The figure shows a particular partition on DFS where commits and corresponding 
file 

[hudi] branch master updated: [HUDI-1942] Add Default value for HIVE_AUTO_CREATE_DATABASE_OPT_KEY in HoodieSparkSqlWriter (#3036)

2021-06-05 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 2a7e1e0  [HUDI-1942] Add Default value for 
HIVE_AUTO_CREATE_DATABASE_OPT_KEY in HoodieSparkSqlWriter (#3036)
2a7e1e0 is described below

commit 2a7e1e091e69c53acc0a19e3d792ca15a3d7db62
Author: Vinay Patil <52563354+veenaypa...@users.noreply.github.com>
AuthorDate: Sun Jun 6 03:32:26 2021 +0530

[HUDI-1942] Add Default value for HIVE_AUTO_CREATE_DATABASE_OPT_KEY in 
HoodieSparkSqlWriter (#3036)
---
 .../src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala| 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git 
a/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
 
b/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
index abb5f76..17b3cc2 100644
--- 
a/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
+++ 
b/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
@@ -155,7 +155,7 @@ private[hudi] object HoodieSparkSqlWriter {
 
   // Convert to RDD[HoodieRecord]
   val genericRecords: RDD[GenericRecord] = 
HoodieSparkUtils.createRdd(df, schema, structName, nameSpace)
-  val shouldCombine = parameters(INSERT_DROP_DUPS_OPT_KEY).toBoolean 
|| operation.equals(WriteOperationType.UPSERT);
+  val shouldCombine = parameters(INSERT_DROP_DUPS_OPT_KEY).toBoolean 
|| operation.equals(WriteOperationType.UPSERT)
   val hoodieAllIncomingRecords = genericRecords.map(gr => {
 val hoodieRecord = if (shouldCombine) {
   val orderingVal = HoodieAvroUtils.getNestedFieldVal(gr, 
parameters(PRECOMBINE_FIELD_OPT_KEY), false)
@@ -423,7 +423,8 @@ private[hudi] object HoodieSparkSqlWriter {
 hiveSyncConfig.verifyMetadataFileListing = 
parameters(HoodieMetadataConfig.METADATA_VALIDATE_PROP).toBoolean
 hiveSyncConfig.ignoreExceptions = 
parameters.get(HIVE_IGNORE_EXCEPTIONS_OPT_KEY).exists(r => r.toBoolean)
 hiveSyncConfig.supportTimestamp = 
parameters.get(HIVE_SUPPORT_TIMESTAMP).exists(r => r.toBoolean)
-hiveSyncConfig.autoCreateDatabase = 
parameters.get(HIVE_AUTO_CREATE_DATABASE_OPT_KEY).exists(r => r.toBoolean)
+hiveSyncConfig.autoCreateDatabase = 
parameters.getOrElse(HIVE_AUTO_CREATE_DATABASE_OPT_KEY,
+  DEFAULT_HIVE_AUTO_CREATE_DATABASE_OPT_KEY).toBoolean
 hiveSyncConfig.decodePartition = 
parameters.getOrElse(URL_ENCODE_PARTITIONING_OPT_KEY,
   DEFAULT_URL_ENCODE_PARTITIONING_OPT_VAL).toBoolean
 


[GitHub] [hudi] nsivabalan merged pull request #3036: [HUDI-1942] Add Default value for HIVE_AUTO_CREATE_DATABASE_OPT_KEY

2021-06-05 Thread GitBox


nsivabalan merged pull request #3036:
URL: https://github.com/apache/hudi/pull/3036


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #3036: [HUDI-1942] Add Default value for HIVE_AUTO_CREATE_DATABASE_OPT_KEY

2021-06-05 Thread GitBox


nsivabalan commented on a change in pull request #3036:
URL: https://github.com/apache/hudi/pull/3036#discussion_r646041047



##
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##
@@ -423,7 +423,8 @@ private[hudi] object HoodieSparkSqlWriter {
 hiveSyncConfig.verifyMetadataFileListing = 
parameters(HoodieMetadataConfig.METADATA_VALIDATE_PROP).toBoolean
 hiveSyncConfig.ignoreExceptions = 
parameters.get(HIVE_IGNORE_EXCEPTIONS_OPT_KEY).exists(r => r.toBoolean)
 hiveSyncConfig.supportTimestamp = 
parameters.get(HIVE_SUPPORT_TIMESTAMP).exists(r => r.toBoolean)
-hiveSyncConfig.autoCreateDatabase = 
parameters.get(HIVE_AUTO_CREATE_DATABASE_OPT_KEY).exists(r => r.toBoolean)
+hiveSyncConfig.autoCreateDatabase = 
parameters.getOrElse(HIVE_AUTO_CREATE_DATABASE_OPT_KEY,

Review comment:
   my bad. thanks for clarifying. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * 0070398ad4cd0bdebd084fe91acc2daa046fc16d Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=154)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * b9bab751eb85f5295ee475eba4d75e25d29880e0 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=153)
 
   * 0070398ad4cd0bdebd084fe91acc2daa046fc16d Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=154)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * b9bab751eb85f5295ee475eba4d75e25d29880e0 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=153)
 
   * 0070398ad4cd0bdebd084fe91acc2daa046fc16d UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * b9bab751eb85f5295ee475eba4d75e25d29880e0 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=153)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * e79a1a416f733e0aeb360849a1ee5162db27bb0f Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=152)
 
   * b9bab751eb85f5295ee475eba4d75e25d29880e0 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=153)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * e79a1a416f733e0aeb360849a1ee5162db27bb0f Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=152)
 
   * b9bab751eb85f5295ee475eba4d75e25d29880e0 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * e79a1a416f733e0aeb360849a1ee5162db27bb0f Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=152)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * 3b1e629f57e94809e4a79e4583d291c23868b1a7 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=151)
 
   * e79a1a416f733e0aeb360849a1ee5162db27bb0f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=152)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * 24baac9967e7f0d3e1c814e41583092068156146 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=150)
 
   * 3b1e629f57e94809e4a79e4583d291c23868b1a7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=151)
 
   * e79a1a416f733e0aeb360849a1ee5162db27bb0f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * 24baac9967e7f0d3e1c814e41583092068156146 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=150)
 
   * 3b1e629f57e94809e4a79e4583d291c23868b1a7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=151)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot edited a comment on pull request #2984: (Azure CI) test PR

2021-06-05 Thread GitBox


hudi-bot edited a comment on pull request #2984:
URL: https://github.com/apache/hudi/pull/2984#issuecomment-846794102


   
   ## CI report:
   
   * 24baac9967e7f0d3e1c814e41583092068156146 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=150)
 
   * 3b1e629f57e94809e4a79e4583d291c23868b1a7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] veenaypatil commented on pull request #3040: [HUDI-1148] Remove Hadoop Conf Logs

2021-06-05 Thread GitBox


veenaypatil commented on pull request #3040:
URL: https://github.com/apache/hudi/pull/3040#issuecomment-855260089


   @bvaradar Can you please review this, I have tested this with DelaStreamer 
and do not see any other logs which is bloated or not useful


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter edited a comment on pull request #3040: [HUDI-1148] Remove Hadoop Conf Logs

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #3040:
URL: https://github.com/apache/hudi/pull/3040#issuecomment-855251170






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter edited a comment on pull request #2993: [HUDI-1929] Support configure KeyGenerator by type

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #2993:
URL: https://github.com/apache/hudi/pull/2993#issuecomment-848384059


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#2993](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (a226428) into 
[master](https://codecov.io/gh/apache/hudi/commit/870e97b5f82f7657ac1547c5ab38c0797aea5c27?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (870e97b) will **decrease** coverage by `15.69%`.
   > The diff coverage is `80.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2993/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#2993   +/-   ##
   =
   - Coverage 70.83%   55.14%   -15.70% 
   - Complexity  385 3862 +3477 
   =
 Files54  487  +433 
 Lines  201623603+21587 
 Branches241 2528 +2287 
   =
   + Hits   142813015+11587 
   - Misses  454 9429 +8975 
   - Partials134 1159 +1025 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.55% <ø> (?)` | |
   | hudiclient | `∅ <ø> (∅)` | |
   | hudicommon | `50.31% <ø> (?)` | |
   | hudiflink | `63.40% <72.72%> (?)` | |
   | hudihadoopmr | `51.43% <ø> (?)` | |
   | hudisparkdatasource | `74.28% <100.00%> (?)` | |
   | hudisync | `46.60% <ø> (?)` | |
   | huditimelineservice | `64.36% <ø> (?)` | |
   | hudiutilities | `70.83% <100.00%> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[.../org/apache/hudi/streamer/FlinkStreamerConfig.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zdHJlYW1lci9GbGlua1N0cmVhbWVyQ29uZmlnLmphdmE=)
 | `0.00% <0.00%> (ø)` | |
   | 
[...c/main/java/org/apache/hudi/util/StreamerUtil.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmVhbWVyVXRpbC5qYXZh)
 | `57.39% <ø> (ø)` | |
   | 
[...va/org/apache/hudi/configuration/FlinkOptions.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9jb25maWd1cmF0aW9uL0ZsaW5rT3B0aW9ucy5qYXZh)
 | `91.53% <85.71%> (ø)` | |
   | 
[...e/hudi/sink/transform/RowDataToHoodieFunction.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3RyYW5zZm9ybS9Sb3dEYXRhVG9Ib29kaWVGdW5jdGlvbi5qYXZh)
 | `100.00% <100.00%> (ø)` | |
   | 
[...i/bootstrap/SparkParquetBootstrapDataProvider.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvYm9vdHN0cmFwL1NwYXJrUGFycXVldEJvb3RzdHJhcERhdGFQcm92aWRlci5qYXZh)
 | `80.00% <100.00%> (ø)` | |
   | 
[...n/scala/org/apache/hudi/HoodieSparkSqlWriter.scala](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrU3FsV3JpdGVyLnNjYWxh)
 | 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #3040: [HUDI-1148] Remove Hadoop Conf Logs

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #3040:
URL: https://github.com/apache/hudi/pull/3040#issuecomment-855251170


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3040?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3040](https://codecov.io/gh/apache/hudi/pull/3040?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (b8ed7ed) into 
[master](https://codecov.io/gh/apache/hudi/commit/a658328001218273c3c9153b485340ac0e91db93?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (a658328) will **increase** coverage by `15.74%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3040/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3040?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#3040   +/-   ##
   =
   + Coverage 55.13%   70.88%   +15.74% 
   + Complexity 3866  386 -3480 
   =
 Files   487   54  -433 
 Lines 23613 2016-21597 
 Branches   2528  241 -2287 
   =
   - Hits  13020 1429-11591 
   + Misses 9436  454 -8982 
   + Partials   1157  133 -1024 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `?` | |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `70.88% <ø> (+0.04%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3040?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...he/hudi/sink/partitioner/profile/WriteProfile.java](https://codecov.io/gh/apache/hudi/pull/3040/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlLmphdmE=)
 | | |
   | 
[...in/scala/org/apache/hudi/HoodieEmptyRelation.scala](https://codecov.io/gh/apache/hudi/pull/3040/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUVtcHR5UmVsYXRpb24uc2NhbGE=)
 | | |
   | 
[...java/org/apache/hudi/hive/util/HiveSchemaUtil.java](https://codecov.io/gh/apache/hudi/pull/3040/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvdXRpbC9IaXZlU2NoZW1hVXRpbC5qYXZh)
 | | |
   | 
[...penJ9MemoryLayoutSpecification64bitCompressed.java](https://codecov.io/gh/apache/hudi/pull/3040/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvanZtL09wZW5KOU1lbW9yeUxheW91dFNwZWNpZmljYXRpb242NGJpdENvbXByZXNzZWQuamF2YQ==)
 | | |
   | 
[...mon/table/log/block/HoodieCommandBlockVersion.java](https://codecov.io/gh/apache/hudi/pull/3040/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9ibG9jay9Ib29kaWVDb21tYW5kQmxvY2tWZXJzaW9uLmphdmE=)
 | | |
   | 
[...g/apache/hudi/common/bloom/BloomFilterFactory.java](https://codecov.io/gh/apache/hudi/pull/3040/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2Jsb29tL0Jsb29tRmlsdGVyRmFjdG9yeS5qYXZh)
 | | |
   | 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #2993: [HUDI-1929] Support configure KeyGenerator by type

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #2993:
URL: https://github.com/apache/hudi/pull/2993#issuecomment-848384059


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#2993](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (a226428) into 
[master](https://codecov.io/gh/apache/hudi/commit/870e97b5f82f7657ac1547c5ab38c0797aea5c27?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (870e97b) will **decrease** coverage by `15.69%`.
   > The diff coverage is `80.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2993/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#2993   +/-   ##
   =
   - Coverage 70.83%   55.14%   -15.70% 
   - Complexity  385 3862 +3477 
   =
 Files54  487  +433 
 Lines  201623603+21587 
 Branches241 2528 +2287 
   =
   + Hits   142813015+11587 
   - Misses  454 9429 +8975 
   - Partials134 1159 +1025 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.55% <ø> (?)` | |
   | hudiclient | `∅ <ø> (∅)` | |
   | hudicommon | `50.31% <ø> (?)` | |
   | hudiflink | `63.40% <72.72%> (?)` | |
   | hudihadoopmr | `51.43% <ø> (?)` | |
   | hudisparkdatasource | `74.28% <100.00%> (?)` | |
   | hudisync | `46.60% <ø> (?)` | |
   | huditimelineservice | `64.36% <ø> (?)` | |
   | hudiutilities | `70.83% <100.00%> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[.../org/apache/hudi/streamer/FlinkStreamerConfig.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zdHJlYW1lci9GbGlua1N0cmVhbWVyQ29uZmlnLmphdmE=)
 | `0.00% <0.00%> (ø)` | |
   | 
[...c/main/java/org/apache/hudi/util/StreamerUtil.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmVhbWVyVXRpbC5qYXZh)
 | `57.39% <ø> (ø)` | |
   | 
[...va/org/apache/hudi/configuration/FlinkOptions.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9jb25maWd1cmF0aW9uL0ZsaW5rT3B0aW9ucy5qYXZh)
 | `91.53% <85.71%> (ø)` | |
   | 
[...e/hudi/sink/transform/RowDataToHoodieFunction.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3RyYW5zZm9ybS9Sb3dEYXRhVG9Ib29kaWVGdW5jdGlvbi5qYXZh)
 | `100.00% <100.00%> (ø)` | |
   | 
[...i/bootstrap/SparkParquetBootstrapDataProvider.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvYm9vdHN0cmFwL1NwYXJrUGFycXVldEJvb3RzdHJhcERhdGFQcm92aWRlci5qYXZh)
 | `80.00% <100.00%> (ø)` | |
   | 
[...n/scala/org/apache/hudi/HoodieSparkSqlWriter.scala](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrU3FsV3JpdGVyLnNjYWxh)
 | 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #2993: [HUDI-1929] Support configure KeyGenerator by type

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #2993:
URL: https://github.com/apache/hudi/pull/2993#issuecomment-848384059


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#2993](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (a226428) into 
[master](https://codecov.io/gh/apache/hudi/commit/870e97b5f82f7657ac1547c5ab38c0797aea5c27?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (870e97b) will **decrease** coverage by `17.19%`.
   > The diff coverage is `75.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2993/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#2993   +/-   ##
   =
   - Coverage 70.83%   53.64%   -17.20% 
   - Complexity  385 3419 +3034 
   =
 Files54  424  +370 
 Lines  201620054+18038 
 Branches241 2083 +1842 
   =
   + Hits   142810757 +9329 
   - Misses  454 8386 +7932 
   - Partials134  911  +777 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.55% <ø> (?)` | |
   | hudiclient | `∅ <ø> (∅)` | |
   | hudicommon | `50.31% <ø> (?)` | |
   | hudiflink | `63.40% <72.72%> (?)` | |
   | hudihadoopmr | `51.43% <ø> (?)` | |
   | hudiutilities | `70.83% <100.00%> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[.../org/apache/hudi/streamer/FlinkStreamerConfig.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zdHJlYW1lci9GbGlua1N0cmVhbWVyQ29uZmlnLmphdmE=)
 | `0.00% <0.00%> (ø)` | |
   | 
[...c/main/java/org/apache/hudi/util/StreamerUtil.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmVhbWVyVXRpbC5qYXZh)
 | `57.39% <ø> (ø)` | |
   | 
[...va/org/apache/hudi/configuration/FlinkOptions.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9jb25maWd1cmF0aW9uL0ZsaW5rT3B0aW9ucy5qYXZh)
 | `91.53% <85.71%> (ø)` | |
   | 
[...e/hudi/sink/transform/RowDataToHoodieFunction.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3RyYW5zZm9ybS9Sb3dEYXRhVG9Ib29kaWVGdW5jdGlvbi5qYXZh)
 | `100.00% <100.00%> (ø)` | |
   | 
[...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=)
 | `70.84% <100.00%> (ø)` | |
   | 
[...di-cli/src/main/java/org/apache/hudi/cli/Main.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL01haW4uamF2YQ==)
 | `0.00% <0.00%> (ø)` | |
   | 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #2993: [HUDI-1929] Support configure KeyGenerator by type

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #2993:
URL: https://github.com/apache/hudi/pull/2993#issuecomment-848384059


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#2993](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (a226428) into 
[master](https://codecov.io/gh/apache/hudi/commit/870e97b5f82f7657ac1547c5ab38c0797aea5c27?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (870e97b) will **not change** coverage.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2993/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@Coverage Diff@@
   ## master#2993   +/-   ##
   =
 Coverage 70.83%   70.83%   
 Complexity  385  385   
   =
 Files54   54   
 Lines  2016 2016   
 Branches241  241   
   =
 Hits   1428 1428   
 Misses  454  454   
 Partials134  134   
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `?` | |
   | hudiutilities | `70.83% <100.00%> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=)
 | `70.84% <100.00%> (ø)` | |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] wangxianghu commented on a change in pull request #2993: [HUDI-1929] Support configure KeyGenerator by type

2021-06-05 Thread GitBox


wangxianghu commented on a change in pull request #2993:
URL: https://github.com/apache/hudi/pull/2993#discussion_r645996864



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/factory/HoodieAvroKeyGeneratorFactory.java
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.keygen.factory;
+
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.common.util.StringUtils;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieKeyGeneratorException;
+import org.apache.hudi.keygen.ComplexAvroKeyGenerator;
+import org.apache.hudi.keygen.CustomAvroKeyGenerator;
+import org.apache.hudi.keygen.GlobalAvroDeleteKeyGenerator;
+import org.apache.hudi.keygen.KeyGenerator;
+import org.apache.hudi.keygen.NonpartitionedAvroKeyGenerator;
+import org.apache.hudi.keygen.SimpleAvroKeyGenerator;
+import org.apache.hudi.keygen.TimestampBasedAvroKeyGenerator;
+import org.apache.hudi.keygen.constant.KeyGeneratorType;
+
+import java.io.IOException;
+import java.util.Locale;
+import java.util.Objects;
+
+/**
+ * Factory help to create {@link org.apache.hudi.keygen.KeyGenerator}.
+ * 
+ * This factory will try {@link HoodieWriteConfig#KEYGENERATOR_CLASS_PROP} 
firstly, this ensures the class prop
+ * will not be overwritten by {@link KeyGeneratorType}
+ */
+public class HoodieAvroKeyGeneratorFactory {
+  public static KeyGenerator createKeyGenerator(TypedProperties props) throws 
IOException {
+// keyGenerator class name has higher priority
+KeyGenerator keyGenerator = createKeyGeneratorByClassName(props);
+return Objects.isNull(keyGenerator) ? createKeyGeneratorByType(props) : 
keyGenerator;
+  }
+
+  public static KeyGenerator createKeyGeneratorByClassName(TypedProperties 
props) throws IOException {

Review comment:
   > now I get the flow. I guess this has to be moved to a Utils class so 
that both keyGens can use it.
   
   Yeah good idea `org.apache.hudi.keygen.KeyGenUtils` might be a good place to 
go




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter commented on pull request #3040: [HUDI-1148] Remove Hadoop Conf Logs

2021-06-05 Thread GitBox


codecov-commenter commented on pull request #3040:
URL: https://github.com/apache/hudi/pull/3040#issuecomment-855251170


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3040?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3040](https://codecov.io/gh/apache/hudi/pull/3040?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (b8ed7ed) into 
[master](https://codecov.io/gh/apache/hudi/commit/a658328001218273c3c9153b485340ac0e91db93?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (a658328) will **decrease** coverage by `45.86%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3040/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3040?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #3040   +/-   ##
   
   - Coverage 55.13%   9.27%   -45.87% 
   + Complexity 3866  48 -3818 
   
 Files   487  54  -433 
 Lines 236132016-21597 
 Branches   2528 241 -2287 
   
   - Hits  13020 187-12833 
   + Misses 94361816 -7620 
   + Partials   1157  13 -1144 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `?` | |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `9.27% <ø> (-61.56%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3040?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3040/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3040/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3040/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/3040/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/3040/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3040/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 

[jira] [Updated] (HUDI-1148) Revisit log messages seen when wiriting or reading through Hudi

2021-06-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-1148:
-
Labels: pull-request-available  (was: )

> Revisit log messages seen when wiriting or reading through Hudi
> ---
>
> Key: HUDI-1148
> URL: https://issues.apache.org/jira/browse/HUDI-1148
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Writer Core
>Affects Versions: 0.9.0
>Reporter: Balaji Varadarajan
>Assignee: Vinay
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> [https://github.com/apache/hudi/issues/1906]
>  
> Some of these Log messages can be made debug. We need to generally see the 
> verbosity of log messages when running hudi operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] veenaypatil opened a new pull request #3040: [HUDI-1148] Remove Hadoop Conf Logs

2021-06-05 Thread GitBox


veenaypatil opened a new pull request #3040:
URL: https://github.com/apache/hudi/pull/3040


   ## What is the purpose of the pull request
   
   Remove Hadoop configuration log because it prints a lot of times in the log 
and does not add value while debugging as well
   
   ## Brief change log
   
   Remove Log statement
   
   ## Verify this pull request
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   ## Committer checklist
   
- [X] Has a corresponding JIRA in PR title & commit

- [X] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter edited a comment on pull request #2993: [HUDI-1929] Support configure KeyGenerator by type

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #2993:
URL: https://github.com/apache/hudi/pull/2993#issuecomment-848384059


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#2993](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (a226428) into 
[master](https://codecov.io/gh/apache/hudi/commit/870e97b5f82f7657ac1547c5ab38c0797aea5c27?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (870e97b) will **decrease** coverage by `61.55%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2993/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #2993   +/-   ##
   
   - Coverage 70.83%   9.27%   -61.56% 
   + Complexity  385  48  -337 
   
 Files54  54   
 Lines  20162016   
 Branches241 241   
   
   - Hits   1428 187 -1241 
   - Misses  4541816 +1362 
   + Partials134  13  -121 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `?` | |
   | hudiutilities | `9.27% <0.00%> (-61.56%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2993?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=)
 | `0.00% <0.00%> (-70.85%)` | :arrow_down: |
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2993/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #2923: [HUDI-1864] Added support for Date, Timestamp, LocalDate and LocalDateTime in TimestampBasedAvroKeyGenerator

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #2923:
URL: https://github.com/apache/hudi/pull/2923#issuecomment-846613183


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/2923?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#2923](https://codecov.io/gh/apache/hudi/pull/2923?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (126455c) into 
[master](https://codecov.io/gh/apache/hudi/commit/7a63175a7073d886110c1993eec872ded713e356?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (7a63175) will **decrease** coverage by `15.71%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2923/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2923?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#2923   +/-   ##
   =
   - Coverage 70.83%   55.11%   -15.72% 
   - Complexity  385 3866 +3481 
   =
 Files54  487  +433 
 Lines  201623609+21593 
 Branches241 2530 +2289 
   =
   + Hits   142813013+11585 
   - Misses  454 9437 +8983 
   - Partials134 1159 +1025 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.55% <ø> (?)` | |
   | hudiclient | `∅ <ø> (∅)` | |
   | hudicommon | `50.31% <100.00%> (?)` | |
   | hudiflink | `63.25% <100.00%> (?)` | |
   | hudihadoopmr | `51.43% <ø> (?)` | |
   | hudisparkdatasource | `74.28% <ø> (?)` | |
   | hudisync | `46.60% <ø> (?)` | |
   | huditimelineservice | `64.36% <ø> (?)` | |
   | hudiutilities | `70.83% <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2923?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...ain/java/org/apache/hudi/avro/HoodieAvroUtils.java](https://codecov.io/gh/apache/hudi/pull/2923/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvYXZyby9Ib29kaWVBdnJvVXRpbHMuamF2YQ==)
 | `57.48% <100.00%> (ø)` | |
   | 
[...org/apache/hudi/util/StringToRowDataConverter.java](https://codecov.io/gh/apache/hudi/pull/2923/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmluZ1RvUm93RGF0YUNvbnZlcnRlci5qYXZh)
 | `66.66% <100.00%> (ø)` | |
   | 
[...common/table/view/PriorityBasedFileSystemView.java](https://codecov.io/gh/apache/hudi/pull/2923/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvUHJpb3JpdHlCYXNlZEZpbGVTeXN0ZW1WaWV3LmphdmE=)
 | `94.36% <0.00%> (ø)` | |
   | 
[...c/main/java/org/apache/hudi/dla/DLASyncConfig.java](https://codecov.io/gh/apache/hudi/pull/2923/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktZGxhLXN5bmMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZGxhL0RMQVN5bmNDb25maWcuamF2YQ==)
 | `97.36% <0.00%> (ø)` | |
   | 
[.../common/table/view/RocksDbBasedFileSystemView.java](https://codecov.io/gh/apache/hudi/pull/2923/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvUm9ja3NEYkJhc2VkRmlsZVN5c3RlbVZpZXcuamF2YQ==)
 | `80.40% <0.00%> (ø)` | |
   | 
[.../java/org/apache/hudi/common/util/FileIOUtils.java](https://codecov.io/gh/apache/hudi/pull/2923/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvRmlsZUlPVXRpbHMuamF2YQ==)
 | `65.51% <0.00%> (ø)` 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #2923: [HUDI-1864] Added support for Date, Timestamp, LocalDate and LocalDateTime in TimestampBasedAvroKeyGenerator

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #2923:
URL: https://github.com/apache/hudi/pull/2923#issuecomment-846613183






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] wangxianghu commented on a change in pull request #2993: [HUDI-1929] Support configure KeyGenerator by type

2021-06-05 Thread GitBox


wangxianghu commented on a change in pull request #2993:
URL: https://github.com/apache/hudi/pull/2993#discussion_r645997489



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/factory/HoodieAvroKeyGeneratorFactory.java
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.keygen.factory;
+
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.common.util.StringUtils;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieKeyGeneratorException;
+import org.apache.hudi.keygen.ComplexAvroKeyGenerator;
+import org.apache.hudi.keygen.CustomAvroKeyGenerator;
+import org.apache.hudi.keygen.GlobalAvroDeleteKeyGenerator;
+import org.apache.hudi.keygen.KeyGenerator;
+import org.apache.hudi.keygen.NonpartitionedAvroKeyGenerator;
+import org.apache.hudi.keygen.SimpleAvroKeyGenerator;
+import org.apache.hudi.keygen.TimestampBasedAvroKeyGenerator;
+import org.apache.hudi.keygen.constant.KeyGeneratorType;
+
+import java.io.IOException;
+import java.util.Locale;
+import java.util.Objects;
+
+/**
+ * Factory help to create {@link org.apache.hudi.keygen.KeyGenerator}.
+ * 
+ * This factory will try {@link HoodieWriteConfig#KEYGENERATOR_CLASS_PROP} 
firstly, this ensures the class prop
+ * will not be overwritten by {@link KeyGeneratorType}
+ */
+public class HoodieAvroKeyGeneratorFactory {
+  public static KeyGenerator createKeyGenerator(TypedProperties props) throws 
IOException {
+// keyGenerator class name has higher priority
+KeyGenerator keyGenerator = createKeyGeneratorByClassName(props);
+return Objects.isNull(keyGenerator) ? createKeyGeneratorByType(props) : 
keyGenerator;
+  }
+
+  public static KeyGenerator createKeyGeneratorByClassName(TypedProperties 
props) throws IOException {

Review comment:
   > now I get the flow. I guess this has to be moved to a Utils class so 
that both keyGens can use it.
   
   done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] wangxianghu commented on a change in pull request #2993: [HUDI-1929] Support configure KeyGenerator by type

2021-06-05 Thread GitBox


wangxianghu commented on a change in pull request #2993:
URL: https://github.com/apache/hudi/pull/2993#discussion_r645996864



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/factory/HoodieAvroKeyGeneratorFactory.java
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.keygen.factory;
+
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.common.util.StringUtils;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieKeyGeneratorException;
+import org.apache.hudi.keygen.ComplexAvroKeyGenerator;
+import org.apache.hudi.keygen.CustomAvroKeyGenerator;
+import org.apache.hudi.keygen.GlobalAvroDeleteKeyGenerator;
+import org.apache.hudi.keygen.KeyGenerator;
+import org.apache.hudi.keygen.NonpartitionedAvroKeyGenerator;
+import org.apache.hudi.keygen.SimpleAvroKeyGenerator;
+import org.apache.hudi.keygen.TimestampBasedAvroKeyGenerator;
+import org.apache.hudi.keygen.constant.KeyGeneratorType;
+
+import java.io.IOException;
+import java.util.Locale;
+import java.util.Objects;
+
+/**
+ * Factory help to create {@link org.apache.hudi.keygen.KeyGenerator}.
+ * 
+ * This factory will try {@link HoodieWriteConfig#KEYGENERATOR_CLASS_PROP} 
firstly, this ensures the class prop
+ * will not be overwritten by {@link KeyGeneratorType}
+ */
+public class HoodieAvroKeyGeneratorFactory {
+  public static KeyGenerator createKeyGenerator(TypedProperties props) throws 
IOException {
+// keyGenerator class name has higher priority
+KeyGenerator keyGenerator = createKeyGeneratorByClassName(props);
+return Objects.isNull(keyGenerator) ? createKeyGeneratorByType(props) : 
keyGenerator;
+  }
+
+  public static KeyGenerator createKeyGeneratorByClassName(TypedProperties 
props) throws IOException {

Review comment:
   > now I get the flow. I guess this has to be moved to a Utils class so 
that both keyGens can use it.
   
   Yeah good idea `org.apache.hudi.keygen.KeyGenUtils` might be a good place to 
hold it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter edited a comment on pull request #2923: [HUDI-1864] Added support for Date, Timestamp, LocalDate and LocalDateTime in TimestampBasedAvroKeyGenerator

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #2923:
URL: https://github.com/apache/hudi/pull/2923#issuecomment-846613183


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/2923?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#2923](https://codecov.io/gh/apache/hudi/pull/2923?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (126455c) into 
[master](https://codecov.io/gh/apache/hudi/commit/7a63175a7073d886110c1993eec872ded713e356?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (7a63175) will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2923/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2923?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@Coverage Diff@@
   ## master#2923   +/-   ##
   =
 Coverage 70.83%   70.83%   
 Complexity  385  385   
   =
 Files54   54   
 Lines  2016 2016   
 Branches241  241   
   =
 Hits   1428 1428   
 Misses  454  454   
 Partials134  134   
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `?` | |
   | hudiutilities | `70.83% <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter edited a comment on pull request #3038: [HUDI-1980] Optimize the code to prevent other exceptions from causin…

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #3038:
URL: https://github.com/apache/hudi/pull/3038#issuecomment-855220122


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3038](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c40142f) into 
[master](https://codecov.io/gh/apache/hudi/commit/c2383ee9040001cfc1a9e8f6add65d24ad991969?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c2383ee) will **decrease** coverage by `15.69%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3038/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#3038   +/-   ##
   =
   - Coverage 70.83%   55.13%   -15.70% 
   - Complexity  385 3865 +3480 
   =
 Files54  487  +433 
 Lines  201623605+21589 
 Branches241 2528 +2287 
   =
   + Hits   142813015+11587 
   - Misses  454 9432 +8978 
   - Partials134 1158 +1024 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.55% <ø> (?)` | |
   | hudiclient | `∅ <ø> (∅)` | |
   | hudicommon | `50.33% <ø> (?)` | |
   | hudiflink | `63.25% <ø> (?)` | |
   | hudihadoopmr | `51.43% <ø> (?)` | |
   | hudisparkdatasource | `74.28% <ø> (?)` | |
   | hudisync | `46.60% <0.00%> (?)` | |
   | huditimelineservice | `64.36% <ø> (?)` | |
   | hudiutilities | `70.88% <ø> (+0.04%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...in/java/org/apache/hudi/hive/HoodieHiveClient.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSG9vZGllSGl2ZUNsaWVudC5qYXZh)
 | `70.32% <0.00%> (ø)` | |
   | 
[...e/hudi/table/format/mor/MergeOnReadTableState.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9mb3JtYXQvbW9yL01lcmdlT25SZWFkVGFibGVTdGF0ZS5qYXZh)
 | `100.00% <0.00%> (ø)` | |
   | 
[...ache/hudi/common/table/timeline/TimelineUtils.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL1RpbWVsaW5lVXRpbHMuamF2YQ==)
 | `62.71% <0.00%> (ø)` | |
   | 
[...va/org/apache/hudi/sink/utils/HiveSyncContext.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3V0aWxzL0hpdmVTeW5jQ29udGV4dC5qYXZh)
 | `91.89% <0.00%> (ø)` | |
   | 
[...util/jvm/OpenJ9MemoryLayoutSpecification64bit.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvanZtL09wZW5KOU1lbW9yeUxheW91dFNwZWNpZmljYXRpb242NGJpdC5qYXZh)
 | `0.00% <0.00%> (ø)` | |
   | 
[...n/java/org/apache/hudi/common/metrics/Counter.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21ldHJpY3MvQ291bnRlci5qYXZh)
 | `71.42% 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #3038: [HUDI-1980] Optimize the code to prevent other exceptions from causin…

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #3038:
URL: https://github.com/apache/hudi/pull/3038#issuecomment-855220122


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3038](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c40142f) into 
[master](https://codecov.io/gh/apache/hudi/commit/c2383ee9040001cfc1a9e8f6add65d24ad991969?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c2383ee) will **decrease** coverage by `15.41%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3038/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#3038   +/-   ##
   =
   - Coverage 70.83%   55.41%   -15.42% 
   - Complexity  385 3659 +3274 
   =
 Files54  463  +409 
 Lines  201621950+19934 
 Branches241 2361 +2120 
   =
   + Hits   142812164+10736 
   - Misses  454 8710 +8256 
   - Partials134 1076  +942 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.55% <ø> (?)` | |
   | hudiclient | `∅ <ø> (∅)` | |
   | hudicommon | `50.33% <ø> (?)` | |
   | hudiflink | `63.25% <ø> (?)` | |
   | hudihadoopmr | `51.43% <ø> (?)` | |
   | hudisparkdatasource | `74.28% <ø> (?)` | |
   | hudiutilities | `70.88% <ø> (+0.04%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[.../hudi/table/format/cow/CopyOnWriteInputFormat.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9mb3JtYXQvY293L0NvcHlPbldyaXRlSW5wdXRGb3JtYXQuamF2YQ==)
 | `55.33% <0.00%> (ø)` | |
   | 
[...e/hudi/hadoop/FileStatusWithBootstrapBaseFile.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0ZpbGVTdGF0dXNXaXRoQm9vdHN0cmFwQmFzZUZpbGUuamF2YQ==)
 | `0.00% <0.00%> (ø)` | |
   | 
[.../hudi/table/format/mor/MergeOnReadInputFormat.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9mb3JtYXQvbW9yL01lcmdlT25SZWFkSW5wdXRGb3JtYXQuamF2YQ==)
 | `64.75% <0.00%> (ø)` | |
   | 
[.../org/apache/hudi/common/engine/EngineProperty.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2VuZ2luZS9FbmdpbmVQcm9wZXJ0eS5qYXZh)
 | `0.00% <0.00%> (ø)` | |
   | 
[...di/common/bootstrap/index/HFileBootstrapIndex.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2Jvb3RzdHJhcC9pbmRleC9IRmlsZUJvb3RzdHJhcEluZGV4LmphdmE=)
 | `81.48% <0.00%> (ø)` | |
   | 
[...rc/main/scala/org/apache/hudi/cli/DeDupeType.scala](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL2NsaS9EZUR1cGVUeXBlLnNjYWxh)
 | `0.00% <0.00%> (ø)` | |
   | 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #2923: [HUDI-1864] Added support for Date, Timestamp, LocalDate and LocalDateTime in TimestampBasedAvroKeyGenerator

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #2923:
URL: https://github.com/apache/hudi/pull/2923#issuecomment-846613183


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/2923?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#2923](https://codecov.io/gh/apache/hudi/pull/2923?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (126455c) into 
[master](https://codecov.io/gh/apache/hudi/commit/7a63175a7073d886110c1993eec872ded713e356?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (7a63175) will **decrease** coverage by `61.55%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2923/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2923?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #2923   +/-   ##
   
   - Coverage 70.83%   9.27%   -61.56% 
   + Complexity  385  48  -337 
   
 Files54  54   
 Lines  20162016   
 Branches241 241   
   
   - Hits   1428 187 -1241 
   - Misses  4541816 +1362 
   + Partials134  13  -121 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `?` | |
   | hudiutilities | `9.27% <ø> (-61.56%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2923?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2923/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2923/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2923/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2923/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2923/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2923/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 

[GitHub] [hudi] vaibhav-sinha commented on a change in pull request #2923: [HUDI-1864] Added support for Date, Timestamp, LocalDate and LocalDateTime in TimestampBasedAvroKeyGenerator

2021-06-05 Thread GitBox


vaibhav-sinha commented on a change in pull request #2923:
URL: https://github.com/apache/hudi/pull/2923#discussion_r645993113



##
File path: 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/keygen/TestTimestampBasedKeyGenerator.java
##
@@ -378,4 +382,46 @@ public void 
test_ExpectsMatch_MultipleInputFormats_ShortDate_OutputCustomDate()
 baseRow = genericRecordToRow(baseRecord);
 assertEquals("04/01/2020", keyGen.getPartitionPath(baseRow));
   }
+
+  @Test
+  public void test_ExpectsMatch_LogicalType_Date() throws IOException {
+LocalDate today = LocalDate.now();
+baseRecord.put("dob", (int) today.toEpochDay());
+properties = this.getBaseKeyConfig(
+"DATE",
+"-MM-dd'T'HH:mm:ssZ,-MM-dd'T'HH:mm:ss.SSSZ,MMdd",
+"",
+null,
+"-MM-dd",
+DateTimeZone.getDefault().getID());
+
+properties.setProperty(KeyGeneratorOptions.PARTITIONPATH_FIELD_OPT_KEY, 
"dob");
+BuiltinKeyGenerator keyGen = new TimestampBasedKeyGenerator(properties);
+HoodieKey hk1 = keyGen.getKey(baseRecord);
+Assertions.assertEquals(today.toString(), hk1.getPartitionPath());
+
+baseRow = genericRecordToRow(baseRecord);
+assertEquals(today.toString(), keyGen.getPartitionPath(baseRow));
+  }
+
+  @Test
+  public void test_ExpectsMatch_LogicalType_Timestamp() throws IOException {

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] vaibhav-sinha commented on a change in pull request #2923: [HUDI-1864] Added support for Date, Timestamp, LocalDate and LocalDateTime in TimestampBasedAvroKeyGenerator

2021-06-05 Thread GitBox


vaibhav-sinha commented on a change in pull request #2923:
URL: https://github.com/apache/hudi/pull/2923#discussion_r645993107



##
File path: 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/keygen/TestTimestampBasedKeyGenerator.java
##
@@ -378,4 +382,46 @@ public void 
test_ExpectsMatch_MultipleInputFormats_ShortDate_OutputCustomDate()
 baseRow = genericRecordToRow(baseRecord);
 assertEquals("04/01/2020", keyGen.getPartitionPath(baseRow));
   }
+
+  @Test
+  public void test_ExpectsMatch_LogicalType_Date() throws IOException {

Review comment:
   Had named the test based on the naming convention for existing tests in 
this class. Have updated names of all tests to camelCase.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter edited a comment on pull request #3039: [HUDI-1864][WIP] Added support for Date, Timestamp, LocalDate and LocalDateTime in TimestampBasedAvroKeyGenerator

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #3039:
URL: https://github.com/apache/hudi/pull/3039#issuecomment-855239677


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3039?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3039](https://codecov.io/gh/apache/hudi/pull/3039?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (7dcdd5d) into 
[master](https://codecov.io/gh/apache/hudi/commit/a658328001218273c3c9153b485340ac0e91db93?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (a658328) will **increase** coverage by `7.95%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3039/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3039?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#3039  +/-   ##
   
   + Coverage 55.13%   63.09%   +7.95% 
   + Complexity 3866  347-3519 
   
 Files   487   54 -433 
 Lines 23613 2016   -21597 
 Branches   2528  241-2287 
   
   - Hits  13020 1272   -11748 
   + Misses 9436  621-8815 
   + Partials   1157  123-1034 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `∅ <ø> (∅)` | |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `63.09% <ø> (-7.74%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3039?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...ies/exception/HoodieSnapshotExporterException.java](https://codecov.io/gh/apache/hudi/pull/3039/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVTbmFwc2hvdEV4cG9ydGVyRXhjZXB0aW9uLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3039/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==)
 | `5.17% <0.00%> (-83.63%)` | :arrow_down: |
   | 
[...hudi/utilities/schema/JdbcbasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/3039/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9KZGJjYmFzZWRTY2hlbWFQcm92aWRlci5qYXZh)
 | `0.00% <0.00%> (-72.23%)` | :arrow_down: |
   | 
[...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/3039/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh)
 | `0.00% <0.00%> (-66.67%)` | :arrow_down: |
   | 
[...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/3039/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=)
 | `40.69% <0.00%> (-23.84%)` | :arrow_down: |
   | 

[GitHub] [hudi] codecov-commenter edited a comment on pull request #3038: [HUDI-1980] Optimize the code to prevent other exceptions from causin…

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #3038:
URL: https://github.com/apache/hudi/pull/3038#issuecomment-855220122


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3038](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c40142f) into 
[master](https://codecov.io/gh/apache/hudi/commit/c2383ee9040001cfc1a9e8f6add65d24ad991969?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c2383ee) will **increase** coverage by `0.04%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3038/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#3038  +/-   ##
   
   + Coverage 70.83%   70.88%   +0.04% 
   - Complexity  385  386   +1 
   
 Files54   54  
 Lines  2016 2016  
 Branches241  241  
   
   + Hits   1428 1429   +1 
 Misses  454  454  
   + Partials134  133   -1 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `?` | |
   | hudiutilities | `70.88% <ø> (+0.04%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=)
 | `71.18% <0.00%> (+0.33%)` | :arrow_up: |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (HUDI-1981) Introduce --enable-sync option in HoodieMultiTableDeltaStreamer

2021-06-05 Thread Pratyaksh Sharma (Jira)
Pratyaksh Sharma created HUDI-1981:
--

 Summary: Introduce --enable-sync option in 
HoodieMultiTableDeltaStreamer
 Key: HUDI-1981
 URL: https://issues.apache.org/jira/browse/HUDI-1981
 Project: Apache Hudi
  Issue Type: Improvement
  Components: DeltaStreamer
Reporter: Pratyaksh Sharma
Assignee: Pratyaksh Sharma


HoodieDeltaStreamer has --enable-sync and --enable-hive-sync both, but the 
latter is deprecated now. Need to introduce the former in 
HoodieMultiTableDeltaStreamer as well. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] codecov-commenter edited a comment on pull request #3038: [HUDI-1980] Optimize the code to prevent other exceptions from causin…

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #3038:
URL: https://github.com/apache/hudi/pull/3038#issuecomment-855220122


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3038](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c40142f) into 
[master](https://codecov.io/gh/apache/hudi/commit/c2383ee9040001cfc1a9e8f6add65d24ad991969?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c2383ee) will **decrease** coverage by `61.55%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3038/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #3038   +/-   ##
   
   - Coverage 70.83%   9.27%   -61.56% 
   + Complexity  385  48  -337 
   
 Files54  54   
 Lines  20162016   
 Branches241 241   
   
   - Hits   1428 187 -1241 
   - Misses  4541816 +1362 
   + Partials134  13  -121 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `?` | |
   | hudiutilities | `9.27% <ø> (-61.56%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 

[GitHub] [hudi] codecov-commenter commented on pull request #3039: [HUDI-1864][WIP] Added support for Date, Timestamp, LocalDate and LocalDateTime in TimestampBasedAvroKeyGenerator

2021-06-05 Thread GitBox


codecov-commenter commented on pull request #3039:
URL: https://github.com/apache/hudi/pull/3039#issuecomment-855239677


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3039?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3039](https://codecov.io/gh/apache/hudi/pull/3039?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (7dcdd5d) into 
[master](https://codecov.io/gh/apache/hudi/commit/a658328001218273c3c9153b485340ac0e91db93?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (a658328) will **increase** coverage by `7.95%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3039/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3039?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#3039  +/-   ##
   
   + Coverage 55.13%   63.09%   +7.95% 
   + Complexity 3866  347-3519 
   
 Files   487   54 -433 
 Lines 23613 2016   -21597 
 Branches   2528  241-2287 
   
   - Hits  13020 1272   -11748 
   + Misses 9436  621-8815 
   + Partials   1157  123-1034 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `?` | |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `63.09% <ø> (-7.74%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3039?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...ies/exception/HoodieSnapshotExporterException.java](https://codecov.io/gh/apache/hudi/pull/3039/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVTbmFwc2hvdEV4cG9ydGVyRXhjZXB0aW9uLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3039/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==)
 | `5.17% <0.00%> (-83.63%)` | :arrow_down: |
   | 
[...hudi/utilities/schema/JdbcbasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/3039/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9KZGJjYmFzZWRTY2hlbWFQcm92aWRlci5qYXZh)
 | `0.00% <0.00%> (-72.23%)` | :arrow_down: |
   | 
[...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/3039/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh)
 | `0.00% <0.00%> (-66.67%)` | :arrow_down: |
   | 
[...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/3039/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=)
 | `40.69% <0.00%> (-23.84%)` | :arrow_down: |
   | 

[GitHub] [hudi] pratyakshsharma commented on a change in pull request #2967: Added blog for Hudi cleaner service

2021-06-05 Thread GitBox


pratyakshsharma commented on a change in pull request #2967:
URL: https://github.com/apache/hudi/pull/2967#discussion_r645990277



##
File path: 
docs/_posts/2021-06-03-employing-right-configurations-for-hudi-cleaner.md
##
@@ -0,0 +1,106 @@
+---
+title: "Employing correct configurations for Hudi's cleaner table service"
+excerpt: "Ensuring isolation between Hudi writers and readers using 
`HoodieCleaner.java`"
+author: pratyakshsharma
+category: blog
+---
+
+Apache Hudi provides snapshot isolation between writers and readers. This is 
made possible by Hudi’s MVCC concurrency model. In this blog, we will explain 
how to employ the right configurations to manage multiple file versions. 
Furthermore, we will discuss mechanisms available to users on how to maintain 
just the required number of old file versions so that long running readers do 
not fail. 
+
+### Reclaiming space and keeping your data lake storage costs in check
+
+Hudi provides different table management services to be able to manage your 
tables on the data lake. One of these services is called the **Cleaner**. As 
you write more data to your table, for every batch of updates received, Hudi 
can either generate a new version of the data file with updates applied to 
records (COPY_ON_WRITE) or write these delta updates to a log file, avoiding 
rewriting newer version of an existing file (MERGE_ON_READ). In such 
situations, depending on the frequency of your updates, the number of file 
versions of log files can grow indefinitely. If your use-cases do not require 
keeping an infinite history of these versions, it is imperative to have a 
process that reclaims older versions of the data. This is Hudi’s cleaner 
service.
+
+### Problem Statement
+
+In a data lake architecture, it is a very common scenario to have readers and 
writers concurrently accessing the same table. As the Hudi cleaner service 
periodically reclaims older file versions, scenarios arise where a long running 
query might be accessing a file version that is deemed to be reclaimed by the 
cleaner. Here, we need to employ the correct configs to ensure readers (aka 
queries) don’t fail.
+
+### Deeper dive into Hudi Cleaner
+
+To deal with the mentioned scenario, lets understand the  different cleaning 
policies that Hudi offers and the corresponding properties that need to be 
configured. Options are available to schedule cleaning asynchronously or 
synchronously. Before going into more details, we would like to explain a few 
underlying concepts:
+
+ - **Hudi base file**: Columnar file which consists of final data after 
compaction. A base file’s name follows the following naming convention: 
`__.parquet`. In subsequent writes of this 
file, file id remains the same and commit time gets updated to show the latest 
version. This also implies any particular version of a record, given its 
partition path, can be uniquely located using the file id and instant time. 
+ - **File slice**: A file slice consists of the base file and any log files 
consisting of the delta, in case of MERGE_ON_READ table type.
+ - **Hudi File Group**: Any file group in Hudi is uniquely identified by the 
partition path and the  file id that the files in this group have as part of 
their name. A file group consists of all the file slices in a particular 
partition path. Also any partition path can have multiple file groups.
+
+### Cleaning Policies
+
+Hudi cleaner currently supports below cleaning policies:
+
+ - **KEEP_LATEST_COMMITS**: This is the default policy. This is a temporal 
cleaning policy that ensures the effect of having lookback into all the changes 
that happened in the last X commits. Suppose a writer is ingesting data  into a 
Hudi dataset every 30 minutes and the longest running query can take 5 hours to 
finish, then the user should retain atleast the last 10 commits. With such a 
configuration, we ensure that the oldest version of a file is kept on disk for 
at least 5 hours, thereby preventing the longest running query from failing at 
any point in time. Incremental cleaning is also possible using this policy.
+ - **KEEP_LATEST_FILE_VERSIONS**: This policy has the effect of keeping N 
number of file versions irrespective of time. This policy is useful when it is 
known how many MAX versions of the file does one want to keep at any given 
time. To achieve the same behaviour as before of preventing long running 
queries from failing, one should do their calculations based on data patterns. 
Alternatively, this policy is also useful if a user just wants to maintain 1 
latest version of the file.
+
+### Examples
+
+Suppose a user is ingesting data into a hudi dataset of type COPY_ON_WRITE 
every 30 minutes as shown below:
+
+![Initial timeline](/assets/images/blog/hoodie-cleaner/Initial_timeline.png)
+_Figure1: Incoming records getting ingested into a hudi dataset every 30 
minutes_
+
+The figure shows a particular partition on DFS where commits and corresponding 
file 

[GitHub] [hudi] nsivabalan commented on pull request #2923: [HUDI-1864] Added support for Date, Timestamp, LocalDate and LocalDateTime in TimestampBasedAvroKeyGenerator

2021-06-05 Thread GitBox


nsivabalan commented on pull request #2923:
URL: https://github.com/apache/hudi/pull/2923#issuecomment-855236580


   @vaibhav-sinha : I recreated this PR cleanly 
[here](https://github.com/apache/hudi/pull/3039). Have not made fixes to 
TestCopyOnWrite yet. will wait for flink experts to chime in. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan opened a new pull request #3039: [HUDI-1864][WIP] Added support for Date, Timestamp, LocalDate and LocalDateTime in TimestampBasedAvroKeyGenerator

2021-06-05 Thread GitBox


nsivabalan opened a new pull request #3039:
URL: https://github.com/apache/hudi/pull/3039


   Redo of https://github.com/apache/hudi/pull/2923
   
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a 
pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] Carl-Zhou-CN commented on a change in pull request #3036: [HUDI-1942] Add Default value for HIVE_AUTO_CREATE_DATABASE_OPT_KEY

2021-06-05 Thread GitBox


Carl-Zhou-CN commented on a change in pull request #3036:
URL: https://github.com/apache/hudi/pull/3036#discussion_r645987505



##
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##
@@ -423,7 +423,8 @@ private[hudi] object HoodieSparkSqlWriter {
 hiveSyncConfig.verifyMetadataFileListing = 
parameters(HoodieMetadataConfig.METADATA_VALIDATE_PROP).toBoolean
 hiveSyncConfig.ignoreExceptions = 
parameters.get(HIVE_IGNORE_EXCEPTIONS_OPT_KEY).exists(r => r.toBoolean)
 hiveSyncConfig.supportTimestamp = 
parameters.get(HIVE_SUPPORT_TIMESTAMP).exists(r => r.toBoolean)
-hiveSyncConfig.autoCreateDatabase = 
parameters.get(HIVE_AUTO_CREATE_DATABASE_OPT_KEY).exists(r => r.toBoolean)
+hiveSyncConfig.autoCreateDatabase = 
parameters.getOrElse(HIVE_AUTO_CREATE_DATABASE_OPT_KEY,

Review comment:
   I think it's a necessary modification. Originally, this method set 
HIVE_AUTO_CREATE_DATABASE_OPT_KEY to false, which is contrary to the default




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] Carl-Zhou-CN commented on a change in pull request #3036: [HUDI-1942] Add Default value for HIVE_AUTO_CREATE_DATABASE_OPT_KEY

2021-06-05 Thread GitBox


Carl-Zhou-CN commented on a change in pull request #3036:
URL: https://github.com/apache/hudi/pull/3036#discussion_r645987505



##
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##
@@ -423,7 +423,8 @@ private[hudi] object HoodieSparkSqlWriter {
 hiveSyncConfig.verifyMetadataFileListing = 
parameters(HoodieMetadataConfig.METADATA_VALIDATE_PROP).toBoolean
 hiveSyncConfig.ignoreExceptions = 
parameters.get(HIVE_IGNORE_EXCEPTIONS_OPT_KEY).exists(r => r.toBoolean)
 hiveSyncConfig.supportTimestamp = 
parameters.get(HIVE_SUPPORT_TIMESTAMP).exists(r => r.toBoolean)
-hiveSyncConfig.autoCreateDatabase = 
parameters.get(HIVE_AUTO_CREATE_DATABASE_OPT_KEY).exists(r => r.toBoolean)
+hiveSyncConfig.autoCreateDatabase = 
parameters.getOrElse(HIVE_AUTO_CREATE_DATABASE_OPT_KEY,

Review comment:
   I think it's necessary to set it to false in this way, which goes 
against the default




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter edited a comment on pull request #3035: [HUDI-1936] Introduce a optional property for conditional upsert

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #3035:
URL: https://github.com/apache/hudi/pull/3035#issuecomment-855229410






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #2923: [HUDI-1864] Added support for Date, Timestamp, LocalDate and LocalDateTime in TimestampBasedAvroKeyGenerator

2021-06-05 Thread GitBox


nsivabalan commented on a change in pull request #2923:
URL: https://github.com/apache/hudi/pull/2923#discussion_r645985325



##
File path: 
hudi-flink/src/test/java/org/apache/hudi/sink/TestWriteCopyOnWrite.java
##
@@ -380,12 +380,12 @@ public void testUpsertWithDelete() throws Exception {
   @Test
   public void testInsertWithMiniBatches() throws Exception {
 // reset the config option
-conf.setDouble(FlinkOptions.WRITE_BATCH_SIZE, 0.0006); // 630 bytes batch 
size
+conf.setDouble(FlinkOptions.WRITE_BATCH_SIZE, 0.00075); // 786 bytes batch 
size

Review comment:
   Also, nishith left a comment else where about backwards compatability. 
Can you check that out as well. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on pull request #2923: [HUDI-1864] Added support for Date, Timestamp, LocalDate and LocalDateTime in TimestampBasedAvroKeyGenerator

2021-06-05 Thread GitBox


nsivabalan commented on pull request #2923:
URL: https://github.com/apache/hudi/pull/2923#issuecomment-855233374


   @vaibhav-sinha : in the mean time, can you address nishith's feedback on 
test naming. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #2923: [HUDI-1864] Added support for Date, Timestamp, LocalDate and LocalDateTime in TimestampBasedAvroKeyGenerator

2021-06-05 Thread GitBox


nsivabalan commented on a change in pull request #2923:
URL: https://github.com/apache/hudi/pull/2923#discussion_r645984981



##
File path: 
hudi-flink/src/test/java/org/apache/hudi/sink/TestWriteCopyOnWrite.java
##
@@ -380,12 +380,12 @@ public void testUpsertWithDelete() throws Exception {
   @Test
   public void testInsertWithMiniBatches() throws Exception {
 // reset the config option
-conf.setDouble(FlinkOptions.WRITE_BATCH_SIZE, 0.0006); // 630 bytes batch 
size
+conf.setDouble(FlinkOptions.WRITE_BATCH_SIZE, 0.00075); // 786 bytes batch 
size

Review comment:
   @danny0405 @leesf @yanghua : hey folks. Can one of you check why these 
tests are failing for this patch and why the fix is required. I also tried 
locally and it is failing w/ this patch, but don't have lot of context around 
tests. I expected these tests should not be affected by the changes in this 
patch. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter edited a comment on pull request #3035: [HUDI-1936] Introduce a optional property for conditional upsert

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #3035:
URL: https://github.com/apache/hudi/pull/3035#issuecomment-855229410


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3035?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3035](https://codecov.io/gh/apache/hudi/pull/3035?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (26dadb6) into 
[master](https://codecov.io/gh/apache/hudi/commit/974b476180e61fac58cd87e78699428d4108a482?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (974b476) will **increase** coverage by `15.81%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3035/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3035?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#3035   +/-   ##
   =
   + Coverage 55.01%   70.83%   +15.81% 
   + Complexity 3850  385 -3465 
   =
 Files   485   54  -431 
 Lines 23467 2016-21451 
 Branches   2497  241 -2256 
   =
   - Hits  12911 1428-11483 
   + Misses 9405  454 -8951 
   + Partials   1151  134 -1017 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `?` | |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `70.83% <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3035?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...di-cli/src/main/java/org/apache/hudi/cli/Main.java](https://codecov.io/gh/apache/hudi/pull/3035/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL01haW4uamF2YQ==)
 | | |
   | 
[.../java/org/apache/hudi/common/metrics/Registry.java](https://codecov.io/gh/apache/hudi/pull/3035/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21ldHJpY3MvUmVnaXN0cnkuamF2YQ==)
 | | |
   | 
[...e/hudi/common/util/queue/BoundedInMemoryQueue.java](https://codecov.io/gh/apache/hudi/pull/3035/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvcXVldWUvQm91bmRlZEluTWVtb3J5UXVldWUuamF2YQ==)
 | | |
   | 
[...i/common/util/collection/ExternalSpillableMap.java](https://codecov.io/gh/apache/hudi/pull/3035/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvY29sbGVjdGlvbi9FeHRlcm5hbFNwaWxsYWJsZU1hcC5qYXZh)
 | | |
   | 
[...rg/apache/hudi/metadata/HoodieMetadataMetrics.java](https://codecov.io/gh/apache/hudi/pull/3035/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvSG9vZGllTWV0YWRhdGFNZXRyaWNzLmphdmE=)
 | | |
   | 
[...c/main/java/org/apache/hudi/dla/DLASyncConfig.java](https://codecov.io/gh/apache/hudi/pull/3035/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktZGxhLXN5bmMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZGxhL0RMQVN5bmNDb25maWcuamF2YQ==)
 | | |
   | 

[GitHub] [hudi] codecov-commenter commented on pull request #3035: [HUDI-1936] Introduce a optional property for conditional upsert

2021-06-05 Thread GitBox


codecov-commenter commented on pull request #3035:
URL: https://github.com/apache/hudi/pull/3035#issuecomment-855229410


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3035?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3035](https://codecov.io/gh/apache/hudi/pull/3035?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (26dadb6) into 
[master](https://codecov.io/gh/apache/hudi/commit/974b476180e61fac58cd87e78699428d4108a482?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (974b476) will **decrease** coverage by `45.74%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3035/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3035?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #3035   +/-   ##
   
   - Coverage 55.01%   9.27%   -45.75% 
   + Complexity 3850  48 -3802 
   
 Files   485  54  -431 
 Lines 234672016-21451 
 Branches   2497 241 -2256 
   
   - Hits  12911 187-12724 
   + Misses 94051816 -7589 
   + Partials   1151  13 -1138 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `?` | |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `9.27% <ø> (-61.56%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3035?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3035/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3035/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3035/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/3035/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/3035/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3035/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 

[GitHub] [hudi] nsivabalan commented on a change in pull request #2963: [HUDI-1904] Make SchemaProvider spark free and move it to hudi-client-common

2021-06-05 Thread GitBox


nsivabalan commented on a change in pull request #2963:
URL: https://github.com/apache/hudi/pull/2963#discussion_r645980835



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/schema/SchemaProvider.java
##
@@ -34,18 +32,9 @@
 @PublicAPIClass(maturity = ApiMaturityLevel.STABLE)
 public abstract class SchemaProvider implements Serializable {
 
-  protected TypedProperties config;
+  protected Schema sourceSchema;
 
-  protected JavaSparkContext jssc;
-
-  public SchemaProvider(TypedProperties props) {

Review comment:
   sorry, guess we need to be careful there. 
   old constructor
   ```
   SchemaProvider(TypedProperties props, JavaSparkContext jssc)
   ```
   and new one 
   ```
   SchemaProvider(TypedProperties props, HoodieEngineContext context)
   ```
   javaSparkContext may not cast to EngineContext. So, again, this might be 
backwards incompatible. 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] veenaypatil commented on a change in pull request #3036: [HUDI-1942] Add Default value for HIVE_AUTO_CREATE_DATABASE_OPT_KEY

2021-06-05 Thread GitBox


veenaypatil commented on a change in pull request #3036:
URL: https://github.com/apache/hudi/pull/3036#discussion_r645981297



##
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##
@@ -423,7 +423,8 @@ private[hudi] object HoodieSparkSqlWriter {
 hiveSyncConfig.verifyMetadataFileListing = 
parameters(HoodieMetadataConfig.METADATA_VALIDATE_PROP).toBoolean
 hiveSyncConfig.ignoreExceptions = 
parameters.get(HIVE_IGNORE_EXCEPTIONS_OPT_KEY).exists(r => r.toBoolean)
 hiveSyncConfig.supportTimestamp = 
parameters.get(HIVE_SUPPORT_TIMESTAMP).exists(r => r.toBoolean)
-hiveSyncConfig.autoCreateDatabase = 
parameters.get(HIVE_AUTO_CREATE_DATABASE_OPT_KEY).exists(r => r.toBoolean)
+hiveSyncConfig.autoCreateDatabase = 
parameters.getOrElse(HIVE_AUTO_CREATE_DATABASE_OPT_KEY,

Review comment:
   yeh I agree, not sure why would it set to false then. We can close the 
PR if this is not required
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] leesf commented on a change in pull request #3038: [HUDI-1980] Optimize the code to prevent other exceptions from causin…

2021-06-05 Thread GitBox


leesf commented on a change in pull request #3038:
URL: https://github.com/apache/hudi/pull/3038#discussion_r645980679



##
File path: 
hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java
##
@@ -35,33 +38,18 @@
 import org.apache.hudi.common.util.Option;
 import org.apache.hudi.common.util.ValidationUtils;
 import org.apache.hudi.hive.util.HiveSchemaUtil;
-
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
-import org.apache.hadoop.hive.conf.HiveConf;
-import org.apache.hadoop.hive.metastore.IMetaStoreClient;
-import org.apache.hadoop.hive.ql.metadata.Hive;
-import org.apache.hadoop.hive.ql.metadata.HiveException;
-import org.apache.hadoop.hive.ql.processors.CommandProcessorResponse;
-import org.apache.hadoop.hive.ql.session.SessionState;
 import org.apache.hudi.sync.common.AbstractSyncHoodieClient;
 import org.apache.log4j.LogManager;
 import org.apache.log4j.Logger;
 import org.apache.parquet.schema.MessageType;
 import org.apache.thrift.TException;
 
 import java.io.IOException;
-import java.sql.Connection;
-import java.sql.DatabaseMetaData;
-import java.sql.DriverManager;
-import java.sql.ResultSet;
-import java.sql.SQLException;
-import java.sql.Statement;
-import java.util.ArrayList;
-import java.util.Collections;
-import java.util.HashMap;
-import java.util.List;
-import java.util.Map;
+import java.io.UnsupportedEncodingException;
+import java.net.URLDecoder;
+import java.nio.charset.StandardCharsets;
+import java.sql.*;
+import java.util.*;

Review comment:
   please revert the change to avoid using import *




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[hudi] branch master updated: [HUDI-1979] Optimize logic to improve code readability (#3037)

2021-06-05 Thread leesf
This is an automated email from the ASF dual-hosted git repository.

leesf pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new dab13f7  [HUDI-1979] Optimize logic to improve code readability (#3037)
dab13f7 is described below

commit dab13f7473cfe4fad42189056e4c931ee2e2a297
Author: Wei 
AuthorDate: Sat Jun 5 19:40:45 2021 +0800

[HUDI-1979] Optimize logic to improve code readability (#3037)

Co-authored-by: wei.zhang2 
---
 .../java/org/apache/hudi/hive/HiveMetastoreBasedLockProvider.java  | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git 
a/hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveMetastoreBasedLockProvider.java
 
b/hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveMetastoreBasedLockProvider.java
index 593adc2..9b5a1b0 100644
--- 
a/hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveMetastoreBasedLockProvider.java
+++ 
b/hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveMetastoreBasedLockProvider.java
@@ -184,18 +184,13 @@ public class HiveMetastoreBasedLockProvider implements 
LockProvider hiveClient.lock(lockRequestFinal))
   .get(time, unit);
 } catch (InterruptedException | TimeoutException e) {
-  if (this.lock != null && this.lock.getState() == LockState.ACQUIRED) {
-return;
-  } else if (lockRequest != null) {
+  if (this.lock == null || this.lock.getState() != LockState.ACQUIRED) {
 LockResponse lockResponse = 
this.hiveClient.checkLock(lockRequest.getTxnid());
 if (lockResponse.getState() == LockState.ACQUIRED) {
   this.lock = lockResponse;
-  return;
 } else {
   throw e;
 }
-  } else {
-throw e;
   }
 }
   }


[GitHub] [hudi] leesf merged pull request #3037: [HUDI-1979] Optimize logic to improve code readability

2021-06-05 Thread GitBox


leesf merged pull request #3037:
URL: https://github.com/apache/hudi/pull/3037


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] leesf commented on a change in pull request #2616: [HUDI-1625] Support range partition keygen

2021-06-05 Thread GitBox


leesf commented on a change in pull request #2616:
URL: https://github.com/apache/hudi/pull/2616#discussion_r645980052



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/RangePartitionAvroKeyGenerator.java
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.keygen;
+
+import org.apache.hudi.avro.HoodieAvroUtils;
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.exception.HoodieKeyGeneratorException;
+import org.apache.hudi.exception.HoodieNotSupportedException;
+import org.apache.hudi.keygen.constant.KeyGeneratorOptions;
+
+import org.apache.avro.generic.GenericRecord;
+
+public class RangePartitionAvroKeyGenerator extends SimpleAvroKeyGenerator {
+
+  private final Long rangePerBucket;
+  private final String partitionName;
+
+  public static class Config {
+public static final String RANGE_PER_PARTITION_PROP = 
"hoodie.keygen.range.partition.num";
+public static final Long DEFAULT_RANGE_PER_PARTITION = 10L;
+public static final String RANGE_PARTITION_NAME_PROP = 
"hoodie.keygen.range.partition.name";
+public static final String DEFAULT_RANGE_PARTITION_NAME = "rangePartition";

Review comment:
   I am curious about what the partition name will be set in cdc scenario, 
should it be the incremental primary key or some other fields to make the range 
partition useful?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #3036: [HUDI-1942] Add Default value for HIVE_AUTO_CREATE_DATABASE_OPT_KEY

2021-06-05 Thread GitBox


nsivabalan commented on a change in pull request #3036:
URL: https://github.com/apache/hudi/pull/3036#discussion_r645979488



##
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##
@@ -423,7 +423,8 @@ private[hudi] object HoodieSparkSqlWriter {
 hiveSyncConfig.verifyMetadataFileListing = 
parameters(HoodieMetadataConfig.METADATA_VALIDATE_PROP).toBoolean
 hiveSyncConfig.ignoreExceptions = 
parameters.get(HIVE_IGNORE_EXCEPTIONS_OPT_KEY).exists(r => r.toBoolean)
 hiveSyncConfig.supportTimestamp = 
parameters.get(HIVE_SUPPORT_TIMESTAMP).exists(r => r.toBoolean)
-hiveSyncConfig.autoCreateDatabase = 
parameters.get(HIVE_AUTO_CREATE_DATABASE_OPT_KEY).exists(r => r.toBoolean)
+hiveSyncConfig.autoCreateDatabase = 
parameters.getOrElse(HIVE_AUTO_CREATE_DATABASE_OPT_KEY,

Review comment:
   this is not strictly required. hiveSyncConfig.autoCreateDatabase has a 
default value to true which is DEFAULT_HIVE_AUTO_CREATE_DATABASE_OPT_KEY. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on pull request #2616: [HUDI-1625] Support range partition keygen

2021-06-05 Thread GitBox


nsivabalan commented on pull request #2616:
URL: https://github.com/apache/hudi/pull/2616#issuecomment-855225713


   @garyli1019 : Did you get a chance to follow up on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #2993: [HUDI-1929] Support configure KeyGenerator by type

2021-06-05 Thread GitBox


nsivabalan commented on a change in pull request #2993:
URL: https://github.com/apache/hudi/pull/2993#discussion_r645978567



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/factory/HoodieAvroKeyGeneratorFactory.java
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.keygen.factory;
+
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.common.util.StringUtils;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieKeyGeneratorException;
+import org.apache.hudi.keygen.ComplexAvroKeyGenerator;
+import org.apache.hudi.keygen.CustomAvroKeyGenerator;
+import org.apache.hudi.keygen.GlobalAvroDeleteKeyGenerator;
+import org.apache.hudi.keygen.KeyGenerator;
+import org.apache.hudi.keygen.NonpartitionedAvroKeyGenerator;
+import org.apache.hudi.keygen.SimpleAvroKeyGenerator;
+import org.apache.hudi.keygen.TimestampBasedAvroKeyGenerator;
+import org.apache.hudi.keygen.constant.KeyGeneratorType;
+
+import java.io.IOException;
+import java.util.Locale;
+import java.util.Objects;
+
+/**
+ * Factory help to create {@link org.apache.hudi.keygen.KeyGenerator}.
+ * 
+ * This factory will try {@link HoodieWriteConfig#KEYGENERATOR_CLASS_PROP} 
firstly, this ensures the class prop
+ * will not be overwritten by {@link KeyGeneratorType}
+ */
+public class HoodieAvroKeyGeneratorFactory {
+  public static KeyGenerator createKeyGenerator(TypedProperties props) throws 
IOException {
+// keyGenerator class name has higher priority
+KeyGenerator keyGenerator = createKeyGeneratorByClassName(props);
+return Objects.isNull(keyGenerator) ? createKeyGeneratorByType(props) : 
keyGenerator;
+  }
+
+  public static KeyGenerator createKeyGeneratorByClassName(TypedProperties 
props) throws IOException {

Review comment:
   now I get the flow. I guess this has to be moved to a Utils class so 
that both keyGens can use it. 

##
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/keygen/factory/HoodieSparkKeyGeneratorFactory.java
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.keygen.factory;
+
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieKeyGeneratorException;
+import org.apache.hudi.keygen.BuiltinKeyGenerator;
+import org.apache.hudi.keygen.ComplexKeyGenerator;
+import org.apache.hudi.keygen.CustomKeyGenerator;
+import org.apache.hudi.keygen.GlobalDeleteKeyGenerator;
+import org.apache.hudi.keygen.NonpartitionedKeyGenerator;
+import org.apache.hudi.keygen.SimpleKeyGenerator;
+import org.apache.hudi.keygen.TimestampBasedKeyGenerator;
+import org.apache.hudi.keygen.constant.KeyGeneratorType;
+
+import java.io.IOException;
+import java.util.Locale;
+import java.util.Objects;
+
+/**
+ * Factory help to create {@link org.apache.hudi.keygen.KeyGenerator}.
+ * 
+ * This factory will try {@link HoodieWriteConfig#KEYGENERATOR_CLASS_PROP} 
firstly, this ensures the class prop
+ * will not be overwritten by {@link KeyGeneratorType}
+ */
+public class HoodieSparkKeyGeneratorFactory {

Review comment:
   I get it now, thanks. 

##
File path: 

[GitHub] [hudi] nsivabalan commented on pull request #2768: [HUDI-485]: corrected the check for incremental sql

2021-06-05 Thread GitBox


nsivabalan commented on pull request #2768:
URL: https://github.com/apache/hudi/pull/2768#issuecomment-855224613


   yes, please. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on pull request #2967: Added blog for Hudi cleaner service

2021-06-05 Thread GitBox


nsivabalan commented on pull request #2967:
URL: https://github.com/apache/hudi/pull/2967#issuecomment-855224324


   We can land once addressed. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on pull request #2967: Added blog for Hudi cleaner service

2021-06-05 Thread GitBox


nsivabalan commented on pull request #2967:
URL: https://github.com/apache/hudi/pull/2967#issuecomment-855224297


   sorry, one last comment. in the figure, instead of addressing as fielIds 
(fileId1, fileId2...), can we use fileGroup1 , fileGroup2 etc. A file group 
represents all files in that group. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on pull request #2925: [HUDI-1879] Fix RO Tables Returning Snapshot Result

2021-06-05 Thread GitBox


nsivabalan commented on pull request #2925:
URL: https://github.com/apache/hudi/pull/2925#issuecomment-855223971


   Once you move the config to HiveSyncConfig or HiveSyncTool, we should be 
good to merge this in.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #2925: [HUDI-1879] Fix RO Tables Returning Snapshot Result

2021-06-05 Thread GitBox


nsivabalan commented on a change in pull request #2925:
URL: https://github.com/apache/hudi/pull/2925#discussion_r645977253



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/config/DefaultHoodieConfig.java
##
@@ -26,6 +26,11 @@
  */
 public class DefaultHoodieConfig implements Serializable {
 
+  public static final String QUERY_TYPE_OPT_KEY = 
"hoodie.datasource.query.type";

Review comment:
   yeah, I don't see any usages for this config outside of HiveSyncTool. 
So, better to keep it local. And Initially I thought on similar lines as 
@umehrot2 that we can leverage table name to detect. but later realized that, 
this particular config is used while syncing schema and the caller hard codes 
the type(there is only one caller to this method and is local to 
HoodieSyncTool). So, may not be required to leverage table name. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter edited a comment on pull request #3038: [HUDI-1980] Optimize the code to prevent other exceptions from causin…

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #3038:
URL: https://github.com/apache/hudi/pull/3038#issuecomment-855220122


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3038](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (18cc6ec) into 
[master](https://codecov.io/gh/apache/hudi/commit/c2383ee9040001cfc1a9e8f6add65d24ad991969?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c2383ee) will **decrease** coverage by `7.73%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3038/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#3038  +/-   ##
   
   - Coverage 70.83%   63.09%   -7.74% 
   + Complexity  385  347  -38 
   
 Files54   54  
 Lines  2016 2016  
 Branches241  241  
   
   - Hits   1428 1272 -156 
   - Misses  454  621 +167 
   + Partials134  123  -11 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `∅ <ø> (∅)` | |
   | hudiutilities | `63.09% <ø> (-7.74%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...ies/exception/HoodieSnapshotExporterException.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVTbmFwc2hvdEV4cG9ydGVyRXhjZXB0aW9uLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==)
 | `5.17% <0.00%> (-83.63%)` | :arrow_down: |
   | 
[...hudi/utilities/schema/JdbcbasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9KZGJjYmFzZWRTY2hlbWFQcm92aWRlci5qYXZh)
 | `0.00% <0.00%> (-72.23%)` | :arrow_down: |
   | 
[...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh)
 | `0.00% <0.00%> (-66.67%)` | :arrow_down: |
   | 
[...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=)
 | `40.69% <0.00%> (-23.84%)` | :arrow_down: |
   | 
[...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=)
 | `71.18% <0.00%> (+0.33%)` | :arrow_up: |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL 

[GitHub] [hudi] nsivabalan commented on pull request #2896: [HUDI-1790] Added SqlSource to fetch data from any partitions for backfill use case

2021-06-05 Thread GitBox


nsivabalan commented on pull request #2896:
URL: https://github.com/apache/hudi/pull/2896#issuecomment-855221106


   @vingov : hey vinoth. Did you get a chance to check out my feedback. We can 
merge this in once addressed. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on pull request #2747: [HUDI-1743] Added support for SqlFileBasedTransformer

2021-06-05 Thread GitBox


nsivabalan commented on pull request #2747:
URL: https://github.com/apache/hudi/pull/2747#issuecomment-855221039


   @vingov : Did you get a chance to check out my feedback. Once addressed, we 
can land this. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter commented on pull request #3038: [HUDI-1980] Optimize the code to prevent other exceptions from causin…

2021-06-05 Thread GitBox


codecov-commenter commented on pull request #3038:
URL: https://github.com/apache/hudi/pull/3038#issuecomment-855220122


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3038](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (18cc6ec) into 
[master](https://codecov.io/gh/apache/hudi/commit/c2383ee9040001cfc1a9e8f6add65d24ad991969?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c2383ee) will **decrease** coverage by `7.73%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3038/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#3038  +/-   ##
   
   - Coverage 70.83%   63.09%   -7.74% 
   + Complexity  385  347  -38 
   
 Files54   54  
 Lines  2016 2016  
 Branches241  241  
   
   - Hits   1428 1272 -156 
   - Misses  454  621 +167 
   + Partials134  123  -11 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `?` | |
   | hudiutilities | `63.09% <ø> (-7.74%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3038?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...ies/exception/HoodieSnapshotExporterException.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVTbmFwc2hvdEV4cG9ydGVyRXhjZXB0aW9uLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==)
 | `5.17% <0.00%> (-83.63%)` | :arrow_down: |
   | 
[...hudi/utilities/schema/JdbcbasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9KZGJjYmFzZWRTY2hlbWFQcm92aWRlci5qYXZh)
 | `0.00% <0.00%> (-72.23%)` | :arrow_down: |
   | 
[...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh)
 | `0.00% <0.00%> (-66.67%)` | :arrow_down: |
   | 
[...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=)
 | `40.69% <0.00%> (-23.84%)` | :arrow_down: |
   | 
[...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3038/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=)
 | `71.18% <0.00%> (+0.33%)` | :arrow_up: |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the 

[GitHub] [hudi] Xuehai-Chen edited a comment on pull request #2854: [HUDI-1771] Propagate CDC format for hoodie

2021-06-05 Thread GitBox


Xuehai-Chen edited a comment on pull request #2854:
URL: https://github.com/apache/hudi/pull/2854#issuecomment-855218182


   Maybe I get it wrong, but it looks like the "_hoodie_cdc_operation" field is 
not set properly when it's flushing to files.
   I think problem is when StreamWriteFunction.bufferRecord convert 
HoodieRecord to DataItem, the operation field is lost.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] Xuehai-Chen commented on pull request #2854: [HUDI-1771] Propagate CDC format for hoodie

2021-06-05 Thread GitBox


Xuehai-Chen commented on pull request #2854:
URL: https://github.com/apache/hudi/pull/2854#issuecomment-855218182


   Maybe I get it wrong, but it looks like the "_hoodie_cdc_operation" field is 
not set properly. When StreamWriteFunction.bufferRecord convert HoodieRecord to 
DataItem


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-1980) Optimize the code to prevent other exceptions from causing resources not to be closed

2021-06-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-1980:
-
Labels: pull-request-available  (was: )

> Optimize the code to prevent other exceptions from causing resources not to 
> be closed
> -
>
> Key: HUDI-1980
> URL: https://issues.apache.org/jira/browse/HUDI-1980
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Wei
>Priority: Critical
>  Labels: pull-request-available
>
>  When  *HoodieHiveClient* initializing resources, some exceptions may cause 
> resources to fail to close



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] chaplinthink opened a new pull request #3038: [HUDI-1980] Optimize the code to prevent other exceptions from causin…

2021-06-05 Thread GitBox


chaplinthink opened a new pull request #3038:
URL: https://github.com/apache/hudi/pull/3038


   …g resources not to be closed
   
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a 
pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-1980) Optimize the code to prevent other exceptions from causing resources not to be closed

2021-06-05 Thread Wei (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei updated HUDI-1980:
--
Description:  When  *HoodieHiveClient* initializing resources, some 
exceptions may cause resources to fail to close

> Optimize the code to prevent other exceptions from causing resources not to 
> be closed
> -
>
> Key: HUDI-1980
> URL: https://issues.apache.org/jira/browse/HUDI-1980
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Wei
>Priority: Critical
>
>  When  *HoodieHiveClient* initializing resources, some exceptions may cause 
> resources to fail to close



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] codecov-commenter edited a comment on pull request #3037: [HUDI-1979] Optimize logic to improve code readability

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #3037:
URL: https://github.com/apache/hudi/pull/3037#issuecomment-855211206


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3037?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3037](https://codecov.io/gh/apache/hudi/pull/3037?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (0152412) into 
[master](https://codecov.io/gh/apache/hudi/commit/c2383ee9040001cfc1a9e8f6add65d24ad991969?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c2383ee) will **decrease** coverage by `15.70%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3037/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3037?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#3037   +/-   ##
   =
   - Coverage 70.83%   55.12%   -15.71% 
   - Complexity  385 3864 +3479 
   =
 Files54  487  +433 
 Lines  201623605+21589 
 Branches241 2528 +2287 
   =
   + Hits   142813012+11584 
   - Misses  454 9433 +8979 
   - Partials134 1160 +1026 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.55% <ø> (?)` | |
   | hudiclient | `∅ <ø> (∅)` | |
   | hudicommon | `50.31% <ø> (?)` | |
   | hudiflink | `63.25% <ø> (?)` | |
   | hudihadoopmr | `51.43% <ø> (?)` | |
   | hudisparkdatasource | `74.28% <ø> (?)` | |
   | hudisync | `46.60% <0.00%> (?)` | |
   | huditimelineservice | `64.36% <ø> (?)` | |
   | hudiutilities | `70.83% <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3037?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/3037/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=)
 | `0.00% <0.00%> (ø)` | |
   | 
[...he/hudi/sink/partitioner/profile/WriteProfile.java](https://codecov.io/gh/apache/hudi/pull/3037/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlLmphdmE=)
 | `85.93% <0.00%> (ø)` | |
   | 
[...in/scala/org/apache/hudi/HoodieEmptyRelation.scala](https://codecov.io/gh/apache/hudi/pull/3037/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUVtcHR5UmVsYXRpb24uc2NhbGE=)
 | `66.66% <0.00%> (ø)` | |
   | 
[...java/org/apache/hudi/hive/util/HiveSchemaUtil.java](https://codecov.io/gh/apache/hudi/pull/3037/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvdXRpbC9IaXZlU2NoZW1hVXRpbC5qYXZh)
 | `68.93% <0.00%> (ø)` | |
   | 
[...penJ9MemoryLayoutSpecification64bitCompressed.java](https://codecov.io/gh/apache/hudi/pull/3037/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvanZtL09wZW5KOU1lbW9yeUxheW91dFNwZWNpZmljYXRpb242NGJpdENvbXByZXNzZWQuamF2YQ==)
 | `0.00% <0.00%> (ø)` | |
   | 

[jira] [Created] (HUDI-1980) Optimize the code to prevent other exceptions from causing resources not to be closed

2021-06-05 Thread Wei (Jira)
Wei created HUDI-1980:
-

 Summary: Optimize the code to prevent other exceptions from 
causing resources not to be closed
 Key: HUDI-1980
 URL: https://issues.apache.org/jira/browse/HUDI-1980
 Project: Apache Hudi
  Issue Type: Improvement
  Components: Hive Integration
Reporter: Wei






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] codecov-commenter edited a comment on pull request #3037: [HUDI-1979] Optimize logic to improve code readability

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #3037:
URL: https://github.com/apache/hudi/pull/3037#issuecomment-855211206






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter edited a comment on pull request #3037: [HUDI-1979] Optimize logic to improve code readability

2021-06-05 Thread GitBox


codecov-commenter edited a comment on pull request #3037:
URL: https://github.com/apache/hudi/pull/3037#issuecomment-855211206


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3037?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3037](https://codecov.io/gh/apache/hudi/pull/3037?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (0152412) into 
[master](https://codecov.io/gh/apache/hudi/commit/c2383ee9040001cfc1a9e8f6add65d24ad991969?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c2383ee) will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3037/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3037?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@Coverage Diff@@
   ## master#3037   +/-   ##
   =
 Coverage 70.83%   70.83%   
 Complexity  385  385   
   =
 Files54   54   
 Lines  2016 2016   
 Branches241  241   
   =
 Hits   1428 1428   
 Misses  454  454   
 Partials134  134   
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `?` | |
   | hudiutilities | `70.83% <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-commenter commented on pull request #3037: [HUDI-1979] Optimize logic to improve code readability

2021-06-05 Thread GitBox


codecov-commenter commented on pull request #3037:
URL: https://github.com/apache/hudi/pull/3037#issuecomment-855211206


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3037?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3037](https://codecov.io/gh/apache/hudi/pull/3037?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (0152412) into 
[master](https://codecov.io/gh/apache/hudi/commit/c2383ee9040001cfc1a9e8f6add65d24ad991969?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 (c2383ee) will **decrease** coverage by `61.55%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3037/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3037?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #3037   +/-   ##
   
   - Coverage 70.83%   9.27%   -61.56% 
   + Complexity  385  48  -337 
   
 Files54  54   
 Lines  20162016   
 Branches241 241   
   
   - Hits   1428 187 -1241 
   - Misses  4541816 +1362 
   + Partials134  13  -121 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `?` | |
   | hudiutilities | `9.27% <ø> (-61.56%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3037?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3037/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3037/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3037/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/3037/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/3037/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3037/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | 

[jira] [Updated] (HUDI-1979) Optimize logic to improve code readability

2021-06-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-1979:
-
Labels: pull-request-available  (was: )

>  Optimize logic to improve code readability
> ---
>
> Key: HUDI-1979
> URL: https://issues.apache.org/jira/browse/HUDI-1979
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Wei
>Priority: Minor
>  Labels: pull-request-available
>
> HiveMetastoreBasedLockProvider# acquireLockInternal()  method
> Remove unnecessary judgments and return statement



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] chaplinthink opened a new pull request #3037: [HUDI-1979] Optimize logic to improve code readability

2021-06-05 Thread GitBox


chaplinthink opened a new pull request #3037:
URL: https://github.com/apache/hudi/pull/3037


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a 
pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (HUDI-1847) Add ability to decouple configs for scheduling inline and running async

2021-06-05 Thread Vinay (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay reassigned HUDI-1847:
---

Assignee: Vinay

> Add ability to decouple configs for scheduling inline and running async
> ---
>
> Key: HUDI-1847
> URL: https://issues.apache.org/jira/browse/HUDI-1847
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Nishith Agarwal
>Assignee: Vinay
>Priority: Major
>  Labels: sev:high
>
> Currently, there are 2 ways to enable compaction:
>  
>  # Inline - This will schedule compaction inline and execute inline
>  # Async - This option is only available for HoodieDeltaStreamer based jobs. 
> This turns on scheduling inline and running async as part of the same spark 
> job.
>  
> Users need a config to be able to schedule only inline while having an 
> ability to execute in their own spark job



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1847) Add ability to decouple configs for scheduling inline and running async

2021-06-05 Thread Vinay (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17357787#comment-17357787
 ] 

Vinay commented on HUDI-1847:
-

[~nishith29]  Thank you for mentioning all the steps clearly, I would like to 
start working on this issue, will let you know if I face any issues

> Add ability to decouple configs for scheduling inline and running async
> ---
>
> Key: HUDI-1847
> URL: https://issues.apache.org/jira/browse/HUDI-1847
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Nishith Agarwal
>Priority: Major
>  Labels: sev:high
>
> Currently, there are 2 ways to enable compaction:
>  
>  # Inline - This will schedule compaction inline and execute inline
>  # Async - This option is only available for HoodieDeltaStreamer based jobs. 
> This turns on scheduling inline and running async as part of the same spark 
> job.
>  
> Users need a config to be able to schedule only inline while having an 
> ability to execute in their own spark job



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1942) HIVE_AUTO_CREATE_DATABASE_OPT_KEY This should default to true when Hudi synchronizes Hive

2021-06-05 Thread Vinay (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17357786#comment-17357786
 ] 

Vinay commented on HUDI-1942:
-

[~yao.z...@yuanxi.onaliyun.com] yes, I have already added that in the PR, can 
you pls review - [GitHub Pull Request 
#3036|https://github.com/apache/hudi/pull/3036]

> HIVE_AUTO_CREATE_DATABASE_OPT_KEY This should default to true when Hudi 
> synchronizes Hive
> -
>
> Key: HUDI-1942
> URL: https://issues.apache.org/jira/browse/HUDI-1942
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: newbie
>Reporter: yao.zhou
>Assignee: Vinay
>Priority: Major
>  Labels: easy-fix, pull-request-available
> Fix For: 0.8.0, 0.9.0
>
>
> HIVE_AUTO_CREATE_DATABASE_OPT_KEY = 
> "hoodie.datasource.hive_sync.auto_create_database"
> DEFAULT_HIVE_AUTO_CREATE_DATABASE_OPT_KEY = "true"
> in HoodieSparkSqlWriter.buildSyncConfig 
> hiveSyncConfig.autoCreateDatabase = 
> parameters.get(HIVE_AUTO_CREATE_DATABASE_OPT_KEY).exists(r => r.toBoolean)
>  * This method sets the parameter to false



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-1979) Optimize logic to improve code readability

2021-06-05 Thread Wei (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei updated HUDI-1979:
--
Status: In Progress  (was: Open)

>  Optimize logic to improve code readability
> ---
>
> Key: HUDI-1979
> URL: https://issues.apache.org/jira/browse/HUDI-1979
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Wei
>Priority: Minor
>
> HiveMetastoreBasedLockProvider# acquireLockInternal()  method
> Remove unnecessary judgments and return statement



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-1979) Optimize logic to improve code readability

2021-06-05 Thread Wei (Jira)
Wei created HUDI-1979:
-

 Summary:  Optimize logic to improve code readability
 Key: HUDI-1979
 URL: https://issues.apache.org/jira/browse/HUDI-1979
 Project: Apache Hudi
  Issue Type: Improvement
  Components: Hive Integration
Reporter: Wei


HiveMetastoreBasedLockProvider# acquireLockInternal()  method

Remove unnecessary judgments and return statement



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >