[jira] [Updated] (HUDI-4409) Improve LockManager wait logic when catch exception

2022-07-18 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-4409:
---
Summary: Improve LockManager wait logic when catch exception  (was: 
LockManager improve wait time logic)

> Improve LockManager wait logic when catch exception
> ---
>
> Key: HUDI-4409
> URL: https://issues.apache.org/jira/browse/HUDI-4409
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: liujinhui
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> //public void lock() {
>   if 
> (writeConfig.getWriteConcurrencyMode().supportsOptimisticConcurrencyControl())
>  {
> LockProvider lockProvider = getLockProvider();
> int retryCount = 0;
> boolean acquired = false;
> while (retryCount <= maxRetries) {
>   try {
> acquired = 
> lockProvider.tryLock(writeConfig.getLockAcquireWaitTimeoutInMs(), 
> TimeUnit.MILLISECONDS);
> if (acquired) {
>   break;
> }
> LOG.info("Retrying to acquire lock...");
> Thread.sleep(maxWaitTimeInMs);
>   } catch (HoodieLockException | InterruptedException e) {
> if (retryCount >= maxRetries) {
>   throw new HoodieLockException("Unable to acquire lock, lock object 
> ", e);
> }
>   } finally {
> retryCount++;
>   }
> }
> if (!acquired) {
>   throw new HoodieLockException("Unable to acquire lock, lock object " + 
> lockProvider.getLock());
> }
>   }
> } {code}
> We should put sleep in catch



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-4409) Improve LockManager wait logic when catch exception

2022-07-18 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned HUDI-4409:
--

Assignee: liujinhui

> Improve LockManager wait logic when catch exception
> ---
>
> Key: HUDI-4409
> URL: https://issues.apache.org/jira/browse/HUDI-4409
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: liujinhui
>Assignee: liujinhui
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> //public void lock() {
>   if 
> (writeConfig.getWriteConcurrencyMode().supportsOptimisticConcurrencyControl())
>  {
> LockProvider lockProvider = getLockProvider();
> int retryCount = 0;
> boolean acquired = false;
> while (retryCount <= maxRetries) {
>   try {
> acquired = 
> lockProvider.tryLock(writeConfig.getLockAcquireWaitTimeoutInMs(), 
> TimeUnit.MILLISECONDS);
> if (acquired) {
>   break;
> }
> LOG.info("Retrying to acquire lock...");
> Thread.sleep(maxWaitTimeInMs);
>   } catch (HoodieLockException | InterruptedException e) {
> if (retryCount >= maxRetries) {
>   throw new HoodieLockException("Unable to acquire lock, lock object 
> ", e);
> }
>   } finally {
> retryCount++;
>   }
> }
> if (!acquired) {
>   throw new HoodieLockException("Unable to acquire lock, lock object " + 
> lockProvider.getLock());
> }
>   }
> } {code}
> We should put sleep in catch



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-4409) Improve LockManager wait logic when catch exception

2022-07-18 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-4409.
--
Resolution: Done

> Improve LockManager wait logic when catch exception
> ---
>
> Key: HUDI-4409
> URL: https://issues.apache.org/jira/browse/HUDI-4409
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: liujinhui
>Assignee: liujinhui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>
> {code:java}
> //public void lock() {
>   if 
> (writeConfig.getWriteConcurrencyMode().supportsOptimisticConcurrencyControl())
>  {
> LockProvider lockProvider = getLockProvider();
> int retryCount = 0;
> boolean acquired = false;
> while (retryCount <= maxRetries) {
>   try {
> acquired = 
> lockProvider.tryLock(writeConfig.getLockAcquireWaitTimeoutInMs(), 
> TimeUnit.MILLISECONDS);
> if (acquired) {
>   break;
> }
> LOG.info("Retrying to acquire lock...");
> Thread.sleep(maxWaitTimeInMs);
>   } catch (HoodieLockException | InterruptedException e) {
> if (retryCount >= maxRetries) {
>   throw new HoodieLockException("Unable to acquire lock, lock object 
> ", e);
> }
>   } finally {
> retryCount++;
>   }
> }
> if (!acquired) {
>   throw new HoodieLockException("Unable to acquire lock, lock object " + 
> lockProvider.getLock());
> }
>   }
> } {code}
> We should put sleep in catch



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4409) Improve LockManager wait logic when catch exception

2022-07-18 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-4409:
---
Fix Version/s: 0.12.0

> Improve LockManager wait logic when catch exception
> ---
>
> Key: HUDI-4409
> URL: https://issues.apache.org/jira/browse/HUDI-4409
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: liujinhui
>Assignee: liujinhui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>
> {code:java}
> //public void lock() {
>   if 
> (writeConfig.getWriteConcurrencyMode().supportsOptimisticConcurrencyControl())
>  {
> LockProvider lockProvider = getLockProvider();
> int retryCount = 0;
> boolean acquired = false;
> while (retryCount <= maxRetries) {
>   try {
> acquired = 
> lockProvider.tryLock(writeConfig.getLockAcquireWaitTimeoutInMs(), 
> TimeUnit.MILLISECONDS);
> if (acquired) {
>   break;
> }
> LOG.info("Retrying to acquire lock...");
> Thread.sleep(maxWaitTimeInMs);
>   } catch (HoodieLockException | InterruptedException e) {
> if (retryCount >= maxRetries) {
>   throw new HoodieLockException("Unable to acquire lock, lock object 
> ", e);
> }
>   } finally {
> retryCount++;
>   }
> }
> if (!acquired) {
>   throw new HoodieLockException("Unable to acquire lock, lock object " + 
> lockProvider.getLock());
> }
>   }
> } {code}
> We should put sleep in catch



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-184) Integrate Hudi with Apache Flink

2022-03-10 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-184.
-
Resolution: Implemented

This feature has been tracked via 
https://issues.apache.org/jira/browse/HUDI-1521

> Integrate Hudi with Apache Flink
> 
>
> Key: HUDI-184
> URL: https://issues.apache.org/jira/browse/HUDI-184
> Project: Apache Hudi
>  Issue Type: New Feature
>  Components: writer-core
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> Apache Flink is a popular streaming processing engine.
> Integrating Hudi with Flink is a valuable work.
> The discussion mailing thread is here: 
> [https://lists.apache.org/api/source.lua/1533de2d4cd4243fa9e8f8bf057ffd02f2ac0bec7c7539d8f72166ea@%3Cdev.hudi.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Reopened] (HUDI-184) Integrate Hudi with Apache Flink

2022-03-10 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reopened HUDI-184:
---

> Integrate Hudi with Apache Flink
> 
>
> Key: HUDI-184
> URL: https://issues.apache.org/jira/browse/HUDI-184
> Project: Apache Hudi
>  Issue Type: New Feature
>  Components: writer-core
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> Apache Flink is a popular streaming processing engine.
> Integrating Hudi with Flink is a valuable work.
> The discussion mailing thread is here: 
> [https://lists.apache.org/api/source.lua/1533de2d4cd4243fa9e8f8bf057ffd02f2ac0bec7c7539d8f72166ea@%3Cdev.hudi.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Closed] (HUDI-609) Implement a Flink specific HoodieIndex

2022-03-10 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-609.
-
Resolution: Won't Do

> Implement a Flink specific HoodieIndex
> --
>
> Key: HUDI-609
> URL: https://issues.apache.org/jira/browse/HUDI-609
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> Indexing is a key step in hudi's write flow. {{HoodieIndex}} is the super 
> abstract class of all the implement of the index. Currently, {{HoodieIndex}} 
> couples with Spark in the design. However, HUDI-538 is doing the restructure 
> for hudi-client so that hudi can be decoupled with Spark. After that, we 
> would get an engine-irrelevant implementation of {{HoodieIndex}}. And 
> extending that class, we could implement a Flink specific index.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Closed] (HUDI-608) Implement a flink datastream execution context

2022-03-10 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-608.
-
Resolution: Won't Do

> Implement a flink datastream execution context
> --
>
> Key: HUDI-608
> URL: https://issues.apache.org/jira/browse/HUDI-608
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> Currently {{HoodieWriteClient}} does something like 
> `hoodieRecordRDD.map().sort()` internally.. if we want to support Flink 
> DataStream as the object, then we need to somehow define an abstraction like 
> {{HoodieExecutionContext}}  which will have a common set of map(T) -> T, 
> filter(), repartition() methods. There will be subclass like 
> {{HoodieFlinkDataStreamExecutionContext}} which will implement it 
> in Flink specific ways and hand back the transformed T object.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Closed] (HUDI-184) Integrate Hudi with Apache Flink

2022-03-10 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-184.
-
Resolution: Won't Do

> Integrate Hudi with Apache Flink
> 
>
> Key: HUDI-184
> URL: https://issues.apache.org/jira/browse/HUDI-184
> Project: Apache Hudi
>  Issue Type: New Feature
>  Components: writer-core
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> Apache Flink is a popular streaming processing engine.
> Integrating Hudi with Flink is a valuable work.
> The discussion mailing thread is here: 
> [https://lists.apache.org/api/source.lua/1533de2d4cd4243fa9e8f8bf057ffd02f2ac0bec7c7539d8f72166ea@%3Cdev.hudi.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-2418) Support HiveSchemaProvider

2021-12-02 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2418:
---
Summary: Support HiveSchemaProvider   (was: add HiveSchemaProvider )

> Support HiveSchemaProvider 
> ---
>
> Key: HUDI-2418
> URL: https://issues.apache.org/jira/browse/HUDI-2418
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: Jian Feng
>Assignee: Jian Feng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>
> when using DeltaStreamer to migrate exist Hive table, it better to have a 
> HiveSchemaProvider instead of avro schema file.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HUDI-2699) Remove duplicated zookeeper with tests classifier exists in bundles

2021-11-05 Thread vinoyang (Jira)
vinoyang created HUDI-2699:
--

 Summary: Remove duplicated zookeeper with tests classifier exists 
in bundles
 Key: HUDI-2699
 URL: https://issues.apache.org/jira/browse/HUDI-2699
 Project: Apache Hudi
  Issue Type: Sub-task
Reporter: vinoyang
Assignee: vinoyang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2643) Remove duplicated hbase-common with tests classifier exists in bundles

2021-11-01 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2643.
--
Resolution: Done

13b637ddc3ab9fba51e303cfa0343a496e476d26

> Remove duplicated hbase-common with tests classifier exists in bundles
> --
>
> Key: HUDI-2643
> URL: https://issues.apache.org/jira/browse/HUDI-2643
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2643) Remove duplicated hbase-common with tests classifier exists in bundles

2021-11-01 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2643:
---
Fix Version/s: 0.10.0

> Remove duplicated hbase-common with tests classifier exists in bundles
> --
>
> Key: HUDI-2643
> URL: https://issues.apache.org/jira/browse/HUDI-2643
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-2643) Remove duplicated hbase-common with tests classifier exists in bundles

2021-10-28 Thread vinoyang (Jira)
vinoyang created HUDI-2643:
--

 Summary: Remove duplicated hbase-common with tests classifier 
exists in bundles
 Key: HUDI-2643
 URL: https://issues.apache.org/jira/browse/HUDI-2643
 Project: Apache Hudi
  Issue Type: Sub-task
Reporter: vinoyang
Assignee: vinoyang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2614) Remove duplicated hadoop-hdfs with tests classifier exists in bundles

2021-10-26 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2614.
--
Resolution: Done

b1c4acf0aeb0f3d650c8e704828b1c2b0d2b5b40

> Remove duplicated hadoop-hdfs with tests classifier exists in bundles
> -
>
> Key: HUDI-2614
> URL: https://issues.apache.org/jira/browse/HUDI-2614
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2614) Remove duplicated hadoop-hdfs with tests classifier exists in bundles

2021-10-26 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2614:
---
Fix Version/s: 0.10.0

> Remove duplicated hadoop-hdfs with tests classifier exists in bundles
> -
>
> Key: HUDI-2614
> URL: https://issues.apache.org/jira/browse/HUDI-2614
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-2614) Remove duplicated hadoop-hdfs with tests classifier exists in bundles

2021-10-24 Thread vinoyang (Jira)
vinoyang created HUDI-2614:
--

 Summary: Remove duplicated hadoop-hdfs with tests classifier 
exists in bundles
 Key: HUDI-2614
 URL: https://issues.apache.org/jira/browse/HUDI-2614
 Project: Apache Hudi
  Issue Type: Sub-task
Reporter: vinoyang
Assignee: vinoyang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2600) Remove duplicated hadoop-common with tests classifier exists in bundles

2021-10-24 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2600:
---
Fix Version/s: 0.10.0

> Remove duplicated hadoop-common with tests classifier exists in bundles
> ---
>
> Key: HUDI-2600
> URL: https://issues.apache.org/jira/browse/HUDI-2600
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Release  Administrative
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> We found many duplicated dependencies in the generated dependency list, 
> `hadoop-common` is one of them:
> {code:java}
> hadoop-common/org.apache.hadoop/2.7.3//hadoop-common-2.7.3.jar
> hadoop-common/org.apache.hadoop/2.7.3/tests/hadoop-common-2.7.3-tests.jar
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2600) Remove duplicated hadoop-common with tests classifier exists in bundles

2021-10-24 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2600.
--
Resolution: Done

220bf6a7e6f5cdf0efbbbee9df6852a8b2288570

> Remove duplicated hadoop-common with tests classifier exists in bundles
> ---
>
> Key: HUDI-2600
> URL: https://issues.apache.org/jira/browse/HUDI-2600
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Release  Administrative
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> We found many duplicated dependencies in the generated dependency list, 
> `hadoop-common` is one of them:
> {code:java}
> hadoop-common/org.apache.hadoop/2.7.3//hadoop-common-2.7.3.jar
> hadoop-common/org.apache.hadoop/2.7.3/tests/hadoop-common-2.7.3-tests.jar
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2592) NumberFormatException: Zero length BigInteger when write.precombine.field is decimal type

2021-10-22 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2592.
--
Resolution: Fixed

> NumberFormatException: Zero length BigInteger when write.precombine.field is 
> decimal type
> -
>
> Key: HUDI-2592
> URL: https://issues.apache.org/jira/browse/HUDI-2592
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Reporter: Matrix42
>Assignee: Matrix42
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0, 0.11.0
>
>
> when write.precombine.field is decimal type,write decimal will be an empty 
> byte array, when read will throw NumberFormatException: Zero length 
> BigInteger like below:
> {code:java}
> 2021-10-20 17:14:03
> java.lang.NumberFormatException: Zero length BigInteger
> at java.math.BigInteger.(BigInteger.java:302)
> at 
> org.apache.flink.table.data.DecimalData.fromUnscaledBytes(DecimalData.java:223)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createDecimalConverter$4dc14f00$1(AvroToRowDataConverters.java:158)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createNullableConverter$4568343a$1(AvroToRowDataConverters.java:94)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createRowConverter$68595fbd$1(AvroToRowDataConverters.java:75)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$1.hasNext(MergeOnReadInputFormat.java:300)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$LogFileOnlyIterator.reachedEnd(MergeOnReadInputFormat.java:362)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat.reachedEnd(MergeOnReadInputFormat.java:202)
> at 
> org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:90)
> at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
> at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
> at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:213)
> {code}
> analyze:
>  
> HoodieAvroUtils.getNestedFieldVal will invoked to extract precombine field.
> next will invoke convertValueForAvroLogicalTypes. when field is decimal 
> type,the bytebuffer will consumed, we should rewind.
> {code:java}
> private static Object convertValueForAvroLogicalTypes(Schema fieldSchema, 
> Object fieldValue) {
>   if (fieldSchema.getLogicalType() == LogicalTypes.date()) {
> return LocalDate.ofEpochDay(Long.parseLong(fieldValue.toString()));
>   } else if (fieldSchema.getLogicalType() instanceof LogicalTypes.Decimal) {
> Decimal dc = (Decimal) fieldSchema.getLogicalType();
> DecimalConversion decimalConversion = new DecimalConversion();
> if (fieldSchema.getType() == Schema.Type.FIXED) {
>   return decimalConversion.fromFixed((GenericFixed) fieldValue, 
> fieldSchema,
>   LogicalTypes.decimal(dc.getPrecision(), dc.getScale()));
> } else if (fieldSchema.getType() == Schema.Type.BYTES) {
>   
> //this methoad will consume the byteBuffer
>   return decimalConversion.fromBytes((ByteBuffer) fieldValue, fieldSchema,
>   LogicalTypes.decimal(dc.getPrecision(), dc.getScale()));
> }
>   }
>   return fieldValue;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HUDI-2592) NumberFormatException: Zero length BigInteger when write.precombine.field is decimal type

2021-10-22 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reopened HUDI-2592:


> NumberFormatException: Zero length BigInteger when write.precombine.field is 
> decimal type
> -
>
> Key: HUDI-2592
> URL: https://issues.apache.org/jira/browse/HUDI-2592
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Reporter: Matrix42
>Assignee: Matrix42
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0, 0.11.0
>
>
> when write.precombine.field is decimal type,write decimal will be an empty 
> byte array, when read will throw NumberFormatException: Zero length 
> BigInteger like below:
> {code:java}
> 2021-10-20 17:14:03
> java.lang.NumberFormatException: Zero length BigInteger
> at java.math.BigInteger.(BigInteger.java:302)
> at 
> org.apache.flink.table.data.DecimalData.fromUnscaledBytes(DecimalData.java:223)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createDecimalConverter$4dc14f00$1(AvroToRowDataConverters.java:158)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createNullableConverter$4568343a$1(AvroToRowDataConverters.java:94)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createRowConverter$68595fbd$1(AvroToRowDataConverters.java:75)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$1.hasNext(MergeOnReadInputFormat.java:300)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$LogFileOnlyIterator.reachedEnd(MergeOnReadInputFormat.java:362)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat.reachedEnd(MergeOnReadInputFormat.java:202)
> at 
> org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:90)
> at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
> at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
> at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:213)
> {code}
> analyze:
>  
> HoodieAvroUtils.getNestedFieldVal will invoked to extract precombine field.
> next will invoke convertValueForAvroLogicalTypes. when field is decimal 
> type,the bytebuffer will consumed, we should rewind.
> {code:java}
> private static Object convertValueForAvroLogicalTypes(Schema fieldSchema, 
> Object fieldValue) {
>   if (fieldSchema.getLogicalType() == LogicalTypes.date()) {
> return LocalDate.ofEpochDay(Long.parseLong(fieldValue.toString()));
>   } else if (fieldSchema.getLogicalType() instanceof LogicalTypes.Decimal) {
> Decimal dc = (Decimal) fieldSchema.getLogicalType();
> DecimalConversion decimalConversion = new DecimalConversion();
> if (fieldSchema.getType() == Schema.Type.FIXED) {
>   return decimalConversion.fromFixed((GenericFixed) fieldValue, 
> fieldSchema,
>   LogicalTypes.decimal(dc.getPrecision(), dc.getScale()));
> } else if (fieldSchema.getType() == Schema.Type.BYTES) {
>   
> //this methoad will consume the byteBuffer
>   return decimalConversion.fromBytes((ByteBuffer) fieldValue, fieldSchema,
>   LogicalTypes.decimal(dc.getPrecision(), dc.getScale()));
> }
>   }
>   return fieldValue;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2592) NumberFormatException: Zero length BigInteger when write.precombine.field is decimal type

2021-10-22 Thread vinoyang (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17432925#comment-17432925
 ] 

vinoyang commented on HUDI-2592:


[~Matrix42] I have given you Jira contributor permission. Thanks for your 
contribution!

> NumberFormatException: Zero length BigInteger when write.precombine.field is 
> decimal type
> -
>
> Key: HUDI-2592
> URL: https://issues.apache.org/jira/browse/HUDI-2592
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Reporter: Matrix42
>Assignee: Matrix42
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0, 0.11.0
>
>
> when write.precombine.field is decimal type,write decimal will be an empty 
> byte array, when read will throw NumberFormatException: Zero length 
> BigInteger like below:
> {code:java}
> 2021-10-20 17:14:03
> java.lang.NumberFormatException: Zero length BigInteger
> at java.math.BigInteger.(BigInteger.java:302)
> at 
> org.apache.flink.table.data.DecimalData.fromUnscaledBytes(DecimalData.java:223)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createDecimalConverter$4dc14f00$1(AvroToRowDataConverters.java:158)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createNullableConverter$4568343a$1(AvroToRowDataConverters.java:94)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createRowConverter$68595fbd$1(AvroToRowDataConverters.java:75)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$1.hasNext(MergeOnReadInputFormat.java:300)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$LogFileOnlyIterator.reachedEnd(MergeOnReadInputFormat.java:362)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat.reachedEnd(MergeOnReadInputFormat.java:202)
> at 
> org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:90)
> at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
> at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
> at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:213)
> {code}
> analyze:
>  
> HoodieAvroUtils.getNestedFieldVal will invoked to extract precombine field.
> next will invoke convertValueForAvroLogicalTypes. when field is decimal 
> type,the bytebuffer will consumed, we should rewind.
> {code:java}
> private static Object convertValueForAvroLogicalTypes(Schema fieldSchema, 
> Object fieldValue) {
>   if (fieldSchema.getLogicalType() == LogicalTypes.date()) {
> return LocalDate.ofEpochDay(Long.parseLong(fieldValue.toString()));
>   } else if (fieldSchema.getLogicalType() instanceof LogicalTypes.Decimal) {
> Decimal dc = (Decimal) fieldSchema.getLogicalType();
> DecimalConversion decimalConversion = new DecimalConversion();
> if (fieldSchema.getType() == Schema.Type.FIXED) {
>   return decimalConversion.fromFixed((GenericFixed) fieldValue, 
> fieldSchema,
>   LogicalTypes.decimal(dc.getPrecision(), dc.getScale()));
> } else if (fieldSchema.getType() == Schema.Type.BYTES) {
>   
> //this methoad will consume the byteBuffer
>   return decimalConversion.fromBytes((ByteBuffer) fieldValue, fieldSchema,
>   LogicalTypes.decimal(dc.getPrecision(), dc.getScale()));
> }
>   }
>   return fieldValue;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-2592) NumberFormatException: Zero length BigInteger when write.precombine.field is decimal type

2021-10-22 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned HUDI-2592:
--

Assignee: Matrix42

> NumberFormatException: Zero length BigInteger when write.precombine.field is 
> decimal type
> -
>
> Key: HUDI-2592
> URL: https://issues.apache.org/jira/browse/HUDI-2592
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Reporter: Matrix42
>Assignee: Matrix42
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0, 0.11.0
>
>
> when write.precombine.field is decimal type,write decimal will be an empty 
> byte array, when read will throw NumberFormatException: Zero length 
> BigInteger like below:
> {code:java}
> 2021-10-20 17:14:03
> java.lang.NumberFormatException: Zero length BigInteger
> at java.math.BigInteger.(BigInteger.java:302)
> at 
> org.apache.flink.table.data.DecimalData.fromUnscaledBytes(DecimalData.java:223)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createDecimalConverter$4dc14f00$1(AvroToRowDataConverters.java:158)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createNullableConverter$4568343a$1(AvroToRowDataConverters.java:94)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createRowConverter$68595fbd$1(AvroToRowDataConverters.java:75)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$1.hasNext(MergeOnReadInputFormat.java:300)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$LogFileOnlyIterator.reachedEnd(MergeOnReadInputFormat.java:362)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat.reachedEnd(MergeOnReadInputFormat.java:202)
> at 
> org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:90)
> at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
> at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
> at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:213)
> {code}
> analyze:
>  
> HoodieAvroUtils.getNestedFieldVal will invoked to extract precombine field.
> next will invoke convertValueForAvroLogicalTypes. when field is decimal 
> type,the bytebuffer will consumed, we should rewind.
> {code:java}
> private static Object convertValueForAvroLogicalTypes(Schema fieldSchema, 
> Object fieldValue) {
>   if (fieldSchema.getLogicalType() == LogicalTypes.date()) {
> return LocalDate.ofEpochDay(Long.parseLong(fieldValue.toString()));
>   } else if (fieldSchema.getLogicalType() instanceof LogicalTypes.Decimal) {
> Decimal dc = (Decimal) fieldSchema.getLogicalType();
> DecimalConversion decimalConversion = new DecimalConversion();
> if (fieldSchema.getType() == Schema.Type.FIXED) {
>   return decimalConversion.fromFixed((GenericFixed) fieldValue, 
> fieldSchema,
>   LogicalTypes.decimal(dc.getPrecision(), dc.getScale()));
> } else if (fieldSchema.getType() == Schema.Type.BYTES) {
>   
> //this methoad will consume the byteBuffer
>   return decimalConversion.fromBytes((ByteBuffer) fieldValue, fieldSchema,
>   LogicalTypes.decimal(dc.getPrecision(), dc.getScale()));
> }
>   }
>   return fieldValue;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2592) NumberFormatException: Zero length BigInteger when write.precombine.field is decimal type

2021-10-22 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2592:
---
Status: Closed  (was: Patch Available)

> NumberFormatException: Zero length BigInteger when write.precombine.field is 
> decimal type
> -
>
> Key: HUDI-2592
> URL: https://issues.apache.org/jira/browse/HUDI-2592
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Reporter: Matrix42
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0, 0.11.0
>
>
> when write.precombine.field is decimal type,write decimal will be an empty 
> byte array, when read will throw NumberFormatException: Zero length 
> BigInteger like below:
> {code:java}
> 2021-10-20 17:14:03
> java.lang.NumberFormatException: Zero length BigInteger
> at java.math.BigInteger.(BigInteger.java:302)
> at 
> org.apache.flink.table.data.DecimalData.fromUnscaledBytes(DecimalData.java:223)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createDecimalConverter$4dc14f00$1(AvroToRowDataConverters.java:158)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createNullableConverter$4568343a$1(AvroToRowDataConverters.java:94)
> at 
> org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createRowConverter$68595fbd$1(AvroToRowDataConverters.java:75)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$1.hasNext(MergeOnReadInputFormat.java:300)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$LogFileOnlyIterator.reachedEnd(MergeOnReadInputFormat.java:362)
> at 
> org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat.reachedEnd(MergeOnReadInputFormat.java:202)
> at 
> org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:90)
> at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
> at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
> at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:213)
> {code}
> analyze:
>  
> HoodieAvroUtils.getNestedFieldVal will invoked to extract precombine field.
> next will invoke convertValueForAvroLogicalTypes. when field is decimal 
> type,the bytebuffer will consumed, we should rewind.
> {code:java}
> private static Object convertValueForAvroLogicalTypes(Schema fieldSchema, 
> Object fieldValue) {
>   if (fieldSchema.getLogicalType() == LogicalTypes.date()) {
> return LocalDate.ofEpochDay(Long.parseLong(fieldValue.toString()));
>   } else if (fieldSchema.getLogicalType() instanceof LogicalTypes.Decimal) {
> Decimal dc = (Decimal) fieldSchema.getLogicalType();
> DecimalConversion decimalConversion = new DecimalConversion();
> if (fieldSchema.getType() == Schema.Type.FIXED) {
>   return decimalConversion.fromFixed((GenericFixed) fieldValue, 
> fieldSchema,
>   LogicalTypes.decimal(dc.getPrecision(), dc.getScale()));
> } else if (fieldSchema.getType() == Schema.Type.BYTES) {
>   
> //this methoad will consume the byteBuffer
>   return decimalConversion.fromBytes((ByteBuffer) fieldValue, fieldSchema,
>   LogicalTypes.decimal(dc.getPrecision(), dc.getScale()));
> }
>   }
>   return fieldValue;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-2600) Remove duplicated hadoop-common with tests classifier exists in bundles

2021-10-22 Thread vinoyang (Jira)
vinoyang created HUDI-2600:
--

 Summary: Remove duplicated hadoop-common with tests classifier 
exists in bundles
 Key: HUDI-2600
 URL: https://issues.apache.org/jira/browse/HUDI-2600
 Project: Apache Hudi
  Issue Type: Sub-task
  Components: Release  Administrative
Reporter: vinoyang


We found many duplicated dependencies in the generated dependency list, 
`hadoop-common` is one of them:
{code:java}
hadoop-common/org.apache.hadoop/2.7.3//hadoop-common-2.7.3.jar
hadoop-common/org.apache.hadoop/2.7.3/tests/hadoop-common-2.7.3-tests.jar
{code}
 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-2600) Remove duplicated hadoop-common with tests classifier exists in bundles

2021-10-22 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned HUDI-2600:
--

Assignee: vinoyang

> Remove duplicated hadoop-common with tests classifier exists in bundles
> ---
>
> Key: HUDI-2600
> URL: https://issues.apache.org/jira/browse/HUDI-2600
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Release  Administrative
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> We found many duplicated dependencies in the generated dependency list, 
> `hadoop-common` is one of them:
> {code:java}
> hadoop-common/org.apache.hadoop/2.7.3//hadoop-common-2.7.3.jar
> hadoop-common/org.apache.hadoop/2.7.3/tests/hadoop-common-2.7.3-tests.jar
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2507) Generate more dependency list file for other bundles

2021-10-21 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2507.
--
Resolution: Done

b480294e792b6344d37560587f8f6e170e210d14

> Generate more dependency list file for other bundles
> 
>
> Key: HUDI-2507
> URL: https://issues.apache.org/jira/browse/HUDI-2507
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Usability
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2507) Generate more dependency list file for other bundles

2021-10-21 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2507:
---
Fix Version/s: 0.10.0

> Generate more dependency list file for other bundles
> 
>
> Key: HUDI-2507
> URL: https://issues.apache.org/jira/browse/HUDI-2507
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Usability
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-2508) Build GA for the dependeny diff check workflow

2021-09-30 Thread vinoyang (Jira)
vinoyang created HUDI-2508:
--

 Summary: Build GA for the dependeny diff check workflow
 Key: HUDI-2508
 URL: https://issues.apache.org/jira/browse/HUDI-2508
 Project: Apache Hudi
  Issue Type: Sub-task
  Components: Usability
Reporter: vinoyang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-2508) Build GA for the dependeny diff check workflow

2021-09-30 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned HUDI-2508:
--

Assignee: vinoyang

> Build GA for the dependeny diff check workflow
> --
>
> Key: HUDI-2508
> URL: https://issues.apache.org/jira/browse/HUDI-2508
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Usability
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-2507) Generate more dependency list file for other bundles

2021-09-30 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned HUDI-2507:
--

Assignee: vinoyang

> Generate more dependency list file for other bundles
> 
>
> Key: HUDI-2507
> URL: https://issues.apache.org/jira/browse/HUDI-2507
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Usability
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-2507) Generate more dependency list file for other bundles

2021-09-30 Thread vinoyang (Jira)
vinoyang created HUDI-2507:
--

 Summary: Generate more dependency list file for other bundles
 Key: HUDI-2507
 URL: https://issues.apache.org/jira/browse/HUDI-2507
 Project: Apache Hudi
  Issue Type: Sub-task
  Components: Usability
Reporter: vinoyang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-2506) Hudi dependency governance

2021-09-30 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned HUDI-2506:
--

Assignee: vinoyang

> Hudi dependency governance
> --
>
> Key: HUDI-2506
> URL: https://issues.apache.org/jira/browse/HUDI-2506
> Project: Apache Hudi
>  Issue Type: Task
>  Components: Usability
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2440) Add dependency change diff script for dependency governace

2021-09-30 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2440.
--
Resolution: Done

> Add dependency change diff script for dependency governace
> --
>
> Key: HUDI-2440
> URL: https://issues.apache.org/jira/browse/HUDI-2440
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Usability, Utilities
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> Currently, hudi's dependency management is chaotic, e.g. for 
> `hudi-spark-bundle_2.11`, the dependency list is here:
> {code:java}
> HikariCP/2.5.1//HikariCP-2.5.1.jar
> ST4/4.0.4//ST4-4.0.4.jar
> aircompressor/0.15//aircompressor-0.15.jar
> annotations/17.0.0//annotations-17.0.0.jar
> ant-launcher/1.9.1//ant-launcher-1.9.1.jar
> ant/1.6.5//ant-1.6.5.jar
> ant/1.9.1//ant-1.9.1.jar
> antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar
> aopalliance/1.0//aopalliance-1.0.jar
> apache-curator/2.7.1//apache-curator-2.7.1.pom
> apacheds-i18n/2.0.0-M15//apacheds-i18n-2.0.0-M15.jar
> apacheds-kerberos-codec/2.0.0-M15//apacheds-kerberos-codec-2.0.0-M15.jar
> api-asn1-api/1.0.0-M20//api-asn1-api-1.0.0-M20.jar
> api-util/1.0.0-M20//api-util-1.0.0-M20.jar
> asm/3.1//asm-3.1.jar
> avatica-metrics/1.8.0//avatica-metrics-1.8.0.jar
> avatica/1.8.0//avatica-1.8.0.jar
> avro/1.8.2//avro-1.8.2.jar
> bonecp/0.8.0.RELEASE//bonecp-0.8.0.RELEASE.jar
> calcite-core/1.10.0//calcite-core-1.10.0.jar
> calcite-druid/1.10.0//calcite-druid-1.10.0.jar
> calcite-linq4j/1.10.0//calcite-linq4j-1.10.0.jar
> commons-beanutils-core/1.8.0//commons-beanutils-core-1.8.0.jar
> commons-beanutils/1.7.0//commons-beanutils-1.7.0.jar
> commons-cli/1.2//commons-cli-1.2.jar
> commons-codec/1.4//commons-codec-1.4.jar
> commons-collections/3.2.2//commons-collections-3.2.2.jar
> commons-compiler/2.7.6//commons-compiler-2.7.6.jar
> commons-compress/1.9//commons-compress-1.9.jar
> commons-configuration/1.6//commons-configuration-1.6.jar
> commons-daemon/1.0.13//commons-daemon-1.0.13.jar
> commons-dbcp/1.4//commons-dbcp-1.4.jar
> commons-digester/1.8//commons-digester-1.8.jar
> commons-el/1.0//commons-el-1.0.jar
> commons-httpclient/3.1//commons-httpclient-3.1.jar
> commons-io/2.4//commons-io-2.4.jar
> commons-lang/2.6//commons-lang-2.6.jar
> commons-lang3/3.1//commons-lang3-3.1.jar
> commons-logging/1.2//commons-logging-1.2.jar
> commons-math/2.2//commons-math-2.2.jar
> commons-math3/3.1.1//commons-math3-3.1.1.jar
> commons-net/3.1//commons-net-3.1.jar
> commons-pool/1.5.4//commons-pool-1.5.4.jar
> curator-client/2.7.1//curator-client-2.7.1.jar
> curator-framework/2.7.1//curator-framework-2.7.1.jar
> curator-recipes/2.7.1//curator-recipes-2.7.1.jar
> datanucleus-api-jdo/4.2.4//datanucleus-api-jdo-4.2.4.jar
> datanucleus-core/4.1.17//datanucleus-core-4.1.17.jar
> datanucleus-rdbms/4.1.19//datanucleus-rdbms-4.1.19.jar
> derby/10.10.2.0//derby-10.10.2.0.jar
> disruptor/3.3.0//disruptor-3.3.0.jar
> dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar
> eigenbase-properties/1.1.5//eigenbase-properties-1.1.5.jar
> fastutil/7.0.13//fastutil-7.0.13.jar
> findbugs-annotations/1.3.9-1//findbugs-annotations-1.3.9-1.jar
> fluent-hc/4.4.1//fluent-hc-4.4.1.jar
> groovy-all/2.4.4//groovy-all-2.4.4.jar
> gson/2.3.1//gson-2.3.1.jar
> guava/14.0.1//guava-14.0.1.jar
> guice-assistedinject/3.0//guice-assistedinject-3.0.jar
> guice-servlet/3.0//guice-servlet-3.0.jar
> guice/3.0//guice-3.0.jar
> hadoop-annotations/2.7.3//hadoop-annotations-2.7.3.jar
> hadoop-auth/2.7.3//hadoop-auth-2.7.3.jar
> hadoop-client/2.7.3//hadoop-client-2.7.3.jar
> hadoop-common/2.7.3//hadoop-common-2.7.3.jar
> hadoop-common/2.7.3/tests/hadoop-common-2.7.3-tests.jar
> hadoop-hdfs/2.7.3//hadoop-hdfs-2.7.3.jar
> hadoop-hdfs/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar
> hadoop-mapreduce-client-app/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar
> hadoop-mapreduce-client-common/2.7.3//hadoop-mapreduce-client-common-2.7.3.jar
> hadoop-mapreduce-client-core/2.7.3//hadoop-mapreduce-client-core-2.7.3.jar
> hadoop-mapreduce-client-jobclient/2.7.3//hadoop-mapreduce-client-jobclient-2.7.3.jar
> hadoop-mapreduce-client-shuffle/2.7.3//hadoop-mapreduce-client-shuffle-2.7.3.jar
> hadoop-yarn-api/2.7.3//hadoop-yarn-api-2.7.3.jar
> hadoop-yarn-client/2.7.3//hadoop-yarn-client-2.7.3.jar
> hadoop-yarn-common/2.7.3//hadoop-yarn-common-2.7.3.jar
> hadoop-yarn-registry/2.7.1//hadoop-yarn-registry-2.7.1.jar
> hadoop-yarn-server-applicationhistoryservice/2.7.2//hadoop-yarn-server-applicationhistoryservice-2.7.2.jar
> hadoop-yarn-server-common/2.7.2//hadoop-yarn-server-common-2.7.2.jar
> hadoop-yarn-server-resourcemanager/2.7.2//hadoop-yarn-server-resourcemanager-2.7.2.jar
> 

[jira] [Created] (HUDI-2506) Hudi dependency governance

2021-09-30 Thread vinoyang (Jira)
vinoyang created HUDI-2506:
--

 Summary: Hudi dependency governance
 Key: HUDI-2506
 URL: https://issues.apache.org/jira/browse/HUDI-2506
 Project: Apache Hudi
  Issue Type: Task
  Components: Usability
Reporter: vinoyang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HUDI-2440) Add dependency change diff script for dependency governace

2021-09-30 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reopened HUDI-2440:


> Add dependency change diff script for dependency governace
> --
>
> Key: HUDI-2440
> URL: https://issues.apache.org/jira/browse/HUDI-2440
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Usability, Utilities
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> Currently, hudi's dependency management is chaotic, e.g. for 
> `hudi-spark-bundle_2.11`, the dependency list is here:
> {code:java}
> HikariCP/2.5.1//HikariCP-2.5.1.jar
> ST4/4.0.4//ST4-4.0.4.jar
> aircompressor/0.15//aircompressor-0.15.jar
> annotations/17.0.0//annotations-17.0.0.jar
> ant-launcher/1.9.1//ant-launcher-1.9.1.jar
> ant/1.6.5//ant-1.6.5.jar
> ant/1.9.1//ant-1.9.1.jar
> antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar
> aopalliance/1.0//aopalliance-1.0.jar
> apache-curator/2.7.1//apache-curator-2.7.1.pom
> apacheds-i18n/2.0.0-M15//apacheds-i18n-2.0.0-M15.jar
> apacheds-kerberos-codec/2.0.0-M15//apacheds-kerberos-codec-2.0.0-M15.jar
> api-asn1-api/1.0.0-M20//api-asn1-api-1.0.0-M20.jar
> api-util/1.0.0-M20//api-util-1.0.0-M20.jar
> asm/3.1//asm-3.1.jar
> avatica-metrics/1.8.0//avatica-metrics-1.8.0.jar
> avatica/1.8.0//avatica-1.8.0.jar
> avro/1.8.2//avro-1.8.2.jar
> bonecp/0.8.0.RELEASE//bonecp-0.8.0.RELEASE.jar
> calcite-core/1.10.0//calcite-core-1.10.0.jar
> calcite-druid/1.10.0//calcite-druid-1.10.0.jar
> calcite-linq4j/1.10.0//calcite-linq4j-1.10.0.jar
> commons-beanutils-core/1.8.0//commons-beanutils-core-1.8.0.jar
> commons-beanutils/1.7.0//commons-beanutils-1.7.0.jar
> commons-cli/1.2//commons-cli-1.2.jar
> commons-codec/1.4//commons-codec-1.4.jar
> commons-collections/3.2.2//commons-collections-3.2.2.jar
> commons-compiler/2.7.6//commons-compiler-2.7.6.jar
> commons-compress/1.9//commons-compress-1.9.jar
> commons-configuration/1.6//commons-configuration-1.6.jar
> commons-daemon/1.0.13//commons-daemon-1.0.13.jar
> commons-dbcp/1.4//commons-dbcp-1.4.jar
> commons-digester/1.8//commons-digester-1.8.jar
> commons-el/1.0//commons-el-1.0.jar
> commons-httpclient/3.1//commons-httpclient-3.1.jar
> commons-io/2.4//commons-io-2.4.jar
> commons-lang/2.6//commons-lang-2.6.jar
> commons-lang3/3.1//commons-lang3-3.1.jar
> commons-logging/1.2//commons-logging-1.2.jar
> commons-math/2.2//commons-math-2.2.jar
> commons-math3/3.1.1//commons-math3-3.1.1.jar
> commons-net/3.1//commons-net-3.1.jar
> commons-pool/1.5.4//commons-pool-1.5.4.jar
> curator-client/2.7.1//curator-client-2.7.1.jar
> curator-framework/2.7.1//curator-framework-2.7.1.jar
> curator-recipes/2.7.1//curator-recipes-2.7.1.jar
> datanucleus-api-jdo/4.2.4//datanucleus-api-jdo-4.2.4.jar
> datanucleus-core/4.1.17//datanucleus-core-4.1.17.jar
> datanucleus-rdbms/4.1.19//datanucleus-rdbms-4.1.19.jar
> derby/10.10.2.0//derby-10.10.2.0.jar
> disruptor/3.3.0//disruptor-3.3.0.jar
> dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar
> eigenbase-properties/1.1.5//eigenbase-properties-1.1.5.jar
> fastutil/7.0.13//fastutil-7.0.13.jar
> findbugs-annotations/1.3.9-1//findbugs-annotations-1.3.9-1.jar
> fluent-hc/4.4.1//fluent-hc-4.4.1.jar
> groovy-all/2.4.4//groovy-all-2.4.4.jar
> gson/2.3.1//gson-2.3.1.jar
> guava/14.0.1//guava-14.0.1.jar
> guice-assistedinject/3.0//guice-assistedinject-3.0.jar
> guice-servlet/3.0//guice-servlet-3.0.jar
> guice/3.0//guice-3.0.jar
> hadoop-annotations/2.7.3//hadoop-annotations-2.7.3.jar
> hadoop-auth/2.7.3//hadoop-auth-2.7.3.jar
> hadoop-client/2.7.3//hadoop-client-2.7.3.jar
> hadoop-common/2.7.3//hadoop-common-2.7.3.jar
> hadoop-common/2.7.3/tests/hadoop-common-2.7.3-tests.jar
> hadoop-hdfs/2.7.3//hadoop-hdfs-2.7.3.jar
> hadoop-hdfs/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar
> hadoop-mapreduce-client-app/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar
> hadoop-mapreduce-client-common/2.7.3//hadoop-mapreduce-client-common-2.7.3.jar
> hadoop-mapreduce-client-core/2.7.3//hadoop-mapreduce-client-core-2.7.3.jar
> hadoop-mapreduce-client-jobclient/2.7.3//hadoop-mapreduce-client-jobclient-2.7.3.jar
> hadoop-mapreduce-client-shuffle/2.7.3//hadoop-mapreduce-client-shuffle-2.7.3.jar
> hadoop-yarn-api/2.7.3//hadoop-yarn-api-2.7.3.jar
> hadoop-yarn-client/2.7.3//hadoop-yarn-client-2.7.3.jar
> hadoop-yarn-common/2.7.3//hadoop-yarn-common-2.7.3.jar
> hadoop-yarn-registry/2.7.1//hadoop-yarn-registry-2.7.1.jar
> hadoop-yarn-server-applicationhistoryservice/2.7.2//hadoop-yarn-server-applicationhistoryservice-2.7.2.jar
> hadoop-yarn-server-common/2.7.2//hadoop-yarn-server-common-2.7.2.jar
> hadoop-yarn-server-resourcemanager/2.7.2//hadoop-yarn-server-resourcemanager-2.7.2.jar
> 

[jira] [Updated] (HUDI-2440) Add dependency change diff script for dependency governace

2021-09-30 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2440:
---
Parent: HUDI-2506
Issue Type: Sub-task  (was: Improvement)

> Add dependency change diff script for dependency governace
> --
>
> Key: HUDI-2440
> URL: https://issues.apache.org/jira/browse/HUDI-2440
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Usability, Utilities
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> Currently, hudi's dependency management is chaotic, e.g. for 
> `hudi-spark-bundle_2.11`, the dependency list is here:
> {code:java}
> HikariCP/2.5.1//HikariCP-2.5.1.jar
> ST4/4.0.4//ST4-4.0.4.jar
> aircompressor/0.15//aircompressor-0.15.jar
> annotations/17.0.0//annotations-17.0.0.jar
> ant-launcher/1.9.1//ant-launcher-1.9.1.jar
> ant/1.6.5//ant-1.6.5.jar
> ant/1.9.1//ant-1.9.1.jar
> antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar
> aopalliance/1.0//aopalliance-1.0.jar
> apache-curator/2.7.1//apache-curator-2.7.1.pom
> apacheds-i18n/2.0.0-M15//apacheds-i18n-2.0.0-M15.jar
> apacheds-kerberos-codec/2.0.0-M15//apacheds-kerberos-codec-2.0.0-M15.jar
> api-asn1-api/1.0.0-M20//api-asn1-api-1.0.0-M20.jar
> api-util/1.0.0-M20//api-util-1.0.0-M20.jar
> asm/3.1//asm-3.1.jar
> avatica-metrics/1.8.0//avatica-metrics-1.8.0.jar
> avatica/1.8.0//avatica-1.8.0.jar
> avro/1.8.2//avro-1.8.2.jar
> bonecp/0.8.0.RELEASE//bonecp-0.8.0.RELEASE.jar
> calcite-core/1.10.0//calcite-core-1.10.0.jar
> calcite-druid/1.10.0//calcite-druid-1.10.0.jar
> calcite-linq4j/1.10.0//calcite-linq4j-1.10.0.jar
> commons-beanutils-core/1.8.0//commons-beanutils-core-1.8.0.jar
> commons-beanutils/1.7.0//commons-beanutils-1.7.0.jar
> commons-cli/1.2//commons-cli-1.2.jar
> commons-codec/1.4//commons-codec-1.4.jar
> commons-collections/3.2.2//commons-collections-3.2.2.jar
> commons-compiler/2.7.6//commons-compiler-2.7.6.jar
> commons-compress/1.9//commons-compress-1.9.jar
> commons-configuration/1.6//commons-configuration-1.6.jar
> commons-daemon/1.0.13//commons-daemon-1.0.13.jar
> commons-dbcp/1.4//commons-dbcp-1.4.jar
> commons-digester/1.8//commons-digester-1.8.jar
> commons-el/1.0//commons-el-1.0.jar
> commons-httpclient/3.1//commons-httpclient-3.1.jar
> commons-io/2.4//commons-io-2.4.jar
> commons-lang/2.6//commons-lang-2.6.jar
> commons-lang3/3.1//commons-lang3-3.1.jar
> commons-logging/1.2//commons-logging-1.2.jar
> commons-math/2.2//commons-math-2.2.jar
> commons-math3/3.1.1//commons-math3-3.1.1.jar
> commons-net/3.1//commons-net-3.1.jar
> commons-pool/1.5.4//commons-pool-1.5.4.jar
> curator-client/2.7.1//curator-client-2.7.1.jar
> curator-framework/2.7.1//curator-framework-2.7.1.jar
> curator-recipes/2.7.1//curator-recipes-2.7.1.jar
> datanucleus-api-jdo/4.2.4//datanucleus-api-jdo-4.2.4.jar
> datanucleus-core/4.1.17//datanucleus-core-4.1.17.jar
> datanucleus-rdbms/4.1.19//datanucleus-rdbms-4.1.19.jar
> derby/10.10.2.0//derby-10.10.2.0.jar
> disruptor/3.3.0//disruptor-3.3.0.jar
> dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar
> eigenbase-properties/1.1.5//eigenbase-properties-1.1.5.jar
> fastutil/7.0.13//fastutil-7.0.13.jar
> findbugs-annotations/1.3.9-1//findbugs-annotations-1.3.9-1.jar
> fluent-hc/4.4.1//fluent-hc-4.4.1.jar
> groovy-all/2.4.4//groovy-all-2.4.4.jar
> gson/2.3.1//gson-2.3.1.jar
> guava/14.0.1//guava-14.0.1.jar
> guice-assistedinject/3.0//guice-assistedinject-3.0.jar
> guice-servlet/3.0//guice-servlet-3.0.jar
> guice/3.0//guice-3.0.jar
> hadoop-annotations/2.7.3//hadoop-annotations-2.7.3.jar
> hadoop-auth/2.7.3//hadoop-auth-2.7.3.jar
> hadoop-client/2.7.3//hadoop-client-2.7.3.jar
> hadoop-common/2.7.3//hadoop-common-2.7.3.jar
> hadoop-common/2.7.3/tests/hadoop-common-2.7.3-tests.jar
> hadoop-hdfs/2.7.3//hadoop-hdfs-2.7.3.jar
> hadoop-hdfs/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar
> hadoop-mapreduce-client-app/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar
> hadoop-mapreduce-client-common/2.7.3//hadoop-mapreduce-client-common-2.7.3.jar
> hadoop-mapreduce-client-core/2.7.3//hadoop-mapreduce-client-core-2.7.3.jar
> hadoop-mapreduce-client-jobclient/2.7.3//hadoop-mapreduce-client-jobclient-2.7.3.jar
> hadoop-mapreduce-client-shuffle/2.7.3//hadoop-mapreduce-client-shuffle-2.7.3.jar
> hadoop-yarn-api/2.7.3//hadoop-yarn-api-2.7.3.jar
> hadoop-yarn-client/2.7.3//hadoop-yarn-client-2.7.3.jar
> hadoop-yarn-common/2.7.3//hadoop-yarn-common-2.7.3.jar
> hadoop-yarn-registry/2.7.1//hadoop-yarn-registry-2.7.1.jar
> hadoop-yarn-server-applicationhistoryservice/2.7.2//hadoop-yarn-server-applicationhistoryservice-2.7.2.jar
> hadoop-yarn-server-common/2.7.2//hadoop-yarn-server-common-2.7.2.jar
> 

[jira] [Closed] (HUDI-2440) Add dependency change diff script for dependency governace

2021-09-30 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2440.
--
Resolution: Implemented

47ed91799943271f219419cf209793a98b3f09b5

> Add dependency change diff script for dependency governace
> --
>
> Key: HUDI-2440
> URL: https://issues.apache.org/jira/browse/HUDI-2440
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Usability, Utilities
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> Currently, hudi's dependency management is chaotic, e.g. for 
> `hudi-spark-bundle_2.11`, the dependency list is here:
> {code:java}
> HikariCP/2.5.1//HikariCP-2.5.1.jar
> ST4/4.0.4//ST4-4.0.4.jar
> aircompressor/0.15//aircompressor-0.15.jar
> annotations/17.0.0//annotations-17.0.0.jar
> ant-launcher/1.9.1//ant-launcher-1.9.1.jar
> ant/1.6.5//ant-1.6.5.jar
> ant/1.9.1//ant-1.9.1.jar
> antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar
> aopalliance/1.0//aopalliance-1.0.jar
> apache-curator/2.7.1//apache-curator-2.7.1.pom
> apacheds-i18n/2.0.0-M15//apacheds-i18n-2.0.0-M15.jar
> apacheds-kerberos-codec/2.0.0-M15//apacheds-kerberos-codec-2.0.0-M15.jar
> api-asn1-api/1.0.0-M20//api-asn1-api-1.0.0-M20.jar
> api-util/1.0.0-M20//api-util-1.0.0-M20.jar
> asm/3.1//asm-3.1.jar
> avatica-metrics/1.8.0//avatica-metrics-1.8.0.jar
> avatica/1.8.0//avatica-1.8.0.jar
> avro/1.8.2//avro-1.8.2.jar
> bonecp/0.8.0.RELEASE//bonecp-0.8.0.RELEASE.jar
> calcite-core/1.10.0//calcite-core-1.10.0.jar
> calcite-druid/1.10.0//calcite-druid-1.10.0.jar
> calcite-linq4j/1.10.0//calcite-linq4j-1.10.0.jar
> commons-beanutils-core/1.8.0//commons-beanutils-core-1.8.0.jar
> commons-beanutils/1.7.0//commons-beanutils-1.7.0.jar
> commons-cli/1.2//commons-cli-1.2.jar
> commons-codec/1.4//commons-codec-1.4.jar
> commons-collections/3.2.2//commons-collections-3.2.2.jar
> commons-compiler/2.7.6//commons-compiler-2.7.6.jar
> commons-compress/1.9//commons-compress-1.9.jar
> commons-configuration/1.6//commons-configuration-1.6.jar
> commons-daemon/1.0.13//commons-daemon-1.0.13.jar
> commons-dbcp/1.4//commons-dbcp-1.4.jar
> commons-digester/1.8//commons-digester-1.8.jar
> commons-el/1.0//commons-el-1.0.jar
> commons-httpclient/3.1//commons-httpclient-3.1.jar
> commons-io/2.4//commons-io-2.4.jar
> commons-lang/2.6//commons-lang-2.6.jar
> commons-lang3/3.1//commons-lang3-3.1.jar
> commons-logging/1.2//commons-logging-1.2.jar
> commons-math/2.2//commons-math-2.2.jar
> commons-math3/3.1.1//commons-math3-3.1.1.jar
> commons-net/3.1//commons-net-3.1.jar
> commons-pool/1.5.4//commons-pool-1.5.4.jar
> curator-client/2.7.1//curator-client-2.7.1.jar
> curator-framework/2.7.1//curator-framework-2.7.1.jar
> curator-recipes/2.7.1//curator-recipes-2.7.1.jar
> datanucleus-api-jdo/4.2.4//datanucleus-api-jdo-4.2.4.jar
> datanucleus-core/4.1.17//datanucleus-core-4.1.17.jar
> datanucleus-rdbms/4.1.19//datanucleus-rdbms-4.1.19.jar
> derby/10.10.2.0//derby-10.10.2.0.jar
> disruptor/3.3.0//disruptor-3.3.0.jar
> dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar
> eigenbase-properties/1.1.5//eigenbase-properties-1.1.5.jar
> fastutil/7.0.13//fastutil-7.0.13.jar
> findbugs-annotations/1.3.9-1//findbugs-annotations-1.3.9-1.jar
> fluent-hc/4.4.1//fluent-hc-4.4.1.jar
> groovy-all/2.4.4//groovy-all-2.4.4.jar
> gson/2.3.1//gson-2.3.1.jar
> guava/14.0.1//guava-14.0.1.jar
> guice-assistedinject/3.0//guice-assistedinject-3.0.jar
> guice-servlet/3.0//guice-servlet-3.0.jar
> guice/3.0//guice-3.0.jar
> hadoop-annotations/2.7.3//hadoop-annotations-2.7.3.jar
> hadoop-auth/2.7.3//hadoop-auth-2.7.3.jar
> hadoop-client/2.7.3//hadoop-client-2.7.3.jar
> hadoop-common/2.7.3//hadoop-common-2.7.3.jar
> hadoop-common/2.7.3/tests/hadoop-common-2.7.3-tests.jar
> hadoop-hdfs/2.7.3//hadoop-hdfs-2.7.3.jar
> hadoop-hdfs/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar
> hadoop-mapreduce-client-app/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar
> hadoop-mapreduce-client-common/2.7.3//hadoop-mapreduce-client-common-2.7.3.jar
> hadoop-mapreduce-client-core/2.7.3//hadoop-mapreduce-client-core-2.7.3.jar
> hadoop-mapreduce-client-jobclient/2.7.3//hadoop-mapreduce-client-jobclient-2.7.3.jar
> hadoop-mapreduce-client-shuffle/2.7.3//hadoop-mapreduce-client-shuffle-2.7.3.jar
> hadoop-yarn-api/2.7.3//hadoop-yarn-api-2.7.3.jar
> hadoop-yarn-client/2.7.3//hadoop-yarn-client-2.7.3.jar
> hadoop-yarn-common/2.7.3//hadoop-yarn-common-2.7.3.jar
> hadoop-yarn-registry/2.7.1//hadoop-yarn-registry-2.7.1.jar
> hadoop-yarn-server-applicationhistoryservice/2.7.2//hadoop-yarn-server-applicationhistoryservice-2.7.2.jar
> hadoop-yarn-server-common/2.7.2//hadoop-yarn-server-common-2.7.2.jar
> 

[jira] [Updated] (HUDI-2440) Add dependency change diff script for dependency governace

2021-09-30 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2440:
---
Fix Version/s: 0.10.0

> Add dependency change diff script for dependency governace
> --
>
> Key: HUDI-2440
> URL: https://issues.apache.org/jira/browse/HUDI-2440
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Usability, Utilities
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> Currently, hudi's dependency management is chaotic, e.g. for 
> `hudi-spark-bundle_2.11`, the dependency list is here:
> {code:java}
> HikariCP/2.5.1//HikariCP-2.5.1.jar
> ST4/4.0.4//ST4-4.0.4.jar
> aircompressor/0.15//aircompressor-0.15.jar
> annotations/17.0.0//annotations-17.0.0.jar
> ant-launcher/1.9.1//ant-launcher-1.9.1.jar
> ant/1.6.5//ant-1.6.5.jar
> ant/1.9.1//ant-1.9.1.jar
> antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar
> aopalliance/1.0//aopalliance-1.0.jar
> apache-curator/2.7.1//apache-curator-2.7.1.pom
> apacheds-i18n/2.0.0-M15//apacheds-i18n-2.0.0-M15.jar
> apacheds-kerberos-codec/2.0.0-M15//apacheds-kerberos-codec-2.0.0-M15.jar
> api-asn1-api/1.0.0-M20//api-asn1-api-1.0.0-M20.jar
> api-util/1.0.0-M20//api-util-1.0.0-M20.jar
> asm/3.1//asm-3.1.jar
> avatica-metrics/1.8.0//avatica-metrics-1.8.0.jar
> avatica/1.8.0//avatica-1.8.0.jar
> avro/1.8.2//avro-1.8.2.jar
> bonecp/0.8.0.RELEASE//bonecp-0.8.0.RELEASE.jar
> calcite-core/1.10.0//calcite-core-1.10.0.jar
> calcite-druid/1.10.0//calcite-druid-1.10.0.jar
> calcite-linq4j/1.10.0//calcite-linq4j-1.10.0.jar
> commons-beanutils-core/1.8.0//commons-beanutils-core-1.8.0.jar
> commons-beanutils/1.7.0//commons-beanutils-1.7.0.jar
> commons-cli/1.2//commons-cli-1.2.jar
> commons-codec/1.4//commons-codec-1.4.jar
> commons-collections/3.2.2//commons-collections-3.2.2.jar
> commons-compiler/2.7.6//commons-compiler-2.7.6.jar
> commons-compress/1.9//commons-compress-1.9.jar
> commons-configuration/1.6//commons-configuration-1.6.jar
> commons-daemon/1.0.13//commons-daemon-1.0.13.jar
> commons-dbcp/1.4//commons-dbcp-1.4.jar
> commons-digester/1.8//commons-digester-1.8.jar
> commons-el/1.0//commons-el-1.0.jar
> commons-httpclient/3.1//commons-httpclient-3.1.jar
> commons-io/2.4//commons-io-2.4.jar
> commons-lang/2.6//commons-lang-2.6.jar
> commons-lang3/3.1//commons-lang3-3.1.jar
> commons-logging/1.2//commons-logging-1.2.jar
> commons-math/2.2//commons-math-2.2.jar
> commons-math3/3.1.1//commons-math3-3.1.1.jar
> commons-net/3.1//commons-net-3.1.jar
> commons-pool/1.5.4//commons-pool-1.5.4.jar
> curator-client/2.7.1//curator-client-2.7.1.jar
> curator-framework/2.7.1//curator-framework-2.7.1.jar
> curator-recipes/2.7.1//curator-recipes-2.7.1.jar
> datanucleus-api-jdo/4.2.4//datanucleus-api-jdo-4.2.4.jar
> datanucleus-core/4.1.17//datanucleus-core-4.1.17.jar
> datanucleus-rdbms/4.1.19//datanucleus-rdbms-4.1.19.jar
> derby/10.10.2.0//derby-10.10.2.0.jar
> disruptor/3.3.0//disruptor-3.3.0.jar
> dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar
> eigenbase-properties/1.1.5//eigenbase-properties-1.1.5.jar
> fastutil/7.0.13//fastutil-7.0.13.jar
> findbugs-annotations/1.3.9-1//findbugs-annotations-1.3.9-1.jar
> fluent-hc/4.4.1//fluent-hc-4.4.1.jar
> groovy-all/2.4.4//groovy-all-2.4.4.jar
> gson/2.3.1//gson-2.3.1.jar
> guava/14.0.1//guava-14.0.1.jar
> guice-assistedinject/3.0//guice-assistedinject-3.0.jar
> guice-servlet/3.0//guice-servlet-3.0.jar
> guice/3.0//guice-3.0.jar
> hadoop-annotations/2.7.3//hadoop-annotations-2.7.3.jar
> hadoop-auth/2.7.3//hadoop-auth-2.7.3.jar
> hadoop-client/2.7.3//hadoop-client-2.7.3.jar
> hadoop-common/2.7.3//hadoop-common-2.7.3.jar
> hadoop-common/2.7.3/tests/hadoop-common-2.7.3-tests.jar
> hadoop-hdfs/2.7.3//hadoop-hdfs-2.7.3.jar
> hadoop-hdfs/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar
> hadoop-mapreduce-client-app/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar
> hadoop-mapreduce-client-common/2.7.3//hadoop-mapreduce-client-common-2.7.3.jar
> hadoop-mapreduce-client-core/2.7.3//hadoop-mapreduce-client-core-2.7.3.jar
> hadoop-mapreduce-client-jobclient/2.7.3//hadoop-mapreduce-client-jobclient-2.7.3.jar
> hadoop-mapreduce-client-shuffle/2.7.3//hadoop-mapreduce-client-shuffle-2.7.3.jar
> hadoop-yarn-api/2.7.3//hadoop-yarn-api-2.7.3.jar
> hadoop-yarn-client/2.7.3//hadoop-yarn-client-2.7.3.jar
> hadoop-yarn-common/2.7.3//hadoop-yarn-common-2.7.3.jar
> hadoop-yarn-registry/2.7.1//hadoop-yarn-registry-2.7.1.jar
> hadoop-yarn-server-applicationhistoryservice/2.7.2//hadoop-yarn-server-applicationhistoryservice-2.7.2.jar
> hadoop-yarn-server-common/2.7.2//hadoop-yarn-server-common-2.7.2.jar
> hadoop-yarn-server-resourcemanager/2.7.2//hadoop-yarn-server-resourcemanager-2.7.2.jar
> 

[jira] [Closed] (HUDI-2487) An empty message in Kafka causes a task exception

2021-09-27 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2487.
--
Fix Version/s: (was: 0.9.0)
   0.10.0
   Resolution: Implemented

9067657a5ff313990c819065ad12d71fa8bb0f06

> An empty message in Kafka causes a task exception
> -
>
> Key: HUDI-2487
> URL: https://issues.apache.org/jira/browse/HUDI-2487
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: qianchutao
>Assignee: qianchutao
>Priority: Major
>  Labels: easyfix, newbie, pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> h1. Question:
>       When I use deltaStreamer to update hive tables in upsert mode from json 
> data in Kafka to HUDi, if the value of the message body in Kafka is null, the 
> task throws an exception.
> h2. Exception description:
> Lost task 0.1 in stage 2.0 (TID 24, 
> node-group-1UtpO.1f562475-6982-4b16-a50d-d19b0ebff950.com, executor 6): 
> org.apache.hudi.exception.HoodieException: The value of tmSmp can not be null
>  at 
> org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(HoodieAvroUtils.java:463)
>  at 
> org.apache.hudi.utilities.deltastreamer.DeltaSync.lambda$readFromSource$d62e16$1(DeltaSync.java:389)
>  at 
> org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1040)
>  at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
>  at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
>  at 
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:196)
>  at 
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
>  at 
> org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:58)
>  at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
>  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
>  at org.apache.spark.scheduler.Task.run(Task.scala:123)
>  at 
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:413)
>  at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1551)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:419)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> h1. The task Settings:
>  
> {code:java}
> hoodie.datasource.write.precombine.field=tmSmp
> hoodie.datasource.write.recordkey.field=subOrderId,activityId,ticketId
> hoodie.datasource.hive_sync.partition_fields=db,dt
> hoodie.datasource.write.partitionpath.field=db:SIMPLE,dt:SIMPLE
> hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator
> hoodie.datasource.hive_sync.enable=true
> hoodie.datasource.meta.sync.enable=true
> hoodie.datasource.hive_sync.partition_extractor_class=org.apache.hudi.hive.MultiPartKeysValueExtractor
> hoodie.datasource.hive_sync.support_timestamp=true
> hoodie.datasource.hive_sync.auto_create_database=true
> hoodie.meta.sync.client.tool.class=org.apache.hudi.hive.HiveSyncTool
> hoodie.datasource.hive_sync.base_file_format=PARQUET
> {code}
>  
>  
> h1. Spark-submit Script parameter Settings:
>  
> {code:java}
> --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer
> --source-ordering-field tmSmp \
> --table-type MERGE_ON_READ  \
> --target-table ${TABLE_NAME} \
> --source-class org.apache.hudi.utilities.sources.JsonKafkaSource \
> --schemaprovider-class 
> org.apache.hudi.utilities.schema.FilebasedSchemaProvider \
> --enable-sync \
> --op UPSERT \
> --continuous \
> {code}
>  
>  
>        So I think some optimizations can be made to prevent task throwing, 
> such as filtering messages with a null value in Kafka.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2447) Extract common parts from 'if' & Fix typo

2021-09-17 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2447.
--
Resolution: Done

> Extract common parts from 'if' & Fix typo
> -
>
> Key: HUDI-2447
> URL: https://issues.apache.org/jira/browse/HUDI-2447
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: 董可伦
>Assignee: 董可伦
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> Extract common parts from 'if' & Fix typo



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2447) Extract common parts from 'if' & Fix typo

2021-09-17 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2447:
---
Priority: Minor  (was: Major)

> Extract common parts from 'if' & Fix typo
> -
>
> Key: HUDI-2447
> URL: https://issues.apache.org/jira/browse/HUDI-2447
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: 董可伦
>Assignee: 董可伦
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> Extract common parts from 'if' & Fix typo



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2434) Add GraphiteReporter reporter periodSeconds config

2021-09-17 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2434.
--
Resolution: Done

> Add GraphiteReporter reporter periodSeconds config
> --
>
> Key: HUDI-2434
> URL: https://issues.apache.org/jira/browse/HUDI-2434
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: liujinhui
>Assignee: liujinhui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-2440) Add dependency change diff script for dependency governace

2021-09-16 Thread vinoyang (Jira)
vinoyang created HUDI-2440:
--

 Summary: Add dependency change diff script for dependency governace
 Key: HUDI-2440
 URL: https://issues.apache.org/jira/browse/HUDI-2440
 Project: Apache Hudi
  Issue Type: Improvement
  Components: Utilities
Reporter: vinoyang
Assignee: vinoyang


Currently, hudi's dependency management is chaotic, e.g. for 
`hudi-spark-bundle_2.11`, the dependency list is here:
{code:java}
HikariCP/2.5.1//HikariCP-2.5.1.jar
ST4/4.0.4//ST4-4.0.4.jar
aircompressor/0.15//aircompressor-0.15.jar
annotations/17.0.0//annotations-17.0.0.jar
ant-launcher/1.9.1//ant-launcher-1.9.1.jar
ant/1.6.5//ant-1.6.5.jar
ant/1.9.1//ant-1.9.1.jar
antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar
aopalliance/1.0//aopalliance-1.0.jar
apache-curator/2.7.1//apache-curator-2.7.1.pom
apacheds-i18n/2.0.0-M15//apacheds-i18n-2.0.0-M15.jar
apacheds-kerberos-codec/2.0.0-M15//apacheds-kerberos-codec-2.0.0-M15.jar
api-asn1-api/1.0.0-M20//api-asn1-api-1.0.0-M20.jar
api-util/1.0.0-M20//api-util-1.0.0-M20.jar
asm/3.1//asm-3.1.jar
avatica-metrics/1.8.0//avatica-metrics-1.8.0.jar
avatica/1.8.0//avatica-1.8.0.jar
avro/1.8.2//avro-1.8.2.jar
bonecp/0.8.0.RELEASE//bonecp-0.8.0.RELEASE.jar
calcite-core/1.10.0//calcite-core-1.10.0.jar
calcite-druid/1.10.0//calcite-druid-1.10.0.jar
calcite-linq4j/1.10.0//calcite-linq4j-1.10.0.jar
commons-beanutils-core/1.8.0//commons-beanutils-core-1.8.0.jar
commons-beanutils/1.7.0//commons-beanutils-1.7.0.jar
commons-cli/1.2//commons-cli-1.2.jar
commons-codec/1.4//commons-codec-1.4.jar
commons-collections/3.2.2//commons-collections-3.2.2.jar
commons-compiler/2.7.6//commons-compiler-2.7.6.jar
commons-compress/1.9//commons-compress-1.9.jar
commons-configuration/1.6//commons-configuration-1.6.jar
commons-daemon/1.0.13//commons-daemon-1.0.13.jar
commons-dbcp/1.4//commons-dbcp-1.4.jar
commons-digester/1.8//commons-digester-1.8.jar
commons-el/1.0//commons-el-1.0.jar
commons-httpclient/3.1//commons-httpclient-3.1.jar
commons-io/2.4//commons-io-2.4.jar
commons-lang/2.6//commons-lang-2.6.jar
commons-lang3/3.1//commons-lang3-3.1.jar
commons-logging/1.2//commons-logging-1.2.jar
commons-math/2.2//commons-math-2.2.jar
commons-math3/3.1.1//commons-math3-3.1.1.jar
commons-net/3.1//commons-net-3.1.jar
commons-pool/1.5.4//commons-pool-1.5.4.jar
curator-client/2.7.1//curator-client-2.7.1.jar
curator-framework/2.7.1//curator-framework-2.7.1.jar
curator-recipes/2.7.1//curator-recipes-2.7.1.jar
datanucleus-api-jdo/4.2.4//datanucleus-api-jdo-4.2.4.jar
datanucleus-core/4.1.17//datanucleus-core-4.1.17.jar
datanucleus-rdbms/4.1.19//datanucleus-rdbms-4.1.19.jar
derby/10.10.2.0//derby-10.10.2.0.jar
disruptor/3.3.0//disruptor-3.3.0.jar
dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar
eigenbase-properties/1.1.5//eigenbase-properties-1.1.5.jar
fastutil/7.0.13//fastutil-7.0.13.jar
findbugs-annotations/1.3.9-1//findbugs-annotations-1.3.9-1.jar
fluent-hc/4.4.1//fluent-hc-4.4.1.jar
groovy-all/2.4.4//groovy-all-2.4.4.jar
gson/2.3.1//gson-2.3.1.jar
guava/14.0.1//guava-14.0.1.jar
guice-assistedinject/3.0//guice-assistedinject-3.0.jar
guice-servlet/3.0//guice-servlet-3.0.jar
guice/3.0//guice-3.0.jar
hadoop-annotations/2.7.3//hadoop-annotations-2.7.3.jar
hadoop-auth/2.7.3//hadoop-auth-2.7.3.jar
hadoop-client/2.7.3//hadoop-client-2.7.3.jar
hadoop-common/2.7.3//hadoop-common-2.7.3.jar
hadoop-common/2.7.3/tests/hadoop-common-2.7.3-tests.jar
hadoop-hdfs/2.7.3//hadoop-hdfs-2.7.3.jar
hadoop-hdfs/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar
hadoop-mapreduce-client-app/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar
hadoop-mapreduce-client-common/2.7.3//hadoop-mapreduce-client-common-2.7.3.jar
hadoop-mapreduce-client-core/2.7.3//hadoop-mapreduce-client-core-2.7.3.jar
hadoop-mapreduce-client-jobclient/2.7.3//hadoop-mapreduce-client-jobclient-2.7.3.jar
hadoop-mapreduce-client-shuffle/2.7.3//hadoop-mapreduce-client-shuffle-2.7.3.jar
hadoop-yarn-api/2.7.3//hadoop-yarn-api-2.7.3.jar
hadoop-yarn-client/2.7.3//hadoop-yarn-client-2.7.3.jar
hadoop-yarn-common/2.7.3//hadoop-yarn-common-2.7.3.jar
hadoop-yarn-registry/2.7.1//hadoop-yarn-registry-2.7.1.jar
hadoop-yarn-server-applicationhistoryservice/2.7.2//hadoop-yarn-server-applicationhistoryservice-2.7.2.jar
hadoop-yarn-server-common/2.7.2//hadoop-yarn-server-common-2.7.2.jar
hadoop-yarn-server-resourcemanager/2.7.2//hadoop-yarn-server-resourcemanager-2.7.2.jar
hadoop-yarn-server-web-proxy/2.7.2//hadoop-yarn-server-web-proxy-2.7.2.jar
hamcrest-core/1.3//hamcrest-core-1.3.jar
hbase-annotations/1.2.3//hbase-annotations-1.2.3.jar
hbase-client/1.2.3//hbase-client-1.2.3.jar
hbase-common/1.2.3//hbase-common-1.2.3.jar
hbase-common/1.2.3/tests/hbase-common-1.2.3-tests.jar
hbase-hadoop-compat/1.2.3//hbase-hadoop-compat-1.2.3.jar
hbase-hadoop2-compat/1.2.3//hbase-hadoop2-compat-1.2.3.jar
hbase-prefix-tree/1.2.3//hbase-prefix-tree-1.2.3.jar

[jira] [Closed] (HUDI-2423) Separate some config logic from HoodieMetricsConfig into HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig

2021-09-16 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2423.
--
Resolution: Done

> Separate some config logic from HoodieMetricsConfig into 
> HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig
> ---
>
> Key: HUDI-2423
> URL: https://issues.apache.org/jira/browse/HUDI-2423
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: liujinhui
>Assignee: liujinhui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2423) Separate some config logic from HoodieMetricsConfig into HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig

2021-09-16 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2423:
---
Fix Version/s: 0.10.0

> Separate some config logic from HoodieMetricsConfig into 
> HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig
> ---
>
> Key: HUDI-2423
> URL: https://issues.apache.org/jira/browse/HUDI-2423
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: liujinhui
>Assignee: liujinhui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2423) Separate some config logic from HoodieMetricsConfig into HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig

2021-09-16 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2423:
---
Summary: Separate some config logic from HoodieMetricsConfig into 
HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig  (was:  Breakdown 
HoodieMetricsConfig into HoodieMetricsGraphiteConfig、HoodieMetricsJmxConfig...)

> Separate some config logic from HoodieMetricsConfig into 
> HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig
> ---
>
> Key: HUDI-2423
> URL: https://issues.apache.org/jira/browse/HUDI-2423
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: liujinhui
>Assignee: liujinhui
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2410) Fix getDefaultBootstrapIndexClass logical error

2021-09-13 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2410:
---
Description: 
 
{code:java}
public static String getDefaultBootstrapIndexClass(Properties props) {
 String defaultClass = BOOTSTRAP_INDEX_CLASS_NAME.defaultValue();
 if 
("false".equalsIgnoreCase(props.getProperty(BOOTSTRAP_INDEX_ENABLE.key({ 
defaultClass = NO_OP_BOOTSTRAP_INDEX_CLASS; 
  }
  return defaultClass;
 }
{code}
 

When hoodie.bootstrap.index.enable is not passed, the original logic will 
follow HFileBootstrapIndex,

This should not be judged here

  was:
public static String getDefaultBootstrapIndexClass(Properties props) {
 String defaultClass = BOOTSTRAP_INDEX_CLASS_NAME.defaultValue();
 if ("false".equalsIgnoreCase(props.getProperty(BOOTSTRAP_INDEX_ENABLE.key(

{ defaultClass = NO_OP_BOOTSTRAP_INDEX_CLASS; }

return defaultClass;
 }

When hoodie.bootstrap.index.enable is not passed, the original logic will 
follow HFileBootstrapIndex,

This should not be judged here


> Fix getDefaultBootstrapIndexClass logical error
> ---
>
> Key: HUDI-2410
> URL: https://issues.apache.org/jira/browse/HUDI-2410
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: bootstrap
>Reporter: liujinhui
>Assignee: liujinhui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
>  
> {code:java}
> public static String getDefaultBootstrapIndexClass(Properties props) {
>  String defaultClass = BOOTSTRAP_INDEX_CLASS_NAME.defaultValue();
>  if 
> ("false".equalsIgnoreCase(props.getProperty(BOOTSTRAP_INDEX_ENABLE.key({ 
> defaultClass = NO_OP_BOOTSTRAP_INDEX_CLASS; 
>   }
>   return defaultClass;
>  }
> {code}
>  
> When hoodie.bootstrap.index.enable is not passed, the original logic will 
> follow HFileBootstrapIndex,
> This should not be judged here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2410) Fix getDefaultBootstrapIndexClass logical error

2021-09-13 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2410.
--
Resolution: Fixed

9f3c4a2a7f565f7bcc32189a202a3d400ece23f1

> Fix getDefaultBootstrapIndexClass logical error
> ---
>
> Key: HUDI-2410
> URL: https://issues.apache.org/jira/browse/HUDI-2410
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: bootstrap
>Reporter: liujinhui
>Assignee: liujinhui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> public static String getDefaultBootstrapIndexClass(Properties props) {
>  String defaultClass = BOOTSTRAP_INDEX_CLASS_NAME.defaultValue();
>  if 
> ("false".equalsIgnoreCase(props.getProperty(BOOTSTRAP_INDEX_ENABLE.key(
> { defaultClass = NO_OP_BOOTSTRAP_INDEX_CLASS; }
> return defaultClass;
>  }
> When hoodie.bootstrap.index.enable is not passed, the original logic will 
> follow HFileBootstrapIndex,
> This should not be judged here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-2410) Fix getDefaultBootstrapIndexClass logical error

2021-09-13 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned HUDI-2410:
--

Assignee: liujinhui

> Fix getDefaultBootstrapIndexClass logical error
> ---
>
> Key: HUDI-2410
> URL: https://issues.apache.org/jira/browse/HUDI-2410
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: bootstrap
>Reporter: liujinhui
>Assignee: liujinhui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> public static String getDefaultBootstrapIndexClass(Properties props) {
>  String defaultClass = BOOTSTRAP_INDEX_CLASS_NAME.defaultValue();
>  if 
> ("false".equalsIgnoreCase(props.getProperty(BOOTSTRAP_INDEX_ENABLE.key(
> { defaultClass = NO_OP_BOOTSTRAP_INDEX_CLASS; }
> return defaultClass;
>  }
> When hoodie.bootstrap.index.enable is not passed, the original logic will 
> follow HFileBootstrapIndex,
> This should not be judged here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2411) Remove unnecessary method overriden and note

2021-09-10 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2411.
--
Resolution: Done

44b9bc145e0d101bcc688f11c6a30ebcbb7a4a7d

> Remove unnecessary  method overriden and note
> -
>
> Key: HUDI-2411
> URL: https://issues.apache.org/jira/browse/HUDI-2411
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Xianghu Wang
>Assignee: Xianghu Wang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2384) Allow log file size more than 2GB

2021-09-01 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2384:
---
Fix Version/s: 0.10.0

> Allow log file size more than 2GB
> -
>
> Key: HUDI-2384
> URL: https://issues.apache.org/jira/browse/HUDI-2384
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core
>Reporter: XiaoyuGeng
>Assignee: XiaoyuGeng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2384) Allow log file size more than 2GB

2021-09-01 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2384.
--
Resolution: Done

21fd6edfe7721c674b40877fbbdbac71b36bf782

> Allow log file size more than 2GB
> -
>
> Key: HUDI-2384
> URL: https://issues.apache.org/jira/browse/HUDI-2384
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core
>Reporter: XiaoyuGeng
>Assignee: XiaoyuGeng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2320) Add support ByteArrayDeserializer in AvroKafkaSource

2021-08-29 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2320.
--
Resolution: Done

bf5a52e51bbeaa089995335a0a4c55884792e505

> Add support ByteArrayDeserializer in AvroKafkaSource
> 
>
> Key: HUDI-2320
> URL: https://issues.apache.org/jira/browse/HUDI-2320
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: 董可伦
>Assignee: 董可伦
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> When the 'value.serializer' of Kafka Avro Producer is 
> 'org.apache.kafka.common.serialization.ByteArraySerializer',Use the following 
> configuration
> {code:java}
> --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
> --schemaprovider-class 
> org.apache.hudi.utilities.schema.JdbcbasedSchemaProvider \
> --hoodie-conf 
> "hoodie.deltastreamer.source.kafka.value.deserializer.class=org.apache.kafka.common.serialization.ByteArrayDeserializer"
> {code}
> For now,It will throw an exception::
> {code:java}
> java.lang.ClassCastException: [B cannot be cast to 
> org.apache.avro.generic.GenericRecord{code}
> After support ByteArrayDeserializer,Use the configuration above,It works 
> properly.And there is no need to provide 'schema.registry.url',For example, 
> we can use the JdbcbasedSchemaProvider to get the sourceSchema



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2225) Add compaction example

2021-08-02 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2225.
--
Fix Version/s: 0.9.0
   Resolution: Done

aa857beee00a764cee90d6e790ee4b0ab4ad4862

> Add compaction example
> --
>
> Key: HUDI-2225
> URL: https://issues.apache.org/jira/browse/HUDI-2225
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-2244) Fix database alreadyExistsException while hive sync

2021-07-28 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned HUDI-2244:
--

Assignee: Zheng yunhong

> Fix database alreadyExistsException while hive sync
> ---
>
> Key: HUDI-2244
> URL: https://issues.apache.org/jira/browse/HUDI-2244
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Hive Integration
>Reporter: Zheng yunhong
>Assignee: Zheng yunhong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Fix database alreadyExistsException while hive sync.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2244) Fix database alreadyExistsException while hive sync

2021-07-28 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2244.
--
Resolution: Fixed

eedfadeb46d5538bc7efb2c455469f1b42e9385e

> Fix database alreadyExistsException while hive sync
> ---
>
> Key: HUDI-2244
> URL: https://issues.apache.org/jira/browse/HUDI-2244
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Hive Integration
>Reporter: Zheng yunhong
>Assignee: Zheng yunhong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Fix database alreadyExistsException while hive sync.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2230) "Task not serializable" exception due to non-serializable Codahale Timers

2021-07-28 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2230.
--
Resolution: Fixed

8105cf588e28820b9c021c9ed0e59e3f8b6efa71

> "Task not serializable" exception due to non-serializable Codahale Timers
> -
>
> Key: HUDI-2230
> URL: https://issues.apache.org/jira/browse/HUDI-2230
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 0.9.0
>Reporter: Dave Hagman
>Assignee: Dave Hagman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Steps to reproduce:
>  * Enable graphite metrics via props file. Example:
> {noformat}
> hoodie.metrics.on=true
> hoodie.metrics.reporter.type=GRAPHITE
> hoodie.metrics.graphite.host=
> hoodie.metrics.graphite.port=
> hoodie.metrics.graphite.metric.prefix=
> {noformat}
>  * Run the Deltastreamer
>  * Note the following exception:
> {noformat}
> Exception in thread "main" org.apache.hudi.exception.HoodieException: 
> org.apache.hudi.exception.HoodieException: Task not serializable
>   at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:165)
>   at org.apache.hudi.common.util.Option.ifPresent(Option.java:96)
>   at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:160)
>   at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:501)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
>   at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:959)
>   at 
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>   at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>   at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>   at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1038)
>   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1047)
>   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.hudi.exception.HoodieException: Task not serializable
>   at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
>   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
>   at 
> org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:90)
>   at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:163)
>   ... 15 more
> Caused by: org.apache.hudi.exception.HoodieException: Task not serializable
>   at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:649)
>   at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.spark.SparkException: Task not serializable
>   at 
> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:416)
>   at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:406)
>   at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162)
>   at org.apache.spark.SparkContext.clean(SparkContext.scala:2502)
>   at org.apache.spark.rdd.RDD.$anonfun$map$1(RDD.scala:422)
>   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>   at org.apache.spark.rdd.RDD.withScope(RDD.scala:414)
>   at org.apache.spark.rdd.RDD.map(RDD.scala:421)
>   at org.apache.spark.api.java.JavaRDDLike.map(JavaRDDLike.scala:93)
>   at org.apache.spark.api.java.JavaRDDLike.map$(JavaRDDLike.scala:92)
>   at 
> org.apache.spark.api.java.AbstractJavaRDDLike.map(JavaRDDLike.scala:45)
>   ...
>   at 
> org.apache.hudi.utilities.sources.RowSource.fetchNewData(RowSource.java:43)
>   at org.apache.hudi.utilities.sources.Source.fetchNext(Source.java:76)
>   at 

[jira] [Updated] (HUDI-2216) the words 'fiels' in the comments is incorrect

2021-07-24 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2216:
---
Issue Type: Improvement  (was: Bug)

> the words 'fiels' in the comments is incorrect
> --
>
> Key: HUDI-2216
> URL: https://issues.apache.org/jira/browse/HUDI-2216
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Docs
>Affects Versions: 0.9.0
>Reporter: 董可伦
>Assignee: 董可伦
>Priority: Major
>  Labels: documentation, pull-request-available
> Fix For: 0.9.0
>
> Attachments: HUDI-2216.png
>
>
> the words 'fiels'  in the comments of MergeIntoHoodieTableCommand is 
> incorrect,it should be 
> 'fields'
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2216) the words 'fiels' in the comments is incorrect

2021-07-24 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2216.
--
Resolution: Done

a91296f14a037a148d949b2380ad503677e688c7

> the words 'fiels' in the comments is incorrect
> --
>
> Key: HUDI-2216
> URL: https://issues.apache.org/jira/browse/HUDI-2216
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Docs
>Affects Versions: 0.9.0
>Reporter: 董可伦
>Assignee: 董可伦
>Priority: Trivial
>  Labels: documentation, pull-request-available
> Fix For: 0.9.0
>
> Attachments: HUDI-2216.png
>
>
> the words 'fiels'  in the comments of MergeIntoHoodieTableCommand is 
> incorrect,it should be 
> 'fields'
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2216) the words 'fiels' in the comments is incorrect

2021-07-24 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2216:
---
Priority: Trivial  (was: Major)

> the words 'fiels' in the comments is incorrect
> --
>
> Key: HUDI-2216
> URL: https://issues.apache.org/jira/browse/HUDI-2216
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Docs
>Affects Versions: 0.9.0
>Reporter: 董可伦
>Assignee: 董可伦
>Priority: Trivial
>  Labels: documentation, pull-request-available
> Fix For: 0.9.0
>
> Attachments: HUDI-2216.png
>
>
> the words 'fiels'  in the comments of MergeIntoHoodieTableCommand is 
> incorrect,it should be 
> 'fields'
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2216) the words 'fiels' in the comments is incorrect

2021-07-24 Thread vinoyang (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386805#comment-17386805
 ] 

vinoyang commented on HUDI-2216:


[~dongkelun] I have given you Jira contributor permission. thanks~

> the words 'fiels' in the comments is incorrect
> --
>
> Key: HUDI-2216
> URL: https://issues.apache.org/jira/browse/HUDI-2216
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: 0.9.0
>Reporter: 董可伦
>Assignee: 董可伦
>Priority: Major
>  Labels: documentation, pull-request-available
> Fix For: 0.9.0
>
> Attachments: HUDI-2216.png
>
>
> the words 'fiels'  in the comments of MergeIntoHoodieTableCommand is 
> incorrect,it should be 
> 'fields'
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-2216) the words 'fiels' in the comments is incorrect

2021-07-24 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned HUDI-2216:
--

Assignee: 董可伦

> the words 'fiels' in the comments is incorrect
> --
>
> Key: HUDI-2216
> URL: https://issues.apache.org/jira/browse/HUDI-2216
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: 0.9.0
>Reporter: 董可伦
>Assignee: 董可伦
>Priority: Major
>  Labels: documentation, pull-request-available
> Fix For: 0.9.0
>
> Attachments: HUDI-2216.png
>
>
> the words 'fiels'  in the comments of MergeIntoHoodieTableCommand is 
> incorrect,it should be 
> 'fields'
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2213) Remove unnecessary parameter for HoodieMetrics constructor and fix NPE in UT

2021-07-23 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2213.
--
Resolution: Done

71e14cf866d2ddeee7238a3f15b59912c2537943

> Remove unnecessary parameter for HoodieMetrics constructor and fix NPE in UT
> 
>
> Key: HUDI-2213
> URL: https://issues.apache.org/jira/browse/HUDI-2213
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Xuedong Luan
>Assignee: Xuedong Luan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Remove unnecessary parameter table name for HoodieMetrics constructor, 
> because it can get by config.getTableName(). And also fix some 
> NullPointerException in metrics UT.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2213) Remove unnecessary parameter for HoodieMetrics constructor and fix NPE in UT

2021-07-23 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2213:
---
Fix Version/s: 0.9.0

> Remove unnecessary parameter for HoodieMetrics constructor and fix NPE in UT
> 
>
> Key: HUDI-2213
> URL: https://issues.apache.org/jira/browse/HUDI-2213
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Xuedong Luan
>Assignee: Xuedong Luan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Remove unnecessary parameter table name for HoodieMetrics constructor, 
> because it can get by config.getTableName(). And also fix some 
> NullPointerException in metrics UT.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2211) Fix NullPointerException in TestHoodieConsoleMetrics

2021-07-22 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2211:
---
Fix Version/s: 0.9.0

> Fix NullPointerException in TestHoodieConsoleMetrics
> 
>
> Key: HUDI-2211
> URL: https://issues.apache.org/jira/browse/HUDI-2211
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Xuedong Luan
>Assignee: Xuedong Luan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> java.lang.NullPointerException: Expected a non-null value. Got 
> nulljava.lang.NullPointerException: Expected a non-null value. Got null at 
> org.apache.hudi.common.util.Option.(Option.java:65) at 
> org.apache.hudi.common.util.Option.of(Option.java:75) at 
> org.apache.hudi.metrics.Metrics.registerHoodieCommonMetrics(Metrics.java:85) 
> at org.apache.hudi.metrics.Metrics.reportAndCloseReporter(Metrics.java:63) at 
> org.apache.hudi.metrics.Metrics.shutdown(Metrics.java:109) at 
> org.apache.hudi.metrics.TestHoodieConsoleMetrics.stop(TestHoodieConsoleMetrics.java:48)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2211) Fix NullPointerException in TestHoodieConsoleMetrics

2021-07-22 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2211.
--
Resolution: Done

6d592c5896d033c7f781d4a9eef4a43916c084ed

> Fix NullPointerException in TestHoodieConsoleMetrics
> 
>
> Key: HUDI-2211
> URL: https://issues.apache.org/jira/browse/HUDI-2211
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Xuedong Luan
>Assignee: Xuedong Luan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> java.lang.NullPointerException: Expected a non-null value. Got 
> nulljava.lang.NullPointerException: Expected a non-null value. Got null at 
> org.apache.hudi.common.util.Option.(Option.java:65) at 
> org.apache.hudi.common.util.Option.of(Option.java:75) at 
> org.apache.hudi.metrics.Metrics.registerHoodieCommonMetrics(Metrics.java:85) 
> at org.apache.hudi.metrics.Metrics.reportAndCloseReporter(Metrics.java:63) at 
> org.apache.hudi.metrics.Metrics.shutdown(Metrics.java:109) at 
> org.apache.hudi.metrics.TestHoodieConsoleMetrics.stop(TestHoodieConsoleMetrics.java:48)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2165) Support Transformer for HoodieFlinkStreamer

2021-07-14 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2165.
--
Resolution: Implemented

52524b659d2cb64403e8ba87d2fefe6d536156e9

> Support Transformer for HoodieFlinkStreamer
> ---
>
> Key: HUDI-2165
> URL: https://issues.apache.org/jira/browse/HUDI-2165
> Project: Apache Hudi
>  Issue Type: New Feature
>  Components: Flink Integration
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Hoodie's delta streamer support {{Transformer}} , we can also provide this 
> feature for {{HoodieFlinkStreamer}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2165) Support Transformer for HoodieFlinkStreamer

2021-07-14 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2165:
---
Fix Version/s: 0.9.0

> Support Transformer for HoodieFlinkStreamer
> ---
>
> Key: HUDI-2165
> URL: https://issues.apache.org/jira/browse/HUDI-2165
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Hoodie's delta streamer support {{Transformer}} , we can also provide this 
> feature for {{HoodieFlinkStreamer}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2165) Support Transformer for HoodieFlinkStreamer

2021-07-14 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2165:
---
Issue Type: New Feature  (was: Improvement)

> Support Transformer for HoodieFlinkStreamer
> ---
>
> Key: HUDI-2165
> URL: https://issues.apache.org/jira/browse/HUDI-2165
> Project: Apache Hudi
>  Issue Type: New Feature
>  Components: Flink Integration
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Hoodie's delta streamer support {{Transformer}} , we can also provide this 
> feature for {{HoodieFlinkStreamer}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-2165) Support Transformer for HoodieFlinkStreamer

2021-07-13 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned HUDI-2165:
--

Assignee: vinoyang

> Support Transformer for HoodieFlinkStreamer
> ---
>
> Key: HUDI-2165
> URL: https://issues.apache.org/jira/browse/HUDI-2165
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> Hoodie's delta streamer support {{Transformer}} , we can also provide this 
> feature for {{HoodieFlinkStreamer}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-2165) Support Transformer for HoodieFlinkStreamer

2021-07-12 Thread vinoyang (Jira)
vinoyang created HUDI-2165:
--

 Summary: Support Transformer for HoodieFlinkStreamer
 Key: HUDI-2165
 URL: https://issues.apache.org/jira/browse/HUDI-2165
 Project: Apache Hudi
  Issue Type: Improvement
  Components: Flink Integration
Reporter: vinoyang


Hoodie's delta streamer support {{Transformer}} , we can also provide this 
feature for {{HoodieFlinkStreamer}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-2142) Support setting bucket assign parallelism for flink write task

2021-07-10 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned HUDI-2142:
--

Assignee: Zheng yunhong

> Support setting bucket assign parallelism for flink write task
> --
>
> Key: HUDI-2142
> URL: https://issues.apache.org/jira/browse/HUDI-2142
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: Zheng yunhong
>Assignee: Zheng yunhong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Support setting bucket assign parallelism for flink write task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2142) Support setting bucket assign parallelism for flink write task

2021-07-10 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2142.
--
Resolution: Implemented

9b01d2a04520db6230cd16ef2b29013c013b1944

> Support setting bucket assign parallelism for flink write task
> --
>
> Key: HUDI-2142
> URL: https://issues.apache.org/jira/browse/HUDI-2142
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: Zheng yunhong
>Assignee: Zheng yunhong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Support setting bucket assign parallelism for flink write task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2143) Tweak the default compaction target IO to 500GB when flink async compaction is off

2021-07-10 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2143.
--
Resolution: Done

942a024e74af52e09cabbfe967f5da0ef108bdbb

> Tweak the default compaction target IO to 500GB when flink async compaction 
> is off
> --
>
> Key: HUDI-2143
> URL: https://issues.apache.org/jira/browse/HUDI-2143
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2147) Remove unused class AvroConvertor in hudi-flink

2021-07-09 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2147.
--
Resolution: Done

3b2a4f2b6b49e13997292ecafa9accdd3e7b9efd

> Remove unused class AvroConvertor in hudi-flink
> ---
>
> Key: HUDI-2147
> URL: https://issues.apache.org/jira/browse/HUDI-2147
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Xianghu Wang
>Assignee: Xianghu Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2147) Remove unused class AvroConvertor in hudi-flink

2021-07-09 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2147:
---
Fix Version/s: 0.9.0

> Remove unused class AvroConvertor in hudi-flink
> ---
>
> Key: HUDI-2147
> URL: https://issues.apache.org/jira/browse/HUDI-2147
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Xianghu Wang
>Assignee: Xianghu Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2087) Support Append only in Flink stream

2021-07-09 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2087.
--
Fix Version/s: 0.9.0
   Resolution: Done

371526789d663dee85041eb31c27c52c81ef87ef

> Support Append only in Flink stream
> ---
>
> Key: HUDI-2087
> URL: https://issues.apache.org/jira/browse/HUDI-2087
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: yuzhaojing
>Assignee: yuzhaojing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
> Attachments: image-2021-07-08-22-04-30-039.png, 
> image-2021-07-08-22-04-40-018.png
>
>
> It is necessary to support append mode in flink stream, as the data lake 
> should be able to write log type data as parquet high performance without 
> merge.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2136) Fix packet conflict when flink-sql-connector-hive and hudi-flink-bundle both in flink lib

2021-07-08 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2136.
--
Resolution: Fixed

047d956e01b6d7c92320686d8321b2bbe9d2188e

> Fix packet conflict when flink-sql-connector-hive and hudi-flink-bundle both 
> in flink lib
> -
>
> Key: HUDI-2136
> URL: https://issues.apache.org/jira/browse/HUDI-2136
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Flink Integration
>Reporter: Zheng yunhong
>Assignee: Zheng yunhong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Fix packet conflict when flink-sql-connector-hive and hudi-flink-bundle both 
> in flink lib.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-2136) Fix packet conflict when flink-sql-connector-hive and hudi-flink-bundle both in flink lib

2021-07-08 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned HUDI-2136:
--

Assignee: Zheng yunhong

> Fix packet conflict when flink-sql-connector-hive and hudi-flink-bundle both 
> in flink lib
> -
>
> Key: HUDI-2136
> URL: https://issues.apache.org/jira/browse/HUDI-2136
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Flink Integration
>Reporter: Zheng yunhong
>Assignee: Zheng yunhong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Fix packet conflict when flink-sql-connector-hive and hudi-flink-bundle both 
> in flink lib.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2111) Update docs about bootstrap support configure KeyGenerator by type

2021-07-06 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2111.
--
Resolution: Done

0a6e48dd23c73a1ef34852396291cfec388bb0ca

> Update docs about bootstrap support configure KeyGenerator by type
> --
>
> Key: HUDI-2111
> URL: https://issues.apache.org/jira/browse/HUDI-2111
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Xianghu Wang
>Assignee: Xianghu Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2106) Fix flink batch compaction bug while user don't set compaction tasks

2021-07-05 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2106.
--
Resolution: Fixed

bc313727e3e89640edad85364022e057c9864ee9

> Fix flink batch compaction bug while user don't set compaction tasks
> 
>
> Key: HUDI-2106
> URL: https://issues.apache.org/jira/browse/HUDI-2106
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Flink Integration
>Reporter: Zheng yunhong
>Assignee: Zheng yunhong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> There is a bug in flink batch compaction while we did not set compaction 
> tasks, the compaction tasks would always default value instead of 
> compactionPlan operations size.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-2106) Fix flink batch compaction bug while user don't set compaction tasks

2021-07-05 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned HUDI-2106:
--

Assignee: Zheng yunhong

> Fix flink batch compaction bug while user don't set compaction tasks
> 
>
> Key: HUDI-2106
> URL: https://issues.apache.org/jira/browse/HUDI-2106
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Flink Integration
>Reporter: Zheng yunhong
>Assignee: Zheng yunhong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> There is a bug in flink batch compaction while we did not set compaction 
> tasks, the compaction tasks would always default value instead of 
> compactionPlan operations size.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2132) Make coordinator events as POJO for efficient serialization

2021-07-05 Thread vinoyang (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17375078#comment-17375078
 ] 

vinoyang commented on HUDI-2132:


32bd8ce088e0f1d82577575ac048e1a44d44e380

> Make coordinator events as POJO for efficient serialization
> ---
>
> Key: HUDI-2132
> URL: https://issues.apache.org/jira/browse/HUDI-2132
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-1930) Bootstrap support configure KeyGenerator by type

2021-07-03 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-1930.
--
Resolution: Implemented

> Bootstrap support configure KeyGenerator by type
> 
>
> Key: HUDI-1930
> URL: https://issues.apache.org/jira/browse/HUDI-1930
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: Xianghu Wang
>Assignee: Xianghu Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-1930) Bootstrap support configure KeyGenerator by type

2021-07-03 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-1930:
---
Fix Version/s: 0.9.0

> Bootstrap support configure KeyGenerator by type
> 
>
> Key: HUDI-1930
> URL: https://issues.apache.org/jira/browse/HUDI-1930
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: Xianghu Wang
>Assignee: Xianghu Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2112) Support reading pure logs file group for flink batch reader after compaction

2021-07-02 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2112.
--
Resolution: Done

7462fdefc39c75dca986d65551d871b8c47d4f55

> Support reading pure logs file group for flink batch reader after compaction
> 
>
> Key: HUDI-2112
> URL: https://issues.apache.org/jira/browse/HUDI-2112
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2088) Missing Partition Fields And PreCombineField In Hoodie Properties For Table Written By Flink

2021-07-01 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2088.
--
Resolution: Fixed

b34d53fa9c1821e7e7191ede1b99bdc96212ce3e

> Missing Partition Fields And PreCombineField In Hoodie Properties For Table 
> Written By Flink
> 
>
> Key: HUDI-2088
> URL: https://issues.apache.org/jira/browse/HUDI-2088
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Flink Integration, Spark Integration
>Reporter: pengzhiwei
>Assignee: pengzhiwei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Currently we have missed the partition fields and preCombineField in 
> hoodie.properites when init the table by flink which will cause two problems:
> 1、spark partition prune will not work if missing he partition fields. 
> 2、spark query mor table will return incorrect result if missing the 
> preCombineField.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2052) Support load logfile in BootstrapFunction

2021-06-30 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2052.
--
Resolution: Implemented

07e93de8b49560eee23237817fc24fbe763f2891

> Support load logfile in BootstrapFunction
> -
>
> Key: HUDI-2052
> URL: https://issues.apache.org/jira/browse/HUDI-2052
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: yuzhaojing
>Assignee: yuzhaojing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Support load logfile in BootstrapFunction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2052) Support load logfile in BootstrapFunction

2021-06-30 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2052:
---
Fix Version/s: 0.9.0

> Support load logfile in BootstrapFunction
> -
>
> Key: HUDI-2052
> URL: https://issues.apache.org/jira/browse/HUDI-2052
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: yuzhaojing
>Assignee: yuzhaojing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Support load logfile in BootstrapFunction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2103) Add rebalance before index bootstrap

2021-06-30 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2103.
--
Fix Version/s: 0.9.0
   Resolution: Done

1cbf43b6e7731c68661c74bdfa41399f6398172f

> Add rebalance before index bootstrap
> 
>
> Key: HUDI-2103
> URL: https://issues.apache.org/jira/browse/HUDI-2103
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: yuzhaojing
>Assignee: yuzhaojing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> When use flink sql upsert to hudi, user always set parallelism larger than 
> kafak partition num. Now bootstrap operator need at least one element to 
> trigger loading.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2092) Fix NPE caused by FlinkStreamerConfig#writePartitionUrlEncode null value

2021-06-29 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2092:
---
Fix Version/s: 0.9.0

> Fix NPE caused by FlinkStreamerConfig#writePartitionUrlEncode null value
> 
>
> Key: HUDI-2092
> URL: https://issues.apache.org/jira/browse/HUDI-2092
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Xianghu Wang
>Assignee: Xianghu Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2092) Fix NPE caused by FlinkStreamerConfig#writePartitionUrlEncode null value

2021-06-29 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2092.
--
Resolution: Fixed

202887b8ca27eb6de808ba7a2e737b13ae9eb8c0

> Fix NPE caused by FlinkStreamerConfig#writePartitionUrlEncode null value
> 
>
> Key: HUDI-2092
> URL: https://issues.apache.org/jira/browse/HUDI-2092
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Xianghu Wang
>Assignee: Xianghu Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2067) Sync all the options of FlinkOptions to FlinkStreamerConfig

2021-06-28 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2067.
--
Resolution: Implemented

34fc8a8880b0da5adb82b6eb7678f141af54ca18

> Sync all the options of FlinkOptions to FlinkStreamerConfig
> ---
>
> Key: HUDI-2067
> URL: https://issues.apache.org/jira/browse/HUDI-2067
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: Danny Chen
>Assignee: Vinay
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Sync the options so that the {{HoodieFlinkStreamer}} can have more config 
> options.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-2067) Sync all the options of FlinkOptions to FlinkStreamerConfig

2021-06-28 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-2067:
---
Issue Type: Improvement  (was: Task)

> Sync all the options of FlinkOptions to FlinkStreamerConfig
> ---
>
> Key: HUDI-2067
> URL: https://issues.apache.org/jira/browse/HUDI-2067
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: Danny Chen
>Assignee: Vinay
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Sync the options so that the {{HoodieFlinkStreamer}} can have more config 
> options.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2074) Use while loop instead of recursive call in MergeOnReadInputFormat#MergeIterator to avoid StackOverflow

2021-06-28 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2074.
--
Resolution: Done

d24341d10ca49ed52fc0c1c86a164fbfb57327d4

> Use while loop instead of recursive call in 
> MergeOnReadInputFormat#MergeIterator to avoid StackOverflow
> ---
>
> Key: HUDI-2074
> URL: https://issues.apache.org/jira/browse/HUDI-2074
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Flink Integration
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2068) Skip the assign state for SmallFileAssign when the state can not assign initially

2021-06-24 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2068.
--
Resolution: Fixed

e64fe5505487f3ee591b2b5d044c2c57989f8991

> Skip the assign state for SmallFileAssign when the state can not assign 
> initially
> -
>
> Key: HUDI-2068
> URL: https://issues.apache.org/jira/browse/HUDI-2068
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Flink Integration
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-2062) Catch FileNotFoundException in WriteProfiles #getCommitMetadataSafely

2021-06-24 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-2062.
--
Fix Version/s: 0.9.0
   Resolution: Fixed

218f2a6df8a41279d904a76b594d22f4d66d17c2

> Catch FileNotFoundException in WriteProfiles #getCommitMetadataSafely
> -
>
> Key: HUDI-2062
> URL: https://issues.apache.org/jira/browse/HUDI-2062
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Flink Integration
>Reporter: yuzhaojing
>Assignee: yuzhaojing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> The function WriteProfiles #getCommitMetadataSafely expect get instant 
> safely, if instant deleted by cleaner that ignore it.
> But in case, FileNotFoundException will throw out of try catch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-2062) Catch IOException in WriteProfiles #getCommitMetadataSafely

2021-06-23 Thread vinoyang (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367982#comment-17367982
 ] 

vinoyang commented on HUDI-2062:


Hi [~yuzhaojing] What's the exception and stack trace did you meet? Can you 
share it in the "Description" section?

> Catch IOException in WriteProfiles #getCommitMetadataSafely
> ---
>
> Key: HUDI-2062
> URL: https://issues.apache.org/jira/browse/HUDI-2062
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Flink Integration
>Reporter: yuzhaojing
>Assignee: yuzhaojing
>Priority: Major
>  Labels: pull-request-available
>
> Catch IOException in WriteProfiles #getCommitMetadataSafely



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-1826) Add ORC support in HoodieSnapshotExporter

2021-06-23 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-1826.
--
Fix Version/s: 0.9.0
   Resolution: Fixed

43b9c1fa1caf97f6fb2baf68e350615541ea0a0c

> Add ORC support in HoodieSnapshotExporter
> -
>
> Key: HUDI-1826
> URL: https://issues.apache.org/jira/browse/HUDI-1826
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Storage Management
>Reporter: Teresa Kang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   7   8   >