[jira] [Closed] (HUDI-1775) Add option for compaction parallelism

2021-04-08 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-1775. -- Resolution: Done 6786581c4842e47e1a8a8e942f54003dc151c7c6 > Add option for compaction parallelism >

[jira] [Updated] (HUDI-1775) Add option for compaction parallelism

2021-04-08 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-1775: --- Fix Version/s: 0.9.0 > Add option for compaction parallelism > - > >

[jira] [Assigned] (HUDI-1775) Add option for compaction parallelism

2021-04-08 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned HUDI-1775: -- Assignee: Danny Chen > Add option for compaction parallelism > - >

[hudi] branch master updated: [HUDI-1775] Add option for compaction parallelism (#2785)

2021-04-08 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 6786581 [HUDI-1775] Add option for compaction

[GitHub] [hudi] yanghua merged pull request #2785: [HUDI-1775] Add option for compaction parallelism

2021-04-08 Thread GitBox
yanghua merged pull request #2785: URL: https://github.com/apache/hudi/pull/2785 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [hudi] nsivabalan commented on issue #2787: [SUPPORT] Error upserting bucketType UPDATE for partition

2021-04-08 Thread GitBox
nsivabalan commented on issue #2787: URL: https://github.com/apache/hudi/issues/2787#issuecomment-816416482 Can you give us the configs you used? is it failing at the beginning itself or after after few batch of writes? looks like this is the root cause. ``` Caused by:

[jira] [Updated] (HUDI-1743) Add support for Spark SQL File based transformer for deltastreamer

2021-04-08 Thread Vinoth Govindarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Govindarajan updated HUDI-1743: -- Status: Patch Available (was: In Progress) > Add support for Spark SQL File based

[jira] [Updated] (HUDI-1743) Add support for Spark SQL File based transformer for deltastreamer

2021-04-08 Thread Vinoth Govindarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Govindarajan updated HUDI-1743: -- Status: In Progress (was: Open) > Add support for Spark SQL File based transformer for

[jira] [Commented] (HUDI-1762) Hive Sync is not working with Hive Style Partitioning

2021-04-08 Thread Vinoth Govindarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17317655#comment-17317655 ] Vinoth Govindarajan commented on HUDI-1762: --- PR merged. > Hive Sync is not working with Hive

[jira] [Updated] (HUDI-1762) Hive Sync is not working with Hive Style Partitioning

2021-04-08 Thread Vinoth Govindarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Govindarajan updated HUDI-1762: -- Status: Closed (was: Patch Available) > Hive Sync is not working with Hive Style

[hudi] branch master updated: [HUDI-1762] Added HiveStylePartitionExtractor to support Hive style partitions (#2769)

2021-04-08 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 08e82c4 [HUDI-1762] Added

[GitHub] [hudi] nsivabalan merged pull request #2769: [HUDI-1762] Added HiveStylePartitionExtractor to support Hive style partitions

2021-04-08 Thread GitBox
nsivabalan merged pull request #2769: URL: https://github.com/apache/hudi/pull/2769 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [hudi] nsivabalan commented on a change in pull request #2769: [HUDI-1762] Added HiveStylePartitionExtractor to support Hive style partitions

2021-04-08 Thread GitBox
nsivabalan commented on a change in pull request #2769: URL: https://github.com/apache/hudi/pull/2769#discussion_r610346604 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveStylePartitionValueExtractor.java ## @@ -0,0 +1,42 @@ +/* + * Licensed to

[GitHub] [hudi] vingov commented on a change in pull request #2769: [HUDI-1762] Added HiveStylePartitionExtractor to support Hive style partitions

2021-04-08 Thread GitBox
vingov commented on a change in pull request #2769: URL: https://github.com/apache/hudi/pull/2769#discussion_r610344542 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveStylePartitionValueExtractor.java ## @@ -0,0 +1,42 @@ +/* + * Licensed to the

[GitHub] [hudi] vingov commented on a change in pull request #2769: [HUDI-1762] Added HiveStylePartitionExtractor to support Hive style partitions

2021-04-08 Thread GitBox
vingov commented on a change in pull request #2769: URL: https://github.com/apache/hudi/pull/2769#discussion_r610344542 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveStylePartitionValueExtractor.java ## @@ -0,0 +1,42 @@ +/* + * Licensed to the

[GitHub] [hudi] vingov commented on a change in pull request #2769: [HUDI-1762] Added HiveStylePartitionExtractor to support Hive style partitions

2021-04-08 Thread GitBox
vingov commented on a change in pull request #2769: URL: https://github.com/apache/hudi/pull/2769#discussion_r610344542 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveStylePartitionValueExtractor.java ## @@ -0,0 +1,42 @@ +/* + * Licensed to the

[hudi] branch asf-site updated: Travis CI build asf-site

2021-04-08 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new a21c5a4 Travis CI build asf-site a21c5a4 is

[GitHub] [hudi] yanghua commented on pull request #2785: [HUDI-1775] Add option for compaction parallelism

2021-04-08 Thread GitBox
yanghua commented on pull request #2785: URL: https://github.com/apache/hudi/pull/2785#issuecomment-816369965 Will merge after the Travis turns to green. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [hudi] yanghua commented on a change in pull request #2740: [HUDI-1055] Remove hardcoded parquet in tests

2021-04-08 Thread GitBox
yanghua commented on a change in pull request #2740: URL: https://github.com/apache/hudi/pull/2740#discussion_r610313295 ## File path: hudi-cli/src/main/scala/org/apache/hudi/cli/SparkHelpers.scala ## @@ -40,7 +40,7 @@ import scala.collection.mutable._ object SparkHelpers {

[GitHub] [hudi] danny0405 commented on pull request #2786: [HUDI-1782] Add more options for HUDI Flink

2021-04-08 Thread GitBox
danny0405 commented on pull request #2786: URL: https://github.com/apache/hudi/pull/2786#issuecomment-816368238 > Should we change the 0.8.0 doc as well? It will be merged soon. #2792 I think it is not necessary ? People would always see the master document. -- This is an

[jira] [Updated] (HUDI-1782) Add more options for HUDI Flink

2021-04-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1782: - Labels: pull-request-available (was: ) > Add more options for HUDI Flink >

[GitHub] [hudi] danny0405 commented on pull request #2786: [HUDI-1782] Add more options for HUDI Flink

2021-04-08 Thread GitBox
danny0405 commented on pull request #2786: URL: https://github.com/apache/hudi/pull/2786#issuecomment-816368080 > > > @danny0405 Can you please correct the title of the PR? > > > > > > What title should i use ? > > file a jira ticket and add the jira id? Added. --

[jira] [Created] (HUDI-1782) Add more options for HUDI Flink

2021-04-08 Thread Danny Chen (Jira)
Danny Chen created HUDI-1782: Summary: Add more options for HUDI Flink Key: HUDI-1782 URL: https://issues.apache.org/jira/browse/HUDI-1782 Project: Apache Hudi Issue Type: Task

[jira] [Created] (HUDI-1781) Flink streaming reader throws ClassCastException when reading from empty table path

2021-04-08 Thread Danny Chen (Jira)
Danny Chen created HUDI-1781: Summary: Flink streaming reader throws ClassCastException when reading from empty table path Key: HUDI-1781 URL: https://issues.apache.org/jira/browse/HUDI-1781 Project:

[hudi] branch asf-site updated: Fix 0.8.0 release doc link (#2795)

2021-04-08 Thread garyli
This is an automated email from the ASF dual-hosted git repository. garyli pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 0ae8969 Fix 0.8.0 release doc link (#2795)

[GitHub] [hudi] garyli1019 merged pull request #2795: [DOCS]Fix 0.8.0 release doc link

2021-04-08 Thread GitBox
garyli1019 merged pull request #2795: URL: https://github.com/apache/hudi/pull/2795 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [hudi] li36909 commented on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled

2021-04-08 Thread GitBox
li36909 commented on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-816365255 @cdmikechen thank you for your explain. I use hudi 0.7 + spark 2.4.5 + hive 3.1, and didn't test with hive 2.*, if possible please fix this issue for hive3 also, thank you very much

[GitHub] [hudi] pengzhiwei2018 commented on pull request #2283: [HUDI-1415] Read Hoodie Table As Spark DataSource Table

2021-04-08 Thread GitBox
pengzhiwei2018 commented on pull request #2283: URL: https://github.com/apache/hudi/pull/2283#issuecomment-816360889 Hi @nsivabalan @umehrot2 can you take a review on this pr again? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] zherenyu831 commented on a change in pull request #2784: [HUDI-1740] Fix insert-overwrite API archival

2021-04-08 Thread GitBox
zherenyu831 commented on a change in pull request #2784: URL: https://github.com/apache/hudi/pull/2784#discussion_r610237185 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/MetadataConversionUtils.java ## @@ -72,9 +76,14 @@ public

[GitHub] [hudi] yanghua commented on pull request #2786: Add more options for HUDI Flink

2021-04-08 Thread GitBox
yanghua commented on pull request #2786: URL: https://github.com/apache/hudi/pull/2786#issuecomment-816355918 > > @danny0405 Can you please correct the title of the PR? > > What title should i use ? file a jira ticket and add the jira id? -- This is an automated message

[GitHub] [hudi] garyli1019 opened a new pull request #2795: [DOCS]Fix 0.8.0 release doc link

2021-04-08 Thread GitBox
garyli1019 opened a new pull request #2795: URL: https://github.com/apache/hudi/pull/2795 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] yanghua commented on a change in pull request #2785: [HUDI-1775] Add option for compaction parallelism

2021-04-08 Thread GitBox
yanghua commented on a change in pull request #2785: URL: https://github.com/apache/hudi/pull/2785#discussion_r610278459 ## File path: hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/BucketAssignFunction.java ## @@ -159,8 +164,11 @@ public void processElement(I

[GitHub] [hudi] ztcheck edited a comment on issue #2680: [SUPPORT]Hive sync error by using run_sync_tool.sh

2021-04-08 Thread GitBox
ztcheck edited a comment on issue #2680: URL: https://github.com/apache/hudi/issues/2680#issuecomment-816336235 I updated my issue description. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] ztcheck edited a comment on issue #2680: [SUPPORT]Hive sync error by using run_sync_tool.sh

2021-04-08 Thread GitBox
ztcheck edited a comment on issue #2680: URL: https://github.com/apache/hudi/issues/2680#issuecomment-816336235 I updated my iuuse description. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] ztcheck commented on issue #2680: [SUPPORT]Hive sync error by using run_sync_tool.sh

2021-04-08 Thread GitBox
ztcheck commented on issue #2680: URL: https://github.com/apache/hudi/issues/2680#issuecomment-816336235 I update my iuuse description. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] ssdong commented on a change in pull request #2784: [HUDI-1740] Fix insert-overwrite API archival

2021-04-08 Thread GitBox
ssdong commented on a change in pull request #2784: URL: https://github.com/apache/hudi/pull/2784#discussion_r610259030 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/MetadataConversionUtils.java ## @@ -72,9 +76,14 @@ public static

[GitHub] [hudi] ssdong commented on a change in pull request #2784: [HUDI-1740] Fix insert-overwrite API archival

2021-04-08 Thread GitBox
ssdong commented on a change in pull request #2784: URL: https://github.com/apache/hudi/pull/2784#discussion_r610259030 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/MetadataConversionUtils.java ## @@ -72,9 +76,14 @@ public static

[GitHub] [hudi] ssdong commented on a change in pull request #2784: [HUDI-1740] Fix insert-overwrite API archival

2021-04-08 Thread GitBox
ssdong commented on a change in pull request #2784: URL: https://github.com/apache/hudi/pull/2784#discussion_r610259030 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/MetadataConversionUtils.java ## @@ -72,9 +76,14 @@ public static

[GitHub] [hudi] ztcheck edited a comment on issue #2680: [SUPPORT]Hive sync error by using run_sync_tool.sh

2021-04-08 Thread GitBox
ztcheck edited a comment on issue #2680: URL: https://github.com/apache/hudi/issues/2680#issuecomment-816335004 @n3nash I'm not sure which jar is missing, so i add all of the jars under ${HIVE_HOME}/lib/*.jar to the classpath . And i use this start command in the script

[GitHub] [hudi] ztcheck edited a comment on issue #2680: [SUPPORT]Hive sync error by using run_sync_tool.sh

2021-04-08 Thread GitBox
ztcheck edited a comment on issue #2680: URL: https://github.com/apache/hudi/issues/2680#issuecomment-816335004 @n3nash I'm not sure which jar is missing, so i add all of the jars under ${HIVE_HOME}/lib/*.jar to the classpath . And i use this start command in the script

[GitHub] [hudi] ztcheck edited a comment on issue #2680: [SUPPORT]Hive sync error by using run_sync_tool.sh

2021-04-08 Thread GitBox
ztcheck edited a comment on issue #2680: URL: https://github.com/apache/hudi/issues/2680#issuecomment-816335004 @n3nash I'm not sure which jar is missing, so i add all of the jars under ${HIVE_HOME}/lib/*.jar to the classpath . And i use this start command in the script

[GitHub] [hudi] ztcheck commented on issue #2680: [SUPPORT]Hive sync error by using run_sync_tool.sh

2021-04-08 Thread GitBox
ztcheck commented on issue #2680: URL: https://github.com/apache/hudi/issues/2680#issuecomment-816335004 @n3nash I'm not sure which jar is missing, so i add all of the jars under ${HIVE_HOME}/lib/*.jar to the classpath . And i use this start command,it works .

[GitHub] [hudi] ssdong commented on a change in pull request #2784: [HUDI-1740] Fix insert-overwrite API archival

2021-04-08 Thread GitBox
ssdong commented on a change in pull request #2784: URL: https://github.com/apache/hudi/pull/2784#discussion_r610256593 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/MetadataConversionUtils.java ## @@ -72,9 +76,14 @@ public static

[GitHub] [hudi] satishkotha commented on a change in pull request #2784: [HUDI-1740] Fix insert-overwrite API archival

2021-04-08 Thread GitBox
satishkotha commented on a change in pull request #2784: URL: https://github.com/apache/hudi/pull/2784#discussion_r610253248 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/MetadataConversionUtils.java ## @@ -72,9 +76,14 @@ public

[GitHub] [hudi] satishkotha commented on a change in pull request #2784: [HUDI-1740] Fix insert-overwrite API archival

2021-04-08 Thread GitBox
satishkotha commented on a change in pull request #2784: URL: https://github.com/apache/hudi/pull/2784#discussion_r610252511 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java ## @@ -245,7 +245,7 @@ public final void

[GitHub] [hudi] zherenyu831 commented on a change in pull request #2784: [HUDI-1740] Fix insert-overwrite API archival

2021-04-08 Thread GitBox
zherenyu831 commented on a change in pull request #2784: URL: https://github.com/apache/hudi/pull/2784#discussion_r610237185 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/MetadataConversionUtils.java ## @@ -72,9 +76,14 @@ public

[GitHub] [hudi] njalan edited a comment on issue #2791: [SUPPORT]Failed to enable hoodie.metadata.enable

2021-04-08 Thread GitBox
njalan edited a comment on issue #2791: URL: https://github.com/apache/hudi/issues/2791#issuecomment-816318167 After job running half an hour I got this error. Once I remove all the files from table directory and got the error again within an hour (1 minute for one micro batch). --

[GitHub] [hudi] cdmikechen edited a comment on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled

2021-04-08 Thread GitBox
cdmikechen edited a comment on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-816314674 @li36909 As I known, `TimestampWritableV2` is a hive3 class, we mainly use hive2 lib in hudi. And your class is based on a `MOR` table, my change is based on a `COW`

[GitHub] [hudi] njalan commented on issue #2791: [SUPPORT]Failed to enable hoodie.metadata.enable

2021-04-08 Thread GitBox
njalan commented on issue #2791: URL: https://github.com/apache/hudi/issues/2791#issuecomment-816318167 After job running half an hour I got this error. Once I remove all the files from table directory and got the error again within an hour. -- This is an automated message from the

[GitHub] [hudi] cdmikechen edited a comment on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled

2021-04-08 Thread GitBox
cdmikechen edited a comment on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-816314674 @li36909 As I known, `TimestampWritableV2` is a hive3 class, we mainly use hive2 lib in hudi. And your class is based on a `MOR` table, my change is based on a `COW`

[GitHub] [hudi] cdmikechen commented on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled

2021-04-08 Thread GitBox
cdmikechen commented on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-816314674 @li36909 As I known, `TimestampWritableV2` is a hive3 class, we mainly use hive2 lib in hudi.And your class is based on a `MOR` table, my change is based on a `COW` table. I

[GitHub] [hudi] n3nash opened a new pull request #2794: [MINOR] Fix concurrency docs

2021-04-08 Thread GitBox
n3nash opened a new pull request #2794: URL: https://github.com/apache/hudi/pull/2794 ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes

[GitHub] [hudi] nsivabalan commented on issue #2620: [SUPPORT] Performance Tuning: Slow stages (Building Workload Profile & Getting Small files from partitions) during Hudi Writes

2021-04-08 Thread GitBox
nsivabalan commented on issue #2620: URL: https://github.com/apache/hudi/issues/2620#issuecomment-816306475 @kimberlyamandalu : do you have a support ticket for your question. lets not pollute this issue. we can create a new one for your use-case and can discuss over there -- This is

[GitHub] [hudi] nsivabalan commented on issue #2620: [SUPPORT] Performance Tuning: Slow stages (Building Workload Profile & Getting Small files from partitions) during Hudi Writes

2021-04-08 Thread GitBox
nsivabalan commented on issue #2620: URL: https://github.com/apache/hudi/issues/2620#issuecomment-816306202 @codejoyan : sorry, somehow slipped from my radar. May I know whats the scale of data you are dealing with? I see your parallelism is very less (2). Can you try w/ 100 or more

[GitHub] [hudi] nsivabalan merged pull request #2792: [DOCS] Add docs for release 0.8.0

2021-04-08 Thread GitBox
nsivabalan merged pull request #2792: URL: https://github.com/apache/hudi/pull/2792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [hudi] TeRS-K commented on pull request #2793: [HUDI-57] Support ORC Storage

2021-04-08 Thread GitBox
TeRS-K commented on pull request #2793: URL: https://github.com/apache/hudi/pull/2793#issuecomment-816163413 The build is currently failing with error `ERROR: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading:

[GitHub] [hudi] rubenssoto removed a comment on pull request #2790: [HUDI-1779] Fail to bootstrap/upsert a table which contains timestamp column

2021-04-08 Thread GitBox
rubenssoto removed a comment on pull request #2790: URL: https://github.com/apache/hudi/pull/2790#issuecomment-815986157 Hello Guys, is it a bug on hudi 0.8.0? I migrate all my workload to Hudi yesterday using hudi 0.8.0 and Im having problems with timestamp.

[jira] [Updated] (HUDI-1780) Refactoring of parts of HoodieMetadataArchiveLog have changed behaviour of Archival

2021-04-08 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1780: -- Labels: sev:critical (was: ) > Refactoring of parts of HoodieMetadataArchiveLog have changed

[jira] [Updated] (HUDI-1780) Refactoring of parts of HoodieMetadataArchiveLog have changed behaviour of Archival

2021-04-08 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1780: -- Priority: Major (was: Minor) > Refactoring of parts of HoodieMetadataArchiveLog have changed

[jira] [Assigned] (HUDI-1780) Refactoring of parts of HoodieMetadataArchiveLog have changed behaviour of Archival

2021-04-08 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal reassigned HUDI-1780: - Assignee: Nishith Agarwal > Refactoring of parts of HoodieMetadataArchiveLog have

[jira] [Created] (HUDI-1780) Refactoring of parts of HoodieMetadataArchiveLog have changed behaviour of Archival

2021-04-08 Thread Jagmeet Bali (Jira)
Jagmeet Bali created HUDI-1780: -- Summary: Refactoring of parts of HoodieMetadataArchiveLog have changed behaviour of Archival Key: HUDI-1780 URL: https://issues.apache.org/jira/browse/HUDI-1780 Project:

[GitHub] [hudi] n3nash commented on pull request #2793: [HUDI-57] Support ORC Storage

2021-04-08 Thread GitBox
n3nash commented on pull request #2793: URL: https://github.com/apache/hudi/pull/2793#issuecomment-816037894 @prashantwason Can you review this ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] TeRS-K opened a new pull request #2793: [HUDI-57] Support ORC Storage

2021-04-08 Thread GitBox
TeRS-K opened a new pull request #2793: URL: https://github.com/apache/hudi/pull/2793 ## What is the purpose of the pull request This pull request supports ORC storage in hudi. ## Brief change log In two separate commits: - Implemented HoodieOrcWriter - Added

[GitHub] [hudi] codejoyan commented on issue #2592: [SUPPORT] Does latest versions of Hudi (0.7.0, 0.6.0) work with Spark 2.3.0 when reading orc files?

2021-04-08 Thread GitBox
codejoyan commented on issue #2592: URL: https://github.com/apache/hudi/issues/2592#issuecomment-816012992 Please let me know if there are any suggestions to try out -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [hudi] stackfun commented on issue #2692: [SUPPORT] Corrupt Blocks in Google Cloud Storage

2021-04-08 Thread GitBox
stackfun commented on issue #2692: URL: https://github.com/apache/hudi/issues/2692#issuecomment-816013345 I'm using GCP dataproc 1.4, the gcs connector version is 1.9.17. The versions of all the libraries can be found here:

[GitHub] [hudi] codejoyan commented on issue #2620: [SUPPORT] Performance Tuning: Slow stages (Building Workload Profile & Getting Small files from partitions) during Hudi Writes

2021-04-08 Thread GitBox
codejoyan commented on issue #2620: URL: https://github.com/apache/hudi/issues/2620#issuecomment-816012609 @nsivabalan, any inputs would be very helpful. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [hudi] kimberlyamandalu commented on issue #2620: [SUPPORT] Performance Tuning: Slow stages (Building Workload Profile & Getting Small files from partitions) during Hudi Writes

2021-04-08 Thread GitBox
kimberlyamandalu commented on issue #2620: URL: https://github.com/apache/hudi/issues/2620#issuecomment-815987341 I have a similar issue where bloom index performance is very slow for upsert into a Hudi MOR table. Does anyone know if when Hudi performs an upsert, does it only lookup

[GitHub] [hudi] rubenssoto edited a comment on pull request #2790: [HUDI-1779] Fail to bootstrap/upsert a table which contains timestamp column

2021-04-08 Thread GitBox
rubenssoto edited a comment on pull request #2790: URL: https://github.com/apache/hudi/pull/2790#issuecomment-815986157 Hello Guys, is it a bug on hudi 0.8.0? I migrate all my workload to Hudi yesterday using hudi 0.8.0 and Im having problems with timestamp.

[GitHub] [hudi] rubenssoto commented on pull request #2790: [HUDI-1779] Fail to bootstrap/upsert a table which contains timestamp column

2021-04-08 Thread GitBox
rubenssoto commented on pull request #2790: URL: https://github.com/apache/hudi/pull/2790#issuecomment-815986157 Hello Guys, is it a bug on hudi 0.8.0? I migrate all my workload to Hudi yesterday using hudi 0.8.0 and Im having problems with timestamp.

[GitHub] [hudi] n3nash commented on issue #2791: [SUPPORT]Failed to enable hoodie.metadata.enable

2021-04-08 Thread GitBox
n3nash commented on issue #2791: URL: https://github.com/apache/hudi/issues/2791#issuecomment-815973743 @prashantwason Can you take a look at this ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] nsivabalan commented on a change in pull request #2792: [DOCS] Add docs for release 0.8.0

2021-04-08 Thread GitBox
nsivabalan commented on a change in pull request #2792: URL: https://github.com/apache/hudi/pull/2792#discussion_r609883525 ## File path: docs/_docs/0.8.0/1_1_spark_quick_start_guide.md ## @@ -0,0 +1,530 @@ +--- +version: 0.8.0 +title: "Quick-Start Guide" +permalink:

[GitHub] [hudi] n3nash commented on a change in pull request #2792: [DOCS] Add docs for release 0.8.0

2021-04-08 Thread GitBox
n3nash commented on a change in pull request #2792: URL: https://github.com/apache/hudi/pull/2792#discussion_r609890178 ## File path: docs/_docs/1_1_spark_quick_start_guide.md ## @@ -13,13 +13,17 @@ After each write operation we will also show how to read the data both

[GitHub] [hudi] nsivabalan commented on a change in pull request #2792: [DOCS] Add docs for release 0.8.0

2021-04-08 Thread GitBox
nsivabalan commented on a change in pull request #2792: URL: https://github.com/apache/hudi/pull/2792#discussion_r609884054 ## File path: docs/_docs/1_1_spark_quick_start_guide.md ## @@ -13,13 +13,17 @@ After each write operation we will also show how to read the data both

[GitHub] [hudi] nsivabalan commented on a change in pull request #2792: [DOCS] Add docs for release 0.8.0

2021-04-08 Thread GitBox
nsivabalan commented on a change in pull request #2792: URL: https://github.com/apache/hudi/pull/2792#discussion_r609883525 ## File path: docs/_docs/0.8.0/1_1_spark_quick_start_guide.md ## @@ -0,0 +1,530 @@ +--- +version: 0.8.0 +title: "Quick-Start Guide" +permalink:

[GitHub] [hudi] garyli1019 commented on pull request #2792: [DOCS] Add docs for release 0.8.0

2021-04-08 Thread GitBox
garyli1019 commented on pull request #2792: URL: https://github.com/apache/hudi/pull/2792#issuecomment-815963047 > Did we change to version 0.8.0 in all places wherever applicable? I see quick start was not updated. This version change was generated by an automated tool. The quick

[GitHub] [hudi] garyli1019 commented on a change in pull request #2792: [DOCS] Add docs for release 0.8.0

2021-04-08 Thread GitBox
garyli1019 commented on a change in pull request #2792: URL: https://github.com/apache/hudi/pull/2792#discussion_r609875712 ## File path: docs/_docs/0.8.0/1_1_spark_quick_start_guide.md ## @@ -0,0 +1,530 @@ +--- +version: 0.8.0 +title: "Quick-Start Guide" +permalink:

[GitHub] [hudi] nsivabalan commented on a change in pull request #2792: [DOCS] Add docs for release 0.8.0

2021-04-08 Thread GitBox
nsivabalan commented on a change in pull request #2792: URL: https://github.com/apache/hudi/pull/2792#discussion_r609858054 ## File path: docs/_docs/0.8.0/1_1_spark_quick_start_guide.md ## @@ -0,0 +1,530 @@ +--- +version: 0.8.0 +title: "Quick-Start Guide" +permalink:

[GitHub] [hudi] garyli1019 commented on pull request #2786: Add more options for HUDI Flink

2021-04-08 Thread GitBox
garyli1019 commented on pull request #2786: URL: https://github.com/apache/hudi/pull/2786#issuecomment-815943296 Should we change the 0.8.0 doc as well? It will be merged soon. #2792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] TeRS-K commented on a change in pull request #2740: [HUDI-1055] Remove hardcoded parquet in tests

2021-04-08 Thread GitBox
TeRS-K commented on a change in pull request #2740: URL: https://github.com/apache/hudi/pull/2740#discussion_r609825554 ## File path: hudi-cli/src/main/scala/org/apache/hudi/cli/SparkHelpers.scala ## @@ -40,7 +40,7 @@ import scala.collection.mutable._ object SparkHelpers {

[GitHub] [hudi] garyli1019 opened a new pull request #2792: [DOCS] Add docs for release 0.8.0

2021-04-08 Thread GitBox
garyli1019 opened a new pull request #2792: URL: https://github.com/apache/hudi/pull/2792 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] nsivabalan merged pull request #2772: [MINOR] Update doap with 0.8.0 release

2021-04-08 Thread GitBox
nsivabalan merged pull request #2772: URL: https://github.com/apache/hudi/pull/2772 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[hudi] branch master updated (5b3608f -> cf3d2e2)

2021-04-08 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 5b3608f [HUDI-1778] Add setter to CompactionPlanEvent and CompactionCommitEvent to have better SE/DE performance

[GitHub] [hudi] nsivabalan edited a comment on pull request #2790: [HUDI-1779] Fail to bootstrap/upsert a table which contains timestamp column

2021-04-08 Thread GitBox
nsivabalan edited a comment on pull request #2790: URL: https://github.com/apache/hudi/pull/2790#issuecomment-815898964 @li36909 : IIUC, this patch is not about failing a bootstrap or upsert w/ timestamp. We are adding support for timestamp column by upgrading parquet version. If yes,

[GitHub] [hudi] nsivabalan commented on pull request #2790: [HUDI-1779] Fail to bootstrap/upsert a table which contains timestamp column

2021-04-08 Thread GitBox
nsivabalan commented on pull request #2790: URL: https://github.com/apache/hudi/pull/2790#issuecomment-815898964 IIUC, this patch is not about failing a bootstrap or upsert w/ timestamp. We are adding support for timestamp column by upgrading parquet version. If yes, please do fix the

[GitHub] [hudi] nsivabalan commented on pull request #2452: [HUDI-1531] Introduce HoodiePartitionCleaner to delete specific partition

2021-04-08 Thread GitBox
nsivabalan commented on pull request #2452: URL: https://github.com/apache/hudi/pull/2452#issuecomment-815878916 sounds good. yeah, cleaning strategy would be great. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [hudi] nsivabalan commented on pull request #2601: [HUDI-1602] update parquet version from 1.10.1 to 1.11.1

2021-04-08 Thread GitBox
nsivabalan commented on pull request #2601: URL: https://github.com/apache/hudi/pull/2601#issuecomment-815870944 CC @li36909 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] nsivabalan commented on pull request #2790: [HUDI-1779] Fail to bootstrap/upsert a table which contains timestamp column

2021-04-08 Thread GitBox
nsivabalan commented on pull request #2790: URL: https://github.com/apache/hudi/pull/2790#issuecomment-815868898 @li36909 : can you fix the links in the description. guess its cuttoff. ``` parquet.avro.readInt96AsFixed=true, please check https://github

[GitHub] [hudi] njalan opened a new issue #2791: [SUPPORT]Error when enable hoodie.metadata.enable

2021-04-08 Thread GitBox
njalan opened a new issue #2791: URL: https://github.com/apache/hudi/issues/2791 I am facing performance issue by S3 slow file listing. So I try to enable hoodie metadata to improve performance. **Environment Description** Hudi version : 0.7 Spark version : 3.0.1

[GitHub] [hudi] vburenin commented on issue #2692: [SUPPORT] Corrupt Blocks in Google Cloud Storage

2021-04-08 Thread GitBox
vburenin commented on issue #2692: URL: https://github.com/apache/hudi/issues/2692#issuecomment-815855516 @n3nash I am not sure the issue is still relevant. That has been happening with Hudi 0.5.0-snapshot. The symptoms were like there are duplicate records, or records were not upserted

[GitHub] [hudi] li36909 commented on pull request #2790: [HUDI-1779] Fail to bootstrap/upsert a table which contains timestamp column

2021-04-08 Thread GitBox
li36909 commented on pull request #2790: URL: https://github.com/apache/hudi/pull/2790#issuecomment-815854810 cc @nsivabalan could you help to take a look, thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column

2021-04-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1779: - Labels: pull-request-available (was: ) > Fail to bootstrap/upsert a table which contains

[GitHub] [hudi] li36909 opened a new pull request #2790: [HUDI-1779] Fail to bootstrap/upsert a table which contains timestamp column

2021-04-08 Thread GitBox
li36909 opened a new pull request #2790: URL: https://github.com/apache/hudi/pull/2790 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] li36909 commented on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled

2021-04-08 Thread GitBox
li36909 commented on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-815849870 @nsivabalan @cdmikechen I fix and pass the test by a simple change like this: at hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeRecordReaderUtils.java

[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column

2021-04-08 Thread lrz (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lrz updated HUDI-1779: -- Attachment: upsertFail.png > Fail to bootstrap/upsert a table which contains timestamp column >

[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column

2021-04-08 Thread lrz (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lrz updated HUDI-1779: -- Attachment: unsupportInt96.png > Fail to bootstrap/upsert a table which contains timestamp column >

[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column

2021-04-08 Thread lrz (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lrz updated HUDI-1779: -- Attachment: upsertFail2.png > Fail to bootstrap/upsert a table which contains timestamp column >

[jira] [Created] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column

2021-04-08 Thread lrz (Jira)
lrz created HUDI-1779: - Summary: Fail to bootstrap/upsert a table which contains timestamp column Key: HUDI-1779 URL: https://issues.apache.org/jira/browse/HUDI-1779 Project: Apache Hudi Issue Type:

[GitHub] [hudi] codecov-io edited a comment on pull request #2761: [HUDI-1676] Support SQL with spark3

2021-04-08 Thread GitBox
codecov-io edited a comment on pull request #2761: URL: https://github.com/apache/hudi/pull/2761#issuecomment-812815750 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2761?src=pr=h1) Report > Merging [#2761](https://codecov.io/gh/apache/hudi/pull/2761?src=pr=desc) (e862a1a) into

[GitHub] [hudi] codecov-io edited a comment on pull request #2761: [HUDI-1676] Support SQL with spark3

2021-04-08 Thread GitBox
codecov-io edited a comment on pull request #2761: URL: https://github.com/apache/hudi/pull/2761#issuecomment-812815750 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2761?src=pr=h1) Report > Merging [#2761](https://codecov.io/gh/apache/hudi/pull/2761?src=pr=desc) (e862a1a) into

[GitHub] [hudi] codecov-io edited a comment on pull request #2283: [HUDI-1415] Read Hoodie Table As Spark DataSource Table

2021-04-08 Thread GitBox
codecov-io edited a comment on pull request #2283: URL: https://github.com/apache/hudi/pull/2283#issuecomment-734137301 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

  1   2   >