[GitHub] [hudi] yanghua merged pull request #2814: [HUDI-1792] Fix flink-client query error when processing files larger than 128mb

2021-04-15 Thread GitBox
yanghua merged pull request #2814: URL: https://github.com/apache/hudi/pull/2814 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[jira] [Created] (HUDI-1803) Hopefully Hudi will officially support BAIDU AFS storage format

2021-04-15 Thread Xu Guang Lv (Jira)
Xu Guang Lv created HUDI-1803: - Summary: Hopefully Hudi will officially support BAIDU AFS storage format Key: HUDI-1803 URL: https://issues.apache.org/jira/browse/HUDI-1803 Project: Apache Hudi

[GitHub] [hudi] ssdong commented on pull request #2784: [HUDI-1740] Fix insert-overwrite API archival

2021-04-15 Thread GitBox
ssdong commented on pull request #2784: URL: https://github.com/apache/hudi/pull/2784#issuecomment-820928674 @satishkotha @lw309637554 Just to share some updates, this PR fixed the following 2 issues during archival 1. `Positive number of partitions required` 2.

[GitHub] [hudi] yanghua commented on pull request #2814: [HUDI-1792] Fix flink-client query error when processing files larger than 128mb

2021-04-15 Thread GitBox
yanghua commented on pull request #2814: URL: https://github.com/apache/hudi/pull/2814#issuecomment-820928266 > of course. > Before fixing the problem: > ![修复前1](https://user-images.githubusercontent.com/18521084/114977940-1ac31e80-9ebb-11eb-9634-2d8d389701b3.png) >

[GitHub] [hudi] hj2016 commented on pull request #2814: [HUDI-1792] Fix flink-client query error when processing files larger than 128mb

2021-04-15 Thread GitBox
hj2016 commented on pull request #2814: URL: https://github.com/apache/hudi/pull/2814#issuecomment-820927569 of course. Before fixing the problem: ![修复前1](https://user-images.githubusercontent.com/18521084/114977940-1ac31e80-9ebb-11eb-9634-2d8d389701b3.png)

[GitHub] [hudi] xglv1985 commented on issue #2812: [SUPPORT]Got a parquet related error when incremental querying MOR table, using Spark 2.4

2021-04-15 Thread GitBox
xglv1985 commented on issue #2812: URL: https://github.com/apache/hudi/issues/2812#issuecomment-820914205 > Okay, do you mind re-opening that Spark ticket and asking a question there ? Other options are to try a different Spark build to confirm that this is a spark issue and should

[GitHub] [hudi] codecov-io commented on pull request #2835: [HUDI-1802] Timeline Server Bundle need to include com.esotericsoftware package

2021-04-15 Thread GitBox
codecov-io commented on pull request #2835: URL: https://github.com/apache/hudi/pull/2835#issuecomment-820896412 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2835?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report

[jira] [Assigned] (HUDI-1802) Timeline Server Bundle need to include com.esotericsoftware package

2021-04-15 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned HUDI-1802: -- Assignee: cdmikechen > Timeline Server Bundle need to include com.esotericsoftware package >

[jira] [Closed] (HUDI-1801) FlinkMergeHandle rolling over may miss to rename the latest file handle

2021-04-15 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-1801. -- Resolution: Fixed b6d949b48a649acac27d5d9b91677bf2e25e9342 > FlinkMergeHandle rolling over may miss to rename

[hudi] branch master updated (191470d -> b6d949b)

2021-04-15 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 191470d [HUDI-1797] Remove the com.google.guave jar from hudi-flink-bundle to avoid conflicts. (#2828) add

[GitHub] [hudi] yanghua merged pull request #2831: [HUDI-1801] FlinkMergeHandle rolling over may miss to rename the late…

2021-04-15 Thread GitBox
yanghua merged pull request #2831: URL: https://github.com/apache/hudi/pull/2831 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[jira] [Updated] (HUDI-1802) Timeline Server Bundle need to include com.esotericsoftware package

2021-04-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1802: - Labels: pull-request-available (was: ) > Timeline Server Bundle need to include

[GitHub] [hudi] cdmikechen opened a new pull request #2835: [HUDI-1802] Timeline Server Bundle need to include com.esotericsoftware package

2021-04-15 Thread GitBox
cdmikechen opened a new pull request #2835: URL: https://github.com/apache/hudi/pull/2835 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] njalan commented on issue #2609: [SUPPORT] Presto hudi query slow when compared to parquet

2021-04-15 Thread GitBox
njalan commented on issue #2609: URL: https://github.com/apache/hudi/issues/2609#issuecomment-820876527 @tooptoop4 So is there any plan to merge it in prestosql? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[jira] [Created] (HUDI-1802) Timeline Server Bundle need to include com.esotericsoftware package

2021-04-15 Thread cdmikechen (Jira)
cdmikechen created HUDI-1802: Summary: Timeline Server Bundle need to include com.esotericsoftware package Key: HUDI-1802 URL: https://issues.apache.org/jira/browse/HUDI-1802 Project: Apache Hudi

[GitHub] [hudi] garyli1019 commented on issue #2818: [SUPPORT] Exception thrown in incremental query(MOR) and potential change data loss after archiving

2021-04-15 Thread GitBox
garyli1019 commented on issue #2818: URL: https://github.com/apache/hudi/issues/2818#issuecomment-820874107 @ssdong Thanks for report the issue. For the `NoSuchElementException`, please feel free to submit a fix. For the incremental pulling form archived commits, do you think we should

[GitHub] [hudi] danny0405 commented on a change in pull request #2831: [HUDI-1801] FlinkMergeHandle rolling over may miss to rename the late…

2021-04-15 Thread GitBox
danny0405 commented on a change in pull request #2831: URL: https://github.com/apache/hudi/pull/2831#discussion_r614527384 ## File path: hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/io/FlinkMergeHandle.java ## @@ -138,12 +128,12 @@ public void

[GitHub] [hudi] yanghua commented on a change in pull request #2831: [HUDI-1801] FlinkMergeHandle rolling over may miss to rename the late…

2021-04-15 Thread GitBox
yanghua commented on a change in pull request #2831: URL: https://github.com/apache/hudi/pull/2831#discussion_r614520445 ## File path: hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/io/FlinkMergeHandle.java ## @@ -138,12 +128,12 @@ public void rollOver(Iterator>

[GitHub] [hudi] wk888 opened a new issue #2834: [SUPPORT]org.apache.hudi.exception.TableNotFoundException

2021-04-15 Thread GitBox
wk888 opened a new issue #2834: URL: https://github.com/apache/hudi/issues/2834 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://cwiki.apache.org/confluence/display/HUDI/FAQ)? - Join the mailing list to engage in conversations and get faster

[GitHub] [hudi] yanghua commented on pull request #2814: [HUDI-1792] Fix flink-client query error when processing files larger than 128mb

2021-04-15 Thread GitBox
yanghua commented on pull request #2814: URL: https://github.com/apache/hudi/pull/2814#issuecomment-820843182 @hj2016 Since this fix is hard to write test. Did you test it in your local env? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] zhedoubushishi commented on pull request #2833: [WIP][HUDI-89] Add configOption & refactor HoodieBootstrapConfig for a demo

2021-04-15 Thread GitBox
zhedoubushishi commented on pull request #2833: URL: https://github.com/apache/hudi/pull/2833#issuecomment-820836368 @vinothchandar can you take a look when you have time to see if this is something you want to go with? -- This is an automated message from the Apache Git Service. To

[jira] [Updated] (HUDI-89) Clean up placement, naming, defaults of HoodieWriteConfig

2021-04-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-89?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-89: --- Labels: pull-request-available (was: ) > Clean up placement, naming, defaults of HoodieWriteConfig >

[GitHub] [hudi] zhedoubushishi opened a new pull request #2833: [WIP][HUDI-89] Add configOption & refactor HoodieBootstrapConfig for a demo

2021-04-15 Thread GitBox
zhedoubushishi opened a new pull request #2833: URL: https://github.com/apache/hudi/pull/2833 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of

[GitHub] [hudi] umehrot2 commented on a change in pull request #2283: [HUDI-1415] Read Hoodie Table As Spark DataSource Table

2021-04-15 Thread GitBox
umehrot2 commented on a change in pull request #2283: URL: https://github.com/apache/hudi/pull/2283#discussion_r614431975 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala ## @@ -388,7 +399,8 @@ private[hudi] object

[GitHub] [hudi] nevgin opened a new issue #2832: [SUPPORT]

2021-04-15 Thread GitBox
nevgin opened a new issue #2832: URL: https://github.com/apache/hudi/issues/2832 I have installed vanilla versions of hive and spark. Put the jar hoodie spark bundle in the spark. Put hudi-hadoop-mr-bundle-x.y.z-SNAPSHOT.jar in aux hive dir and to classpath hadoop on all datanodes.

[jira] [Commented] (HUDI-57) [UMBRELLA] Support ORC Storage

2021-04-15 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17322470#comment-17322470 ] Nishith Agarwal commented on HUDI-57: - [~Teresa] Please create the tickets for the remaining work around

[jira] [Assigned] (HUDI-765) Implement OrcReaderIterator

2021-04-15 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal reassigned HUDI-765: Assignee: Teresa Kang (was: Yanjia Gary Li) > Implement OrcReaderIterator >

[jira] [Assigned] (HUDI-57) [UMBRELLA] Support ORC Storage

2021-04-15 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-57?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal reassigned HUDI-57: --- Assignee: Teresa Kang (was: Mani Jindal) > [UMBRELLA] Support ORC Storage >

[jira] [Assigned] (HUDI-764) Implement HoodieOrcWriter

2021-04-15 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal reassigned HUDI-764: Assignee: Teresa Kang > Implement HoodieOrcWriter > - > >

[jira] [Assigned] (HUDI-764) Implement HoodieOrcWriter

2021-04-15 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal reassigned HUDI-764: Assignee: (was: lamber-ken) > Implement HoodieOrcWriter > - > >

[jira] [Assigned] (HUDI-764) Implement HoodieOrcWriter

2021-04-15 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal reassigned HUDI-764: Assignee: (was: lamber-ken) > Implement HoodieOrcWriter > - > >

[jira] [Updated] (HUDI-1796) allow ExternalSpillMap use accurate payload size rather than estimated

2021-04-15 Thread ZiyueGuan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZiyueGuan updated HUDI-1796: Description: Situation: In ExternalSpillMap, we need to control the amount of data in memory map to avoid

[GitHub] [hudi] vburenin commented on issue #2811: [SUPPORT] How to run hudi on dataproc and write to gcs bucket

2021-04-15 Thread GitBox
vburenin commented on issue #2811: URL: https://github.com/apache/hudi/issues/2811#issuecomment-820530141 It looks like core-site.xml is not visible since it didn't trigger gs:// schema handler. One more thing though, I would recommend to upgrade google GCS connector to the latest

[GitHub] [hudi] leesf commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-15 Thread GitBox
leesf commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r614153933 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestDelete.scala ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] leesf commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-15 Thread GitBox
leesf commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r614152609 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestDelete.scala ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] leesf commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-15 Thread GitBox
leesf commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r614150697 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestCreateTable.scala ## @@ -0,0 +1,230 @@ +/* + * Licensed to the

[GitHub] [hudi] leesf commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-15 Thread GitBox
leesf commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r614142342 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableAsSelectCommand.scala ## @@ -0,0 +1,69 @@

[GitHub] [hudi] leesf commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-15 Thread GitBox
leesf commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r614142192 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableAsSelectCommand.scala ## @@ -0,0 +1,69 @@

[GitHub] [hudi] leesf commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-15 Thread GitBox
leesf commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r614141295 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodieAnalysis.scala ## @@ -0,0 +1,318 @@ +/* + * Licensed

[GitHub] [hudi] leesf commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-15 Thread GitBox
leesf commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r614135849 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/SparkSqlAdapterSupport.scala ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] leesf commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-15 Thread GitBox
leesf commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r614133690 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/MergeOnReadSnapshotRelation.scala ## @@ -61,8 +62,17 @@ class

[GitHub] [hudi] leesf commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-15 Thread GitBox
leesf commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r614125884 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodiePayloadProps.java ## @@ -40,4 +40,12 @@ */ public static final String

[GitHub] [hudi] leesf commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-15 Thread GitBox
leesf commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r614125206 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/DefaultHoodieRecordPayload.java ## @@ -97,4 +86,20 @@ public

[GitHub] [hudi] leesf commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-15 Thread GitBox
leesf commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r614124751 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/DefaultHoodieRecordPayload.java ## @@ -97,4 +86,20 @@ public

[GitHub] [hudi] rubenssoto commented on issue #2787: [SUPPORT] Error upserting bucketType UPDATE for partition

2021-04-15 Thread GitBox
rubenssoto commented on issue #2787: URL: https://github.com/apache/hudi/issues/2787#issuecomment-820462928 Its a new Hudi table. It happens intermittently, probably some schema mismatch I think...is there any way to know where exactly the problem or I will have to inspect the new

[GitHub] [hudi] codecov-io commented on pull request #2831: [HUDI-1801] FlinkMergeHandle rolling over may miss to rename the late…

2021-04-15 Thread GitBox
codecov-io commented on pull request #2831: URL: https://github.com/apache/hudi/pull/2831#issuecomment-820383022 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2831?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report

[GitHub] [hudi] tmac2100 commented on issue #2806: Spark upsert Hudi performance degrades significantly

2021-04-15 Thread GitBox
tmac2100 commented on issue #2806: URL: https://github.com/apache/hudi/issues/2806#issuecomment-820367824 @n3nash Thank you for your help.I can't transfer pictures because of company information security restrictions. 1)BloomIndex is more efficient when the number of fields is smaller

[GitHub] [hudi] dszakallas edited a comment on issue #1751: [SUPPORT] Hudi not working with Spark 3.0.0

2021-04-15 Thread GitBox
dszakallas edited a comment on issue #1751: URL: https://github.com/apache/hudi/issues/1751#issuecomment-820315925 I resolved the issue by deleting these two exclusions from Spark: https://github.com/apache/spark/blob/v3.0.1/pom.xml#L1692-L1699. After that calcite-core becomes part of the

[GitHub] [hudi] dszakallas edited a comment on issue #1751: [SUPPORT] Hudi not working with Spark 3.0.0

2021-04-15 Thread GitBox
dszakallas edited a comment on issue #1751: URL: https://github.com/apache/hudi/issues/1751#issuecomment-820315925 I resolved the issue by deleting these two exclusions from Spark: https://github.com/apache/spark/blob/v3.0.1/pom.xml#L1692-L1699. After that calcite-core becomes part of the

[jira] [Updated] (HUDI-1801) FlinkMergeHandle rolling over may miss to rename the latest file handle

2021-04-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1801: - Labels: pull-request-available (was: ) > FlinkMergeHandle rolling over may miss to rename the

[GitHub] [hudi] danny0405 opened a new pull request #2831: [HUDI-1801] FlinkMergeHandle rolling over may miss to rename the late…

2021-04-15 Thread GitBox
danny0405 opened a new pull request #2831: URL: https://github.com/apache/hudi/pull/2831 …st file handle The FlinkMergeHandle may rename the N-1 th file handle instead of the latest one, thus to cause data duplication. ## *Tips* - *Thank you very much for contributing to

[jira] [Created] (HUDI-1801) FlinkMergeHandle rolling over may miss to rename the latest file handle

2021-04-15 Thread Danny Chen (Jira)
Danny Chen created HUDI-1801: Summary: FlinkMergeHandle rolling over may miss to rename the latest file handle Key: HUDI-1801 URL: https://issues.apache.org/jira/browse/HUDI-1801 Project: Apache Hudi

[GitHub] [hudi] dszakallas edited a comment on issue #1751: [SUPPORT] Hudi not working with Spark 3.0.0

2021-04-15 Thread GitBox
dszakallas edited a comment on issue #1751: URL: https://github.com/apache/hudi/issues/1751#issuecomment-820315925 I resolved the issue by deleting these two exclusions from Spark: https://github.com/apache/spark/blob/v3.0.1/pom.xml#L1692-L1699. After that calcite-core becomes part of the

[GitHub] [hudi] dszakallas commented on issue #1751: [SUPPORT] Hudi not working with Spark 3.0.0

2021-04-15 Thread GitBox
dszakallas commented on issue #1751: URL: https://github.com/apache/hudi/issues/1751#issuecomment-820315925 I resolved the issue by deleting these two exclusions from Spark: https://github.com/apache/spark/blob/v3.0.1/pom.xml#L1692-L1699 -- This is an automated message from the Apache

[GitHub] [hudi] huzekang commented on issue #2656: HUDI insert operation is working same as upsert

2021-04-15 Thread GitBox
huzekang commented on issue #2656: URL: https://github.com/apache/hudi/issues/2656#issuecomment-820287948 I have the same problem. when i set insert operation to hudi, I expect the result has 10 records,but there is 8 records . It just like upsert opt. ``` val spark =

[GitHub] [hudi] wsxGit opened a new issue #2830: [SUPPORT]same _hoodie_record_key has duplicates data

2021-04-15 Thread GitBox
wsxGit opened a new issue #2830: URL: https://github.com/apache/hudi/issues/2830 config is : `props.put("hoodie.datasource.write.table.type", "COPY_ON_WRITE") props.put(RECORDKEY_FIELD_OPT_KEY, "hudi_uuid") props.put(PRECOMBINE_FIELD_OPT_KEY, "opttime")

[GitHub] [hudi] manishbol opened a new issue #2829: Getting an Exception Property hoodie.deltastreamer.schemaprovider.registry.baseUrl not found

2021-04-15 Thread GitBox
manishbol opened a new issue #2829: URL: https://github.com/apache/hudi/issues/2829 What do the below two properties mean? What can be the possible values of these properties? hoodie.deltastreamer.schemaprovider.registry.baseUrl

[jira] [Commented] (HUDI-1797) Shade google guava for hudi-flink-bundle jar

2021-04-15 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17321961#comment-17321961 ] vinoyang commented on HUDI-1797: [~wangminchao] Welcome to Hudi community! I have given you jira

[jira] [Closed] (HUDI-1797) Shade google guava for hudi-flink-bundle jar

2021-04-15 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-1797. -- Resolution: Done 191470d1fc9b3596eb4da2413e8bef286ccc7135 > Shade google guava for hudi-flink-bundle jar >

[jira] [Assigned] (HUDI-1797) Shade google guava for hudi-flink-bundle jar

2021-04-15 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned HUDI-1797: -- Assignee: WangMinChao > Shade google guava for hudi-flink-bundle jar >

[GitHub] [hudi] ssdong commented on issue #2818: [SUPPORT] Exception thrown in incremental query(MOR) and potential change data loss after archiving

2021-04-15 Thread GitBox
ssdong commented on issue #2818: URL: https://github.com/apache/hudi/issues/2818#issuecomment-820180331 @n3nash Thank you for getting back to me. Let me know if you need extra manpower to help fix `MergeOnReadIncrementalRelation`. :) As for the second issue, thank you for providing

[jira] [Updated] (HUDI-1797) Shade google guava for hudi-flink-bundle jar

2021-04-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1797: - Labels: pull-request-available (was: ) > Shade google guava for hudi-flink-bundle jar >

[hudi] branch master updated (6d1aec6 -> 191470d)

2021-04-15 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 6d1aec6 [HUDI-1798] Flink streaming reader should always monitor the delta commits files (#2825) add 191470d

[GitHub] [hudi] yanghua merged pull request #2828: [HUDI-1797] Remove the com.google.guave jar from hudi-flink-bundle to avoid conflicts.

2021-04-15 Thread GitBox
yanghua merged pull request #2828: URL: https://github.com/apache/hudi/pull/2828 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #2796: [HUDI-1783] Support Huawei Cloud Object Storage

2021-04-15 Thread GitBox
xiarixiaoyao commented on a change in pull request #2796: URL: https://github.com/apache/hudi/pull/2796#discussion_r613814357 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/StorageSchemes.java ## @@ -53,7 +53,9 @@ // Databricks file system

[GitHub] [hudi] hudi-bot edited a comment on pull request #2643: DO NOT MERGE (Azure CI) test branch ci

2021-04-15 Thread GitBox
hudi-bot edited a comment on pull request #2643: URL: https://github.com/apache/hudi/pull/2643#issuecomment-792368481 ## CI report: * 9831a6c50e9f49f8a71c02fc6ac50ae1446f7c1f UNKNOWN * a569dbe9409910fbb83b3764b300574c0e52612e Azure:

[jira] [Assigned] (HUDI-1696) artifactSet of maven-shade-plugin has not commons-codec

2021-04-15 Thread Harshit Mittal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harshit Mittal reassigned HUDI-1696: Assignee: Harshit Mittal > artifactSet of maven-shade-plugin has not commons-codec >

[jira] [Resolved] (HUDI-1696) artifactSet of maven-shade-plugin has not commons-codec

2021-04-15 Thread Harshit Mittal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harshit Mittal resolved HUDI-1696. -- Resolution: Fixed > artifactSet of maven-shade-plugin has not commons-codec >

[GitHub] [hudi] hudi-bot edited a comment on pull request #2643: DO NOT MERGE (Azure CI) test branch ci

2021-04-15 Thread GitBox
hudi-bot edited a comment on pull request #2643: URL: https://github.com/apache/hudi/pull/2643#issuecomment-792368481 ## CI report: * 9831a6c50e9f49f8a71c02fc6ac50ae1446f7c1f UNKNOWN * a569dbe9409910fbb83b3764b300574c0e52612e Azure:

[jira] [Commented] (HUDI-1800) Incorrect HoodieTableFileSystem API usage for pending slices causing issues

2021-04-15 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17321943#comment-17321943 ] Nishith Agarwal commented on HUDI-1800: --- [~varadarb] [~uditme] Can one of you pick this one up ?

[jira] [Updated] (HUDI-1800) Incorrect HoodieTableFileSystem API usage for pending slices causing issues

2021-04-15 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1800: -- Labels: sev:critical (was: ) > Incorrect HoodieTableFileSystem API usage for pending slices

[jira] [Created] (HUDI-1800) Incorrect HoodieTableFileSystem API usage for pending slices causing issues

2021-04-15 Thread Nishith Agarwal (Jira)
Nishith Agarwal created HUDI-1800: - Summary: Incorrect HoodieTableFileSystem API usage for pending slices causing issues Key: HUDI-1800 URL: https://issues.apache.org/jira/browse/HUDI-1800 Project:

[GitHub] [hudi] n3nash commented on issue #2633: Empty File Slice causing application to fail in small files optimization code

2021-04-15 Thread GitBox
n3nash commented on issue #2633: URL: https://github.com/apache/hudi/issues/2633#issuecomment-820160205 I've filed a ticket here -> https://issues.apache.org/jira/browse/HUDI-1800. Let's move the discussion and solution there. -- This is an automated message from the Apache Git Service.

[GitHub] [hudi] n3nash closed issue #2633: Empty File Slice causing application to fail in small files optimization code

2021-04-15 Thread GitBox
n3nash closed issue #2633: URL: https://github.com/apache/hudi/issues/2633 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact

[GitHub] [hudi] aditiwari01 commented on issue #2802: Hive read issues when different partition have different schemas.

2021-04-15 Thread GitBox
aditiwari01 commented on issue #2802: URL: https://github.com/apache/hudi/issues/2802#issuecomment-820157419 Actually we get Null pointer exceptionn when trying to get when we try to create readerSchema from writer schema and projectedFields, because the new column would present in

[GitHub] [hudi] n3nash commented on issue #2812: [SUPPORT]Got a parquet related error when incremental querying MOR table, using Spark 2.4

2021-04-15 Thread GitBox
n3nash commented on issue #2812: URL: https://github.com/apache/hudi/issues/2812#issuecomment-820156418 Okay, do you mind re-opening that Spark ticket and asking a question there ? Other options are to try a different Spark build to confirm that this is a spark issue and should probably

[GitHub] [hudi] n3nash commented on issue #2811: [SUPPORT] How to run hudi on dataproc and write to gcs bucket

2021-04-15 Thread GitBox
n3nash commented on issue #2811: URL: https://github.com/apache/hudi/issues/2811#issuecomment-820155577 @vburenin Are you able to help out here ? Looks like a GCS related issue which you are more familiar with -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [hudi] n3nash commented on issue #2806: Spark upsert Hudi performance degrades significantly

2021-04-15 Thread GitBox
n3nash commented on issue #2806: URL: https://github.com/apache/hudi/issues/2806#issuecomment-820155041 @tmac2100 Thanks for those details, they are very helpful to understand your job. I need one more information before I can understand the root cause, can you please put a screenshot of

[GitHub] [hudi] n3nash commented on issue #2802: Hive read issues when different partition have different schemas.

2021-04-15 Thread GitBox
n3nash commented on issue #2802: URL: https://github.com/apache/hudi/issues/2802#issuecomment-820153027 @aditiwari01 Thanks for that explanation. Can you please add the failure message / exception that you see with the current code ? -- This is an automated message from the Apache Git

[GitHub] [hudi] n3nash edited a comment on issue #2797: [SUPPORT] Can not create a Path from an empty string on unpartitioned table

2021-04-15 Thread GitBox
n3nash edited a comment on issue #2797: URL: https://github.com/apache/hudi/issues/2797#issuecomment-820151588 @ismailsimsek Are you saying it was fixed after you fixed the databasePath / location in your glue metastore to include `/` ? Is the `/` expected always at the end of the path ?

[GitHub] [hudi] n3nash commented on issue #2797: [SUPPORT] Can not create a Path from an empty string on unpartitioned table

2021-04-15 Thread GitBox
n3nash commented on issue #2797: URL: https://github.com/apache/hudi/issues/2797#issuecomment-820151588 @ismailsimsek Are you saying it was fixed after you fixed the databasePath / location in your glue metastore to include `/` ? Is the `/` expected always at the end of the path ? If yes,

[GitHub] [hudi] n3nash commented on issue #2787: [SUPPORT] Error upserting bucketType UPDATE for partition

2021-04-15 Thread GitBox
n3nash commented on issue #2787: URL: https://github.com/apache/hudi/issues/2787#issuecomment-820150010 This does look like related to some files having incompatible schema. @rubenssoto 1) Is this a tables boostrapped into hudi or is this a new hudi tables ? 2) Is this problem

[jira] [Updated] (HUDI-1799) NoSuchMethod error in prometheus reporter

2021-04-15 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1799: -- Description: Prometheus reporter throws the following exception  

[jira] [Assigned] (HUDI-1799) NoSuchMethod error in prometheus reporter

2021-04-15 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal reassigned HUDI-1799: - Assignee: sivabalan narayanan > NoSuchMethod error in prometheus reporter >

[jira] [Updated] (HUDI-1799) NoSuchMethod error in prometheus reporter

2021-04-15 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1799: -- Description: Prometheus reporter throws the following exception  

[jira] [Updated] (HUDI-1799) NoSuchMethod error in prometheus reporter

2021-04-15 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1799: -- Labels: sev:high (was: ) > NoSuchMethod error in prometheus reporter >

[jira] [Updated] (HUDI-1799) NoSuchMethod error in prometheus reporter

2021-04-15 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1799: -- Issue Type: Bug (was: Improvement) > NoSuchMethod error in prometheus reporter >

[jira] [Created] (HUDI-1799) NoSuchMethod error in prometheus reporter

2021-04-15 Thread Nishith Agarwal (Jira)
Nishith Agarwal created HUDI-1799: - Summary: NoSuchMethod error in prometheus reporter Key: HUDI-1799 URL: https://issues.apache.org/jira/browse/HUDI-1799 Project: Apache Hudi Issue Type:

[GitHub] [hudi] n3nash commented on issue #2818: [SUPPORT] Exception thrown in incremental query(MOR) and potential change data loss after archiving

2021-04-15 Thread GitBox
n3nash commented on issue #2818: URL: https://github.com/apache/hudi/issues/2818#issuecomment-820146127 @ssdong Thanks for the detailed description of the problem and how to reproduce this. First, it looks like there is an issue in `MergeOnReadIncrementalRelation` that is resulting in