[GitHub] [hudi] xushiyan commented on pull request #2250: [HUDI-1395] HoodieSnapshotCopier to work correctly on non-partitioned datasets

2020-12-08 Thread GitBox
xushiyan commented on pull request #2250: URL: https://github.com/apache/hudi/pull/2250#issuecomment-741602312 Made https://github.com/apache/hudi/pull/2312 to include changes in Exporter too. We may close this once that one gets merged. Thanks.

[GitHub] [hudi] xushiyan opened a new pull request #2312: [HUDI-1395] Fix partition path using FSUtils

2020-12-08 Thread GitBox
xushiyan opened a new pull request #2312: URL: https://github.com/apache/hudi/pull/2312 Fixed the logic to get partition path in Copier and Exporter utilities. ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is

[GitHub] [hudi] nsivabalan edited a comment on pull request #2311: [HUDI-115] Adding OverwriteWithLatestAvroPayloadV1 to honor ordering with combineAndGetUpdateValue

2020-12-08 Thread GitBox
nsivabalan edited a comment on pull request #2311: URL: https://github.com/apache/hudi/pull/2311#issuecomment-741600626 @vinothchandar : can you take a look at this patch. I am yet to find a good name for the new class added (OverwriteWithLatestAvroPayloadV1). Ignoring that for now,

[GitHub] [hudi] nsivabalan commented on pull request #2311: [HUDI-115] Adding OverwriteWithLatestAvroPayloadV1 to honor ordering with combineAndGetUpdateValue

2020-12-08 Thread GitBox
nsivabalan commented on pull request #2311: URL: https://github.com/apache/hudi/pull/2311#issuecomment-741600626 @vinothchandar : can you take a look at this patch. I am yet to find a good name for the class. Ignoring that for now, basically I have introduced a new config called

[GitHub] [hudi] nsivabalan opened a new pull request #2311: [HUDI-115] Adding OverwriteWithLatestAvroPayloadV1 to honor ordering with combineAndGetUpdateValue

2020-12-08 Thread GitBox
nsivabalan opened a new pull request #2311: URL: https://github.com/apache/hudi/pull/2311 ## What is the purpose of the pull request Existing OverwriteWithLatestAvroPayload always chooses incoming record when combineAndGetUpdateValue is called, but there are chances a record is late

[GitHub] [hudi] satishkotha commented on a change in pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-08 Thread GitBox
satishkotha commented on a change in pull request #2263: URL: https://github.com/apache/hudi/pull/2263#discussion_r539041396 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieFileSliceReader.java ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] yanghua commented on a change in pull request #2307: [MINOR] Throw an exception when keyGenerator initialization failed in…

2020-12-08 Thread GitBox
yanghua commented on a change in pull request #2307: URL: https://github.com/apache/hudi/pull/2307#discussion_r539021999 ## File path: hudi-flink/src/main/java/org/apache/hudi/source/JsonStringToHoodieRecordMapFunction.java ## @@ -65,10 +67,12 @@ public HoodieRecord

[GitHub] [hudi] n3nash commented on a change in pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-08 Thread GitBox
n3nash commented on a change in pull request #2263: URL: https://github.com/apache/hudi/pull/2263#discussion_r539016243 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieFileSliceReader.java ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] n3nash commented on pull request #2309: [HUDI-1441] - HoodieAvroUtils - rewrite() is not handling evolution o…

2020-12-08 Thread GitBox
n3nash commented on pull request #2309: URL: https://github.com/apache/hudi/pull/2309#issuecomment-741532620 @nbalajee Can you please explain why do we need this ? If the latest schema is passed (which is the case for Hudi now) is this still a problem ? @bvaradar can you please take a

[GitHub] [hudi] bhushanamk commented on issue #2294: [SUPPORT] java.lang.IllegalArgumentException: Can not create a Path from an empty string on non partitioned COW table

2020-12-08 Thread GitBox
bhushanamk commented on issue #2294: URL: https://github.com/apache/hudi/issues/2294#issuecomment-741524757 @bvaradar , Sure I will check it and let you know thanks This is an automated message from the Apache Git Service.

[jira] [Updated] (HUDI-1444) fix the error when rollback commit that belong to a non partition table

2020-12-08 Thread ann (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ann updated HUDI-1444: -- Status: Open (was: New) > fix the error when rollback commit that belong to a non partition table >

[jira] [Updated] (HUDI-1444) fix the error when rollback commit that belong to a non partition table

2020-12-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1444: - Labels: pull-request-available (was: ) > fix the error when rollback commit that belong to a non

[GitHub] [hudi] Xoln opened a new pull request #2310: [HUDI-1444] fix rollback for emtpy partition table

2020-12-08 Thread GitBox
Xoln opened a new pull request #2310: URL: https://github.com/apache/hudi/pull/2310 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull

[GitHub] [hudi] codecov-io edited a comment on pull request #2208: [HUDI-1040] Make Hudi support Spark 3

2020-12-08 Thread GitBox
codecov-io edited a comment on pull request #2208: URL: https://github.com/apache/hudi/pull/2208#issuecomment-718240937 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2208?src=pr=h1) Report > Merging [#2208](https://codecov.io/gh/apache/hudi/pull/2208?src=pr=desc) (d51d392) into

[jira] [Updated] (HUDI-1444) fix the error when rollback commit that belong to a non partition table

2020-12-08 Thread ann (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ann updated HUDI-1444: -- Attachment: rollback-empty-partition-table.diff > fix the error when rollback commit that belong to a non partition

[jira] [Created] (HUDI-1444) fix the error when rollback commit that belong to a non partition table

2020-12-08 Thread ann (Jira)
ann created HUDI-1444: - Summary: fix the error when rollback commit that belong to a non partition table Key: HUDI-1444 URL: https://issues.apache.org/jira/browse/HUDI-1444 Project: Apache Hudi Issue

[GitHub] [hudi] zherenyu831 commented on issue #2285: [SUPPORT] Exception on snapshot query while compaction (hudi 0.6.0)

2020-12-08 Thread GitBox
zherenyu831 commented on issue #2285: URL: https://github.com/apache/hudi/issues/2285#issuecomment-741401166 @bvaradar I deleted parquet-hadoop-bundle-1.6.0.jar and tired again, but error still happens then I replaced all parquet lib with official ones, but not worked

[GitHub] [hudi] satishkotha commented on a change in pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-08 Thread GitBox
satishkotha commented on a change in pull request #2263: URL: https://github.com/apache/hudi/pull/2263#discussion_r538934862 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieFileSliceReader.java ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] shenh062326 commented on pull request #2306: HUDI-1439 Remove scala dependency from hudi-client-common

2020-12-08 Thread GitBox
shenh062326 commented on pull request #2306: URL: https://github.com/apache/hudi/pull/2306#issuecomment-741379363 > LGTM. what do you think about the checkstyle suggestion It‘s better to add a checkstyle to ensure that scala will not be import. I will try to add one.

[GitHub] [hudi] satishkotha commented on pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-08 Thread GitBox
satishkotha commented on pull request #2263: URL: https://github.com/apache/hudi/pull/2263#issuecomment-741375755 @n3nash created https://issues.apache.org/jira/browse/HUDI-1443 and https://issues.apache.org/jira/browse/HUDI-1442 for performance related measurements

[jira] [Created] (HUDI-1443) Remove record deserialization in RDDCustomColumnsSortPartitioner

2020-12-08 Thread satish (Jira)
satish created HUDI-1443: Summary: Remove record deserialization in RDDCustomColumnsSortPartitioner Key: HUDI-1443 URL: https://issues.apache.org/jira/browse/HUDI-1443 Project: Apache Hudi Issue

[GitHub] [hudi] nsivabalan commented on a change in pull request #1704: [HUDI-115] Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-12-08 Thread GitBox
nsivabalan commented on a change in pull request #1704: URL: https://github.com/apache/hudi/pull/1704#discussion_r538925260 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java ## @@ -113,6 +113,9 @@ public static final String

[GitHub] [hudi] nsivabalan commented on a change in pull request #1704: [HUDI-115] Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-12-08 Thread GitBox
nsivabalan commented on a change in pull request #1704: URL: https://github.com/apache/hudi/pull/1704#discussion_r538925260 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java ## @@ -113,6 +113,9 @@ public static final String

[jira] [Created] (HUDI-1442) Simplify clustering executor SparkRunClusteringCommitActionExecutor

2020-12-08 Thread satish (Jira)
satish created HUDI-1442: Summary: Simplify clustering executor SparkRunClusteringCommitActionExecutor Key: HUDI-1442 URL: https://issues.apache.org/jira/browse/HUDI-1442 Project: Apache Hudi Issue

[jira] [Commented] (HUDI-1401) Presto use of Metadata Table for file listings

2020-12-08 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17246169#comment-17246169 ] Udit Mehrotra commented on HUDI-1401: - [~vinoth] I agree that this is not something that would be

[GitHub] [hudi] n3nash commented on pull request #2168: [HUDI-1331] Adding support for validating entire dataset and long running tests in test suite framework

2020-12-08 Thread GitBox
n3nash commented on pull request #2168: URL: https://github.com/apache/hudi/pull/2168#issuecomment-741101118 @nsivabalan The use-case you described seems to be intentional but the behavior is not correct. If the number of records to update is explicitly asked by the dag, then `Option

[GitHub] [hudi] n3nash commented on a change in pull request #2168: [HUDI-1331] Adding support for validating entire dataset and long running tests in test suite framework

2020-12-08 Thread GitBox
n3nash commented on a change in pull request #2168: URL: https://github.com/apache/hudi/pull/2168#discussion_r538836367 ## File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/dag/DagUtils.java ## @@ -48,6 +48,15 @@ */ public class DagUtils { +

[GitHub] [hudi] n3nash commented on a change in pull request #2168: [HUDI-1331] Adding support for validating entire dataset and long running tests in test suite framework

2020-12-08 Thread GitBox
n3nash commented on a change in pull request #2168: URL: https://github.com/apache/hudi/pull/2168#discussion_r538833962 ## File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/dag/nodes/ValidateDatasetNode.java ## @@ -0,0 +1,147 @@ +/* + * Licensed to the

[GitHub] [hudi] n3nash commented on a change in pull request #2168: [HUDI-1331] Adding support for validating entire dataset and long running tests in test suite framework

2020-12-08 Thread GitBox
n3nash commented on a change in pull request #2168: URL: https://github.com/apache/hudi/pull/2168#discussion_r538832728 ## File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/dag/nodes/ValidateDatasetNode.java ## @@ -0,0 +1,147 @@ +/* + * Licensed to the

[GitHub] [hudi] codecov-io edited a comment on pull request #2309: [HUDI-1441] - HoodieAvroUtils - rewrite() is not handling evolution o…

2020-12-08 Thread GitBox
codecov-io edited a comment on pull request #2309: URL: https://github.com/apache/hudi/pull/2309#issuecomment-741038401 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2309?src=pr=h1) Report > Merging [#2309](https://codecov.io/gh/apache/hudi/pull/2309?src=pr=desc) (5ba2447) into

[GitHub] [hudi] codecov-io commented on pull request #2309: [HUDI-1441] - HoodieAvroUtils - rewrite() is not handling evolution o…

2020-12-08 Thread GitBox
codecov-io commented on pull request #2309: URL: https://github.com/apache/hudi/pull/2309#issuecomment-741038401 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2309?src=pr=h1) Report > Merging [#2309](https://codecov.io/gh/apache/hudi/pull/2309?src=pr=desc) (5ba2447) into

[jira] [Assigned] (HUDI-259) Hadoop 3 support for Hudi writing

2020-12-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-259: --- Assignee: (was: Pratyaksh Sharma) > Hadoop 3 support for Hudi writing >

[GitHub] [hudi] vinothchandar commented on pull request #2208: [HUDI-1040] Make Hudi support Spark 3

2020-12-08 Thread GitBox
vinothchandar commented on pull request #2208: URL: https://github.com/apache/hudi/pull/2208#issuecomment-740945829 @zhedoubushishi looks like we need to rebase this again. @umehrot2 please go ahead and merge once you are happy with this.

[jira] [Updated] (HUDI-1441) HoodieAvroUtils - rewrite() is not handling evolution of a nested record field.

2020-12-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1441: - Labels: pull-request-available (was: ) > HoodieAvroUtils - rewrite() is not handling evolution

[GitHub] [hudi] nbalajee opened a new pull request #2309: [HUDI-1441] - HoodieAvroUtils - rewrite() is not handling evolution o…

2020-12-08 Thread GitBox
nbalajee opened a new pull request #2309: URL: https://github.com/apache/hudi/pull/2309 …f a nested record field. ## What is the purpose of the pull request If schema contains nested records, then HoodieAvroUtils rewrite() function copies the record fields as-is, from the

[jira] [Commented] (HUDI-1425) Performance loss with the additional hoodieRecords.isEmpty() in HoodieSparkSqlWriter#write

2020-12-08 Thread Christopher Dedels (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17245964#comment-17245964 ] Christopher Dedels commented on HUDI-1425: -- +1 for this issue.  In addition to isEmpty adding

[GitHub] [hudi] codecov-io edited a comment on pull request #2306: HUDI-1439 Remove scala dependency from hudi-client-common

2020-12-08 Thread GitBox
codecov-io edited a comment on pull request #2306: URL: https://github.com/apache/hudi/pull/2306#issuecomment-740515622 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2306?src=pr=h1) Report > Merging [#2306](https://codecov.io/gh/apache/hudi/pull/2306?src=pr=desc) (9d9887c) into

[jira] [Commented] (HUDI-480) Support a querying delete data methond in incremental view

2020-12-08 Thread cdmikechen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17245939#comment-17245939 ] cdmikechen commented on HUDI-480: - [~vinoth] [~yanghua] I opened a project in

[hudi] branch master updated: fix typo (#2308)

2020-12-08 Thread garyli
This is an automated email from the ASF dual-hosted git repository. garyli pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 3a91d26 fix typo (#2308) 3a91d26 is described

[GitHub] [hudi] garyli1019 merged pull request #2308: [MINOR] fix typo

2020-12-08 Thread GitBox
garyli1019 merged pull request #2308: URL: https://github.com/apache/hudi/pull/2308 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] garyli1019 commented on pull request #2281: [HUDI-1418] set up flink client unit test infra

2020-12-08 Thread GitBox
garyli1019 commented on pull request #2281: URL: https://github.com/apache/hudi/pull/2281#issuecomment-740613311 Hi @wangxianghu @leesf @yanghua , if any of you have time, please help to review this PR, thanks! This is an

[GitHub] [hudi] wangxianghu commented on a change in pull request #2307: [MINOR] Throw an exception when keyGenerator initialization failed in…

2020-12-08 Thread GitBox
wangxianghu commented on a change in pull request #2307: URL: https://github.com/apache/hudi/pull/2307#discussion_r538354015 ## File path: hudi-flink/src/main/java/org/apache/hudi/source/JsonStringToHoodieRecordMapFunction.java ## @@ -65,10 +67,12 @@ public HoodieRecord

[GitHub] [hudi] codecov-io commented on pull request #2308: [MINOR] fix typo

2020-12-08 Thread GitBox
codecov-io commented on pull request #2308: URL: https://github.com/apache/hudi/pull/2308#issuecomment-740606111 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2308?src=pr=h1) Report > Merging [#2308](https://codecov.io/gh/apache/hudi/pull/2308?src=pr=desc) (13b9767) into

[hudi] branch asf-site updated: Travis CI build asf-site

2020-12-08 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new f93 Travis CI build asf-site f93 is

[GitHub] [hudi] jshmchenxi opened a new pull request #2308: [MINOR] fix typo

2020-12-08 Thread GitBox
jshmchenxi opened a new pull request #2308: URL: https://github.com/apache/hudi/pull/2308 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] yanghua commented on a change in pull request #2307: [MINOR] Throw an exception when keyGenerator initialization failed in…

2020-12-08 Thread GitBox
yanghua commented on a change in pull request #2307: URL: https://github.com/apache/hudi/pull/2307#discussion_r538265716 ## File path: hudi-flink/src/main/java/org/apache/hudi/source/JsonStringToHoodieRecordMapFunction.java ## @@ -65,10 +67,12 @@ public HoodieRecord

[GitHub] [hudi] vinothchandar commented on a change in pull request #1704: [HUDI-115] Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-12-08 Thread GitBox
vinothchandar commented on a change in pull request #1704: URL: https://github.com/apache/hudi/pull/1704#discussion_r538262064 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java ## @@ -113,6 +113,9 @@ public static final String

[GitHub] [hudi] rahultoall commented on issue #2302: Caused by: java.lang.NoSuchMethodError: org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(Lorg/apache/avro/generic/GenericRecord;Ljava/lang/St

2020-12-08 Thread GitBox
rahultoall commented on issue #2302: URL: https://github.com/apache/hudi/issues/2302#issuecomment-740564293 @bvaradar I have added the hudi-hadoop-mr-bundle jar in Hive Aux path and also the configured the same in hive-site.xml as below hive.aux.jars.path

[hudi] branch asf-site updated: [DOCS] Update more blogs, talks to the powered by page (#2305)

2020-12-08 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new c15326f [DOCS] Update more blogs, talks to

[GitHub] [hudi] vinothchandar merged pull request #2305: [DOCS] Update more blogs, talks to the powered by page

2020-12-08 Thread GitBox
vinothchandar merged pull request #2305: URL: https://github.com/apache/hudi/pull/2305 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] trikota-kc edited a comment on issue #2258: [SUPPORT] Unable to query hudi tables in Presto

2020-12-08 Thread GitBox
trikota-kc edited a comment on issue #2258: URL: https://github.com/apache/hudi/issues/2258#issuecomment-740555412 Ok to anyone out there struggling with data type issues that are related to improper column ordering in Presto queries of Deltastreamer output. This is what worked for me:

[GitHub] [hudi] codecov-io edited a comment on pull request #2136: [HUDI-37] Persist the HoodieIndex type in the hoodie.properties file

2020-12-08 Thread GitBox
codecov-io edited a comment on pull request #2136: URL: https://github.com/apache/hudi/pull/2136#issuecomment-729377388 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2136?src=pr=h1) Report > Merging [#2136](https://codecov.io/gh/apache/hudi/pull/2136?src=pr=desc) (6f960c0) into

[GitHub] [hudi] codecov-io edited a comment on pull request #2136: [HUDI-37] Persist the HoodieIndex type in the hoodie.properties file

2020-12-08 Thread GitBox
codecov-io edited a comment on pull request #2136: URL: https://github.com/apache/hudi/pull/2136#issuecomment-729377388 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2136?src=pr=h1) Report > Merging [#2136](https://codecov.io/gh/apache/hudi/pull/2136?src=pr=desc) (6f960c0) into

[GitHub] [hudi] trikota-kc edited a comment on issue #2258: [SUPPORT] Unable to query hudi tables in Presto

2020-12-08 Thread GitBox
trikota-kc edited a comment on issue #2258: URL: https://github.com/apache/hudi/issues/2258#issuecomment-740555412 Ok to anyone out there struggling with data type issues that are related to improper column ordering in Presto. This is what worked for me: - Use emr of the latest

[GitHub] [hudi] codecov-io edited a comment on pull request #2136: [HUDI-37] Persist the HoodieIndex type in the hoodie.properties file

2020-12-08 Thread GitBox
codecov-io edited a comment on pull request #2136: URL: https://github.com/apache/hudi/pull/2136#issuecomment-729377388 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2136?src=pr=h1) Report > Merging [#2136](https://codecov.io/gh/apache/hudi/pull/2136?src=pr=desc) (6f960c0) into

[GitHub] [hudi] trikota-kc commented on issue #2258: [SUPPORT] Unable to query hudi tables in Presto

2020-12-08 Thread GitBox
trikota-kc commented on issue #2258: URL: https://github.com/apache/hudi/issues/2258#issuecomment-740555412 Ok to anyone out there struggling with data type issues that are related to improper column ordering in Presto. This is what worked for me: - Use emr of the latest version with

[GitHub] [hudi] codecov-io edited a comment on pull request #2307: [MINOR] Throw an exception when keyGenerator initialization failed in…

2020-12-08 Thread GitBox
codecov-io edited a comment on pull request #2307: URL: https://github.com/apache/hudi/pull/2307#issuecomment-740554761 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] codecov-io commented on pull request #2307: [MINOR] Throw an exception when keyGenerator initialization failed in…

2020-12-08 Thread GitBox
codecov-io commented on pull request #2307: URL: https://github.com/apache/hudi/pull/2307#issuecomment-740554761 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2307?src=pr=h1) Report > Merging [#2307](https://codecov.io/gh/apache/hudi/pull/2307?src=pr=desc) (e472ef4) into

[GitHub] [hudi] bithw1 closed issue #2304: [SUPPORT]Hoodie clean operation result explanation

2020-12-08 Thread GitBox
bithw1 closed issue #2304: URL: https://github.com/apache/hudi/issues/2304 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] bithw1 commented on issue #2304: [SUPPORT]Hoodie clean operation result explanation

2020-12-08 Thread GitBox
bithw1 commented on issue #2304: URL: https://github.com/apache/hudi/issues/2304#issuecomment-740544780 Thanks @bvaradar for the great answer. I am closing this. This is an automated message from the Apache Git Service. To

[GitHub] [hudi] vinothchandar commented on a change in pull request #2306: HUDI-1439 Remove scala dependency from hudi-client-common

2020-12-08 Thread GitBox
vinothchandar commented on a change in pull request #2306: URL: https://github.com/apache/hudi/pull/2306#discussion_r538235679 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/index/simple/SparkHoodieSimpleIndex.java ## @@ -147,6 +150,12 @@ public

[GitHub] [hudi] bvaradar commented on issue #2304: [SUPPORT]Hoodie clean operation result explanation

2020-12-08 Thread GitBox
bvaradar commented on issue #2304: URL: https://github.com/apache/hudi/issues/2304#issuecomment-740518409 Yeah, cleans are supposed to only delete old versions of the files safely. No data loss is expected as you have noticed. The observation regarding hudi_file_name is expected. We

[GitHub] [hudi] codecov-io edited a comment on pull request #2306: HUDI-1439 Remove scala dependency from hudi-client-common

2020-12-08 Thread GitBox
codecov-io edited a comment on pull request #2306: URL: https://github.com/apache/hudi/pull/2306#issuecomment-740515622 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2306?src=pr=h1) Report > Merging [#2306](https://codecov.io/gh/apache/hudi/pull/2306?src=pr=desc) (6cdb46d) into

[GitHub] [hudi] codecov-io edited a comment on pull request #2306: HUDI-1439 Remove scala dependency from hudi-client-common

2020-12-08 Thread GitBox
codecov-io edited a comment on pull request #2306: URL: https://github.com/apache/hudi/pull/2306#issuecomment-740515622 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2306?src=pr=h1) Report > Merging [#2306](https://codecov.io/gh/apache/hudi/pull/2306?src=pr=desc) (6cdb46d) into

[GitHub] [hudi] codecov-io commented on pull request #2306: HUDI-1439 Remove scala dependency from hudi-client-common

2020-12-08 Thread GitBox
codecov-io commented on pull request #2306: URL: https://github.com/apache/hudi/pull/2306#issuecomment-740515622 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2306?src=pr=h1) Report > Merging [#2306](https://codecov.io/gh/apache/hudi/pull/2306?src=pr=desc) (6cdb46d) into

[GitHub] [hudi] bvaradar commented on issue #2302: Caused by: java.lang.NoSuchMethodError: org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(Lorg/apache/avro/generic/GenericRecord;Ljava/lang/Stri

2020-12-08 Thread GitBox
bvaradar commented on issue #2302: URL: https://github.com/apache/hudi/issues/2302#issuecomment-740501997 @rahultoall : The hudi-hadoop-mr-bundle should not be include in spark's class path. You can look at

[jira] [Created] (HUDI-1441) HoodieAvroUtils - rewrite() is not handling evolution of a nested record field.

2020-12-08 Thread Balajee Nagasubramaniam (Jira)
Balajee Nagasubramaniam created HUDI-1441: - Summary: HoodieAvroUtils - rewrite() is not handling evolution of a nested record field. Key: HUDI-1441 URL: https://issues.apache.org/jira/browse/HUDI-1441

[jira] [Updated] (HUDI-1440) Allow option to override schema when doing spark.write

2020-12-08 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1440: - Status: Open (was: New) > Allow option to override schema when doing spark.write >

[jira] [Created] (HUDI-1440) Allow option to override schema when doing spark.write

2020-12-08 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-1440: Summary: Allow option to override schema when doing spark.write Key: HUDI-1440 URL: https://issues.apache.org/jira/browse/HUDI-1440 Project: Apache Hudi

[GitHub] [hudi] wangxianghu commented on pull request #2307: [MINOR] Throw an exception when keyGenerator initialization failed in…

2020-12-08 Thread GitBox
wangxianghu commented on pull request #2307: URL: https://github.com/apache/hudi/pull/2307#issuecomment-740471281 @yanghua please take a look when free This is an automated message from the Apache Git Service. To respond to

[GitHub] [hudi] wangxianghu opened a new pull request #2307: [MINOR] Throw an exception when keyGenerator initialization failed in…

2020-12-08 Thread GitBox
wangxianghu opened a new pull request #2307: URL: https://github.com/apache/hudi/pull/2307 …stead of logging it ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.*

[GitHub] [hudi] bvaradar commented on issue #2282: [SUPPORT] Hoodie table not found in path Unable to find a hudi table for the user provided paths.

2020-12-08 Thread GitBox
bvaradar commented on issue #2282: URL: https://github.com/apache/hudi/issues/2282#issuecomment-740456593 @wosow : If this is a plain parquet dataset, you should be reading like spark.read.parquet("hdfs://nameservice/data/wdt/sqoop/cow/inc/stockout_order_20201125/*") and not use hudi