[hudi] branch hudi_test_suite_refactor updated (de1b58a -> 7cc6c55)

2020-07-26 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a change to branch hudi_test_suite_refactor in repository https://gitbox.apache.org/repos/asf/hudi.git. discard de1b58a [HUDI-394] Provide a basic implementation of test suite add 7cc6c55 [HUDI-394]

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #351

2020-07-26 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.29 KB...] /home/jenkins/tools/maven/apache-maven-3.5.4/conf: logging settings.xml toolchains.xml

[GitHub] [hudi] vinothchandar commented on a change in pull request #1859: [HUDI-1072] Use replace metadata file to filter excluded files in views

2020-07-26 Thread GitBox
vinothchandar commented on a change in pull request #1859: URL: https://github.com/apache/hudi/pull/1859#discussion_r460608513 ## File path: hudi-common/src/main/avro/HoodieReplaceMetadata.avsc ## @@ -0,0 +1,44 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [hudi] vinothchandar commented on pull request #1869: [HUDI-427] Implement CLI support for performing bootstrap

2020-07-26 Thread GitBox
vinothchandar commented on pull request #1869: URL: https://github.com/apache/hudi/pull/1869#issuecomment-664077582 cc @umehrot2 can you please take a pass at this. This is an automated message from the Apache Git Service.

[hudi] branch hudi_test_suite_refactor updated (ea2c616 -> de1b58a)

2020-07-26 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a change to branch hudi_test_suite_refactor in repository https://gitbox.apache.org/repos/asf/hudi.git. discard ea2c616 [HUDI-394] Provide a basic implementation of test suite add de1b58a [HUDI-394]

[GitHub] [hudi] vinothchandar commented on issue #1830: [SUPPORT] Processing time gradually increases while using Spark Streaming

2020-07-26 Thread GitBox
vinothchandar commented on issue #1830: URL: https://github.com/apache/hudi/issues/1830#issuecomment-664069552 @bvaradar any updates from trying this on emr? This is an automated message from the Apache Git Service. To

[GitHub] [hudi] zuyanton edited a comment on issue #1847: [SUPPORT] querying MoR tables on S3 becomes slow with number of files growing

2020-07-26 Thread GitBox
zuyanton edited a comment on issue #1847: URL: https://github.com/apache/hudi/issues/1847#issuecomment-664064544 @umehrot2 @bschell Looks like CSE is disabled on the cluster, however I can see that we still specify CSE key id in cluster config. is ```fs.s3.cse.enabled``` is the only flag

[GitHub] [hudi] zuyanton commented on issue #1847: [SUPPORT] querying MoR tables on S3 becomes slow with number of files growing

2020-07-26 Thread GitBox
zuyanton commented on issue #1847: URL: https://github.com/apache/hudi/issues/1847#issuecomment-664064544 Looks like CSE is disabled on the cluster, however I can see that we still specify CSE key id in cluster config. is ```fs.s3.cse.enabled``` is the only flag that triggers EMR to

[GitHub] [hudi] vinothchandar commented on pull request #1876: [HUDI-242] Support for RFC-12/Bootstrapping of external datasets

2020-07-26 Thread GitBox
vinothchandar commented on pull request #1876: URL: https://github.com/apache/hudi/pull/1876#issuecomment-664060914 Behind on getting the tests to pass again. Working on it This is an automated message from the Apache Git

[jira] [Updated] (HUDI-1050) Support filter pushdown and column pruning for MOR table on Spark Datasource

2020-07-26 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1050: - Status: In Progress (was: Open) > Support filter pushdown and column pruning for MOR table on

[GitHub] [hudi] garyli1019 commented on pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-26 Thread GitBox
garyli1019 commented on pull request #1848: URL: https://github.com/apache/hudi/pull/1848#issuecomment-664055034 Added support for `PruneFilterScan`. Please review this PR again. Thank you! This is an automated message from

[GitHub] [hudi] xushiyan commented on pull request #1704: [HUDI-115] Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-07-26 Thread GitBox
xushiyan commented on pull request #1704: URL: https://github.com/apache/hudi/pull/1704#issuecomment-664024837 Hi @bhasudha this means there are too many logs got printed out during the test. In ideal scenario, a test does not output anything unless there is a failure. Could you check if

[GitHub] [hudi] jackwellsxyz commented on issue #857: http://hudi.apache.org/comparison.html# should mention Iceberg and DeltaLake

2020-07-26 Thread GitBox
jackwellsxyz commented on issue #857: URL: https://github.com/apache/hudi/issues/857#issuecomment-663997844 From what I can tell, the Delta Lake file format is open source (https://github.com/delta-io/delta), but many of the optimizations like ZORDER are part of Delta Engine which sits

[GitHub] [hudi] nsivabalan commented on a change in pull request #1868: [HUDI-1083] Optimization in determining insert bucket location for a given key

2020-07-26 Thread GitBox
nsivabalan commented on a change in pull request #1868: URL: https://github.com/apache/hudi/pull/1868#discussion_r460534315 ## File path: hudi-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java ## @@ -272,21 +275,44 @@ public int

[GitHub] [hudi] nsivabalan commented on a change in pull request #1819: [HUDI-1058] Make delete marker configurable

2020-07-26 Thread GitBox
nsivabalan commented on a change in pull request #1819: URL: https://github.com/apache/hudi/pull/1819#discussion_r460415342 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/OverwriteWithLatestAvroPayload.java ## @@ -36,6 +36,8 @@ public class

[GitHub] [hudi] bhasudha commented on pull request #1704: [HUDI-115] Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-07-26 Thread GitBox
bhasudha commented on pull request #1704: URL: https://github.com/apache/hudi/pull/1704#issuecomment-663955621 @xushiyan This PR fails CI due to log limits over threshold - https://travis-ci.org/github/apache/hudi/builds/711685577. I see this error message consistently - `The job

[GitHub] [hudi] bhasudha commented on pull request #1704: [HUDI-115] Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-07-26 Thread GitBox
bhasudha commented on pull request #1704: URL: https://github.com/apache/hudi/pull/1704#issuecomment-663954904 > @bhasudha The PR looks good to me. Looks like the same ordering field will be honored in all places. One high level question before I accept it -> If `preCombine` &