[GitHub] [hudi] lw309637554 commented on a change in pull request #2275: [HUDI-1354] Block updates and replace on file groups in clustering

2020-12-17 Thread GitBox
lw309637554 commented on a change in pull request #2275: URL: https://github.com/apache/hudi/pull/2275#discussion_r541677201 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java ## @@ -103,6 +104,9 @@

[GitHub] [hudi] lw309637554 commented on a change in pull request #2275: [HUDI-1354] Block updates and replace on file groups in clustering

2020-12-17 Thread GitBox
lw309637554 commented on a change in pull request #2275: URL: https://github.com/apache/hudi/pull/2275#discussion_r541675903 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java ## @@ -103,6 +104,9 @@

[GitHub] [hudi] sumihehe opened a new issue #2346: [SUPPORT]The rt view query returns a wrong result with predicate push down.

2020-12-17 Thread GitBox
sumihehe opened a new issue #2346: URL: https://github.com/apache/hudi/issues/2346 Hi all, The rt view query returns a wrong result with predicate push down. This is my query on a rt view of  MOR table: select count(1) from ***mor_rt where platform = "HYLOOP" and

[jira] [Commented] (HUDI-1463) Accomplishments (2019-2020) and Roadmap (2021-2022)

2020-12-17 Thread Mani Jindal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251538#comment-17251538 ] Mani Jindal commented on HUDI-1463: --- Great Idea [~vinoth] ya i would be interested to do that let me

[GitHub] [hudi] kingkongpoon opened a new issue #2345: the option PRECOMBINE_FIELD_OPT_KEY is useless

2020-12-17 Thread GitBox
kingkongpoon opened a new issue #2345: URL: https://github.com/apache/hudi/issues/2345 When I use hudi-0.6.0, I find that the option PRECOMBINE_FIELD_OPT_KEY is useless ? I want to use a rt table to update my data by it's timestamp (ts) ### Test Data filename a.csv

[GitHub] [hudi] codecov-io edited a comment on pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
codecov-io edited a comment on pull request #2263: URL: https://github.com/apache/hudi/pull/2263#issuecomment-730664726 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] codecov-io edited a comment on pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
codecov-io edited a comment on pull request #2263: URL: https://github.com/apache/hudi/pull/2263#issuecomment-730664726 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2263?src=pr=h1) Report > Merging [#2263](https://codecov.io/gh/apache/hudi/pull/2263?src=pr=desc) (039e6a5) into

[GitHub] [hudi] codecov-io edited a comment on pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
codecov-io edited a comment on pull request #2263: URL: https://github.com/apache/hudi/pull/2263#issuecomment-730664726 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2263?src=pr=h1) Report > Merging [#2263](https://codecov.io/gh/apache/hudi/pull/2263?src=pr=desc) (039e6a5) into

[GitHub] [hudi] satishkotha commented on pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
satishkotha commented on pull request #2263: URL: https://github.com/apache/hudi/pull/2263#issuecomment-747852533 > This is great, thanks @satishkotha ! > > I have completed a first pass. Don't have major concerns. May be we can work through these initial comments, as I complete the

[GitHub] [hudi] satishkotha commented on pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
satishkotha commented on pull request #2263: URL: https://github.com/apache/hudi/pull/2263#issuecomment-747850886 > @satishkotha > > > After clustering all new records have a new commit time. I'm trying to see if it's possible to preserve commit_time from original file to support

[GitHub] [hudi] satishkotha commented on a change in pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
satishkotha commented on a change in pull request #2263: URL: https://github.com/apache/hudi/pull/2263#discussion_r54517 ## File path: hudi-common/src/main/java/org/apache/hudi/common/util/FileSliceUtils.java ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] satishkotha commented on a change in pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
satishkotha commented on a change in pull request #2263: URL: https://github.com/apache/hudi/pull/2263#discussion_r54370 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/strategy/LogFileSizeBasedCompactionStrategy.java ## @@

[GitHub] [hudi] satishkotha commented on a change in pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
satishkotha commented on a change in pull request #2263: URL: https://github.com/apache/hudi/pull/2263#discussion_r545554989 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/AbstractBulkInsertHelper.java ## @@ -27,8 +27,21 @@

[GitHub] [hudi] satishkotha commented on a change in pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
satishkotha commented on a change in pull request #2263: URL: https://github.com/apache/hudi/pull/2263#discussion_r545554841 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/cluster/strategy/ScheduleClusteringStrategy.java ## @@ -0,0

[GitHub] [hudi] satishkotha commented on a change in pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
satishkotha commented on a change in pull request #2263: URL: https://github.com/apache/hudi/pull/2263#discussion_r545554494 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/cluster/strategy/ScheduleClusteringStrategy.java ## @@ -0,0

[GitHub] [hudi] satishkotha commented on a change in pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
satishkotha commented on a change in pull request #2263: URL: https://github.com/apache/hudi/pull/2263#discussion_r545554378 ## File path:

[GitHub] [hudi] umehrot2 commented on a change in pull request #2343: [HUDI-1469] Faster initialization of metadata table using parallelized listing.

2020-12-17 Thread GitBox
umehrot2 commented on a change in pull request #2343: URL: https://github.com/apache/hudi/pull/2343#discussion_r545546646 ## File path: hudi-client/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java ## @@ -369,10 +343,56 @@ private void

[GitHub] [hudi] satishkotha commented on a change in pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
satishkotha commented on a change in pull request #2263: URL: https://github.com/apache/hudi/pull/2263#discussion_r545554043 ## File path:

[GitHub] [hudi] satishkotha commented on a change in pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
satishkotha commented on a change in pull request #2263: URL: https://github.com/apache/hudi/pull/2263#discussion_r545553782 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/HoodieMetrics.java ## @@ -48,6 +49,7 @@ private Timer

[GitHub] [hudi] codecov-io edited a comment on pull request #2311: [HUDI-115] Adding DefaultHoodieRecordPayload to honor ordering with combineAndGetUpdateValue

2020-12-17 Thread GitBox
codecov-io edited a comment on pull request #2311: URL: https://github.com/apache/hudi/pull/2311#issuecomment-741799295 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] codecov-io edited a comment on pull request #2311: [HUDI-115] Adding DefaultHoodieRecordPayload to honor ordering with combineAndGetUpdateValue

2020-12-17 Thread GitBox
codecov-io edited a comment on pull request #2311: URL: https://github.com/apache/hudi/pull/2311#issuecomment-741799295 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2311?src=pr=h1) Report > Merging [#2311](https://codecov.io/gh/apache/hudi/pull/2311?src=pr=desc) (d932723) into

[GitHub] [hudi] codecov-io commented on pull request #2344: [HUDI-1470] In the hudi-test-suite, use the latest writer schema, when reading from existing parquet files

2020-12-17 Thread GitBox
codecov-io commented on pull request #2344: URL: https://github.com/apache/hudi/pull/2344#issuecomment-747810064 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2344?src=pr=h1) Report > Merging [#2344](https://codecov.io/gh/apache/hudi/pull/2344?src=pr=desc) (3649a24) into

[GitHub] [hudi] rmpifer commented on a change in pull request #2342: [RFC-15][HUDI-1325] Merge updates of unsynced instants to metadata table

2020-12-17 Thread GitBox
rmpifer commented on a change in pull request #2342: URL: https://github.com/apache/hudi/pull/2342#discussion_r545511401 ## File path: hudi-client/src/main/java/org/apache/hudi/table/HoodieTable.java ## @@ -635,9 +635,9 @@ public boolean requireSortedRecords() { return

[GitHub] [hudi] umehrot2 merged pull request #2326: [HUDI-1450] [RFC-15] Use metadata table for listing in HoodieROTablePathFilter

2020-12-17 Thread GitBox
umehrot2 merged pull request #2326: URL: https://github.com/apache/hudi/pull/2326 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[hudi] branch rfc-15 updated: Use metadata table for listing in HoodieROTablePathFilter (#2326)

2020-12-17 Thread uditme
This is an automated email from the ASF dual-hosted git repository. uditme pushed a commit to branch rfc-15 in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/rfc-15 by this push: new 3cab54e Use metadata table for listing in

[GitHub] [hudi] umehrot2 commented on a change in pull request #2326: [HUDI-1450] [RFC-15] Use metadata table for listing in HoodieROTablePathFilter

2020-12-17 Thread GitBox
umehrot2 commented on a change in pull request #2326: URL: https://github.com/apache/hudi/pull/2326#discussion_r545510206 ## File path: hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala ## @@ -194,4 +195,42 @@ class TestCOWDataSource extends

[GitHub] [hudi] umehrot2 commented on a change in pull request #2326: [HUDI-1450] [RFC-15] Use metadata table for listing in HoodieROTablePathFilter

2020-12-17 Thread GitBox
umehrot2 commented on a change in pull request #2326: URL: https://github.com/apache/hudi/pull/2326#discussion_r545508178 ## File path: hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala ## @@ -194,4 +195,42 @@ class TestCOWDataSource extends

[GitHub] [hudi] umehrot2 commented on a change in pull request #2326: [HUDI-1450] [RFC-15] Use metadata table for listing in HoodieROTablePathFilter

2020-12-17 Thread GitBox
umehrot2 commented on a change in pull request #2326: URL: https://github.com/apache/hudi/pull/2326#discussion_r545508037 ## File path: hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java ## @@ -128,24 +128,23 @@ public Builder retainCommits(int

[GitHub] [hudi] codecov-io edited a comment on pull request #2168: [HUDI-1331] Adding support for validating entire dataset and long running tests in test suite framework

2020-12-17 Thread GitBox
codecov-io edited a comment on pull request #2168: URL: https://github.com/apache/hudi/pull/2168#issuecomment-706649760 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2168?src=pr=h1) Report > Merging [#2168](https://codecov.io/gh/apache/hudi/pull/2168?src=pr=desc) (e6f76b0) into

[jira] [Updated] (HUDI-1470) Hudi-test-suite - DFSHoodieDatasetInputReader.java - Use the latest writer schema, when reading the parquet files.

2020-12-17 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1470: - Labels: pull-request-available (was: ) > Hudi-test-suite - DFSHoodieDatasetInputReader.java -

[GitHub] [hudi] nbalajee opened a new pull request #2344: [HUDI-1470] In the hudi-test-suite, use the latest writer schema, when reading from existing parquet files

2020-12-17 Thread GitBox
nbalajee opened a new pull request #2344: URL: https://github.com/apache/hudi/pull/2344 ## What is the purpose of the pull request When hudi-test-suite is reading records from the existing parquet files, it is using the reader schema (original schema used to write the parquet file).

[jira] [Created] (HUDI-1470) Hudi-test-suite - DFSHoodieDatasetInputReader.java - Use the latest writer schema, when reading the parquet files.

2020-12-17 Thread Balajee Nagasubramaniam (Jira)
Balajee Nagasubramaniam created HUDI-1470: - Summary: Hudi-test-suite - DFSHoodieDatasetInputReader.java - Use the latest writer schema, when reading the parquet files. Key: HUDI-1470 URL:

[GitHub] [hudi] bvaradar commented on issue #2076: [SUPPORT] load data partition wise

2020-12-17 Thread GitBox
bvaradar commented on issue #2076: URL: https://github.com/apache/hudi/issues/2076#issuecomment-747781069 @AnweshaSen : Can you create a new github issue with full exception stack trace and all the configurations that you are passing.

[jira] [Commented] (HUDI-1399) support clustering operation can run asynchronously

2020-12-17 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251393#comment-17251393 ] Vinoth Chandar commented on HUDI-1399: -- yes starting with 1 and 2 sounds good to me.  Followed by 4

[jira] [Commented] (HUDI-1399) support clustering operation can run asynchronously

2020-12-17 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251392#comment-17251392 ] Vinoth Chandar commented on HUDI-1399: -- yes love to get a release out by end of year. There is a lot

[jira] [Commented] (HUDI-1455) Hudi integration with project nessie

2020-12-17 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251390#comment-17251390 ] Vinoth Chandar commented on HUDI-1455: -- Thanks for the information Ryan. Let me process this and come

[hudi] branch master updated (14d5d11 -> 8b5d6f9)

2020-12-17 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 14d5d11 [HUDI-1406] Add date partition based source input selector for Delta streamer (#2264) add 8b5d6f9

[GitHub] [hudi] vinothchandar merged pull request #2322: [HUDI-1437] support more accurate spark JobGroup for better performan tracking

2020-12-17 Thread GitBox
vinothchandar merged pull request #2322: URL: https://github.com/apache/hudi/pull/2322 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] vinothchandar commented on a change in pull request #2332: [HUDI-1319] Make async operations work with metadata table

2020-12-17 Thread GitBox
vinothchandar commented on a change in pull request #2332: URL: https://github.com/apache/hudi/pull/2332#discussion_r545467367 ## File path: hudi-client/src/main/java/org/apache/hudi/client/HoodieWriteClient.java ## @@ -701,8 +704,6 @@ protected void

[GitHub] [hudi] vinothchandar commented on a change in pull request #2332: [HUDI-1319] Make async operations work with metadata table

2020-12-17 Thread GitBox
vinothchandar commented on a change in pull request #2332: URL: https://github.com/apache/hudi/pull/2332#discussion_r545466691 ## File path: hudi-client/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java ## @@ -725,6 +698,13 @@ private synchronized

[GitHub] [hudi] umehrot2 commented on pull request #2343: [HUDI-1469] Faster initialization of metadata table using parallelized listing.

2020-12-17 Thread GitBox
umehrot2 commented on pull request #2343: URL: https://github.com/apache/hudi/pull/2343#issuecomment-747755598 Its much needed. We should have done this for Hudi in general long time back. This is an automated message from

[GitHub] [hudi] codecov-io edited a comment on pull request #2343: [HUDI-1469] Faster initialization of metadata table using parallelized listing.

2020-12-17 Thread GitBox
codecov-io edited a comment on pull request #2343: URL: https://github.com/apache/hudi/pull/2343#issuecomment-747754631 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2343?src=pr=h1) Report > Merging [#2343](https://codecov.io/gh/apache/hudi/pull/2343?src=pr=desc) (374bbb6) into

[GitHub] [hudi] codecov-io edited a comment on pull request #2343: [HUDI-1469] Faster initialization of metadata table using parallelized listing.

2020-12-17 Thread GitBox
codecov-io edited a comment on pull request #2343: URL: https://github.com/apache/hudi/pull/2343#issuecomment-747754631 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2343?src=pr=h1) Report > Merging [#2343](https://codecov.io/gh/apache/hudi/pull/2343?src=pr=desc) (374bbb6) into

[GitHub] [hudi] codecov-io commented on pull request #2343: [HUDI-1469] Faster initialization of metadata table using parallelized listing.

2020-12-17 Thread GitBox
codecov-io commented on pull request #2343: URL: https://github.com/apache/hudi/pull/2343#issuecomment-747754631 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2343?src=pr=h1) Report > Merging [#2343](https://codecov.io/gh/apache/hudi/pull/2343?src=pr=desc) (374bbb6) into

[jira] [Commented] (HUDI-1463) Accomplishments (2019-2020) and Roadmap (2021-2022)

2020-12-17 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251378#comment-17251378 ] Vinoth Chandar commented on HUDI-1463: -- We can start by outlining the summary of releases this year

[jira] [Commented] (HUDI-1463) Accomplishments (2019-2020) and Roadmap (2021-2022)

2020-12-17 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251379#comment-17251379 ] Vinoth Chandar commented on HUDI-1463: -- wdyt?  > Accomplishments (2019-2020) and Roadmap (2021-2022)

[GitHub] [hudi] vinothchandar commented on pull request #2343: [HUDI-1469] Faster initialization of metadata table using parallelized listing.

2020-12-17 Thread GitBox
vinothchandar commented on pull request #2343: URL: https://github.com/apache/hudi/pull/2343#issuecomment-747747412 Good optimization. Will review This is an automated message from the Apache Git Service. To respond to the

[jira] [Updated] (HUDI-1469) Faster initialization for larger datasets

2020-12-17 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1469: - Labels: pull-request-available (was: ) > Faster initialization for larger datasets >

[GitHub] [hudi] prashantwason commented on pull request #2343: [HUDI-1469] Faster initialization of metadata table using parallelized listing.

2020-12-17 Thread GitBox
prashantwason commented on pull request #2343: URL: https://github.com/apache/hudi/pull/2343#issuecomment-747741409 @vinothchandar Please take a look. This is an automated message from the Apache Git Service. To respond to

[GitHub] [hudi] prashantwason opened a new pull request #2343: [HUDI-1469] Faster initialization of metadata table using parallelized listing.

2020-12-17 Thread GitBox
prashantwason opened a new pull request #2343: URL: https://github.com/apache/hudi/pull/2343 This finds partitions and files in a single scan rather than listing partitions first and then listing each partition. ## Brief change log *(for example:)* - *Modify

[jira] [Created] (HUDI-1469) Faster initialization for larger datasets

2020-12-17 Thread Prashant Wason (Jira)
Prashant Wason created HUDI-1469: Summary: Faster initialization for larger datasets Key: HUDI-1469 URL: https://issues.apache.org/jira/browse/HUDI-1469 Project: Apache Hudi Issue Type:

[jira] [Closed] (HUDI-1305) Prevent log pollution from console metrics logger

2020-12-17 Thread Prashant Wason (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Wason closed HUDI-1305. Resolution: Fixed > Prevent log pollution from console metrics logger >

[jira] [Updated] (HUDI-1305) Prevent log pollution from console metrics logger

2020-12-17 Thread Prashant Wason (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Wason updated HUDI-1305: - Status: Open (was: New) > Prevent log pollution from console metrics logger >

[jira] [Updated] (HUDI-1303) Some improvements for the HUDI Test Suite

2020-12-17 Thread Prashant Wason (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Wason updated HUDI-1303: - Status: Open (was: New) > Some improvements for the HUDI Test Suite >

[jira] [Closed] (HUDI-1303) Some improvements for the HUDI Test Suite

2020-12-17 Thread Prashant Wason (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Wason closed HUDI-1303. Resolution: Fixed > Some improvements for the HUDI Test Suite >

[GitHub] [hudi] vinothchandar commented on pull request #2311: [HUDI-115] Adding DefaultHoodieRecordPayload to honor ordering with combineAndGetUpdateValue

2020-12-17 Thread GitBox
vinothchandar commented on pull request #2311: URL: https://github.com/apache/hudi/pull/2311#issuecomment-747703536 @nsivabalan could you check the build? This is an automated message from the Apache Git Service. To respond

[GitHub] [hudi] vinothchandar commented on a change in pull request #2342: [RFC-15][HUDI-1325] Merge updates of unsynced instants to metadata table

2020-12-17 Thread GitBox
vinothchandar commented on a change in pull request #2342: URL: https://github.com/apache/hudi/pull/2342#discussion_r545394203 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataTimelineUtil.java ## @@ -0,0 +1,334 @@ +/* + * Licensed to the

[GitHub] [hudi] AnweshaSen commented on issue #2076: [SUPPORT] load data partition wise

2020-12-17 Thread GitBox
AnweshaSen commented on issue #2076: URL: https://github.com/apache/hudi/issues/2076#issuecomment-747676296 Hi, I am very new to Hudi and I faced similar kind error while dealing with csv. I tried with a simple csv, having a structure like: +---+-+ |age| Name|

[jira] [Created] (HUDI-1468) incremental read support with clustering

2020-12-17 Thread satish (Jira)
satish created HUDI-1468: Summary: incremental read support with clustering Key: HUDI-1468 URL: https://issues.apache.org/jira/browse/HUDI-1468 Project: Apache Hudi Issue Type: Sub-task

[GitHub] [hudi] satishkotha commented on a change in pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
satishkotha commented on a change in pull request #2263: URL: https://github.com/apache/hudi/pull/2263#discussion_r545367096 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java ## @@ -0,0 +1,155 @@ +/* + * Licensed to

[GitHub] [hudi] satishkotha commented on pull request #2254: [HUDI-1350] Support Partition level delete API in HUDI

2020-12-17 Thread GitBox
satishkotha commented on pull request #2254: URL: https://github.com/apache/hudi/pull/2254#issuecomment-747667904 @n3nash @bvaradar Can one of you review as well? This is an automated message from the Apache Git Service. To

[GitHub] [hudi] satishkotha commented on a change in pull request #2254: [HUDI-1350] Support Partition level delete API in HUDI

2020-12-17 Thread GitBox
satishkotha commented on a change in pull request #2254: URL: https://github.com/apache/hudi/pull/2254#discussion_r545364740 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/TestHoodieClientOnCopyOnWriteStorage.java ## @@ -999,6 +999,103 @@

[GitHub] [hudi] pengzhiwei2018 edited a comment on pull request #2283: [HUDI-1415] Incorrect query result for hudi hive table when using spa…

2020-12-17 Thread GitBox
pengzhiwei2018 edited a comment on pull request #2283: URL: https://github.com/apache/hudi/pull/2283#issuecomment-739497762 > @pengzhiwei2018 would you please describe in more details about the issue? Hi @leesf ,Sorry for the late response. I find that when reading a hudi table

[jira] [Updated] (HUDI-1415) Incorrect query result for hudi hive table when using spark sql

2020-12-17 Thread pengzhiwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengzhiwei updated HUDI-1415: - Description: If we update a hudi table twice more, we will get an incorrect query count by spark sql.  

[GitHub] [hudi] brandon-stanley commented on issue #2331: Why does Hudi not support field deletions?

2020-12-17 Thread GitBox
brandon-stanley commented on issue #2331: URL: https://github.com/apache/hudi/issues/2331#issuecomment-747551023 Thanks for your response @prashantwason. Does this mean that the implementation of maintaining schemas within Hudi is more of a _wrapper_ around Avro which has an

[GitHub] [hudi] sbernauer commented on pull request #2316: Add commons-codec to spark and utilities bundle jars

2020-12-17 Thread GitBox
sbernauer commented on pull request #2316: URL: https://github.com/apache/hudi/pull/2316#issuecomment-747528223 Im using >= 0.6.0 from master branch and Spark 3.0.1 I'm sorry I can't downgrade to spark 2.4 But I will try removing the relocation

[GitHub] [hudi] vinothchandar commented on pull request #2342: [RFC-15][HUDI-1325] Merge updates of unsynced instants to metadata table

2020-12-17 Thread GitBox
vinothchandar commented on pull request #2342: URL: https://github.com/apache/hudi/pull/2342#issuecomment-747486716 @rmpifer could you please rebase this against latest `rfc-15` branch. I ll get started with the review in the meantime

[GitHub] [hudi] vinothchandar commented on a change in pull request #2326: [HUDI-1450] [RFC-15] Use metadata table for listing in HoodieROTablePathFilter

2020-12-17 Thread GitBox
vinothchandar commented on a change in pull request #2326: URL: https://github.com/apache/hudi/pull/2326#discussion_r545145067 ## File path: hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java ## @@ -128,24 +128,23 @@ public Builder

[GitHub] [hudi] vinothchandar commented on a change in pull request #2263: [HUDI-1075] Implement simple clustering strategies to create and run ClusteringPlan

2020-12-17 Thread GitBox
vinothchandar commented on a change in pull request #2263: URL: https://github.com/apache/hudi/pull/2263#discussion_r544859334 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java ## @@ -0,0 +1,165 @@ +/* + * Licensed

[hudi] branch master updated: [HUDI-1406] Add date partition based source input selector for Delta streamer (#2264)

2020-12-17 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 14d5d11 [HUDI-1406] Add date partition based

[GitHub] [hudi] vinothchandar merged pull request #2264: [HUDI-1406] Add date partition based source input selector for DeltaStreamer

2020-12-17 Thread GitBox
vinothchandar merged pull request #2264: URL: https://github.com/apache/hudi/pull/2264 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[jira] [Created] (HUDI-1467) Promote Powered by chapter to top level menu

2020-12-17 Thread wangxianghu (Jira)
wangxianghu created HUDI-1467: - Summary: Promote Powered by chapter to top level menu Key: HUDI-1467 URL: https://issues.apache.org/jira/browse/HUDI-1467 Project: Apache Hudi Issue Type:

[GitHub] [hudi] garyli1019 commented on a change in pull request #2296: [HUDI-1425] Performance loss with the additional hoodieRecords.isEmpt…

2020-12-17 Thread GitBox
garyli1019 commented on a change in pull request #2296: URL: https://github.com/apache/hudi/pull/2296#discussion_r544993950 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala ## @@ -320,4 +320,21 @@ class

[GitHub] [hudi] Karl-WangSK commented on pull request #2260: [HUDI-1381] Schedule compaction based on time elapsed

2020-12-17 Thread GitBox
Karl-WangSK commented on pull request #2260: URL: https://github.com/apache/hudi/pull/2260#issuecomment-747348473 @vinothchandar This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] Karl-WangSK removed a comment on pull request #2260: [HUDI-1381] Schedule compaction based on time elapsed

2020-12-17 Thread GitBox
Karl-WangSK removed a comment on pull request #2260: URL: https://github.com/apache/hudi/pull/2260#issuecomment-730111282 cc @bvaradar @yanghua This is an automated message from the Apache Git Service. To respond to the

[GitHub] [hudi] codecov-io edited a comment on pull request #2260: [HUDI-1381] Schedule compaction based on time elapsed

2020-12-17 Thread GitBox
codecov-io edited a comment on pull request #2260: URL: https://github.com/apache/hudi/pull/2260#issuecomment-729530724 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2260?src=pr=h1) Report > Merging [#2260](https://codecov.io/gh/apache/hudi/pull/2260?src=pr=desc) (0750f24) into

[jira] [Created] (HUDI-1466) Migrate CI/CD from travis to Azure pipeline

2020-12-17 Thread wangxianghu (Jira)
wangxianghu created HUDI-1466: - Summary: Migrate CI/CD from travis to Azure pipeline Key: HUDI-1466 URL: https://issues.apache.org/jira/browse/HUDI-1466 Project: Apache Hudi Issue Type: New

[hudi] branch master updated: [MINOR] Make QuickstartUtil generate random timestamp instead of 0 (#2340)

2020-12-17 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 4ddfc61 [MINOR] Make QuickstartUtil generate

[GitHub] [hudi] yanghua merged pull request #2340: [MINOR] Make QuickstartUtil generate random timestamp instead of 0

2020-12-17 Thread GitBox
yanghua merged pull request #2340: URL: https://github.com/apache/hudi/pull/2340 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] garyli1019 commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

2020-12-17 Thread GitBox
garyli1019 commented on a change in pull request #2281: URL: https://github.com/apache/hudi/pull/2281#discussion_r544952178 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieDataSourceConfig.java ## @@ -0,0 +1,102 @@ +/* + * Licensed to

[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

2020-12-17 Thread GitBox
liujinhui1994 commented on a change in pull request #2281: URL: https://github.com/apache/hudi/pull/2281#discussion_r544948270 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieDataSourceConfig.java ## @@ -0,0 +1,102 @@ +/* + * Licensed

[GitHub] [hudi] wangxianghu commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

2020-12-17 Thread GitBox
wangxianghu commented on a change in pull request #2281: URL: https://github.com/apache/hudi/pull/2281#discussion_r544888525 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieDataSourceConfig.java ## @@ -0,0 +1,102 @@ +/* + * Licensed to

[GitHub] [hudi] garyli1019 commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

2020-12-17 Thread GitBox
garyli1019 commented on a change in pull request #2281: URL: https://github.com/apache/hudi/pull/2281#discussion_r544885347 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieDataSourceConfig.java ## @@ -0,0 +1,102 @@ +/* + * Licensed to