[GitHub] [hudi] vinothsiva1989 opened a new issue #2031: [SUPPORT]

2020-08-24 Thread GitBox
vinothsiva1989 opened a new issue #2031: URL: https://github.com/apache/hudi/issues/2031 i am new to hudi please help . A clear and concise description of the problem. **To Reproduce** Steps to reproduce the behavior: step1: spark-shell \ --packages

[jira] [Resolved] (HUDI-1103) Improve the code format of Delete data demo in Quick-Start Guide

2020-08-24 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxianghu resolved HUDI-1103. --- Resolution: Fixed > Improve the code format of Delete data demo in Quick-Start Guide >

[jira] [Commented] (HUDI-1103) Improve the code format of Delete data demo in Quick-Start Guide

2020-08-24 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183721#comment-17183721 ] wangxianghu commented on HUDI-1103: --- fixed via asf-site branch : 

[jira] [Updated] (HUDI-1103) Improve the code format of Delete data demo in Quick-Start Guide

2020-08-24 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxianghu updated HUDI-1103: -- Status: Open (was: New) > Improve the code format of Delete data demo in Quick-Start Guide >

[jira] [Commented] (HUDI-1214) Need ability to set deltastreamer checkpoints when doing Spark datasource writes

2020-08-24 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183706#comment-17183706 ] Balaji Varadarajan commented on HUDI-1214: -- [~Trevorzhang] : Go for it > Need ability to set

[jira] [Commented] (HUDI-1201) HoodieDeltaStreamer: Allow user overrides to read from earliest kafka offset when commit files do not have checkpoint

2020-08-24 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183707#comment-17183707 ] Balaji Varadarajan commented on HUDI-1201: -- [~Trevorzhang] : Go for it *

[GitHub] [hudi] bvaradar commented on issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

2020-08-24 Thread GitBox
bvaradar commented on issue #1979: URL: https://github.com/apache/hudi/issues/1979#issuecomment-679522323 @hughfdjackson : In general getting incremental read to discard duplicates is not possible for MOR table types as we defer the merging of records to compaction. I was thinking

[GitHub] [hudi] bvaradar commented on issue #2007: [SUPPORT] Is Timeline metadata queryable ?

2020-08-24 Thread GitBox
bvaradar commented on issue #2007: URL: https://github.com/apache/hudi/issues/2007#issuecomment-679497286 @ashishmgofficial : If you use DeltaStreamer, it comes with kafka integration and manages checkpoints internally. So, there is no need to query timeline metadata separately. Do you

[GitHub] [hudi] Trevor-zhang commented on a change in pull request #2021: [HUDI-1218] Introduce BulkInsertSortMode as Independent class

2020-08-24 Thread GitBox
Trevor-zhang commented on a change in pull request #2021: URL: https://github.com/apache/hudi/pull/2021#discussion_r476102080 ## File path: hudi-client/src/main/java/org/apache/hudi/execution/bulkinsert/BulkInsertInternalPartitionerFactory.java ## @@ -39,10 +39,4 @@ public

[GitHub] [hudi] yanghua commented on a change in pull request #2021: [HUDI-1218]Introduce BulkInsertSortMode as Independent class

2020-08-24 Thread GitBox
yanghua commented on a change in pull request #2021: URL: https://github.com/apache/hudi/pull/2021#discussion_r476098066 ## File path: hudi-client/src/main/java/org/apache/hudi/execution/bulkinsert/BulkInsertInternalPartitionerFactory.java ## @@ -39,10 +39,4 @@ public static

[GitHub] [hudi] dm-tran commented on issue #2020: [SUPPORT] Compaction fails with "java.io.FileNotFoundException"

2020-08-24 Thread GitBox
dm-tran commented on issue #2020: URL: https://github.com/apache/hudi/issues/2020#issuecomment-679481632 Thank you for your answer @bvaradar ! > Can you please add the details of "commit showfiles --commit 20200821153748" ```

[GitHub] [hudi] n3nash commented on a change in pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
n3nash commented on a change in pull request #1964: URL: https://github.com/apache/hudi/pull/1964#discussion_r476026846 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimelineUtils.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] n3nash commented on a change in pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
n3nash commented on a change in pull request #1964: URL: https://github.com/apache/hudi/pull/1964#discussion_r476026846 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimelineUtils.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software

[hudi] branch master updated: [HUDI-1135] Make timeline server timeout settings configurable.

2020-08-24 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 218d4a6 [HUDI-1135] Make timeline server

[GitHub] [hudi] n3nash merged pull request #2026: [HUDI-1135] Make timeline server timeout settings configurable.

2020-08-24 Thread GitBox
n3nash merged pull request #2026: URL: https://github.com/apache/hudi/pull/2026 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] n3nash merged pull request #2027: [HUDI-1136] Add back findInstantsAfterOrEquals to the HoodieTimeline class.

2020-08-24 Thread GitBox
n3nash merged pull request #2027: URL: https://github.com/apache/hudi/pull/2027 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[hudi] branch master updated: [HUDI-1136] Add back findInstantsAfterOrEquals to the HoodieTimeline class.

2020-08-24 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 9b1f16b [HUDI-1136] Add back

[GitHub] [hudi] n3nash commented on pull request #2030: [HUDI-1130] hudi-test-suite support for schema evolution (can be trig…

2020-08-24 Thread GitBox
n3nash commented on pull request #2030: URL: https://github.com/apache/hudi/pull/2030#issuecomment-679429416 @vinothchandar Yes, that PR is following later today by @modi95. We will merge this after that. This is an

[GitHub] [hudi] satishkotha commented on a change in pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
satishkotha commented on a change in pull request #1964: URL: https://github.com/apache/hudi/pull/1964#discussion_r475962559 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimelineUtils.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] bhasudha merged pull request #2028: Site update and release page for 0.6.0

2020-08-24 Thread GitBox
bhasudha merged pull request #2028: URL: https://github.com/apache/hudi/pull/2028 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] satishkotha commented on a change in pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
satishkotha commented on a change in pull request #1964: URL: https://github.com/apache/hudi/pull/1964#discussion_r475961204 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimelineUtils.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] vinothchandar commented on pull request #2030: [HUDI-1130] hudi-test-suite support for schema evolution (can be trig…

2020-08-24 Thread GitBox
vinothchandar commented on pull request #2030: URL: https://github.com/apache/hudi/pull/2030#issuecomment-679411848 can we first make the test-suite tests work on master and run in CI, before we merge more features? cc @n3nash

[jira] [Updated] (HUDI-1130) Allow for schema evolution within DAG for hudi test suite

2020-08-24 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1130: - Labels: pull-request-available (was: ) > Allow for schema evolution within DAG for hudi test

[GitHub] [hudi] nbalajee opened a new pull request #2030: [HUDI-1130] hudi-test-suite support for schema evolution (can be trig…

2020-08-24 Thread GitBox
nbalajee opened a new pull request #2030: URL: https://github.com/apache/hudi/pull/2030 …gered on any insert/upsert DAG node). ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull

[GitHub] [hudi] vinothchandar commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-08-24 Thread GitBox
vinothchandar commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r475940809 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieHFileRealtimeInputFormat.java ## @@ -0,0 +1,110 @@ +/* + * Licensed

[GitHub] [hudi] umehrot2 commented on issue #1981: [SUPPORT] Huge performance Difference Between Hudi and Regular Parquet in Athena

2020-08-24 Thread GitBox
umehrot2 commented on issue #1981: URL: https://github.com/apache/hudi/issues/1981#issuecomment-679408763 @rubenssoto yes currently EMR presto is on 0.232, but in upcoming releases you will see later versions of presto where you will be able to use this patch. If you want to

[GitHub] [hudi] vinothchandar commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-08-24 Thread GitBox
vinothchandar commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r475940393 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieHFileInputFormat.java ## @@ -0,0 +1,163 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] vinothchandar commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-08-24 Thread GitBox
vinothchandar commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r475939558 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieHFileDataBlock.java ## @@ -0,0 +1,159 @@ +/* + * Licensed to the

[GitHub] [hudi] vinothchandar commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-08-24 Thread GitBox
vinothchandar commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r475939229 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieHFileDataBlock.java ## @@ -0,0 +1,159 @@ +/* + * Licensed to the

[GitHub] [hudi] s-sanjay commented on issue #1895: HUDI Dataset backed by Hive Metastore fails on Presto with Unknown converted type TIMESTAMP_MICROS

2020-08-24 Thread GitBox
s-sanjay commented on issue #1895: URL: https://github.com/apache/hudi/issues/1895#issuecomment-679406205 @FelixKJose I have raised a [PR](https://github.com/prestodb/presto/pull/15074) This is an automated message from the

[GitHub] [hudi] vinothchandar commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-08-24 Thread GitBox
vinothchandar commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r475937184 ## File path: hudi-client/src/main/java/org/apache/hudi/io/storage/HoodieHFileConfig.java ## @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] vinothchandar commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-08-24 Thread GitBox
vinothchandar commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r475936227 ## File path: hudi-client/src/main/java/org/apache/hudi/io/HoodieSortedMergeHandle.java ## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] vinothchandar commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-08-24 Thread GitBox
vinothchandar commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r475936227 ## File path: hudi-client/src/main/java/org/apache/hudi/io/HoodieSortedMergeHandle.java ## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] vinothchandar commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-08-24 Thread GitBox
vinothchandar commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r475935700 ## File path: hudi-client/src/main/java/org/apache/hudi/io/HoodieSortedMergeHandle.java ## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] bvaradar commented on issue #1980: [SUPPORT] Small files (423KB) generated after running delete query

2020-08-24 Thread GitBox
bvaradar commented on issue #1980: URL: https://github.com/apache/hudi/issues/1980#issuecomment-679403428 Yes, this is expected. We retain the penultimate version of the file to prevent a running query from failing. In this case, you might see only one version of some file which did not

[GitHub] [hudi] bvaradar commented on a change in pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
bvaradar commented on a change in pull request #1964: URL: https://github.com/apache/hudi/pull/1964#discussion_r475931147 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimelineUtils.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] vinothchandar commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-08-24 Thread GitBox
vinothchandar commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r475932748 ## File path: hudi-client/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java ## @@ -55,7 +56,7 @@ private long recordsWritten = 0; private

[GitHub] [hudi] prashantwason commented on pull request #2026: [HUDI-1135] Make timeline server timeout settings configurable.

2020-08-24 Thread GitBox
prashantwason commented on pull request #2026: URL: https://github.com/apache/hudi/pull/2026#issuecomment-679401505 @n3nash corrected the errors. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [hudi] prashantwason commented on a change in pull request #2026: [HUDI-1135] Make timeline server timeout settings configurable.

2020-08-24 Thread GitBox
prashantwason commented on a change in pull request #2026: URL: https://github.com/apache/hudi/pull/2026#discussion_r475932300 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/view/RemoteHoodieTableFileSystemView.java ## @@ -147,7 +153,7 @@ public

[GitHub] [hudi] modi95 commented on a change in pull request #1975: [HUDI-1194][WIP] Reorganize HoodieHiveClient based on the way to call Hive API

2020-08-24 Thread GitBox
modi95 commented on a change in pull request #1975: URL: https://github.com/apache/hudi/pull/1975#discussion_r475932170 ## File path: pom.xml ## @@ -94,7 +94,7 @@ 2.9.9 2.7.3 org.apache.hive -2.3.1 +2.3.6 Review comment: Why are we updating Hive?

[GitHub] [hudi] modi95 commented on a change in pull request #1975: [HUDI-1194][WIP] Reorganize HoodieHiveClient based on the way to call Hive API

2020-08-24 Thread GitBox
modi95 commented on a change in pull request #1975: URL: https://github.com/apache/hudi/pull/1975#discussion_r475930539 ## File path: hudi-spark/src/main/scala/org/apache/hudi/DataSourceOptions.scala ## @@ -303,8 +305,28 @@ object DataSourceWriteOptions { val

[GitHub] [hudi] jiegzhan commented on issue #1980: [SUPPORT] Small files (423KB) generated after running delete query

2020-08-24 Thread GitBox
jiegzhan commented on issue #1980: URL: https://github.com/apache/hudi/issues/1980#issuecomment-679398095 @bvaradar, before re-clustering is available, I tested [hoodie.cleaner.commits.retained](https://hudi.apache.org/docs/configurations.html#retainCommits). I set

[GitHub] [hudi] n3nash commented on a change in pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
n3nash commented on a change in pull request #1964: URL: https://github.com/apache/hudi/pull/1964#discussion_r475925314 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimelineUtils.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] n3nash commented on a change in pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
n3nash commented on a change in pull request #1964: URL: https://github.com/apache/hudi/pull/1964#discussion_r475925115 ## File path: hudi-common/src/test/java/org/apache/hudi/common/table/TestTimelineUtils.java ## @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] prashanthvg89 opened a new issue #2029: Records seen with _hoodie_is_deleted set to true on non-existing record

2020-08-24 Thread GitBox
prashanthvg89 opened a new issue #2029: URL: https://github.com/apache/hudi/issues/2029 If a Hudi table, let's say, has zero rows and I issue an upsert with _hoodie_is_deleted = true then the record is still visible when I read the table. It works if the record was already existing but as

[GitHub] [hudi] n3nash commented on a change in pull request #2026: [HUDI-1135] Make timeline server timeout settings configurable.

2020-08-24 Thread GitBox
n3nash commented on a change in pull request #2026: URL: https://github.com/apache/hudi/pull/2026#discussion_r475920521 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/view/RemoteHoodieTableFileSystemView.java ## @@ -147,7 +153,7 @@ public

[GitHub] [hudi] satishkotha commented on a change in pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
satishkotha commented on a change in pull request #1964: URL: https://github.com/apache/hudi/pull/1964#discussion_r475920279 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieDefaultTimeline.java ## @@ -296,6 +300,42 @@ public boolean

[GitHub] [hudi] n3nash commented on a change in pull request #2026: [HUDI-1135] Make timeline server timeout settings configurable.

2020-08-24 Thread GitBox
n3nash commented on a change in pull request #2026: URL: https://github.com/apache/hudi/pull/2026#discussion_r475920074 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/view/FileSystemViewStorageConfig.java ## @@ -52,7 +54,7 @@ public static final

[GitHub] [hudi] satishkotha commented on a change in pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
satishkotha commented on a change in pull request #1964: URL: https://github.com/apache/hudi/pull/1964#discussion_r475919445 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieDefaultTimeline.java ## @@ -296,6 +300,42 @@ public boolean

[GitHub] [hudi] satishkotha commented on a change in pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
satishkotha commented on a change in pull request #1964: URL: https://github.com/apache/hudi/pull/1964#discussion_r475919254 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieTimeline.java ## @@ -232,6 +233,12 @@ */ Option

[hudi] branch master updated (ea983ff -> f7e02aa)

2020-08-24 Thread bhavanisudha
This is an automated email from the ASF dual-hosted git repository. bhavanisudha pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from ea983ff [HUDI-1137] Add option to configure different path selector add f7e02aa [MINOR] Update DOAP with

[GitHub] [hudi] bhasudha merged pull request #2024: [MINOR] Update DOAP with 0.6.0 Release

2020-08-24 Thread GitBox
bhasudha merged pull request #2024: URL: https://github.com/apache/hudi/pull/2024 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] bhasudha commented on pull request #2016: [WIP] Add release page doc for 0.6.0

2020-08-24 Thread GitBox
bhasudha commented on pull request #2016: URL: https://github.com/apache/hudi/pull/2016#issuecomment-679383433 closing this in favor of https://github.com/apache/hudi/pull/2028 . Capture the comment there. This is an

[GitHub] [hudi] bhasudha closed pull request #2016: [WIP] Add release page doc for 0.6.0

2020-08-24 Thread GitBox
bhasudha closed pull request #2016: URL: https://github.com/apache/hudi/pull/2016 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] bhasudha opened a new pull request #2028: Site update and release page for 0.6.0

2020-08-24 Thread GitBox
bhasudha opened a new pull request #2028: URL: https://github.com/apache/hudi/pull/2028 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[jira] [Updated] (HUDI-1136) Add back findInstantsAfterOrEquals to the HoodieTimeline class

2020-08-24 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1136: - Labels: pull-request-available (was: ) > Add back findInstantsAfterOrEquals to the

[GitHub] [hudi] vinothchandar commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-08-24 Thread GitBox
vinothchandar commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r475907308 ## File path: hudi-client/src/test/java/org/apache/hudi/testutils/HoodieClientTestUtils.java ## @@ -45,22 +45,39 @@ import

[GitHub] [hudi] prashantwason opened a new pull request #2027: [HUDI-1136] Add back findInstantsAfterOrEquals to the HoodieTimeline class.

2020-08-24 Thread GitBox
prashantwason opened a new pull request #2027: URL: https://github.com/apache/hudi/pull/2027 ## What is the purpose of the pull request Add an API findInstantsAfterOrEquals to HoodieTimeline. ## Verify this pull request This pull request is a trivial rework / code

[GitHub] [hudi] bvaradar commented on a change in pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
bvaradar commented on a change in pull request #1964: URL: https://github.com/apache/hudi/pull/1964#discussion_r475895369 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieDefaultTimeline.java ## @@ -296,6 +300,42 @@ public boolean

[GitHub] [hudi] satishkotha commented on a change in pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
satishkotha commented on a change in pull request #1964: URL: https://github.com/apache/hudi/pull/1964#discussion_r475901534 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieTimeline.java ## @@ -232,6 +233,12 @@ */ Option

[jira] [Updated] (HUDI-1135) Make timeline server timeout settings configurable

2020-08-24 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1135: - Labels: pull-request-available (was: ) > Make timeline server timeout settings configurable >

[GitHub] [hudi] prashantwason opened a new pull request #2026: [HUDI-1135] Make timeline server timeout settings configurable.

2020-08-24 Thread GitBox
prashantwason opened a new pull request #2026: URL: https://github.com/apache/hudi/pull/2026 ## *Tips* ## What is the purpose of the pull request Make timeline server timeout settings configurable. ## Brief change log Add timeout config settings for timeline server.

[GitHub] [hudi] vinothchandar commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-08-24 Thread GitBox
vinothchandar commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r475893386 ## File path: hudi-client/src/test/java/org/apache/hudi/testutils/HoodieClientTestUtils.java ## @@ -45,22 +45,39 @@ import

[GitHub] [hudi] n3nash commented on a change in pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
n3nash commented on a change in pull request #1964: URL: https://github.com/apache/hudi/pull/1964#discussion_r475892688 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieTimeline.java ## @@ -232,6 +233,12 @@ */ Option

[GitHub] [hudi] tooptoop4 edited a comment on issue #1955: [SUPPORT] DMS partition treated as part of pk

2020-08-24 Thread GitBox
tooptoop4 edited a comment on issue #1955: URL: https://github.com/apache/hudi/issues/1955#issuecomment-679356547 @nsivabalan perfect, that is how I expect. perhaps the default should be global index? or documentation should be updated? From coming from RDBMS background the PK is

[GitHub] [hudi] tooptoop4 commented on issue #1955: [SUPPORT] DMS partition treated as part of pk

2020-08-24 Thread GitBox
tooptoop4 commented on issue #1955: URL: https://github.com/apache/hudi/issues/1955#issuecomment-679356547 perfect, that is how I expect. perhaps the default should be global index? or documentation should be updated? From coming from RDBMS background the PK is unique at table level

[GitHub] [hudi] tooptoop4 commented on issue #1954: [SUPPORT] DMS Caused by: java.lang.IllegalArgumentException: Partition key parts [] does not match with partition values

2020-08-24 Thread GitBox
tooptoop4 commented on issue #1954: URL: https://github.com/apache/hudi/issues/1954#issuecomment-679353671 @bvaradar in each comment I am trying brand new tables with different spark submits. So not changing an existing table. try to reproduce with

[GitHub] [hudi] nsivabalan commented on issue #1955: [SUPPORT] DMS partition treated as part of pk

2020-08-24 Thread GitBox
nsivabalan commented on issue #1955: URL: https://github.com/apache/hudi/issues/1955#issuecomment-679352339 @tooptoop4 : can you clarify what you mean by this. ``` ie for each version_no,group_company combo, i want to get the latest row by TimeCreated (ie the source-ordering-field)

[GitHub] [hudi] n3nash merged pull request #2023: [HUDI-1137] Add option to configure different path selector

2020-08-24 Thread GitBox
n3nash merged pull request #2023: URL: https://github.com/apache/hudi/pull/2023 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[hudi] branch master updated: [HUDI-1137] Add option to configure different path selector

2020-08-24 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new ea983ff [HUDI-1137] Add option to configure

[GitHub] [hudi] nsivabalan commented on a change in pull request #2016: [WIP] Add release page doc for 0.6.0

2020-08-24 Thread GitBox
nsivabalan commented on a change in pull request #2016: URL: https://github.com/apache/hudi/pull/2016#discussion_r475863568 ## File path: docs/_pages/releases.md ## @@ -5,6 +5,72 @@ layout: releases toc: true last_modified_at: 2020-05-28T08:40:00-07:00 --- +## [Release

[jira] [Updated] (HUDI-1056) Ensure validate_staged_release.sh also runs against released version in release repo

2020-08-24 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1056: - Labels: pull-request-available (was: ) > Ensure validate_staged_release.sh also runs against

[GitHub] [hudi] nsivabalan opened a new pull request #2025: [HUDI-1056] Fix release validate script for rc_num and release_type

2020-08-24 Thread GitBox
nsivabalan opened a new pull request #2025: URL: https://github.com/apache/hudi/pull/2025 ## What is the purpose of the pull request Fixing release validate script for making rc_num optional and to introduce release_type(dev/release) ## Brief change log Fixing release

[GitHub] [hudi] bvaradar commented on issue #1962: [SUPPORT] Unable to filter hudi table in hive on partition column

2020-08-24 Thread GitBox
bvaradar commented on issue #1962: URL: https://github.com/apache/hudi/issues/1962#issuecomment-679316776 For the second case, Hive Metastore would be filtering out partitions and only return specific paths. I think there is some inconsistency between the path used in the filesystem and

[GitHub] [hudi] vinothchandar commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-08-24 Thread GitBox
vinothchandar commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r475815871 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieStorageConfig.java ## @@ -81,6 +87,7 @@ public Builder fromProperties(Properties

[hudi] branch asf-site updated: Travis CI build asf-site

2020-08-24 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new e995bd2 Travis CI build asf-site e995bd2 is

[GitHub] [hudi] bvaradar commented on issue #1954: [SUPPORT] DMS Caused by: java.lang.IllegalArgumentException: Partition key parts [] does not match with partition values

2020-08-24 Thread GitBox
bvaradar commented on issue #1954: URL: https://github.com/apache/hudi/issues/1954#issuecomment-679305798 @tooptoop4 : IIUC, Are you effectively changing a table from non-partitioned to partitioned ? The exception you added to the last comment was about a missing file which does not tie

[GitHub] [hudi] jpugliesi edited a comment on issue #2002: [SUPPORT] Inconsistent Commits between CLI and Incremental Query

2020-08-24 Thread GitBox
jpugliesi edited a comment on issue #2002: URL: https://github.com/apache/hudi/issues/2002#issuecomment-679298548 @bvaradar I suspected this may have been the case, but I was not able to find any documentation anywhere that states that a commit tracks the timestamp of when a _specific

[GitHub] [hudi] bhasudha opened a new pull request #2024: [MINOR] Update DOAP with 0.6.0 Release

2020-08-24 Thread GitBox
bhasudha opened a new pull request #2024: URL: https://github.com/apache/hudi/pull/2024 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] jpugliesi commented on issue #2002: [SUPPORT] Inconsistent Commits between CLI and Incremental Query

2020-08-24 Thread GitBox
jpugliesi commented on issue #2002: URL: https://github.com/apache/hudi/issues/2002#issuecomment-679298548 @bvaradar I suspected this may have been the case, but I was not able to find any documentation anywhere that states that a commit tracks the timestamp of when a _specific subset of

[GitHub] [hudi] n3nash commented on pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-24 Thread GitBox
n3nash commented on pull request #1964: URL: https://github.com/apache/hudi/pull/1964#issuecomment-679298263 Could we structure this as a "BaseTableMetaClient", "HoodieTableMetaClient" and "HoodieTableIncrementalMetaClient" ?

[jira] [Updated] (HUDI-424) Implement Hive Query Side Integration for querying tables containing bootstrap file slices

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-424: --- Fix Version/s: (was: 0.6.0) 0.6.1 > Implement Hive Query Side Integration for

[jira] [Updated] (HUDI-769) Write blog about HoodieMultiTableDeltaStreamer in cwiki

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-769: --- Fix Version/s: (was: 0.6.0) 0.6.1 > Write blog about

[jira] [Updated] (HUDI-575) Support Async Compaction for spark streaming writes to hudi table

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-575: --- Fix Version/s: (was: 0.6.0) 0.6.1 > Support Async Compaction for spark

[jira] [Updated] (HUDI-806) Implement support for bootstrapping via Spark datasource API

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-806: --- Fix Version/s: (was: 0.6.0) 0.6.1 > Implement support for bootstrapping via

[jira] [Updated] (HUDI-421) Cleanup bootstrap code and create PR for FileStystemView changes

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-421: --- Fix Version/s: (was: 0.6.0) 0.6.1 > Cleanup bootstrap code and create PR for

[jira] [Updated] (HUDI-425) Implement support for bootstrapping in HoodieDeltaStreamer

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-425: --- Fix Version/s: (was: 0.6.0) 0.6.1 > Implement support for bootstrapping in

[jira] [Updated] (HUDI-1031) Document how to set job scheduling configs for Async compaction

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-1031: Fix Version/s: (was: 0.6.0) 0.6.1 > Document how to set job scheduling

[jira] [Updated] (HUDI-422) Cleanup bootstrap code and create write APIs for supporting bootstrap

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-422: --- Fix Version/s: (was: 0.6.0) 0.6.1 > Cleanup bootstrap code and create write

[jira] [Updated] (HUDI-428) Web documentation for explaining how to bootstrap

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-428: --- Fix Version/s: (was: 0.6.0) 0.6.1 > Web documentation for explaining how to

[jira] [Updated] (HUDI-807) Spark DS Support for incremental queries for bootstrapped tables

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-807: --- Fix Version/s: (was: 0.6.0) 0.6.1 > Spark DS Support for incremental queries

[jira] [Updated] (HUDI-423) Implement upsert functionality for handling updates to these bootstrap file slices

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-423: --- Fix Version/s: (was: 0.6.0) 0.6.1 > Implement upsert functionality for handling

[jira] [Updated] (HUDI-420) Automated end to end Integration Test

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-420: --- Fix Version/s: (was: 0.6.0) 0.6.1 > Automated end to end Integration Test >

[jira] [Updated] (HUDI-418) Bootstrap Index - Implementation

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-418: --- Fix Version/s: (was: 0.6.0) 0.6.1 > Bootstrap Index - Implementation >

[jira] [Closed] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha closed HUDI-114. -- > Allow for clients to overwrite the payload implementation in hoodie.properties >

[jira] [Reopened] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha reopened HUDI-114: > Allow for clients to overwrite the payload implementation in hoodie.properties >

[jira] [Resolved] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha resolved HUDI-114. Resolution: Fixed > Allow for clients to overwrite the payload implementation in hoodie.properties >

[jira] [Reopened] (HUDI-590) Cut a new Doc version 0.5.1 explicitly

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha reopened HUDI-590: > Cut a new Doc version 0.5.1 explicitly > -- > >

[jira] [Resolved] (HUDI-590) Cut a new Doc version 0.5.1 explicitly

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha resolved HUDI-590. Resolution: Fixed > Cut a new Doc version 0.5.1 explicitly > --

[jira] [Closed] (HUDI-624) Split some of the code from PR for HUDI-479

2020-08-24 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha closed HUDI-624. -- > Split some of the code from PR for HUDI-479 > > >

  1   2   3   >