[jira] [Assigned] (HUDI-1016) [Minor] Code optimization

2020-06-08 Thread Hong Shen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Shen reassigned HUDI-1016: --- Assignee: Hong Shen > [Minor] Code optimization > - > > Key:

[jira] [Created] (HUDI-1016) [Minor] Code optimization

2020-06-08 Thread Hong Shen (Jira)
Hong Shen created HUDI-1016: --- Summary: [Minor] Code optimization Key: HUDI-1016 URL: https://issues.apache.org/jira/browse/HUDI-1016 Project: Apache Hudi Issue Type: Improvement

[jira] [Updated] (HUDI-1006) deltastreamer use kafkaSource with offset reset strategy: latest can't consume data

2020-06-08 Thread Tianye Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tianye Li updated HUDI-1006: Summary: deltastreamer use kafkaSource with offset reset strategy: latest can't consume data (was:

[jira] [Updated] (HUDI-1006) deltastreamer use kafkaSource set auto.offset.reset=latest can't consume data

2020-06-08 Thread Tianye Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tianye Li updated HUDI-1006: Summary: deltastreamer use kafkaSource set auto.offset.reset=latest can't consume data (was: deltastreamer

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #303

2020-06-08 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.40 KB...] settings.xml toolchains.xml /home/jenkins/tools/maven/apache-maven-3.5.4/conf/logging: simplelogger.properties

[hudi] branch master updated: HUDI-494 fix incorrect record size estimation

2020-06-08 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 22cd824 HUDI-494 fix incorrect record size

[jira] [Comment Edited] (HUDI-781) Re-design test utilities

2020-06-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128778#comment-17128778 ] Raymond Xu edited comment on HUDI-781 at 6/9/20, 2:41 AM: -- [~yanghua] [~vinoth]

[jira] [Comment Edited] (HUDI-781) Re-design test utilities

2020-06-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128778#comment-17128778 ] Raymond Xu edited comment on HUDI-781 at 6/9/20, 2:36 AM: -- [~yanghua] [~vinoth]

[jira] [Commented] (HUDI-781) Re-design test utilities

2020-06-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128778#comment-17128778 ] Raymond Xu commented on HUDI-781: - [~yanghua] [~yanghua] [~nishith29] [~garyli1019] Here is an execution

[hudi] branch release-0.5.3 updated (ed4bcbc -> e0c45f6)

2020-06-08 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch release-0.5.3 in repository https://gitbox.apache.org/repos/asf/hudi.git. discard ed4bcbc [HUDI-988] Fix More Unit Test Flakiness new e0c45f6 [HUDI-988] Fix More Unit Test Flakiness

[hudi] 01/01: [HUDI-988] Fix More Unit Test Flakiness

2020-06-08 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch release-0.5.3 in repository https://gitbox.apache.org/repos/asf/hudi.git commit e0c45f62818da1e285781a5a30622f69accde1af Author: garyli1019 AuthorDate: Fri Jun 5 17:25:59 2020 -0700

[jira] [Updated] (HUDI-1007) When earliestOffsets is greater than checkpoint, Hudi will not be able to successfully consume data

2020-06-08 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-1007: Status: Open (was: New) > When earliestOffsets is greater than checkpoint, Hudi will not be able to >

[jira] [Commented] (HUDI-914) support different target data clusters

2020-06-08 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128757#comment-17128757 ] liujinhui commented on HUDI-914: Due to the needs of some business parties, they only want the hudi dataset

[jira] [Updated] (HUDI-635) MergeHandle's DiskBasedMap entries can be thinner

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-635: Status: Open (was: New) > MergeHandle's DiskBasedMap entries can be thinner >

[jira] [Updated] (HUDI-69) Support realtime view in Spark datasource #136

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-69: --- Status: Patch Available (was: In Progress) > Support realtime view in Spark datasource #136 >

[jira] [Updated] (HUDI-684) Introduce abstraction for writing and reading and compacting from FileGroups

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-684: Status: Patch Available (was: In Progress) > Introduce abstraction for writing and reading and

[jira] [Updated] (HUDI-684) Introduce abstraction for writing and reading and compacting from FileGroups

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-684: Fix Version/s: 0.6.0 > Introduce abstraction for writing and reading and compacting from FileGroups

[jira] [Updated] (HUDI-684) Introduce abstraction for writing and reading and compacting from FileGroups

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-684: Priority: Blocker (was: Major) > Introduce abstraction for writing and reading and compacting from

[jira] [Assigned] (HUDI-635) MergeHandle's DiskBasedMap entries can be thinner

2020-06-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-635: Assignee: sivabalan narayanan (was: Vinoth Chandar) > MergeHandle's DiskBasedMap

[jira] [Updated] (HUDI-242) Support Efficient bootstrap of large parquet datasets to Hudi

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-242: Status: Patch Available (was: In Progress) > Support Efficient bootstrap of large parquet datasets

[jira] [Updated] (HUDI-882) Update documentation with new configs for 0.6.0 release

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-882: Status: New (was: Open) > Update documentation with new configs for 0.6.0 release >

[jira] [Updated] (HUDI-818) Optimize the default value of hoodie.memory.merge.max.size option

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-818: Status: New (was: Open) > Optimize the default value of hoodie.memory.merge.max.size option >

[jira] [Updated] (HUDI-802) AWSDmsTransformer does not handle insert -> delete of a row in a single batch correctly

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-802: Status: New (was: Open) > AWSDmsTransformer does not handle insert -> delete of a row in a single

[jira] [Updated] (HUDI-686) Implement BloomIndexV2 that does not depend on memory caching

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-686: Status: In Progress (was: Open) > Implement BloomIndexV2 that does not depend on memory caching >

[jira] [Updated] (HUDI-289) Implement a test suite to support long running test for Hudi writing and querying end-end

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-289: Status: Patch Available (was: In Progress) > Implement a test suite to support long running test

[jira] [Updated] (HUDI-860) Ability to do small file handling without need for caching

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-860: Status: New (was: Open) > Ability to do small file handling without need for caching >

[jira] [Updated] (HUDI-686) Implement BloomIndexV2 that does not depend on memory caching

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-686: Status: Patch Available (was: In Progress) > Implement BloomIndexV2 that does not depend on memory

[jira] [Updated] (HUDI-855) Run Auto Cleaner in parallel with ingestion

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-855: Status: New (was: Open) > Run Auto Cleaner in parallel with ingestion >

[jira] [Commented] (HUDI-651) Incremental Query on Hive via Spark SQL does not return expected results

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128742#comment-17128742 ] Vinoth Chandar commented on HUDI-651: - [~bhavanisudha] can you please push your draft impl to a

[jira] [Updated] (HUDI-472) Make sortBy() inside bulkInsertInternal() configurable for bulk_insert

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-472: Status: New (was: Open) > Make sortBy() inside bulkInsertInternal() configurable for bulk_insert >

[jira] [Updated] (HUDI-1013) Bulk Insert w/o converting to RDD

2020-06-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1013: -- Summary: Bulk Insert w/o converting to RDD (was: Bulk Insert w/ converting to RDD) >

[jira] [Updated] (HUDI-115) Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-115: Status: Patch Available (was: In Progress) > Enhance OverwriteWithLatestAvroPayload to also respect

[jira] [Updated] (HUDI-635) MergeHandle's DiskBasedMap entries can be thinner

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-635: Priority: Blocker (was: Major) > MergeHandle's DiskBasedMap entries can be thinner >

[jira] [Updated] (HUDI-635) MergeHandle's DiskBasedMap entries can be thinner

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-635: Fix Version/s: 0.6.0 > MergeHandle's DiskBasedMap entries can be thinner >

[jira] [Created] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2020-06-08 Thread Vinoth Chandar (Jira)
Vinoth Chandar created HUDI-1015: Summary: Audit all getAllPartitionPaths() calls and keep em out of fast path Key: HUDI-1015 URL: https://issues.apache.org/jira/browse/HUDI-1015 Project: Apache Hudi

[jira] [Updated] (HUDI-1013) Bulk Insert w/ converting to RDD

2020-06-08 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1013: -- Fix Version/s: 0.6.0 > Bulk Insert w/ converting to RDD >

[jira] [Created] (HUDI-1014) Design and Implement upgrade-downgrade infrastrucutre

2020-06-08 Thread Vinoth Chandar (Jira)
Vinoth Chandar created HUDI-1014: Summary: Design and Implement upgrade-downgrade infrastrucutre Key: HUDI-1014 URL: https://issues.apache.org/jira/browse/HUDI-1014 Project: Apache Hudi

[jira] [Updated] (HUDI-839) Implement rollbacks using marker files instead of relying on commit metadata

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-839: Status: In Progress (was: Open) > Implement rollbacks using marker files instead of relying on

[jira] [Updated] (HUDI-839) Implement rollbacks using marker files instead of relying on commit metadata

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-839: Status: Open (was: New) > Implement rollbacks using marker files instead of relying on commit

[jira] [Created] (HUDI-1013) Bulk Insert w/ converting to RDD

2020-06-08 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-1013: - Summary: Bulk Insert w/ converting to RDD Key: HUDI-1013 URL: https://issues.apache.org/jira/browse/HUDI-1013 Project: Apache Hudi Issue Type:

[jira] [Updated] (HUDI-839) Implement rollbacks using marker files instead of relying on commit metadata

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-839: Priority: Blocker (was: Major) > Implement rollbacks using marker files instead of relying on

[jira] [Updated] (HUDI-305) Presto MOR "_rt" queries only reads base parquet file

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-305: Priority: Blocker (was: Major) > Presto MOR "_rt" queries only reads base parquet file >

[jira] [Updated] (HUDI-860) Ability to do small file handling without need for caching

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-860: Priority: Blocker (was: Major) > Ability to do small file handling without need for caching >

[jira] [Updated] (HUDI-575) Support Async Compaction for spark streaming writes to hudi table

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-575: Priority: Blocker (was: Major) > Support Async Compaction for spark streaming writes to hudi table

[jira] [Assigned] (HUDI-575) Support Async Compaction for spark streaming writes to hudi table

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-575: --- Assignee: Balaji Varadarajan (was: Prasanna Rajaperumal) > Support Async Compaction for

[jira] [Updated] (HUDI-818) Optimize the default value of hoodie.memory.merge.max.size option

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-818: Priority: Blocker (was: Major) > Optimize the default value of hoodie.memory.merge.max.size option

[jira] [Updated] (HUDI-979) AWSDMSPayload delete handling with MOR

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-979: Priority: Blocker (was: Major) > AWSDMSPayload delete handling with MOR >

[jira] [Updated] (HUDI-855) Run Auto Cleaner in parallel with ingestion

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-855: Priority: Blocker (was: Major) > Run Auto Cleaner in parallel with ingestion >

[jira] [Updated] (HUDI-845) Allow parallel writing and move the pending rollback work into cleaner

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-845: Priority: Blocker (was: Major) > Allow parallel writing and move the pending rollback work into

[jira] [Updated] (HUDI-853) Deprecate/Remove Clean_by_versions functionality in Hudi

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-853: Status: Open (was: New) > Deprecate/Remove Clean_by_versions functionality in Hudi >

[jira] [Closed] (HUDI-853) Deprecate/Remove Clean_by_versions functionality in Hudi

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar closed HUDI-853. --- Resolution: Won't Fix > Deprecate/Remove Clean_by_versions functionality in Hudi >

[jira] [Updated] (HUDI-719) Exception during clean phase: Found org.apache.hudi.avro.model.HoodieCleanMetadata, expecting org.apache.hudi.avro.model.HoodieCleanerPlan

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-719: Status: Open (was: New) > Exception during clean phase: Found >

[jira] [Resolved] (HUDI-719) Exception during clean phase: Found org.apache.hudi.avro.model.HoodieCleanMetadata, expecting org.apache.hudi.avro.model.HoodieCleanerPlan

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-719. - Resolution: Fixed > Exception during clean phase: Found >

[jira] [Updated] (HUDI-882) Update documentation with new configs for 0.6.0 release

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-882: Priority: Blocker (was: Major) > Update documentation with new configs for 0.6.0 release >

[jira] [Updated] (HUDI-115) Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-115: Priority: Blocker (was: Major) > Enhance OverwriteWithLatestAvroPayload to also respect ordering

[jira] [Updated] (HUDI-920) Incremental view on MOR table using Spark Datasource

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-920: Priority: Blocker (was: Major) > Incremental view on MOR table using Spark Datasource >

[jira] [Updated] (HUDI-844) Store Avro schema string as first-level entity in commit metadata

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-844: Status: Open (was: New) > Store Avro schema string as first-level entity in commit metadata >

[jira] [Updated] (HUDI-69) Support realtime view in Spark datasource #136

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-69: --- Priority: Blocker (was: Major) > Support realtime view in Spark datasource #136 >

[jira] [Closed] (HUDI-844) Store Avro schema string as first-level entity in commit metadata

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar closed HUDI-844. --- Resolution: Won't Fix > Store Avro schema string as first-level entity in commit metadata >

[jira] [Updated] (HUDI-472) Make sortBy() inside bulkInsertInternal() configurable for bulk_insert

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-472: Status: Open (was: New) > Make sortBy() inside bulkInsertInternal() configurable for bulk_insert >

[jira] [Updated] (HUDI-472) Make sortBy() inside bulkInsertInternal() configurable for bulk_insert

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-472: Priority: Blocker (was: Major) > Make sortBy() inside bulkInsertInternal() configurable for

[jira] [Assigned] (HUDI-586) Revisit the release guide

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-586: --- Assignee: sivabalan narayanan (was: leesf) > Revisit the release guide >

[jira] [Closed] (HUDI-672) Spark DataSource - Upsert for S3 Hudi dataset with large partitions takes a lot of time in writing

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar closed HUDI-672. --- Resolution: Duplicate > Spark DataSource - Upsert for S3 Hudi dataset with large partitions takes a >

[jira] [Updated] (HUDI-672) Spark DataSource - Upsert for S3 Hudi dataset with large partitions takes a lot of time in writing

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-672: Status: Open (was: New) > Spark DataSource - Upsert for S3 Hudi dataset with large partitions takes

[jira] [Updated] (HUDI-802) AWSDmsTransformer does not handle insert -> delete of a row in a single batch correctly

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-802: Priority: Blocker (was: Major) > AWSDmsTransformer does not handle insert -> delete of a row in a

[jira] [Updated] (HUDI-802) AWSDmsTransformer does not handle insert -> delete of a row in a single batch correctly

2020-06-08 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-802: Status: Open (was: New) > AWSDmsTransformer does not handle insert -> delete of a row in a single

[hudi] branch release-0.5.3 updated (864a7cd -> ed4bcbc)

2020-06-08 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch release-0.5.3 in repository https://gitbox.apache.org/repos/asf/hudi.git. discard 864a7cd [HUDI-988] Fix More Unit Test Flakiness new ed4bcbc [HUDI-988] Fix More Unit Test Flakiness

[hudi] 01/01: [HUDI-988] Fix More Unit Test Flakiness

2020-06-08 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch release-0.5.3 in repository https://gitbox.apache.org/repos/asf/hudi.git commit ed4bcbcf54945d52871e954855b7e8d470dfff26 Author: garyli1019 AuthorDate: Fri Jun 5 17:25:59 2020 -0700

[hudi] 03/04: Making few fixes after cherry picking

2020-06-08 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch release-0.5.3 in repository https://gitbox.apache.org/repos/asf/hudi.git commit 84ca5b0cae72d9b33271045efee93a4cf1a0cff5 Author: Sivabalan Narayanan AuthorDate: Sun Jun 7 16:23:40 2020 -0400

[hudi] 01/04: [HUDI-988] Fix Unit Test Flakiness : Ensure all instantiations of HoodieWriteClient is closed properly. Fix bug in TestRollbacks. Make CLI unit tests for Hudi CLI check skip redering str

2020-06-08 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch release-0.5.3 in repository https://gitbox.apache.org/repos/asf/hudi.git commit 6dcd0a3524fe7be0bbbd3e673ed7e1d4b035e0cb Author: Balaji Varadarajan AuthorDate: Tue Jun 2 01:49:37 2020 -0700

[hudi] branch release-0.5.3 updated (5fcc461 -> 864a7cd)

2020-06-08 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch release-0.5.3 in repository https://gitbox.apache.org/repos/asf/hudi.git. omit 5fcc461 Bumping release candidate number 1 new 6dcd0a3 [HUDI-988] Fix Unit Test Flakiness : Ensure

[hudi] 04/04: [HUDI-988] Fix More Unit Test Flakiness

2020-06-08 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch release-0.5.3 in repository https://gitbox.apache.org/repos/asf/hudi.git commit 864a7cd880cf80aac056aac0658ee94f53b36ac9 Author: garyli1019 AuthorDate: Fri Jun 5 17:25:59 2020 -0700

[hudi] 02/04: [HUDI-990] Timeline API : filterCompletedAndCompactionInstants needs to handle requested state correctly. Also ensure timeline gets reloaded after we revert committed transactions

2020-06-08 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch release-0.5.3 in repository https://gitbox.apache.org/repos/asf/hudi.git commit ae48ecbe232eb55267d1a138baeec13baa1fb249 Author: Balaji Varadarajan AuthorDate: Wed Jun 3 00:35:14 2020 -0700

[hudi] branch master updated: HUDI-515 Resolve API conflict for Hive 2 & Hive 3

2020-06-08 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 7d40f19 HUDI-515 Resolve API conflict for Hive

[hudi] branch release-0.5.3 updated (41fb6c2 -> 5fcc461)

2020-06-08 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch release-0.5.3 in repository https://gitbox.apache.org/repos/asf/hudi.git. discard 41fb6c2 Bumping release candidate number 2 discard d3afcba Making few fixes after cherry picking discard

[jira] [Created] (HUDI-1012) add test for snapshot reads

2020-06-08 Thread satish (Jira)
satish created HUDI-1012: Summary: add test for snapshot reads Key: HUDI-1012 URL: https://issues.apache.org/jira/browse/HUDI-1012 Project: Apache Hudi Issue Type: Test Reporter: satish

[jira] [Closed] (HUDI-1011) Refactor hudi-client unit tests structure

2020-06-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-1011. > Refactor hudi-client unit tests structure > - > >

[jira] [Resolved] (HUDI-1011) Refactor hudi-client unit tests structure

2020-06-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu resolved HUDI-1011. -- Resolution: Duplicate > Refactor hudi-client unit tests structure >

[jira] [Commented] (HUDI-996) Use shared spark session provider

2020-06-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128497#comment-17128497 ] Raymond Xu commented on HUDI-996: - Notes by [~garyli1019] hudi-client unit tests are the most

[jira] [Updated] (HUDI-1010) Fix the memory leak for hudi-client unit tests

2020-06-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1010: - Parent: HUDI-781 Issue Type: Sub-task (was: Bug) > Fix the memory leak for hudi-client unit

[jira] [Updated] (HUDI-1011) Refactor hudi-client unit tests structure

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1011: - Status: Open (was: New) > Refactor hudi-client unit tests structure >

[jira] [Updated] (HUDI-1011) Refactor hudi-client unit tests structure

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1011: - Component/s: Testing > Refactor hudi-client unit tests structure >

[jira] [Updated] (HUDI-1010) Fix the memory leak for hudi-client unit tests

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1010: - Status: Open (was: New) > Fix the memory leak for hudi-client unit tests >

[jira] [Updated] (HUDI-1010) Fix the memory leak for hudi-client unit tests

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1010: - Component/s: Testing > Fix the memory leak for hudi-client unit tests >

[jira] [Updated] (HUDI-1011) Refactor hudi-client unit tests structure

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1011: - Labels: help-wanted (was: ) > Refactor hudi-client unit tests structure >

[jira] [Created] (HUDI-1011) Refactor hudi-client unit tests structure

2020-06-08 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1011: Summary: Refactor hudi-client unit tests structure Key: HUDI-1011 URL: https://issues.apache.org/jira/browse/HUDI-1011 Project: Apache Hudi Issue Type:

[jira] [Updated] (HUDI-1010) Fix the memory leak for hudi-client unit tests

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1010: - Description: hudi-client unit test has a memory leak, which could be some resources are not

[jira] [Updated] (HUDI-1010) Fix the memory leak for hudi-client unit tests

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1010: - Description: hudi-client unit test has a memory leak, which could be some resources are not

[jira] [Updated] (HUDI-1010) Fix the memory leak for hudi-client unit tests

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1010: - Labels: help-wanted (was: ) > Fix the memory leak for hudi-client unit tests >

[jira] [Created] (HUDI-1010) Fix the memory leak for hudi-client unit tests

2020-06-08 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1010: Summary: Fix the memory leak for hudi-client unit tests Key: HUDI-1010 URL: https://issues.apache.org/jira/browse/HUDI-1010 Project: Apache Hudi Issue Type:

[jira] [Commented] (HUDI-1009) Handle insert for recordkey is not unique

2020-06-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128410#comment-17128410 ] liwei commented on HUDI-1009: - https://issues.apache.org/jira/browse/HUDI-1008 repeated > Handle insert for

[jira] [Closed] (HUDI-1009) Handle insert for recordkey is not unique

2020-06-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei closed HUDI-1009. --- Resolution: Duplicate > Handle insert for recordkey is not unique > - > >

[jira] [Assigned] (HUDI-1009) Handle insert for recordkey is not unique

2020-06-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei reassigned HUDI-1009: --- Assignee: liwei > Handle insert for recordkey is not unique > - > >

[jira] [Updated] (HUDI-1009) Handle insert for recordkey is not unique

2020-06-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei updated HUDI-1009: Status: Open (was: New) > Handle insert for recordkey is not unique > - >

[jira] [Assigned] (HUDI-1008) Handle insert for recordkey is not unique

2020-06-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei reassigned HUDI-1008: --- Assignee: liwei > Handle insert for recordkey is not unique > - > >

[jira] [Created] (HUDI-1009) Handle insert for recordkey is not unique

2020-06-08 Thread liwei (Jira)
liwei created HUDI-1009: --- Summary: Handle insert for recordkey is not unique Key: HUDI-1009 URL: https://issues.apache.org/jira/browse/HUDI-1009 Project: Apache Hudi Issue Type: New Feature

[jira] [Created] (HUDI-1008) Handle insert for recordkey is not unique

2020-06-08 Thread liwei (Jira)
liwei created HUDI-1008: --- Summary: Handle insert for recordkey is not unique Key: HUDI-1008 URL: https://issues.apache.org/jira/browse/HUDI-1008 Project: Apache Hudi Issue Type: New Feature

[jira] [Closed] (HUDI-918) Fix kafkaOffsetGen can not read kafka data bug

2020-06-08 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui closed HUDI-918. -- > Fix kafkaOffsetGen can not read kafka data bug > -- > >

[jira] [Resolved] (HUDI-918) Fix kafkaOffsetGen can not read kafka data bug

2020-06-08 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui resolved HUDI-918. Resolution: Fixed > Fix kafkaOffsetGen can not read kafka data bug >

[jira] [Commented] (HUDI-1007) When earliestOffsets is greater than checkpoint, Hudi will not be able to successfully consume data

2020-06-08 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128400#comment-17128400 ] liujinhui commented on HUDI-1007: - Yes, every run will check the offset of the earliest in the offect

  1   2   >