[jira] [Closed] (HUDI-7703) Clean plan does not need to include partitions with no files to delete

2024-05-05 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-7703.

Resolution: Fixed

> Clean plan does not need to include partitions with no files to delete
> --
>
> Key: HUDI-7703
> URL: https://issues.apache.org/jira/browse/HUDI-7703
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: table-service
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
> Attachments: Screenshot 2024-04-10 at 2.59.57 PM.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-6110) Hudi DOAP file error

2024-05-03 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-6110.

Fix Version/s: (was: 0.15.0)
   Resolution: Not A Problem

> Hudi DOAP file error
> 
>
> Key: HUDI-6110
> URL: https://issues.apache.org/jira/browse/HUDI-6110
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: docs
>Reporter: Claude Warren
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: documentation
>
> The DOAP file [1] as listed in [2] has the error:
> [line: 44, col: 16] \{E201} Multiple children of property element
> [1] 
> https://gitbox.apache.org/repos/asf?p=hudi.git;a=blob_plain;f=doap_HUDI.rdf;hb=HEAD
> [2] 
> https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-7362) Athena does not support s3a partition scheme anymore leading to missing data

2024-05-03 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-7362.

Resolution: Fixed

>  Athena does not support s3a partition scheme anymore leading to missing data
> -
>
> Key: HUDI-7362
> URL: https://issues.apache.org/jira/browse/HUDI-7362
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: nicolas paris
>Assignee: nicolas paris
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>
> see https://github.com/apache/hudi/issues/10595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-7362) Athena does not support s3a partition scheme anymore leading to missing data

2024-05-03 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-7362:


Assignee: nicolas paris

>  Athena does not support s3a partition scheme anymore leading to missing data
> -
>
> Key: HUDI-7362
> URL: https://issues.apache.org/jira/browse/HUDI-7362
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: nicolas paris
>Assignee: nicolas paris
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>
> see https://github.com/apache/hudi/issues/10595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7362) Athena does not support s3a partition scheme anymore leading to missing data

2024-05-03 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7362:
-
Priority: Blocker  (was: Major)

>  Athena does not support s3a partition scheme anymore leading to missing data
> -
>
> Key: HUDI-7362
> URL: https://issues.apache.org/jira/browse/HUDI-7362
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: nicolas paris
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>
> see https://github.com/apache/hudi/issues/10595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7362) Athena does not support s3a partition scheme anymore leading to missing data

2024-05-03 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7362:
-
Fix Version/s: 0.15.0

>  Athena does not support s3a partition scheme anymore leading to missing data
> -
>
> Key: HUDI-7362
> URL: https://issues.apache.org/jira/browse/HUDI-7362
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: nicolas paris
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>
> see https://github.com/apache/hudi/issues/10595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HUDI-6110) Hudi DOAP file error

2024-05-03 Thread Raymond Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-6110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843127#comment-17843127
 ] 

Raymond Xu commented on HUDI-6110:
--

[~claude] can you point out what exactly is the problem with the DOAP file? the 
error message is unclear to me.

> Hudi DOAP file error
> 
>
> Key: HUDI-6110
> URL: https://issues.apache.org/jira/browse/HUDI-6110
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: docs
>Reporter: Claude Warren
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: documentation
> Fix For: 0.15.0
>
>
> The DOAP file [1] as listed in [2] has the error:
> [line: 44, col: 16] \{E201} Multiple children of property element
> [1] 
> https://gitbox.apache.org/repos/asf?p=hudi.git;a=blob_plain;f=doap_HUDI.rdf;hb=HEAD
> [2] 
> https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7708) Support cleaning archived commits

2024-05-02 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7708:
-
Description: 
currently archived commits in 0.x can take up a lot storage space and users are 
not sure if manually delete cause any issue. we would develop some mechanism to 
help optimize the storage for archived commits.

 

related issues

[https://github.com/apache/hudi/issues/7246]
[https://github.com/apache/hudi/issues/7734]

> Support cleaning archived commits
> -
>
> Key: HUDI-7708
> URL: https://issues.apache.org/jira/browse/HUDI-7708
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: archiving, table-service
>Reporter: Raymond Xu
>Priority: Major
> Fix For: 0.15.0
>
>
> currently archived commits in 0.x can take up a lot storage space and users 
> are not sure if manually delete cause any issue. we would develop some 
> mechanism to help optimize the storage for archived commits.
>  
> related issues
> [https://github.com/apache/hudi/issues/7246]
> [https://github.com/apache/hudi/issues/7734]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-7708) Support cleaning archived commits

2024-05-02 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-7708:


 Summary: Support cleaning archived commits
 Key: HUDI-7708
 URL: https://issues.apache.org/jira/browse/HUDI-7708
 Project: Apache Hudi
  Issue Type: Improvement
  Components: archiving, table-service
Reporter: Raymond Xu
 Fix For: 0.15.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-7703) Clean plan does not need to include partitions with no files to delete

2024-05-02 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-7703:


Assignee: Raymond Xu

> Clean plan does not need to include partitions with no files to delete
> --
>
> Key: HUDI-7703
> URL: https://issues.apache.org/jira/browse/HUDI-7703
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: table-service
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
> Attachments: Screenshot 2024-04-10 at 2.59.57 PM.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-7703) Clean plan does not need to include partitions with no files to delete

2024-05-01 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-7703:


 Summary: Clean plan does not need to include partitions with no 
files to delete
 Key: HUDI-7703
 URL: https://issues.apache.org/jira/browse/HUDI-7703
 Project: Apache Hudi
  Issue Type: Improvement
  Components: table-service
Reporter: Raymond Xu
 Fix For: 0.15.0, 1.0.0
 Attachments: Screenshot 2024-04-10 at 2.59.57 PM.png





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-3939) Website Contributing code to the project (newbie JIRAs) links wrong.

2024-04-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-3939.

Resolution: Fixed

https://github.com/apache/hudi/pull/11087

> Website Contributing code to the project (newbie JIRAs) links wrong.
> 
>
> Key: HUDI-3939
> URL: https://issues.apache.org/jira/browse/HUDI-3939
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: docs
>Reporter: CaoYu
>Assignee: Raymond Xu
>Priority: Minor
> Fix For: 0.15.0
>
>
> [https://hudi.apache.org/contribute/how-to-contribute]
>  * Contributing code to the project ([newbie 
> JIRAs|https://issues.apache.org/jira/issues/?jql=project+%3D+HUDI+AND+component+%3D+newbie])
> newbie JIRAs: 
> https://issues.apache.org/jira/issues/?jql=project+%3D+HUDI+AND+component+%3D+newbie
>  
> The newbie JIRAs web link can not get right result.
> Get result 
> h2. No issues were found to match your search
> when using this link from HUDI official website.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HUDI-3939) Website Contributing code to the project (newbie JIRAs) links wrong.

2024-04-24 Thread Raymond Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840609#comment-17840609
 ] 

Raymond Xu edited comment on HUDI-3939 at 4/25/24 1:12 AM:
---

[https://github.com/apache/hudi/pull/11087]

https://github.com/apache/hudi/pull/11088


was (Author: xushiyan):
https://github.com/apache/hudi/pull/11087

> Website Contributing code to the project (newbie JIRAs) links wrong.
> 
>
> Key: HUDI-3939
> URL: https://issues.apache.org/jira/browse/HUDI-3939
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: docs
>Reporter: CaoYu
>Assignee: Raymond Xu
>Priority: Minor
> Fix For: 0.15.0
>
>
> [https://hudi.apache.org/contribute/how-to-contribute]
>  * Contributing code to the project ([newbie 
> JIRAs|https://issues.apache.org/jira/issues/?jql=project+%3D+HUDI+AND+component+%3D+newbie])
> newbie JIRAs: 
> https://issues.apache.org/jira/issues/?jql=project+%3D+HUDI+AND+component+%3D+newbie
>  
> The newbie JIRAs web link can not get right result.
> Get result 
> h2. No issues were found to match your search
> when using this link from HUDI official website.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-5180) Get Involved on the website has broken links

2024-04-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-5180.

Resolution: Fixed

> Get Involved on the website has broken links
> 
>
> Key: HUDI-5180
> URL: https://issues.apache.org/jira/browse/HUDI-5180
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: docs
>Reporter: Jonathan Vexler
>Assignee: Raymond Xu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>
> The page [https://hudi.apache.org/community/get-involved] has two links 
> labeled "#here" that link to 
> [https://hudi.apache.org/community/get-involved#accounts]
> That doesn't exist anymore. I'm not exactly sure what it's supposed to link 
> to because the file got moved so git history isn't helpful.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-6330) Update user document to introduce this feature

2024-04-08 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-6330:


Assignee: Jing Zhang

> Update user document to introduce this feature
> --
>
> Key: HUDI-6330
> URL: https://issues.apache.org/jira/browse/HUDI-6330
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: docs, flink
>Reporter: Jing Zhang
>Assignee: Jing Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HUDI-6330) Update user document to introduce this feature

2024-04-08 Thread Raymond Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-6330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834990#comment-17834990
 ] 

Raymond Xu commented on HUDI-6330:
--

[~jingzhang] thanks and merged!

> Update user document to introduce this feature
> --
>
> Key: HUDI-6330
> URL: https://issues.apache.org/jira/browse/HUDI-6330
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: docs, flink
>Reporter: Jing Zhang
>Assignee: Jing Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-6330) Update user document to introduce this feature

2024-04-08 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-6330.

Resolution: Fixed

> Update user document to introduce this feature
> --
>
> Key: HUDI-6330
> URL: https://issues.apache.org/jira/browse/HUDI-6330
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: docs, flink
>Reporter: Jing Zhang
>Assignee: Jing Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-6331) Update user doc of partial update for MERGE INTO

2024-04-02 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-6331:
-
Fix Version/s: 0.15.0

> Update user doc of partial update for MERGE INTO
> 
>
> Key: HUDI-6331
> URL: https://issues.apache.org/jira/browse/HUDI-6331
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: docs
>Reporter: Jing Zhang
>Assignee: Jing Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-6331) Update user doc of partial update for MERGE INTO

2024-04-02 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-6331.

Resolution: Fixed

> Update user doc of partial update for MERGE INTO
> 
>
> Key: HUDI-6331
> URL: https://issues.apache.org/jira/browse/HUDI-6331
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: docs
>Reporter: Jing Zhang
>Assignee: Jing Zhang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-3431) Certify Hudi against Spark3 Hive3 Hadoop3

2024-04-01 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3431:
-
Fix Version/s: 0.15.0

> Certify Hudi against Spark3 Hive3 Hadoop3
> -
>
> Key: HUDI-3431
> URL: https://issues.apache.org/jira/browse/HUDI-3431
> Project: Apache Hudi
>  Issue Type: Epic
>  Components: dependencies
>Reporter: Raymond Xu
>Assignee: Rahil Chertara
>Priority: Blocker
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-5724) Test MOR table w/ global index w/ update partition path to true

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-5724.

Fix Version/s: 0.14.0
   (was: 1.1.0)
   Resolution: Fixed

> Test MOR table w/ global index w/ update partition path to true
> ---
>
> Key: HUDI-5724
> URL: https://issues.apache.org/jira/browse/HUDI-5724
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: writer-core
>Reporter: sivabalan narayanan
>Assignee: Raymond Xu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.14.0
>
>
> use a global index along w/ MOR table. 
> set update partition path = true. 
> After commit goes through, from base file standpoint, record could belong to 
> two diff file groups. Until compaction kicks in, I am not sure how subsequent 
> upserts work. we need to test this flow. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-1210) Update doc to clarify that start timestamp is exclusive for incremental queries

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-1210:
-
Fix Version/s: (was: 0.14.2)

> Update doc to clarify that start timestamp is exclusive for incremental 
> queries
> ---
>
> Key: HUDI-1210
> URL: https://issues.apache.org/jira/browse/HUDI-1210
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs
>Affects Versions: 0.9.0
>Reporter: Balaji Varadarajan
>Assignee: Raymond Xu
>Priority: Major
>  Labels: user-support-issues
> Fix For: 0.15.0
>
>
> [https://github.com/apache/hudi/issues/1973#issuecomment-675087028]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-3390) Update cleaner blog with KEEP_LATEST_BY_HOURS policy

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3390:
-
Fix Version/s: (was: 0.14.2)

> Update cleaner blog with KEEP_LATEST_BY_HOURS policy
> 
>
> Key: HUDI-3390
> URL: https://issues.apache.org/jira/browse/HUDI-3390
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs
>Reporter: Pratyaksh Sharma
>Assignee: Pratyaksh Sharma
>Priority: Major
> Fix For: 0.15.0
>
>
> Add new policy added in PR (https://github.com/apache/hudi/pull/3646) to 
> cleaner blog - 
> https://hudi.apache.org/blog/2021/06/10/employing-right-configurations-for-hudi-cleaner



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-1306) Write documentation/blog about SchemaProvider and subclasses

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-1306:
-
Fix Version/s: 0.15.0
   (was: 0.14.2)

> Write documentation/blog about SchemaProvider and subclasses
> 
>
> Key: HUDI-1306
> URL: https://issues.apache.org/jira/browse/HUDI-1306
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: docs
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: documentation, user-support-issues
> Fix For: 0.15.0
>
>
> To introduce usages of built-in subclasses of 
> org.apache.hudi.utilities.schema.SchemaProvider to help new users adopt 
> DeltaStreamer faster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-3939) Website Contributing code to the project (newbie JIRAs) links wrong.

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3939:
-
Fix Version/s: (was: 0.14.2)

> Website Contributing code to the project (newbie JIRAs) links wrong.
> 
>
> Key: HUDI-3939
> URL: https://issues.apache.org/jira/browse/HUDI-3939
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: docs
>Reporter: CaoYu
>Assignee: Raymond Xu
>Priority: Minor
> Fix For: 0.15.0
>
>
> [https://hudi.apache.org/contribute/how-to-contribute]
>  * Contributing code to the project ([newbie 
> JIRAs|https://issues.apache.org/jira/issues/?jql=project+%3D+HUDI+AND+component+%3D+newbie])
> newbie JIRAs: 
> https://issues.apache.org/jira/issues/?jql=project+%3D+HUDI+AND+component+%3D+newbie
>  
> The newbie JIRAs web link can not get right result.
> Get result 
> h2. No issues were found to match your search
> when using this link from HUDI official website.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7238) Ensure ExternalSpillableMaps are properly closed

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7238:
-
Fix Version/s: 0.15.0
   (was: 0.14.2)

> Ensure ExternalSpillableMaps are properly closed 
> -
>
> Key: HUDI-7238
> URL: https://issues.apache.org/jira/browse/HUDI-7238
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Timothy Brown
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> There are a few places that the ExternalSpillableMap are used but the close 
> method is not called. There are also cases where we are creating the 
> underlying BitMap even when we have no need for it yet.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4045) DynamoDB billing_mode property is incorrectly documented

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-4045:
-
Fix Version/s: (was: 0.14.2)

> DynamoDB billing_mode property is incorrectly documented
> 
>
> Key: HUDI-4045
> URL: https://issues.apache.org/jira/browse/HUDI-4045
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: docs
>Reporter: Atharva Inamdar
>Assignee: Raymond Xu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
> Attachments: image-2022-05-05-12-13-35-294.png
>
>
> Documented feature: 
> [https://hudi.apache.org/docs/next/configurations/#hoodiewritelockdynamodbbilling_mode]
> This specifies the `billing_mode` property is optional with a default value. 
> However running Hudi 0.10.1 the implementation throws an error if the 
> property is not specified in the application.
> !image-2022-05-05-12-13-35-294.png|width=973,height=125!
>  
> Second document that is incomplete is this guide: 
> [https://hudi.apache.org/docs/concurrency_control/#enabling-multi-writing] . 
> This is missing `billing_mode` and `endpoint_url`.
> it would be good to get documentation updated to be correct.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-6110) Hudi DOAP file error

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-6110:
-
Fix Version/s: (was: 0.14.2)

> Hudi DOAP file error
> 
>
> Key: HUDI-6110
> URL: https://issues.apache.org/jira/browse/HUDI-6110
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: docs
>Reporter: Claude Warren
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: documentation
> Fix For: 0.15.0
>
>
> The DOAP file [1] as listed in [2] has the error:
> [line: 44, col: 16] \{E201} Multiple children of property element
> [1] 
> https://gitbox.apache.org/repos/asf?p=hudi.git;a=blob_plain;f=doap_HUDI.rdf;hb=HEAD
> [2] 
> https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7308) LockManager::unlock should not call updateLockHeldTimerMetrics if lockDurationTimer has not been started

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7308:
-
Fix Version/s: 0.15.0
   (was: 0.14.2)

> LockManager::unlock should not call updateLockHeldTimerMetrics if 
> lockDurationTimer has not been started
> 
>
> Key: HUDI-7308
> URL: https://issues.apache.org/jira/browse/HUDI-7308
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: metrics, multi-writer
>Affects Versions: 1.0.0-beta1
>Reporter: Krishen Bhan
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> If an exception is thrown in 
> org.apache.hudi.client.transaction.lock.LockManager#lock it is possible for 
> lockDurationTimer in HoodieLockMetrics to be closed by the user before it is 
> started, which throws and bubbles up a `HoodieException("Timer was not 
> started")`  exception (rather than the actual exception that occurred when 
> trying to acquire lock). Specifically, this can happen due to following 
> scenario
>  # The BaseHoodieTableServiceClient calls 
> `org.apache.hudi.client.BaseHoodieTableServiceClient#completeClustering` , 
> which in turn calls 
> `org.apache.hudi.client.transaction.TransactionManager#beginTransaction` 
> within a `try` block
>  # During 
> `org.apache.hudi.client.transaction.TransactionManager#beginTransaction` the 
> LockManager lock API 
> `org.apache.hudi.client.transaction.lock.LockManager#lock` is called
>  # Inside ``org.apache.hudi.client.transaction.lock.LockManager#lock` , the 
> `java.util.concurrent.locks.Lock#tryLock(long, 
> java.util.concurrent.TimeUnit)` throws some exception. Because of this 
> exception, the statement
> `metrics.updateLockAcquiredMetric();` is not executed. This means that the 
> `org.apache.hudi.common.util.HoodieTimer#startTimer` method was never called 
> for the timer HoodieLockMetrics member variable 
> `org.apache.hudi.client.transaction.lock.metrics.HoodieLockMetrics#lockDurationTimer`
>  
>  # The exception in (3) bubbles up back to 
> `org.apache.hudi.client.BaseHoodieTableServiceClient#completeClustering`. 
> Since this is in a `try` block, the `catch` and `finally` blocks are 
> executed. When `finally` is executed though, 
> `org.apache.hudi.client.transaction.TransactionManager#endTransaction` is 
> called
>  # During 
> `org.apache.hudi.client.transaction.TransactionManager#endTransaction` the 
> LockManager unlock API 
> `org.apache.hudi.client.transaction.lock.LockManager#unlock` is called. 
> During the  execution of `metrics.updateLockHeldTimerMetrics();`  ,  The 
> method `org.apache.hudi.common.util.HoodieTimer#endTimer` is called for 
> `org.apache.hudi.client.transaction.lock.metrics.HoodieLockMetrics#lockDurationTimer`
>  . This throws an exception ` `HoodieException("Timer was not started")` This 
> is  because the corresponding 
> `org.apache.hudi.common.util.HoodieTimer#startTimer` method was never called
> The issue here is that the caller ('BaseHoodieTableServiceClient` in this 
> case) should have ended up re-throwing the exception thrown in (3) while 
> trying to start the transaction in 
> `org.apache.hudi.client.transaction.TransactionManager#startTransaction`. 
> Instead though, because the caller safely "cleaned up" by calling 
> `org.apache.hudi.client.transaction.TransactionManager#endTransaction` (in a 
> `finally`), the `HoodieException("Timer was not started")` exception was 
> raised instead, suppressing the exception from (3), which is the actual root 
> cause issue. Instead, the execution of 
> `org.apache.hudi.client.transaction.TransactionManager#endTransaction` should 
> have executed without throwing this additional exception, which would have 
> lead the caller to throw the exception in (3) before exiting.
> Although resolving this would not prevent the overall operation from failing, 
> it would provide better observability on the actual root cause exception (the 
> one from (3)). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7398) clarify clustering strategy for java client

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7398:
-
Fix Version/s: 0.15.0
   (was: 0.14.2)

> clarify clustering strategy for java client
> ---
>
> Key: HUDI-7398
> URL: https://issues.apache.org/jira/browse/HUDI-7398
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>
> java client only does linear sort
> org.apache.hudi.client.clustering.run.strategy.JavaExecutionStrategy#getPartitioner
> org.apache.hudi.client.clustering.run.strategy.JavaExecutionStrategy#getPartitioner
> but in fact it can be extended to perform space-filling curve sorting. guess 
> it’s just not implemented yet. if you’re interested, feel free to attempt it 
> with a pr



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7435) Remove Shaded of codahale metrics in flink bundle

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7435:
-
Fix Version/s: 0.15.0
   (was: 0.14.2)

> Remove Shaded of codahale metrics in flink bundle 
> --
>
> Key: HUDI-7435
> URL: https://issues.apache.org/jira/browse/HUDI-7435
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Reporter: Qijun Fu
>Assignee: Qijun Fu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> see https://github.com/apache/hudi/pull/9118#issuecomment-1926124497



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5677) [DOCS] Update AWS libs version

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-5677:
-
Fix Version/s: (was: 0.14.2)

> [DOCS] Update AWS libs version
> --
>
> Key: HUDI-5677
> URL: https://issues.apache.org/jira/browse/HUDI-5677
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Sagar Sumit
>Assignee: Raymond Xu
>Priority: Major
> Fix For: 0.15.0
>
>
> Update AWS libs version in https://hudi.apache.org/docs/s3_hoodie/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-6774) Prefix HiveConf properties to Hoodie catalog properties map with '.hadoop'

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-6774:
-
Fix Version/s: 0.15.0
   (was: 0.14.2)

> Prefix HiveConf properties to Hoodie catalog properties map with '.hadoop'
> --
>
> Key: HUDI-6774
> URL: https://issues.apache.org/jira/browse/HUDI-6774
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: meta-sync
>Reporter: Aditya Goenka
>Assignee: Vova Kolmakov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> Github Issue - [https://github.com/apache/hudi/issues/9269]
> Prefix HiveConf properties to Hoodie catalog properties map with '.hadoop', 
> so that properties defined in hive-site.xml get used.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5180) Get Involved on the website has broken links

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-5180:
-
Fix Version/s: (was: 0.14.2)

> Get Involved on the website has broken links
> 
>
> Key: HUDI-5180
> URL: https://issues.apache.org/jira/browse/HUDI-5180
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: docs
>Reporter: Jonathan Vexler
>Assignee: Raymond Xu
>Priority: Major
> Fix For: 0.15.0
>
>
> The page [https://hudi.apache.org/community/get-involved] has two links 
> labeled "#here" that link to 
> [https://hudi.apache.org/community/get-involved#accounts]
> That doesn't exist anymore. I'm not exactly sure what it's supposed to link 
> to because the file got moved so git history isn't helpful.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5719) Add docs for hudi-cli "show restores" feature

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-5719:
-
Fix Version/s: (was: 0.14.2)

> Add docs for hudi-cli "show restores" feature
> -
>
> Key: HUDI-5719
> URL: https://issues.apache.org/jira/browse/HUDI-5719
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Pramod Biligiri
>Assignee: Raymond Xu
>Priority: Major
> Fix For: 0.15.0
>
>
> Once the hudi-cli "show restores" feature is accepted 
> (https://issues.apache.org/jira/browse/HUDI-1593 is considered done), add 
> documentation for the same to the website and wherever else required.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5462) Spark-sql certain commands are only allowed with v2 tables

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-5462:
-
Fix Version/s: (was: 0.14.2)

> Spark-sql certain commands are only allowed with v2 tables
> --
>
> Key: HUDI-5462
> URL: https://issues.apache.org/jira/browse/HUDI-5462
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs, spark-sql
>Reporter: Jonathan Vexler
>Assignee: Raymond Xu
>Priority: Major
> Fix For: 0.15.0
>
>
> Certain commands such as DROP COLUMNS, RENAME COLUMN are mentioned in [spark 
> documentation|https://spark.apache.org/docs/latest/sql-ref-syntax-ddl-alter-table.html]
>  but not in our documentation. We should add it to our documentation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7402) Align MDT cleaner configs with the data table

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7402:
-
Fix Version/s: 0.15.0
   (was: 0.14.2)

> Align MDT cleaner configs with the data table
> -
>
> Key: HUDI-7402
> URL: https://issues.apache.org/jira/browse/HUDI-7402
> Project: Apache Hudi
>  Issue Type: Task
>  Components: metadata
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> Metadata table should retain at least as much history as data table. Follow 
> similar policy as data table and set retention to 1.2x for metadata table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7432) Fix excessive object creation in KeyGenUtils

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7432:
-
Fix Version/s: 0.15.0
   (was: 0.14.2)

> Fix excessive object creation in KeyGenUtils
> 
>
> Key: HUDI-7432
> URL: https://issues.apache.org/jira/browse/HUDI-7432
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Vova Kolmakov
>Assignee: Vova Kolmakov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> Some Key generators are creating excessive objects during 
> recordKey/partitionPath computation to cut off last character from them (as 
> methods getRecordKey, getRecordPartitionPath in KeyGenUtils utilize 
> deleteCharAt).
> The same fix was already applied to CustomKeyGenerator (HUDI-6916).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7411) Meta sync does not consider clean commits while syncing partitions

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7411:
-
Fix Version/s: 0.15.0
   (was: 0.14.2)

> Meta sync does not consider clean commits while syncing partitions
> --
>
> Key: HUDI-7411
> URL: https://issues.apache.org/jira/browse/HUDI-7411
> Project: Apache Hudi
>  Issue Type: Task
>  Components: meta-sync
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> Cleaner could not delete partitions but meta sync fails to drop partition in 
> that case. This could cause query using engines that depend on catalog to 
> fail.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7379) hudi-aws-bundle included jackson-module-afterburner without relocating jackson-databind

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7379:
-
Fix Version/s: 0.15.0
   (was: 0.14.2)

> hudi-aws-bundle included jackson-module-afterburner without relocating 
> jackson-databind
> ---
>
> Key: HUDI-7379
> URL: https://issues.apache.org/jira/browse/HUDI-7379
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Affects Versions: 0.14.1
>Reporter: Prabhu Joseph
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> Hudi Flink (1.18.1) Write fails when using hudi-aws-bundle on Hudi 0.14.1. It 
> works fine on Hudi 0.14.0.
>  
> *Error*
> {code:java}
> Caused by: java.lang.VerifyError: Bad type on operand stack
> Exception Details:
>   Location:
> org/apache/hudi/timeline/service/RequestHandler.()V @14: 
> invokevirtual
>   Reason:
> Type 
> 'org/apache/hudi/com/fasterxml/jackson/module/afterburner/AfterburnerModule' 
> (current frame, stack[1]) is not assignable to 
> 'org/apache/hudi/com/fasterxml/jackson/databind/Module'
>   Current Frame:
> bci: @14
> flags: { }
> locals: { }
> stack: { 'org/apache/hudi/com/fasterxml/jackson/databind/ObjectMapper', 
> 'org/apache/hudi/com/fasterxml/jackson/module/afterburner/AfterburnerModule' }
>   Bytecode:
> 0x000: bb00 8c59 b704 9dbb 049f 59b7 04a0 b604
> 0x010: a4b3 0157 1202 b804 aab3 010d b1
> at 
> org.apache.hudi.timeline.service.TimelineService.startService(TimelineService.java:358)
>  ~[hudi-flink1.18-bundle-0.14.1-amzn-0-SNAPSHOT.jar:0.14.1-amzn-0-SNAPSHOT]
> at 
> org.apache.hudi.client.embedded.EmbeddedTimelineService.startServer(EmbeddedTimelineService.java:180)
>  ~[hudi-flink1.18-bundle-0.14.1-amzn-0-SNAPSHOT.jar:0.14.1-amzn-0-SNAPSHOT]
> at 
> org.apache.hudi.client.embedded.EmbeddedTimelineService.createAndStartService(EmbeddedTimelineService.java:121)
>  ~[hudi-flink1.18-bundle-0.14.1-amzn-0-SNAPSHOT.jar:0.14.1-amzn-0-SNAPSHOT]
> at 
> org.apache.hudi.client.embedded.EmbeddedTimelineService.getOrStartEmbeddedTimelineService(EmbeddedTimelineService.java:107)
>  ~[hudi-flink1.18-bundle-0.14.1-amzn-0-SNAPSHOT.jar:0.14.1-amzn-0-SNAPSHOT]
> at 
> org.apache.hudi.client.embedded.EmbeddedTimelineService.getOrStartEmbeddedTimelineService(EmbeddedTimelineService.java:92)
>  ~[hudi-flink1.18-bundle-0.14.1-amzn-0-SNAPSHOT.jar:0.14.1-amzn-0-SNAPSHOT]
> at 
> org.apache.hudi.client.embedded.EmbeddedTimelineServerHelper.createEmbeddedTimelineService(EmbeddedTimelineServerHelper.java:44)
>  ~[hudi-flink1.18-bundle-0.14.1-amzn-0-SNAPSHOT.jar:0.14.1-amzn-0-SNAPSHOT]
> at 
> org.apache.hudi.client.BaseHoodieClient.startEmbeddedServerView(BaseHoodieClient.java:133)
>  ~[hudi-flink1.18-bundle-0.14.1-amzn-0-SNAPSHOT.jar:0.14.1-amzn-0-SNAPSHOT]
> at 
> org.apache.hudi.client.BaseHoodieClient.(BaseHoodieClient.java:98) 
> ~[hudi-flink1.18-bundle-0.14.1-amzn-0-SNAPSHOT.jar:0.14.1-amzn-0-SNAPSHOT]
> at 
> org.apache.hudi.client.BaseHoodieWriteClient.(BaseHoodieWriteClient.java:164)
>  ~[hudi-flink1.18-bundle-0.14.1-amzn-0-SNAPSHOT.jar:0.14.1-amzn-0-SNAPSHOT]
> at 
> org.apache.hudi.client.BaseHoodieWriteClient.(BaseHoodieWriteClient.java:149)
>  ~[hudi-flink1.18-bundle-0.14.1-amzn-0-SNAPSHOT.jar:0.14.1-amzn-0-SNAPSHOT]
> at 
> org.apache.hudi.client.HoodieFlinkWriteClient.(HoodieFlinkWriteClient.java:88)
>  ~[hudi-flink1.18-bundle-0.14.1-amzn-0-SNAPSHOT.jar:0.14.1-amzn-0-SNAPSHOT]
> at 
> org.apache.hudi.util.FlinkWriteClients.createWriteClient(FlinkWriteClients.java:71)
>  ~[hudi-flink1.18-bundle-0.14.1-amzn-0-SNAPSHOT.jar:0.14.1-amzn-0-SNAPSHOT]
> at 
> org.apache.hudi.sink.StreamWriteOperatorCoordinator.start(StreamWriteOperatorCoordinator.java:191)
>  ~[hudi-flink1.18-bundle-0.14.1-amzn-0-SNAPSHOT.jar:0.14.1-amzn-0-SNAPSHOT]
> at 
> org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder.start(OperatorCoordinatorHolder.java:185)
>  ~[flink-dist-1.18.1-amzn-0-SNAPSHOT.jar:1.18.1-amzn-0-SNAPSHOT]
> at 
> org.apache.flink.runtime.scheduler.DefaultOperatorCoordinatorHandler.startOperatorCoordinators(DefaultOperatorCoordinatorHandler.java:165)
>  ~[flink-dist-1.18.1-amzn-0-SNAPSHOT.jar:1.18.1-amzn-0-SNAPSHOT]
> at 
> org.apache.flink.runtime.scheduler.DefaultOperatorCoordinatorHandler.startAllOperatorCoordinators(DefaultOperatorCoordinatorHandler.java:82)
>  ~[flink-dist-1.18.1-amzn-0-SNAPSHOT.jar:1.18.1-amzn-0-SNAPSHOT]
> at 
> org.apache.flink.runtime.scheduler.SchedulerBase.startScheduling(SchedulerBase.java:627)
>  ~[flink-dist-1.18.1-amzn-0-SNAPSHOT.jar:1.18.1-amzn-0-SNAPSHOT]
> at 

[jira] [Updated] (HUDI-7482) Update schema evolution docs to explicitly state allowed type promotions

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7482:
-
Fix Version/s: 0.15.0
   (was: 0.14.2)

> Update schema evolution docs to explicitly state allowed type promotions
> 
>
> Key: HUDI-7482
> URL: https://issues.apache.org/jira/browse/HUDI-7482
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: spark, spark-sql
>Reporter: Jonathan Vexler
>Assignee: Jonathan Vexler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>
> Documentation currently only describes schema type correction



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7502) Reorganize content in the "how to" sections in the side bar

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7502:
-
Fix Version/s: (was: 0.14.2)

> Reorganize content in the "how to" sections in the side bar
> ---
>
> Key: HUDI-7502
> URL: https://issues.apache.org/jira/browse/HUDI-7502
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7550) Add Daft usage in docs

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7550:
-
Fix Version/s: (was: 0.14.2)

> Add Daft usage in docs
> --
>
> Key: HUDI-7550
> URL: https://issues.apache.org/jira/browse/HUDI-7550
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Major
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7443) Improve Compatibility for Legacy Decimal Types with Bytes as Actual Data Representation

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7443:
-
Fix Version/s: (was: 0.14.2)

> Improve Compatibility for Legacy Decimal Types with Bytes as Actual Data 
> Representation
> ---
>
> Key: HUDI-7443
> URL: https://issues.apache.org/jira/browse/HUDI-7443
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: hive
>Reporter: Qijun Fu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> Improve Compatibility for Legacy Decimal Types with Bytes as Actual Data 
> Representationype that have bytes as actual type. 
> We can't read decimal with the following types now: 
> ```json
> {
>   "name": "decimal_value",
>   "type": {
> "type": "bytes",
> "logicalType": "decimal",
> "precision": 10,
> "scale": 4
>   }
> }
> ```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7529) Resolve hotspots in stream read

2024-03-28 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7529:
-
Fix Version/s: 0.15.0
   (was: 0.14.2)

> Resolve hotspots in stream read
> ---
>
> Key: HUDI-7529
> URL: https://issues.apache.org/jira/browse/HUDI-7529
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: reader-core
>Reporter: zhuanshenbsj1
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-7226) Clean by hour does not respect lastVersionBeforeEarliestCommitToRetain

2024-03-26 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-7226.

Fix Version/s: 0.14.1
   (was: 1.1.0)
 Assignee: Timothy Brown  (was: Raymond Xu)
   Resolution: Fixed

fixed in [https://github.com/apache/hudi/pull/10307]

> Clean by hour does not respect lastVersionBeforeEarliestCommitToRetain
> --
>
> Key: HUDI-7226
> URL: https://issues.apache.org/jira/browse/HUDI-7226
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: cleaning
>Reporter: Raymond Xu
>Assignee: Timothy Brown
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.14.1
>
>
> org.apache.hudi.table.action.clean.CleanPlanner#getFilesToCleanKeepingLatestCommits(java.lang.String,
>  int, org.apache.hudi.common.model.HoodieCleaningPolicy)
> lastVersionBeforeEarliestCommitToRetain is not honored by 
> KEEP_LATEST_BY_HOURS policy. This essentially makes cleaner to remove the 
> file slice when it becomes non-latest. This could fail long-running queries 
> in a race condition:
> # timeline contains a t0.deltacommit (not cleaned because it's latest)
> # a snapshot query starts and running
> # compaction runs and creates t1.commit
> # cleaner runs and remove t0 (because now t1.commit is the latest)
> # the query failed due to a log file belongs to t0.deltacommit is not found



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-7550) Add Daft usage in docs

2024-03-25 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-7550:


 Summary: Add Daft usage in docs
 Key: HUDI-7550
 URL: https://issues.apache.org/jira/browse/HUDI-7550
 Project: Apache Hudi
  Issue Type: Task
  Components: docs
Reporter: Raymond Xu
Assignee: Raymond Xu
 Fix For: 0.15.0, 0.14.2






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-3309) Integrate quickstart examples into integration tests

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3309:
-
Epic Link: HUDI-7536  (was: HUDI-2224)

> Integrate quickstart examples into integration tests
> 
>
> Key: HUDI-3309
> URL: https://issues.apache.org/jira/browse/HUDI-3309
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs, tests-ci
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Minor
> Fix For: 0.15.0
>
>
> - create integration test suite for quickstart examples
> - make the code examples on website pages generated from the code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-3585) Docs for (consistent) hashing index

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3585:
-
Issue Type: Task  (was: New Feature)

> Docs for (consistent) hashing index
> ---
>
> Key: HUDI-3585
> URL: https://issues.apache.org/jira/browse/HUDI-3585
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Yuwei Xiao
>Assignee: Raymond Xu
>Priority: Major
> Fix For: 0.15.0
>
>
> User documents related to (consistent) hashing index, will contain the 
> following content:
> - configs to enable bucket index and tuning parameters
> - use cases and demos
> - limitations and restrictions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-7301) Update hudi docs/websites with documentation for the new spark TVF

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-7301.

  Assignee: (was: Vinaykumar Bhat)
Resolution: Fixed

https://hudi.apache.org/docs/0.14.0/quick-start-guide#incremental-query

> Update hudi docs/websites with documentation for the new spark TVF
> --
>
> Key: HUDI-7301
> URL: https://issues.apache.org/jira/browse/HUDI-7301
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Vinaykumar Bhat
>Priority: Major
> Fix For: 0.14.0
>
>
> Hudi documentation and website needs to be updated to reflect the support for 
> new spark-sql related table-valued-functions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7301) Update hudi docs/websites with documentation for the new spark TVF

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-7301:
-
Fix Version/s: 0.14.0
   (was: 0.15.0)

> Update hudi docs/websites with documentation for the new spark TVF
> --
>
> Key: HUDI-7301
> URL: https://issues.apache.org/jira/browse/HUDI-7301
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Vinaykumar Bhat
>Assignee: Vinaykumar Bhat
>Priority: Major
> Fix For: 0.14.0
>
>
> Hudi documentation and website needs to be updated to reflect the support for 
> new spark-sql related table-valued-functions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-1916) Create a matrix of datatypes across spark, hive, presto, Avro, parquet.

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-1916:
-
Fix Version/s: 0.15.0
   (was: 1.1.0)

> Create a matrix of datatypes across spark, hive, presto, Avro, parquet. 
> 
>
> Key: HUDI-1916
> URL: https://issues.apache.org/jira/browse/HUDI-1916
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: sivabalan narayanan
>Assignee: Raymond Xu
>Priority: Major
> Fix For: 0.15.0
>
>
> Create a matrix of datatypes across spark, hive, presto, Avro, parquet.
> Follow up with Flink. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-1916) Create a matrix of datatypes across spark, hive, presto, Avro, parquet.

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-1916:


Assignee: Raymond Xu  (was: Nishith Agarwal)

> Create a matrix of datatypes across spark, hive, presto, Avro, parquet. 
> 
>
> Key: HUDI-1916
> URL: https://issues.apache.org/jira/browse/HUDI-1916
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: sivabalan narayanan
>Assignee: Raymond Xu
>Priority: Major
> Fix For: 1.1.0
>
>
> Create a matrix of datatypes across spark, hive, presto, Avro, parquet.
> Follow up with Flink. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-533) Update Setup docs with lastest checkstyle

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-533.
---
  Assignee: Raymond Xu
Resolution: Fixed

https://hudi.apache.org/contribute/developer-setup/

> Update Setup docs with lastest checkstyle
> -
>
> Key: HUDI-533
> URL: https://issues.apache.org/jira/browse/HUDI-533
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: leesf
>Assignee: Raymond Xu
>Priority: Minor
>
> more context at 
> https://github.com/apache/incubator-hudi/pull/1208#issuecomment-574024869
> docs here https://hudi.apache.org/contributing.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-3) Add a guide for running Hudi pipelines using AWS Glue

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-3:
-

Assignee: Raymond Xu  (was: Vinoth Chandar)

> Add a guide for running Hudi pipelines using AWS Glue
> -
>
> Key: HUDI-3
> URL: https://issues.apache.org/jira/browse/HUDI-3
> Project: Apache Hudi
>  Issue Type: Wish
>  Components: docs, Usability
>Reporter: Vinoth Chandar
>Assignee: Raymond Xu
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-691) hoodie.*.consume.* should be set whitelist in hive-site.xml

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-691:

Fix Version/s: 0.15.0
   (was: 1.1.0)

> hoodie.*.consume.* should be set whitelist in hive-site.xml
> ---
>
> Key: HUDI-691
> URL: https://issues.apache.org/jira/browse/HUDI-691
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Affects Versions: 0.9.0
>Reporter: Bhavani Sudha
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: new-to-hudi, query-eng, sev:high, user-support-issues
> Fix For: 0.15.0
>
>
> More details in this GH issue - 
> https://github.com/apache/incubator-hudi/issues/910



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-3) Add a guide for running Hudi pipelines using AWS Glue

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3:
--
Fix Version/s: 0.15.0

> Add a guide for running Hudi pipelines using AWS Glue
> -
>
> Key: HUDI-3
> URL: https://issues.apache.org/jira/browse/HUDI-3
> Project: Apache Hudi
>  Issue Type: Wish
>  Components: docs, Usability
>Reporter: Vinoth Chandar
>Assignee: Raymond Xu
>Priority: Minor
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-691) hoodie.*.consume.* should be set whitelist in hive-site.xml

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-691:
---

Assignee: Raymond Xu  (was: GarudaGuo)

> hoodie.*.consume.* should be set whitelist in hive-site.xml
> ---
>
> Key: HUDI-691
> URL: https://issues.apache.org/jira/browse/HUDI-691
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Affects Versions: 0.9.0
>Reporter: Bhavani Sudha
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: new-to-hudi, query-eng, sev:high, user-support-issues
> Fix For: 1.1.0
>
>
> More details in this GH issue - 
> https://github.com/apache/incubator-hudi/issues/910



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-1972) Document set hive.conversion=none when using clause like LIMIT

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-1972.

Resolution: Abandoned

> Document set hive.conversion=none when using clause like LIMIT
> --
>
> Key: HUDI-1972
> URL: https://issues.apache.org/jira/browse/HUDI-1972
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Nishith Agarwal
>Assignee: Nishith Agarwal
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-1605) Add more documentation around archival process and configs

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-1605:


Assignee: Raymond Xu  (was: Kyle Weller)

> Add more documentation around archival process and configs
> --
>
> Key: HUDI-1605
> URL: https://issues.apache.org/jira/browse/HUDI-1605
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Affects Versions: 0.9.0
>Reporter: sivabalan narayanan
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: user-support-issues
>
> Reference:
> What is the trade-off in lowering {{hoodie.keep.max.commits}} and 
> {{hoodie.keep.min.commits}}?
> https://github.com/apache/hudi/issues/2408#issuecomment-758360941



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-1964) Update guide around hive metastore and hive sync for hudi tables

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-1964:
-
Summary: Update guide around hive metastore and hive sync for hudi tables  
(was: Add a blog around hive metastore and hive sync for hudi tables)

> Update guide around hive metastore and hive sync for hudi tables
> 
>
> Key: HUDI-1964
> URL: https://issues.apache.org/jira/browse/HUDI-1964
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Nishith Agarwal
>Assignee: Nishith Agarwal
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-1964) Update guide around hive metastore and hive sync for hudi tables

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-1964:
-
Fix Version/s: 0.15.0

> Update guide around hive metastore and hive sync for hudi tables
> 
>
> Key: HUDI-1964
> URL: https://issues.apache.org/jira/browse/HUDI-1964
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Nishith Agarwal
>Assignee: Raymond Xu
>Priority: Minor
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-1605) Add more documentation around archival process and configs

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-1605:
-
Fix Version/s: 0.15.0

> Add more documentation around archival process and configs
> --
>
> Key: HUDI-1605
> URL: https://issues.apache.org/jira/browse/HUDI-1605
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Affects Versions: 0.9.0
>Reporter: sivabalan narayanan
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: user-support-issues
> Fix For: 0.15.0
>
>
> Reference:
> What is the trade-off in lowering {{hoodie.keep.max.commits}} and 
> {{hoodie.keep.min.commits}}?
> https://github.com/apache/hudi/issues/2408#issuecomment-758360941



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-2609) Clarify small file configs in config page

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-2609:
-
Fix Version/s: 0.15.0

> Clarify small file configs in config page
> -
>
> Key: HUDI-2609
> URL: https://issues.apache.org/jira/browse/HUDI-2609
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: docs, user-support-issues
> Fix For: 0.15.0
>
>
> The knowledge should be preserved in docs close to the related config keys
> https://github.com/apache/hudi/issues/3676#issuecomment-922508543



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-1964) Update guide around hive metastore and hive sync for hudi tables

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-1964:


Assignee: Raymond Xu  (was: Nishith Agarwal)

> Update guide around hive metastore and hive sync for hudi tables
> 
>
> Key: HUDI-1964
> URL: https://issues.apache.org/jira/browse/HUDI-1964
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Nishith Agarwal
>Assignee: Raymond Xu
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-2035) Create document for PrometheusReporter

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-2035.

Resolution: Fixed

https://hudi.apache.org/docs/0.14.0/metrics#prometheusmetricsreporter

> Create document for PrometheusReporter
> --
>
> Key: HUDI-2035
> URL: https://issues.apache.org/jira/browse/HUDI-2035
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Vinay
>Priority: Major
>
> Although PrometheusReporter is released, there is no documentation for the 
> same



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-2609) Clarify small file configs in config page

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-2609:


Assignee: Raymond Xu  (was: Kyle Weller)

> Clarify small file configs in config page
> -
>
> Key: HUDI-2609
> URL: https://issues.apache.org/jira/browse/HUDI-2609
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: docs, user-support-issues
>
> The knowledge should be preserved in docs close to the related config keys
> https://github.com/apache/hudi/issues/3676#issuecomment-922508543



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-3074) Docs for Z-order

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-3074:


Assignee: Raymond Xu  (was: Kyle Weller)

> Docs for Z-order
> 
>
> Key: HUDI-3074
> URL: https://issues.apache.org/jira/browse/HUDI-3074
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Kyle Weller
>Assignee: Raymond Xu
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-3074) Docs for Z-order

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3074:
-
Fix Version/s: 0.15.0

> Docs for Z-order
> 
>
> Key: HUDI-3074
> URL: https://issues.apache.org/jira/browse/HUDI-3074
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Kyle Weller
>Assignee: Raymond Xu
>Priority: Minor
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-3075) Docs for Debezium source

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-3075:


Assignee: Raymond Xu  (was: Kyle Weller)

> Docs for Debezium source
> 
>
> Key: HUDI-3075
> URL: https://issues.apache.org/jira/browse/HUDI-3075
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Kyle Weller
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-3075) Docs for Debezium source

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3075:
-
Fix Version/s: 0.15.0

> Docs for Debezium source
> 
>
> Key: HUDI-3075
> URL: https://issues.apache.org/jira/browse/HUDI-3075
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Kyle Weller
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-3079) Docs for Flink 0.10.0 new features

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3079:
-
Fix Version/s: (was: 0.15.0)

> Docs for Flink 0.10.0 new features
> --
>
> Key: HUDI-3079
> URL: https://issues.apache.org/jira/browse/HUDI-3079
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Kyle Weller
>Assignee: Raymond Xu
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-3079) Docs for Flink 0.10.0 new features

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-3079.

  Assignee: (was: Raymond Xu)
Resolution: Fixed

https://hudi.apache.org/docs/0.14.0/flink-quick-start-guide

> Docs for Flink 0.10.0 new features
> --
>
> Key: HUDI-3079
> URL: https://issues.apache.org/jira/browse/HUDI-3079
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Kyle Weller
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-3079) Docs for Flink 0.10.0 new features

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-3079:


Assignee: Raymond Xu  (was: Kyle Weller)

> Docs for Flink 0.10.0 new features
> --
>
> Key: HUDI-3079
> URL: https://issues.apache.org/jira/browse/HUDI-3079
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Kyle Weller
>Assignee: Raymond Xu
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-3079) Docs for Flink 0.10.0 new features

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3079:
-
Fix Version/s: 0.15.0

> Docs for Flink 0.10.0 new features
> --
>
> Key: HUDI-3079
> URL: https://issues.apache.org/jira/browse/HUDI-3079
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Kyle Weller
>Assignee: Raymond Xu
>Priority: Minor
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-2151) Make performant out-of-box configs

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-2151:
-
Component/s: (was: docs)

> Make performant out-of-box configs
> --
>
> Key: HUDI-2151
> URL: https://issues.apache.org/jira/browse/HUDI-2151
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: code-quality, writer-core
>Reporter: Vinoth Chandar
>Assignee: sivabalan narayanan
>Priority: Blocker
>  Labels: pull-request-available
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> We have quite a few configs which deliver better performance or usability, 
> but guarded by flags. 
>  This is to identify them, change them, test (functionally, perf) and make 
> them default
>  
> Need to ensure we also capture all the backwards compatibility issues that 
> can arise



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-1600) Fix document to reflect Hudi supports MOR for spark datasource for incremental queries

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-1600.

  Assignee: (was: Gary Li)
Resolution: Fixed

https://hudi.apache.org/docs/0.14.0/quick-start-guide#incremental-query

> Fix document to reflect Hudi supports MOR for spark datasource for 
> incremental queries
> --
>
> Key: HUDI-1600
> URL: https://issues.apache.org/jira/browse/HUDI-1600
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Affects Versions: 0.9.0
>Reporter: Bhavani Sudha
>Priority: Minor
>
> The document should be updated to reflect the same 
> [https://hudi.apache.org/docs/querying_data.html#merge-on-read-tables]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HUDI-1111) Highlight Hudi guarantees in documentation section of website

2024-03-24 Thread Raymond Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830270#comment-17830270
 ] 

Raymond Xu edited comment on HUDI- at 3/24/24 4:26 PM:
---

[https://hudi.apache.org/docs/next/hudi_stack#transactional-database-layer|https://hudi.apache.org/docs/next/concurrency_control]


was (Author: xushiyan):
https://hudi.apache.org/docs/next/concurrency_control

> Highlight Hudi guarantees in documentation section of website 
> --
>
> Key: HUDI-
> URL: https://issues.apache.org/jira/browse/HUDI-
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs
>Affects Versions: 0.9.0
>Reporter: Balaji Varadarajan
>Assignee: leesf
>Priority: Major
>  Labels: user-support-issues
>
> [https://github.com/apache/hudi/issues/1795]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-1569) Add Flink examples to QuickStartUtils and Docker demo page

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-1569:


Assignee: Raymond Xu  (was: Danny Chen)

> Add Flink examples to QuickStartUtils and Docker demo page
> --
>
> Key: HUDI-1569
> URL: https://issues.apache.org/jira/browse/HUDI-1569
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs
>Reporter: sivabalan narayanan
>Assignee: Raymond Xu
>Priority: Major
>  Labels: user-support-issues
>
> Add Flink examples to QuickStartUtils and Docker demo page. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-1569) Add Flink examples to QuickStartUtils and Docker demo page

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-1569:
-
Fix Version/s: 0.15.0

> Add Flink examples to QuickStartUtils and Docker demo page
> --
>
> Key: HUDI-1569
> URL: https://issues.apache.org/jira/browse/HUDI-1569
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs
>Reporter: sivabalan narayanan
>Assignee: Raymond Xu
>Priority: Major
>  Labels: user-support-issues
> Fix For: 0.15.0
>
>
> Add Flink examples to QuickStartUtils and Docker demo page. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-1081) Document AWS Hudi integration

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-1081:
-
Fix Version/s: 0.15.0

> Document AWS Hudi integration
> -
>
> Key: HUDI-1081
> URL: https://issues.apache.org/jira/browse/HUDI-1081
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs, Usability
>Affects Versions: 0.9.0
>Reporter: Bhavani Sudha
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: documentation, user-support-issues
> Fix For: 0.15.0
>
>
> Often times AWS Hudi users seek documentation on setting up Hudi and 
> integrating Hive megastore and GLUE configurations. This has been one of the 
> popular thread in Slack. It would serve well if documented.
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-1081) Document AWS Hudi integration

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-1081:


Assignee: Raymond Xu  (was: Bhavani Sudha)

> Document AWS Hudi integration
> -
>
> Key: HUDI-1081
> URL: https://issues.apache.org/jira/browse/HUDI-1081
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs, Usability
>Affects Versions: 0.9.0
>Reporter: Bhavani Sudha
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: documentation, user-support-issues
>
> Often times AWS Hudi users seek documentation on setting up Hudi and 
> integrating Hive megastore and GLUE configurations. This has been one of the 
> popular thread in Slack. It would serve well if documented.
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-1111) Highlight Hudi guarantees in documentation section of website

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-.

Resolution: Fixed

https://hudi.apache.org/docs/next/concurrency_control

> Highlight Hudi guarantees in documentation section of website 
> --
>
> Key: HUDI-
> URL: https://issues.apache.org/jira/browse/HUDI-
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs
>Affects Versions: 0.9.0
>Reporter: Balaji Varadarajan
>Assignee: leesf
>Priority: Major
>  Labels: user-support-issues
>
> [https://github.com/apache/hudi/issues/1795]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-851) Add Documentation on partitioning data with examples and details on how to sync to Hive

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-851:
---

Assignee: Raymond Xu  (was: Sagar Sumit)

> Add Documentation on partitioning data with examples and details on how to 
> sync to Hive
> ---
>
> Key: HUDI-851
> URL: https://issues.apache.org/jira/browse/HUDI-851
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs
>Reporter: Bhavani Sudha
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: query-eng, user-support-issues
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HUDI-336) Improve page related content on Hudi website

2024-03-24 Thread Raymond Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830268#comment-17830268
 ] 

Raymond Xu commented on HUDI-336:
-

website landing page updated

> Improve page related content on Hudi website
> 
>
> Key: HUDI-336
> URL: https://issues.apache.org/jira/browse/HUDI-336
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs
>Reporter: Ethan Guo
>Priority: Major
>
> The landing page of [hudi.apache.org|https://hudi.apache.org/] shows detailed 
> information about Hudi.  It would be good to show highlights of Hudi at the 
> high level and other useful information (powered-by, slack group, etc) 
> directly on the landing page.
>  
> h2. Redesigning landing page
> The landing page should serve the following purposes, with as fewer words as 
> possible
>  # Answer what is Hudi
>  # Provide insights into why someone should use Hudi (some answers in 
> [FAQ|http://cwiki.apache.org/confluence/display/HUDI/FAQ])
>  # Highlight key features
>  # Have the link to the comparison page if the user would like to read more
>  # Direct links/buttons to join the community
> Here are the proposed changes corresponding to each item:
>  # Simplify the existing content to 2-3 sentences, with a diagram showing the 
> interoperability with existing data ecosystem
>  # 3-4 sentences on why someone should use Hudi, with a link to the 
> [FAQ|http://cwiki.apache.org/confluence/display/HUDI/FAQ]
>  # Highlight a few key features, each one elaborated with the technical 
> details in 2-3 sentences, something like the following (from this [set of 
> slides|https://docs.google.com/presentation/d/1FHhsvh70ZP6xXlHdVsAI0g__B_6Mpto5KQFlZ0b8-mM/edit#slide=id.g47544a2471_0_0]
>  by Vinoth and Balaji)
> E.g., "*Near real-time data ingestion to Cloud storage/DFS*: By carefully 
> managing how data is laid out in storage & how it’s exposed to queries, Hudi 
> is able to power a rich data ecosystem where external sources can be ingested 
> in near real-time and made available for interactive SQL Engines like 
> [Presto|https://prestodb.io/] & [Spark|https://spark.apache.org/sql/];
> !http://cwiki.apache.org/confluence/download/attachments/135860808/Screen%20Shot%202019-11-13%20at%204.20.26%20PM.png?version=1=1573691647000=v2|height=250!!http://cwiki.apache.org/confluence/download/attachments/135860808/Screen%20Shot%202019-11-13%20at%204.20.32%20PM.png?version=1=1573691647000=v2|height=250!
>  # Have a few sentences on the comparison against other solutions.  Have a 
> link to the detailed comparison page.
>  # Direct links for joining the slack group and mailing list.  Right now 
> these links are in the community page and requires additional clicks and 
> reading.
> For the detailed comparison page, we could have an interactive table to 
> compare Hudi with other solutions regarding different aspects (each aspect 
> can be clickable to show detailed explanation):
> !https://cdn.qubole.com/wp-content/uploads/2019/09/Hive-ACID-selection-table.png|height=250!
> The top menu bar on the landing page is still kept.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-336) Improve page related content on Hudi website

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-336.
---
Resolution: Done

> Improve page related content on Hudi website
> 
>
> Key: HUDI-336
> URL: https://issues.apache.org/jira/browse/HUDI-336
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs
>Reporter: Ethan Guo
>Priority: Major
>
> The landing page of [hudi.apache.org|https://hudi.apache.org/] shows detailed 
> information about Hudi.  It would be good to show highlights of Hudi at the 
> high level and other useful information (powered-by, slack group, etc) 
> directly on the landing page.
>  
> h2. Redesigning landing page
> The landing page should serve the following purposes, with as fewer words as 
> possible
>  # Answer what is Hudi
>  # Provide insights into why someone should use Hudi (some answers in 
> [FAQ|http://cwiki.apache.org/confluence/display/HUDI/FAQ])
>  # Highlight key features
>  # Have the link to the comparison page if the user would like to read more
>  # Direct links/buttons to join the community
> Here are the proposed changes corresponding to each item:
>  # Simplify the existing content to 2-3 sentences, with a diagram showing the 
> interoperability with existing data ecosystem
>  # 3-4 sentences on why someone should use Hudi, with a link to the 
> [FAQ|http://cwiki.apache.org/confluence/display/HUDI/FAQ]
>  # Highlight a few key features, each one elaborated with the technical 
> details in 2-3 sentences, something like the following (from this [set of 
> slides|https://docs.google.com/presentation/d/1FHhsvh70ZP6xXlHdVsAI0g__B_6Mpto5KQFlZ0b8-mM/edit#slide=id.g47544a2471_0_0]
>  by Vinoth and Balaji)
> E.g., "*Near real-time data ingestion to Cloud storage/DFS*: By carefully 
> managing how data is laid out in storage & how it’s exposed to queries, Hudi 
> is able to power a rich data ecosystem where external sources can be ingested 
> in near real-time and made available for interactive SQL Engines like 
> [Presto|https://prestodb.io/] & [Spark|https://spark.apache.org/sql/];
> !http://cwiki.apache.org/confluence/download/attachments/135860808/Screen%20Shot%202019-11-13%20at%204.20.26%20PM.png?version=1=1573691647000=v2|height=250!!http://cwiki.apache.org/confluence/download/attachments/135860808/Screen%20Shot%202019-11-13%20at%204.20.32%20PM.png?version=1=1573691647000=v2|height=250!
>  # Have a few sentences on the comparison against other solutions.  Have a 
> link to the detailed comparison page.
>  # Direct links for joining the slack group and mailing list.  Right now 
> these links are in the community page and requires additional clicks and 
> reading.
> For the detailed comparison page, we could have an interactive table to 
> compare Hudi with other solutions regarding different aspects (each aspect 
> can be clickable to show detailed explanation):
> !https://cdn.qubole.com/wp-content/uploads/2019/09/Hive-ACID-selection-table.png|height=250!
> The top menu bar on the landing page is still kept.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-295) Do one-time cleanup of Hudi git history

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-295.
---
Resolution: Abandoned

> Do one-time cleanup of Hudi git history
> ---
>
> Key: HUDI-295
> URL: https://issues.apache.org/jira/browse/HUDI-295
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Vinoth Chandar
>Priority: Major
>
> https://lists.apache.org/thread.html/dc6eb516e248088dac1a2b5c9690383dfe2eb3912f76bbe9dd763c2b@



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-17) Better documentation for paths passed to incr and ro views from Spark datasource

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-17?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-17:
--

Assignee: Raymond Xu

> Better documentation for paths passed to incr and ro views from Spark 
> datasource
> 
>
> Key: HUDI-17
> URL: https://issues.apache.org/jira/browse/HUDI-17
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs, spark
>Reporter: Vinoth Chandar
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: user-support-issues
>
> https://github.com/uber/hudi/issues/524



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-17) Better documentation for paths passed to incr and ro views from Spark datasource

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-17?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-17:
---
Fix Version/s: 0.15.0

> Better documentation for paths passed to incr and ro views from Spark 
> datasource
> 
>
> Key: HUDI-17
> URL: https://issues.apache.org/jira/browse/HUDI-17
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs, spark
>Reporter: Vinoth Chandar
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: user-support-issues
> Fix For: 0.15.0
>
>
> https://github.com/uber/hudi/issues/524



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4859) Adding a blog on how to run Hudi on Serverless Platforms (AWS Glue)

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-4859:
-
Component/s: (was: docs)

> Adding a blog on how to run Hudi on Serverless Platforms (AWS Glue) 
> 
>
> Key: HUDI-4859
> URL: https://issues.apache.org/jira/browse/HUDI-4859
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Angel Conde 
>Assignee: Angel Conde 
>Priority: Trivial
>  Labels: Blog, docs
> Fix For: 0.15.0
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> Hi, 
> After a small contribution about Delta Streamer 
> [https://github.com/apache/hudi/pull/5630#event-7377963882]
> I got the suggestion on writing some some docs/blog on how to use Hudi on 
> Serverless platforms. 
>  
> This tickets is to follow this work. The idea is to publish a new blog with a 
> full example of running Hudi on AWS Glue (one of many "serverless" Spark 
> Platform). 
>  
> Further in the future I would like to contribute with integration with AWS 
> Glue Registry for Delta Streamer :). 
>  
> Thanks!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-1564) Blog: Dfs -> Hudi followed by Kafka to Hudi

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-1564:
-
Component/s: (was: docs)

> Blog: Dfs -> Hudi followed by Kafka to Hudi
> ---
>
> Key: HUDI-1564
> URL: https://issues.apache.org/jira/browse/HUDI-1564
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: sivabalan narayanan
>Assignee: Pratyaksh Sharma
>Priority: Major
>  Labels: user-support-issues
>
> We need a blog to talk about this use case. 
> Lets say content from Kafka are already dumped to dfs. 
> So, bootstrapping will be done from dfs to Hudi. 
> Once complete, users might want to hook in Kafka Source in delta streamer -> 
> Hudi. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4859) Adding a blog on how to run Hudi on Serverless Platforms (AWS Glue)

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-4859:
-
Fix Version/s: (was: 0.15.0)

> Adding a blog on how to run Hudi on Serverless Platforms (AWS Glue) 
> 
>
> Key: HUDI-4859
> URL: https://issues.apache.org/jira/browse/HUDI-4859
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Angel Conde 
>Assignee: Angel Conde 
>Priority: Trivial
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> Hi, 
> After a small contribution about Delta Streamer 
> [https://github.com/apache/hudi/pull/5630#event-7377963882]
> I got the suggestion on writing some some docs/blog on how to use Hudi on 
> Serverless platforms. 
>  
> This tickets is to follow this work. The idea is to publish a new blog with a 
> full example of running Hudi on AWS Glue (one of many "serverless" Spark 
> Platform). 
>  
> Further in the future I would like to contribute with integration with AWS 
> Glue Registry for Delta Streamer :). 
>  
> Thanks!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4859) Adding a blog on how to run Hudi on Serverless Platforms (AWS Glue)

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-4859:
-
Labels:   (was: Blog docs)

> Adding a blog on how to run Hudi on Serverless Platforms (AWS Glue) 
> 
>
> Key: HUDI-4859
> URL: https://issues.apache.org/jira/browse/HUDI-4859
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Angel Conde 
>Assignee: Angel Conde 
>Priority: Trivial
> Fix For: 0.15.0
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> Hi, 
> After a small contribution about Delta Streamer 
> [https://github.com/apache/hudi/pull/5630#event-7377963882]
> I got the suggestion on writing some some docs/blog on how to use Hudi on 
> Serverless platforms. 
>  
> This tickets is to follow this work. The idea is to publish a new blog with a 
> full example of running Hudi on AWS Glue (one of many "serverless" Spark 
> Platform). 
>  
> Further in the future I would like to contribute with integration with AWS 
> Glue Registry for Delta Streamer :). 
>  
> Thanks!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-1022) Document examples for Spark structured streaming writing into Hudi

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-1022.

Resolution: Fixed

https://hudi.apache.org/docs/next/writing_tables_streaming_writes

> Document examples for Spark structured streaming writing into Hudi
> --
>
> Key: HUDI-1022
> URL: https://issues.apache.org/jira/browse/HUDI-1022
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs, Usability
>Reporter: Bhavani Sudha
>Assignee: Sagar Sumit
>Priority: Minor
>  Labels: docs, sev:normal, user-support-issues
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-1112) Blog on Tracking Hudi Data along transaction time and buisness time

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-1112:
-
Component/s: blog
 (was: docs)

> Blog on Tracking Hudi Data along transaction time and buisness time
> ---
>
> Key: HUDI-1112
> URL: https://issues.apache.org/jira/browse/HUDI-1112
> Project: Apache Hudi
>  Issue Type: Task
>  Components: blog
>Affects Versions: 0.9.0
>Reporter: Vinoth Chandar
>Assignee: Sandeep Maji
>Priority: Major
>
> https://github.com/apache/hudi/issues/1705



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-2341) Publish a blog on immutable data lakes

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-2341:
-
Component/s: (was: docs)

> Publish a blog on immutable data lakes
> --
>
> Key: HUDI-2341
> URL: https://issues.apache.org/jira/browse/HUDI-2341
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4310) Add docs around spark scheduler configs for compaction and clustering

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-4310:
-
Fix Version/s: 0.15.0

> Add docs around spark scheduler configs for compaction and clustering
> -
>
> Key: HUDI-4310
> URL: https://issues.apache.org/jira/browse/HUDI-4310
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: sivabalan narayanan
>Assignee: Raymond Xu
>Priority: Minor
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-4310) Add docs around spark scheduler configs for compaction and clustering

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-4310:


Assignee: Raymond Xu

> Add docs around spark scheduler configs for compaction and clustering
> -
>
> Key: HUDI-4310
> URL: https://issues.apache.org/jira/browse/HUDI-4310
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: sivabalan narayanan
>Assignee: Raymond Xu
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-1565) Document all maven commands

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-1565.

Resolution: Fixed

In readme

> Document all maven commands
> ---
>
> Key: HUDI-1565
> URL: https://issues.apache.org/jira/browse/HUDI-1565
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: user-support-issues
>
> Document all maven commands



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-1966) Add a CLI command to capture .hoodie output for debugging

2024-03-24 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-1966.

Resolution: Abandoned

> Add a CLI command to capture .hoodie output for debugging
> -
>
> Key: HUDI-1966
> URL: https://issues.apache.org/jira/browse/HUDI-1966
> Project: Apache Hudi
>  Issue Type: Task
>  Components: docs
>Reporter: Nishith Agarwal
>Assignee: Nishith Agarwal
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   8   9   10   >