[GitHub] [hudi] hudi-bot edited a comment on pull request #3416: [HUDI-2362] Add external config file support

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3416: URL: https://github.com/apache/hudi/pull/3416#issuecomment-893712830 ## CI report: * 4949d3827d33e725b2bbba13b9ce15df160d56ee Azure:

[jira] [Updated] (HUDI-2472) Tests failure follow up when metadata is enabled by default

2021-10-06 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-2472: - Description: We plan to enable metadata by default. but there are some tests that fail with this.

[GitHub] [hudi] hudi-bot edited a comment on pull request #3416: [HUDI-2362] Add external config file support

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3416: URL: https://github.com/apache/hudi/pull/3416#issuecomment-893712830 ## CI report: * 5f1b6ee1f819a5fa8398417877003ac7c477ca37 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3416: [HUDI-2362] Add external config file support

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3416: URL: https://github.com/apache/hudi/pull/3416#issuecomment-893712830 ## CI report: * 4949d3827d33e725b2bbba13b9ce15df160d56ee Azure:

[GitHub] [hudi] nsivabalan commented on issue #3605: [SUPPORT]Hudi Inserts and Upserts for MoR and CoW tables are taking very long time.

2021-10-06 Thread GitBox
nsivabalan commented on issue #3605: URL: https://github.com/apache/hudi/issues/3605#issuecomment-937436336 also, can you post your spark stages UI so that we can see some metrics wrt data skewness. and how much parallelism we are hitting. -- This is an automated message from the

[GitHub] [hudi] nsivabalan edited a comment on issue #3605: [SUPPORT]Hudi Inserts and Upserts for MoR and CoW tables are taking very long time.

2021-10-06 Thread GitBox
nsivabalan edited a comment on issue #3605: URL: https://github.com/apache/hudi/issues/3605#issuecomment-937435741 sorry, whats the shuffle parallelism you are setting for these writes? In your original description, I see you are setting it to 2. definitely that would give you bad perf.

[GitHub] [hudi] nsivabalan commented on issue #3605: [SUPPORT]Hudi Inserts and Upserts for MoR and CoW tables are taking very long time.

2021-10-06 Thread GitBox
nsivabalan commented on issue #3605: URL: https://github.com/apache/hudi/issues/3605#issuecomment-937435741 sorry, whats the shuffle parallelism you are setting for these writes? In your original description, I see you are setting it to 2. definitely that would give you bad perf. Try to

[GitHub] [hudi] nsivabalan commented on issue #3739: Hoodie clean is not deleting old files

2021-10-06 Thread GitBox
nsivabalan commented on issue #3739: URL: https://github.com/apache/hudi/issues/3739#issuecomment-937434037 I am not aware of any easier option or not hudi-cli has any option for this. @vinothchandar @bhasudha @bvaradar @n3nash : any suggestions here. Here is the question: if

[GitHub] [hudi] nsivabalan commented on issue #2265: Arrays with nulls in them result in broken parquet files

2021-10-06 Thread GitBox
nsivabalan commented on issue #2265: URL: https://github.com/apache/hudi/issues/2265#issuecomment-937432922 Nope, it was a very old comment. removed it. 080 should have it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] nsivabalan commented on issue #3676: MOR table rolls out new parquet files at 10MB for new inserts - even though max file size set as 128MB

2021-10-06 Thread GitBox
nsivabalan commented on issue #3676: URL: https://github.com/apache/hudi/issues/3676#issuecomment-937427922 guess, we don't have clear documentation around this. I myself had to dig through the code and tried it myself before confirming some of the nuance behaviors. -- This is an

[GitHub] [hudi] hudi-bot edited a comment on pull request #3416: [HUDI-2362] Add external config file support

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3416: URL: https://github.com/apache/hudi/pull/3416#issuecomment-893712830 ## CI report: * 5f1b6ee1f819a5fa8398417877003ac7c477ca37 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3416: [HUDI-2362] Add external config file support

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3416: URL: https://github.com/apache/hudi/pull/3416#issuecomment-893712830 ## CI report: * 4949d3827d33e725b2bbba13b9ce15df160d56ee Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3416: [HUDI-2362] Add external config file support

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3416: URL: https://github.com/apache/hudi/pull/3416#issuecomment-893712830 ## CI report: * 4949d3827d33e725b2bbba13b9ce15df160d56ee Azure:

[GitHub] [hudi] nsivabalan closed pull request #3411: [HUDI-2276] Enable metadata table by default for readers and writers

2021-10-06 Thread GitBox
nsivabalan closed pull request #3411: URL: https://github.com/apache/hudi/pull/3411 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] nsivabalan commented on pull request #3411: [HUDI-2276] Enable metadata table by default for readers and writers

2021-10-06 Thread GitBox
nsivabalan commented on pull request #3411: URL: https://github.com/apache/hudi/pull/3411#issuecomment-937353116 fixed it along with https://github.com/apache/hudi/pull/3590 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] nsivabalan merged pull request #3753: [HUDI-2510] Added a quickstart redirect page to fix broken external links in GCP docs

2021-10-06 Thread GitBox
nsivabalan merged pull request #3753: URL: https://github.com/apache/hudi/pull/3753 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[hudi] branch asf-site updated: [HUDI-2510] Added a quickstart redirect page to fix broken external links in GCP docs (#3753)

2021-10-06 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new c6512ea [HUDI-2510] Added a quickstart

[GitHub] [hudi] nsivabalan commented on a change in pull request #3740: [HUDI-2496] Insert duplicate keys when precombined is deactivated

2021-10-06 Thread GitBox
nsivabalan commented on a change in pull request #3740: URL: https://github.com/apache/hudi/pull/3740#discussion_r723770333 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSourceConsistentInserts.scala ## @@ -0,0 +1,86 @@

[GitHub] [hudi] nsivabalan merged pull request #3738: [MINOR] Fix typo,'properites' corrected to 'properties'

2021-10-06 Thread GitBox
nsivabalan merged pull request #3738: URL: https://github.com/apache/hudi/pull/3738 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] nsivabalan commented on pull request #3722: HUDI-2491 hoodie.datasource.hive_sync.mode=hms mode is supported in s…

2021-10-06 Thread GitBox
nsivabalan commented on pull request #3722: URL: https://github.com/apache/hudi/pull/3722#issuecomment-937348693 @codope : Can you take up this review please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[hudi] branch master updated (2e15217 -> 10e3a9a)

2021-10-06 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 2e15217 [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module (#3743) add 10e3a9a

[GitHub] [hudi] nsivabalan commented on pull request #3700: [HUDI-2471] Add support ignoring case when column name matches in merge into

2021-10-06 Thread GitBox
nsivabalan commented on pull request #3700: URL: https://github.com/apache/hudi/pull/3700#issuecomment-937348065 @pengzhiwei2018 @xushiyan : can either of you take care of reviewing this please. -- This is an automated message from the Apache Git Service. To respond to the message,

[hudi] branch master updated: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module (#3743)

2021-10-06 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 2e15217 [HUDI-2513] Refactor table upgrade

[GitHub] [hudi] nsivabalan merged pull request #3743: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module

2021-10-06 Thread GitBox
nsivabalan merged pull request #3743: URL: https://github.com/apache/hudi/pull/3743 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] bgt-cdedels commented on issue #3739: Hoodie clean is not deleting old files

2021-10-06 Thread GitBox
bgt-cdedels commented on issue #3739: URL: https://github.com/apache/hudi/issues/3739#issuecomment-937333824 Yes, my question is: if our configuration was trimming commits from timeline but files were not deleted by cleaner, is there a way to remove (manually clean) those files without

[jira] [Commented] (HUDI-860) Ability to do small file handling without need for caching

2021-10-06 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425284#comment-17425284 ] Ethan Guo commented on HUDI-860: Cool, I'll take a look. > Ability to do small file handling without need

[GitHub] [hudi] govorunov commented on issue #3756: [SUPPORT] Can we use Hudi to build Temporal Datastore?

2021-10-06 Thread GitBox
govorunov commented on issue #3756: URL: https://github.com/apache/hudi/issues/3756#issuecomment-937267572 I think I need to elaborate a little further: 1. If we are to write all database backups into Hudi table in their historical order, then do the live database snapshot and only

[GitHub] [hudi] hudi-bot edited a comment on pull request #3416: [HUDI-2362] Add external config file support

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3416: URL: https://github.com/apache/hudi/pull/3416#issuecomment-893712830 ## CI report: * 4949d3827d33e725b2bbba13b9ce15df160d56ee Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3416: [HUDI-2362] Add external config file support

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3416: URL: https://github.com/apache/hudi/pull/3416#issuecomment-893712830 ## CI report: * ca2568692241b16928162899d91241a43de870a9 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3416: [HUDI-2362] Add external config file support

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3416: URL: https://github.com/apache/hudi/pull/3416#issuecomment-893712830 ## CI report: * ca2568692241b16928162899d91241a43de870a9 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3416: [HUDI-2362] Add external config file support

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3416: URL: https://github.com/apache/hudi/pull/3416#issuecomment-893712830 ## CI report: * b1c19180582fa6f0139b1a897aba36834a5b408f Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3416: [HUDI-2362] Add external config file support

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3416: URL: https://github.com/apache/hudi/pull/3416#issuecomment-893712830 ## CI report: * b1c19180582fa6f0139b1a897aba36834a5b408f Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3416: [HUDI-2362] Add external config file support

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3416: URL: https://github.com/apache/hudi/pull/3416#issuecomment-893712830 ## CI report: * b1c19180582fa6f0139b1a897aba36834a5b408f Azure:

[GitHub] [hudi] helanto commented on a change in pull request #3740: [HUDI-2496] Insert duplicate keys when precombined is deactivated

2021-10-06 Thread GitBox
helanto commented on a change in pull request #3740: URL: https://github.com/apache/hudi/pull/3740#discussion_r723551743 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSourceConsistentInserts.scala ## @@ -0,0 +1,86 @@ +/* +

[GitHub] [hudi] helanto commented on a change in pull request #3740: [HUDI-2496] Insert duplicate keys when precombined is deactivated

2021-10-06 Thread GitBox
helanto commented on a change in pull request #3740: URL: https://github.com/apache/hudi/pull/3740#discussion_r723551743 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSourceConsistentInserts.scala ## @@ -0,0 +1,86 @@ +/* +

[GitHub] [hudi] nsivabalan commented on issue #3739: Hoodie clean is not deleting old files

2021-10-06 Thread GitBox
nsivabalan commented on issue #3739: URL: https://github.com/apache/hudi/issues/3739#issuecomment-936736174 or is your questions, due to mis-configuration, if archival trimmed some commits from timeline which cleaner did not get a chance to clean, is there a way to go about cleaning them

[GitHub] [hudi] nsivabalan commented on issue #3739: Hoodie clean is not deleting old files

2021-10-06 Thread GitBox
nsivabalan commented on issue #3739: URL: https://github.com/apache/hudi/issues/3739#issuecomment-936731947 to clarify, archival touches only the timeline and cleaner touches only the data files. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] nsivabalan edited a comment on issue #3739: Hoodie clean is not deleting old files

2021-10-06 Thread GitBox
nsivabalan edited a comment on issue #3739: URL: https://github.com/apache/hudi/issues/3739#issuecomment-936729727 let me illustrate w/ an example. archival works with timeline, where as cleaner deals with data files. this difference is important to understand the interplays here.

[GitHub] [hudi] nsivabalan commented on issue #3739: Hoodie clean is not deleting old files

2021-10-06 Thread GitBox
nsivabalan commented on issue #3739: URL: https://github.com/apache/hudi/issues/3739#issuecomment-936729727 let me illustrate w/ an example. archival works with timeline, where as cleaner deals with data files. this difference is important to understand the interplays here.

[jira] [Assigned] (HUDI-1370) Scoping work needed to support bootstrap and RFC-15 together

2021-10-06 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1370: - Assignee: Vinoth Chandar > Scoping work needed to support bootstrap and RFC-15

[jira] [Resolved] (HUDI-2276) Enable Metadata Table by default for both writers and readers

2021-10-06 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan resolved HUDI-2276. --- Resolution: Fixed > Enable Metadata Table by default for both writers and readers >

[jira] [Commented] (HUDI-2276) Enable Metadata Table by default for both writers and readers

2021-10-06 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425100#comment-17425100 ] sivabalan narayanan commented on HUDI-2276: --- Enabled metadata by default with

[jira] [Resolved] (HUDI-2476) Fix retried compaction commit in datatable fails when applied to metadata w/ sync updates

2021-10-06 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan resolved HUDI-2476. --- Resolution: Fixed > Fix retried compaction commit in datatable fails when applied to

[jira] [Closed] (HUDI-2436) rollback in cloud stores w/o append, wrt collecting failed log files to be deleted/logged

2021-10-06 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan closed HUDI-2436. - Resolution: Invalid > rollback in cloud stores w/o append, wrt collecting failed log

[jira] [Resolved] (HUDI-2285) Metadata Table Synchronous Design

2021-10-06 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan resolved HUDI-2285. --- Fix Version/s: 0.10.0 Resolution: Fixed > Metadata Table Synchronous Design >

[jira] [Commented] (HUDI-2285) Metadata Table Synchronous Design

2021-10-06 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425097#comment-17425097 ] sivabalan narayanan commented on HUDI-2285: --- [~pwason]: I am closing this ticket out. Let me

[GitHub] [hudi] bgt-cdedels commented on issue #3739: Hoodie clean is not deleting old files

2021-10-06 Thread GitBox
bgt-cdedels commented on issue #3739: URL: https://github.com/apache/hudi/issues/3739#issuecomment-936650190 @nsivabalan - thanks for the help. If we increase hoodie.keep.max.commits to 10, will that also delete any old commits from the archival timeline when the cleaner is run the next

[jira] [Commented] (HUDI-2285) Metadata Table Synchronous Design

2021-10-06 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425069#comment-17425069 ] sivabalan narayanan commented on HUDI-2285: --- synchronous metadata patch got landed. 

[jira] [Commented] (HUDI-2159) Supporting Clustering and Metadata Table together

2021-10-06 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425066#comment-17425066 ] sivabalan narayanan commented on HUDI-2159: --- With synchronous metadata table design, we can now

[jira] [Resolved] (HUDI-2159) Supporting Clustering and Metadata Table together

2021-10-06 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan resolved HUDI-2159. --- Resolution: Fixed > Supporting Clustering and Metadata Table together >

[GitHub] [hudi] hudi-bot edited a comment on pull request #3757: [HUDI-2005][WIP] Avoiding direct fs calls in HoodieLogFileReader and AbstractTableFileSystemView

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3757: URL: https://github.com/apache/hudi/pull/3757#issuecomment-936434081 ## CI report: * 7262974f6f78070fbdda2dc2d588894f5c7ca2ef Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3757: [HUDI-2005][WIP] Avoiding direct fs calls in HoodieLogFileReader and AbstractTableFileSystemView

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3757: URL: https://github.com/apache/hudi/pull/3757#issuecomment-936434081 ## CI report: * 7262974f6f78070fbdda2dc2d588894f5c7ca2ef Azure:

[GitHub] [hudi] hudi-bot commented on pull request #3757: [HUDI-2005][WIP] Avoiding direct fs calls in HoodieLogFileReader and AbstractTableFileSystemView

2021-10-06 Thread GitBox
hudi-bot commented on pull request #3757: URL: https://github.com/apache/hudi/pull/3757#issuecomment-936434081 ## CI report: * 7262974f6f78070fbdda2dc2d588894f5c7ca2ef UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis`

[jira] [Updated] (HUDI-2005) Audit and remove references of fs.listStatus() and fs.getFileStatus() or fs.exists()

2021-10-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-2005: - Labels: pull-request-available (was: ) > Audit and remove references of fs.listStatus() and

[GitHub] [hudi] nsivabalan opened a new pull request #3757: [HUDI-2005][WIP] Avoiding fs.getFileStatus call in HoodieLogFileReader

2021-10-06 Thread GitBox
nsivabalan opened a new pull request #3757: URL: https://github.com/apache/hudi/pull/3757 ## What is the purpose of the pull request - Fixing couple of direct fs calls. - a. HoodieLogFileReader was using fs.getFileStatus on log file path. Fixed to avoid direct fs call - b.

[jira] [Updated] (HUDI-2491) hoodie.datasource.hive_sync.mode=hms mode is supported in spark writer option

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2491: - Labels: pull-request-available sev:high (was: pull-request-available) >

[jira] [Updated] (HUDI-2438) [Umbrella] [RFC-34] Implement BigQuerySyncTool for BigQuery Sync

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2438: - Priority: Blocker (was: Major) > [Umbrella] [RFC-34] Implement BigQuerySyncTool for BigQuery

[jira] [Updated] (HUDI-2484) Hive sync not working in HMS mode with DeltaStreamer

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2484: - Labels: pull-request-available sev:critical user-support-issues (was: pull-request-available) >

[jira] [Updated] (HUDI-2319) Integrate hudi with dbt (data build tool)

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2319: - Priority: Blocker (was: Major) > Integrate hudi with dbt (data build tool) >

[jira] [Updated] (HUDI-1748) Read operation will possibility fail on mor table rt view when a write operations is concurrency running

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1748: - Labels: pull-request-available user-support-issues (was: pull-request-available) > Read

[jira] [Resolved] (HUDI-1097) Integration test for prestosql queries

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-1097. -- Fix Version/s: (was: 0.10.0) Resolution: Invalid > Integration test for prestosql

[jira] [Updated] (HUDI-1097) Integration test for prestosql queries

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1097: - Status: Open (was: New) > Integration test for prestosql queries >

[jira] [Resolved] (HUDI-1095) Add documentation for prestosql support

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-1095. -- Fix Version/s: (was: 0.10.0) Resolution: Invalid > Add documentation for prestosql

[jira] [Resolved] (HUDI-1096) MOR queries support from Prestosql

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-1096. -- Resolution: Invalid > MOR queries support from Prestosql > -- >

[jira] [Resolved] (HUDI-1094) Docker demo integration of Prestosql queries

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-1094. -- Fix Version/s: (was: 0.10.0) Resolution: Invalid > Docker demo integration of

[jira] [Updated] (HUDI-1096) MOR queries support from Prestosql

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1096: - Fix Version/s: (was: 0.10.0) > MOR queries support from Prestosql >

[jira] [Resolved] (HUDI-1093) Add support for COW tables from Prestosql

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-1093. -- Fix Version/s: (was: 0.10.0) Resolution: Invalid > Add support for COW tables from

[jira] [Resolved] (HUDI-1092) Hudi support from prestosql

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-1092. -- Fix Version/s: (was: 0.10.0) Resolution: Duplicate We already have a separate one

[jira] [Updated] (HUDI-868) [UMBRELLA] Insert Overwrite API

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-868: Fix Version/s: (was: 0.10.0) > [UMBRELLA] Insert Overwrite API > ---

[jira] [Updated] (HUDI-1042) [Umbrella] Support clustering on filegroups

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1042: - Fix Version/s: (was: 0.10.0) > [Umbrella] Support clustering on filegroups >

[jira] [Updated] (HUDI-1294) Implement inlining of HFile Data Blocks in metadata table log

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1294: - Priority: Blocker (was: Major) > Implement inlining of HFile Data Blocks in metadata table log >

[jira] [Updated] (HUDI-512) Decouple logical partitioning from physical one.

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-512: Parent: HUDI-1822 Issue Type: Sub-task (was: Improvement) > Decouple logical partitioning

[jira] [Updated] (HUDI-52) Implement Savepoints for Merge On Read table #88

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-52: --- Fix Version/s: (was: 0.10.0) 0.11.0 > Implement Savepoints for Merge On Read

[jira] [Commented] (HUDI-52) Implement Savepoints for Merge On Read table #88

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425007#comment-17425007 ] Vinoth Chandar commented on HUDI-52: No. just need to think it though a bit and get it done. the design

[jira] [Commented] (HUDI-1683) When using hudi on flink write data to the HDFS ClassCastException: scala. Tuple2 always be cast to org.apache.hudi.com mon. Util. Collection. The Pair

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425006#comment-17425006 ] Vinoth Chandar commented on HUDI-1683: -- [~MengYao] is this still a valid issue? > When using hudi on

[jira] [Commented] (HUDI-860) Ability to do small file handling without need for caching

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425003#comment-17425003 ] Vinoth Chandar commented on HUDI-860: - [~guoyihua] this is a good one to get started on the

[jira] [Assigned] (HUDI-860) Ability to do small file handling without need for caching

2021-10-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-860: --- Assignee: Ethan Guo (was: Vinoth Chandar) > Ability to do small file handling without need

[jira] [Assigned] (HUDI-2390) KeyGenerator discrepancy between DataFrame writer and SQL

2021-10-06 Thread Yann Byron (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yann Byron reassigned HUDI-2390: Assignee: Yann Byron > KeyGenerator discrepancy between DataFrame writer and SQL >

[GitHub] [hudi] Ambarish-Giri commented on issue #3605: [SUPPORT]Hudi Inserts and Upserts for MoR and CoW tables are taking very long time.

2021-10-06 Thread GitBox
Ambarish-Giri commented on issue #3605: URL: https://github.com/apache/hudi/issues/3605#issuecomment-936162315 Hi @nsivabalan , I analysed the Hudi code as well to check if there is any room for improvement but couldn't find much. Let me know if there is any updates from your end. --

[GitHub] [hudi] sannidhiteredesai commented on issue #2265: Arrays with nulls in them result in broken parquet files

2021-10-06 Thread GitBox
sannidhiteredesai commented on issue #2265: URL: https://github.com/apache/hudi/issues/2265#issuecomment-936123669 @n3nash, @nsivabalan, Is this fix available in hudi 0.9.0 ? Because as per [faq

[GitHub] [hudi] govorunov opened a new issue #3756: [SUPPORT] Can we use Hudi to build Temporal Datastore?

2021-10-06 Thread GitBox
govorunov opened a new issue #3756: URL: https://github.com/apache/hudi/issues/3756 Hi, I read all the documentation and FAQ and got a feeling Hudi is (almost) the right tool for what I'm trying to build, still unable to design the right solution: We need to build a

[GitHub] [hudi] fengjian428 commented on issue #3755: [Delta Streamer] file name mismatch with meta when compaction running

2021-10-06 Thread GitBox
fengjian428 commented on issue #3755: URL: https://github.com/apache/hudi/issues/3755#issuecomment-935750802 this table was create by Delta streamer's SqlSource from another table, but when ingest real-time data from kafka with kafkasource, the compaction dose not work, I need shutdown

[GitHub] [hudi] hudi-bot edited a comment on pull request #3743: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3743: URL: https://github.com/apache/hudi/pull/3743#issuecomment-932639881 ## CI report: * eec99609d3670d390be3bb8e1da6b6aacec10168 UNKNOWN * 21f5296fc6ec0e4b2de6e22762f3dd189c878e01 UNKNOWN *

[GitHub] [hudi] rmahindra123 commented on pull request #3753: [HUDI-2510] Added a quickstart redirect page to fix broken external links in GCP docs

2021-10-06 Thread GitBox
rmahindra123 commented on pull request #3753: URL: https://github.com/apache/hudi/pull/3753#issuecomment-935700221 lgtm. Thanks @vingov for the quick turnaround -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [hudi] leesf merged pull request #3693: [HUDI-2456] support 'show partitions' sql

2021-10-06 Thread GitBox
leesf merged pull request #3693: URL: https://github.com/apache/hudi/pull/3693 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[hudi] branch master updated: [HUDI-2456] support 'show partitions' sql (#3693)

2021-10-06 Thread leesf
This is an automated email from the ASF dual-hosted git repository. leesf pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new e91e611 [HUDI-2456] support 'show partitions' sql

[GitHub] [hudi] hudi-bot edited a comment on pull request #3743: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3743: URL: https://github.com/apache/hudi/pull/3743#issuecomment-932639881 ## CI report: * 3464963ae47ca4bddc4c57f5c1f9e14c2d87b318 Azure:

[GitHub] [hudi] MikeBuh commented on issue #3751: [SUPPORT] Slow Write Speeds to Hudi

2021-10-06 Thread GitBox
MikeBuh commented on issue #3751: URL: https://github.com/apache/hudi/issues/3751#issuecomment-935643291 The reason we opted for COW is because other tools we are using (namely Athena) have a much better support for COW type tables over MOR type. -- This is an automated message from the

[GitHub] [hudi] hudi-bot edited a comment on pull request #3743: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3743: URL: https://github.com/apache/hudi/pull/3743#issuecomment-932639881 ## CI report: * 3464963ae47ca4bddc4c57f5c1f9e14c2d87b318 Azure:

[GitHub] [hudi] yihua commented on a change in pull request #3743: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module

2021-10-06 Thread GitBox
yihua commented on a change in pull request #3743: URL: https://github.com/apache/hudi/pull/3743#discussion_r722944964 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/upgrade/UpgradeDowngrade.java ## @@ -0,0 +1,179 @@ +/* + * Licensed to the

[GitHub] [hudi] hudi-bot edited a comment on pull request #3416: [HUDI-2362] Add external config file support

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3416: URL: https://github.com/apache/hudi/pull/3416#issuecomment-893712830 ## CI report: * b1c19180582fa6f0139b1a897aba36834a5b408f Azure:

[GitHub] [hudi] yihua commented on a change in pull request #3743: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module

2021-10-06 Thread GitBox
yihua commented on a change in pull request #3743: URL: https://github.com/apache/hudi/pull/3743#discussion_r722939258 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/upgrade/UpgradeDowngrade.java ## @@ -0,0 +1,179 @@ +/* + * Licensed to the

[GitHub] [hudi] yihua commented on a change in pull request #3743: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module

2021-10-06 Thread GitBox
yihua commented on a change in pull request #3743: URL: https://github.com/apache/hudi/pull/3743#discussion_r722938371 ## File path: hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/FlinkTaskContextSupplier.java ## @@ -62,4 +64,9 @@ public RuntimeContext

[GitHub] [hudi] yihua commented on a change in pull request #3743: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module

2021-10-06 Thread GitBox
yihua commented on a change in pull request #3743: URL: https://github.com/apache/hudi/pull/3743#discussion_r722937716 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/upgrade/UpgradeDowngrade.java ## @@ -0,0 +1,179 @@ +/* + * Licensed to the

[GitHub] [hudi] hudi-bot edited a comment on pull request #3741: [HUDI-2501] Add HoodieData abstraction and refactor compaction actions in hudi-client module

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3741: URL: https://github.com/apache/hudi/pull/3741#issuecomment-931660346 ## CI report: * afa26cb34e05fd49056b2e072457b3d92bacaa91 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3743: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module

2021-10-06 Thread GitBox
hudi-bot edited a comment on pull request #3743: URL: https://github.com/apache/hudi/pull/3743#issuecomment-932639881 ## CI report: * 3464963ae47ca4bddc4c57f5c1f9e14c2d87b318 Azure:

[GitHub] [hudi] yihua commented on a change in pull request #3743: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module

2021-10-06 Thread GitBox
yihua commented on a change in pull request #3743: URL: https://github.com/apache/hudi/pull/3743#discussion_r722936410 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackUtils.java ## @@ -120,21 +120,24 @@ static

[GitHub] [hudi] yihua commented on a change in pull request #3743: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module

2021-10-06 Thread GitBox
yihua commented on a change in pull request #3743: URL: https://github.com/apache/hudi/pull/3743#discussion_r722936242 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/marker/WriteMarkersFactory.java ## @@ -34,24 +36,28 @@ private static

[GitHub] [hudi] yihua commented on a change in pull request #3743: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module

2021-10-06 Thread GitBox
yihua commented on a change in pull request #3743: URL: https://github.com/apache/hudi/pull/3743#discussion_r722935917 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/ListingBasedRollbackStrategy.java ## @@ -63,7 +63,8 @@

[GitHub] [hudi] fengjian428 opened a new issue #3755: [Delta Streamer] file name mismatch with meta when compaction running

2021-10-06 Thread GitBox
fengjian428 opened a new issue #3755: URL: https://github.com/apache/hudi/issues/3755 Environment: Hudi 0.9 ,Hbase 1.4.12 when I run delta streamer(version 0.9) to ingest data from kafka to a Hbase indexed mor table , after few commits, met this error when compaction running

  1   2   >