[GitHub] [hudi] hudi-bot edited a comment on pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
hudi-bot edited a comment on pull request #3744: URL: https://github.com/apache/hudi/pull/3744#issuecomment-932845780 ## CI report: * 5a724c6c859d67980473db571c9a90b8babcf710 UNKNOWN * 22664f2385572b82dbfc4f1316a063308e647735 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2492) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2493) * 5f444fa98c3f1dbeac6fa6f9c1af98adc81f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
xushiyan commented on pull request #3744: URL: https://github.com/apache/hudi/pull/3744#issuecomment-932869164 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
hudi-bot edited a comment on pull request #3744: URL: https://github.com/apache/hudi/pull/3744#issuecomment-932845780 ## CI report: * 5a724c6c859d67980473db571c9a90b8babcf710 UNKNOWN * 22664f2385572b82dbfc4f1316a063308e647735 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2492) * 5f444fa98c3f1dbeac6fa6f9c1af98adc81f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
hudi-bot edited a comment on pull request #3744: URL: https://github.com/apache/hudi/pull/3744#issuecomment-932845780 ## CI report: * 5a724c6c859d67980473db571c9a90b8babcf710 UNKNOWN * 22664f2385572b82dbfc4f1316a063308e647735 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2492) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
hudi-bot edited a comment on pull request #3744: URL: https://github.com/apache/hudi/pull/3744#issuecomment-932845780 ## CI report: * be214ea66c7cdb5a5a0aee320db05bea336c39d0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2491) * 5a724c6c859d67980473db571c9a90b8babcf710 UNKNOWN * 22664f2385572b82dbfc4f1316a063308e647735 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2492) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
hudi-bot edited a comment on pull request #3744: URL: https://github.com/apache/hudi/pull/3744#issuecomment-932845780 ## CI report: * be214ea66c7cdb5a5a0aee320db05bea336c39d0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2491) * 5a724c6c859d67980473db571c9a90b8babcf710 UNKNOWN * 22664f2385572b82dbfc4f1316a063308e647735 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
hudi-bot edited a comment on pull request #3744: URL: https://github.com/apache/hudi/pull/3744#issuecomment-932845780 ## CI report: * be214ea66c7cdb5a5a0aee320db05bea336c39d0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2491) * 5a724c6c859d67980473db571c9a90b8babcf710 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
hudi-bot edited a comment on pull request #3744: URL: https://github.com/apache/hudi/pull/3744#issuecomment-932845780 ## CI report: * be214ea66c7cdb5a5a0aee320db05bea336c39d0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2491) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3745: [HUDI-2514] Add default hiveTableSerdeProperties for Spark SQL when sync Hive
hudi-bot edited a comment on pull request #3745: URL: https://github.com/apache/hudi/pull/3745#issuecomment-932858096 ## CI report: * b84020e26d7990ce07fb7f6d821801806084e833 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2490) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
hudi-bot edited a comment on pull request #3744: URL: https://github.com/apache/hudi/pull/3744#issuecomment-932845780 ## CI report: * f4d1c821d7f4540c52e457734143b320455af802 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2489) * be214ea66c7cdb5a5a0aee320db05bea336c39d0 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2491) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
hudi-bot edited a comment on pull request #3744: URL: https://github.com/apache/hudi/pull/3744#issuecomment-932845780 ## CI report: * f4d1c821d7f4540c52e457734143b320455af802 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2489) * be214ea66c7cdb5a5a0aee320db05bea336c39d0 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3745: [HUDI-2514] Add default hiveTableSerdeProperties for Spark SQL when sync Hive
hudi-bot edited a comment on pull request #3745: URL: https://github.com/apache/hudi/pull/3745#issuecomment-932858096 ## CI report: * b84020e26d7990ce07fb7f6d821801806084e833 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2490) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #3745: [HUDI-2514] Add default hiveTableSerdeProperties for Spark SQL when sync Hive
hudi-bot commented on pull request #3745: URL: https://github.com/apache/hudi/pull/3745#issuecomment-932858096 ## CI report: * b84020e26d7990ce07fb7f6d821801806084e833 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-2514) Add default hiveTableSerdeProperties for Spark SQL when sync Hive
[ https://issues.apache.org/jira/browse/HUDI-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-2514: - Labels: pull-request-available (was: ) > Add default hiveTableSerdeProperties for Spark SQL when sync Hive > - > > Key: HUDI-2514 > URL: https://issues.apache.org/jira/browse/HUDI-2514 > Project: Apache Hudi > Issue Type: Improvement > Components: Spark Integration >Reporter: 董可伦 >Assignee: 董可伦 >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] dongkelun opened a new pull request #3745: [HUDI-2514] Add default hiveTableSerdeProperties for Spark SQL when sync Hive
dongkelun opened a new pull request #3745: URL: https://github.com/apache/hudi/pull/3745 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpose of the pull request *Add default hiveTableSerdeProperties for Spark SQL when sync Hive* ## Brief change log *(for example:)* - *Add default hiveTableSerdeProperties for Spark SQL when sync Hive* - *Code optimization/code formatting* ## Verify this pull request *(Please pick either of the following options)* This pull request is a trivial rework / code cleanup without any test coverage. *(or)* This pull request is already covered by existing tests, such as *(please describe tests)*. (or) This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end.* - *Added HoodieClientWriteTest to verify the change.* - *Manually verified the change by running a job locally.* ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-2514) Add default hiveTableSerdeProperties for Spark SQL when sync Hive
董可伦 created HUDI-2514: - Summary: Add default hiveTableSerdeProperties for Spark SQL when sync Hive Key: HUDI-2514 URL: https://issues.apache.org/jira/browse/HUDI-2514 Project: Apache Hudi Issue Type: Improvement Components: Spark Integration Reporter: 董可伦 Assignee: 董可伦 Fix For: 0.10.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
hudi-bot edited a comment on pull request #3744: URL: https://github.com/apache/hudi/pull/3744#issuecomment-932845780 ## CI report: * f4d1c821d7f4540c52e457734143b320455af802 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2489) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
hudi-bot edited a comment on pull request #3744: URL: https://github.com/apache/hudi/pull/3744#issuecomment-932845780 ## CI report: * f4d1c821d7f4540c52e457734143b320455af802 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2489) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
hudi-bot commented on pull request #3744: URL: https://github.com/apache/hudi/pull/3744#issuecomment-932845780 ## CI report: * f4d1c821d7f4540c52e457734143b320455af802 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-2108) Flaky test: TestHoodieBackedMetadata.testOnlyValidPartitionsAdded:210
[ https://issues.apache.org/jira/browse/HUDI-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-2108: - Labels: pull-request-available (was: ) > Flaky test: TestHoodieBackedMetadata.testOnlyValidPartitionsAdded:210 > - > > Key: HUDI-2108 > URL: https://issues.apache.org/jira/browse/HUDI-2108 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Vinoth Chandar >Assignee: Raymond Xu >Priority: Major > Labels: pull-request-available > > https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=357=logs=864947d5-8fca-5138-8394-999ccb212a1e=552b4d2f-26d5-5f2f-1d5d-e8229058b632 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] xushiyan opened a new pull request #3744: [HUDI-2108] Fix flakiness in TestHoodieBackedMetadata
xushiyan opened a new pull request #3744: URL: https://github.com/apache/hudi/pull/3744 ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-2108) Flaky test: TestHoodieBackedMetadata.testOnlyValidPartitionsAdded:210
[ https://issues.apache.org/jira/browse/HUDI-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2108: - Status: In Progress (was: Open) > Flaky test: TestHoodieBackedMetadata.testOnlyValidPartitionsAdded:210 > - > > Key: HUDI-2108 > URL: https://issues.apache.org/jira/browse/HUDI-2108 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Vinoth Chandar >Assignee: Raymond Xu >Priority: Major > > https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=357=logs=864947d5-8fca-5138-8394-999ccb212a1e=552b4d2f-26d5-5f2f-1d5d-e8229058b632 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-2108) Flaky test: TestHoodieBackedMetadata.testOnlyValidPartitionsAdded:210
[ https://issues.apache.org/jira/browse/HUDI-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-2108: Assignee: Raymond Xu (was: Vinoth Chandar) > Flaky test: TestHoodieBackedMetadata.testOnlyValidPartitionsAdded:210 > - > > Key: HUDI-2108 > URL: https://issues.apache.org/jira/browse/HUDI-2108 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Vinoth Chandar >Assignee: Raymond Xu >Priority: Major > > https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=357=logs=864947d5-8fca-5138-8394-999ccb212a1e=552b4d2f-26d5-5f2f-1d5d-e8229058b632 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] vingov removed a comment on issue #2934: [SUPPORT] Parquet file does not exist when trying to read hudi table incrementally
vingov removed a comment on issue #2934: URL: https://github.com/apache/hudi/issues/2934#issuecomment-932829205 @t0il3ts0ap - sure, I will confirm in 2 days after checking with @jsbali. He already has the changes in our internal codebase, he just needs to upstream it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vingov commented on issue #2934: [SUPPORT] Parquet file does not exist when trying to read hudi table incrementally
vingov commented on issue #2934: URL: https://github.com/apache/hudi/issues/2934#issuecomment-932829205 @t0il3ts0ap - sure, I will confirm in 2 days after checking with @jsbali. He already has the changes in our internal codebase, he just needs to upstream it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (HUDI-1362) Make deltastreamer support insert_overwrite
[ https://issues.apache.org/jira/browse/HUDI-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-1362. -- Resolution: Implemented Closing Since this is already fixed. > Make deltastreamer support insert_overwrite > > > Key: HUDI-1362 > URL: https://issues.apache.org/jira/browse/HUDI-1362 > Project: Apache Hudi > Issue Type: Improvement > Components: DeltaStreamer >Reporter: liujinhui >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1362) Make deltastreamer support insert_overwrite
[ https://issues.apache.org/jira/browse/HUDI-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1362: - Status: Open (was: New) > Make deltastreamer support insert_overwrite > > > Key: HUDI-1362 > URL: https://issues.apache.org/jira/browse/HUDI-1362 > Project: Apache Hudi > Issue Type: Improvement > Components: DeltaStreamer >Reporter: liujinhui >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1355) Allowing multipleSourceOrdering fields for doing the preCombine on payload
[ https://issues.apache.org/jira/browse/HUDI-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1355: - Status: Open (was: New) > Allowing multipleSourceOrdering fields for doing the preCombine on payload > -- > > Key: HUDI-1355 > URL: https://issues.apache.org/jira/browse/HUDI-1355 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Utilities >Affects Versions: 0.9.0 >Reporter: Bala Mahesh Jampani >Priority: Major > Labels: new-to-hudi, patch, starter > Fix For: 0.10.0 > > Original Estimate: 168h > Remaining Estimate: 168h > > Hi, > I have come across the use case where some of the incoming events have same > timestamps for the insert and update event. In this case I want to depend on > the other field for ordering. In simple terms, if the primary sort ties, i > want to do secondary sort based on other field, if that too ties, go to the > other field etc.,. it would be good if hudi has this functionality. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1079) Cannot upsert on schema with Array of Record with single field
[ https://issues.apache.org/jira/browse/HUDI-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1079: - Status: Open (was: New) > Cannot upsert on schema with Array of Record with single field > -- > > Key: HUDI-1079 > URL: https://issues.apache.org/jira/browse/HUDI-1079 > Project: Apache Hudi > Issue Type: Bug > Components: Spark Integration >Affects Versions: 0.9.0 > Environment: spark 2.4.4, local >Reporter: Adrian Tanase >Priority: Critical > Labels: schema, sev:critical, user-support-issues > Fix For: 0.10.0 > > > I am trying to trigger upserts on a table that has an array field with > records of just one field. > Here is the code to reproduce: > {code:scala} > val spark = SparkSession.builder() > .master("local[1]") > .appName("SparkByExamples.com") > .config("spark.serializer", > "org.apache.spark.serializer.KryoSerializer") > .getOrCreate(); > // https://sparkbyexamples.com/spark/spark-dataframe-array-of-struct/ > val arrayStructData = Seq( > Row("James",List(Row("Java","XX",120),Row("Scala","XA",300))), > Row("Michael",List(Row("Java","XY",200),Row("Scala","XB",500))), > Row("Robert",List(Row("Java","XZ",400),Row("Scala","XC",250))), > Row("Washington",null) > ) > val arrayStructSchema = new StructType() > .add("name",StringType) > .add("booksIntersted",ArrayType( > new StructType() > .add("bookName",StringType) > // .add("author",StringType) > // .add("pages",IntegerType) > )) > val df = > spark.createDataFrame(spark.sparkContext.parallelize(arrayStructData),arrayStructSchema) > {code} > Running insert following by upsert will fail: > {code:scala} > df.write > .format("hudi") > .options(getQuickstartWriteConfigs) > .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY, "name") > .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY, "name") > .option(DataSourceWriteOptions.TABLE_TYPE_OPT_KEY, "COPY_ON_WRITE") > .option(HoodieWriteConfig.TABLE_NAME, tableName) > .mode(Overwrite) > .save(basePath) > df.write > .format("hudi") > .options(getQuickstartWriteConfigs) > .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY, "name") > .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY, "name") > .option(HoodieWriteConfig.TABLE_NAME, tableName) > .mode(Append) > .save(basePath) > {code} > If I create the books record with all the fields (at least 2), it works as > expected. > The relevant part of the exception is this: > {noformat} > Caused by: java.lang.ClassCastException: required binary bookName (UTF8) is > not a groupCaused by: java.lang.ClassCastException: required binary bookName > (UTF8) is not a group at > org.apache.parquet.schema.Type.asGroupType(Type.java:207) at > org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:279) > at > org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:232) > at > org.apache.parquet.avro.AvroRecordConverter.access$100(AvroRecordConverter.java:78) > at > org.apache.parquet.avro.AvroRecordConverter$AvroCollectionConverter$ElementConverter.(AvroRecordConverter.java:536) > at > org.apache.parquet.avro.AvroRecordConverter$AvroCollectionConverter.(AvroRecordConverter.java:486) > at > org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:289) > at > org.apache.parquet.avro.AvroRecordConverter.(AvroRecordConverter.java:141) > at > org.apache.parquet.avro.AvroRecordConverter.(AvroRecordConverter.java:95) > at > org.apache.parquet.avro.AvroRecordMaterializer.(AvroRecordMaterializer.java:33) > at > org.apache.parquet.avro.AvroReadSupport.prepareForRead(AvroReadSupport.java:138) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:183) > at > org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:156) at > org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:135) at > org.apache.hudi.client.utils.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:49) > at > org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45) > at > org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:92) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ... 4 > more{noformat} > Another way to test is by changing the generated data in the tips to just the > amount, by dropping the currency on the tips_history field, tests will start
[jira] [Updated] (HUDI-893) Add spark datasource V2 reader support for Hudi tables
[ https://issues.apache.org/jira/browse/HUDI-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-893: Status: Open (was: New) > Add spark datasource V2 reader support for Hudi tables > -- > > Key: HUDI-893 > URL: https://issues.apache.org/jira/browse/HUDI-893 > Project: Apache Hudi > Issue Type: Sub-task > Components: Spark Integration >Reporter: Nishith Agarwal >Assignee: Nan Zhu >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1341) hudi cli command such as rollback 、bootstrap support spark sql implement
[ https://issues.apache.org/jira/browse/HUDI-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1341: - Status: Open (was: New) > hudi cli command such as rollback 、bootstrap support spark sql implement > - > > Key: HUDI-1341 > URL: https://issues.apache.org/jira/browse/HUDI-1341 > Project: Apache Hudi > Issue Type: Sub-task > Components: Spark Integration >Reporter: liwei >Assignee: liwei >Priority: Major > > now rollback 、bootstrap ... command need to use hudi CLI. Some user more like > use spark > sql or spark code API. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1237) [UMBRELLA] Checkstyle, formatting, warnings, spotless
[ https://issues.apache.org/jira/browse/HUDI-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1237: - Priority: Major (was: Blocker) > [UMBRELLA] Checkstyle, formatting, warnings, spotless > - > > Key: HUDI-1237 > URL: https://issues.apache.org/jira/browse/HUDI-1237 > Project: Apache Hudi > Issue Type: Improvement > Components: Code Cleanup >Reporter: sivabalan narayanan >Assignee: leesf >Priority: Major > Labels: gsoc, gsoc2021, hudi-umbrellas, mentor > > Umbrella ticket to track all tickets related to checkstyle, spotless, > warnings etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1237) [UMBRELLA] Checkstyle, formatting, warnings, spotless
[ https://issues.apache.org/jira/browse/HUDI-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1237: - Fix Version/s: (was: 0.10.0) > [UMBRELLA] Checkstyle, formatting, warnings, spotless > - > > Key: HUDI-1237 > URL: https://issues.apache.org/jira/browse/HUDI-1237 > Project: Apache Hudi > Issue Type: Improvement > Components: Code Cleanup >Reporter: sivabalan narayanan >Assignee: leesf >Priority: Blocker > Labels: gsoc, gsoc2021, hudi-umbrellas, mentor > > Umbrella ticket to track all tickets related to checkstyle, spotless, > warnings etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1237) [UMBRELLA] Checkstyle, formatting, warnings, spotless
[ https://issues.apache.org/jira/browse/HUDI-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1237: - Status: Open (was: New) > [UMBRELLA] Checkstyle, formatting, warnings, spotless > - > > Key: HUDI-1237 > URL: https://issues.apache.org/jira/browse/HUDI-1237 > Project: Apache Hudi > Issue Type: Improvement > Components: Code Cleanup >Reporter: sivabalan narayanan >Assignee: leesf >Priority: Blocker > Labels: gsoc, gsoc2021, hudi-umbrellas, mentor > Fix For: 0.10.0 > > > Umbrella ticket to track all tickets related to checkstyle, spotless, > warnings etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1500) Support incrementally reading clustering commit via Spark Datasource/DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1500: - Status: Open (was: New) > Support incrementally reading clustering commit via Spark > Datasource/DeltaStreamer > --- > > Key: HUDI-1500 > URL: https://issues.apache.org/jira/browse/HUDI-1500 > Project: Apache Hudi > Issue Type: Sub-task > Components: DeltaStreamer, Spark Integration >Reporter: liwei >Assignee: satish >Priority: Blocker > Fix For: 0.10.0 > > > now in DeltaSync.readFromSource() can not read last instant as replace > commit, such as clustering. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-864) parquet schema conflict: optional binary (UTF8) is not a group
[ https://issues.apache.org/jira/browse/HUDI-864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-864: Status: Open (was: New) > parquet schema conflict: optional binary (UTF8) is not a group > --- > > Key: HUDI-864 > URL: https://issues.apache.org/jira/browse/HUDI-864 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core, Spark Integration >Affects Versions: 0.5.2, 0.6.0, 0.5.3, 0.7.0, 0.8.0, 0.9.0 >Reporter: Roland Johann >Priority: Blocker > Labels: sev:critical, user-support-issues > Fix For: 0.10.0 > > > When dealing with struct types like this > {code:json} > { > "type": "struct", > "fields": [ > { > "name": "categoryResults", > "type": { > "type": "array", > "elementType": { > "type": "struct", > "fields": [ > { > "name": "categoryId", > "type": "string", > "nullable": true, > "metadata": {} > } > ] > }, > "containsNull": true > }, > "nullable": true, > "metadata": {} > } > ] > } > {code} > The second ingest batch throws that exception: > {code} > ERROR [Executor task launch worker for task 15] > commit.BaseCommitActionExecutor (BaseCommitActionExecutor.java:264) - Error > upserting bucketType UPDATE for partition :0 > org.apache.hudi.exception.HoodieException: > org.apache.hudi.exception.HoodieException: > java.util.concurrent.ExecutionException: > org.apache.hudi.exception.HoodieException: operation has failed > at > org.apache.hudi.table.action.commit.CommitActionExecutor.handleUpdateInternal(CommitActionExecutor.java:100) > at > org.apache.hudi.table.action.commit.CommitActionExecutor.handleUpdate(CommitActionExecutor.java:76) > at > org.apache.hudi.table.action.deltacommit.DeltaCommitActionExecutor.handleUpdate(DeltaCommitActionExecutor.java:73) > at > org.apache.hudi.table.action.commit.BaseCommitActionExecutor.handleUpsertPartition(BaseCommitActionExecutor.java:258) > at > org.apache.hudi.table.action.commit.BaseCommitActionExecutor.handleInsertPartition(BaseCommitActionExecutor.java:271) > at > org.apache.hudi.table.action.commit.BaseCommitActionExecutor.lambda$execute$caffe4c4$1(BaseCommitActionExecutor.java:104) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) > at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:337) > at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:335) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1182) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156) > at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091) > at > org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156) > at > org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882) > at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:286) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) > at org.apache.spark.scheduler.Task.run(Task.scala:123) > at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at
[jira] [Updated] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1015: - Status: Open (was: New) > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Assignee: sivabalan narayanan >Priority: Blocker > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1492) Handle DeltaWriteStat correctly for storage schemes that support appends
[ https://issues.apache.org/jira/browse/HUDI-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1492: - Status: Open (was: New) > Handle DeltaWriteStat correctly for storage schemes that support appends > > > Key: HUDI-1492 > URL: https://issues.apache.org/jira/browse/HUDI-1492 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Vinoth Chandar >Assignee: sivabalan narayanan >Priority: Blocker > Fix For: 0.10.0 > > > Current implementation simply uses the > {code:java} > String pathWithPartition = hoodieWriteStat.getPath(); {code} > to write the metadata table. this is problematic, if the delta write was > merely an append. and can technically add duplicate files into the metadata > table > (not sure if this is a problem per se. but filing a Jira to track and either > close/fix ) > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1180) Upgrade HBase to 2.3.3
[ https://issues.apache.org/jira/browse/HUDI-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1180: - Labels: sev:critical (was: ) > Upgrade HBase to 2.3.3 > -- > > Key: HUDI-1180 > URL: https://issues.apache.org/jira/browse/HUDI-1180 > Project: Apache Hudi > Issue Type: Improvement > Components: Writer Core >Affects Versions: 0.9.0 >Reporter: Wenning Ding >Priority: Blocker > Labels: sev:critical > Fix For: 0.10.0 > > > Trying to upgrade HBase to 2.3.3 but ran into several issues. > According to the Hadoop version support matrix: > [http://hbase.apache.org/book.html#hadoop], also need to upgrade Hadoop to > 2.8.5+. > > There are several API conflicts between HBase 2.2.3 and HBase 1.2.3, we need > to resolve this first. After resolving conflicts, I am able to compile it but > then I ran into a tricky jetty version issue during the testing: > {code:java} > [ERROR] TestHBaseIndex.testDelete() Time elapsed: 4.705 s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testSimpleTagLocationAndUpdate() Time elapsed: 0.174 > s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testSimpleTagLocationAndUpdateWithRollback() Time > elapsed: 0.076 s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testSmallBatchSize() Time elapsed: 0.122 s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testTagLocationAndDuplicateUpdate() Time elapsed: > 0.16 s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testTotalGetsBatching() Time elapsed: 1.771 s <<< > ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testTotalPutsBatching() Time elapsed: 0.082 s <<< > ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > 34206 [Thread-260] WARN > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner - DirectoryScanner: > shutdown has been called > 34240 [BP-1058834949-10.0.0.2-1597189606506 heartbeating to > localhost/127.0.0.1:55924] WARN > org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager - > IncrementalBlockReportManager interrupted > 34240 [BP-1058834949-10.0.0.2-1597189606506 heartbeating to > localhost/127.0.0.1:55924] WARN > org.apache.hadoop.hdfs.server.datanode.DataNode - Ending block pool service > for: Block pool BP-1058834949-10.0.0.2-1597189606506 (Datanode Uuid > cb7bd8aa-5d79-4955-b1ec-bdaf7f1b6431) service to localhost/127.0.0.1:55924 > 34246 > [refreshUsed-/private/var/folders/98/mxq3vc_n6l5728rf1wmcwrqs52lpwg/T/temp1791820148926982977/dfs/data/data1/current/BP-1058834949-10.0.0.2-1597189606506] > WARN org.apache.hadoop.fs.CachingGetSpaceUsed - Thread Interrupted waiting > to refresh disk information: sleep interrupted > 34247 > [refreshUsed-/private/var/folders/98/mxq3vc_n6l5728rf1wmcwrqs52lpwg/T/temp1791820148926982977/dfs/data/data2/current/BP-1058834949-10.0.0.2-1597189606506] > WARN org.apache.hadoop.fs.CachingGetSpaceUsed - Thread Interrupted waiting > to refresh disk information: sleep interrupted > 37192 [HBase-Metrics2-1] WARN org.apache.hadoop.metrics2.impl.MetricsConfig > - Cannot locate configuration: tried > hadoop-metrics2-datanode.properties,hadoop-metrics2.properties > 43904 > [master/iad1-ws-cor-r12:0:becomeActiveMaster-SendThread(localhost:58768)] > WARN org.apache.zookeeper.ClientCnxn - Session 0x173dfeb0c8b0004 for server > null, unexpected error, closing socket connection and attempting reconnect > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) > [INFO] > [INFO] Results: > [INFO] > [ERROR] Errors: > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR]
[jira] [Updated] (HUDI-1180) Upgrade HBase to 2.3.3
[ https://issues.apache.org/jira/browse/HUDI-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1180: - Status: Open (was: New) > Upgrade HBase to 2.3.3 > -- > > Key: HUDI-1180 > URL: https://issues.apache.org/jira/browse/HUDI-1180 > Project: Apache Hudi > Issue Type: Improvement >Affects Versions: 0.9.0 >Reporter: Wenning Ding >Priority: Blocker > Fix For: 0.10.0 > > > Trying to upgrade HBase to 2.3.3 but ran into several issues. > According to the Hadoop version support matrix: > [http://hbase.apache.org/book.html#hadoop], also need to upgrade Hadoop to > 2.8.5+. > > There are several API conflicts between HBase 2.2.3 and HBase 1.2.3, we need > to resolve this first. After resolving conflicts, I am able to compile it but > then I ran into a tricky jetty version issue during the testing: > {code:java} > [ERROR] TestHBaseIndex.testDelete() Time elapsed: 4.705 s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testSimpleTagLocationAndUpdate() Time elapsed: 0.174 > s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testSimpleTagLocationAndUpdateWithRollback() Time > elapsed: 0.076 s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testSmallBatchSize() Time elapsed: 0.122 s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testTagLocationAndDuplicateUpdate() Time elapsed: > 0.16 s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testTotalGetsBatching() Time elapsed: 1.771 s <<< > ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testTotalPutsBatching() Time elapsed: 0.082 s <<< > ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > 34206 [Thread-260] WARN > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner - DirectoryScanner: > shutdown has been called > 34240 [BP-1058834949-10.0.0.2-1597189606506 heartbeating to > localhost/127.0.0.1:55924] WARN > org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager - > IncrementalBlockReportManager interrupted > 34240 [BP-1058834949-10.0.0.2-1597189606506 heartbeating to > localhost/127.0.0.1:55924] WARN > org.apache.hadoop.hdfs.server.datanode.DataNode - Ending block pool service > for: Block pool BP-1058834949-10.0.0.2-1597189606506 (Datanode Uuid > cb7bd8aa-5d79-4955-b1ec-bdaf7f1b6431) service to localhost/127.0.0.1:55924 > 34246 > [refreshUsed-/private/var/folders/98/mxq3vc_n6l5728rf1wmcwrqs52lpwg/T/temp1791820148926982977/dfs/data/data1/current/BP-1058834949-10.0.0.2-1597189606506] > WARN org.apache.hadoop.fs.CachingGetSpaceUsed - Thread Interrupted waiting > to refresh disk information: sleep interrupted > 34247 > [refreshUsed-/private/var/folders/98/mxq3vc_n6l5728rf1wmcwrqs52lpwg/T/temp1791820148926982977/dfs/data/data2/current/BP-1058834949-10.0.0.2-1597189606506] > WARN org.apache.hadoop.fs.CachingGetSpaceUsed - Thread Interrupted waiting > to refresh disk information: sleep interrupted > 37192 [HBase-Metrics2-1] WARN org.apache.hadoop.metrics2.impl.MetricsConfig > - Cannot locate configuration: tried > hadoop-metrics2-datanode.properties,hadoop-metrics2.properties > 43904 > [master/iad1-ws-cor-r12:0:becomeActiveMaster-SendThread(localhost:58768)] > WARN org.apache.zookeeper.ClientCnxn - Session 0x173dfeb0c8b0004 for server > null, unexpected error, closing socket connection and attempting reconnect > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) > [INFO] > [INFO] Results: > [INFO] > [ERROR] Errors: > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR]
[jira] [Updated] (HUDI-1180) Upgrade HBase to 2.3.3
[ https://issues.apache.org/jira/browse/HUDI-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1180: - Component/s: Writer Core > Upgrade HBase to 2.3.3 > -- > > Key: HUDI-1180 > URL: https://issues.apache.org/jira/browse/HUDI-1180 > Project: Apache Hudi > Issue Type: Improvement > Components: Writer Core >Affects Versions: 0.9.0 >Reporter: Wenning Ding >Priority: Blocker > Fix For: 0.10.0 > > > Trying to upgrade HBase to 2.3.3 but ran into several issues. > According to the Hadoop version support matrix: > [http://hbase.apache.org/book.html#hadoop], also need to upgrade Hadoop to > 2.8.5+. > > There are several API conflicts between HBase 2.2.3 and HBase 1.2.3, we need > to resolve this first. After resolving conflicts, I am able to compile it but > then I ran into a tricky jetty version issue during the testing: > {code:java} > [ERROR] TestHBaseIndex.testDelete() Time elapsed: 4.705 s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testSimpleTagLocationAndUpdate() Time elapsed: 0.174 > s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testSimpleTagLocationAndUpdateWithRollback() Time > elapsed: 0.076 s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testSmallBatchSize() Time elapsed: 0.122 s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testTagLocationAndDuplicateUpdate() Time elapsed: > 0.16 s <<< ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testTotalGetsBatching() Time elapsed: 1.771 s <<< > ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] TestHBaseIndex.testTotalPutsBatching() Time elapsed: 0.082 s <<< > ERROR! > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > 34206 [Thread-260] WARN > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner - DirectoryScanner: > shutdown has been called > 34240 [BP-1058834949-10.0.0.2-1597189606506 heartbeating to > localhost/127.0.0.1:55924] WARN > org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager - > IncrementalBlockReportManager interrupted > 34240 [BP-1058834949-10.0.0.2-1597189606506 heartbeating to > localhost/127.0.0.1:55924] WARN > org.apache.hadoop.hdfs.server.datanode.DataNode - Ending block pool service > for: Block pool BP-1058834949-10.0.0.2-1597189606506 (Datanode Uuid > cb7bd8aa-5d79-4955-b1ec-bdaf7f1b6431) service to localhost/127.0.0.1:55924 > 34246 > [refreshUsed-/private/var/folders/98/mxq3vc_n6l5728rf1wmcwrqs52lpwg/T/temp1791820148926982977/dfs/data/data1/current/BP-1058834949-10.0.0.2-1597189606506] > WARN org.apache.hadoop.fs.CachingGetSpaceUsed - Thread Interrupted waiting > to refresh disk information: sleep interrupted > 34247 > [refreshUsed-/private/var/folders/98/mxq3vc_n6l5728rf1wmcwrqs52lpwg/T/temp1791820148926982977/dfs/data/data2/current/BP-1058834949-10.0.0.2-1597189606506] > WARN org.apache.hadoop.fs.CachingGetSpaceUsed - Thread Interrupted waiting > to refresh disk information: sleep interrupted > 37192 [HBase-Metrics2-1] WARN org.apache.hadoop.metrics2.impl.MetricsConfig > - Cannot locate configuration: tried > hadoop-metrics2-datanode.properties,hadoop-metrics2.properties > 43904 > [master/iad1-ws-cor-r12:0:becomeActiveMaster-SendThread(localhost:58768)] > WARN org.apache.zookeeper.ClientCnxn - Session 0x173dfeb0c8b0004 for server > null, unexpected error, closing socket connection and attempting reconnect > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) > [INFO] > [INFO] Results: > [INFO] > [ERROR] Errors: > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR] org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V > [ERROR]
[jira] [Assigned] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-1015: Assignee: sivabalan narayanan (was: Vinoth Chandar) > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Assignee: sivabalan narayanan >Priority: Blocker > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-864) parquet schema conflict: optional binary (UTF8) is not a group
[ https://issues.apache.org/jira/browse/HUDI-864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-864: Fix Version/s: 0.10.0 > parquet schema conflict: optional binary (UTF8) is not a group > --- > > Key: HUDI-864 > URL: https://issues.apache.org/jira/browse/HUDI-864 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core, Spark Integration >Affects Versions: 0.5.2, 0.6.0, 0.5.3, 0.7.0, 0.8.0, 0.9.0 >Reporter: Roland Johann >Priority: Blocker > Labels: sev:critical, user-support-issues > Fix For: 0.10.0 > > > When dealing with struct types like this > {code:json} > { > "type": "struct", > "fields": [ > { > "name": "categoryResults", > "type": { > "type": "array", > "elementType": { > "type": "struct", > "fields": [ > { > "name": "categoryId", > "type": "string", > "nullable": true, > "metadata": {} > } > ] > }, > "containsNull": true > }, > "nullable": true, > "metadata": {} > } > ] > } > {code} > The second ingest batch throws that exception: > {code} > ERROR [Executor task launch worker for task 15] > commit.BaseCommitActionExecutor (BaseCommitActionExecutor.java:264) - Error > upserting bucketType UPDATE for partition :0 > org.apache.hudi.exception.HoodieException: > org.apache.hudi.exception.HoodieException: > java.util.concurrent.ExecutionException: > org.apache.hudi.exception.HoodieException: operation has failed > at > org.apache.hudi.table.action.commit.CommitActionExecutor.handleUpdateInternal(CommitActionExecutor.java:100) > at > org.apache.hudi.table.action.commit.CommitActionExecutor.handleUpdate(CommitActionExecutor.java:76) > at > org.apache.hudi.table.action.deltacommit.DeltaCommitActionExecutor.handleUpdate(DeltaCommitActionExecutor.java:73) > at > org.apache.hudi.table.action.commit.BaseCommitActionExecutor.handleUpsertPartition(BaseCommitActionExecutor.java:258) > at > org.apache.hudi.table.action.commit.BaseCommitActionExecutor.handleInsertPartition(BaseCommitActionExecutor.java:271) > at > org.apache.hudi.table.action.commit.BaseCommitActionExecutor.lambda$execute$caffe4c4$1(BaseCommitActionExecutor.java:104) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$25.apply(RDD.scala:853) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) > at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:337) > at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:335) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1182) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156) > at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091) > at > org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156) > at > org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882) > at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:286) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) > at org.apache.spark.scheduler.Task.run(Task.scala:123) > at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at
[jira] [Updated] (HUDI-1204) NoClassDefFoundError with AbstractSyncTool while running HoodieTestSuiteJob
[ https://issues.apache.org/jira/browse/HUDI-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1204: - Status: Open (was: New) > NoClassDefFoundError with AbstractSyncTool while running HoodieTestSuiteJob > --- > > Key: HUDI-1204 > URL: https://issues.apache.org/jira/browse/HUDI-1204 > Project: Apache Hudi > Issue Type: Bug > Components: Testing >Affects Versions: 0.8.0 >Reporter: sivabalan narayanan >Assignee: Nishith Agarwal >Priority: Major > Attachments: complex-dag-cow-2.yaml > > > I was trying to run HoodieTestSuiteJob in my local docker set up and ran into > dep issue. > > spark-submit --master local --class > org.apache.hudi.integ.testsuite.HoodieTestSuiteJob --packages > com.databricks:spark-avro_2.11:4.0.0 > /opt/hudi-integ-test-bundle-0.6.0-rc1.jar --source-ordering-field timestamp > --target-base-path /user/hive/warehouse/hudi-test-suite/output > --input-base-path /user/hive/warehouse/hudi-test-suite/input > --target-table test_table --props [file:///opt/test-source.properties] > --schemaprovider-class > org.apache.hudi.utilities.schema.FilebasedSchemaProvider --source-class > org.apache.hudi.utilities.sources.AvroDFSSource --input-file-size 12582912 > --workload-yaml-path > /var/hoodie/ws/docker/demo/config/test-suite/complex-dag-cow.yaml > --table-type COPY_ON_WRITE --workload-generator-classname yaml > > {code:java} > 20/08/19 21:42:26 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hudi/sync/common/AbstractSyncTool > at java.lang.ClassLoader.defineClass1(Native Method) > at java.lang.ClassLoader.defineClass(ClassLoader.java:763) > at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) > at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) > at java.net.URLClassLoader.access$100(URLClassLoader.java:74) > at java.net.URLClassLoader$1.run(URLClassLoader.java:369) > at java.net.URLClassLoader$1.run(URLClassLoader.java:363) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:362) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$Config.(HoodieDeltaStreamer.java:279) > at > org.apache.hudi.integ.testsuite.HoodieTestSuiteJob$HoodieTestSuiteConfig.(HoodieTestSuiteJob.java:153) > at > org.apache.hudi.integ.testsuite.HoodieTestSuiteJob.main(HoodieTestSuiteJob.java:114) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.lang.ClassNotFoundException: > org.apache.hudi.sync.common.AbstractSyncTool > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 26 more > {code} > I tried adding hudi-sync-common as dep to hudi-utilities, but didn't fix the > issue. > -- This message was sent by Atlassian Jira (v8.3.4#803005)