[jira] [Updated] (HUDI-1790) Add SqlSource for DeltaStreamer to support backfill use cases
[ https://issues.apache.org/jira/browse/HUDI-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Govindarajan updated HUDI-1790: -- Status: Patch Available (was: In Progress) > Add SqlSource for DeltaStreamer to support backfill use cases > - > > Key: HUDI-1790 > URL: https://issues.apache.org/jira/browse/HUDI-1790 > Project: Apache Hudi > Issue Type: New Feature > Components: DeltaStreamer >Reporter: Vinoth Govindarajan >Assignee: Vinoth Govindarajan >Priority: Major > Labels: pull-request-available > > Delta Streamer is great for incremental workloads, but we need to support > backfills for use cases like adding a new column and backfill only that > column for the last 6 months, and if there was a bug in our transformation > logic and we need to reprocess a couple of older partitions. > > If we have a SqlSource as one of the input source to the delta streamer, then > I can pass any custom Spark SQL queries selecting specific partitions and > backfill. > > When we do the backfill, we don't need to update the last processed commit > checkpoint, this has to copy the last processed checkpoint before the > backfill and copy that over to the backfill commit. > > cc [~nishith29] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1790) Add SqlSource for DeltaStreamer to support backfill use cases
[ https://issues.apache.org/jira/browse/HUDI-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1790: - Labels: pull-request-available (was: ) > Add SqlSource for DeltaStreamer to support backfill use cases > - > > Key: HUDI-1790 > URL: https://issues.apache.org/jira/browse/HUDI-1790 > Project: Apache Hudi > Issue Type: New Feature > Components: DeltaStreamer >Reporter: Vinoth Govindarajan >Assignee: Vinoth Govindarajan >Priority: Major > Labels: pull-request-available > > Delta Streamer is great for incremental workloads, but we need to support > backfills for use cases like adding a new column and backfill only that > column for the last 6 months, and if there was a bug in our transformation > logic and we need to reprocess a couple of older partitions. > > If we have a SqlSource as one of the input source to the delta streamer, then > I can pass any custom Spark SQL queries selecting specific partitions and > backfill. > > When we do the backfill, we don't need to update the last processed commit > checkpoint, this has to copy the last processed checkpoint before the > backfill and copy that over to the backfill commit. > > cc [~nishith29] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] vingov opened a new pull request #2896: [HUDI-1790] Added SqlSource to fetch data from any partitions for backfill use case
vingov opened a new pull request #2896: URL: https://github.com/apache/hudi/pull/2896 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request *This pull request adds a new source to delta streamer, to perform snapshot queries mainly used for backfilling historical partitions.* ## Brief change log - *Added a new SqlSource to delta streamer to handle backfills for any specific date range snapshot queries.* ## Verify this pull request This change added tests and can be verified as follows: - *Added TestSqlSource to verify the change.* - *Manually verified the change by running a job locally.* ## Committer checklist - [x] Has a corresponding JIRA in PR title & commit - [x] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2892: [HUDI-1865] Make embedded time line service singleton
codecov-commenter edited a comment on pull request #2892: URL: https://github.com/apache/hudi/pull/2892#issuecomment-828296052 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2895: [HUDI-1867] Streaming read for Flink COW table
codecov-commenter edited a comment on pull request #2895: URL: https://github.com/apache/hudi/pull/2895#issuecomment-828943047 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2895](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (b8f0f53) into [master](https://codecov.io/gh/apache/hudi/commit/c9bcb5e33f7f9f97af0e8429a88d95f58ee48f13?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (c9bcb5e) will **increase** coverage by `5.04%`. > The diff coverage is `10.41%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2895/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2895 +/- ## + Coverage 47.90% 52.94% +5.04% - Complexity 3421 3748 +327 Files 488 488 Lines 2352923572 +43 Branches 2501 2507 +6 + Hits 1127112480+1209 + Misses11277 9991-1286 - Partials981 1101 +120 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.53% <ø> (ø)` | `220.00 <ø> (ø)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.38% <ø> (ø)` | `1975.00 <ø> (ø)` | | | hudiflink | `59.07% <10.41%> (-0.60%)` | `537.00 <0.00> (ø)` | | | hudihadoopmr | `33.33% <ø> (ø)` | `198.00 <ø> (ø)` | | | hudisparkdatasource | `73.33% <ø> (ø)` | `237.00 <ø> (ø)` | | | hudisync | `46.73% <ø> (ø)` | `144.00 <ø> (ø)` | | | huditimelineservice | `64.36% <ø> (ø)` | `62.00 <ø> (ø)` | | | hudiutilities | `69.75% <ø> (+60.39%)` | `375.00 <ø> (+327.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [.../java/org/apache/hudi/table/HoodieTableSource.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZVNvdXJjZS5qYXZh) | `59.56% <0.00%> (-3.81%)` | `26.00 <0.00> (ø)` | | | [.../hudi/table/format/mor/MergeOnReadInputFormat.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9mb3JtYXQvbW9yL01lcmdlT25SZWFkSW5wdXRGb3JtYXQuamF2YQ==) | `66.52% <3.44%> (-8.96%)` | `18.00 <0.00> (ø)` | | | [...ache/hudi/source/StreamReadMonitoringFunction.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zb3VyY2UvU3RyZWFtUmVhZE1vbml0b3JpbmdGdW5jdGlvbi5qYXZh) | `76.22% <80.00%> (-0.04%)` | `35.00 <0.00> (ø)` | | | [.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==) | `88.79% <0.00%> (+5.17%)` | `28.00% <0.00%> (ø%)` | | | [...e/hudi/utilities/transform/ChainedTransformer.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9DaGFpbmVkVHJhbnNmb3JtZXIuamF2YQ==) | `100.00% <0.00%> (+11.11%)` | `4.00% <0.00%> (+1.00%)` | | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2895: [HUDI-1867] Streaming read for Flink COW table
codecov-commenter edited a comment on pull request #2895: URL: https://github.com/apache/hudi/pull/2895#issuecomment-828943047 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2895](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (b8f0f53) into [master](https://codecov.io/gh/apache/hudi/commit/c9bcb5e33f7f9f97af0e8429a88d95f58ee48f13?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (c9bcb5e) will **increase** coverage by `3.31%`. > The diff coverage is `10.41%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2895/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2895 +/- ## + Coverage 47.90% 51.21% +3.31% + Complexity 3421 3305 -116 Files 488 425 -63 Lines 2352920095-3434 Branches 2501 2089 -412 - Hits 1127110292 -979 + Misses11277 8947-2330 + Partials981 856 -125 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.53% <ø> (ø)` | `220.00 <ø> (ø)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.38% <ø> (ø)` | `1975.00 <ø> (ø)` | | | hudiflink | `59.07% <10.41%> (-0.60%)` | `537.00 <0.00> (ø)` | | | hudihadoopmr | `33.33% <ø> (ø)` | `198.00 <ø> (ø)` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `69.75% <ø> (+60.39%)` | `375.00 <ø> (+327.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [.../java/org/apache/hudi/table/HoodieTableSource.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZVNvdXJjZS5qYXZh) | `59.56% <0.00%> (-3.81%)` | `26.00 <0.00> (ø)` | | | [.../hudi/table/format/mor/MergeOnReadInputFormat.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9mb3JtYXQvbW9yL01lcmdlT25SZWFkSW5wdXRGb3JtYXQuamF2YQ==) | `66.52% <3.44%> (-8.96%)` | `18.00 <0.00> (ø)` | | | [...ache/hudi/source/StreamReadMonitoringFunction.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zb3VyY2UvU3RyZWFtUmVhZE1vbml0b3JpbmdGdW5jdGlvbi5qYXZh) | `76.22% <80.00%> (-0.04%)` | `35.00 <0.00> (ø)` | | | [.../main/scala/org/apache/hudi/HoodieSparkUtils.scala](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrVXRpbHMuc2NhbGE=) | | | | | [...nal/HoodieBulkInsertDataInternalWriterFactory.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2ludGVybmFsL0hvb2RpZUJ1bGtJbnNlcnREYXRhSW50ZXJuYWxXcml0ZXJGYWN0b3J5LmphdmE=) | | | | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2892: [HUDI-1865] Make embedded time line service singleton
codecov-commenter edited a comment on pull request #2892: URL: https://github.com/apache/hudi/pull/2892#issuecomment-828296052 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2892](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (e2d0335) into [master](https://codecov.io/gh/apache/hudi/commit/386767693d46e7419c4fb0fa292ccb7ab7f7098d?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3867676) will **decrease** coverage by `16.73%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2892/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2892 +/- ## = - Coverage 69.75% 53.01% -16.74% - Complexity 375 3746 +3371 = Files54 488 +434 Lines 199723527+21530 Branches236 2501 +2265 = + Hits 139312474+11081 - Misses 473 9953 +9480 - Partials131 1100 +969 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.53% <ø> (?)` | `220.00 <ø> (?)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.41% <ø> (?)` | `1976.00 <ø> (?)` | | | hudiflink | `59.67% <ø> (?)` | `537.00 <ø> (?)` | | | hudihadoopmr | `33.33% <ø> (?)` | `198.00 <ø> (?)` | | | hudisparkdatasource | `73.33% <ø> (?)` | `237.00 <ø> (?)` | | | hudisync | `46.39% <ø> (?)` | `142.00 <ø> (?)` | | | huditimelineservice | `64.36% <ø> (?)` | `62.00 <ø> (?)` | | | hudiutilities | `69.70% <ø> (-0.06%)` | `374.00 <ø> (-1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.08% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | | | [...sioning/clean/CleanMetadataV1MigrationHandler.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL3ZlcnNpb25pbmcvY2xlYW4vQ2xlYW5NZXRhZGF0YVYxTWlncmF0aW9uSGFuZGxlci5qYXZh) | `10.00% <0.00%> (ø)` | `3.00% <0.00%> (?%)` | | | [...apache/hudi/common/engine/TaskContextSupplier.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2VuZ2luZS9UYXNrQ29udGV4dFN1cHBsaWVyLmphdmE=) | `100.00% <0.00%> (ø)` | `1.00% <0.00%> (?%)` | | | [...udi/common/table/log/block/HoodieCommandBlock.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9ibG9jay9Ib29kaWVDb21tYW5kQmxvY2suamF2YQ==) | `100.00% <0.00%> (ø)` | `6.00% <0.00%> (?%)` | | | [...he/hudi/common/model/EmptyHoodieRecordPayload.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0VtcHR5SG9vZGllUmVjb3JkUGF5bG9hZC5qYXZh) | `0.00% <0.00%>
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2895: [HUDI-1867] Streaming read for Flink COW table
codecov-commenter edited a comment on pull request #2895: URL: https://github.com/apache/hudi/pull/2895#issuecomment-828943047 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2895](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (b8f0f53) into [master](https://codecov.io/gh/apache/hudi/commit/c9bcb5e33f7f9f97af0e8429a88d95f58ee48f13?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (c9bcb5e) will **increase** coverage by `21.85%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2895/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2895 +/- ## = + Coverage 47.90% 69.75% +21.85% + Complexity 3421 375 -3046 = Files 488 54 -434 Lines 23529 1997-21532 Branches 2501 236 -2265 = - Hits 11271 1393 -9878 + Misses11277 473-10804 + Partials981 131 -850 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `69.75% <ø> (+60.39%)` | `375.00 <ø> (+327.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...he/hudi/metadata/HoodieMetadataFileSystemView.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvSG9vZGllTWV0YWRhdGFGaWxlU3lzdGVtVmlldy5qYXZh) | | | | | [...g/apache/hudi/common/util/RocksDBSchemaHelper.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvUm9ja3NEQlNjaGVtYUhlbHBlci5qYXZh) | | | | | [...di/common/table/timeline/HoodieActiveTimeline.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZUFjdGl2ZVRpbWVsaW5lLmphdmE=) | | | | | [...in/java/org/apache/hudi/common/model/BaseFile.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0Jhc2VGaWxlLmphdmE=) | | | | | [...on/table/log/block/HoodieAvroDataBlockVersion.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9ibG9jay9Ib29kaWVBdnJvRGF0YUJsb2NrVmVyc2lvbi5qYXZh) | | | | | [...a/org/apache/hudi/common/util/CollectionUtils.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvQ29sbGVjdGlvblV0aWxzLmphdmE=) | | | | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2892: [HUDI-1865] Make embedded time line service singleton
codecov-commenter edited a comment on pull request #2892: URL: https://github.com/apache/hudi/pull/2892#issuecomment-828296052 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2892](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (e2d0335) into [master](https://codecov.io/gh/apache/hudi/commit/386767693d46e7419c4fb0fa292ccb7ab7f7098d?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3867676) will **decrease** coverage by `16.73%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2892/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2892 +/- ## = - Coverage 69.75% 53.01% -16.74% - Complexity 375 3746 +3371 = Files54 488 +434 Lines 199723527+21530 Branches236 2501 +2265 = + Hits 139312474+11081 - Misses 473 9953 +9480 - Partials131 1100 +969 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.53% <ø> (?)` | `220.00 <ø> (?)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.41% <ø> (?)` | `1976.00 <ø> (?)` | | | hudiflink | `59.67% <ø> (?)` | `537.00 <ø> (?)` | | | hudihadoopmr | `33.33% <ø> (?)` | `198.00 <ø> (?)` | | | hudisparkdatasource | `73.33% <ø> (?)` | `237.00 <ø> (?)` | | | hudisync | `46.39% <ø> (?)` | `142.00 <ø> (?)` | | | huditimelineservice | `64.36% <ø> (?)` | `62.00 <ø> (?)` | | | hudiutilities | `69.70% <ø> (-0.06%)` | `374.00 <ø> (-1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.08% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | | | [.../hudi/common/table/view/FileSystemViewManager.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvRmlsZVN5c3RlbVZpZXdNYW5hZ2VyLmphdmE=) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | | | [...a/org/apache/hudi/common/bloom/InternalFilter.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2Jsb29tL0ludGVybmFsRmlsdGVyLmphdmE=) | `46.34% <0.00%> (ø)` | `4.00% <0.00%> (?%)` | | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `100.00% <0.00%> (ø)` | `2.00% <0.00%> (?%)` | | | [...mmon/table/log/HoodieUnMergedLogRecordScanner.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVVbk1lcmdlZExvZ1JlY29yZFNjYW5uZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | |
[GitHub] [hudi] codecov-commenter commented on pull request #2895: [HUDI-1867] Streaming read for Flink COW table
codecov-commenter commented on pull request #2895: URL: https://github.com/apache/hudi/pull/2895#issuecomment-828943047 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2895](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (b8f0f53) into [master](https://codecov.io/gh/apache/hudi/commit/c9bcb5e33f7f9f97af0e8429a88d95f58ee48f13?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (c9bcb5e) will **decrease** coverage by `38.53%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2895/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #2895 +/- ## - Coverage 47.90% 9.36% -38.54% + Complexity 3421 48 -3373 Files 488 54 -434 Lines 235291997-21532 Branches 2501 236 -2265 - Hits 11271 187-11084 + Misses112771797 -9480 + Partials981 13 -968 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.36% <ø> (ø)` | `48.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2895?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [.../main/java/org/apache/hudi/util/AvroConvertor.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL0F2cm9Db252ZXJ0b3IuamF2YQ==) | | | | | [...java/org/apache/hudi/sink/StreamWriteOperator.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL1N0cmVhbVdyaXRlT3BlcmF0b3IuamF2YQ==) | | | | | [...va/org/apache/hudi/metadata/BaseTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvQmFzZVRhYmxlTWV0YWRhdGEuamF2YQ==) | | | | | [.../org/apache/hudi/metadata/HoodieTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvSG9vZGllVGFibGVNZXRhZGF0YS5qYXZh) | | | | | [.../org/apache/hudi/common/metrics/LocalRegistry.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21ldHJpY3MvTG9jYWxSZWdpc3RyeS5qYXZh) | | | | | [...3/internal/HoodieDataSourceInternalBatchWrite.java](https://codecov.io/gh/apache/hudi/pull/2895/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVEYXRhU291cmNlSW50ZXJuYWxCYXRjaFdyaXRlLmphdmE=) | | | | |
[GitHub] [hudi] danny0405 closed pull request #2892: [HUDI-1865] Make embedded time line service singleton
danny0405 closed pull request #2892: URL: https://github.com/apache/hudi/pull/2892 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on pull request #2868: [HUDI-1821] Remove legacy code for Flink writer
danny0405 commented on pull request #2868: URL: https://github.com/apache/hudi/pull/2868#issuecomment-828941274 Fine, let's keep them for a time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-1867) Streaming read for Flink COW table
[ https://issues.apache.org/jira/browse/HUDI-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1867: - Labels: pull-request-available (was: ) > Streaming read for Flink COW table > -- > > Key: HUDI-1867 > URL: https://issues.apache.org/jira/browse/HUDI-1867 > Project: Apache Hudi > Issue Type: Improvement > Components: Flink Integration >Reporter: Danny Chen >Assignee: Danny Chen >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > Supports streaming read for Copy On Write table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] danny0405 opened a new pull request #2895: [HUDI-1867] Streaming read for Flink COW table
danny0405 opened a new pull request #2895: URL: https://github.com/apache/hudi/pull/2895 Supports streaming read for Copy On Write table. ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request *(For example: This pull request adds quick-start document.)* ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request *(Please pick either of the following options)* This pull request is a trivial rework / code cleanup without any test coverage. *(or)* This pull request is already covered by existing tests, such as *(please describe tests)*. (or) This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end.* - *Added HoodieClientWriteTest to verify the change.* - *Manually verified the change by running a job locally.* ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support
pengzhiwei2018 commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r622719034 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/keygen/UuidKeyGenerator.java ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hudi.keygen; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.UUID; +import java.util.stream.Collectors; +import org.apache.avro.generic.GenericRecord; +import org.apache.hudi.common.config.TypedProperties; +import org.apache.hudi.keygen.constant.KeyGeneratorOptions; + +/** + * A KeyGenerator which use the uuid as the record key. + */ +public class UuidKeyGenerator extends BuiltinKeyGenerator { Review comment: That's greate, I will try it in the 1840. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support
pengzhiwei2018 commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r622716410 ## File path: pom.xml ## @@ -112,6 +112,7 @@ 3.0.0 3 +hudi-spark2 Review comment: ok ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestCreateTable.scala ## @@ -0,0 +1,230 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hudi + +import scala.collection.JavaConverters._ +import org.apache.hudi.common.model.HoodieRecord +import org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.types.{DoubleType, IntegerType, LongType, StringType, StructField} + +class TestCreateTable extends TestHoodieSqlBase { + + test("Test Create Managed Hoodie Table") { +val tableName = generateTableName +// Create a managed table +spark.sql( + s""" + | create table $tableName ( Review comment: +1 for this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #2833: [HUDI-89] Add configOption & refactor HoodieBootstrapConfig for a demo
vinothchandar commented on pull request #2833: URL: https://github.com/apache/hudi/pull/2833#issuecomment-828914347 @zhedoubushishi please ping me when this is ready to go and we got all the configs covered. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] yanghua commented on pull request #2892: [HUDI-1865] Make embedded time line service singleton
yanghua commented on pull request #2892: URL: https://github.com/apache/hudi/pull/2892#issuecomment-828909873 @vinothchandar Can you join in and review this PR? The change is out of my range. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2894: [HUDI-1620] Fix Metrics UTs and remove maven profile for azure tests
codecov-commenter edited a comment on pull request #2894: URL: https://github.com/apache/hudi/pull/2894#issuecomment-828895522 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2894](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (b91aef6) into [master](https://codecov.io/gh/apache/hudi/commit/3ca90302562580a7c5c69fd3f11ab376cfac1f0b?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3ca9030) will **decrease** coverage by `16.74%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2894/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2894 +/- ## = - Coverage 69.75% 53.00% -16.75% - Complexity 375 3745 +3370 = Files54 488 +434 Lines 199723527+21530 Branches236 2501 +2265 = + Hits 139312471+11078 - Misses 473 9956 +9483 - Partials131 1100 +969 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.53% <ø> (?)` | `220.00 <ø> (?)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.38% <ø> (?)` | `1975.00 <ø> (?)` | | | hudiflink | `59.67% <ø> (?)` | `537.00 <ø> (?)` | | | hudihadoopmr | `33.33% <ø> (?)` | `198.00 <ø> (?)` | | | hudisparkdatasource | `73.33% <ø> (?)` | `237.00 <ø> (?)` | | | hudisync | `46.39% <ø> (?)` | `142.00 <ø> (?)` | | | huditimelineservice | `64.36% <ø> (?)` | `62.00 <ø> (?)` | | | hudiutilities | `69.70% <ø> (-0.06%)` | `374.00 <ø> (-1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.08% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/io/storage/HoodieHFileReader.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vc3RvcmFnZS9Ib29kaWVIRmlsZVJlYWRlci5qYXZh) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | | | [...org/apache/hudi/common/table/log/AppendResult.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9BcHBlbmRSZXN1bHQuamF2YQ==) | `100.00% <0.00%> (ø)` | `4.00% <0.00%> (?%)` | | | [...mmon/table/log/AbstractHoodieLogRecordScanner.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9BYnN0cmFjdEhvb2RpZUxvZ1JlY29yZFNjYW5uZXIuamF2YQ==) | `80.00% <0.00%> (ø)` | `34.00% <0.00%> (?%)` | | | [.../common/util/queue/IteratorBasedQueueProducer.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvcXVldWUvSXRlcmF0b3JCYXNlZFF1ZXVlUHJvZHVjZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2894: [HUDI-1620] Fix Metrics UTs and remove maven profile for azure tests
codecov-commenter edited a comment on pull request #2894: URL: https://github.com/apache/hudi/pull/2894#issuecomment-828895522 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2894: [HUDI-1620] Fix Metrics UTs and remove maven profile for azure tests
codecov-commenter edited a comment on pull request #2894: URL: https://github.com/apache/hudi/pull/2894#issuecomment-828895522 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2894](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (b91aef6) into [master](https://codecov.io/gh/apache/hudi/commit/3ca90302562580a7c5c69fd3f11ab376cfac1f0b?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3ca9030) will **decrease** coverage by `16.14%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2894/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2894 +/- ## = - Coverage 69.75% 53.60% -16.15% - Complexity 375 594 +219 = Files54 94 +40 Lines 1997 4281 +2284 Branches236 496 +260 = + Hits 1393 2295 +902 - Misses 473 1781 +1308 - Partials131 205 +74 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.53% <ø> (?)` | `220.00 <ø> (?)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudiutilities | `69.70% <ø> (-0.06%)` | `374.00 <ø> (-1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.08% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | | | [...n/java/org/apache/hudi/cli/HoodieSplashScreen.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL0hvb2RpZVNwbGFzaFNjcmVlbi5qYXZh) | `42.85% <0.00%> (ø)` | `2.00% <0.00%> (?%)` | | | [...i-cli/src/main/java/org/apache/hudi/cli/Table.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL1RhYmxlLmphdmE=) | `60.78% <0.00%> (ø)` | `12.00% <0.00%> (?%)` | | | [...ain/scala/org/apache/hudi/cli/DedupeSparkJob.scala](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL2NsaS9EZWR1cGVTcGFya0pvYi5zY2FsYQ==) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | | | [...rg/apache/hudi/cli/commands/CompactionCommand.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL0NvbXBhY3Rpb25Db21tYW5kLmphdmE=) | `30.18% <0.00%> (ø)` | `22.00% <0.00%> (?%)` | | | [...rc/main/scala/org/apache/hudi/cli/DeDupeType.scala](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL2NsaS9EZUR1cGVUeXBlLnNjYWxh) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | | |
[GitHub] [hudi] yanghua commented on pull request #2868: [HUDI-1821] Remove legacy code for Flink writer
yanghua commented on pull request #2868: URL: https://github.com/apache/hudi/pull/2868#issuecomment-828903309 > > I still insist that we need to include kafka-related dependencies. If you look back at the HoodieFlinkStreamerV2 class. What is it in essence? It is just a program written using Flink DataStream API, which is specific (Kafka -> Hudi) > > No, on one says that they don't know how to add a connector jar or actually few people use the `HoodieFlinkStreamerV2` tool. "one says that they don't know how to add a connector jar" -> I recommend we package it into the bundle for users. It's not that users won't, but users should not or may not need to perceive these things. This is a question of user experience. According to your logic, what reason do you think users will not use FlinkWriteClient directly? Why should we guide users to use Flink SQL? Can't users write the FlinkStreamer class by themselves? All of this is to shield users from details as much as possible, let the framework provide out-of-the-box capabilities as much as possible, and provide a good experience as much as possible? Is not it? "actually few people use the `HoodieFlinkStreamerV2` tool" -> Actually, there still few users use the flink write client, because it is still not production-ready for 0.8, you know. IMO, we do not get enough samples about your result. I have never understood why we cannot include the kafka connector to provide convenience to some users who do not use SQL. And it should provide a consistent experience with DeltaStreamer based on Spark. Otherwise, don't call "FlinkStreamerXXX". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2894: [HUDI-1620] Fix Metrics UTs and remove maven profile for azure tests
codecov-commenter edited a comment on pull request #2894: URL: https://github.com/apache/hudi/pull/2894#issuecomment-828895522 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2894](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (b91aef6) into [master](https://codecov.io/gh/apache/hudi/commit/3ca90302562580a7c5c69fd3f11ab376cfac1f0b?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3ca9030) will **decrease** coverage by `0.05%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2894/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2894 +/- ## - Coverage 69.75% 69.70% -0.06% + Complexity 375 374 -1 Files54 54 Lines 1997 1997 Branches236 236 - Hits 1393 1392 -1 Misses 473 473 - Partials131 132 +1 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudiclient | `?` | `?` | | | hudiutilities | `69.70% <ø> (-0.06%)` | `374.00 <ø> (-1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.08% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support
vinothchandar commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r622689426 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestCreateTable.scala ## @@ -0,0 +1,230 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hudi + +import scala.collection.JavaConverters._ +import org.apache.hudi.common.model.HoodieRecord +import org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.types.{DoubleType, IntegerType, LongType, StringType, StructField} + +class TestCreateTable extends TestHoodieSqlBase { + + test("Test Create Managed Hoodie Table") { +val tableName = generateTableName +// Create a managed table +spark.sql( + s""" + | create table $tableName ( Review comment: Can we file a JIRA to write our own DFS based catalog? We can also extend it to work with a metaserver down the line? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support
vinothchandar commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r622686509 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/keygen/UuidKeyGenerator.java ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hudi.keygen; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.UUID; +import java.util.stream.Collectors; +import org.apache.avro.generic.GenericRecord; +import org.apache.hudi.common.config.TypedProperties; +import org.apache.hudi.keygen.constant.KeyGeneratorOptions; + +/** + * A KeyGenerator which use the uuid as the record key. + */ +public class UuidKeyGenerator extends BuiltinKeyGenerator { Review comment: I have a better suggestion. Could you try and explore time-ordered UUIDs instead? https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/ https://github.com/f4b6a3/uuid-creator This will do I think. We need not make changes to pass in commit time per se. I was using that as an example. It will be good to do this in first go itself, that way users don't have to regenerate/rewrite datasets ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/MultiPartKeysValueExtractor.java ## @@ -31,6 +32,11 @@ @Override public List extractPartitionValuesInPath(String partitionPath) { +// If the partitionPath is empty string( which means none-partition table), the partition values Review comment: 2876 looks good. merged. ## File path: pom.xml ## @@ -112,6 +112,7 @@ 3.0.0 3 +hudi-spark2 Review comment: rename this to `hudi.spark.module` ? There is a typo. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[hudi] branch master updated (3ca9030 -> c9bcb5e)
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 3ca9030 [HUDI-1858] Fix cannot create table due to jar conflict (#2886) add c9bcb5e [HUDI-1845] Exception Throws When Sync Non-Partitioned Table To Hive With MultiPartKeysValueExtractor (#2876) No new revisions were added by this update. Summary of changes: .../hudi/hive/MultiPartKeysValueExtractor.java | 6 .../hudi/hive/TestMultiPartKeysValueExtractor.java | 39 +- 2 files changed, 21 insertions(+), 24 deletions(-) copy hudi-common/src/test/java/org/apache/hudi/common/model/TestHoodieDeltaWriteStat.java => hudi-sync/hudi-hive-sync/src/test/java/org/apache/hudi/hive/TestMultiPartKeysValueExtractor.java (53%)
[GitHub] [hudi] vinothchandar merged pull request #2876: [HUDI-1845] Exception Throws When Sync Non-Partitioned Table To Hive …
vinothchandar merged pull request #2876: URL: https://github.com/apache/hudi/pull/2876 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter commented on pull request #2894: [HUDI-1620] Fix Metrics UTs and remove maven profile for azure tests
codecov-commenter commented on pull request #2894: URL: https://github.com/apache/hudi/pull/2894#issuecomment-828895522 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2894](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (b91aef6) into [master](https://codecov.io/gh/apache/hudi/commit/3ca90302562580a7c5c69fd3f11ab376cfac1f0b?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3ca9030) will **decrease** coverage by `60.39%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2894/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #2894 +/- ## - Coverage 69.75% 9.36% -60.40% + Complexity 375 48 -327 Files54 54 Lines 19971997 Branches236 236 - Hits 1393 187 -1206 - Misses 4731797 +1324 + Partials131 13 -118 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudiclient | `?` | `?` | | | hudiutilities | `9.36% <ø> (-60.40%)` | `48.00 <ø> (-327.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2894?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2894/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | |
[GitHub] [hudi] pengzhiwei2018 edited a comment on pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support
pengzhiwei2018 edited a comment on pull request #2645: URL: https://github.com/apache/hudi/pull/2645#issuecomment-828893245 > @pengzhiwei2018 could we make the spark-shell experience better? I think we need the extensions added by default when the jar is pulled in? > > ```scala > $ spark-shell --jars $HUDI_SPARK_BUNDLE --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' > > scala> spark.sql("create table t1 (id int, name string, price double, ts long) using hudi options(primaryKey= 'id', preCombineField = 'ts')").show > t, returning NoSuchObjectException > org.apache.hudi.exception.HoodieException: 'path' or 'hoodie.datasource.read.paths' or both must be specified. > at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:77) > at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:337) > at org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:78) > at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:229) > at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3616) > at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100) > at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160) > at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763) > at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3614) > at org.apache.spark.sql.Dataset.(Dataset.scala:229) > at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100) > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97) > at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:606) > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:601) > ``` Hi @vinothchandar , you can test this by the following command - Using spark-sql > spark-sql --jars $HUDI_SPARK_BUNDLE \\ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \\ --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' - Using spark-shell > spark-shell --jars $HUDI_SPARK_BUNDLE \\ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \\ --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' just set the `spark.sql.extensions` to `org.apache.spark.sql.hudi.HoodieSparkSessionExtension`. IMO This conf is just like the `spark.serializer` which should be specified when create `SparkSession`. So It is hard to auto set this when install the hudi jar. Thanks~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-1867) Streaming read for Flink COW table
Danny Chen created HUDI-1867: Summary: Streaming read for Flink COW table Key: HUDI-1867 URL: https://issues.apache.org/jira/browse/HUDI-1867 Project: Apache Hudi Issue Type: Improvement Components: Flink Integration Reporter: Danny Chen Assignee: Danny Chen Fix For: 0.9.0 Supports streaming read for Copy On Write table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] pengzhiwei2018 edited a comment on pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support
pengzhiwei2018 edited a comment on pull request #2645: URL: https://github.com/apache/hudi/pull/2645#issuecomment-828893245 > @pengzhiwei2018 could we make the spark-shell experience better? I think we need the extensions added by default when the jar is pulled in? > > ```scala > $ spark-shell --jars $HUDI_SPARK_BUNDLE --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' > > scala> spark.sql("create table t1 (id int, name string, price double, ts long) using hudi options(primaryKey= 'id', preCombineField = 'ts')").show > t, returning NoSuchObjectException > org.apache.hudi.exception.HoodieException: 'path' or 'hoodie.datasource.read.paths' or both must be specified. > at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:77) > at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:337) > at org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:78) > at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:229) > at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3616) > at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100) > at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160) > at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763) > at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3614) > at org.apache.spark.sql.Dataset.(Dataset.scala:229) > at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100) > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97) > at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:606) > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:601) > ``` Hi @vinothchandar , you can test this by the following command - Using spark-sql > spark-sql --jars $HUDI_SPARK_BUNDLE \\ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \\ --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' - Using spark-shell > spark-shell --jars $HUDI_SPARK_BUNDLE \\ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \\ --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' just set the `spark.sql.extensions` to `org.apache.spark.sql.hudi.HoodieSparkSessionExtension`. Thanks~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] pengzhiwei2018 commented on pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support
pengzhiwei2018 commented on pull request #2645: URL: https://github.com/apache/hudi/pull/2645#issuecomment-828893245 > @pengzhiwei2018 could we make the spark-shell experience better? I think we need the extensions added by default when the jar is pulled in? > > ```scala > $ spark-shell --jars $HUDI_SPARK_BUNDLE --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' > > scala> spark.sql("create table t1 (id int, name string, price double, ts long) using hudi options(primaryKey= 'id', preCombineField = 'ts')").show > t, returning NoSuchObjectException > org.apache.hudi.exception.HoodieException: 'path' or 'hoodie.datasource.read.paths' or both must be specified. > at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:77) > at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:337) > at org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:78) > at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:229) > at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3616) > at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100) > at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160) > at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763) > at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3614) > at org.apache.spark.sql.Dataset.(Dataset.scala:229) > at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100) > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97) > at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:606) > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:601) > ``` Hi @vinothchandar , you can test this by the following command - Using spark-sql > spark-sql --jars $HUDI_SPARK_BUNDLE \ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \ --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' - Using spark-shell > spark-shell --jars $HUDI_SPARK_BUNDLE \ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \ --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' just set the `spark.sql.extensions` to `org.apache.spark.sql.hudi.HoodieSparkSessionExtension`. Thanks~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support
vinothchandar commented on pull request #2645: URL: https://github.com/apache/hudi/pull/2645#issuecomment-82958 @pengzhiwei2018 could we make the spark-shell experience better? I think we need the extensions added by default when the jar is pulled in? ```Scala $ spark-shell --jars $HUDI_SPARK_BUNDLE --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' scala> spark.sql("create table t1 (id int, name string, price double, ts long) using hudi options(primaryKey= 'id', preCombineField = 'ts')").show t, returning NoSuchObjectException org.apache.hudi.exception.HoodieException: 'path' or 'hoodie.datasource.read.paths' or both must be specified. at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:77) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:337) at org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:78) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:229) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3616) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3614) at org.apache.spark.sql.Dataset.(Dataset.scala:229) at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97) at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:606) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:601) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2643: DO NOT MERGE (Azure CI) test branch ci
hudi-bot edited a comment on pull request #2643: URL: https://github.com/apache/hudi/pull/2643#issuecomment-792368481 ## CI report: * 9831a6c50e9f49f8a71c02fc6ac50ae1446f7c1f UNKNOWN * a569dbe9409910fbb83b3764b300574c0e52612e Azure: [FAILURE](https://dev.azure.com/XUSH0012/0ef433cc-d4b4-47cc-b6a1-03d032ef546c/_build/results?buildId=142) * e6e9f1f1554a1474dd6c20338215030cad23a2e0 UNKNOWN * 2a6690a256c8cd8efe9ed2b1984b896fb27ef077 UNKNOWN * d8b7cca55e057a52a2e229d81e8cb52b60dc275f UNKNOWN * 3bce301333cc78194d13a702598b46e04fe9f85f UNKNOWN * f07f345baa450f3fec7eab59caa76b0fbda1e132 UNKNOWN * 869d2ce3fad330af93c1bb3b576824f519c6e68b UNKNOWN * fa86907f7522bc8dbe512d48b5a87e4a6b13f035 UNKNOWN * 4ebe53016ce3e0648992dbe14d04f71a92f116e6 UNKNOWN * 682ae9985f591f6d0c30ee2ef9b159403c1e46de UNKNOWN * d80397fcfeaa2996ab550bcdab4524be7420a364 UNKNOWN * bfe3a803e19540578b94f778f7ba7551db0f86f1 UNKNOWN * a632e58390eb94fcc7e757bd7580780cf184f9a8 UNKNOWN * 2e413d601c80b123269c2fc3fc6aa9a8bd0d746a UNKNOWN * e797ee47aa319df3c3c40bdc4acab4f592d70ffe UNKNOWN * acb06df73c1c2a0ef1590f66e8b41e173d2a7a7b UNKNOWN * f7f78ee22a0a75c5fb866c4e9cdda01482fbcb59 UNKNOWN * 3a7227993309e8dd37f2aef693cb3fed69a2043c UNKNOWN * 8f7a8e7f4989c9e20b936123c0f6e324898471d2 UNKNOWN * 6824c4917ad812c5938fe5346344a4aef9b7a72e UNKNOWN * 252364017f5dee1dcdfa061cc3070dac518d4047 UNKNOWN * b1691e583f3c23ee83fcb7ee0245eed826624cc0 UNKNOWN * ba970bda569f0312c77cd5c139f9dec4ad2759b0 UNKNOWN * 4370d21d4983e5e79d1f4bafba51ae26dd29f9a0 UNKNOWN * 21ea9ccef8ab9d78f9c201fa58a22e3e59caaa6b UNKNOWN * b17028e8a232ff3015c18b8f7de5435241800bfe UNKNOWN * 11974f4994838cca929d1f55214a132f2dbccd60 UNKNOWN * c4f92e29cc2affbd3da1e02c87e99c2076d3c410 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2643: DO NOT MERGE (Azure CI) test branch ci
hudi-bot edited a comment on pull request #2643: URL: https://github.com/apache/hudi/pull/2643#issuecomment-792368481 ## CI report: * 9831a6c50e9f49f8a71c02fc6ac50ae1446f7c1f UNKNOWN * a569dbe9409910fbb83b3764b300574c0e52612e Azure: [FAILURE](https://dev.azure.com/XUSH0012/0ef433cc-d4b4-47cc-b6a1-03d032ef546c/_build/results?buildId=142) * e6e9f1f1554a1474dd6c20338215030cad23a2e0 UNKNOWN * 2a6690a256c8cd8efe9ed2b1984b896fb27ef077 UNKNOWN * d8b7cca55e057a52a2e229d81e8cb52b60dc275f UNKNOWN * 3bce301333cc78194d13a702598b46e04fe9f85f UNKNOWN * f07f345baa450f3fec7eab59caa76b0fbda1e132 UNKNOWN * 869d2ce3fad330af93c1bb3b576824f519c6e68b UNKNOWN * fa86907f7522bc8dbe512d48b5a87e4a6b13f035 UNKNOWN * 4ebe53016ce3e0648992dbe14d04f71a92f116e6 UNKNOWN * 682ae9985f591f6d0c30ee2ef9b159403c1e46de UNKNOWN * d80397fcfeaa2996ab550bcdab4524be7420a364 UNKNOWN * bfe3a803e19540578b94f778f7ba7551db0f86f1 UNKNOWN * a632e58390eb94fcc7e757bd7580780cf184f9a8 UNKNOWN * 2e413d601c80b123269c2fc3fc6aa9a8bd0d746a UNKNOWN * e797ee47aa319df3c3c40bdc4acab4f592d70ffe UNKNOWN * acb06df73c1c2a0ef1590f66e8b41e173d2a7a7b UNKNOWN * f7f78ee22a0a75c5fb866c4e9cdda01482fbcb59 UNKNOWN * 3a7227993309e8dd37f2aef693cb3fed69a2043c UNKNOWN * 8f7a8e7f4989c9e20b936123c0f6e324898471d2 UNKNOWN * 6824c4917ad812c5938fe5346344a4aef9b7a72e UNKNOWN * 252364017f5dee1dcdfa061cc3070dac518d4047 UNKNOWN * b1691e583f3c23ee83fcb7ee0245eed826624cc0 UNKNOWN * ba970bda569f0312c77cd5c139f9dec4ad2759b0 UNKNOWN * 4370d21d4983e5e79d1f4bafba51ae26dd29f9a0 UNKNOWN * 21ea9ccef8ab9d78f9c201fa58a22e3e59caaa6b UNKNOWN * b17028e8a232ff3015c18b8f7de5435241800bfe UNKNOWN * 11974f4994838cca929d1f55214a132f2dbccd60 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2892: [HUDI-1865] Make embedded time line service singleton
codecov-commenter edited a comment on pull request #2892: URL: https://github.com/apache/hudi/pull/2892#issuecomment-828296052 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-1620) TestPushGateWayReporter failed when run separately
[ https://issues.apache.org/jira/browse/HUDI-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1620: - Labels: pull-request-available (was: ) > TestPushGateWayReporter failed when run separately > -- > > Key: HUDI-1620 > URL: https://issues.apache.org/jira/browse/HUDI-1620 > Project: Apache Hudi > Issue Type: Improvement > Components: Testing >Reporter: Raymond Xu >Assignee: Raymond Xu >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > org.apache.hudi.metrics.prometheus.TestPushGateWayReporter#testRegisterGauge > when run separately, it failed with > {quote}org.apache.hudi.exception.HoodieException: > java.lang.IllegalArgumentException > at org.apache.hudi.metrics.Metrics.init(Metrics.java:100) > at org.apache.hudi.metrics.HoodieMetrics.(HoodieMetrics.java:59) > at > org.apache.hudi.metrics.prometheus.TestPushGateWayReporter.testRegisterGauge(TestPushGateWayReporter.java:45){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] xushiyan opened a new pull request #2894: [HUDI-1620] Fix Metrics UT
xushiyan opened a new pull request #2894: URL: https://github.com/apache/hudi/pull/2894 Make sure shutdown Metrics between unit test cases to ensure isolation ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2892: [HUDI-1865] Make embedded time line service singleton
codecov-commenter edited a comment on pull request #2892: URL: https://github.com/apache/hudi/pull/2892#issuecomment-828296052 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2892](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (e2d0335) into [master](https://codecov.io/gh/apache/hudi/commit/386767693d46e7419c4fb0fa292ccb7ab7f7098d?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3867676) will **decrease** coverage by `16.75%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2892/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2892 +/- ## = - Coverage 69.75% 52.99% -16.76% - Complexity 375 3745 +3370 = Files54 488 +434 Lines 199723527+21530 Branches236 2501 +2265 = + Hits 139312469+11076 - Misses 473 9957 +9484 - Partials131 1101 +970 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.53% <ø> (?)` | `220.00 <ø> (?)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.37% <ø> (?)` | `1975.00 <ø> (?)` | | | hudiflink | `59.67% <ø> (?)` | `537.00 <ø> (?)` | | | hudihadoopmr | `33.33% <ø> (?)` | `198.00 <ø> (?)` | | | hudisparkdatasource | `73.33% <ø> (?)` | `237.00 <ø> (?)` | | | hudisync | `46.39% <ø> (?)` | `142.00 <ø> (?)` | | | huditimelineservice | `64.36% <ø> (?)` | `62.00 <ø> (?)` | | | hudiutilities | `69.70% <ø> (-0.06%)` | `374.00 <ø> (-1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.08% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | | | [...i/src/main/java/org/apache/hudi/cli/HoodieCLI.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL0hvb2RpZUNMSS5qYXZh) | `89.18% <0.00%> (ø)` | `18.00% <0.00%> (?%)` | | | [...org/apache/hudi/common/util/collection/Triple.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvY29sbGVjdGlvbi9UcmlwbGUuamF2YQ==) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | | | [...che/hudi/common/util/collection/ImmutablePair.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvY29sbGVjdGlvbi9JbW11dGFibGVQYWlyLmphdmE=) | `75.00% <0.00%> (ø)` | `3.00% <0.00%> (?%)` | | | [...hudi/hadoop/hive/HoodieCombineHiveInputFormat.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL2hpdmUvSG9vZGllQ29tYmluZUhpdmVJbnB1dEZvcm1hdC5qYXZh) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2892: [HUDI-1865] Make embedded time line service singleton
codecov-commenter edited a comment on pull request #2892: URL: https://github.com/apache/hudi/pull/2892#issuecomment-828296052 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2892](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (e2d0335) into [master](https://codecov.io/gh/apache/hudi/commit/386767693d46e7419c4fb0fa292ccb7ab7f7098d?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3867676) will **decrease** coverage by `16.75%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2892/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2892 +/- ## = - Coverage 69.75% 52.99% -16.76% - Complexity 375 3745 +3370 = Files54 488 +434 Lines 199723527+21530 Branches236 2501 +2265 = + Hits 139312469+11076 - Misses 473 9957 +9484 - Partials131 1101 +970 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.53% <ø> (?)` | `220.00 <ø> (?)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.37% <ø> (?)` | `1975.00 <ø> (?)` | | | hudiflink | `59.67% <ø> (?)` | `537.00 <ø> (?)` | | | hudihadoopmr | `33.33% <ø> (?)` | `198.00 <ø> (?)` | | | hudisparkdatasource | `73.33% <ø> (?)` | `237.00 <ø> (?)` | | | hudisync | `46.39% <ø> (?)` | `142.00 <ø> (?)` | | | huditimelineservice | `64.36% <ø> (?)` | `62.00 <ø> (?)` | | | hudiutilities | `69.70% <ø> (-0.06%)` | `374.00 <ø> (-1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.08% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | | | [...ache/hudi/common/table/timeline/TimelineUtils.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL1RpbWVsaW5lVXRpbHMuamF2YQ==) | `62.71% <0.00%> (ø)` | `21.00% <0.00%> (?%)` | | | [...apache/hudi/sink/event/BatchWriteSuccessEvent.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL2V2ZW50L0JhdGNoV3JpdGVTdWNjZXNzRXZlbnQuamF2YQ==) | `92.30% <0.00%> (ø)` | `7.00% <0.00%> (?%)` | | | [.../org/apache/hudi/io/storage/HoodieHFileReader.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vc3RvcmFnZS9Ib29kaWVIRmlsZVJlYWRlci5qYXZh) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | | | [...pache/hudi/common/model/HoodieMetadataWrapper.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZU1ldGFkYXRhV3JhcHBlci5qYXZh) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | | |
[GitHub] [hudi] danny0405 closed pull request #2892: [HUDI-1865] Make embedded time line service singleton
danny0405 closed pull request #2892: URL: https://github.com/apache/hudi/pull/2892 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2893: [HUDI-1371] Support metadata based listing for Spark DataSource and Spark SQL
codecov-commenter edited a comment on pull request #2893: URL: https://github.com/apache/hudi/pull/2893#issuecomment-828848333 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2893](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (1ce0f37) into [master](https://codecov.io/gh/apache/hudi/commit/e4fd195d9fd0cc1128b8c6797d88e56402b166bd?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (e4fd195) will **increase** coverage by `0.01%`. > The diff coverage is `50.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2893/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2893 +/- ## + Coverage 52.99% 53.01% +0.01% - Complexity 3745 3749 +4 Files 488 488 Lines 2352723550 +23 Branches 2501 2503 +2 + Hits 1246912484 +15 - Misses 9957 9967 +10 + Partials 1101 1099 -2 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.53% <ø> (ø)` | `220.00 <ø> (ø)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.38% <26.08%> (+0.01%)` | `1978.00 <2.00> (+3.00)` | | | hudiflink | `59.67% <ø> (ø)` | `537.00 <ø> (ø)` | | | hudihadoopmr | `33.33% <ø> (ø)` | `198.00 <ø> (ø)` | | | hudisparkdatasource | `73.34% <92.30%> (+<0.01%)` | `237.00 <0.00> (ø)` | | | hudisync | `46.39% <ø> (ø)` | `142.00 <ø> (ø)` | | | huditimelineservice | `64.36% <ø> (ø)` | `62.00 <ø> (ø)` | | | hudiutilities | `69.75% <ø> (+0.05%)` | `375.00 <ø> (+1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...c/main/java/org/apache/hudi/common/fs/FSUtils.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL0ZTVXRpbHMuamF2YQ==) | `47.34% <0.00%> (ø)` | `57.00 <0.00> (ø)` | | | [...va/org/apache/hudi/metadata/BaseTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvQmFzZVRhYmxlTWV0YWRhdGEuamF2YQ==) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | | | [.../org/apache/hudi/metadata/HoodieTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvSG9vZGllVGFibGVNZXRhZGF0YS5qYXZh) | `0.00% <ø> (ø)` | `0.00 <0.00> (ø)` | | | [...c/main/scala/org/apache/hudi/HoodieFileIndex.scala](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUZpbGVJbmRleC5zY2FsYQ==) | `78.98% <92.30%> (-0.11%)` | `24.00 <0.00> (ø)` | | | [...e/hudi/metadata/FileSystemBackedTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvRmlsZVN5c3RlbUJhY2tlZFRhYmxlTWV0YWRhdGEuamF2YQ==) | `93.18% <100.00%> (+1.07%)` | `15.00 <2.00> (+2.00)` | | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2893: [HUDI-1371] Support metadata based listing for Spark DataSource and Spark SQL
codecov-commenter edited a comment on pull request #2893: URL: https://github.com/apache/hudi/pull/2893#issuecomment-828848333 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2893](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (1ce0f37) into [master](https://codecov.io/gh/apache/hudi/commit/e4fd195d9fd0cc1128b8c6797d88e56402b166bd?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (e4fd195) will **increase** coverage by `0.01%`. > The diff coverage is `50.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2893/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2893 +/- ## + Coverage 52.99% 53.01% +0.01% - Complexity 3745 3749 +4 Files 488 488 Lines 2352723550 +23 Branches 2501 2503 +2 + Hits 1246912484 +15 - Misses 9957 9967 +10 + Partials 1101 1099 -2 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.53% <ø> (ø)` | `220.00 <ø> (ø)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.38% <26.08%> (+0.01%)` | `1978.00 <2.00> (+3.00)` | | | hudiflink | `59.67% <ø> (ø)` | `537.00 <ø> (ø)` | | | hudihadoopmr | `33.33% <ø> (ø)` | `198.00 <ø> (ø)` | | | hudisparkdatasource | `73.34% <92.30%> (+<0.01%)` | `237.00 <0.00> (ø)` | | | hudisync | `46.39% <ø> (ø)` | `142.00 <ø> (ø)` | | | huditimelineservice | `64.36% <ø> (ø)` | `62.00 <ø> (ø)` | | | hudiutilities | `69.75% <ø> (+0.05%)` | `375.00 <ø> (+1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...c/main/java/org/apache/hudi/common/fs/FSUtils.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL0ZTVXRpbHMuamF2YQ==) | `47.34% <0.00%> (ø)` | `57.00 <0.00> (ø)` | | | [...va/org/apache/hudi/metadata/BaseTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvQmFzZVRhYmxlTWV0YWRhdGEuamF2YQ==) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | | | [.../org/apache/hudi/metadata/HoodieTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvSG9vZGllVGFibGVNZXRhZGF0YS5qYXZh) | `0.00% <ø> (ø)` | `0.00 <0.00> (ø)` | | | [...c/main/scala/org/apache/hudi/HoodieFileIndex.scala](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUZpbGVJbmRleC5zY2FsYQ==) | `78.98% <92.30%> (-0.11%)` | `24.00 <0.00> (ø)` | | | [...e/hudi/metadata/FileSystemBackedTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvRmlsZVN5c3RlbUJhY2tlZFRhYmxlTWV0YWRhdGEuamF2YQ==) | `93.18% <100.00%> (+1.07%)` | `15.00 <2.00> (+2.00)` | | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2893: [HUDI-1371] Support metadata based listing for Spark DataSource and Spark SQL
codecov-commenter edited a comment on pull request #2893: URL: https://github.com/apache/hudi/pull/2893#issuecomment-828848333 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2893](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (1ce0f37) into [master](https://codecov.io/gh/apache/hudi/commit/e4fd195d9fd0cc1128b8c6797d88e56402b166bd?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (e4fd195) will **decrease** coverage by `1.69%`. > The diff coverage is `26.08%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2893/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2893 +/- ## - Coverage 52.99% 51.30% -1.70% + Complexity 3745 3308 -437 Files 488 425 -63 Lines 2352720071-3456 Branches 2501 2085 -416 - Hits 1246910298-2171 + Misses 9957 8919-1038 + Partials 1101 854 -247 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.53% <ø> (ø)` | `220.00 <ø> (ø)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.38% <26.08%> (+0.01%)` | `1978.00 <2.00> (+3.00)` | | | hudiflink | `59.67% <ø> (ø)` | `537.00 <ø> (ø)` | | | hudihadoopmr | `33.33% <ø> (ø)` | `198.00 <ø> (ø)` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `69.75% <ø> (+0.05%)` | `375.00 <ø> (+1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...c/main/java/org/apache/hudi/common/fs/FSUtils.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL0ZTVXRpbHMuamF2YQ==) | `47.34% <0.00%> (ø)` | `57.00 <0.00> (ø)` | | | [...va/org/apache/hudi/metadata/BaseTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvQmFzZVRhYmxlTWV0YWRhdGEuamF2YQ==) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | | | [.../org/apache/hudi/metadata/HoodieTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvSG9vZGllVGFibGVNZXRhZGF0YS5qYXZh) | `0.00% <ø> (ø)` | `0.00 <0.00> (ø)` | | | [...e/hudi/metadata/FileSystemBackedTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvRmlsZVN5c3RlbUJhY2tlZFRhYmxlTWV0YWRhdGEuamF2YQ==) | `93.18% <100.00%> (+1.07%)` | `15.00 <2.00> (+2.00)` | | | [.../apache/hudi/hive/MultiPartKeysValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTXVsdGlQYXJ0S2V5c1ZhbHVlRXh0cmFjdG9yLmphdmE=) | | | | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2893: [HUDI-1371] Support metadata based listing for Spark DataSource and Spark SQL
codecov-commenter edited a comment on pull request #2893: URL: https://github.com/apache/hudi/pull/2893#issuecomment-828848333 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2893](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (1ce0f37) into [master](https://codecov.io/gh/apache/hudi/commit/e4fd195d9fd0cc1128b8c6797d88e56402b166bd?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (e4fd195) will **increase** coverage by `16.75%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2893/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2893 +/- ## = + Coverage 52.99% 69.75% +16.75% + Complexity 3745 375 -3370 = Files 488 54 -434 Lines 23527 1997-21530 Branches 2501 236 -2265 = - Hits 12469 1393-11076 + Misses 9957 473 -9484 + Partials 1101 131 -970 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `69.75% <ø> (+0.05%)` | `375.00 <ø> (+1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...java/org/apache/hudi/common/fs/StorageSchemes.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL1N0b3JhZ2VTY2hlbWVzLmphdmE=) | | | | | [...e/hudi/exception/HoodieDeltaStreamerException.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZURlbHRhU3RyZWFtZXJFeGNlcHRpb24uamF2YQ==) | | | | | [...src/main/scala/org/apache/hudi/DefaultSource.scala](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0RlZmF1bHRTb3VyY2Uuc2NhbGE=) | | | | | [.../java/org/apache/hudi/common/util/CommitUtils.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvQ29tbWl0VXRpbHMuamF2YQ==) | | | | | [...e/timeline/versioning/clean/CleanPlanMigrator.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL3ZlcnNpb25pbmcvY2xlYW4vQ2xlYW5QbGFuTWlncmF0b3IuamF2YQ==) | | | | | [...rg/apache/hudi/cli/commands/SavepointsCommand.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL1NhdmVwb2ludHNDb21tYW5kLmphdmE=) | | | | |
[GitHub] [hudi] codecov-commenter commented on pull request #2893: [HUDI-1371] Support metadata based listing for Spark DataSource and Spark SQL
codecov-commenter commented on pull request #2893: URL: https://github.com/apache/hudi/pull/2893#issuecomment-828848333 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2893](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (1ce0f37) into [master](https://codecov.io/gh/apache/hudi/commit/e4fd195d9fd0cc1128b8c6797d88e56402b166bd?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (e4fd195) will **decrease** coverage by `43.63%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2893/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #2893 +/- ## - Coverage 52.99% 9.36% -43.64% + Complexity 3745 48 -3697 Files 488 54 -434 Lines 235271997-21530 Branches 2501 236 -2265 - Hits 12469 187-12282 + Misses 99571797 -8160 + Partials 1101 13 -1088 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.36% <ø> (-60.35%)` | `48.00 <ø> (-326.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2893?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2893/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | |
[jira] [Commented] (HUDI-1847) Add ability to decouple configs for scheduling inline and running async
[ https://issues.apache.org/jira/browse/HUDI-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17335061#comment-17335061 ] Nishith Agarwal commented on HUDI-1847: --- Steps to contribute this PR # Start by adding a config to SCHEDULE compaction inline or not, so that allows to turn off inline compaction but schedule inline or not. This can be added here -> [https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java] # Next, this config needs to be added to [https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java] so it's part of the getters # This config then needs to be honored in all the places compaction is scheduled, good place to look at are : AbstractHoodieWriteClient, DeltaSync/HoodieDeltaStreamer and HoodieSparkSqlWriter.scala # Once this config is honored, you should be able to write test cases for each of these parts of the code to be able to test out this feature > Add ability to decouple configs for scheduling inline and running async > --- > > Key: HUDI-1847 > URL: https://issues.apache.org/jira/browse/HUDI-1847 > Project: Apache Hudi > Issue Type: Improvement > Components: Compaction >Reporter: Nishith Agarwal >Priority: Major > Labels: sev:high > > Currently, there are 2 ways to enable compaction: > > # Inline - This will schedule compaction inline and execute inline > # Async - This option is only available for HoodieDeltaStreamer based jobs. > This turns on scheduling inline and running async as part of the same spark > job. > > Users need a config to be able to schedule only inline while having an > ability to execute in their own spark job -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] diogodilcl commented on issue #1679: [HUDI-1609] How to disable Hive JDBC and enable metastore
diogodilcl commented on issue #1679: URL: https://github.com/apache/hudi/issues/1679#issuecomment-828828874 Hudi version: 0.7.0 Emr : 6.2 Hi, when I use: `"hoodie.datasource.hive_sync.use_jdbc":"false"` I have the following exception: ``` 21/04/28 22:19:49 ERROR HiveSyncTool: Got runtime exception when hive syncing org.apache.hudi.hive.HoodieHiveSyncException: Failed in executing SQL at org.apache.hudi.hive.HoodieHiveClient.updateHiveSQLs(HoodieHiveClient.java:406) at org.apache.hudi.hive.HoodieHiveClient.updateHiveSQLUsingHiveDriver(HoodieHiveClient.java:384) at org.apache.hudi.hive.HoodieHiveClient.updateHiveSQL(HoodieHiveClient.java:374) at org.apache.hudi.hive.HoodieHiveClient.createTable(HoodieHiveClient.java:263) at org.apache.hudi.hive.HiveSyncTool.syncSchema(HiveSyncTool.java:181) at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:136) at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:94) at org.apache.hudi.HoodieSparkSqlWriter$.syncHive(HoodieSparkSqlWriter.scala:355) at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$4(HoodieSparkSqlWriter.scala:403) at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$4$adapted(HoodieSparkSqlWriter.scala:399) at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) at org.apache.hudi.HoodieSparkSqlWriter$.metaSync(HoodieSparkSqlWriter.scala:399) at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:460) at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:218) at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:134) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90) at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:124) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:123) at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:963) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:104) at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:227) at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:107) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:132) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:104) at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:227) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:132) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:248) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:131) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:963) at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:415) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:399) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:288) at sun.reflect.GeneratedMethodAccessor222.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
[jira] [Updated] (HUDI-1371) Support file listing using metadata for Spark DataSource and Spark SQL queries
[ https://issues.apache.org/jira/browse/HUDI-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-1371: Summary: Support file listing using metadata for Spark DataSource and Spark SQL queries (was: Implement Spark datasource by fetching file listing from metadata table) > Support file listing using metadata for Spark DataSource and Spark SQL queries > -- > > Key: HUDI-1371 > URL: https://issues.apache.org/jira/browse/HUDI-1371 > Project: Apache Hudi > Issue Type: Sub-task > Components: Spark Integration >Affects Versions: 0.9.0 >Reporter: Vinoth Chandar >Assignee: Udit Mehrotra >Priority: Blocker > Labels: pull-request-available > Fix For: 0.9.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1371) Implement Spark datasource by fetching file listing from metadata table
[ https://issues.apache.org/jira/browse/HUDI-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1371: - Labels: pull-request-available (was: ) > Implement Spark datasource by fetching file listing from metadata table > --- > > Key: HUDI-1371 > URL: https://issues.apache.org/jira/browse/HUDI-1371 > Project: Apache Hudi > Issue Type: Sub-task > Components: Spark Integration >Affects Versions: 0.9.0 >Reporter: Vinoth Chandar >Assignee: Udit Mehrotra >Priority: Blocker > Labels: pull-request-available > Fix For: 0.9.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] umehrot2 opened a new pull request #2893: [HUDI-1371] Support metadata based listing for Spark DataSource and Spark SQL
umehrot2 opened a new pull request #2893: URL: https://github.com/apache/hudi/pull/2893 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request This pr adds support for metadata based listing for Hudi Spark DataSource and Spark SQL based queries. The detailed design for Spark integration (V2 implementation specifically) can be found at https://cwiki.apache.org/confluence/display/HUDI/RFC+-+15%3A+HUDI+File+Listing+Improvements#RFC15:HUDIFileListingImprovements-Spark. Two parts of the V2 design have already been implemented: - Custom FileIndex for Hudi: https://github.com/apache/hudi/pull/2651 - Registering Hudi tables as DataSource tables in Hive metastore so they are executed via Hudi DataSource instead of Hive InputFormat/Serde. In the process, it will also use the FileIndex implemented in Hudi DataSource: https://github.com/apache/hudi/pull/2283 In this pr we build on top of the FileIndex implementation to get file listing using Hudi's metadata table if the feature is enabled, and otherwise fallback to distributed listing using Spark Context. The metadata table will be read just once and it will reduce O(N) list calls to O(1) get calls for N partitions. We also refactor the Hudi metadata table contract to add a new API which can fetch lists for multiple partitions (opens the reader just once). ## Brief change log ## Verify this pull request - Existing unit tests updated - Internally on AWS EMR ran several performance tests via Spark DataSource and Spark SQL to observe improvements in query planning times ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2833: [HUDI-89] Add configOption & refactor HoodieBootstrapConfig for a demo
codecov-commenter edited a comment on pull request #2833: URL: https://github.com/apache/hudi/pull/2833#issuecomment-828792354 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2833?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2833](https://codecov.io/gh/apache/hudi/pull/2833?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (98d109a) into [master](https://codecov.io/gh/apache/hudi/commit/3ca90302562580a7c5c69fd3f11ab376cfac1f0b?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3ca9030) will **decrease** coverage by `16.66%`. > The diff coverage is `55.41%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2833/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2833?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2833 +/- ## = - Coverage 69.75% 53.08% -16.67% - Complexity 375 3761 +3386 = Files54 489 +435 Lines 199723792+21795 Branches236 2467 +2231 = + Hits 139312630+11237 - Misses 47310082 +9609 - Partials131 1080 +949 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.58% <37.50%> (?)` | `220.00 <0.00> (?)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.72% <52.65%> (?)` | `1991.00 <41.00> (?)` | | | hudiflink | `59.67% <66.66%> (?)` | `537.00 <0.00> (?)` | | | hudihadoopmr | `33.33% <100.00%> (?)` | `198.00 <0.00> (?)` | | | hudisparkdatasource | `73.33% <79.85%> (?)` | `237.00 <4.00> (?)` | | | hudisync | `46.39% <15.38%> (?)` | `142.00 <0.00> (?)` | | | huditimelineservice | `64.07% <0.00%> (?)` | `62.00 <0.00> (?)` | | | hudiutilities | `68.99% <28.26%> (-0.77%)` | `374.00 <0.00> (-1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2833?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...n/java/org/apache/hudi/cli/commands/SparkMain.java](https://codecov.io/gh/apache/hudi/pull/2833/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL1NwYXJrTWFpbi5qYXZh) | `6.06% <0.00%> (ø)` | `4.00 <0.00> (?)` | | | [.../main/scala/org/apache/hudi/cli/SparkHelpers.scala](https://codecov.io/gh/apache/hudi/pull/2833/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL2NsaS9TcGFya0hlbHBlcnMuc2NhbGE=) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | | | [...pache/hudi/common/config/HoodieMetadataConfig.java](https://codecov.io/gh/apache/hudi/pull/2833/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2NvbmZpZy9Ib29kaWVNZXRhZGF0YUNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | | | [...g/apache/hudi/common/config/LockConfiguration.java](https://codecov.io/gh/apache/hudi/pull/2833/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2NvbmZpZy9Mb2NrQ29uZmlndXJhdGlvbi5qYXZh) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | | | [...c/main/java/org/apache/hudi/common/fs/FSUtils.java](https://codecov.io/gh/apache/hudi/pull/2833/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL0ZTVXRpbHMuamF2YQ==) | `46.66% <0.00%> (ø)` | `57.00 <0.00> (?)` | | |
[GitHub] [hudi] codecov-commenter commented on pull request #2833: [HUDI-89] Add configOption & refactor HoodieBootstrapConfig for a demo
codecov-commenter commented on pull request #2833: URL: https://github.com/apache/hudi/pull/2833#issuecomment-828792354 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2833?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2833](https://codecov.io/gh/apache/hudi/pull/2833?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (98d109a) into [master](https://codecov.io/gh/apache/hudi/commit/3ca90302562580a7c5c69fd3f11ab376cfac1f0b?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3ca9030) will **decrease** coverage by `0.76%`. > The diff coverage is `28.26%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2833/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2833?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2833 +/- ## - Coverage 69.75% 68.99% -0.77% + Complexity 375 374 -1 Files54 54 Lines 1997 2019 +22 Branches236 235 -1 Hits 1393 1393 - Misses 473 494 +21 - Partials131 132 +1 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudiutilities | `68.99% <28.26%> (-0.77%)` | `374.00 <0.00> (-1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2833?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...callback/kafka/HoodieWriteCommitKafkaCallback.java](https://codecov.io/gh/apache/hudi/pull/2833/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NhbGxiYWNrL2thZmthL0hvb2RpZVdyaXRlQ29tbWl0S2Fma2FDYWxsYmFjay5qYXZh) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | | | [...ck/kafka/HoodieWriteCommitKafkaCallbackConfig.java](https://codecov.io/gh/apache/hudi/pull/2833/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NhbGxiYWNrL2thZmthL0hvb2RpZVdyaXRlQ29tbWl0S2Fma2FDYWxsYmFja0NvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | | | [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/2833/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `78.52% <0.00%> (ø)` | `19.00 <0.00> (ø)` | | | [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2833/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | | | [...udi/utilities/deltastreamer/BootstrapExecutor.java](https://codecov.io/gh/apache/hudi/pull/2833/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvQm9vdHN0cmFwRXhlY3V0b3IuamF2YQ==) | `82.69% <100.00%> (+0.33%)` | `6.00 <0.00> (ø)` | | | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2833/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.08%
[jira] [Created] (HUDI-1866) Investigate if hive-sync works as expected in a quickstart environment for 0.8
Nishith Agarwal created HUDI-1866: - Summary: Investigate if hive-sync works as expected in a quickstart environment for 0.8 Key: HUDI-1866 URL: https://issues.apache.org/jira/browse/HUDI-1866 Project: Apache Hudi Issue Type: Bug Components: Hive Integration Reporter: Nishith Agarwal Assignee: Nishith Agarwal Hive-Sync seems to be failing for few users as reported on slack, see an example here -> [https://apache-hudi.slack.com/archives/C4D716NPQ/p161950993803] We need to investigate if this is a real issue -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1866) Investigate if hive-sync works as expected in a quickstart environment for 0.8
[ https://issues.apache.org/jira/browse/HUDI-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1866: -- Labels: sev:critical (was: ) > Investigate if hive-sync works as expected in a quickstart environment for 0.8 > -- > > Key: HUDI-1866 > URL: https://issues.apache.org/jira/browse/HUDI-1866 > Project: Apache Hudi > Issue Type: Bug > Components: Hive Integration >Reporter: Nishith Agarwal >Assignee: Nishith Agarwal >Priority: Major > Labels: sev:critical > > Hive-Sync seems to be failing for few users as reported on slack, see an > example here -> > [https://apache-hudi.slack.com/archives/C4D716NPQ/p161950993803] > > We need to investigate if this is a real issue -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] abhijeetkushe edited a comment on issue #2850: [SUPPORT] S3 files skipped by HoodieDeltaStreamer on s3 bucket in continuous mode
abhijeetkushe edited a comment on issue #2850: URL: https://github.com/apache/hudi/issues/2850#issuecomment-824917902 @xushiyan Thanks for your prompt reply.I agree that the issue I am facing is somewhat related to [HUDI-1723](https://issues.apache.org/jira/browse/HUDI-1723). It is great that the hudi team is actively working on addressing this issue.We have come up with the below interim solution to address our issue - We are using INSERT while writing our data as that is both memory and time efficient so using UPSERT just to handle missing files will not work for us - The solution you proposed for overriding the DFSPathSelector will work for us.We are planning to override the [below line](https://github.com/apache/hudi/blob/release-0.6.0/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/DFSPathSelector.java#L92) with `f.getModificationTime() <= Long.valueOf(lastCheckpointStr.get()).longValue() || f.getModificationTime() > (System.currentTimeMillis() - 3)`. We are using hudi version 0.6.0 This will result in a 30 seconds lag while writing records which is acceptable to us and will address missing file problem completely.The 30 seconds lag will be configurable via an environment variable.The HoodieDeltaStreamer takes --source-class as a argument where we will be providing our custom JsonDFSSource which delegates to our custom DFSPathSelector. - Can you please validate whether hoodiedeltastreamer will be able to record correct checkpoint with the change I am proposing to make above ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2892: [HUDI-1865] Make embedded time line service singleton
codecov-commenter edited a comment on pull request #2892: URL: https://github.com/apache/hudi/pull/2892#issuecomment-828296052 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2892](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (e2d0335) into [master](https://codecov.io/gh/apache/hudi/commit/386767693d46e7419c4fb0fa292ccb7ab7f7098d?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3867676) will **decrease** coverage by `21.87%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2892/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2892 +/- ## = - Coverage 69.75% 47.87% -21.88% - Complexity 375 3419 +3044 = Files54 488 +434 Lines 199723527+21530 Branches236 2501 +2265 = + Hits 139311264 +9871 - Misses 47311281+10808 - Partials131 982 +851 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.53% <ø> (?)` | `220.00 <ø> (?)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.37% <ø> (?)` | `1975.00 <ø> (?)` | | | hudiflink | `59.67% <ø> (?)` | `537.00 <ø> (?)` | | | hudihadoopmr | `33.33% <ø> (?)` | `198.00 <ø> (?)` | | | hudisparkdatasource | `73.33% <ø> (?)` | `237.00 <ø> (?)` | | | hudisync | `46.39% <ø> (?)` | `142.00 <ø> (?)` | | | huditimelineservice | `64.36% <ø> (?)` | `62.00 <ø> (?)` | | | hudiutilities | `9.36% <ø> (-60.40%)` | `48.00 <ø> (-327.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | |
[jira] [Commented] (HUDI-1607) Decimal handling bug in SparkAvroPostProcessor
[ https://issues.apache.org/jira/browse/HUDI-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334823#comment-17334823 ] sivabalan narayanan commented on HUDI-1607: --- https://issues.apache.org/jira/browse/HUDI-1343?focusedCommentId=17325964=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17325964 > Decimal handling bug in SparkAvroPostProcessor > --- > > Key: HUDI-1607 > URL: https://issues.apache.org/jira/browse/HUDI-1607 > Project: Apache Hudi > Issue Type: Bug >Reporter: Jingwei Zhang >Priority: Major > Labels: sev:critical, user-support-issues > > This issue related to > [#[Hudi-1343]|[https://github.com/apache/hudi/pull/2192].] > I think the purpose of Hudi-1343 was to bridge the difference between avro > 1.8.2(used by hudi) and avro 1.9.2(used by upstream system) thru internal > Struct type. In particular, the incompatible form to express nullable type > between those two versions. > It was all good until I hit the type Decimal. Since it can either be FIXED or > BYTES, if an avro schema contains decimal type with BYTES as its literal > type, after this two way conversion its literal type become FIXED instead. > This will cause an exception to be thrown in AvroConversionHelper as the data > underneath is HeapByteBuffer rather than GenericFixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1864) Support for java.time.LocalDate in TimestampBasedAvroKeyGenerator
[ https://issues.apache.org/jira/browse/HUDI-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1864: -- Labels: sev:high (was: ) > Support for java.time.LocalDate in TimestampBasedAvroKeyGenerator > - > > Key: HUDI-1864 > URL: https://issues.apache.org/jira/browse/HUDI-1864 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Vaibhav Sinha >Priority: Major > Labels: sev:high > > When we read data from MySQL which has a column of type {{Date}}, Spark > represents it as an instance of {{java.time.LocalDate}}. If I try and use > this column for partitioning while doing a write to Hudi, I get the following > exception > > {code:java} > Caused by: org.apache.hudi.exception.HoodieKeyGeneratorException: Unable to > parse input partition field :2021-04-21 > at > org.apache.hudi.keygen.TimestampBasedAvroKeyGenerator.getPartitionPath(TimestampBasedAvroKeyGenerator.java:136) > ~[hudi-spark3-bundle_2.12-0.8.0.jar:0.8.0] > at > org.apache.hudi.keygen.CustomAvroKeyGenerator.getPartitionPath(CustomAvroKeyGenerator.java:89) > ~[hudi-spark3-bundle_2.12-0.8.0.jar:0.8.0] > at > org.apache.hudi.keygen.CustomKeyGenerator.getPartitionPath(CustomKeyGenerator.java:64) > ~[hudi-spark3-bundle_2.12-0.8.0.jar:0.8.0] > at > org.apache.hudi.keygen.BaseKeyGenerator.getKey(BaseKeyGenerator.java:62) > ~[hudi-spark3-bundle_2.12-0.8.0.jar:0.8.0] > at > org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$write$2(HoodieSparkSqlWriter.scala:160) > ~[hudi-spark3-bundle_2.12-0.8.0.jar:0.8.0] > at scala.collection.Iterator$$anon$10.next(Iterator.scala:459) > ~[scala-library-2.12.10.jar:?] > at scala.collection.Iterator$SliceIterator.next(Iterator.scala:271) > ~[scala-library-2.12.10.jar:?] > at scala.collection.Iterator.foreach(Iterator.scala:941) > ~[scala-library-2.12.10.jar:?] > at scala.collection.Iterator.foreach$(Iterator.scala:941) > ~[scala-library-2.12.10.jar:?] > at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) > ~[scala-library-2.12.10.jar:?] > at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62) > ~[scala-library-2.12.10.jar:?] > at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53) > ~[scala-library-2.12.10.jar:?] > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105) > ~[scala-library-2.12.10.jar:?] > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49) > ~[scala-library-2.12.10.jar:?] > at scala.collection.TraversableOnce.to(TraversableOnce.scala:315) > ~[scala-library-2.12.10.jar:?] > at scala.collection.TraversableOnce.to$(TraversableOnce.scala:313) > ~[scala-library-2.12.10.jar:?] > at scala.collection.AbstractIterator.to(Iterator.scala:1429) > ~[scala-library-2.12.10.jar:?] > at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:307) > ~[scala-library-2.12.10.jar:?] > at > scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:307) > ~[scala-library-2.12.10.jar:?] > at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1429) > ~[scala-library-2.12.10.jar:?] > at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:294) > ~[scala-library-2.12.10.jar:?] > at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:288) > ~[scala-library-2.12.10.jar:?] > at scala.collection.AbstractIterator.toArray(Iterator.scala:1429) > ~[scala-library-2.12.10.jar:?] > at org.apache.spark.rdd.RDD.$anonfun$take$2(RDD.scala:1449) > ~[spark-core_2.12-3.1.1.jar:3.1.1] > at > org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2242) > ~[spark-core_2.12-3.1.1.jar:3.1.1] > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) > ~[spark-core_2.12-3.1.1.jar:3.1.1] > at org.apache.spark.scheduler.Task.run(Task.scala:131) > ~[spark-core_2.12-3.1.1.jar:3.1.1] > at > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497) > ~[spark-core_2.12-3.1.1.jar:3.1.1] > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439) > ~[spark-core_2.12-3.1.1.jar:3.1.1] > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500) > ~[spark-core_2.12-3.1.1.jar:3.1.1] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ~[?:1.8.0_171] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ~[?:1.8.0_171] > at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_171] > Caused by: org.apache.hudi.exception.HoodieNotSupportedException: Unexpected > type for partition field: java.time.LocalDate
[jira] [Commented] (HUDI-1739) insert_overwrite_table and insert_overwrite create empty replacecommit.requested file which breaks archival
[ https://issues.apache.org/jira/browse/HUDI-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334816#comment-17334816 ] satish commented on HUDI-1739: -- [~shivnarayan] https://github.com/apache/hudi/pull/2784 we already have PR for this Looks like this is is also dup of HUDI-1740 > insert_overwrite_table and insert_overwrite create empty > replacecommit.requested file which breaks archival > --- > > Key: HUDI-1739 > URL: https://issues.apache.org/jira/browse/HUDI-1739 > Project: Apache Hudi > Issue Type: Bug >Reporter: Jagmeet Bali >Assignee: Susu Dong >Priority: Minor > Labels: sev:high > > Fixes can be to > # Ignore empty replacecommit.requested files. > # Standardise the replacecommit.requested format across all invocations be > it from clustering or this use case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] satishkotha commented on pull request #2784: [HUDI-1740] Fix insert-overwrite API archival
satishkotha commented on pull request #2784: URL: https://github.com/apache/hudi/pull/2784#issuecomment-828560911 > I will take a pass on this and land! @vinothchandar could you please review this since its been waiting for some time? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 closed pull request #2892: [HUDI-1865] Make embedded time line service singleton
danny0405 closed pull request #2892: URL: https://github.com/apache/hudi/pull/2892 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (HUDI-1063) Save in Google Cloud Storage not working
[ https://issues.apache.org/jira/browse/HUDI-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334773#comment-17334773 ] sivabalan narayanan edited comment on HUDI-1063 at 4/28/21, 2:51 PM: - [~WaterKnight]: I could not reproduce the issue w/ latest master. things are working fine. I followed [this|https://holowczak.com/getting-started-with-apache-spark-on-google-cloud-platform-using-dataproc/] link to set up my cluster. Command I used to launch spark shell ``` /usr/lib/spark/bin/spark-shell --packages org.apache.spark:spark-avro_2.12:3.0.0 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --jars /home/n_siva_b/hudi-spark3-bundle_2.12-0.9.0-SNAPSHOT.jar ``` [Link|https://gist.github.com/nsivabalan/03736cda20c10781957b83a89e2f6650] to gist for steps I tried out. Not sure if Hadoop 3+ was tried w/ 0.5.3. Hudi has few more releases after 0.5.0 with latest as 0.8.0 which is tested for hadoop3. If you want to try out hudi 0.5.3, would recommend trying out hadoop2.7 may be. was (Author: shivnarayan): [~WaterKnight]: I could not reproduce the issue w/ latest master. things are working fine. Command I used to launch spark shell ``` /usr/lib/spark/bin/spark-shell --packages org.apache.spark:spark-avro_2.12:3.0.0 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --jars /home/n_siva_b/hudi-spark3-bundle_2.12-0.9.0-SNAPSHOT.jar ``` [Link|https://gist.github.com/nsivabalan/03736cda20c10781957b83a89e2f6650] to gist for steps I tried out. Not sure if Hadoop 3+ was tried w/ 0.5.3. Hudi has few more releases after 0.5.0 with latest as 0.8.0 which is tested for hadoop3. If you want to try out hudi 0.5.3, would recommend trying out hadoop2.7 may be. > Save in Google Cloud Storage not working > > > Key: HUDI-1063 > URL: https://issues.apache.org/jira/browse/HUDI-1063 > Project: Apache Hudi > Issue Type: Bug > Components: Spark Integration >Affects Versions: 0.9.0 >Reporter: David Lacalle Castillo >Priority: Critical > Labels: sev:critical, user-support-issues > Fix For: 0.9.0 > > > I added to spark submit the following properties: > {{--packages > org.apache.hudi:hudi-spark-bundle_2.11:0.5.3,org.apache.spark:spark-avro_2.11:2.4.4 > \ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'}} > Spark version 2.4.5 and Hadoop version 3.2.1 > > I am trying to save a Dataframe as follows in Google Cloud Storage as follows: > tableName = "forecasts" > basePath = "gs://hudi-datalake/" + tableName > hudi_options = { > 'hoodie.table.name': tableName, > 'hoodie.datasource.write.recordkey.field': 'uuid', > 'hoodie.datasource.write.partitionpath.field': 'partitionpath', > 'hoodie.datasource.write.table.name': tableName, > 'hoodie.datasource.write.operation': 'insert', > 'hoodie.datasource.write.precombine.field': 'ts', > 'hoodie.upsert.shuffle.parallelism': 2, > 'hoodie.insert.shuffle.parallelism': 2 > } > results = results.selectExpr( > "ds as date", > "store", > "item", > "y as sales", > "yhat as sales_predicted", > "yhat_upper as sales_predicted_upper", > "yhat_lower as sales_predicted_lower", > "training_date") > results.write.format("hudi"). \ > options(**hudi_options). \ > mode("overwrite"). \ > save(basePath) > I am getting the following error: > Py4JJavaError: An error occurred while calling o312.save. : > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V at > io.javalin.core.util.JettyServerUtil.defaultSessionHandler(JettyServerUtil.kt:50) > at io.javalin.Javalin.(Javalin.java:94) at > io.javalin.Javalin.create(Javalin.java:107) at > org.apache.hudi.timeline.service.TimelineService.startService(TimelineService.java:102) > at > org.apache.hudi.client.embedded.EmbeddedTimelineService.startServer(EmbeddedTimelineService.java:74) > at > org.apache.hudi.client.AbstractHoodieClient.startEmbeddedServerView(AbstractHoodieClient.java:102) > at > org.apache.hudi.client.AbstractHoodieClient.(AbstractHoodieClient.java:69) > at > org.apache.hudi.client.AbstractHoodieWriteClient.(AbstractHoodieWriteClient.java:83) > at > org.apache.hudi.client.HoodieWriteClient.(HoodieWriteClient.java:137) > at > org.apache.hudi.client.HoodieWriteClient.(HoodieWriteClient.java:124) > at > org.apache.hudi.client.HoodieWriteClient.(HoodieWriteClient.java:120) > at > org.apache.hudi.DataSourceUtils.createHoodieClient(DataSourceUtils.java:195) > at > org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:135) > at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:108) at >
[jira] [Updated] (HUDI-1063) Save in Google Cloud Storage not working
[ https://issues.apache.org/jira/browse/HUDI-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1063: -- Labels: sev:critical sev:triage user-support-issues (was: sev:critical user-support-issues) > Save in Google Cloud Storage not working > > > Key: HUDI-1063 > URL: https://issues.apache.org/jira/browse/HUDI-1063 > Project: Apache Hudi > Issue Type: Bug > Components: Spark Integration >Affects Versions: 0.9.0 >Reporter: David Lacalle Castillo >Priority: Critical > Labels: sev:critical, sev:triage, user-support-issues > Fix For: 0.9.0 > > > I added to spark submit the following properties: > {{--packages > org.apache.hudi:hudi-spark-bundle_2.11:0.5.3,org.apache.spark:spark-avro_2.11:2.4.4 > \ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'}} > Spark version 2.4.5 and Hadoop version 3.2.1 > > I am trying to save a Dataframe as follows in Google Cloud Storage as follows: > tableName = "forecasts" > basePath = "gs://hudi-datalake/" + tableName > hudi_options = { > 'hoodie.table.name': tableName, > 'hoodie.datasource.write.recordkey.field': 'uuid', > 'hoodie.datasource.write.partitionpath.field': 'partitionpath', > 'hoodie.datasource.write.table.name': tableName, > 'hoodie.datasource.write.operation': 'insert', > 'hoodie.datasource.write.precombine.field': 'ts', > 'hoodie.upsert.shuffle.parallelism': 2, > 'hoodie.insert.shuffle.parallelism': 2 > } > results = results.selectExpr( > "ds as date", > "store", > "item", > "y as sales", > "yhat as sales_predicted", > "yhat_upper as sales_predicted_upper", > "yhat_lower as sales_predicted_lower", > "training_date") > results.write.format("hudi"). \ > options(**hudi_options). \ > mode("overwrite"). \ > save(basePath) > I am getting the following error: > Py4JJavaError: An error occurred while calling o312.save. : > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V at > io.javalin.core.util.JettyServerUtil.defaultSessionHandler(JettyServerUtil.kt:50) > at io.javalin.Javalin.(Javalin.java:94) at > io.javalin.Javalin.create(Javalin.java:107) at > org.apache.hudi.timeline.service.TimelineService.startService(TimelineService.java:102) > at > org.apache.hudi.client.embedded.EmbeddedTimelineService.startServer(EmbeddedTimelineService.java:74) > at > org.apache.hudi.client.AbstractHoodieClient.startEmbeddedServerView(AbstractHoodieClient.java:102) > at > org.apache.hudi.client.AbstractHoodieClient.(AbstractHoodieClient.java:69) > at > org.apache.hudi.client.AbstractHoodieWriteClient.(AbstractHoodieWriteClient.java:83) > at > org.apache.hudi.client.HoodieWriteClient.(HoodieWriteClient.java:137) > at > org.apache.hudi.client.HoodieWriteClient.(HoodieWriteClient.java:124) > at > org.apache.hudi.client.HoodieWriteClient.(HoodieWriteClient.java:120) > at > org.apache.hudi.DataSourceUtils.createHoodieClient(DataSourceUtils.java:195) > at > org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:135) > at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:108) at > org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81) > at > org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) > at > org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) > at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80) > at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127) > at >
[jira] [Comment Edited] (HUDI-1063) Save in Google Cloud Storage not working
[ https://issues.apache.org/jira/browse/HUDI-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334773#comment-17334773 ] sivabalan narayanan edited comment on HUDI-1063 at 4/28/21, 2:50 PM: - [~WaterKnight]: I could not reproduce the issue w/ latest master. things are working fine. Command I used to launch spark shell ``` /usr/lib/spark/bin/spark-shell --packages org.apache.spark:spark-avro_2.12:3.0.0 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --jars /home/n_siva_b/hudi-spark3-bundle_2.12-0.9.0-SNAPSHOT.jar ``` [Link|https://gist.github.com/nsivabalan/03736cda20c10781957b83a89e2f6650] to gist for steps I tried out. Not sure if Hadoop 3+ was tried w/ 0.5.3. Hudi has few more releases after 0.5.0 with latest as 0.8.0 which is tested for hadoop3. If you want to try out hudi 0.5.3, would recommend trying out hadoop2.7 may be. was (Author: shivnarayan): [~WaterKnight]: I could not reproduce the issue w/ latest master. things are working fine. Command I used to launch spark shell ``` /usr/lib/spark/bin/spark-shell --packages org.apache.spark:spark-avro_2.12:3.0.0 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --jars /home/n_siva_b/hudi-spark3-bundle_2.12-0.9.0-SNAPSHOT.jar ``` [Link|https://gist.github.com/nsivabalan/03736cda20c10781957b83a89e2f6650] to gist for steps I tried out. > Save in Google Cloud Storage not working > > > Key: HUDI-1063 > URL: https://issues.apache.org/jira/browse/HUDI-1063 > Project: Apache Hudi > Issue Type: Bug > Components: Spark Integration >Affects Versions: 0.9.0 >Reporter: David Lacalle Castillo >Priority: Critical > Labels: sev:critical, user-support-issues > Fix For: 0.9.0 > > > I added to spark submit the following properties: > {{--packages > org.apache.hudi:hudi-spark-bundle_2.11:0.5.3,org.apache.spark:spark-avro_2.11:2.4.4 > \ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'}} > Spark version 2.4.5 and Hadoop version 3.2.1 > > I am trying to save a Dataframe as follows in Google Cloud Storage as follows: > tableName = "forecasts" > basePath = "gs://hudi-datalake/" + tableName > hudi_options = { > 'hoodie.table.name': tableName, > 'hoodie.datasource.write.recordkey.field': 'uuid', > 'hoodie.datasource.write.partitionpath.field': 'partitionpath', > 'hoodie.datasource.write.table.name': tableName, > 'hoodie.datasource.write.operation': 'insert', > 'hoodie.datasource.write.precombine.field': 'ts', > 'hoodie.upsert.shuffle.parallelism': 2, > 'hoodie.insert.shuffle.parallelism': 2 > } > results = results.selectExpr( > "ds as date", > "store", > "item", > "y as sales", > "yhat as sales_predicted", > "yhat_upper as sales_predicted_upper", > "yhat_lower as sales_predicted_lower", > "training_date") > results.write.format("hudi"). \ > options(**hudi_options). \ > mode("overwrite"). \ > save(basePath) > I am getting the following error: > Py4JJavaError: An error occurred while calling o312.save. : > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V at > io.javalin.core.util.JettyServerUtil.defaultSessionHandler(JettyServerUtil.kt:50) > at io.javalin.Javalin.(Javalin.java:94) at > io.javalin.Javalin.create(Javalin.java:107) at > org.apache.hudi.timeline.service.TimelineService.startService(TimelineService.java:102) > at > org.apache.hudi.client.embedded.EmbeddedTimelineService.startServer(EmbeddedTimelineService.java:74) > at > org.apache.hudi.client.AbstractHoodieClient.startEmbeddedServerView(AbstractHoodieClient.java:102) > at > org.apache.hudi.client.AbstractHoodieClient.(AbstractHoodieClient.java:69) > at > org.apache.hudi.client.AbstractHoodieWriteClient.(AbstractHoodieWriteClient.java:83) > at > org.apache.hudi.client.HoodieWriteClient.(HoodieWriteClient.java:137) > at > org.apache.hudi.client.HoodieWriteClient.(HoodieWriteClient.java:124) > at > org.apache.hudi.client.HoodieWriteClient.(HoodieWriteClient.java:120) > at > org.apache.hudi.DataSourceUtils.createHoodieClient(DataSourceUtils.java:195) > at > org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:135) > at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:108) at > org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86) > at >
[jira] [Commented] (HUDI-1063) Save in Google Cloud Storage not working
[ https://issues.apache.org/jira/browse/HUDI-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334773#comment-17334773 ] sivabalan narayanan commented on HUDI-1063: --- [~WaterKnight]: I could not reproduce the issue w/ latest master. things are working fine. Command I used to launch spark shell ``` /usr/lib/spark/bin/spark-shell --packages org.apache.spark:spark-avro_2.12:3.0.0 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --jars /home/n_siva_b/hudi-spark3-bundle_2.12-0.9.0-SNAPSHOT.jar ``` [Link|https://gist.github.com/nsivabalan/03736cda20c10781957b83a89e2f6650] to gist for steps I tried out. > Save in Google Cloud Storage not working > > > Key: HUDI-1063 > URL: https://issues.apache.org/jira/browse/HUDI-1063 > Project: Apache Hudi > Issue Type: Bug > Components: Spark Integration >Affects Versions: 0.9.0 >Reporter: David Lacalle Castillo >Priority: Critical > Labels: sev:critical, user-support-issues > Fix For: 0.9.0 > > > I added to spark submit the following properties: > {{--packages > org.apache.hudi:hudi-spark-bundle_2.11:0.5.3,org.apache.spark:spark-avro_2.11:2.4.4 > \ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'}} > Spark version 2.4.5 and Hadoop version 3.2.1 > > I am trying to save a Dataframe as follows in Google Cloud Storage as follows: > tableName = "forecasts" > basePath = "gs://hudi-datalake/" + tableName > hudi_options = { > 'hoodie.table.name': tableName, > 'hoodie.datasource.write.recordkey.field': 'uuid', > 'hoodie.datasource.write.partitionpath.field': 'partitionpath', > 'hoodie.datasource.write.table.name': tableName, > 'hoodie.datasource.write.operation': 'insert', > 'hoodie.datasource.write.precombine.field': 'ts', > 'hoodie.upsert.shuffle.parallelism': 2, > 'hoodie.insert.shuffle.parallelism': 2 > } > results = results.selectExpr( > "ds as date", > "store", > "item", > "y as sales", > "yhat as sales_predicted", > "yhat_upper as sales_predicted_upper", > "yhat_lower as sales_predicted_lower", > "training_date") > results.write.format("hudi"). \ > options(**hudi_options). \ > mode("overwrite"). \ > save(basePath) > I am getting the following error: > Py4JJavaError: An error occurred while calling o312.save. : > java.lang.NoSuchMethodError: > org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V at > io.javalin.core.util.JettyServerUtil.defaultSessionHandler(JettyServerUtil.kt:50) > at io.javalin.Javalin.(Javalin.java:94) at > io.javalin.Javalin.create(Javalin.java:107) at > org.apache.hudi.timeline.service.TimelineService.startService(TimelineService.java:102) > at > org.apache.hudi.client.embedded.EmbeddedTimelineService.startServer(EmbeddedTimelineService.java:74) > at > org.apache.hudi.client.AbstractHoodieClient.startEmbeddedServerView(AbstractHoodieClient.java:102) > at > org.apache.hudi.client.AbstractHoodieClient.(AbstractHoodieClient.java:69) > at > org.apache.hudi.client.AbstractHoodieWriteClient.(AbstractHoodieWriteClient.java:83) > at > org.apache.hudi.client.HoodieWriteClient.(HoodieWriteClient.java:137) > at > org.apache.hudi.client.HoodieWriteClient.(HoodieWriteClient.java:124) > at > org.apache.hudi.client.HoodieWriteClient.(HoodieWriteClient.java:120) > at > org.apache.hudi.DataSourceUtils.createHoodieClient(DataSourceUtils.java:195) > at > org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:135) > at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:108) at > org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81) > at >
[jira] [Comment Edited] (HUDI-1854) Corrupt blocks in GCS log files
[ https://issues.apache.org/jira/browse/HUDI-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334771#comment-17334771 ] sivabalan narayanan edited comment on HUDI-1854 at 4/28/21, 2:47 PM: - For me, things are working. not able to reproduce. I tried w/ latest master fyi. Followed this [link|[https://hol|https://hol/]owczak.com/getting-started-with-apache-spark-on-google-cloud-platform-using-dataproc/] to set up my cluster. Launch command: ``` /usr/lib/spark/bin/spark-shell --packages org.apache.spark:spark-avro_2.12:3.0.0 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --jars /home/n_siva_b/hudi-spark3-bundle_2.12-0.9.0-SNAPSHOT.jar ``` Gist link for commands I ran. [https://gist.github.com/nsivabalan/03736cda20c10781957b83a89e2f6650] I verified via console, that log files were > 16Mb. // Check attached screenshot. was (Author: shivnarayan): For me, things are working. not able to reproduce. I tried w/ latest master fyi. Followed this [link|[https://hol|https://hol/]owczak.com/getting-started-with-apache-spark-on-google-cloud-platform-using-dataproc/] to set up my cluster. Launch command: ``` /usr/lib/spark/bin/spark-shell --packages org.apache.spark:spark-avro_2.12:3.0.0 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --jars /home/n_siva_b/hudi-spark3-bundle_2.12-0.9.0-SNAPSHOT.jar ``` Gist link for commands I ran. [https://gist.github.com/nsivabalan/03736cda20c10781957b83a89e2f6650] I verified via console, that log files were > 16Mb. > Corrupt blocks in GCS log files > --- > > Key: HUDI-1854 > URL: https://issues.apache.org/jira/browse/HUDI-1854 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Nishith Agarwal >Priority: Major > Labels: sev:critical, sev:triage > Attachments: Screen Shot 2021-04-28 at 10.42.50 AM.png > > > Details on how to reproduce this can be found here -> > [https://github.com/apache/hudi/issues/2692] > > We need a GCS, google data proc environment to reproduce this. > > [~vburenin] Would you be able to help try out hudi 0.7 and follow the steps > mentioned in this ticket to help reproduce this issue and find the root cause > ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1854) Corrupt blocks in GCS log files
[ https://issues.apache.org/jira/browse/HUDI-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1854: -- Attachment: Screen Shot 2021-04-28 at 10.42.50 AM.png > Corrupt blocks in GCS log files > --- > > Key: HUDI-1854 > URL: https://issues.apache.org/jira/browse/HUDI-1854 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Nishith Agarwal >Priority: Major > Labels: sev:critical, sev:triage > Attachments: Screen Shot 2021-04-28 at 10.42.50 AM.png > > > Details on how to reproduce this can be found here -> > [https://github.com/apache/hudi/issues/2692] > > We need a GCS, google data proc environment to reproduce this. > > [~vburenin] Would you be able to help try out hudi 0.7 and follow the steps > mentioned in this ticket to help reproduce this issue and find the root cause > ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1854) Corrupt blocks in GCS log files
[ https://issues.apache.org/jira/browse/HUDI-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334771#comment-17334771 ] sivabalan narayanan commented on HUDI-1854: --- For me, things are working. not able to reproduce. I tried w/ latest master fyi. Followed this [link|[https://hol|https://hol/]owczak.com/getting-started-with-apache-spark-on-google-cloud-platform-using-dataproc/] to set up my cluster. Launch command: ``` /usr/lib/spark/bin/spark-shell --packages org.apache.spark:spark-avro_2.12:3.0.0 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --jars /home/n_siva_b/hudi-spark3-bundle_2.12-0.9.0-SNAPSHOT.jar ``` Gist link for commands I ran. [https://gist.github.com/nsivabalan/03736cda20c10781957b83a89e2f6650] I verified via console, that log files were > 16Mb. > Corrupt blocks in GCS log files > --- > > Key: HUDI-1854 > URL: https://issues.apache.org/jira/browse/HUDI-1854 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Nishith Agarwal >Priority: Major > Labels: sev:critical, sev:triage > Attachments: Screen Shot 2021-04-28 at 10.42.50 AM.png > > > Details on how to reproduce this can be found here -> > [https://github.com/apache/hudi/issues/2692] > > We need a GCS, google data proc environment to reproduce this. > > [~vburenin] Would you be able to help try out hudi 0.7 and follow the steps > mentioned in this ticket to help reproduce this issue and find the root cause > ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2892: [HUDI-1865] Make embedded time line service singleton
codecov-commenter edited a comment on pull request #2892: URL: https://github.com/apache/hudi/pull/2892#issuecomment-828296052 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2892](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (e2d0335) into [master](https://codecov.io/gh/apache/hudi/commit/386767693d46e7419c4fb0fa292ccb7ab7f7098d?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3867676) will **decrease** coverage by `60.39%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2892/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #2892 +/- ## - Coverage 69.75% 9.36% -60.40% + Complexity 375 48 -327 Files54 54 Lines 19971997 Branches236 236 - Hits 1393 187 -1206 - Misses 4731797 +1324 + Partials131 13 -118 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudiclient | `?` | `?` | | | hudiutilities | `9.36% <ø> (-60.40%)` | `48.00 <ø> (-327.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | |
[GitHub] [hudi] danny0405 commented on pull request #2868: [HUDI-1821] Remove legacy code for Flink writer
danny0405 commented on pull request #2868: URL: https://github.com/apache/hudi/pull/2868#issuecomment-828420402 > I still insist that we need to include kafka-related dependencies. If you look back at the HoodieFlinkStreamerV2 class. What is it in essence? It is just a program written using Flink DataStream API, which is specific (Kafka -> Hudi) No, on one says that they don't know how to add a connector jar or actually few people use the `HoodieFlinkStreamerV2` tool. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2892: [HUDI-1865] Make embedded time line service singleton
codecov-commenter edited a comment on pull request #2892: URL: https://github.com/apache/hudi/pull/2892#issuecomment-828296052 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2892](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (d06be43) into [master](https://codecov.io/gh/apache/hudi/commit/386767693d46e7419c4fb0fa292ccb7ab7f7098d?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3867676) will **decrease** coverage by `0.05%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2892/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2892 +/- ## - Coverage 69.75% 69.70% -0.06% + Complexity 375 374 -1 Files54 54 Lines 1997 1997 Branches236 236 - Hits 1393 1392 -1 Misses 473 473 - Partials131 132 +1 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudiutilities | `69.70% <ø> (-0.06%)` | `374.00 <ø> (-1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.08% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 closed pull request #2892: [HUDI-1865] Make embedded time line service singleton
danny0405 closed pull request #2892: URL: https://github.com/apache/hudi/pull/2892 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] yanghua commented on pull request #2868: [HUDI-1821] Remove legacy code for Flink writer
yanghua commented on pull request #2868: URL: https://github.com/apache/hudi/pull/2868#issuecomment-828408491 > > I have two questions: > > > > 1. The lowest Flink version we supported is 1.12.x? > > 2. Can we provide an e2e demo and documentation to show the usage of the flink streamer via jar-mode, just like delta-streamer, it should be out of the box; > > > > I tried it, but missed the dependencies of the Kafka connector. Can we make the new flink streamer peer to the delta streamer? > > Yes, people would only use flink 1.12.x code, the code to remove is not because of flink version, it's because the logic is totally redundant. Remove to avoid cofusion, because i found some people use the legacy code with poor performance. Although I know that many users are currently testing based on 1.12, the threshold we set for many users of older versions is very high. Pray that they are willing to upgrade the Flink version in order to use hudi. In fact, I personally think that the biggest improvement of the new implementation lies in the bucket assigner. As for other points, we could have found a solution (although it does not seem very elegant). Well, I don't have to worry about the Flink version anymore, and I don't have time to pay attention to the old implementation. > I still think we should not include a kafka connector into the delta streamer, on one complains the missing of it, based on the users i see. I still insist that we need to include kafka-related dependencies. If you look back at the HoodieFlinkStreamerV2 class. What is it in essence? It is just a program written using Flink DataStream API, which is specific (Kafka -> Hudi), not plug-in-oriented or abstract-oriented. For a specific Flink program, we should provide users with an Uber(fat) Jar. Instead of letting users pay attention to details and pay additional costs. Otherwise, why don't we make the source universal? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (HUDI-1343) Add standard schema postprocessor which would rewrite the schema using spark-avro conversion
[ https://issues.apache.org/jira/browse/HUDI-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325964#comment-17325964 ] sivabalan narayanan edited comment on HUDI-1343 at 4/28/21, 11:05 AM: -- [~liujinhui] [~vbalaji] [~nishith29] : Do you folks think if this is still required after this fix [https://github.com/apache/hudi/pull/2765] . This fixes AvroConvertionUtils.convertStructTypeToAvroSchema() to ensure null is first entry in union and default value is set to null if a field is nullable in spark structtype. I mean, we have enabled the post schema processor by default. so wanted to double check if it's still applicable. was (Author: shivnarayan): [~liujinhui] [~vbalaji]: Do you folks think if this is still required after this fix [https://github.com/apache/hudi/pull/2765] . This fixes AvroConvertionUtils.convertStructTypeToAvroSchema() to ensure null is first entry in union and default value is set to null if a field is nullable in spark structtype. I mean, we have enabled the post schema processor by default. so wanted to double check if it's still applicable. > Add standard schema postprocessor which would rewrite the schema using > spark-avro conversion > > > Key: HUDI-1343 > URL: https://issues.apache.org/jira/browse/HUDI-1343 > Project: Apache Hudi > Issue Type: Improvement > Components: DeltaStreamer >Reporter: Balaji Varadarajan >Assignee: liujinhui >Priority: Major > Labels: pull-request-available > Fix For: 0.7.0 > > > When we use Transformer, the final Schema which we use to convert avro record > to bytes is auto generated by spark. This could be different (due to the way > Avro treats it) from the target schema that is being used to write (as the > target schema could be coming from Schema Registry). > > For example : > Schema generated by spark-avro when converting Row to avro > { > "type" : "record", > "name" : "hoodie_source", > "namespace" : "hoodie.source", > "fields" : [ { > "name" : "_ts_ms", > "type" : [ "long", "null" ] > }, { > "name" : "_op", > "type" : "string" > }, { > "name" : "inc_id", > "type" : "int" > }, { > "name" : "year", > "type" : [ "int", "null" ] > }, { > "name" : "violation_desc", > "type" : [ "string", "null" ] > }, { > "name" : "violation_code", > "type" : [ "string", "null" ] > }, { > "name" : "case_individual_id", > "type" : [ "int", "null" ] > }, { > "name" : "flag", > "type" : [ "string", "null" ] > }, { > "name" : "last_modified_ts", > "type" : "long" > } ] > } > > is not compatible with the Avro Schema: > > { > "type" : "record", > "name" : "formatted_debezium_payload", > "fields" : [ { > "name" : "_ts_ms", > "type" : [ "null", "long" ], > "default" : null > }, { > "name" : "_op", > "type" : "string", > "default" : null > }, { > "name" : "inc_id", > "type" : "int", > "default" : null > }, { > "name" : "year", > "type" : [ "null", "int" ], > "default" : null > }, { > "name" : "violation_desc", > "type" : [ "null", "string" ], > "default" : null > }, { > "name" : "violation_code", > "type" : [ "null", "string" ], > "default" : null > }, { > "name" : "case_individual_id", > "type" : [ "null", "int" ], > "default" : null > }, { > "name" : "flag", > "type" : [ "null", "string" ], > "default" : null > }, { > "name" : "last_modified_ts", > "type" : "long", > "default" : null > } ] > } > > Note that the type order is different for individual fields : > "type" : [ "null", "string" ], vs "type" : [ "string", "null" ] > Unexpectedly, Avro decoding fails when bytes written with first schema is > read using second schema. > > One way to fix is to use configured target schema when generating record > bytes but this is not easy without breaking Record payload constructor API > used by deltastreamer. > The other option is to apply a post-processor on target schema to make it > schema consistent with Transformer generated records. > > This ticket is to use the later approach of creating a standard schema > post-processor and adding it by default when Transformer is used. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] nsivabalan commented on issue #2887: [SUPPORT]
nsivabalan commented on issue #2887: URL: https://github.com/apache/hudi/issues/2887#issuecomment-828363941 while you update the ticket w/ more info, curious to know if you had set partition path to empty intentionally? ``` .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY(), StringUtils.EMPTY) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1739) insert_overwrite_table and insert_overwrite create empty replacecommit.requested file which breaks archival
[ https://issues.apache.org/jira/browse/HUDI-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334645#comment-17334645 ] sivabalan narayanan commented on HUDI-1739: --- CC [~satishkotha] > insert_overwrite_table and insert_overwrite create empty > replacecommit.requested file which breaks archival > --- > > Key: HUDI-1739 > URL: https://issues.apache.org/jira/browse/HUDI-1739 > Project: Apache Hudi > Issue Type: Bug >Reporter: Jagmeet Bali >Assignee: Susu Dong >Priority: Minor > Labels: sev:high > > Fixes can be to > # Ignore empty replacecommit.requested files. > # Standardise the replacecommit.requested format across all invocations be > it from clustering or this use case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1739) insert_overwrite_table and insert_overwrite create empty replacecommit.requested file which breaks archival
[ https://issues.apache.org/jira/browse/HUDI-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1739: -- Labels: sev:high (was: sev:critical) > insert_overwrite_table and insert_overwrite create empty > replacecommit.requested file which breaks archival > --- > > Key: HUDI-1739 > URL: https://issues.apache.org/jira/browse/HUDI-1739 > Project: Apache Hudi > Issue Type: Bug >Reporter: Jagmeet Bali >Assignee: Susu Dong >Priority: Minor > Labels: sev:high > > Fixes can be to > # Ignore empty replacecommit.requested files. > # Standardise the replacecommit.requested format across all invocations be > it from clustering or this use case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2892: [HUDI-1865] Make embedded time line service singleton
codecov-commenter edited a comment on pull request #2892: URL: https://github.com/apache/hudi/pull/2892#issuecomment-828296052 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2892](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (d06be43) into [master](https://codecov.io/gh/apache/hudi/commit/386767693d46e7419c4fb0fa292ccb7ab7f7098d?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3867676) will **decrease** coverage by `0.05%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2892/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2892 +/- ## - Coverage 69.75% 69.70% -0.06% + Complexity 375 374 -1 Files54 54 Lines 1997 1997 Branches236 236 - Hits 1393 1392 -1 Misses 473 473 - Partials131 132 +1 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudiclient | `?` | `?` | | | hudiutilities | `69.70% <ø> (-0.06%)` | `374.00 <ø> (-1.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.08% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 edited a comment on pull request #2868: [HUDI-1821] Remove legacy code for Flink writer
danny0405 edited a comment on pull request #2868: URL: https://github.com/apache/hudi/pull/2868#issuecomment-828336262 > I have two questions: > > 1. The lowest Flink version we supported is 1.12.x? > 2. Can we provide an e2e demo and documentation to show the usage of the flink streamer via jar-mode, just like delta-streamer, it should be out of the box; > > I tried it, but missed the dependencies of the Kafka connector. Can we make the new flink streamer peer to the delta streamer? Yes, people would only use flink 1.12.x code, the code to remove is not because of flink version, it's because the logic is totally redundant. Remove to avoid cofusion, because i found some people use the legacy code with poor performace. I still think we should not include a kafka connector into the delta streamer, on one complains the missing of it, based on the users i see. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on pull request #2868: [HUDI-1821] Remove legacy code for Flink writer
danny0405 commented on pull request #2868: URL: https://github.com/apache/hudi/pull/2868#issuecomment-828336262 > I have two questions: > > 1. The lowest Flink version we supported is 1.12.x? > 2. Can we provide an e2e demo and documentation to show the usage of the flink streamer via jar-mode, just like delta-streamer, it should be out of the box; > > I tried it, but missed the dependencies of the Kafka connector. Can we make the new flink streamer peer to the delta streamer? Yes, people would only use flink 1.12.x code, the code to remove is not because of flink version, it's because the logic is totally redundant. I still think we should not include a kafka connector into the delta streamer, on one complains the missing of it, based on the users i see. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] yanghua commented on pull request #2868: [HUDI-1821] Remove legacy code for Flink writer
yanghua commented on pull request #2868: URL: https://github.com/apache/hudi/pull/2868#issuecomment-828323317 I have two questions: 1) The lowest Flink version we supported is 1.12.x? 2) Can we provide an e2e demo and documentation to show the usage of the flink streamer via jar-mode, just like delta-streamer, it should be out of the box; I tried it, but missed the dependencies of the Kafka connector. Can we make the new flink streamer peer to the delta streamer? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter commented on pull request #2892: [HUDI-1865] Make embedded time line service singleton
codecov-commenter commented on pull request #2892: URL: https://github.com/apache/hudi/pull/2892#issuecomment-828296052 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2892](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (ab94864) into [master](https://codecov.io/gh/apache/hudi/commit/386767693d46e7419c4fb0fa292ccb7ab7f7098d?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3867676) will **decrease** coverage by `60.39%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2892/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #2892 +/- ## - Coverage 69.75% 9.36% -60.40% + Complexity 375 48 -327 Files54 54 Lines 19971997 Branches236 236 - Hits 1393 187 -1206 - Misses 4731797 +1324 + Partials131 13 -118 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudiclient | `?` | `?` | | | hudiutilities | `9.36% <ø> (-60.40%)` | `48.00 <ø> (-327.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2892?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2892/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | |
[GitHub] [hudi] danny0405 commented on pull request #2868: [HUDI-1821] Remove legacy code for Flink writer
danny0405 commented on pull request #2868: URL: https://github.com/apache/hudi/pull/2868#issuecomment-828261491 Hi, @yanghua can you take a look, thanks ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] MyLanPangzi removed a comment on pull request #2868: [HUDI-1821] Remove legacy code for Flink writer
MyLanPangzi removed a comment on pull request #2868: URL: https://github.com/apache/hudi/pull/2868#issuecomment-828259208 @yanghua Hi,i triggered the ci and it's passed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] MyLanPangzi commented on pull request #2868: [HUDI-1821] Remove legacy code for Flink writer
MyLanPangzi commented on pull request #2868: URL: https://github.com/apache/hudi/pull/2868#issuecomment-828259208 @yanghua Hi,i triggered the ci and it's passed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-1865) Make embedded time line service singleton
[ https://issues.apache.org/jira/browse/HUDI-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1865: - Labels: pull-request-available (was: ) > Make embedded time line service singleton > - > > Key: HUDI-1865 > URL: https://issues.apache.org/jira/browse/HUDI-1865 > Project: Apache Hudi > Issue Type: Improvement > Components: Flink Integration >Reporter: Danny Chen >Assignee: Danny Chen >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > The filesystem view takes too much memory, make it process singleton. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] danny0405 opened a new pull request #2892: [HUDI-1865] Make embedded time line service singleton
danny0405 opened a new pull request #2892: URL: https://github.com/apache/hudi/pull/2892 The filesystem view takes too much memory, make it process singleton. ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request *(For example: This pull request adds quick-start document.)* ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request *(Please pick either of the following options)* This pull request is a trivial rework / code cleanup without any test coverage. *(or)* This pull request is already covered by existing tests, such as *(please describe tests)*. (or) This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end.* - *Added HoodieClientWriteTest to verify the change.* - *Manually verified the change by running a job locally.* ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-1865) Make embedded time line service singleton
[ https://issues.apache.org/jira/browse/HUDI-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-1865: - Summary: Make embedded time line service singleton (was: Make write client of flink pipeline singleton) > Make embedded time line service singleton > - > > Key: HUDI-1865 > URL: https://issues.apache.org/jira/browse/HUDI-1865 > Project: Apache Hudi > Issue Type: Improvement > Components: Flink Integration >Reporter: Danny Chen >Assignee: Danny Chen >Priority: Major > Fix For: 0.9.0 > > > The filesystem view takes too much memory, make it process singleton. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #2643: DO NOT MERGE (Azure CI) test branch ci
hudi-bot edited a comment on pull request #2643: URL: https://github.com/apache/hudi/pull/2643#issuecomment-792368481 ## CI report: * 9831a6c50e9f49f8a71c02fc6ac50ae1446f7c1f UNKNOWN * a569dbe9409910fbb83b3764b300574c0e52612e Azure: [FAILURE](https://dev.azure.com/XUSH0012/0ef433cc-d4b4-47cc-b6a1-03d032ef546c/_build/results?buildId=142) * e6e9f1f1554a1474dd6c20338215030cad23a2e0 UNKNOWN * 2a6690a256c8cd8efe9ed2b1984b896fb27ef077 UNKNOWN * d8b7cca55e057a52a2e229d81e8cb52b60dc275f UNKNOWN * 3bce301333cc78194d13a702598b46e04fe9f85f UNKNOWN * f07f345baa450f3fec7eab59caa76b0fbda1e132 UNKNOWN * 869d2ce3fad330af93c1bb3b576824f519c6e68b UNKNOWN * fa86907f7522bc8dbe512d48b5a87e4a6b13f035 UNKNOWN * 4ebe53016ce3e0648992dbe14d04f71a92f116e6 UNKNOWN * 682ae9985f591f6d0c30ee2ef9b159403c1e46de UNKNOWN * d80397fcfeaa2996ab550bcdab4524be7420a364 UNKNOWN * bfe3a803e19540578b94f778f7ba7551db0f86f1 UNKNOWN * a632e58390eb94fcc7e757bd7580780cf184f9a8 UNKNOWN * 2e413d601c80b123269c2fc3fc6aa9a8bd0d746a UNKNOWN * e797ee47aa319df3c3c40bdc4acab4f592d70ffe UNKNOWN * acb06df73c1c2a0ef1590f66e8b41e173d2a7a7b UNKNOWN * f7f78ee22a0a75c5fb866c4e9cdda01482fbcb59 UNKNOWN * 3a7227993309e8dd37f2aef693cb3fed69a2043c UNKNOWN * 8f7a8e7f4989c9e20b936123c0f6e324898471d2 UNKNOWN * 6824c4917ad812c5938fe5346344a4aef9b7a72e UNKNOWN * 252364017f5dee1dcdfa061cc3070dac518d4047 UNKNOWN * b1691e583f3c23ee83fcb7ee0245eed826624cc0 UNKNOWN * ba970bda569f0312c77cd5c139f9dec4ad2759b0 UNKNOWN * 4370d21d4983e5e79d1f4bafba51ae26dd29f9a0 UNKNOWN * 21ea9ccef8ab9d78f9c201fa58a22e3e59caaa6b UNKNOWN * b17028e8a232ff3015c18b8f7de5435241800bfe UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2643: DO NOT MERGE (Azure CI) test branch ci
hudi-bot edited a comment on pull request #2643: URL: https://github.com/apache/hudi/pull/2643#issuecomment-792368481 ## CI report: * 9831a6c50e9f49f8a71c02fc6ac50ae1446f7c1f UNKNOWN * a569dbe9409910fbb83b3764b300574c0e52612e Azure: [FAILURE](https://dev.azure.com/XUSH0012/0ef433cc-d4b4-47cc-b6a1-03d032ef546c/_build/results?buildId=142) * e6e9f1f1554a1474dd6c20338215030cad23a2e0 UNKNOWN * 2a6690a256c8cd8efe9ed2b1984b896fb27ef077 UNKNOWN * d8b7cca55e057a52a2e229d81e8cb52b60dc275f UNKNOWN * 3bce301333cc78194d13a702598b46e04fe9f85f UNKNOWN * f07f345baa450f3fec7eab59caa76b0fbda1e132 UNKNOWN * 869d2ce3fad330af93c1bb3b576824f519c6e68b UNKNOWN * fa86907f7522bc8dbe512d48b5a87e4a6b13f035 UNKNOWN * 4ebe53016ce3e0648992dbe14d04f71a92f116e6 UNKNOWN * 682ae9985f591f6d0c30ee2ef9b159403c1e46de UNKNOWN * d80397fcfeaa2996ab550bcdab4524be7420a364 UNKNOWN * bfe3a803e19540578b94f778f7ba7551db0f86f1 UNKNOWN * a632e58390eb94fcc7e757bd7580780cf184f9a8 UNKNOWN * 2e413d601c80b123269c2fc3fc6aa9a8bd0d746a UNKNOWN * e797ee47aa319df3c3c40bdc4acab4f592d70ffe UNKNOWN * acb06df73c1c2a0ef1590f66e8b41e173d2a7a7b UNKNOWN * f7f78ee22a0a75c5fb866c4e9cdda01482fbcb59 UNKNOWN * 3a7227993309e8dd37f2aef693cb3fed69a2043c UNKNOWN * 8f7a8e7f4989c9e20b936123c0f6e324898471d2 UNKNOWN * 6824c4917ad812c5938fe5346344a4aef9b7a72e UNKNOWN * 252364017f5dee1dcdfa061cc3070dac518d4047 UNKNOWN * b1691e583f3c23ee83fcb7ee0245eed826624cc0 UNKNOWN * ba970bda569f0312c77cd5c139f9dec4ad2759b0 UNKNOWN * 4370d21d4983e5e79d1f4bafba51ae26dd29f9a0 UNKNOWN * 21ea9ccef8ab9d78f9c201fa58a22e3e59caaa6b UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan edited a comment on pull request #2889: [HUDI-1810] Fix azure setting for integ tests (Azure CI)
xushiyan edited a comment on pull request #2889: URL: https://github.com/apache/hudi/pull/2889#issuecomment-828166779 @vinothchandar integ tests were misconfigured previously. this makes [integ tests passed](https://dev.azure.com/xushiyan/apache-hudi-ci/_build/results?buildId=76=logs=d5c42908-5572-5ce6-e4a8-5e2053b947e8) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2891: [HUDI-1863] Add rate limiter to Flink writer to avoid OOM for bootstrap
codecov-commenter edited a comment on pull request #2891: URL: https://github.com/apache/hudi/pull/2891#issuecomment-828184944 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2891?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2891](https://codecov.io/gh/apache/hudi/pull/2891?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (08576e3) into [master](https://codecov.io/gh/apache/hudi/commit/386767693d46e7419c4fb0fa292ccb7ab7f7098d?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3867676) will **decrease** coverage by `16.74%`. > The diff coverage is `93.47%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2891/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2891?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#2891 +/- ## = - Coverage 69.75% 53.00% -16.75% - Complexity 375 3747 +3372 = Files54 488 +434 Lines 199723521+21524 Branches236 2502 +2266 = + Hits 139312468+11075 - Misses 473 9954 +9481 - Partials131 1099 +968 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `39.53% <ø> (?)` | `220.00 <ø> (?)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.38% <ø> (?)` | `1975.00 <ø> (?)` | | | hudiflink | `59.66% <93.47%> (?)` | `538.00 <4.00> (?)` | | | hudihadoopmr | `33.33% <ø> (?)` | `198.00 <ø> (?)` | | | hudisparkdatasource | `73.33% <ø> (?)` | `237.00 <ø> (?)` | | | hudisync | `46.39% <ø> (?)` | `142.00 <ø> (?)` | | | huditimelineservice | `64.36% <ø> (?)` | `62.00 <ø> (?)` | | | hudiutilities | `69.75% <ø> (ø)` | `375.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2891?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...e/hudi/sink/transform/RowDataToHoodieFunction.java](https://codecov.io/gh/apache/hudi/pull/2891/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3RyYW5zZm9ybS9Sb3dEYXRhVG9Ib29kaWVGdW5jdGlvbi5qYXZh) | `85.71% <89.28%> (ø)` | `8.00 <3.00> (?)` | | | [...va/org/apache/hudi/configuration/FlinkOptions.java](https://codecov.io/gh/apache/hudi/pull/2891/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9jb25maWd1cmF0aW9uL0ZsaW5rT3B0aW9ucy5qYXZh) | `90.48% <100.00%> (ø)` | `11.00 <0.00> (?)` | | | [...java/org/apache/hudi/sink/StreamWriteFunction.java](https://codecov.io/gh/apache/hudi/pull/2891/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL1N0cmVhbVdyaXRlRnVuY3Rpb24uamF2YQ==) | `77.77% <100.00%> (ø)` | `22.00 <1.00> (?)` | | | [...ava/org/apache/hudi/source/StreamReadOperator.java](https://codecov.io/gh/apache/hudi/pull/2891/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zb3VyY2UvU3RyZWFtUmVhZE9wZXJhdG9yLmphdmE=) | `90.66% <100.00%> (ø)` | `15.00 <0.00> (?)` | | | [...udi/common/table/timeline/dto/FSPermissionDTO.java](https://codecov.io/gh/apache/hudi/pull/2891/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL2R0by9GU1Blcm1pc3Npb25EVE8uamF2YQ==) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | | |
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2891: [HUDI-1863] Add rate limiter to Flink writer to avoid OOM for bootstrap
codecov-commenter edited a comment on pull request #2891: URL: https://github.com/apache/hudi/pull/2891#issuecomment-828184944 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-1865) Make write client of flink pipeline singleton
Danny Chen created HUDI-1865: Summary: Make write client of flink pipeline singleton Key: HUDI-1865 URL: https://issues.apache.org/jira/browse/HUDI-1865 Project: Apache Hudi Issue Type: Improvement Components: Flink Integration Reporter: Danny Chen Assignee: Danny Chen Fix For: 0.9.0 The filesystem view takes too much memory, make it process singleton. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch asf-site updated: Travis CI build asf-site
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new dff8c32 Travis CI build asf-site dff8c32 is described below commit dff8c3207cd64a87e5e4bddcd32b90857f89c9a4 Author: CI AuthorDate: Wed Apr 28 06:56:01 2021 + Travis CI build asf-site --- content/docs/flink-quick-start-guide.html | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/content/docs/flink-quick-start-guide.html b/content/docs/flink-quick-start-guide.html index 03d52fe..21ef68b 100644 --- a/content/docs/flink-quick-start-guide.html +++ b/content/docs/flink-quick-start-guide.html @@ -390,13 +390,7 @@ quick start tool for SQL users. The hudi-flink-bundle jar is archived with scala 2.11, so it’s recommended to use flink 1.12.x bundled with scala 2.11. Step.2 start flink cluster -Start a standalone flink cluster within hadoop environment. -Before you start up the cluster, we suggest to config the cluster as follows: - - - in $FLINK_HOME/conf/flink-conf.yaml, add config option taskmanager.numberOfTaskSlots: 4 - in $FLINK_HOME/conf/workers, add item localhost as 4 lines so that there are 4 workers on the local cluster - +Start a standalone flink cluster within hadoop environment. Now starts the cluster: @@ -449,6 +443,8 @@ The SQL CLI only executes the SQL line by line. WITH ( 'connector' = 'hudi', 'path' = 'table_base_path', + 'write.tasks' = '1', -- default is 4 ,required more resource + 'compaction.tasks' = '1', -- default is 10 ,required more resource 'table.type' = 'MERGE_ON_READ' -- this creates a MERGE_ON_READ table, by default is COPY_ON_WRITE ); @@ -504,6 +500,7 @@ We do not need to specify endTime, if we want all changes after the given commit 'connector' = 'hudi', 'path' = 'table_base_path', 'table.type' = 'MERGE_ON_READ', + 'read.tasks' = '1', -- default is 4 ,required more resource 'read.streaming.enabled' = 'true', -- this option enable the streaming read 'read.streaming.start-commit' = '20210316134557', -- specifies the start commit instant time 'read.streaming.check-interval' = '4' -- specifies the check interval for finding new source commits, default 60s.
[GitHub] [hudi] yanghua merged pull request #2890: [MINOR] minimize flink quick start resource
yanghua merged pull request #2890: URL: https://github.com/apache/hudi/pull/2890 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2889: [HUDI-1810] Fix azure setting for integ tests (Azure CI)
hudi-bot edited a comment on pull request #2889: URL: https://github.com/apache/hudi/pull/2889#issuecomment-828163773 ## CI report: * 3f35042fee2ab77100af7cddbc1b5914808ef7d1 Travis: [FAILURE](https://travis-ci.com/github/apachehudi-ci/hudi-branch-ci/builds/224346756) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2889: [HUDI-1810] Fix azure setting for integ tests (Azure CI)
codecov-commenter edited a comment on pull request #2889: URL: https://github.com/apache/hudi/pull/2889#issuecomment-828165841 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter commented on pull request #2891: [HUDI-1863] Add rate limiter to Flink writer to avoid OOM for bootstrap
codecov-commenter commented on pull request #2891: URL: https://github.com/apache/hudi/pull/2891#issuecomment-828184944 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2891?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2891](https://codecov.io/gh/apache/hudi/pull/2891?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (08576e3) into [master](https://codecov.io/gh/apache/hudi/commit/386767693d46e7419c4fb0fa292ccb7ab7f7098d?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3867676) will **decrease** coverage by `60.39%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2891/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2891?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #2891 +/- ## - Coverage 69.75% 9.36% -60.40% + Complexity 375 48 -327 Files54 54 Lines 19971997 Branches236 236 - Hits 1393 187 -1206 - Misses 4731797 +1324 + Partials131 13 -118 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudiclient | `?` | `?` | | | hudiutilities | `9.36% <ø> (-60.40%)` | `48.00 <ø> (-327.00)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2891?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2891/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2891/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2891/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2891/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2891/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2891/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | |