[GitHub] [hudi] codecov-io edited a comment on pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…
codecov-io edited a comment on pull request #2651: URL: https://github.com/apache/hudi/pull/2651#issuecomment-794945140 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-1737) Extract common method in HoodieCreateHandle & FlinkCreateHandle
[ https://issues.apache.org/jira/browse/HUDI-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roc Marshal updated HUDI-1737: -- Summary: Extract common method in HoodieCreateHandle & FlinkCreateHandle (was: Extract public method in HoodieCreateHandle & FlinkCreateHandle) > Extract common method in HoodieCreateHandle & FlinkCreateHandle > --- > > Key: HUDI-1737 > URL: https://issues.apache.org/jira/browse/HUDI-1737 > Project: Apache Hudi > Issue Type: Improvement > Components: Code Cleanup >Reporter: Roc Marshal >Assignee: Roc Marshal >Priority: Minor > Labels: hudi-client > > {code:java} > // HoodieCreateHandle.java > //... > @Override > public List close() { > LOG.info("Closing the file " + writeStatus.getFileId() + " as we are done > with all the records " + recordsWritten); > try { > fileWriter.close(); > HoodieWriteStat stat = new HoodieWriteStat(); > stat.setPartitionPath(writeStatus.getPartitionPath()); > stat.setNumWrites(recordsWritten); > stat.setNumDeletes(recordsDeleted); > stat.setNumInserts(insertRecordsWritten); > stat.setPrevCommit(HoodieWriteStat.NULL_COMMIT); > stat.setFileId(writeStatus.getFileId()); > stat.setPath(new Path(config.getBasePath()), path); > long fileSizeInBytes = FSUtils.getFileSize(fs, path); > stat.setTotalWriteBytes(fileSizeInBytes); > stat.setFileSizeInBytes(fileSizeInBytes); > stat.setTotalWriteErrors(writeStatus.getTotalErrorRecords()); > RuntimeStats runtimeStats = new RuntimeStats(); > runtimeStats.setTotalCreateTime(timer.endTimer()); > stat.setRuntimeStats(runtimeStats); > writeStatus.setStat(stat); > LOG.info(String.format("CreateHandle for partitionPath %s fileID %s, took > %d ms.", stat.getPartitionPath(), > stat.getFileId(), runtimeStats.getTotalCreateTime())); > return Collections.singletonList(writeStatus); > } catch (IOException e) { > throw new HoodieInsertException("Failed to close the Insert Handle for > path " + path, e); > } > } > //FlinkCreateHandle.java > private void setUpWriteStatus() throws IOException { > long fileSizeInBytes = fileWriter.getBytesWritten(); > long incFileSizeInBytes = fileSizeInBytes - lastFileSize; > this.lastFileSize = fileSizeInBytes; > HoodieWriteStat stat = new HoodieWriteStat(); > stat.setPartitionPath(writeStatus.getPartitionPath()); > stat.setNumWrites(recordsWritten); > stat.setNumDeletes(recordsDeleted); > stat.setNumInserts(insertRecordsWritten); > stat.setPrevCommit(HoodieWriteStat.NULL_COMMIT); > stat.setFileId(writeStatus.getFileId()); > stat.setPath(new Path(config.getBasePath()), path); > stat.setTotalWriteBytes(incFileSizeInBytes); > stat.setFileSizeInBytes(fileSizeInBytes); > stat.setTotalWriteErrors(writeStatus.getTotalErrorRecords()); > HoodieWriteStat.RuntimeStats runtimeStats = new > HoodieWriteStat.RuntimeStats(); > runtimeStats.setTotalCreateTime(timer.endTimer()); > stat.setRuntimeStats(runtimeStats); > writeStatus.setStat(stat); > }{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-1737) Extract public method in HoodieCreateHandle & FlinkCreateHandle
Roc Marshal created HUDI-1737: - Summary: Extract public method in HoodieCreateHandle & FlinkCreateHandle Key: HUDI-1737 URL: https://issues.apache.org/jira/browse/HUDI-1737 Project: Apache Hudi Issue Type: Improvement Components: Code Cleanup Reporter: Roc Marshal Assignee: Roc Marshal {code:java} // HoodieCreateHandle.java //... @Override public List close() { LOG.info("Closing the file " + writeStatus.getFileId() + " as we are done with all the records " + recordsWritten); try { fileWriter.close(); HoodieWriteStat stat = new HoodieWriteStat(); stat.setPartitionPath(writeStatus.getPartitionPath()); stat.setNumWrites(recordsWritten); stat.setNumDeletes(recordsDeleted); stat.setNumInserts(insertRecordsWritten); stat.setPrevCommit(HoodieWriteStat.NULL_COMMIT); stat.setFileId(writeStatus.getFileId()); stat.setPath(new Path(config.getBasePath()), path); long fileSizeInBytes = FSUtils.getFileSize(fs, path); stat.setTotalWriteBytes(fileSizeInBytes); stat.setFileSizeInBytes(fileSizeInBytes); stat.setTotalWriteErrors(writeStatus.getTotalErrorRecords()); RuntimeStats runtimeStats = new RuntimeStats(); runtimeStats.setTotalCreateTime(timer.endTimer()); stat.setRuntimeStats(runtimeStats); writeStatus.setStat(stat); LOG.info(String.format("CreateHandle for partitionPath %s fileID %s, took %d ms.", stat.getPartitionPath(), stat.getFileId(), runtimeStats.getTotalCreateTime())); return Collections.singletonList(writeStatus); } catch (IOException e) { throw new HoodieInsertException("Failed to close the Insert Handle for path " + path, e); } } //FlinkCreateHandle.java private void setUpWriteStatus() throws IOException { long fileSizeInBytes = fileWriter.getBytesWritten(); long incFileSizeInBytes = fileSizeInBytes - lastFileSize; this.lastFileSize = fileSizeInBytes; HoodieWriteStat stat = new HoodieWriteStat(); stat.setPartitionPath(writeStatus.getPartitionPath()); stat.setNumWrites(recordsWritten); stat.setNumDeletes(recordsDeleted); stat.setNumInserts(insertRecordsWritten); stat.setPrevCommit(HoodieWriteStat.NULL_COMMIT); stat.setFileId(writeStatus.getFileId()); stat.setPath(new Path(config.getBasePath()), path); stat.setTotalWriteBytes(incFileSizeInBytes); stat.setFileSizeInBytes(fileSizeInBytes); stat.setTotalWriteErrors(writeStatus.getTotalErrorRecords()); HoodieWriteStat.RuntimeStats runtimeStats = new HoodieWriteStat.RuntimeStats(); runtimeStats.setTotalCreateTime(timer.endTimer()); stat.setRuntimeStats(runtimeStats); writeStatus.setStat(stat); }{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] codecov-io edited a comment on pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…
codecov-io edited a comment on pull request #2651: URL: https://github.com/apache/hudi/pull/2651#issuecomment-794945140 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=h1) Report > Merging [#2651](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=desc) (e0824f4) into [master](https://codecov.io/gh/apache/hudi/commit/ce3e8ec87083ef4cd4f33de39b6697f66ff3f277?el=desc) (ce3e8ec) will **decrease** coverage by `42.39%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2651/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) ```diff @@ Coverage Diff @@ ## master #2651 +/- ## - Coverage 51.76% 9.37% -42.40% + Complexity 3602 48 -3554 Files 476 54 -422 Lines 225791995-20584 Branches 2408 236 -2172 - Hits 11688 187-11501 + Misses 98741795 -8079 + Partials 1017 13 -1004 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.37% <0.00%> (-60.42%)` | `0.00 <0.00> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.73%)` | `0.00 <0.00> (-56.00)` | | | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | | | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` |
[GitHub] [hudi] codecov-io edited a comment on pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support
codecov-io edited a comment on pull request #2645: URL: https://github.com/apache/hudi/pull/2645#issuecomment-792430670 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2645?src=pr=h1) Report > Merging [#2645](https://codecov.io/gh/apache/hudi/pull/2645?src=pr=desc) (50c335d) into [master](https://codecov.io/gh/apache/hudi/commit/050626ad6cb8bbd06d138456ccc00dddcff2a860?el=desc) (050626a) will **decrease** coverage by `42.64%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2645/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2645?src=pr=tree) ```diff @@ Coverage Diff @@ ## master #2645 +/- ## - Coverage 52.04% 9.40% -42.65% + Complexity 3625 48 -3577 Files 479 54 -425 Lines 228041989-20815 Branches 2415 236 -2179 - Hits 11868 187-11681 + Misses 99111789 -8122 + Partials 1025 13 -1012 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.40% <ø> (-60.39%)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2645?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2645/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2645/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2645/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2645/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2645/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2645/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2645/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2645/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | | | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2645/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | | | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2645/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%>
[GitHub] [hudi] codecov-io commented on pull request #2740: [HUDI-1055] Remove hardcoded parquet in tests
codecov-io commented on pull request #2740: URL: https://github.com/apache/hudi/pull/2740#issuecomment-809855336 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2740?src=pr=h1) Report > Merging [#2740](https://codecov.io/gh/apache/hudi/pull/2740?src=pr=desc) (ddebd6e) into [master](https://codecov.io/gh/apache/hudi/commit/050626ad6cb8bbd06d138456ccc00dddcff2a860?el=desc) (050626a) will **increase** coverage by `17.69%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2740/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2740?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2740 +/- ## = + Coverage 52.04% 69.73% +17.69% + Complexity 3625 371 -3254 = Files 479 54 -425 Lines 22804 1989-20815 Branches 2415 236 -2179 = - Hits 11868 1387-10481 + Misses 9911 471 -9440 + Partials 1025 131 -894 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `69.73% <ø> (-0.06%)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2740?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2740/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.37% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | | | [...mmon/table/log/block/HoodieDeleteBlockVersion.java](https://codecov.io/gh/apache/hudi/pull/2740/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9ibG9jay9Ib29kaWVEZWxldGVCbG9ja1ZlcnNpb24uamF2YQ==) | | | | | [...va/org/apache/hudi/common/model/CleanFileInfo.java](https://codecov.io/gh/apache/hudi/pull/2740/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0NsZWFuRmlsZUluZm8uamF2YQ==) | | | | | [...n/java/org/apache/hudi/common/HoodieCleanStat.java](https://codecov.io/gh/apache/hudi/pull/2740/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL0hvb2RpZUNsZWFuU3RhdC5qYXZh) | | | | | [.../org/apache/hudi/metadata/HoodieTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2740/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvSG9vZGllVGFibGVNZXRhZGF0YS5qYXZh) | | | | | [.../apache/hudi/sink/partitioner/BucketAssigners.java](https://codecov.io/gh/apache/hudi/pull/2740/diff?src=pr=tree#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL0J1Y2tldEFzc2lnbmVycy5qYXZh) | | | | | [...pache/hudi/common/model/HoodieArchivedLogFile.java](https://codecov.io/gh/apache/hudi/pull/2740/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUFyY2hpdmVkTG9nRmlsZS5qYXZh) | | | | | [...e/hudi/table/format/mor/MergeOnReadTableState.java](https://codecov.io/gh/apache/hudi/pull/2740/diff?src=pr=tree#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9mb3JtYXQvbW9yL01lcmdlT25SZWFkVGFibGVTdGF0ZS5qYXZh) | | | | | [...g/apache/hudi/timeline/service/RequestHandler.java](https://codecov.io/gh/apache/hudi/pull/2740/diff?src=pr=tree#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvUmVxdWVzdEhhbmRsZXIuamF2YQ==) | | | | | [...a/org/apache/hudi/avro/HoodieAvroWriteSupport.java](https://codecov.io/gh/apache/hudi/pull/2740/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvYXZyby9Ib29kaWVBdnJvV3JpdGVTdXBwb3J0LmphdmE=) | | | | | ... and [411 more](https://codecov.io/gh/apache/hudi/pull/2740/diff?src=pr=tree-more) | | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-1055) Ensure hardcoded storage type ".parquet" is removed from tests
[ https://issues.apache.org/jira/browse/HUDI-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1055: - Labels: pull-request-available (was: ) > Ensure hardcoded storage type ".parquet" is removed from tests > -- > > Key: HUDI-1055 > URL: https://issues.apache.org/jira/browse/HUDI-1055 > Project: Apache Hudi > Issue Type: Task > Components: Testing >Affects Versions: 0.9.0 >Reporter: Balaji Varadarajan >Assignee: Prashant Wason >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > Follow up : https://github.com/apache/hudi/pull/1687#issuecomment-649754943 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] TeRS-K opened a new pull request #2740: [HUDI-1055] Remove hardcoded parquet in tests
TeRS-K opened a new pull request #2740: URL: https://github.com/apache/hudi/pull/2740 ## What is the purpose of the pull request This pull request removes hardcoded file format `parquet` from tests, which helps to integrate different base file formats (eg. ORC) in the future. ## Brief change log - Use `HoodieTableConfig.DEFAULT_BASE_FILE_FORMAT` as the base file format for creating file paths and for initializing HoodieTableMetaClient - Read file format from a HoodieTable or a HoodieTableMetaClient wherever possible - Introduce a layer of abstraction for data file utility functions to remove hardcoded `ParquetUtils.*` (useful for future integration with ORC) ## Verify this pull request All existing tests should work and pass. ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan closed pull request #2738: [Minor] Fixing Key generators blog for phrases
nsivabalan closed pull request #2738: URL: https://github.com/apache/hudi/pull/2738 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan opened a new pull request #2739: [MINOR] Fixing key generators blog content
nsivabalan opened a new pull request #2739: URL: https://github.com/apache/hudi/pull/2739 ## What is the purpose of the pull request *Fixing key generators blog content* ## Brief change log - *Modify contents of Key generators blog* ## Verify this pull request N/A. Only text changes. ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
svn commit: r46794 - /release/hudi/KEYS
Author: sivabalan Date: Mon Mar 29 16:57:59 2021 New Revision: 46794 Log: Adding gary's gpg keys to KEYS file Modified: release/hudi/KEYS Modified: release/hudi/KEYS == --- release/hudi/KEYS (original) +++ release/hudi/KEYS Mon Mar 29 16:57:59 2021 @@ -599,3 +599,63 @@ ECppJfvmGRuNapsZ+KCXiY2wjnM9/EopD5Nsr3E7 9ELkv7No+gWT7/64sox1Zo03duuWYR8bGpCJIcd6Qn99dPZSr59o8TGkrPU= =gJ2E -END PGP PUBLIC KEY BLOCK- +pub rsa4096 2021-03-22 [SC] [expires: 2025-03-22] + E2A9714E0FBA3A087BDEE655E72873D765D6C406 +uid [ultimate] YanJia Li +sig 3E72873D765D6C406 2021-03-22 YanJia Li +sub rsa4096 2021-03-22 [E] [expires: 2025-03-22] +sig E72873D765D6C406 2021-03-22 YanJia Li + +-BEGIN PGP PUBLIC KEY BLOCK- + +mQINBGBYnFQBEADJfxkjdOufvAOu7yP1Q1wiM+FGQIcaFb7mydFc3/PpQwqxAPoS +GlcorwkTMCdqKSxR4+p5B9xnfux9qXOydoKof0srhMLudD7lxZa6xAn0OeC2jeqk +mFhXhw2/r+iuon9x7Rzts0HY7XvM3juQpTNa1cOi2jTsALpOyo2qDhPwNc7MNasC +0OKuE0UwGfcDpd9TILIvOlssTNyHcYumavcDZBW9eZMpGF4jASPQzQ0iXnEAHyEr +I55z9q760qNfAW72SO6vKBJZZVWUoCepGzOaB9VaX3fcYdfuOEm4bfKi4qEEEUaF +aOeAo5jMbu+fhSDPBqfvthRyJitmit4rq49ijXJlwU8++mAEDUcLZ7SNMfnMht/N +NazDmz5wXjFcbyKmaYAkQ/Q+7M161QsGLFq3WGmFej1Yv/nCo3tfM3j3aEc74jzR +ylUQQQE+alJwVdN4CJ5SkyBtjBWMTbSHHagRlFoxnLUSktCOTM31vGVIoi/DrSdD +Opxy6BatTIcUcrEW+XRkqeApmiBS6Oss6H0I0qBQJZL+o5F0wT8lrrwioy/qEzR1 +pgmtccHm4TBfa21CJDyNp8+VqM99fteM57dxBwHerR7vGlRfBjNY/s9SeUwKiNZw +L7pmyQfhWXAN3m88xutpKoGpKwSL5S1rnvJl8N0dqeThSzZOB4i6zjUhQQARAQAB +tB1ZYW5KaWEgTGkgPGdhcnlsaUBhcGFjaGUub3JnPokCVAQTAQgAPhYhBOKpcU4P +ujoIe97mVecoc9dl1sQGBQJgWJxUAhsDBQkHhh8LBQsJCAcCBhUKCQgLAgQWAgMB +Ah4BAheAAAoJEOcoc9dl1sQG9BIQAMJCp6lS5ycQXDE83XL/VaVO8iPIWiZySd7P +Hf/XKab/kFIsXbAPrR5pPkcL8DzlarvklY7tTWfgzgY3yhh5L42eAdgH10Na1JWg +x/JbBGea4I89v8lRMqAcslSmts9TyCZv4aRwwV9bwf9Y7b3WGXrd4gv8fd2XZtfH +7pNNPg/B5XiWfTOQkV0S6I5lnpvgrNed3+BRJn+jYZrLLIlhPck4vShLtCnjm7TR +XNrDilRxpSzs0d8Fzgp+paWuMX+W47CzKnRZGyISQ/KJfBlacEirNEyDy+j4P3er +Pyn77QSFoBVM3SbM4wY40P+SW6bTblY+3ntO4Shb/2USb3J+w8jmwzkUXwmljgMD +ojfvQa3rO5rPfPItdaRRtEH9YQvcYdZtnG7NwRRCc8SoqeJfsqYYEo6Iw8JVJqw9 ++CIBQKie5z7/iS2/DEG4lQx57VzMURdZOoFUOvw6MEdqBqlMmwJyqG3caIXW9f4i +T4TSQCr9M0ziJELCZSHBcJ5W6fB+bhYeRsZer32tcIQTnODuNrgZic7gngLsTC4E +nl1Z6lVNczb0aQ0oGRBVb5dNROdKdSrk8hCyP4MQa2rJ4KenwrL0eyhUiIC8Bg5Q +lLGErxnNP/cTPufTgzMcvVQ0PYliyOlGEUndOw0pFORg/xi1RzCNJQIi/k8bNoEj +6u/gRkIRuQINBGBYnFQBEACzmYLb2UhAnG+Q059H1iTQbLSWektYx1WQY5Q9YjCG +hwwimY1D2ePqVI2OfSwY/aAyM14t70LOeZFG3JtjE7wzuIltGTIPBiVIJRDdIeJv +kWImZw4vbN+kMBhBnQwr7U3KNdwuD70MxHjCuQ2LFmP+Jb/Sv/6/kRr+s42PQAbL +qH0FAA8Qcv99gg/dEC0uOOHB4UE7jEEDqIbedkk6GUy9YtgplbDlk+L1I0sqh7vu +UI2bO42C1jDrNqgKJ4yQNNswriU3iri/i+0kwEFA8/oNIxdVpGiZrfBuwxmNTE4X +A9dCrDpPGIs/gKS/vaykEqHdgi33D1DGtUHuCbUNalb7Er/22PhPbeZkK0sqpuKl +u7AYdJpE1PNIiR6qMdBpQlY9F76BAwNq6gWx3eSrsgXWJ5ar5pLlhbF5j6sPjF/i +DKwBKxY0fSLBoCv2abEgzvkjUwYxUsTXkKrEx0rYUbia2WuP95VjiTrrPs0LQrgp +KG3Zn1FEOHqEN6kY3GeeBIMAnepYiCNS3WaD7RIlRKAkzVC82pYG71tP1/meDXTV +LsGOjQrIDQpLDxtiG1rwRin6tzjKcUDBfIi4y1czAHjzEnx9uHCWvbEZ6KlTI35I +SWFdLoFf2QDkmA6dgC4i0+emP4bZHRMLd7JRo+1ozTZ+hDq+z3QJJZacIH4a02vl +eQARAQABiQI8BBgBCAAmFiEE4qlxTg+6Ogh73uZV5yhz12XWxAYFAmBYnFQCGwwF +CQeGHwsACgkQ5yhz12XWxAbb3w//ai7iGR7WL2Wh6OvXICtS2WxAnXHu8XOsl91f +tf0gx6oTWI0u2VbSqJDKJG5rbUPXyCmJbG32eq3PjTYWS0jT2kQFqkWQ5wX6AqZp +lVNkT0GmmBuHRA71sp1PUHK2DaVDmHaTDncSvcdzDra8d0+/ANZ8licZlXF8D9rz +9zGnxU/mbZ98xUJcVK3w8yea98bTV2cQLlTgYjLfmFoA/a8zyeuIotTUCELIA0Wq +sAs8b0ORVm9Hk4G1q5eBem1FY8CzQQvVngrMUTOZdj1f0KXmM1Vii+T8eU8ukT82 +bScU7YRmO/XdMNaijmrsHmdP1ybW2KuP16m3ZIxXUu/mD6HIYCIFrIuin425E2kT +hSh7xyZQMGRyJ+HlzUKm4d8Mg05SmErDaA+4APN5F6lP47ED0kT8RkRGmGBWWaZU +sHbsjj6WYsEAVUcxErn+DelSS31j9P+8sCyI4Yi9/1IAr5VvYQXrrH3veCRHVjQZ +KK/zA7lkrn26yldsuZXq4DArTmFUhCwRSNDEQgcfh/HOpmT8r7WZEGRBb99xXLyY +HJzrVHbpIPxUpvBFld41Eyepuoij+pY7zyb/mCk5KMPEVK4XyYG9PpuPdER7EFzo +K5pVT2a4wL0e0/ekCsGfbEn+2xubSrfWZ+M3YoIlX6uVykQrKH+NjoUlLuqv7PVF +wV4zPZQ= +=0iU3 +-END PGP PUBLIC KEY BLOCK- +
[GitHub] [hudi] nsivabalan commented on issue #2513: [SUPPORT]Hive-Cli set hive.input.format=org.apache.hudi.hadoop.HoodieParquetInputFormat and query error
nsivabalan commented on issue #2513: URL: https://github.com/apache/hudi/issues/2513#issuecomment-809541159 @n3nash : user is awaiting your response. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (HUDI-1728) Addressing ClassNotFound when using HiveMetastoreBasedLockProvider
[ https://issues.apache.org/jira/browse/HUDI-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Li resolved HUDI-1728. --- Resolution: Fixed > Addressing ClassNotFound when using HiveMetastoreBasedLockProvider > -- > > Key: HUDI-1728 > URL: https://issues.apache.org/jira/browse/HUDI-1728 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Nishith Agarwal >Assignee: Nishith Agarwal >Priority: Major > Labels: pull-request-available > Fix For: 0.8.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (HUDI-1728) Addressing ClassNotFound when using HiveMetastoreBasedLockProvider
[ https://issues.apache.org/jira/browse/HUDI-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Li closed HUDI-1728. - > Addressing ClassNotFound when using HiveMetastoreBasedLockProvider > -- > > Key: HUDI-1728 > URL: https://issues.apache.org/jira/browse/HUDI-1728 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Nishith Agarwal >Assignee: Nishith Agarwal >Priority: Major > Labels: pull-request-available > Fix For: 0.8.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1728) Addressing ClassNotFound when using HiveMetastoreBasedLockProvider
[ https://issues.apache.org/jira/browse/HUDI-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Li updated HUDI-1728: -- Fix Version/s: 0.8.0 > Addressing ClassNotFound when using HiveMetastoreBasedLockProvider > -- > > Key: HUDI-1728 > URL: https://issues.apache.org/jira/browse/HUDI-1728 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Nishith Agarwal >Assignee: Nishith Agarwal >Priority: Major > Labels: pull-request-available > Fix For: 0.8.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1736) failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled
[ https://issues.apache.org/jira/browse/HUDI-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1736: -- Labels: sev:triage user-support-issues (was: ) > failed to read timestamp column in version 0.7.0 even when > HIVE_SUPPORT_TIMESTAMP is enabled > > > Key: HUDI-1736 > URL: https://issues.apache.org/jira/browse/HUDI-1736 > Project: Apache Hudi > Issue Type: Bug >Reporter: sivabalan narayanan >Priority: Major > Labels: sev:triage, user-support-issues > > sql("select * from employee_rt").show(false) > > {{174262 [Executor task launch worker for task 12017] ERROR > org.apache.spark.executor.Executor - Exception in task 0.0 in stage 31.0 > (TID 12017) > java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be > cast to org.apache.hadoop.hive.serde2.io.TimestampWritable > at > org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableTimestampObjectInspector.getPrimitiveJavaObject(WritableTimestampObjectInspector.java:39) > at > org.apache.spark.sql.hive.HadoopTableReader$.$anonfun$fillObject$14(TableReader.scala:468) > at > org.apache.spark.sql.hive.HadoopTableReader$.$anonfun$fillObject$14$adapted(TableReader.scala:467) > at > org.apache.spark.sql.hive.HadoopTableReader$.$anonfun$fillObject$18(TableReader.scala:493) > at scala.collection.Iterator$$anon$10.next(Iterator.scala:459) > at scala.collection.Iterator$$anon$10.next(Iterator.scala:459) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) > at > org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:340) > at > org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:872) > at > org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:872) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:313) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) > at org.apache.spark.scheduler.Task.run(Task.scala:127) > at > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748)}} > {{}} > {{}} > {{Steps to reproduce}} > {{}} > {code:java} > import org.apache.spark.sql._ > import org.apache.hudi.QuickstartUtils._ > import scala.collection.JavaConversions._ > import org.apache.spark.sql.SaveMode._ > import org.apache.hudi.DataSourceReadOptions._ > import org.apache.hudi.DataSourceWriteOptions._ > import org.apache.hudi.config.HoodieWriteConfig._ > import org.apache.spark.sql.functions._ > import org.apache.hudi.QuickstartUtils._ > import scala.collection.JavaConversions._ > import org.apache.spark.sql.SaveMode._ > import org.apache.hudi.DataSourceReadOptions > import org.apache.hudi.DataSourceWriteOptions > import org.apache.hudi.config.HoodieWriteConfig > import org.apache.hudi.hive.MultiPartKeysValueExtractor > import org.apache.spark.sql.functions._ > import org.apache.hudi.keygen._ > import org.apache.spark.sql.streaming._ > case class Person(firstname:String, age:Int, gender:Int) > val personDF = List(Person("tom",45,1), > Person("iris",44,0)).toDF.withColumn("ts",unix_timestamp).withColumn("insert_time",current_timestamp) > //val personDF2 = List(Person("peng",56,1), > Person("iris",51,0),Person("jacky",25,1)).toDF.withColumn("ts",unix_timestamp).withColumn("insert_time",current_timestamp) > //personDF.write.mode(SaveMode.Overwrite).format("hudi").saveAsTable("employee") > val tableName = "employee" > val hudiCommonOptions = Map( > "hoodie.compact.inline" -> "true", > "hoodie.compact.inline.max.delta.commits" ->"5", > "hoodie.base.path" -> s"/tmp/$tableName", > "hoodie.table.name" -> tableName, > "hoodie.datasource.write.table.type"->"MERGE_ON_READ", > "hoodie.datasource.write.operation" -> "upsert", > "hoodie.clean.async" -> "true" > ) > val hudiHiveOptions = Map( >
svn commit: r46793 - in /dev/hudi/hudi-0.8.0-rc1: ./ hudi-0.8.0-rc1.src.tgz hudi-0.8.0-rc1.src.tgz.asc hudi-0.8.0-rc1.src.tgz.sha512
Author: garyli Date: Mon Mar 29 16:18:18 2021 New Revision: 46793 Log: Add Hudi 0.8.0 rc1 release Added: dev/hudi/hudi-0.8.0-rc1/ dev/hudi/hudi-0.8.0-rc1/hudi-0.8.0-rc1.src.tgz (with props) dev/hudi/hudi-0.8.0-rc1/hudi-0.8.0-rc1.src.tgz.asc dev/hudi/hudi-0.8.0-rc1/hudi-0.8.0-rc1.src.tgz.sha512 Added: dev/hudi/hudi-0.8.0-rc1/hudi-0.8.0-rc1.src.tgz == Binary file - no diff available. Propchange: dev/hudi/hudi-0.8.0-rc1/hudi-0.8.0-rc1.src.tgz -- svn:mime-type = application/octet-stream Added: dev/hudi/hudi-0.8.0-rc1/hudi-0.8.0-rc1.src.tgz.asc == --- dev/hudi/hudi-0.8.0-rc1/hudi-0.8.0-rc1.src.tgz.asc (added) +++ dev/hudi/hudi-0.8.0-rc1/hudi-0.8.0-rc1.src.tgz.asc Mon Mar 29 16:18:18 2021 @@ -0,0 +1,16 @@ +-BEGIN PGP SIGNATURE- + +iQIzBAABCAAdFiEE4qlxTg+6Ogh73uZV5yhz12XWxAYFAmBh6yYACgkQ5yhz12XW +xAanBQ/6AtZkA5/ZrvvdDUsZUF6694K/cSKj4bRvFJeFohJGAmm5ghb+5d6ZTVCe +V2b+x5Lp4/N6ybRNSaCzibFrak7xgBP0ClTRDA+0fXF1+VaFt0Yw8CADDcJeo6qu +28HtZP4zHp4TEhirF3nb1Qf7a+7taCys4xbhFoLgNPFbZhgTx4aGEYpAzTdhOYKO +ewoNdXz+gcTAwclXa8Gi6DkSpr1WjYgEohyP5Ajy2f7pu3K2YEQ5Vv+dtaMZ1Sg/ +03I5X9jj+CYIq0UD/RO2FOH7J6kG0e8fnWUa1CN4IKvZN0gACQnflvoCDZPloPC5 +N0a2Jah6eyGR7pa/FjflsKWep7ahGsezMUeLl2hQKgulOQ+6ukmtWBZ1h62FdJmN +bk+SheHmhH8xfW1Cw1ZEsGhAJAiO/nPsG1o5OIAz3ZPrM/AM2VbFw/0/a43U41kk +zZju72smZdNOMWViGNY90ecX0MCFNQdgUJN8pmLkTyW29PMiVDFTpd5xqHinpl0n +EqKks3mumDMbx0qR3RlJJe6H4zr98viELwsMc+O5H0pzWB4rb8NGj21geDar5EWm +pxLT1mwSJICTno5Q8hOCvCVkVmuFFFatec48usyIT2tADI6ksoP1UcTlhcpvi1i3 +gSQtUFy5ajncQkr0b6NLHDlljD2nt+Q1p/ZhnZgUbBXrV+k5mb0= +=sCLO +-END PGP SIGNATURE- Added: dev/hudi/hudi-0.8.0-rc1/hudi-0.8.0-rc1.src.tgz.sha512 == --- dev/hudi/hudi-0.8.0-rc1/hudi-0.8.0-rc1.src.tgz.sha512 (added) +++ dev/hudi/hudi-0.8.0-rc1/hudi-0.8.0-rc1.src.tgz.sha512 Mon Mar 29 16:18:18 2021 @@ -0,0 +1 @@ +58cc6bc3463c0eb845c70ad7047128e4cecea29de5ad3fa901af6061c09886e4c1d3d886ab72f8cf83535c2d145d608fc039e172a86aafd60953666876617309 hudi-0.8.0-rc1.src.tgz
svn commit: r46792 - /dev/hudi/hudi-0.8.0-rc1/
Author: garyli Date: Mon Mar 29 16:16:09 2021 New Revision: 46792 Log: Remove 0.8.0-rc1 Removed: dev/hudi/hudi-0.8.0-rc1/
[GitHub] [hudi] nsivabalan commented on issue #2566: [SUPPORT] Unable to read Hudi MOR data set in a test on 0.7
nsivabalan commented on issue #2566: URL: https://github.com/apache/hudi/issues/2566#issuecomment-809511285 @jtmzheng : we fixed the bundling issues in latest master. And we have an upcoming release. If you are doing POC, you can try to build from latest master and give it a try. If not, can wait for release to be complete and use the new maven artifacts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #2564: Hoodie clean is not deleting old files
nsivabalan commented on issue #2564: URL: https://github.com/apache/hudi/issues/2564#issuecomment-809504232 @n3nash : a gentle reminder on follow up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #2498: [SUPPORT] Hudi MERGE_ON_READ load to dataframe fails for the versions [0.6.0],[0.7.0] and runs for [0.5.3]
nsivabalan commented on issue #2498: URL: https://github.com/apache/hudi/issues/2498#issuecomment-809502029 @zafer-sahin @Magicbeanbuyer : Can you folks try out w/ spark2 and let us know if you still encounter the same issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled
nsivabalan commented on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-809498561 Have filed a [tracking](https://issues.apache.org/jira/browse/HUDI-1736) ticket. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-1736) failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled
sivabalan narayanan created HUDI-1736: - Summary: failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled Key: HUDI-1736 URL: https://issues.apache.org/jira/browse/HUDI-1736 Project: Apache Hudi Issue Type: Bug Reporter: sivabalan narayanan sql("select * from employee_rt").show(false) {{174262 [Executor task launch worker for task 12017] ERROR org.apache.spark.executor.Executor - Exception in task 0.0 in stage 31.0 (TID 12017) java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritable at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableTimestampObjectInspector.getPrimitiveJavaObject(WritableTimestampObjectInspector.java:39) at org.apache.spark.sql.hive.HadoopTableReader$.$anonfun$fillObject$14(TableReader.scala:468) at org.apache.spark.sql.hive.HadoopTableReader$.$anonfun$fillObject$14$adapted(TableReader.scala:467) at org.apache.spark.sql.hive.HadoopTableReader$.$anonfun$fillObject$18(TableReader.scala:493) at scala.collection.Iterator$$anon$10.next(Iterator.scala:459) at scala.collection.Iterator$$anon$10.next(Iterator.scala:459) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:340) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:872) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:872) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349) at org.apache.spark.rdd.RDD.iterator(RDD.scala:313) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:127) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)}} {{}} {{}} {{Steps to reproduce}} {{}} {code:java} import org.apache.spark.sql._ import org.apache.hudi.QuickstartUtils._ import scala.collection.JavaConversions._ import org.apache.spark.sql.SaveMode._ import org.apache.hudi.DataSourceReadOptions._ import org.apache.hudi.DataSourceWriteOptions._ import org.apache.hudi.config.HoodieWriteConfig._ import org.apache.spark.sql.functions._ import org.apache.hudi.QuickstartUtils._ import scala.collection.JavaConversions._ import org.apache.spark.sql.SaveMode._ import org.apache.hudi.DataSourceReadOptions import org.apache.hudi.DataSourceWriteOptions import org.apache.hudi.config.HoodieWriteConfig import org.apache.hudi.hive.MultiPartKeysValueExtractor import org.apache.spark.sql.functions._ import org.apache.hudi.keygen._ import org.apache.spark.sql.streaming._ case class Person(firstname:String, age:Int, gender:Int) val personDF = List(Person("tom",45,1), Person("iris",44,0)).toDF.withColumn("ts",unix_timestamp).withColumn("insert_time",current_timestamp) //val personDF2 = List(Person("peng",56,1), Person("iris",51,0),Person("jacky",25,1)).toDF.withColumn("ts",unix_timestamp).withColumn("insert_time",current_timestamp) //personDF.write.mode(SaveMode.Overwrite).format("hudi").saveAsTable("employee") val tableName = "employee" val hudiCommonOptions = Map( "hoodie.compact.inline" -> "true", "hoodie.compact.inline.max.delta.commits" ->"5", "hoodie.base.path" -> s"/tmp/$tableName", "hoodie.table.name" -> tableName, "hoodie.datasource.write.table.type"->"MERGE_ON_READ", "hoodie.datasource.write.operation" -> "upsert", "hoodie.clean.async" -> "true" ) val hudiHiveOptions = Map( DataSourceWriteOptions.HIVE_SYNC_ENABLED_OPT_KEY -> "true", DataSourceWriteOptions.HIVE_URL_OPT_KEY -> "jdbc:hive2://localhost:1", DataSourceWriteOptions.HIVE_PARTITION_FIELDS_OPT_KEY -> "gender", DataSourceWriteOptions.HIVE_STYLE_PARTITIONING_OPT_KEY -> "true", "hoodie.datasource.hive_sync.support_timestamp"->"true", DataSourceWriteOptions.HIVE_TABLE_OPT_KEY -> tableName, DataSourceWriteOptions.HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY ->
[GitHub] [hudi] nsivabalan commented on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled
nsivabalan commented on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-809495394 @cdmikechen : whats would be your advise on how to go about this. Would be nice to get this fixed for hudi users. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan edited a comment on issue #2537: [SUPPORT] delta streamer sql transformer does not work for my kafka data
nsivabalan edited a comment on issue #2537: URL: https://github.com/apache/hudi/issues/2537#issuecomment-809492266 @jiangjiguang : let us know once you have some avro. We are trying to iron out all issues related to schema and kafka issues. would be good to get this addressed as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #2537: [SUPPORT] delta streamer sql transformer does not work for my kafka data
nsivabalan commented on issue #2537: URL: https://github.com/apache/hudi/issues/2537#issuecomment-809492266 @jiangjiguang : let us know once you have some avro. We are trying to iron our all issues related to schema and kafka issues. would be good to get this addressed as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #2557: [SUPPORT]Container exited with a non-zero exit code 137
nsivabalan commented on issue #2557: URL: https://github.com/apache/hudi/issues/2557#issuecomment-809490684 @kingkongpoon : sorry for the late follow up. May I know why do you use Global bloom. Our default recommendation is to use regular "BLOOM" if you don't have any special requirement to use global version. Global bloom is expected to be slower depending on characteristics of your record keys. If your record keys are completely random (no ordering or any timestamp in them), we would recommend using "SIMPLE" index which should perform better when compared to "BLOOM" index. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #2592: [SUPPORT] Does latest versions of Hudi (0.7.0, 0.6.0) work with Spark 2.3.0 when reading orc files?
nsivabalan commented on issue #2592: URL: https://github.com/apache/hudi/issues/2592#issuecomment-809487669 @bvaradar @umehrot2 : reminder to follow up on this issue. Also, once resolved, can you please add a faq entry w/o fail. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #2589: [SUPPORT] Issue with adding column while running deltastreamer with kafka source.
nsivabalan commented on issue #2589: URL: https://github.com/apache/hudi/issues/2589#issuecomment-809485189 hi @t0il3ts0ap. On a related note, with upcoming release, we have added support for [custom deserializer](https://github.com/apache/hudi/pull/2619) if you are using AvroKafkaSource. So, it will fetch latest schema from your schema registry and use that while deserialize kafka records. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-io commented on pull request #2737: [HUDI-1735] Add hive-exec dependency for hudi-examples
codecov-io commented on pull request #2737: URL: https://github.com/apache/hudi/pull/2737#issuecomment-809464352 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2737?src=pr=h1) Report > Merging [#2737](https://codecov.io/gh/apache/hudi/pull/2737?src=pr=desc) (643256f) into [master](https://codecov.io/gh/apache/hudi/commit/d415d45416707ca4d5b1dbad65dc80e6fccfa378?el=desc) (d415d45) will **increase** coverage by `0.00%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2737/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2737?src=pr=tree) ```diff @@Coverage Diff@@ ## master#2737 +/- ## = Coverage 52.04% 52.04% Complexity 3624 3624 = Files 479 479 Lines 2280422804 Branches 2415 2415 = + Hits 1186811869+1 + Misses 9911 9910-1 Partials 1025 1025 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.94% <ø> (+<0.01%)` | `0.00 <ø> (ø)` | | | hudiflink | `56.01% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisync | `45.47% <ø> (ø)` | `0.00 <ø> (ø)` | | | huditimelineservice | `64.36% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiutilities | `69.73% <ø> (ø)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2737?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2737/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `79.68% <0.00%> (+0.78%)` | `26.00% <0.00%> (ø%)` | | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[hudi] annotated tag release-0.8.0-rc1 updated (a90125a -> 377eba2)
This is an automated email from the ASF dual-hosted git repository. garyli pushed a change to annotated tag release-0.8.0-rc1 in repository https://gitbox.apache.org/repos/asf/hudi.git. *** WARNING: tag release-0.8.0-rc1 was modified! *** from a90125a (commit) to 377eba2 (tag) tagging a90125a324921cac7e62306ec925246a4c9e8a4b (commit) replaces hoodie-0.4.7 by garyli1019 on Mon Mar 29 23:03:47 2021 +0800 - Log - 0.8.0 -BEGIN PGP SIGNATURE- iQIzBAABCAAdFiEE4qlxTg+6Ogh73uZV5yhz12XWxAYFAmBh7FMACgkQ5yhz12XW xAZs7RAAiaKmyHfQEaGt0rg1akciWov48A4jjjYbtR6A63Rwv15uaZ69DaKVdQu5 bcWkwsd//WEDhoF7lS+JYyM9HA8fYmeiRv3Zqq7Sz3kBdHhj5gICAmthxWGTBSLZ DwLuLx7Y4mE+IXxhnXCGLANUbkgFieRojzaqyd/ysEeQtJOkqePYqIR2jCKf2hWW oiJTw7XcGNUnq1oV02VBZjBVd1G3hdyAAKIYPzzgmRlgkOHoBqFn/GrjNLw8rj9e RSFjApEdxLNNNnjh9S5r3qvP+ft0Cs4oRz6Rhiu3svE3aPgbzzOhbn59N2jvKAxQ 39IUPEAjCl3emo3Nezmdz7WVxADqVVtcD2vWileihnHNvjDZVCrpppJBtkndvCVK D43+jUzyOynZA8kss/XYLSHhm7m8g6iCKkvRSvdBSiGWwmlpVUI1Ah2cU2ILUoTX ayypXc+Ylt6kKUaXTYZyVB8jlCk6vMEZtAfQLNmv26rSly1sAPzqypBLgA6wWiCm 5MDM206S0GWcjGaansTgbjgqykFh8j8bWzVGceYES5U8YoTo+63bo25gABZL/iwW 7Rr+NCvF58a16ZP7U/5xNTmravzQ9Zq8xzvxhw1dCMdD9D3xLymRrAx5q8fQ0Uag 6RrbRKiBh2OCs0lUpNby7oTjY1vKY/e0Vq1Qpt8A9ckwZG+dbSU= =ceFm -END PGP SIGNATURE- --- No new revisions were added by this update. Summary of changes:
[GitHub] [hudi] yanghua closed pull request #2737: [HUDI-1735] Add hive-exec dependency for hudi-examples
yanghua closed pull request #2737: URL: https://github.com/apache/hudi/pull/2737 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] yanghua commented on pull request #2737: [HUDI-1735] Add hive-exec dependency for hudi-examples
yanghua commented on pull request #2737: URL: https://github.com/apache/hudi/pull/2737#issuecomment-809439067 @hudi-bot run travis -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] pengzhiwei2018 commented on pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support
pengzhiwei2018 commented on pull request #2645: URL: https://github.com/apache/hudi/pull/2645#issuecomment-809432020 Hi @vinothchandar @xiarixiaoyao The PR has updated. Including the following changes: - Introduce `SparkSqlAdapter` to fit spark2 and spark3. We can compile with -Pspark3 for the project now. - Support Update & Delete hoodie table. - Rename the test class name. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-1659) Basic implementation Of Spark Sql Support
[ https://issues.apache.org/jira/browse/HUDI-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengzhiwei updated HUDI-1659: - Description: The Basic Implement include the follow things based on DataSource V1: 1、CREATE TABLE FOR HOODIE 2、CTAS 3、INSERT Hoodie Table 4、MergeInto with the RowKey constraint. 5、Update Hoodie Table. 6、Delete From Hoodie Table was: The Basic Implement include the follow things based on DataSource V1: 1、CREATE TABLE FOR HOODIE 2、CTAS 3、INSERT Hoodie Table 4、MergeInto with the RowKey constraint. > Basic implementation Of Spark Sql Support > - > > Key: HUDI-1659 > URL: https://issues.apache.org/jira/browse/HUDI-1659 > Project: Apache Hudi > Issue Type: Sub-task > Components: Spark Integration >Reporter: pengzhiwei >Assignee: pengzhiwei >Priority: Major > Labels: pull-request-available > > The Basic Implement include the follow things based on DataSource V1: > 1、CREATE TABLE FOR HOODIE > 2、CTAS > 3、INSERT Hoodie Table > 4、MergeInto with the RowKey constraint. > 5、Update Hoodie Table. > 6、Delete From Hoodie Table > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch release-0.8.0 updated: [MINOR] Add Missing Apache License to test files (#2736)
This is an automated email from the ASF dual-hosted git repository. garyli pushed a commit to branch release-0.8.0 in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/release-0.8.0 by this push: new a90125a [MINOR] Add Missing Apache License to test files (#2736) a90125a is described below commit a90125a324921cac7e62306ec925246a4c9e8a4b Author: Gary Li AuthorDate: Mon Mar 29 07:17:23 2021 -0700 [MINOR] Add Missing Apache License to test files (#2736) --- docker/demo/config/test-suite/test.properties | 16 scripts/release/validate_staged_release.sh| 4 ++-- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/docker/demo/config/test-suite/test.properties b/docker/demo/config/test-suite/test.properties index 9dfb465..b4f69d9 100644 --- a/docker/demo/config/test-suite/test.properties +++ b/docker/demo/config/test-suite/test.properties @@ -1,4 +1,20 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + hoodie.insert.shuffle.parallelism=100 hoodie.upsert.shuffle.parallelism=100 hoodie.bulkinsert.shuffle.parallelism=100 diff --git a/scripts/release/validate_staged_release.sh b/scripts/release/validate_staged_release.sh index 75c4baa..5139e34 100755 --- a/scripts/release/validate_staged_release.sh +++ b/scripts/release/validate_staged_release.sh @@ -156,10 +156,10 @@ echo -e "\t\tNotice file exists ? [OK]\n" ### Licensing Check echo "Performing custom Licensing Check " -numfilesWithNoLicense=`find . -iname '*' -type f | grep -v NOTICE | grep -v LICENSE | grep -v '.json' | grep -v '.data'|grep -v '.commit' | grep -v DISCLAIMER | grep -v KEYS | grep -v '.mailmap' | grep -v '.sqltemplate' | grep -v 'ObjectSizeCalculator.java' | grep -v 'AvroConversionHelper.scala' | xargs grep -L "Licensed to the Apache Software Foundation (ASF)" | wc -l` +numfilesWithNoLicense=`find . -iname '*' -type f | grep -v NOTICE | grep -v LICENSE | grep -v '.json' | grep -v '.data'| grep -v '.commit' | grep -v DISCLAIMER | grep -v KEYS | grep -v '.mailmap' | grep -v '.sqltemplate' | grep -v 'ObjectSizeCalculator.java' | grep -v 'AvroConversionHelper.scala' | xargs grep -L "Licensed to the Apache Software Foundation (ASF)" | wc -l` if [ "$numfilesWithNoLicense" -gt "0" ]; then echo "There were some source files that did not have Apache License" - find . -iname '*' -type f | grep -v NOTICE | grep -v LICENSE | grep -v '.json' | grep -v '.data' | grep -v DISCLAIMER | grep -v '.sqltemplate' | grep -v KEYS | grep -v '.mailmap' | grep -v 'ObjectSizeCalculator.java' | grep -v 'AvroConversionHelper.scala' | xargs grep -L "Licensed to the Apache Software Foundation (ASF)" + find . -iname '*' -type f | grep -v NOTICE | grep -v LICENSE | grep -v '.json' | grep -v '.data' | grep -v '.commit' | grep -v DISCLAIMER | grep -v '.sqltemplate' | grep -v KEYS | grep -v '.mailmap' | grep -v 'ObjectSizeCalculator.java' | grep -v 'AvroConversionHelper.scala' | xargs grep -L "Licensed to the Apache Software Foundation (ASF)" exit -1 fi echo -e "\t\tLicensing Check Passed [OK]\n"
[hudi] branch master updated: [MINOR] Add Missing Apache License to test files (#2736)
This is an automated email from the ASF dual-hosted git repository. garyli pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 050626a [MINOR] Add Missing Apache License to test files (#2736) 050626a is described below commit 050626ad6cb8bbd06d138456ccc00dddcff2a860 Author: Gary Li AuthorDate: Mon Mar 29 07:17:23 2021 -0700 [MINOR] Add Missing Apache License to test files (#2736) --- docker/demo/config/test-suite/test.properties | 16 scripts/release/validate_staged_release.sh| 4 ++-- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/docker/demo/config/test-suite/test.properties b/docker/demo/config/test-suite/test.properties index 9dfb465..b4f69d9 100644 --- a/docker/demo/config/test-suite/test.properties +++ b/docker/demo/config/test-suite/test.properties @@ -1,4 +1,20 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + hoodie.insert.shuffle.parallelism=100 hoodie.upsert.shuffle.parallelism=100 hoodie.bulkinsert.shuffle.parallelism=100 diff --git a/scripts/release/validate_staged_release.sh b/scripts/release/validate_staged_release.sh index 75c4baa..5139e34 100755 --- a/scripts/release/validate_staged_release.sh +++ b/scripts/release/validate_staged_release.sh @@ -156,10 +156,10 @@ echo -e "\t\tNotice file exists ? [OK]\n" ### Licensing Check echo "Performing custom Licensing Check " -numfilesWithNoLicense=`find . -iname '*' -type f | grep -v NOTICE | grep -v LICENSE | grep -v '.json' | grep -v '.data'|grep -v '.commit' | grep -v DISCLAIMER | grep -v KEYS | grep -v '.mailmap' | grep -v '.sqltemplate' | grep -v 'ObjectSizeCalculator.java' | grep -v 'AvroConversionHelper.scala' | xargs grep -L "Licensed to the Apache Software Foundation (ASF)" | wc -l` +numfilesWithNoLicense=`find . -iname '*' -type f | grep -v NOTICE | grep -v LICENSE | grep -v '.json' | grep -v '.data'| grep -v '.commit' | grep -v DISCLAIMER | grep -v KEYS | grep -v '.mailmap' | grep -v '.sqltemplate' | grep -v 'ObjectSizeCalculator.java' | grep -v 'AvroConversionHelper.scala' | xargs grep -L "Licensed to the Apache Software Foundation (ASF)" | wc -l` if [ "$numfilesWithNoLicense" -gt "0" ]; then echo "There were some source files that did not have Apache License" - find . -iname '*' -type f | grep -v NOTICE | grep -v LICENSE | grep -v '.json' | grep -v '.data' | grep -v DISCLAIMER | grep -v '.sqltemplate' | grep -v KEYS | grep -v '.mailmap' | grep -v 'ObjectSizeCalculator.java' | grep -v 'AvroConversionHelper.scala' | xargs grep -L "Licensed to the Apache Software Foundation (ASF)" + find . -iname '*' -type f | grep -v NOTICE | grep -v LICENSE | grep -v '.json' | grep -v '.data' | grep -v '.commit' | grep -v DISCLAIMER | grep -v '.sqltemplate' | grep -v KEYS | grep -v '.mailmap' | grep -v 'ObjectSizeCalculator.java' | grep -v 'AvroConversionHelper.scala' | xargs grep -L "Licensed to the Apache Software Foundation (ASF)" exit -1 fi echo -e "\t\tLicensing Check Passed [OK]\n"
[GitHub] [hudi] garyli1019 merged pull request #2736: [MINOR] Add Missing Apache License to test files
garyli1019 merged pull request #2736: URL: https://github.com/apache/hudi/pull/2736 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] rubenssoto commented on issue #2688: [SUPPORT] Sync to Hive using Metastore
rubenssoto commented on issue #2688: URL: https://github.com/apache/hudi/issues/2688#issuecomment-809381115 Hello, somebody could help? @n3nash -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[hudi] 03/03: [HOTFIX] fix deploy staging jars script
This is an automated email from the ASF dual-hosted git repository. garyli pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git commit e069b64e10124de76e0d54d0f24d4ec95890e82d Author: garyli1019 AuthorDate: Sun Mar 28 22:10:00 2021 +0800 [HOTFIX] fix deploy staging jars script --- scripts/release/deploy_staging_jars.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/release/deploy_staging_jars.sh b/scripts/release/deploy_staging_jars.sh index b2c5bf3..4bd9158 100755 --- a/scripts/release/deploy_staging_jars.sh +++ b/scripts/release/deploy_staging_jars.sh @@ -45,7 +45,7 @@ else if [[ $param =~ --scala_version\=(2\.1[1-2]) ]]; then SCALA_VERSION=${BASH_REMATCH[1]} elif [[ $param =~ --spark_version\=([2-3]) ]]; then - SPARK_VERSION=${BASH_REMATCH[0]} + SPARK_VERSION=${BASH_REMATCH[1]} fi done fi
[hudi] 01/03: [HOTFIX] close spark session in functional test suite and disable spark3 test for spark2 (#2727)
This is an automated email from the ASF dual-hosted git repository. garyli pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git commit 452f5e2d661ee667cdd1348c0f580fcc650318b7 Author: Gary Li AuthorDate: Fri Mar 26 20:58:29 2021 -0700 [HOTFIX] close spark session in functional test suite and disable spark3 test for spark2 (#2727) --- .../org/apache/hudi/testutils/FunctionalTestHarness.java | 16 hudi-spark-datasource/hudi-spark2/pom.xml| 2 +- hudi-spark-datasource/hudi-spark3/pom.xml| 7 +++ .../hudi/utilities/testutils/UtilitiesTestBase.java | 9 + pom.xml | 2 ++ 5 files changed, 35 insertions(+), 1 deletion(-) diff --git a/hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/FunctionalTestHarness.java b/hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/FunctionalTestHarness.java index fc02e6d..e391abf 100644 --- a/hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/FunctionalTestHarness.java +++ b/hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/FunctionalTestHarness.java @@ -152,6 +152,8 @@ public class FunctionalTestHarness implements SparkProvider, DFSProvider, Hoodie hdfsTestService.stop(); hdfsTestService = null; +jsc.close(); +jsc = null; spark.stop(); spark = null; })); @@ -166,5 +168,19 @@ public class FunctionalTestHarness implements SparkProvider, DFSProvider, Hoodie for (FileStatus f : fileStatuses) { fs.delete(f.getPath(), true); } +if (hdfsTestService != null) { + hdfsTestService.stop(); + hdfsTestService = null; +} +if (spark != null) { + spark.stop(); + spark = null; +} +if (jsc != null) { + jsc.close(); + jsc = null; +} +sqlContext = null; +context = null; } } diff --git a/hudi-spark-datasource/hudi-spark2/pom.xml b/hudi-spark-datasource/hudi-spark2/pom.xml index 9a232d1..c27bb40 100644 --- a/hudi-spark-datasource/hudi-spark2/pom.xml +++ b/hudi-spark-datasource/hudi-spark2/pom.xml @@ -151,7 +151,7 @@ org.scala-lang scala-library - ${scala11.version} + ${scala.version} diff --git a/hudi-spark-datasource/hudi-spark3/pom.xml b/hudi-spark-datasource/hudi-spark3/pom.xml index d47e90e..f3c25a8 100644 --- a/hudi-spark-datasource/hudi-spark3/pom.xml +++ b/hudi-spark-datasource/hudi-spark3/pom.xml @@ -125,6 +125,13 @@ +org.apache.maven.plugins +maven-surefire-plugin + + ${skip.hudi-spark3.unit.tests} + + + org.apache.rat apache-rat-plugin diff --git a/hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java b/hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java index 6efd468..0adef52 100644 --- a/hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java +++ b/hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java @@ -124,15 +124,19 @@ public class UtilitiesTestBase { public static void cleanupClass() { if (hdfsTestService != null) { hdfsTestService.stop(); + hdfsTestService = null; } if (hiveServer != null) { hiveServer.stop(); + hiveServer = null; } if (hiveTestService != null) { hiveTestService.stop(); + hiveTestService = null; } if (zookeeperTestService != null) { zookeeperTestService.stop(); + zookeeperTestService = null; } } @@ -150,6 +154,11 @@ public class UtilitiesTestBase { TestDataSource.resetDataGen(); if (jsc != null) { jsc.stop(); + jsc = null; +} +if (sparkSession != null) { + sparkSession.close(); + sparkSession = null; } if (context != null) { context = null; diff --git a/pom.xml b/pom.xml index 4c950fe..61e3ac8 100644 --- a/pom.xml +++ b/pom.xml @@ -133,6 +133,7 @@ ${skipTests} ${skipTests} ${skipTests} +${skipTests} UTF-8 ${project.basedir} provided @@ -1424,6 +1425,7 @@ ${scala12.version} 2.12 +true
[hudi] 02/03: [HOTFIX] Disable ITs for Spark3 and scala2.12 (#2733)
This is an automated email from the ASF dual-hosted git repository. garyli pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git commit 4db970dc8afd79964bbfa85097954750cb4f1ab2 Author: Gary Li AuthorDate: Sun Mar 28 01:07:57 2021 -0700 [HOTFIX] Disable ITs for Spark3 and scala2.12 (#2733) --- pom.xml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/pom.xml b/pom.xml index 61e3ac8..7f371dd 100644 --- a/pom.xml +++ b/pom.xml @@ -1426,6 +1426,7 @@ ${scala12.version} 2.12 true +true @@ -1473,6 +1474,7 @@ ${fasterxml.spark3.version} ${fasterxml.spark3.version} true +true
[hudi] branch master updated (d415d45 -> e069b64)
This is an automated email from the ASF dual-hosted git repository. garyli pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from d415d45 [HUDI-1729] Asynchronous Hive sync and commits cleaning for Flink writer (#2732) new 452f5e2 [HOTFIX] close spark session in functional test suite and disable spark3 test for spark2 (#2727) new 4db970d [HOTFIX] Disable ITs for Spark3 and scala2.12 (#2733) new e069b64 [HOTFIX] fix deploy staging jars script The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../org/apache/hudi/testutils/FunctionalTestHarness.java | 16 hudi-spark-datasource/hudi-spark2/pom.xml| 2 +- hudi-spark-datasource/hudi-spark3/pom.xml| 7 +++ .../hudi/utilities/testutils/UtilitiesTestBase.java | 9 + pom.xml | 4 scripts/release/deploy_staging_jars.sh | 2 +- 6 files changed, 38 insertions(+), 2 deletions(-)
[GitHub] [hudi] garyli1019 merged pull request #2735: [HOTFIX] Cherry pick 0.8.0 release fix to master
garyli1019 merged pull request #2735: URL: https://github.com/apache/hudi/pull/2735 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan edited a comment on pull request #2736: [MINOR] Add Missing Apache License to test files
nsivabalan edited a comment on pull request #2736: URL: https://github.com/apache/hudi/pull/2736#issuecomment-809334249 Guess wrt old-version.commit, may be you can fix the validate_staged_release.sh script only. ``` find . -iname '*' -type f | grep -v NOTICE | grep -v LICENSE | grep -v '.json' | grep -v '.data' | grep -v DISCLAIMER | grep -v '.sqltemplate' | grep -v KEYS | grep -v '.mailmap' | grep -v 'ObjectSizeCalculator.java' | grep -v 'AvroConversionHelper.scala' | xargs grep -L "Licensed to the Apache Software Foundation (ASF)" ``` add ".commit" to this list. also, whereever we have such filtering, you might need to add ".commit" over there as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2736: [MINOR] Add Missing Apache License to test files
nsivabalan commented on pull request #2736: URL: https://github.com/apache/hudi/pull/2736#issuecomment-809334249 Guess wrt old-version.commit, may be you can fix the validate_staged_release.sh script only. ``` find . -iname '*' -type f | grep -v NOTICE | grep -v LICENSE | grep -v '.json' | grep -v '.data' | grep -v DISCLAIMER | grep -v '.sqltemplate' | grep -v KEYS | grep -v '.mailmap' | grep -v 'ObjectSizeCalculator.java' | grep -v 'AvroConversionHelper.scala' | xargs grep -L "Licensed to the Apache Software Foundation (ASF)" ``` add ".commit" to this list. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan opened a new pull request #2738: [Minor] Fixing Key generators blog for phrases
nsivabalan opened a new pull request #2738: URL: https://github.com/apache/hudi/pull/2738 ## What is the purpose of the pull request *Fixing Key generators blog for phrases* ## Brief change log - *Modify Key generators blog* ## Verify this pull request - Fix is very trivial. Only fixed some texts w/o any images or tables. ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] yanghua commented on pull request #2734: [HUDI-1731] Rename UpsertPartitioner in both hudi-java-client and hud…
yanghua commented on pull request #2734: URL: https://github.com/apache/hudi/pull/2734#issuecomment-809311188 > > Hi @garyli1019 I suggest that we include this PR into 0.8.0. Currently, we have three classes with the same full-qualified name(`UpsertPartitioner`) in spark/java/flink module. It is extremely confusing. > > Hi @yanghua , we have almost done all the pre-release verification work. If adding more commits, we need to start over. So let's keep this for the next release cuz this doesn't look like a blocker. OK, it will cause an error when users load multiple `UpsertPartitioner`s. As an RM, you have the right to make the decision. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] garyli1019 commented on pull request #2734: [HUDI-1731] Rename UpsertPartitioner in both hudi-java-client and hud…
garyli1019 commented on pull request #2734: URL: https://github.com/apache/hudi/pull/2734#issuecomment-809308540 > Hi @garyli1019 I suggest that we include this PR into 0.8.0. Currently, we have three classes with the same full-qualified name(`UpsertPartitioner`) in spark/java/flink module. It is extremely confusing. Hi @yanghua , we have almost done all the pre-release verification work. If adding more commits, we need to start over. So let's keep this for the next release cuz this doesn't look like a blocker. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-1735) hudi-examples missed MapredParquetInputFormat class
[ https://issues.apache.org/jira/browse/HUDI-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1735: - Labels: pull-request-available (was: ) > hudi-examples missed MapredParquetInputFormat class > --- > > Key: HUDI-1735 > URL: https://issues.apache.org/jira/browse/HUDI-1735 > Project: Apache Hudi > Issue Type: Bug > Components: Usability >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > When running {{HoodieDataSourceExample}}, it throws this exception: > {code:java} > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormatException in > thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat at > java.lang.ClassLoader.defineClass1(Native Method) at > java.lang.ClassLoader.defineClass(ClassLoader.java:763) at > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at > java.net.URLClassLoader.defineClass(URLClassLoader.java:468) at > java.net.URLClassLoader.access$100(URLClassLoader.java:74) at > java.net.URLClassLoader$1.run(URLClassLoader.java:369) at > java.net.URLClassLoader$1.run(URLClassLoader.java:363) at > java.security.AccessController.doPrivileged(Native Method) at > java.net.URLClassLoader.findClass(URLClassLoader.java:362) at > java.lang.ClassLoader.loadClass(ClassLoader.java:424) at > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at > java.lang.ClassLoader.loadClass(ClassLoader.java:357) at > java.lang.ClassLoader.defineClass1(Native Method) at > java.lang.ClassLoader.defineClass(ClassLoader.java:763) at > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at > java.net.URLClassLoader.defineClass(URLClassLoader.java:468) at > java.net.URLClassLoader.access$100(URLClassLoader.java:74) at > java.net.URLClassLoader$1.run(URLClassLoader.java:369) at > java.net.URLClassLoader$1.run(URLClassLoader.java:363) at > java.security.AccessController.doPrivileged(Native Method) at > java.net.URLClassLoader.findClass(URLClassLoader.java:362) at > java.lang.ClassLoader.loadClass(ClassLoader.java:424) at > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at > java.lang.ClassLoader.loadClass(ClassLoader.java:357) at > org.apache.hudi.hadoop.HoodieROTablePathFilter.accept(HoodieROTablePathFilter.java:179) > at > org.apache.spark.sql.execution.datasources.InMemoryFileIndex$$anonfun$15.apply(InMemoryFileIndex.scala:294) > at > org.apache.spark.sql.execution.datasources.InMemoryFileIndex$$anonfun$15.apply(InMemoryFileIndex.scala:294) > at > scala.collection.TraversableLike$$anonfun$filterImpl$1.apply(TraversableLike.scala:248) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at > scala.collection.TraversableLike$class.filterImpl(TraversableLike.scala:247) > at scala.collection.TraversableLike$class.filter(TraversableLike.scala:259) > at scala.collection.mutable.ArrayOps$ofRef.filter(ArrayOps.scala:186) at > org.apache.spark.sql.execution.datasources.InMemoryFileIndex$.org$apache$spark$sql$execution$datasources$InMemoryFileIndex$$listLeafFiles(InMemoryFileIndex.scala:294) > at > org.apache.spark.sql.execution.datasources.InMemoryFileIndex$$anonfun$bulkListLeafFiles$1.apply(InMemoryFileIndex.scala:174) > at > org.apache.spark.sql.execution.datasources.InMemoryFileIndex$$anonfun$bulkListLeafFiles$1.apply(InMemoryFileIndex.scala:173) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at > scala.collection.AbstractTraversable.map(Traversable.scala:104) at > org.apache.spark.sql.execution.datasources.InMemoryFileIndex$.bulkListLeafFiles(InMemoryFileIndex.scala:173) > at > org.apache.spark.sql.execution.datasources.InMemoryFileIndex.listLeafFiles(InMemoryFileIndex.scala:126) > at > org.apache.spark.sql.execution.datasources.InMemoryFileIndex.refresh0(InMemoryFileIndex.scala:91) > at > org.apache.spark.sql.execution.datasources.InMemoryFileIndex.(InMemoryFileIndex.scala:67) > at > org.apache.spark.sql.execution.datasources.DataSource.org$apache$spark$sql$execution$datasources$DataSource$$createInMemoryFileIndex(DataSource.scala:533) > at >
[GitHub] [hudi] yanghua opened a new pull request #2737: [HUDI-1735] Add hive-exec dependency for hudi-examples
yanghua opened a new pull request #2737: URL: https://github.com/apache/hudi/pull/2737 ## What is the purpose of the pull request *hudi-examples missed MapredParquetInputFormat class* ## Brief change log - *Add hive-exec dependency for hudi-examples* ## Verify this pull request This pull request is a trivial rework / code cleanup without any test coverage. ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-1735) hudi-examples missed MapredParquetInputFormat class
vinoyang created HUDI-1735: -- Summary: hudi-examples missed MapredParquetInputFormat class Key: HUDI-1735 URL: https://issues.apache.org/jira/browse/HUDI-1735 Project: Apache Hudi Issue Type: Bug Components: Usability Reporter: vinoyang Assignee: vinoyang Fix For: 0.9.0 When running {{HoodieDataSourceExample}}, it throws this exception: {code:java} Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormatException in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) at java.net.URLClassLoader.access$100(URLClassLoader.java:74) at java.net.URLClassLoader$1.run(URLClassLoader.java:369) at java.net.URLClassLoader$1.run(URLClassLoader.java:363) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:362) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) at java.net.URLClassLoader.access$100(URLClassLoader.java:74) at java.net.URLClassLoader$1.run(URLClassLoader.java:369) at java.net.URLClassLoader$1.run(URLClassLoader.java:363) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:362) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.hudi.hadoop.HoodieROTablePathFilter.accept(HoodieROTablePathFilter.java:179) at org.apache.spark.sql.execution.datasources.InMemoryFileIndex$$anonfun$15.apply(InMemoryFileIndex.scala:294) at org.apache.spark.sql.execution.datasources.InMemoryFileIndex$$anonfun$15.apply(InMemoryFileIndex.scala:294) at scala.collection.TraversableLike$$anonfun$filterImpl$1.apply(TraversableLike.scala:248) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.filterImpl(TraversableLike.scala:247) at scala.collection.TraversableLike$class.filter(TraversableLike.scala:259) at scala.collection.mutable.ArrayOps$ofRef.filter(ArrayOps.scala:186) at org.apache.spark.sql.execution.datasources.InMemoryFileIndex$.org$apache$spark$sql$execution$datasources$InMemoryFileIndex$$listLeafFiles(InMemoryFileIndex.scala:294) at org.apache.spark.sql.execution.datasources.InMemoryFileIndex$$anonfun$bulkListLeafFiles$1.apply(InMemoryFileIndex.scala:174) at org.apache.spark.sql.execution.datasources.InMemoryFileIndex$$anonfun$bulkListLeafFiles$1.apply(InMemoryFileIndex.scala:173) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.spark.sql.execution.datasources.InMemoryFileIndex$.bulkListLeafFiles(InMemoryFileIndex.scala:173) at org.apache.spark.sql.execution.datasources.InMemoryFileIndex.listLeafFiles(InMemoryFileIndex.scala:126) at org.apache.spark.sql.execution.datasources.InMemoryFileIndex.refresh0(InMemoryFileIndex.scala:91) at org.apache.spark.sql.execution.datasources.InMemoryFileIndex.(InMemoryFileIndex.scala:67) at org.apache.spark.sql.execution.datasources.DataSource.org$apache$spark$sql$execution$datasources$DataSource$$createInMemoryFileIndex(DataSource.scala:533) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:371) at org.apache.hudi.DefaultSource.getBaseFileOnlyView(DefaultSource.scala:193) at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:102) at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:63) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318) at
[GitHub] [hudi] codecov-io edited a comment on pull request #2736: [MINOR] Add Missing Apache License to test files
codecov-io edited a comment on pull request #2736: URL: https://github.com/apache/hudi/pull/2736#issuecomment-809134018 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2736?src=pr=h1) Report > Merging [#2736](https://codecov.io/gh/apache/hudi/pull/2736?src=pr=desc) (013c0df) into [master](https://codecov.io/gh/apache/hudi/commit/d415d45416707ca4d5b1dbad65dc80e6fccfa378?el=desc) (d415d45) will **increase** coverage by `17.69%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2736/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2736?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2736 +/- ## = + Coverage 52.04% 69.73% +17.69% + Complexity 3624 371 -3253 = Files 479 54 -425 Lines 22804 1989-20815 Branches 2415 236 -2179 = - Hits 11868 1387-10481 + Misses 9911 471 -9440 + Partials 1025 131 -894 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `69.73% <ø> (ø)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2736?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...rg/apache/hudi/common/fs/NoOpConsistencyGuard.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL05vT3BDb25zaXN0ZW5jeUd1YXJkLmphdmE=) | | | | | [...util/jvm/OpenJ9MemoryLayoutSpecification64bit.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvanZtL09wZW5KOU1lbW9yeUxheW91dFNwZWNpZmljYXRpb242NGJpdC5qYXZh) | | | | | [...org/apache/hudi/common/config/TypedProperties.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2NvbmZpZy9UeXBlZFByb3BlcnRpZXMuamF2YQ==) | | | | | [...pache/hudi/hadoop/PathWithBootstrapFileStatus.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL1BhdGhXaXRoQm9vdHN0cmFwRmlsZVN0YXR1cy5qYXZh) | | | | | [.../hadoop/realtime/AbstractRealtimeRecordReader.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL0Fic3RyYWN0UmVhbHRpbWVSZWNvcmRSZWFkZXIuamF2YQ==) | | | | | [.../common/table/log/block/HoodieLogBlockVersion.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9ibG9jay9Ib29kaWVMb2dCbG9ja1ZlcnNpb24uamF2YQ==) | | | | | [...nal/HoodieDataSourceInternalBatchWriteBuilder.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVEYXRhU291cmNlSW50ZXJuYWxCYXRjaFdyaXRlQnVpbGRlci5qYXZh) | | | | | [...i/common/util/collection/ExternalSpillableMap.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvY29sbGVjdGlvbi9FeHRlcm5hbFNwaWxsYWJsZU1hcC5qYXZh) | | | | | [...e/hudi/common/util/collection/ImmutableTriple.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvY29sbGVjdGlvbi9JbW11dGFibGVUcmlwbGUuamF2YQ==) | | | | | [...i/common/model/OverwriteWithLatestAvroPayload.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL092ZXJ3cml0ZVdpdGhMYXRlc3RBdnJvUGF5bG9hZC5qYXZh) | | | | | ... and [415 more](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree-more) | | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at:
[GitHub] [hudi] yanghua commented on pull request #2734: [HUDI-1731] Rename UpsertPartitioner in both hudi-java-client and hud…
yanghua commented on pull request #2734: URL: https://github.com/apache/hudi/pull/2734#issuecomment-809244541 Hi @garyli1019 I suggest that we include this PR into 0.8.0. Currently, we have three classes with the same full-qualified name(`UpsertPartitioner`) in spark/java/flink module. It is extremely confusing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1731) Rename UpsertPartitioner in both hudi-java-client and hudi-spark-client to differentiate them from each other
[ https://issues.apache.org/jira/browse/HUDI-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310530#comment-17310530 ] vinoyang commented on HUDI-1731: [~pandaman1984] Thanks for reporting this issue. I have given you Jira contributor permission. > Rename UpsertPartitioner in both hudi-java-client and hudi-spark-client to > differentiate them from each other > - > > Key: HUDI-1731 > URL: https://issues.apache.org/jira/browse/HUDI-1731 > Project: Apache Hudi > Issue Type: Improvement > Components: Code Cleanup >Reporter: Leo Zhu >Assignee: Leo Zhu >Priority: Minor > Labels: pull-request-available > Original Estimate: 24h > Remaining Estimate: 24h > > There's same fully-qualified name class - > "org.apache.hudi.table.action.commit.UpsertPartitioner" in both > hudi-spark-client and hudi-java-client module. When both jars are included in > classpath, one would override another, and below error would happen, > Exception in thread "main" java.lang.VerifyError: Bad return typeException in > thread "main" java.lang.VerifyError: Bad return typeException Details: > Location: > org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.getUpsertPartitioner(Lorg/apache/hudi/table/WorkloadProfile;)Lorg/apache/spark/Partitioner; > @34: areturn Reason: Type > 'org/apache/hudi/table/action/commit/UpsertPartitioner' (current frame, > stack[0]) is not assignable to 'org/apache/spark/Partitioner' (from method > signature) Current Frame: bci: @34 flags: \{ } locals: \{ > 'org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor', > 'org/apache/hudi/table/WorkloadProfile' } stack: \{ > 'org/apache/hudi/table/action/commit/UpsertPartitioner' } Bytecode: > 0x000: 2bc7 000d bb00 9859 12bc b700 9bbf bb00 0x010: 8d59 2b2a > b400 0f2a b400 052a b400 03b7 0x020: 00bd b0 > Stackmap Table: same_frame(@14) > at > org.apache.hudi.table.HoodieSparkCopyOnWriteTable.insert(HoodieSparkCopyOnWriteTable.java:97) > at > org.apache.hudi.table.HoodieSparkCopyOnWriteTable.insert(HoodieSparkCopyOnWriteTable.java:82) > at > org.apache.hudi.client.SparkRDDWriteClient.insert(SparkRDDWriteClient.java:169) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-1731) Rename UpsertPartitioner in both hudi-java-client and hudi-spark-client to differentiate them from each other
[ https://issues.apache.org/jira/browse/HUDI-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned HUDI-1731: -- Assignee: Leo Zhu > Rename UpsertPartitioner in both hudi-java-client and hudi-spark-client to > differentiate them from each other > - > > Key: HUDI-1731 > URL: https://issues.apache.org/jira/browse/HUDI-1731 > Project: Apache Hudi > Issue Type: Improvement > Components: Code Cleanup >Reporter: Leo Zhu >Assignee: Leo Zhu >Priority: Minor > Labels: pull-request-available > Original Estimate: 24h > Remaining Estimate: 24h > > There's same fully-qualified name class - > "org.apache.hudi.table.action.commit.UpsertPartitioner" in both > hudi-spark-client and hudi-java-client module. When both jars are included in > classpath, one would override another, and below error would happen, > Exception in thread "main" java.lang.VerifyError: Bad return typeException in > thread "main" java.lang.VerifyError: Bad return typeException Details: > Location: > org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.getUpsertPartitioner(Lorg/apache/hudi/table/WorkloadProfile;)Lorg/apache/spark/Partitioner; > @34: areturn Reason: Type > 'org/apache/hudi/table/action/commit/UpsertPartitioner' (current frame, > stack[0]) is not assignable to 'org/apache/spark/Partitioner' (from method > signature) Current Frame: bci: @34 flags: \{ } locals: \{ > 'org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor', > 'org/apache/hudi/table/WorkloadProfile' } stack: \{ > 'org/apache/hudi/table/action/commit/UpsertPartitioner' } Bytecode: > 0x000: 2bc7 000d bb00 9859 12bc b700 9bbf bb00 0x010: 8d59 2b2a > b400 0f2a b400 052a b400 03b7 0x020: 00bd b0 > Stackmap Table: same_frame(@14) > at > org.apache.hudi.table.HoodieSparkCopyOnWriteTable.insert(HoodieSparkCopyOnWriteTable.java:97) > at > org.apache.hudi.table.HoodieSparkCopyOnWriteTable.insert(HoodieSparkCopyOnWriteTable.java:82) > at > org.apache.hudi.client.SparkRDDWriteClient.insert(SparkRDDWriteClient.java:169) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] yanghua commented on pull request #2734: [HUDI-1731] Rename UpsertPartitioner in both hudi-java-client and hud…
yanghua commented on pull request #2734: URL: https://github.com/apache/hudi/pull/2734#issuecomment-809224318 @leo-Iamok Thanks for your report. Would you also please consider flink module? cc @danny0405 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (HUDI-1716) rt view w/ MOR tables fails after schema evolution
[ https://issues.apache.org/jira/browse/HUDI-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Tiwari reassigned HUDI-1716: --- Assignee: Aditya Tiwari > rt view w/ MOR tables fails after schema evolution > -- > > Key: HUDI-1716 > URL: https://issues.apache.org/jira/browse/HUDI-1716 > Project: Apache Hudi > Issue Type: Bug > Components: Storage Management >Reporter: sivabalan narayanan >Assignee: Aditya Tiwari >Priority: Major > Labels: sev:critical, user-support-issues > Fix For: 0.9.0 > > > Looks like realtime view w/ MOR table fails if schema present in existing log > file is evolved to add a new field. no issues w/ writing. but reading fails > More info: [https://github.com/apache/hudi/issues/2675] > > gist of the stack trace: > Caused by: org.apache.avro.AvroTypeException: Found > hoodie.hudi_trips_cow.hudi_trips_cow_record, expecting > hoodie.hudi_trips_cow.hudi_trips_cow_record, missing required field > evolvedFieldCaused by: org.apache.avro.AvroTypeException: Found > hoodie.hudi_trips_cow.hudi_trips_cow_record, expecting > hoodie.hudi_trips_cow.hudi_trips_cow_record, missing required field > evolvedField at > org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292) at > org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at > org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:130) > at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:215) > at > org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145) > at > org.apache.hudi.common.table.log.block.HoodieAvroDataBlock.deserializeRecords(HoodieAvroDataBlock.java:165) > at > org.apache.hudi.common.table.log.block.HoodieDataBlock.createRecordsFromContentBytes(HoodieDataBlock.java:128) > at > org.apache.hudi.common.table.log.block.HoodieDataBlock.getRecords(HoodieDataBlock.java:106) > at > org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processDataBlock(AbstractHoodieLogRecordScanner.java:289) > at > org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processQueuedBlocksForInstant(AbstractHoodieLogRecordScanner.java:324) > at > org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.scan(AbstractHoodieLogRecordScanner.java:252) > ... 24 more21/03/25 11:27:03 WARN TaskSetManager: Lost task 0.0 in stage > 83.0 (TID 667, sivabala-c02xg219jgh6.attlocal.net, executor driver): > org.apache.hudi.exception.HoodieException: Exception when reading log file > at > org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.scan(AbstractHoodieLogRecordScanner.java:261) > at > org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:100) > at > org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.(HoodieMergedLogRecordScanner.java:93) > at > org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.(HoodieMergedLogRecordScanner.java:75) > at > org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:230) > at > org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:328) > at > org.apache.hudi.HoodieMergeOnReadRDD$$anon$3.(HoodieMergeOnReadRDD.scala:210) > at > org.apache.hudi.HoodieMergeOnReadRDD.payloadCombineFileIterator(HoodieMergeOnReadRDD.scala:200) > at > org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:77) > > Logs from local run: > [https://gist.github.com/nsivabalan/656956ab313676617d84002ef8942198] > diff with which above logs were generated: > [https://gist.github.com/nsivabalan/84dad29bc1ab567ebb6ee8c63b3969ec] > > Steps to reproduce in spark shell: > # create MOR table w/ schema1. > # Ingest (with schema1) until log files are created. // verify via hudi-cli. > It took me 2 batch of updates to see a log file. > # create a new schema2 with one new additional field. ingest a batch with > schema2 that updates existing records. > # read entire dataset. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] codecov-io commented on pull request #2735: [HOTFIX] Cherry pick 0.8.0 release fix to master
codecov-io commented on pull request #2735: URL: https://github.com/apache/hudi/pull/2735#issuecomment-809134222 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2735?src=pr=h1) Report > Merging [#2735](https://codecov.io/gh/apache/hudi/pull/2735?src=pr=desc) (a8962e9) into [master](https://codecov.io/gh/apache/hudi/commit/29b79c99b02d66ef9b087b56223e74c0d1f99e94?el=desc) (29b79c9) will **decrease** coverage by `42.33%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2735/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2735?src=pr=tree) ```diff @@ Coverage Diff @@ ## master #2735 +/- ## - Coverage 51.73% 9.40% -42.34% + Complexity 3601 48 -3553 Files 476 54 -422 Lines 225951989-20606 Branches 2409 236 -2173 - Hits 11689 187-11502 + Misses 98881789 -8099 + Partials 1018 13 -1005 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.40% <ø> (-60.34%)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2735?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2735/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2735/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2735/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2735/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2735/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2735/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2735/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2735/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | | | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2735/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | | | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2735/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%>
[GitHub] [hudi] codecov-io commented on pull request #2736: [MINOR] Add Missing Apache License to test files
codecov-io commented on pull request #2736: URL: https://github.com/apache/hudi/pull/2736#issuecomment-809134018 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2736?src=pr=h1) Report > Merging [#2736](https://codecov.io/gh/apache/hudi/pull/2736?src=pr=desc) (82f4cac) into [master](https://codecov.io/gh/apache/hudi/commit/d415d45416707ca4d5b1dbad65dc80e6fccfa378?el=desc) (d415d45) will **decrease** coverage by `42.64%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2736/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2736?src=pr=tree) ```diff @@ Coverage Diff @@ ## master #2736 +/- ## - Coverage 52.04% 9.40% -42.65% + Complexity 3624 48 -3576 Files 479 54 -425 Lines 228041989-20815 Branches 2415 236 -2179 - Hits 11868 187-11681 + Misses 99111789 -8122 + Partials 1025 13 -1012 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.40% <ø> (-60.34%)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2736?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | | | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | | | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2736/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%>
[GitHub] [hudi] garyli1019 commented on pull request #2735: [HOTFIX] Cherry pick 0.8.0 release fix to master
garyli1019 commented on pull request #2735: URL: https://github.com/apache/hudi/pull/2735#issuecomment-809123736 These 3 commits already exist in the release-0.8.0 branch. So I guess I shouldn't squash this PR? WDYT? @n3nash @nsivabalan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] garyli1019 opened a new pull request #2736: [MINOR] Add Missing Apache License to test files
garyli1019 opened a new pull request #2736: URL: https://github.com/apache/hudi/pull/2736 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request *(For example: This pull request adds quick-start document.)* ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request *(Please pick either of the following options)* This pull request is a trivial rework / code cleanup without any test coverage. *(or)* This pull request is already covered by existing tests, such as *(please describe tests)*. (or) This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end.* - *Added HoodieClientWriteTest to verify the change.* - *Manually verified the change by running a job locally.* ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] garyli1019 opened a new pull request #2735: [HOTFIX] Cherry pick 0.8.0 release fix to master
garyli1019 opened a new pull request #2735: URL: https://github.com/apache/hudi/pull/2735 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request *(For example: This pull request adds quick-start document.)* ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request *(Please pick either of the following options)* This pull request is a trivial rework / code cleanup without any test coverage. *(or)* This pull request is already covered by existing tests, such as *(please describe tests)*. (or) This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end.* - *Added HoodieClientWriteTest to verify the change.* - *Manually verified the change by running a job locally.* ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org