[
https://issues.apache.org/jira/browse/HUDI-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377004#comment-17377004
]
ASF GitHub Bot commented on HUDI-1763:
--------------------------------------
vinothchandar commented on a change in pull request #2977:
URL: https://github.com/apache/hudi/pull/2977#discussion_r665841138
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/cluster/SparkExecuteClusteringCommitActionExecutor.java
##########
@@ -209,7 +209,7 @@ protected String getCommitActionType() {
.build();
recordIterators.add(HoodieFileSliceReader.getFileSliceReader(baseFileReader,
scanner, readerSchema,
- table.getMetaClient().getTableConfig().getPayloadClass()));
+ table.getMetaClient().getTableConfig().getPayloadClass(),
table.getMetaClient().getTableConfig().getPreCombineField()));
Review comment:
pull out `table.getMetaClient().getTableConfig()` into a variable ? for
ease of reading?
##########
File path:
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieFileSliceReader.java
##########
@@ -36,11 +36,11 @@
private Iterator<HoodieRecord<? extends HoodieRecordPayload>>
recordsIterator;
public static <R extends IndexedRecord, T extends HoodieRecordPayload>
HoodieFileSliceReader getFileSliceReader(
- HoodieFileReader<R> baseFileReader, HoodieMergedLogRecordScanner
scanner, Schema schema, String payloadClass) throws IOException {
+ HoodieFileReader<R> baseFileReader, HoodieMergedLogRecordScanner
scanner, Schema schema, String payloadClass, String preCombineField) throws
IOException {
Review comment:
I am kind of skeptical about leaking the precombineField this deep.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> DefaultHoodieRecordPayload does not honor ordering value when records within
> multiple log files are merged
> ----------------------------------------------------------------------------------------------------------
>
> Key: HUDI-1763
> URL: https://issues.apache.org/jira/browse/HUDI-1763
> Project: Apache Hudi
> Issue Type: Bug
> Components: Writer Core
> Affects Versions: 0.8.0
> Reporter: sivabalan narayanan
> Priority: Major
> Labels: pull-request-available, sev:critical
>
> While creating HoodieRecordPayloads from log files in case of MOR tables, the
> payloads are created without any orderingVal (even if specified while writing
> data). Due to this the precombine function could result in any payload
> irrespective of its orderingVal.
> Attaching a sample script to reproduce the issue.
> In this example, for key "key1", 1st insert is with ts=1000. Then we update
> with ts=2000. Thenn we updated with ts=500. Ideally after last update if we
> snnapshot query the table, we must get key1 with ts=2000 (since our ordering
> field is ts). However it shows entry of ts=1000 because from logs it ignores
> ts=2000 and only picks up ts=500.
> Also AFAIU, the same flow will be used while compaction and then we might
> lose data forever.
>
> More info: https://github.com/apache/hudi/issues/2756
--
This message was sent by Atlassian Jira
(v8.3.4#803005)