alexeykudinkin commented on code in PR #7642:
URL: https://github.com/apache/hudi/pull/7642#discussion_r1087234130
##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordReader.java:
##########
@@ -807,16 +809,17 @@ private ClosableIteratorWithSchema<HoodieRecord>
getRecordsIterator(
Option<Pair<Function<HoodieRecord, HoodieRecord>, Schema>>
schemaEvolutionTransformerOpt =
composeEvolvedSchemaTransformer(dataBlock);
+
// In case when schema has been evolved original persisted records will
have to be
// transformed to adhere to the new schema
- if (schemaEvolutionTransformerOpt.isPresent()) {
- return ClosableIteratorWithSchema.newInstance(
- new CloseableMappingIterator<>(blockRecordsIterator,
- schemaEvolutionTransformerOpt.get().getLeft()),
- schemaEvolutionTransformerOpt.get().getRight());
- } else {
- return ClosableIteratorWithSchema.newInstance(blockRecordsIterator,
dataBlock.getSchema());
- }
+ Function<HoodieRecord, HoodieRecord> transformer =
Review Comment:
Good catch! Not sure how this change actually crippled in here. Will clean
it up
##########
hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java:
##########
@@ -73,11 +73,12 @@ public abstract class BaseTableMetadata implements
HoodieTableMetadata {
private static final Logger LOG =
LogManager.getLogger(BaseTableMetadata.class);
- public static final long MAX_MEMORY_SIZE_IN_BYTES = 1024 * 1024 * 1024;
- public static final int BUFFER_SIZE = 10 * 1024 * 1024;
+ protected static final long MAX_MEMORY_SIZE_IN_BYTES = 1024 * 1024 * 1024;
+ // NOTE: Buffer-size is deliberately set pretty low, since MT internally is
relying
+ // on HFile (serving as persisted binary key-value mapping) to do
caching
+ protected static final int BUFFER_SIZE = 10 * 1024; // 10Kb
Review Comment:
Yeah, this patch have been quite heavily benchmarked by now
You can get more context on this one in here:
https://github.com/apache/hudi/pull/6815/files#r1083144222
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]