voonhous commented on code in PR #17781:
URL: https://github.com/apache/hudi/pull/17781#discussion_r3338888220
##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupReader.java:
##########
@@ -81,25 +85,113 @@ public final class HoodieFileGroupReader<T> implements
Closeable {
private HoodieFileGroupRecordBuffer<T> recordBuffer;
private ClosableIterator<T> baseFileIterator;
private final Option<UnaryOperator<T>> outputConverter;
+ @Getter
private final HoodieReadStats readStats;
// Callback to run custom logic on updates to the base files for the file
group
private final Option<BaseFileUpdateCallback<T>> fileGroupUpdateCallback;
// The list of instant times read from the log blocks, this value is used by
the log-compaction to allow optimized log-block scans
+ @Getter
private List<String> validBlockInstants = Collections.emptyList();
private BufferedRecordConverter<T> bufferedRecordConverter;
- private HoodieFileGroupReader(HoodieReaderContext<T> readerContext,
HoodieStorage storage, String tablePath,
- String latestCommitTime, HoodieSchema
dataSchema, HoodieSchema requestedSchema,
- Option<InternalSchema> internalSchemaOpt,
HoodieTableMetaClient hoodieTableMetaClient, TypedProperties props,
- ReaderParameters readerParameters, InputSplit
inputSplit, Option<BaseFileUpdateCallback<T>> updateCallback,
- FileGroupRecordBufferLoader<T>
recordBufferLoader) {
+ @Builder(setterPrefix = "with")
+ private HoodieFileGroupReader(
+ HoodieReaderContext<T> readerContext,
+ String latestCommitTime,
+ HoodieSchema dataSchema,
+ HoodieSchema requestedSchema,
+ Option<InternalSchema> internalSchemaOpt,
+ HoodieTableMetaClient hoodieTableMetaClient,
+ TypedProperties props,
+ Option<HoodieBaseFile> baseFileOption,
+ Stream<HoodieLogFile> logFiles,
+ String partitionPath,
+ Long start,
+ Long length,
+ Iterator<? extends HoodieRecord> recordIterator,
+ Boolean shouldUseRecordPosition,
+ Boolean allowInflightInstants,
+ Boolean emitDelete,
+ Boolean sortOutput,
+ Option<BaseFileUpdateCallback<T>> fileGroupUpdateCallback,
+ FileGroupRecordBufferLoader<T> recordBufferLoader) {
+
+ // Validations
+ ValidationUtils.checkArgument(readerContext != null, "Reader context is
required");
+ ValidationUtils.checkArgument(hoodieTableMetaClient != null, "Hoodie table
meta client is required");
+ ValidationUtils.checkArgument(latestCommitTime != null, "Latest commit
time is required");
+ ValidationUtils.checkArgument(dataSchema != null, "Data schema is
required");
+ ValidationUtils.checkArgument(requestedSchema != null, "Requested schema
is required");
+ ValidationUtils.checkArgument(props != null, "Props is required");
+ ValidationUtils.checkArgument(partitionPath != null, "Partition path is
required");
+
+ // Handle defaults
Review Comment:
Possible, but it will blow up this refactoring.
We will first need to add `@Builder` to the class. Even so, most of these
parameters aren't stored 1:1 as fields. They're transformed in the constructor
body:
1. `start` / `length` is folded into `InputSplit`
2. `internalSchemaOpt` is passed to the schema handler
3. `shouldUseRecordPosition` / `emitDelete` / `sortOutput` /
`allowInflightInstants` are folded into `ReaderParameters`
So there are no `start`/`length`/`shouldUseRecordPosition` fields to hang a
`@Builder.Default` on.
To use it you will have to restructure into class-level `@Builder` with real
defaulted fields and move all the validation/derivation logic out of the
constructor (e.g. into the composite builders or a factory), all these will
blow up the refactoring as a much bigger change than the null checks buy back.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]