the-other-tim-brown commented on code in PR #10578:
URL: https://github.com/apache/hudi/pull/10578#discussion_r1486455598
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##########
@@ -178,27 +173,26 @@ public static <R> HoodieRecord<R>
tagRecord(HoodieRecord<R> record, HoodieRecord
* @param candidateRecordKeys - Candidate keys to filter
* @return List of pairs of candidate keys and positions that are available
in the file
*/
- public static List<Pair<String, Long>> filterKeysFromFile(Path filePath,
List<String> candidateRecordKeys,
- Configuration
configuration) throws HoodieIndexException {
+ public static Collection<Pair<String, Long>> filterKeysFromFile(Path
filePath, Set<String> candidateRecordKeys,
+
Configuration configuration) throws HoodieIndexException {
+ if (candidateRecordKeys.isEmpty()) {
+ return Collections.emptyList();
+ }
checkArgument(FSUtils.isBaseFile(filePath));
- List<Pair<String, Long>> foundRecordKeys = new ArrayList<>();
try (HoodieFileReader fileReader =
HoodieFileReaderFactory.getReaderFactory(HoodieRecordType.AVRO)
.getFileReader(DEFAULT_HUDI_CONFIG_FOR_READER, configuration,
filePath)) {
// Load all rowKeys from the file, to double-confirm
- if (!candidateRecordKeys.isEmpty()) {
- HoodieTimer timer = HoodieTimer.start();
- Set<Pair<String, Long>> fileRowKeys =
fileReader.filterRowKeys(candidateRecordKeys.stream().collect(Collectors.toSet()));
- foundRecordKeys.addAll(fileRowKeys);
Review Comment:
Previously we were creating a list from this set but it seems like we can
just use the `Collection` instead to avoid creating a separate data structure
to forward on the results for this step.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]