jihoonson commented on a change in pull request #9360: Create splits of 
multiple files for parallel indexing
URL: https://github.com/apache/druid/pull/9360#discussion_r383479776
 
 

 ##########
 File path: 
indexing-service/src/main/java/org/apache/druid/indexing/input/DruidInputSource.java
 ##########
 @@ -228,13 +232,15 @@ protected InputSourceReader 
fixedFormatReader(InputRowSchema inputRowSchema, @Nu
     // segmentIds is supposed to be specified by the supervisor task during 
the parallel indexing.
     // If it's not null, segments are already split by the supervisor task and 
further split won't happen.
     if (segmentIds == null) {
-      return createSplits(
-          coordinatorClient,
-          retryPolicyFactory,
-          dataSource,
-          interval,
-          splitHintSpec == null ? new SegmentsSplitHintSpec(null) : 
splitHintSpec
-      ).stream();
+      return Streams.sequentialStreamFrom(
+          createSplits(
+              coordinatorClient,
+              retryPolicyFactory,
+              dataSource,
+              interval,
+              splitHintSpec == null ? new SegmentsSplitHintSpec(null) : 
splitHintSpec
 
 Review comment:
   Changed to create `MaxSizeSplitHintSpec` directly.
   
   > Does this also mean SegmentsSplitHintSpec is deprecated?
   
   Good question. `MaxSizeSplitHintSpec` and `SegmentsSplitHintSpec` work 
exactly same for now, but I think `SegmentsSplitHintSpec` can be further 
optimized in the future. Added some comment about the future improvement.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to