clintropolis commented on a change in pull request #8950: Support orc format 
for native batch ingestion
URL: https://github.com/apache/incubator-druid/pull/8950#discussion_r351070716
 
 

 ##########
 File path: 
core/src/main/java/org/apache/druid/data/input/IntermediateRowParsingReader.java
 ##########
 @@ -39,25 +40,57 @@
   @Override
   public CloseableIterator<InputRow> read() throws IOException
   {
-    return intermediateRowIterator().flatMap(row -> {
-      try {
-        // since parseInputRows() returns a list, the below line always 
iterates over the list,
-        // which means it calls Iterator.hasNext() and Iterator.next() at 
least once per row.
-        // This could be unnecessary if the row wouldn't be exploded into 
multiple inputRows.
-        // If this line turned out to be a performance bottleneck, perhaps 
parseInputRows() interface might not be a
-        // good idea. Subclasses could implement read() with some duplicate 
codes to avoid unnecessary iteration on
-        // a singleton list.
-        return 
CloseableIterators.withEmptyBaggage(parseInputRows(row).iterator());
+    final CloseableIterator<T> intermediateRowIterator = 
intermediateRowIterator();
+
+    return new CloseableIterator<InputRow>()
+    {
+      // since parseInputRows() returns a list, the below line always iterates 
over the list,
+      // which means it calls Iterator.hasNext() and Iterator.next() at least 
once per row.
+      // This could be unnecessary if the row wouldn't be exploded into 
multiple inputRows.
+      // If this line turned out to be a performance bottleneck, perhaps 
parseInputRows() interface might not be a
+      // good idea. Subclasses could implement read() with some duplicate 
codes to avoid unnecessary iteration on
+      // a singleton list.
+      Iterator<InputRow> rows = null;
+
+      @Override
+      public boolean hasNext()
+      {
+        if (rows == null || !rows.hasNext()) {
+          if (!intermediateRowIterator.hasNext()) {
+            return false;
+          }
+          final T row = intermediateRowIterator.next();
+          try {
+            rows = parseInputRows(row).iterator();
+          }
+          catch (IOException e) {
+            throw new ParseException(e, "Unable to parse row [%s]", row);
+          }
+        }
+
+        return rows.hasNext() || intermediateRowIterator.hasNext();
 
 Review comment:
   I think rows will always have a next at this point in the code

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to