bersprockets commented on a change in pull request #30221:
URL: https://github.com/apache/spark/pull/30221#discussion_r516073775



##########
File path: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroUtils.scala
##########
@@ -171,9 +171,15 @@ private[sql] object AvroUtils extends Logging {
     protected val stopPosition: Long
 
     private[this] var completed = false
+    private[this] var interveningNext = true
+    private[this] var prevHasNextRow = false
     private[this] var currentRow: Option[InternalRow] = None
 
     def hasNextRow: Boolean = {
+      if (!interveningNext) {
+        // until a row is consumed, return previous result of hasNextRow
+        return prevHasNextRow
+      }

Review comment:
       Yeah, this approach is better, because it also handles the case where 
nextRow is called multiple times (which previously, would return the same row 
over and over).

##########
File path: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroUtils.scala
##########
@@ -171,9 +171,15 @@ private[sql] object AvroUtils extends Logging {
     protected val stopPosition: Long
 
     private[this] var completed = false
+    private[this] var interveningNext = true
+    private[this] var prevHasNextRow = false
     private[this] var currentRow: Option[InternalRow] = None
 
     def hasNextRow: Boolean = {
+      if (!interveningNext) {
+        // until a row is consumed, return previous result of hasNextRow
+        return prevHasNextRow
+      }

Review comment:
       Yeah, this approach is better, because it also handles the case where 
nextRow is called multiple times (which previously, would return the same row 
over and over, which, as @HyukjinKwon pointed out would keep the status quo, 
but it probably wouldn't be correct, since the users of this code are Iterator 
implementations).

##########
File path: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroUtils.scala
##########
@@ -171,9 +171,15 @@ private[sql] object AvroUtils extends Logging {
     protected val stopPosition: Long
 
     private[this] var completed = false
+    private[this] var interveningNext = true
+    private[this] var prevHasNextRow = false
     private[this] var currentRow: Option[InternalRow] = None
 
     def hasNextRow: Boolean = {
+      if (!interveningNext) {
+        // until a row is consumed, return previous result of hasNextRow
+        return prevHasNextRow
+      }

Review comment:
       >But it sounds weird that nextRow will be called with the same row.
   
   Yes, I agree. As I mentioned above, leaving it that way would keep the 
status quo, but probably wouldn't be correct since the users of this RowReader 
are Iterator implementations.

##########
File path: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroUtils.scala
##########
@@ -171,9 +171,15 @@ private[sql] object AvroUtils extends Logging {
     protected val stopPosition: Long
 
     private[this] var completed = false
+    private[this] var interveningNext = true
+    private[this] var prevHasNextRow = false
     private[this] var currentRow: Option[InternalRow] = None
 
     def hasNextRow: Boolean = {
+      if (!interveningNext) {
+        // until a row is consumed, return previous result of hasNextRow
+        return prevHasNextRow
+      }

Review comment:
       Follow up to my previous comment: As @HyukjinKwon pointed out, I was 
earlier trying to fix the stated bug without changing other behavior. But I 
should probably fix nextRow while I am at it, which makes RowReader follow a 
more recognized pattern and makes the actual Iterators that use it more 
correct. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to