[GitHub] spark pull request #16252: [SPARK-18827][Core] Fix cannot read broadcast on ...

viirya Wed, 14 Dec 2016 05:56:51 -0800

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16252#discussion_r92396781
  
    --- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
    @@ -694,7 +694,7 @@ private[storage] class PartiallyUnrolledIterator[T](
       }
     
       override def next(): T = {
    -    if (unrolled == null) {
    +    if (unrolled == null || !unrolled.hasNext) {
    --- End diff --
    
    oh, yes, before this fixing, calling `next` without the preceding `hasNext` 
will fail.
    
    After rethinking about it, I think it is not guaranteed to be valid to call 
`next` without a preceding `hasNext`.
    
    From the Scala's Iterator API, the `next` is defined as:
    
        abstract def next(): A
          Produces the next element of this iterator.
          returns: the next element of this iterator, if hasNext is true, 
undefined behavior otherwise.
    
    The returned result of `next` is only defined if `hasNext` is true.
    
    We have an example `BufferedRowIterator` in Spark SQL although it is not 
inherited from Iterator but it behaves as iterator. The `hasNext` on 
`BufferedRowIterator` will load data into the iterator and calling `next` can 
return the data then.
    
    This is just some thoughts about the usage of the iterator. The current 
change still looks good to me as we already check `hasNext` in 
`TorrentBroadcast` now.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #16252: [SPARK-18827][Core] Fix cannot read broadcast on ...

Reply via email to