Shixiong Zhu created SPARK-4824:
-----------------------------------

             Summary: Join should use `Iterator` rather than `Iterable`
                 Key: SPARK-4824
                 URL: https://issues.apache.org/jira/browse/SPARK-4824
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
            Reporter: Shixiong Zhu


In Scala, `map` and `flatMap` of `Iterable` will copy the contents of 
`Iterable` to a new `Seq`. Such as,
{code}
  val iterable = Seq(1, 2, 3).map(v => {
    println(v)
    v
  })
  println("Iterable map done")

  val iterator = Seq(1, 2, 3).iterator.map(v => {
    println(v)
    v
  })
  println("Iterator map done")
{code}
outputed
{code}
1
2
3
Iterable map done
Iterator map done
{code}
So we should use 'iterator' to reduce memory consumed by join.

Found by [~johannes.simon]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to