Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21183#discussion_r184753197
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala ---
    @@ -473,7 +475,8 @@ final class OnlineLDAOptimizer extends LDAOptimizer 
with Logging {
                                             None
                                           }
     
    -    val stats: RDD[(BDM[Double], Option[BDV[Double]], Long)] = 
batch.mapPartitions { docs =>
    +    val stats: RDD[(BDM[Double], Option[BDV[Double]], Long)] = 
batch.mapPartitionsWithIndexInternal
    --- End diff --
    
    Let's not use mapPartitionsWithIndexInternal; I don't think closure 
cleaning is expensive enough for us to worry about here.  Use 
mapPartitionsWithIndex instead.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to