Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20472#discussion_r165343004
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
    @@ -917,11 +916,15 @@ private[spark] object RandomForest extends Logging {
           // being spun up that will definitely do no work.
           val numPartitions = math.min(continuousFeatures.length, 
input.partitions.length)
     
    +      val numInput = input.count()
    --- End diff --
    
    we can get this from the `metadata.numExamples * fraction` operation in the 
calling method in order to avoid another job to perform the count


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to