Github user harsha2010 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/6403#discussion_r31079050
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
    @@ -176,7 +176,7 @@ final class OneVsRest(override val uid: String)
         }
     
         // create k columns, one for each binary classifier.
    -    val models = Range(0, numClasses).par.map { index =>
    +    val models = Range(0, numClasses).map { index =>
    --- End diff --
    
    @mengxr , i removed par because it is possible that the underlying 
classifier caches a portion of the dataset, so if this runs in parallel, we end 
up creating multiple copies of the dataset  in the intermediate stages(I wasn't 
too sure this would be an issue since i am already caching the multiclass 
labeled dataset, but the behavior of the underlying classifiers in the 
meta-learner scenario as far as caching goes is still a bit unclear to me, so 
decided it is less risk this way)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to