[ https://issues.apache.org/jira/browse/SPARK-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-14283: ------------------------------------ Assignee: (was: Apache Spark) > Avoid sort in randomSplit when possible > --------------------------------------- > > Key: SPARK-14283 > URL: https://issues.apache.org/jira/browse/SPARK-14283 > Project: Spark > Issue Type: Improvement > Components: SQL > Reporter: Joseph K. Bradley > > Dataset.randomSplit sorts each partition in order to guarantee an ordering > and make randomSplit deterministic given the seed. Since randomSplit is used > a fair amount in ML, it would be great to avoid the sort when possible. > Are there cases when it could be avoided? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org