[jira] [Commented] (SPARK-13458) Datasets cannot be sorted

Rishabh Bhardwaj (JIRA) Sat, 19 Mar 2016 11:17:59 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-13458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199324#comment-15199324
 ]


Rishabh Bhardwaj commented on SPARK-13458:
------------------------------------------

[~obeattie] In the master branch,DataSet.scala have these methods. Is there 
something else you are looking for ?
{code}
scala> val ds = 
sqlContext.createDataFrame(Seq((1,2),(2,3),(34,2),(4,45),(56,444))).as[(Int,Int)]
ds: org.apache.spark.sql.Dataset[(Int, Int)] = [_1: int, _2: int]

scala> ds.sort("_1")
res5: org.apache.spark.sql.Dataset[(Int, Int)] = [_1: int, _2: int]

scala> res5.show
+---+---+
| _1| _2|
+---+---+
|  1|  2|
|  2|  3|
|  4| 45|
| 34|  2|
| 56|444|
+---+---+
{code}

> Datasets cannot be sorted
> -------------------------
>
>                 Key: SPARK-13458
>                 URL: https://issues.apache.org/jira/browse/SPARK-13458
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Oliver Beattie
>
> There doesn't appear to be any way to sort a {{Dataset}} at present, without 
> first converting it to a {{DataFrame}}.
> Methods like {{orderBy}}, {{sort}}, and {{sortWithinPartitions}} which are 
> present on {{DataFrame}}, or {{sortBy}} which is present on {{RDD}}, are 
> absent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-13458) Datasets cannot be sorted

Reply via email to