Re: how to fix the order of data

2017-02-14 Thread ??????????
IT works well now, thanks

---Original---
From: "Sam Elamin"
Date: 2017/2/14 19:54:36
To: "??"<1427357...@qq.com>;
Cc: "user";
Subject: Re: how to fix the order of data


Its because you are just printing on the rdd

You can sort the df like below


 
input.toDF().sort().collect()




or if you do not want to convert to a dataframe you can use the sort 
bysortByKey([ascending], [numTasks])




Regards

Sam












On Tue, Feb 14, 2017 at 11:41 AM, ?? <1427357...@qq.com> wrote:
HIall,
thebelowingismytestcode.Ifoundtheoutputofvalinputisdifferent.howdoifixtheorderplease?

scala>valinput=sc.parallelize(Array(1,2,3))
input:org.apache.spark.rdd.RDD[Int]=ParallelCollectionRDD[13]atparallelizeat:24

scala>input.foreach(print)
132
scala>input.foreach(print)
213
scala>input.foreach(print)
312

Re: how to fix the order of data

2017-02-14 Thread Sam Elamin
Its because you are just printing on the rdd

You can sort the df like below

 input.toDF().sort().collect()


or if you do not want to convert to a dataframe you can use the sort by
*sortByKey*([*ascending*], [*numTasks*])


Regards

Sam





On Tue, Feb 14, 2017 at 11:41 AM, 萝卜丝炒饭 <1427357...@qq.com> wrote:

>HI  all,
> the belowing is my test code. I found the output of val
> input is different. how do i fix the order please?
>
> scala> val input = sc.parallelize( Array(1,2,3))
> input: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[13] at
> parallelize at :24
>
> scala> input.foreach(print)
> 132
> scala> input.foreach(print)
> 213
> scala> input.foreach(print)
> 312


how to fix the order of data

2017-02-14 Thread ??????????
HI  all,
the belowing is my test code. I found the output of val input is different. how 
do i fix the order please?

scala> val input = sc.parallelize( Array(1,2,3))
input: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[13] at parallelize 
at :24

scala> input.foreach(print)
132
scala> input.foreach(print)
213
scala> input.foreach(print)
312