Different partition number of GroupByKey leads different result

2015-10-09 Thread Devin Huang
ber-of-GroupByKey-leads-different-result-tp24989.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: use

Re: Different partition number of GroupByKey leads different result

2015-10-09 Thread Devin Huang
tion and uid is a filter id for result >>> comparision. >>> TagsWritable implements WritableComparable and Serializable. >>> >>> I used GroupByKey on text file, the result was right. >>> >>> Thanks, >>> Dev

Re: Different partition number of GroupByKey leads different result

2015-10-09 Thread Sean Owen
>>>> s._2)).groupByKey(num).filter(_._1 == uid) >>>> >>>> num is the number of partition and uid is a filter id for result >>>> comparision. >>>> TagsWritable implements WritableComparable and Serializable. >>

Re: Different partition number of GroupByKey leads different result

2015-10-09 Thread Devin Huang
may be mismatched on the shuffle stage. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Different-partition-number-of-GroupByKey-leads-different-result-tp24989p24990.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Different partition number of GroupByKey leads different result

2015-10-09 Thread Sean Owen
>> >> num is the number of partition and uid is a filter id for result >> comparision. >> TagsWritable implements WritableComparable and Serializable. >> >> I used GroupByKey on text file, the result was right. >> >> Thanks, >> Devin Huang >> >> >&

Re: Different partition number of GroupByKey leads different result

2015-10-09 Thread Sean Owen
bleComparable and Serializable. > > I used GroupByKey on text file, the result was right. > > Thanks, > Devin Huang > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Different-partition-number-of-GroupByKey-leads-d