Re: Why we get 0 when the key is null?

2016-09-16 Thread Sean Owen
"null" is a valid value in an RDD, so it has to be partition-able. On Fri, Sep 16, 2016 at 2:26 AM, WangJianfei wrote: > When the key is not In the rdd, I can also get an value , I just feel a > little strange. > > > > -- > View this message in context: >

Re: What's the meaning when the partitions is zero?

2016-09-16 Thread Sean Owen
There are almost no cases in which you'd want a zero-partition RDD. The only one I can think of is an empty RDD, where the number of partitions is irrelevant. Still, I would not be surprised if other parts of the code assume at least 1 partition. Maybe this check could be tightened. It would be

Re: Spark 2.0.1 release?

2016-09-16 Thread Sean Owen
There are a few blockers for 2.0.1, but just two. For example https://issues.apache.org/jira/browse/SPARK-17418 must be resolved before another release. On Fri, Sep 16, 2016 at 7:23 PM, Reynold Xin wrote: > 2.0.1 is definitely coming soon. Was going to tag a rc yesterday

Re: Spark 1.x/2.x qualifiers in downstream artifact names

2016-09-16 Thread Michael Heuer
On Wed, Aug 24, 2016 at 12:12 PM, Sean Owen wrote: > If you're just varying versions (or things that can be controlled by a > profile, which is most everything including dependencies), you don't > need and probably don't want multiple POM files. Even that wouldn't > mean you

Doubt about ExternalSorter.spillMemoryIteratorToDisk

2016-09-16 Thread WangJianfei
We can see that when the number of been written objects equals serializerBatchSize, the flush() will be called. But if the objects written exceeds the default buffer size, what will happen? if this situation happens,will the flush() be called automatelly? private[this] def

Re: What's the meaning when the partitions is zero?

2016-09-16 Thread Mridul Muralidharan
When numPartitions is 0, there is no data in the rdd: so getPartition is never invoked. - Mridul On Friday, September 16, 2016, WangJianfei wrote: > if so, we will get exception when the numPartitions is 0. > def getPartition(key: Any): Int = key match { >

Re: What's the meaning when the partitions is zero?

2016-09-16 Thread WangJianfei
if so, we will get exception when the numPartitions is 0. def getPartition(key: Any): Int = key match { case null => 0 //case None => 0 case _ => Utils.nonNegativeMod(key.hashCode, numPartitions) } -- View this message in context:

Re: Compatibility of 1.6 spark.eventLog with a 2.0 History Server

2016-09-16 Thread Parth Brahmbhatt
The problem is we backported the Sql tab ui changes from 2.0 in our 1.6.1. They changed a parameter name in SQLMetricInfo. Still the community version is compatible, ours is not. > On Sep 15, 2016, at 11:08 AM, Mario Ds Briggs wrote: > > I had checked in 1.6.2 and it

Re: Spark 2.0.1 release?

2016-09-16 Thread Ewan Leith
That's great news, since it's that close I'll get started on building and testing the branch myself Thanks, Ewan On 16 Sep 2016 19:23, Reynold Xin wrote: 2.0.1 is definitely coming soon. Was going to tag a rc yesterday but ran into some issue. I will try to do it early

Spark 2.0.1 release?

2016-09-16 Thread Ewan Leith
Hi all, Apologies if I've missed anything, but is there likely to see a 2.0.1 bug fix release, or does a jump to 2.1.0 with additional features seem more probable? The issues for 2.0.1 seem pretty much done here

Re: Spark 2.0.1 release?

2016-09-16 Thread Reynold Xin
2.0.1 is definitely coming soon. Was going to tag a rc yesterday but ran into some issue. I will try to do it early next week for rc. On Fri, Sep 16, 2016 at 11:16 AM, Ewan Leith wrote: > Hi all, > > Apologies if I've missed anything, but is there likely to see a