Re: Force Partitioner to use entire entry of PairRDD as key

Takeshi Yamamuro Mon, 22 Feb 2016 18:01:56 -0800

You're correct, reduceByKey is just an example.

On Tue, Feb 23, 2016 at 10:57 AM, Jay Luan <jaylu...@gmail.com> wrote:


> Could you elaborate on how this would work?
>
> So from what I can tell, this maps a key to a tuple which always has a 0
> as the second element. From there the hash widely changes because we now
> hash something like ((1,4), 0) and ((1,3), 0). Thus mapping this would
> create more even partitions. Why reduce by key after? Is that just an
> example of an operation that can be done? Or does it provide some kind of
> real value to the operation.
>
>
>
> On Mon, Feb 22, 2016 at 5:48 PM, Takeshi Yamamuro <linguin....@gmail.com>
> wrote:
>
>> Hi,
>>
>> How about adding dummy values?
>> values.map(d => (d, 0)).reduceByKey(_ + _)
>>
>> On Tue, Feb 23, 2016 at 10:15 AM, jluan <jaylu...@gmail.com> wrote:
>>
>>> I was wondering, is there a way to force something like the hash
>>> partitioner
>>> to use the entire entry of a PairRDD as a hash rather than just the key?
>>>
>>> For Example, if we have an RDD with values: PairRDD = [(1,4), (1, 3), (2,
>>> 3), (2,5), (2, 10)]. Rather than using keys 1 and 2, can we force the
>>> partitioner to hash the entire tuple such as (1,4)?
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Force-Partitioner-to-use-entire-entry-of-PairRDD-as-key-tp26299.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>
>


-- 
---
Takeshi Yamamuro

Re: Force Partitioner to use entire entry of PairRDD as key

Reply via email to