Could you elaborate on how this would work? So from what I can tell, this maps a key to a tuple which always has a 0 as the second element. From there the hash widely changes because we now hash something like ((1,4), 0) and ((1,3), 0). Thus mapping this would create more even partitions. Why reduce by key after? Is that just an example of an operation that can be done? Or does it provide some kind of real value to the operation.
On Mon, Feb 22, 2016 at 5:48 PM, Takeshi Yamamuro <linguin....@gmail.com> wrote: > Hi, > > How about adding dummy values? > values.map(d => (d, 0)).reduceByKey(_ + _) > > On Tue, Feb 23, 2016 at 10:15 AM, jluan <jaylu...@gmail.com> wrote: > >> I was wondering, is there a way to force something like the hash >> partitioner >> to use the entire entry of a PairRDD as a hash rather than just the key? >> >> For Example, if we have an RDD with values: PairRDD = [(1,4), (1, 3), (2, >> 3), (2,5), (2, 10)]. Rather than using keys 1 and 2, can we force the >> partitioner to hash the entire tuple such as (1,4)? >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Force-Partitioner-to-use-entire-entry-of-PairRDD-as-key-tp26299.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > > > -- > --- > Takeshi Yamamuro >