You're correct, reduceByKey is just an example. On Tue, Feb 23, 2016 at 10:57 AM, Jay Luan <jaylu...@gmail.com> wrote:
> Could you elaborate on how this would work? > > So from what I can tell, this maps a key to a tuple which always has a 0 > as the second element. From there the hash widely changes because we now > hash something like ((1,4), 0) and ((1,3), 0). Thus mapping this would > create more even partitions. Why reduce by key after? Is that just an > example of an operation that can be done? Or does it provide some kind of > real value to the operation. > > > > On Mon, Feb 22, 2016 at 5:48 PM, Takeshi Yamamuro <linguin....@gmail.com> > wrote: > >> Hi, >> >> How about adding dummy values? >> values.map(d => (d, 0)).reduceByKey(_ + _) >> >> On Tue, Feb 23, 2016 at 10:15 AM, jluan <jaylu...@gmail.com> wrote: >> >>> I was wondering, is there a way to force something like the hash >>> partitioner >>> to use the entire entry of a PairRDD as a hash rather than just the key? >>> >>> For Example, if we have an RDD with values: PairRDD = [(1,4), (1, 3), (2, >>> 3), (2,5), (2, 10)]. Rather than using keys 1 and 2, can we force the >>> partitioner to hash the entire tuple such as (1,4)? >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Force-Partitioner-to-use-entire-entry-of-PairRDD-as-key-tp26299.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >> >> -- >> --- >> Takeshi Yamamuro >> > > -- --- Takeshi Yamamuro