RE: ReduceByKey with a byte array as the key

2015-06-11 Thread Aaron Davidson
the overhead between this and using `String` would be > similar enough to warrant just using `String`. > > > > Mark > > > > *From:* Sonal Goyal [mailto:sonalgoy...@gmail.com] > *Sent:* June-11-15 12:58 PM > *To:* Mark Tse > *Cc:* user@spark.apache.org > *Subjec

RE: ReduceByKey with a byte array as the key

2015-06-11 Thread Mark Tse
Subject: Re: ReduceByKey with a byte array as the key I think if you wrap the byte[] into an object and implement equals and hashcode methods, you may be able to do this. There will be the overhead of extra object, but conceptually it should work unless I am missing something. Best Regards, Sonal

Re: ReduceByKey with a byte array as the key

2015-06-11 Thread Sonal Goyal
I think if you wrap the byte[] into an object and implement equals and hashcode methods, you may be able to do this. There will be the overhead of extra object, but conceptually it should work unless I am missing something. Best Regards, Sonal Founder, Nube Technologies Ch

ReduceByKey with a byte array as the key

2015-06-11 Thread Mark Tse
I would like to work with RDD pairs of Tuple2, but byte[]s with the same contents are considered as different values because their reference values are different. I didn't see any to pass in a custom comparer. I could convert the byte[] into a String with an explicit charset, but I'm wondering