Re: Arrays values in keyBy
I've filed a JIRA for this issue: https://issues.apache.org/jira/browse/FLINK-5874 On Wed, Jul 20, 2016 at 4:32 PM, Stephan Ewenwrote: > I thing we can simply add this behavior when we use the TypeComparator in > the keyBy() function. It can implement the hashCode() as a deepHashCode on > array types. > > On Mon, Jun 13, 2016 at 12:30 PM, Ufuk Celebi wrote: > >> Would make sense to update the Javadocs for the next release. >> >> On Mon, Jun 13, 2016 at 11:19 AM, Aljoscha Krettek >> wrote: >> > Yes, this is correct. Right now we're basically using .hashCode() >> for >> > keying. (Which can be problematic in some cases.) >> > >> > Beam, for example, clearly specifies that the encoded form of a value >> should >> > be used for all comparisons/hashing. This is more well defined but can >> lead >> > to slow performance in some cases. >> > >> > On Sat, 11 Jun 2016 at 00:04 Elias Levy >> wrote: >> >> >> >> I would be useful if the documentation warned what type of equality it >> >> expected of values used as keys in keyBy. I just got bit in the ass by >> >> converting a field from a string to a byte array. All of the sudden >> the >> >> windows were no longer aggregating. So it seems Flink is not doing a >> deep >> >> compare of arrays when comparing keys. >> > >
Re: Arrays values in keyBy
Would make sense to update the Javadocs for the next release. On Mon, Jun 13, 2016 at 11:19 AM, Aljoscha Krettekwrote: > Yes, this is correct. Right now we're basically using .hashCode() for > keying. (Which can be problematic in some cases.) > > Beam, for example, clearly specifies that the encoded form of a value should > be used for all comparisons/hashing. This is more well defined but can lead > to slow performance in some cases. > > On Sat, 11 Jun 2016 at 00:04 Elias Levy wrote: >> >> I would be useful if the documentation warned what type of equality it >> expected of values used as keys in keyBy. I just got bit in the ass by >> converting a field from a string to a byte array. All of the sudden the >> windows were no longer aggregating. So it seems Flink is not doing a deep >> compare of arrays when comparing keys.
Re: Arrays values in keyBy
Yes, this is correct. Right now we're basically using .hashCode() for keying. (Which can be problematic in some cases.) Beam, for example, clearly specifies that the encoded form of a value should be used for all comparisons/hashing. This is more well defined but can lead to slow performance in some cases. On Sat, 11 Jun 2016 at 00:04 Elias Levywrote: > I would be useful if the documentation warned what type of equality it > expected of values used as keys in keyBy. I just got bit in the ass by > converting a field from a string to a byte array. All of the sudden the > windows were no longer aggregating. So it seems Flink is not doing a deep > compare of arrays when comparing keys. >
Arrays values in keyBy
I would be useful if the documentation warned what type of equality it expected of values used as keys in keyBy. I just got bit in the ass by converting a field from a string to a byte array. All of the sudden the windows were no longer aggregating. So it seems Flink is not doing a deep compare of arrays when comparing keys.