Re: Arrays values in keyBy

2017-02-21 Thread Robert Metzger
I've filed a JIRA for this issue:
https://issues.apache.org/jira/browse/FLINK-5874

On Wed, Jul 20, 2016 at 4:32 PM, Stephan Ewen  wrote:

> I thing we can simply add this behavior when we use the TypeComparator in
> the keyBy() function. It can implement the hashCode() as a deepHashCode on
> array types.
>
> On Mon, Jun 13, 2016 at 12:30 PM, Ufuk Celebi  wrote:
>
>> Would make sense to update the Javadocs for the next release.
>>
>> On Mon, Jun 13, 2016 at 11:19 AM, Aljoscha Krettek 
>> wrote:
>> > Yes, this is correct. Right now we're basically using .hashCode()
>> for
>> > keying. (Which can be problematic in some cases.)
>> >
>> > Beam, for example, clearly specifies that the encoded form of a value
>> should
>> > be used for all comparisons/hashing. This is more well defined but can
>> lead
>> > to slow performance in some cases.
>> >
>> > On Sat, 11 Jun 2016 at 00:04 Elias Levy 
>> wrote:
>> >>
>> >> I would be useful if the documentation warned what type of equality it
>> >> expected of values used as keys in keyBy.  I just got bit in the ass by
>> >> converting a field from a string to a byte array.  All of the sudden
>> the
>> >> windows were no longer aggregating.  So it seems Flink is not doing a
>> deep
>> >> compare of arrays when comparing keys.
>>
>
>


Re: Arrays values in keyBy

2016-06-13 Thread Ufuk Celebi
Would make sense to update the Javadocs for the next release.

On Mon, Jun 13, 2016 at 11:19 AM, Aljoscha Krettek  wrote:
> Yes, this is correct. Right now we're basically using .hashCode() for
> keying. (Which can be problematic in some cases.)
>
> Beam, for example, clearly specifies that the encoded form of a value should
> be used for all comparisons/hashing. This is more well defined but can lead
> to slow performance in some cases.
>
> On Sat, 11 Jun 2016 at 00:04 Elias Levy  wrote:
>>
>> I would be useful if the documentation warned what type of equality it
>> expected of values used as keys in keyBy.  I just got bit in the ass by
>> converting a field from a string to a byte array.  All of the sudden the
>> windows were no longer aggregating.  So it seems Flink is not doing a deep
>> compare of arrays when comparing keys.


Re: Arrays values in keyBy

2016-06-13 Thread Aljoscha Krettek
Yes, this is correct. Right now we're basically using .hashCode() for
keying. (Which can be problematic in some cases.)

Beam, for example, clearly specifies that the encoded form of a value
should be used for all comparisons/hashing. This is more well defined but
can lead to slow performance in some cases.

On Sat, 11 Jun 2016 at 00:04 Elias Levy  wrote:

> I would be useful if the documentation warned what type of equality it
> expected of values used as keys in keyBy.  I just got bit in the ass by
> converting a field from a string to a byte array.  All of the sudden the
> windows were no longer aggregating.  So it seems Flink is not doing a deep
> compare of arrays when comparing keys.
>


Arrays values in keyBy

2016-06-10 Thread Elias Levy
I would be useful if the documentation warned what type of equality it
expected of values used as keys in keyBy.  I just got bit in the ass by
converting a field from a string to a byte array.  All of the sudden the
windows were no longer aggregating.  So it seems Flink is not doing a deep
compare of arrays when comparing keys.