[
https://issues.apache.org/jira/browse/FLINK-18629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dawid Wysakowicz updated FLINK-18629:
-------------------------------------
Description:
Following test fails:
{code}
@Test
public void testKeyedConnectedStreamsType() {
StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
DataStreamSource<Integer> stream1 = env.fromElements(1, 2);
DataStreamSource<Integer> stream2 = env.fromElements(1, 2);
ConnectedStreams<Integer, Integer> connectedStreams =
stream1.connect(stream2)
.keyBy(v -> v, v -> v);
KeyedStream<?, ?> firstKeyedInput = (KeyedStream<?, ?>)
connectedStreams.getFirstInput();
KeyedStream<?, ?> secondKeyedInput = (KeyedStream<?, ?>)
connectedStreams.getSecondInput();
assertThat(firstKeyedInput.getKeyType(), equalTo(Types.INT));
assertThat(secondKeyedInput.getKeyType(), equalTo(Types.INT));
}
{code}
The problem is that the wildcard type is evaluated as {{Object}} for lambdas,
which in turn produces {{GenericTypeInfo<Object>}} for any KeySelector provided
as lambda.
I suggest changing the method signature to:
{code}
public <K1, K2> ConnectedStreams<IN1, IN2> keyBy(
KeySelector<IN1, K1> keySelector1,
KeySelector<IN2, K2> keySelector2)
{code}
This would be a code compatible change. Might break the compatibility of state
backend (would change derived key type info).
Still there would be a workaround to use the second method for old programs:
{code}
public <KEY> ConnectedStreams<IN1, IN2> keyBy(
KeySelector<IN1, KEY> keySelector1,
KeySelector<IN2, KEY> keySelector2,
TypeInformation<KEY> keyType)
{code}
was:
Following test fails:
{code}
@Test
public void testKeyedConnectedStreamsType() {
StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
DataStreamSource<Integer> stream1 = env.fromElements(1, 2);
DataStreamSource<Integer> stream2 = env.fromElements(1, 2);
ConnectedStreams<Integer, Integer> connectedStreams =
stream1.connect(stream2)
.keyBy(v -> v, v -> v);
KeyedStream<?, ?> firstKeyedInput = (KeyedStream<?, ?>)
connectedStreams.getFirstInput();
KeyedStream<?, ?> secondKeyedInput = (KeyedStream<?, ?>)
connectedStreams.getSecondInput();
assertThat(firstKeyedInput.getKeyType(), equalTo(Types.INT));
assertThat(secondKeyedInput.getKeyType(), equalTo(Types.INT));
}
{code}
The problem is that the wildcard type is evaluated as {{Object}} for lambdas,
which in turn produces {{GenericTypeInfo<Object>}} for any KeySelector provided
as lambda.
I suggest changing the method signature to:
{code}
public <K1, K2> ConnectedStreams<IN1, IN2> keyBy(
KeySelector<IN1, K1> keySelector1,
KeySelector<IN2, K2> keySelector2)
{code}
This would be a code compatible change. Might break the compatibility of state
backend (would change derived key type info). Nevertheless there is a
workaround to use:
{code}
public <KEY> ConnectedStreams<IN1, IN2> keyBy(
KeySelector<IN1, KEY> keySelector1,
KeySelector<IN2, KEY> keySelector2,
TypeInformation<KEY> keyType)
{code}
> ConnectedStreams#keyBy can not derive key TypeInformation for lambda
> KeySelectors
> ---------------------------------------------------------------------------------
>
> Key: FLINK-18629
> URL: https://issues.apache.org/jira/browse/FLINK-18629
> Project: Flink
> Issue Type: Bug
> Components: API / DataStream
> Affects Versions: 1.10.0, 1.11.0, 1.12.0
> Reporter: Dawid Wysakowicz
> Assignee: Dawid Wysakowicz
> Priority: Critical
> Fix For: 1.10.2, 1.12.0, 1.11.2
>
>
> Following test fails:
> {code}
> @Test
> public void testKeyedConnectedStreamsType() {
> StreamExecutionEnvironment env =
> StreamExecutionEnvironment.getExecutionEnvironment();
> DataStreamSource<Integer> stream1 = env.fromElements(1, 2);
> DataStreamSource<Integer> stream2 = env.fromElements(1, 2);
> ConnectedStreams<Integer, Integer> connectedStreams =
> stream1.connect(stream2)
> .keyBy(v -> v, v -> v);
> KeyedStream<?, ?> firstKeyedInput = (KeyedStream<?, ?>)
> connectedStreams.getFirstInput();
> KeyedStream<?, ?> secondKeyedInput = (KeyedStream<?, ?>)
> connectedStreams.getSecondInput();
> assertThat(firstKeyedInput.getKeyType(), equalTo(Types.INT));
> assertThat(secondKeyedInput.getKeyType(), equalTo(Types.INT));
> }
> {code}
> The problem is that the wildcard type is evaluated as {{Object}} for lambdas,
> which in turn produces {{GenericTypeInfo<Object>}} for any KeySelector
> provided as lambda.
> I suggest changing the method signature to:
> {code}
> public <K1, K2> ConnectedStreams<IN1, IN2> keyBy(
> KeySelector<IN1, K1> keySelector1,
> KeySelector<IN2, K2> keySelector2)
> {code}
> This would be a code compatible change. Might break the compatibility of
> state backend (would change derived key type info).
> Still there would be a workaround to use the second method for old programs:
> {code}
> public <KEY> ConnectedStreams<IN1, IN2> keyBy(
> KeySelector<IN1, KEY> keySelector1,
> KeySelector<IN2, KEY> keySelector2,
> TypeInformation<KEY> keyType)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)