the issue.
>
> Thanks for reporting the issue.
>
> Best, Fabian
>
> [1] https://issues.apache.org/jira/browse/FLINK-4640
>
> 2016-09-20 10:35 GMT+02:00 Yukun Guo <gyk@gmail.com>:
>
>> Some detail: if running the FoldFunction on a KeyedStream, everything
>> wo
rn type can be deduced from the input type(s)". This
is true for flatMap(Tuple2<Long, T>, Collector<Tuple2<T, String>>), but if
the signature is changed to void flatMap(SortedMap<T, Long>,
Collector<Tuple2<T, Long>>), type inference fails.
>
> O
Hi,
When I run the code implementing a generic FlatMapFunction, Flink
complained about InvalidTypesException:
public class GenericFlatMapper implements
FlatMapFunction, Tuple2> {
@Override
public void flatMap(SortedMap m, Collector>
> buffer for a later window that will be emitted at a future time.
>
> On Tue, 5 Jul 2016 at 04:35 Yukun Guo <gyk@gmail.com> wrote:
>
>> The output is the timestamps of events in string. (For convenience, the
>> payload of each event is exactly the timestam
orate a bit on what exactly the output means and how
> you derive that events are leaking into the previous window?
>
> On Mon, 4 Jul 2016 at 13:20 Yukun Guo <gyk@gmail.com> wrote:
>
>> Thanks for the information. Strange enough, after I set the time
>> ch
se processing time
> if you don't specify a time characteristic. You can enforce using an
> event-time window using this:
>
> stream.window(EventTimeTumblingWindows.of(Time.seconds(10)))
>
> Cheers,
> Aljoscha
>
>
> On Mon, 4 Jul 2016 at 06:00 Yukun Guo <gyk@gmail.c
t if invoked multiple times.
> The documentation is not discussing this aspect and should be extended.
>
> Thanks for pointing out this issue.
>
> Cheers,
> Fabian
>
>
> 2016-06-09 13:19 GMT+02:00 Yukun Guo <gyk@gmail.com>:
>
>> I’m playing with the
I’m playing with the (Window)WordCount example from Flink QuickStart. I
generate a DataStream consisting of 1000 Strings of random digits, which is
windowed with a tumbling count window of 50 elements:
import org.apache.flink.api.common.functions.FlatMapFunction;import
My algorithm is roughly like this taking top-K words problem as an example
(the purpose of computing local “word count” is to deal with data
imbalance):
DataStream of words ->
timeWindow of 1h ->
converted to DataSet of words ->
random partitioning by rebalance ->
local “word count” using
Hi,
I'm working on a project which uses Flink to compute hourly log statistics
like top-K. The logs are fetched from Kafka by a FlinkKafkaProducer and
packed
into a DataStream.
The problem is, I find the computation quite challenging to express with
Flink's DataStream API:
1. If I use something
10 matches
Mail list logo