Fwd: A WordCount job using DataStream API but behave like the batch WordCount example

Luke Xiong Fri, 28 Apr 2023 22:57:35 -0700

Dear experts,

Is it possible to write a WordCount job that uses the DataStream API, but
make it behave like the batch version WordCount example?


More specifically, I hope the job can get a DataStream of the final (word,
count) records when fed a text file.

For example, given a text file:
```input.txt
hello hello world hello world
hello world world world hello world
```

In the flink WordCount examples, the batch version outputs:
```batch.version.output
hello 5
world 6
```

while the stream version outputs:
```stream.version.output
(hello,1)
(hello,2)
(world,1)
(hello,3)
(world,2)
(hello,4)
(world,3)
(world,4)
(world,5)
(hello,5)
(world,6)
```
Is it possible to have a DataStream that only has two elements: (hello, 5)
and (world, 6)?

Regards,
Luke

Fwd: A WordCount job using DataStream API but behave like the batch WordCount example

Reply via email to