Hi Aljoscha,
I went to the Flink hackathon by Buzzwords yesterday where Fabian and Robert
helped me with this issue. Apparently I was assuming that the file would be
handled in a single thread but I was using parallelsourcefunction and it was
creating 4 different threads and thus reading the same
Hi,
could you try pulling the problem apart, i.e. determine at which point in
the pipeline you have duplicate data. Is it after the sources or in the
CoFlatMap or the Map after the reduce, for example?
Cheers,
Aljoscha
On Wed, 1 Jun 2016 at 17:11 Biplob Biswas wrote:
> Hi,
>
> Before giving the
Hi,
Before giving the method u described above a try, i tried adding the
timestamp with my data directly at the stream source.
Following is my stream source:
http://pastebin.com/AsXiStMC
and I am using the stream source as follows:
DataStream tuples = env.addSource(new DataStreamGenerator(file
Hi,
right now the only way of getting at the timestamps is writing a custom
operator and using that with DataStream.transform(). Take a look at
StreamMap, which is the operator implementation that executes MapFunctions.
I think the links in the doc should point to
https://ci.apache.org/projects/fl