Re: Extracting Timestamp in MapFunction

2016-06-09 Thread Biplob Biswas
Hi Aljoscha, I went to the Flink hackathon by Buzzwords yesterday where Fabian and Robert helped me with this issue. Apparently I was assuming that the file would be handled in a single thread but I was using parallelsourcefunction and it was creating 4 different threads and thus reading the same

Re: Extracting Timestamp in MapFunction

2016-06-09 Thread Aljoscha Krettek
Hi, could you try pulling the problem apart, i.e. determine at which point in the pipeline you have duplicate data. Is it after the sources or in the CoFlatMap or the Map after the reduce, for example? Cheers, Aljoscha On Wed, 1 Jun 2016 at 17:11 Biplob Biswas wrote: > Hi, > > Before giving the

Re: Extracting Timestamp in MapFunction

2016-06-01 Thread Biplob Biswas
Hi, Before giving the method u described above a try, i tried adding the timestamp with my data directly at the stream source. Following is my stream source: http://pastebin.com/AsXiStMC and I am using the stream source as follows: DataStream tuples = env.addSource(new DataStreamGenerator(file

Re: Extracting Timestamp in MapFunction

2016-05-30 Thread Aljoscha Krettek
Hi, right now the only way of getting at the timestamps is writing a custom operator and using that with DataStream.transform(). Take a look at StreamMap, which is the operator implementation that executes MapFunctions. I think the links in the doc should point to https://ci.apache.org/projects/fl