Hi Pramod, I removed transient but result is the same - https://github.com/Matzz/apex-example/blob/master/src/main/java/aptest/Aggregator.java
Creating aggregator 2018-06-18T10:42:50.582 Failing aggregator! 2018-06-18T10:42:50.707 Creating FileOutput 2018-06-18T10:42:50.848 [1.0] [1.0, 2.0] [1.0, 2.0, 3.0] [1.0, 2.0, 3.0, 4.0] [1.0, 2.0, 3.0, 4.0, 5.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0] Creating aggregator 2018-06-18T10:42:59.683 Failing aggregator! 2018-06-18T10:42:59.794 Creating FileOutput 2018-06-18T10:42:59.926 Creating aggregator 2018-06-18T10:43:08.810 Failing aggregator! 2018-06-18T10:43:08.918 Creating FileOutput 2018-06-18T10:43:08.988 [1.0] [1.0, 2.0] [1.0, 2.0, 3.0] Creating FileOutput 2018-06-18T10:43:18.059 Creating aggregator 2018-06-18T10:43:18.142 Failing aggregator! 2018-06-18T10:43:18.227 [1.0] [1.0, 2.0] [1.0, 2.0, 3.0] [1.0, 2.0, 3.0, 4.0] Creating FileOutput 2018-06-18T10:43:27.130 Creating aggregator 2018-06-18T10:43:27.135 Failing aggregator! 2018-06-18T10:43:27.228 [1.0] [1.0, 2.0] [1.0, 2.0, 3.0] [1.0, 2.0, 3.0, 4.0] [1.0, 2.0, 3.0, 4.0, 5.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0] [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0] pon., 18 cze 2018 o 00:16 Pramod Immaneni <pramod.imman...@gmail.com> napisaĆ(a): > Hi Matuesz, > > It is because you have defined the list as transient in the Aggregator. > Transient elements are not serialized and included when the checkpoint is > created. > > Thanks > On Sun, Jun 17, 2018 at 2:35 PM Mateusz Zakarczemny < > m.zakarcze...@gmail.com> wrote: > >> Hi all, >> I created simply app to test apex fault tolerance. It is build from three >> main operators: >> - sequence generator - operator which generate increasing numbers. One >> per time window >> - aggregator - just adds incoming number to the list and emits whole list >> downstream >> - file output - operator which writes incoming messages to the file >> To make it faulty, aggregator operator throws an exception for 10% of >> messages. Source code is here https://github.com/Matzz/apex-example >> >> I'm running it on sandbox docker image. I thought that even if >> aggregation operator is faulty, application will checkpoint its state. >> So over the time output list should be longer and longer. >> Unfortunately, it looks like on each failure app is resenting it state to >> the beginning. Sample output: >> >> *tail -f -n 100 /tmp/stream.out * >> >> *Creating FileOutput 2018-06-16T22:07:01.033* >> *Creating aggreagator 2018-06-16T22:07:01.040* >> *Creating FileOutput 2018-06-16T22:07:01.041* >> *Creating FileOutput 2018-06-16T22:07:02.719* >> *Creating aggreagator 2018-06-16T22:07:02.722* >> *Creating FileOutput 2018-06-16T22:07:02.723* >> *Creating FileOutput 2018-06-16T22:08:48.178* >> *Creating aggreagator 2018-06-16T22:08:48.185* >> *Creating FileOutput 2018-06-16T22:08:48.186* >> *Creating FileOutput 2018-06-16T22:08:49.847* >> *Creating aggreagator 2018-06-16T22:08:49.850* >> *Creating FileOutput 2018-06-16T22:08:49.852* >> *Creating FileOutput 2018-06-16T22:08:56.736* >> *Creating aggreagator 2018-06-16T22:08:56.740* >> *Creating FileOutput 2018-06-16T22:08:56.743* >> *Creating FileOutput 2018-06-16T22:08:57.899* >> *Creating aggreagator 2018-06-16T22:08:57.899* >> *Creating FileOutput 2018-06-16T22:08:57.899* >> *Creating FileOutput 2018-06-16T22:09:10.951* >> *Creating FileOutput 2018-06-16T22:09:10.986* >> *Creating aggreagator 2018-06-16T22:09:11.001* >> *Failing sequence generator!2018-06-16T22:09:11.029* >> *Creating FileOutput 2018-06-16T22:09:19.484* >> *Creating FileOutput 2018-06-16T22:09:19.506* >> *Creating aggreagator 2018-06-16T22:09:19.518* >> *Failing sequence generator!2018-06-16T22:09:19.542* >> *Creating FileOutput 2018-06-16T22:09:28.646* >> *Creating FileOutput 2018-06-16T22:09:28.668* >> *Creating aggreagator 2018-06-16T22:09:28.680* >> *Failing sequence generator!2018-06-16T22:09:28.704* >> *[1.0]* >> *Creating FileOutput 2018-06-16T22:09:37.864* >> *Creating FileOutput 2018-06-16T22:09:37.886* >> *Creating aggreagator 2018-06-16T22:09:37.897* >> *Failing sequence generator!2018-06-16T22:09:37.924* >> *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]* >> *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]* >> *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]* >> *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]* >> *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]* >> *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]* >> *Creating FileOutput 2018-06-16T22:09:46.921* >> *Creating FileOutput 2018-06-16T22:09:46.944* >> *Creating aggreagator 2018-06-16T22:09:46.956* >> *Failing sequence generator!2018-06-16T22:09:46.980* >> *[1.0, 2.0, 3.0, 4.0]* >> *[1.0, 2.0, 3.0, 4.0]* >> *[1.0, 2.0, 3.0, 4.0]* >> *[1.0, 2.0, 3.0, 4.0]* >> *Creating FileOutput 2018-06-16T22:09:56.049* >> *Creating FileOutput 2018-06-16T22:09:56.070* >> *Creating aggreagator 2018-06-16T22:09:56.081* >> *Failing sequence generator!2018-06-16T22:09:56.112* >> *[1.0]* >> *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]* >> *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]* >> *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]* >> *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]* >> *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]* >> *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]* >> *Creating FileOutput 2018-06-16T22:10:05.213* >> *Creating FileOutput 2018-06-16T22:10:05.232* >> *Creating aggreagator 2018-06-16T22:10:05.241* >> *Failing sequence generator!2018-06-16T22:10:05.266**[1.0, 2.0]* >> *[1.0, 2.0]* >> >> >> >> Could I ask for some explanation what I'm doing wrong? >> >> Regards, >> Matuesz Zakarczemny >> >>