Thanks for the feedback.

When you write "cannot pass" what do you mean? the exception that you reported 
is logged and the program continues? something else?

Besides, the standard tests that we run for the release pass and show that 
checkpointing works. The problem is might be related to the speed of 
checkpointing and of sending events. Note that it might not be necessary to 
checkpoint for every single event, and checkpointing every n events (n 
relatively small) and losing at worst n-1 events per PE in case of failure 
might be ok. 

It would be good to know in which conditions exactly you encounter the issue, 
i.e. frequency of checkpointing and frequency of events sent/received. 
Reporting a bug on our jira system would be the best place to follow that 
conversation.

Thanks and regards,

Matthieu 




On Mar 26, 2013, at 08:56 , Dingyu Yang wrote:

> Hi,Matthieu
> I debug the program and still have this problem. 
> I find the problem when debuging at: 
> SaveStateTask.run-----futureSerializedState.get(1000, TimeUnit.MILLISECONDS).
> It cannot pass at here. I don't know what the problem is, Even I have just 
> one PE instance.  Is it my program problem or S4?
> Are you able to checkpoint?
> 
> Waiting for your answer!
> 
> 
> 2013/3/26 Matthieu Morel <mmo...@apache.org>
> This looks like a bug, from a race condition in the serializer.
> 
> Can you file a bug? Also, are you able to reproduce it systematically?
> 
> Thanks,
> 
> Matthieu
> 
> On Mar 23, 2013, at 07:33 , Dingyu Yang wrote:
> 
> > Hi,all
> > I run a checkpoint example and get some problems.
> > The version is S4 0.6 RC3 .
> > ./s4 deploy -a=example.wordcountApp -c=testCluster1 -appName=wordApp 
> > -p=s4.checkpointing.filesystem.storageRootPath=/home/tmp/s4checkpoint 
> > -emc=org.apache.s4.core.ft.FileSystemBackendCheckpointingModule
> >
> > Then I get this error:
> > 14:21:50.251 [Checkpointing-storage-0] WARN  
> > org.apache.s4.core.ft.SaveStateTask - Cannot save checkpoint : 
> > [PROTO_ID];[KEY] --> [example.WordSumPE];[./s4]
> > java.util.concurrent.ExecutionException: 
> > com.esotericsoftware.kryo.KryoException: 
> > java.util.ConcurrentModificationException
> > Serialization trace:
> > classes (sun.misc.Launcher$AppClassLoader)
> > contextClassLoader (java.lang.Thread)
> > thread (java.util.concurrent.ThreadPoolExecutor$Worker)
> > workers (java.util.concurrent.ThreadPoolExecutor)
> > fetchingThreadPool (org.apache.s4.core.ft.SafeKeeper)
> > checkpointingFramework (example.wordcountApp)
> > app (org.apache.s4.core.Stream)
> > downStream (example.WordSumPE)
> >     at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232) 
> > ~[na:1.6.0_22]
> >     at java.util.concurrent.FutureTask.get(FutureTask.java:91) 
> > ~[na:1.6.0_22]
> >     at org.apache.s4.core.ft.SaveStateTask.run(SaveStateTask.java:66) 
> > ~[bin/:na]
> >     at 
> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> >  [na:1.6.0_22]
> >     at 
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> >  [na:1.6.0_22]
> >     at java.lang.Thread.run(Thread.java:662) [na:1.6.0_22]
> > Caused by: com.esotericsoftware.kryo.KryoException: 
> > java.util.ConcurrentModificationException
> > Serialization trace:
> > classes (sun.misc.Launcher$AppClassLoader)
> > contextClassLoader (java.lang.Thread)
> > thread (java.util.concurrent.ThreadPoolExecutor$Worker)
> > workers (java.util.concurrent.ThreadPoolExecutor)
> > fetchingThreadPool (org.apache.s4.core.ft.SafeKeeper)
> > checkpointingFramework (example.wordcountApp)
> > app (org.apache.s4.core.Stream)
> > downStream (example.WordSumPE)
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:585)
> >  ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
> >  ~[kryo-2.20.jar:na]
> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) 
> > ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
> >  ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
> >  ~[kryo-2.20.jar:na]
> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) 
> > ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
> >  ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
> >  ~[kryo-2.20.jar:na]
> >     at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:552) 
> > ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:68)
> >  ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:18)
> >  ~[kryo-2.20.jar:na]
> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) 
> > ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
> >  ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
> >  ~[kryo-2.20.jar:na]
> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) 
> > ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
> >  ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
> >  ~[kryo-2.20.jar:na]
> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) 
> > ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
> >  ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
> >  ~[kryo-2.20.jar:na]
> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) 
> > ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
> >  ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
> >  ~[kryo-2.20.jar:na]
> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) 
> > ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
> >  ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
> >  ~[kryo-2.20.jar:na]
> >     at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:571) 
> > ~[kryo-2.20.jar:na]
> >     at 
> > org.apache.s4.comm.serialize.KryoSerDeser.serialize(KryoSerDeser.java:91) 
> > ~[bin/:na]
> >     at 
> > org.apache.s4.core.ProcessingElement.serializeState(ProcessingElement.java:802)
> >  ~[bin/:na]
> >     at org.apache.s4.core.ft.SerializeTask.call(SerializeTask.java:42) 
> > ~[bin/:na]
> >     at org.apache.s4.core.ft.SerializeTask.call(SerializeTask.java:1) 
> > ~[bin/:na]
> >     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) 
> > ~[na:1.6.0_22]
> >     at java.util.concurrent.FutureTask.run(FutureTask.java:138) 
> > ~[na:1.6.0_22]
> >     ... 3 common frames omitted
> > Caused by: java.util.ConcurrentModificationException: null
> >     at 
> > java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) 
> > ~[na:1.6.0_22]
> >     at java.util.AbstractList$Itr.next(AbstractList.java:343) ~[na:1.6.0_22]
> >     at 
> > com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:74)
> >  ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:18)
> >  ~[kryo-2.20.jar:na]
> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) 
> > ~[kryo-2.20.jar:na]
> >     at 
> > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
> >  ~[kryo-2.20.jar:na]
> >     ... 35 common frames omitted
> >
> >
> 
> 

Reply via email to