Hi Richard and all, Thank you for your answer. This is still only a partial solution, as:
1. The JCas is referenced from inside a Document object, and by your suggestion, I must serialize both of them separately. For instance, write it alternating: <Document, JCas, Document, JCas, ...>, or implement Serializable.writeObject() and call ObjectOutputStream.defaultWriteObject() for the other fields. However, I am looking for a way to have the serializer of the document just go through its default writeObject() implementation, and only when it encounters the JCas field - then some special treatment would be triggered. 2. More importantly - my Sentence object (referenced by a Document object) has a reference to a Sentence Annotation. This Annotation cannot be serialized by the method you suggest, as it only takes a full CAS. Of course I could implement here something that when deserializing, I would iterate through the CAS and find each sentence's annotation and manually put its reference in the Sentence object. But this is pretty complicated, and would be a very lengthy process during deserialization. So I am looking for a way for the SentenceAnnotation references to "survive" the serialization\deserialization. Do you have any ideas? Thank you, Ofer On Mon, May 19, 2014 at 12:19 PM, Richard Eckart de Castilho <[email protected] > wrote: > Hello Ofer, > > the CAS cannot be serialized immediately, but there is a helper class > which is serializable. > > To write: > > ObjectOutputStream docOS = ... > CASCompleteSerializer serializer = > Serialization.serializeCASComplete(aJCas.getCasImpl()); > docOS.writeObject(serializer); > > To read: > > ObjectInputStream is = ... > CASCompleteSerializer serializer = (CASCompleteSerializer) is.readObject(); > Serialization.deserializeCASComplete(serializer, (CASImpl) aCAS); > > However, there are newer and more efficient binary formats that you might > want to use [1]. > > If you want to dig into the topic or if you want to use a ready-made pair > of > readers/writers for the binary formats, you could consider taking a look at > the BinaryCasReader/Writer in the DKPro Core [2,3] (non-ASF). > > Cheers, > > -- Richard > > [1] > http://uima.apache.org/d/uimaj-2.6.0/tutorials_and_users_guides.html#ugr.tug.type_filtering.compressed_file > [2] > https://code.google.com/p/dkpro-core-asl/source/browse/de.tudarmstadt.ukp.dkpro.core-asl/trunk/de.tudarmstadt.ukp.dkpro.core.io.bincas-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/io/bincas/BinaryCasReader.java > [3] > https://code.google.com/p/dkpro-core-asl/source/browse/de.tudarmstadt.ukp.dkpro.core-asl/trunk/de.tudarmstadt.ukp.dkpro.core.io.bincas-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/io/bincas/BinaryCasWriter.java > > On 19.05.2014, at 11:03, Ofer Bronstein <[email protected]> wrote: > > > Hi Guys, > > > > I am an Israeli Master's Student, and have been happily working with UIMA > > for the past two years. > > I hope this is the right place for my question - > > > > I have a Document object I created, which has a JCas member with > > annotations over a document. > > I also have a Sentence object, with a member referencing its Sentence > > Annotation in the corresponding JCas. Each Document object references all > > of its Sentence objects. > > I would like to dump each Document object as a file on disk, using the > > default Java serialization. Later they would also be deserialized back > into > > the Java objects. I understand I would need some special treatment for > the > > JCases and the Sentence Annotations as they are not serializable (now I > get > > NotSerializableException). Hopefully the treatment could be as minimal as > > possible. > > > > How do you suggest to do this, regarding serialization of JCas and > > combining it with Java serialization? > > > > I am working on Windows, with Java 1.6 and UIMA 2.4.0. I am using the > same > > type system and the same 3 views for all JCases and annotations. > > > > Thank you, > > Ofer Bronstein > >
