Hi Richard and all,

Thank you for your answer. This is still only a partial solution, as:

1. The JCas is referenced from inside a Document object, and by your
suggestion, I must serialize both of them separately. For instance, write
it alternating: <Document, JCas, Document, JCas, ...>, or implement
Serializable.writeObject() and call
ObjectOutputStream.defaultWriteObject() for the other fields. However, I am
looking for a way to have the serializer of the document just go through
its default writeObject() implementation, and only when it encounters the
JCas field - then some special treatment would be triggered.

2. More importantly - my Sentence object (referenced by a Document object)
has a reference to a Sentence Annotation. This Annotation cannot be
serialized by the method you suggest, as it only takes a full CAS. Of
course I could implement here something that when deserializing, I would
iterate through the CAS and find each sentence's annotation and manually
put its reference in the Sentence object. But this is pretty complicated,
and would be a very lengthy process during deserialization. So I am looking
for a way for the SentenceAnnotation references to "survive" the
serialization\deserialization.

Do you have any ideas?

Thank you,
Ofer


On Mon, May 19, 2014 at 12:19 PM, Richard Eckart de Castilho <[email protected]
> wrote:

> Hello Ofer,
>
> the CAS cannot be serialized immediately, but there is a helper class
> which is serializable.
>
> To write:
>
> ObjectOutputStream docOS = ...
> CASCompleteSerializer serializer =
> Serialization.serializeCASComplete(aJCas.getCasImpl());
> docOS.writeObject(serializer);
>
> To read:
>
> ObjectInputStream is = ...
> CASCompleteSerializer serializer = (CASCompleteSerializer) is.readObject();
> Serialization.deserializeCASComplete(serializer, (CASImpl) aCAS);
>
> However, there are newer and more efficient binary formats that you might
> want to use [1].
>
> If you want to dig into the topic or if you want to use a ready-made pair
> of
> readers/writers for the binary formats, you could consider taking a look at
> the BinaryCasReader/Writer in the DKPro Core [2,3] (non-ASF).
>
> Cheers,
>
> -- Richard
>
> [1]
> http://uima.apache.org/d/uimaj-2.6.0/tutorials_and_users_guides.html#ugr.tug.type_filtering.compressed_file
> [2]
> https://code.google.com/p/dkpro-core-asl/source/browse/de.tudarmstadt.ukp.dkpro.core-asl/trunk/de.tudarmstadt.ukp.dkpro.core.io.bincas-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/io/bincas/BinaryCasReader.java
> [3]
> https://code.google.com/p/dkpro-core-asl/source/browse/de.tudarmstadt.ukp.dkpro.core-asl/trunk/de.tudarmstadt.ukp.dkpro.core.io.bincas-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/io/bincas/BinaryCasWriter.java
>
> On 19.05.2014, at 11:03, Ofer Bronstein <[email protected]> wrote:
>
> > Hi Guys,
> >
> > I am an Israeli Master's Student, and have been happily working with UIMA
> > for the past two years.
> > I hope this is the right place for my question -
> >
> > I have a Document object I created, which has a JCas member with
> > annotations over a document.
> > I also have a Sentence object, with a member referencing its Sentence
> > Annotation in the corresponding JCas. Each Document object references all
> > of its Sentence objects.
> > I would like to dump each Document object as a file on disk, using the
> > default Java serialization. Later they would also be deserialized back
> into
> > the Java objects. I understand I would need some special treatment for
> the
> > JCases and the Sentence Annotations as they are not serializable (now I
> get
> > NotSerializableException). Hopefully the treatment could be as minimal as
> > possible.
> >
> > How do you suggest to do this, regarding serialization of JCas and
> > combining it with Java serialization?
> >
> > I am working on Windows, with Java 1.6 and UIMA 2.4.0. I am using the
> same
> > type system and the same 3 views for all JCases and annotations.
> >
> > Thank you,
> > Ofer Bronstein
>
>

Reply via email to