Hi Richard and all,

Thank you for the idea. I tried using your idea with ll_getFSForRef(), but
I get a NullPointerException:
In CASImpl.ll_getFSForRef(int fsRef), in the last line of the method (line
3117), the expression this.svd.localFsGenerators[getHeap().heap[fsRef]]
returns null, but since the full phrase
is this.svd.localFsGenerators[getHeap().heap[fsRef]].createFS(fsRef, this),
we get a NullPointerException since we're trying to call createFS(fsRef,
this) on null.

The address I am using is definitely on a Sentence Annotation that exists
in the CAS, in the _InitialView,  and I got the address by calling
getAddress() on it and saving the Integer.
Can you think of any reason why this happens? Or, should I do something
special to make the address valid, or have the FeatureStructure retrievable
from it?

Thank you,
Ofer


On Mon, May 19, 2014 at 1:24 PM, Richard Eckart de Castilho
<[email protected]>wrote:

> Hi Ofer,
>
> I'm not an expert on Java Serialization but here is goes nothing ;)
>
> 1) I suppose you could override the default Java Serialization process for
> your Document class and handle the de/serialization of the CAS via
> the CASCompleteSerializer - that would basically be the special treatment.
>
> 2) I do not think that you can make JCas objects (like SentenceAnnotation)
> "survive" the serialization process because they are not serializable.
> If you manage to de/serialize the CAS using CASCompleteSerializer, then
> you can make use of the CAS addresses in each annotation. Your Sentence
> object can maintain a reference to the address of each SentenceAnnotation.
> When you want to access the SentenceAnnotation through your Sentence,
> you do so by resolving the address against the loaded JCas:
>
> (Store this address in your Sentence)
>   int address = sentenceAnnotation.getAddress()
>
> (Use it later after deserialization to fetch the SentenceAnnotation from
> the JCas)
>   (SentenceAnnotation) aJCas.getLowLevelCas().ll_getFSForRef(address)
>
> Btw. this is as fast as it gets - JCas wrappers use such code internally.
>
> I'd say what you plan to do should work but it verges on the border of
> black magic! But then again, I've done similar stuff ;)
>
> Cheers,
>
> -- Richard
>
> In your Document object, make the CAS a
>
> On 19.05.2014, at 12:04, Ofer Bronstein <[email protected]> wrote:
>
> > Hi Richard and all,
> >
> > Thank you for your answer. This is still only a partial solution, as:
> >
> > 1. The JCas is referenced from inside a Document object, and by your
> > suggestion, I must serialize both of them separately. For instance, write
> > it alternating: <Document, JCas, Document, JCas, ...>, or implement
> > Serializable.writeObject() and call
> > ObjectOutputStream.defaultWriteObject() for the other fields. However, I
> am
> > looking for a way to have the serializer of the document just go through
> > its default writeObject() implementation, and only when it encounters the
> > JCas field - then some special treatment would be triggered.
> >
> > 2. More importantly - my Sentence object (referenced by a Document
> object)
> > has a reference to a Sentence Annotation. This Annotation cannot be
> > serialized by the method you suggest, as it only takes a full CAS. Of
> > course I could implement here something that when deserializing, I would
> > iterate through the CAS and find each sentence's annotation and manually
> > put its reference in the Sentence object. But this is pretty complicated,
> > and would be a very lengthy process during deserialization. So I am
> looking
> > for a way for the SentenceAnnotation references to "survive" the
> > serialization\deserialization.
> >
> > Do you have any ideas?
> >
> > Thank you,
> > Ofer
> >
> >
> > On Mon, May 19, 2014 at 12:19 PM, Richard Eckart de Castilho <
> [email protected]
> >> wrote:
> >
> >> Hello Ofer,
> >>
> >> the CAS cannot be serialized immediately, but there is a helper class
> >> which is serializable.
> >>
> >> To write:
> >>
> >> ObjectOutputStream docOS = ...
> >> CASCompleteSerializer serializer =
> >> Serialization.serializeCASComplete(aJCas.getCasImpl());
> >> docOS.writeObject(serializer);
> >>
> >> To read:
> >>
> >> ObjectInputStream is = ...
> >> CASCompleteSerializer serializer = (CASCompleteSerializer)
> is.readObject();
> >> Serialization.deserializeCASComplete(serializer, (CASImpl) aCAS);
> >>
> >> However, there are newer and more efficient binary formats that you
> might
> >> want to use [1].
> >>
> >> If you want to dig into the topic or if you want to use a ready-made
> pair
> >> of
> >> readers/writers for the binary formats, you could consider taking a
> look at
> >> the BinaryCasReader/Writer in the DKPro Core [2,3] (non-ASF).
> >>
> >> Cheers,
> >>
> >> -- Richard
> >>
> >> [1]
> >>
> http://uima.apache.org/d/uimaj-2.6.0/tutorials_and_users_guides.html#ugr.tug.type_filtering.compressed_file
> >> [2]
> >>
> https://code.google.com/p/dkpro-core-asl/source/browse/de.tudarmstadt.ukp.dkpro.core-asl/trunk/de.tudarmstadt.ukp.dkpro.core.io.bincas-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/io/bincas/BinaryCasReader.java
> >> [3]
> >>
> https://code.google.com/p/dkpro-core-asl/source/browse/de.tudarmstadt.ukp.dkpro.core-asl/trunk/de.tudarmstadt.ukp.dkpro.core.io.bincas-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/io/bincas/BinaryCasWriter.java
> >>
> >> On 19.05.2014, at 11:03, Ofer Bronstein <[email protected]> wrote:
> >>
> >>> Hi Guys,
> >>>
> >>> I am an Israeli Master's Student, and have been happily working with
> UIMA
> >>> for the past two years.
> >>> I hope this is the right place for my question -
> >>>
> >>> I have a Document object I created, which has a JCas member with
> >>> annotations over a document.
> >>> I also have a Sentence object, with a member referencing its Sentence
> >>> Annotation in the corresponding JCas. Each Document object references
> all
> >>> of its Sentence objects.
> >>> I would like to dump each Document object as a file on disk, using the
> >>> default Java serialization. Later they would also be deserialized back
> >> into
> >>> the Java objects. I understand I would need some special treatment for
> >> the
> >>> JCases and the Sentence Annotations as they are not serializable (now I
> >> get
> >>> NotSerializableException). Hopefully the treatment could be as minimal
> as
> >>> possible.
> >>>
> >>> How do you suggest to do this, regarding serialization of JCas and
> >>> combining it with Java serialization?
> >>>
> >>> I am working on Windows, with Java 1.6 and UIMA 2.4.0. I am using the
> >> same
> >>> type system and the same 3 views for all JCases and annotations.
> >>>
> >>> Thank you,
> >>> Ofer Bronstein
>
>

Reply via email to