Re: Enhancement to XMI deserializer as a foundation for remote parallel processing

Michael Baessler Thu, 22 Feb 2007 07:00:36 -0800

Adam Lally wrote:

On 2/22/07, Michael Baessler <[EMAIL PROTECTED]> wrote:

Currently don't understand how the xmi:id is generated. For example we
have a CAS in step 1 that is serialized with a max xmi:id of 100.
Now two other remote services get this XMI document and do their
processing on it and serialize the CAS again. First service adds for
example FS with ids from 101 to 150, right?
And what do the second process? I think it also starts with numbering at
101.


Yes, that's correct.  The IDs used by the two services for new FS they
create will not be unique.

So how are the XMI documents merged in the XMI deserializer. Is it
necessary
so merge the xmi:id attributes before creating the CAS or is it
sufficient to just read the additional xmi:ids (greater than 100) and
add them to the CAS?


Good questions...

It actually should work without having to merge (if you mean to make
unique) the IDs produced by the services.  Take your example, when the
original max ID is 100 (I call 100 the "merge point"), and the two
responses "xmi1" and "xmi2" both have appended FS with xmi:id=101:

The deserialization of xmi1 works as normal - all FS including the one
with ID 101 are added to the CAS.  When the deserialization of xmi2 is
done, the deserializer code is told that the "merge point" is 100.
This effect of this is that all of the FS with xmi:id <= 100 are
ignored.  The FS with xmi:id = 101 will be added to the CAS as a new
FeatureStruture.  It doesn't matter that xmi1 also had an FS with the
same id - the deserializer will know that because 101 is greater than
the mergePoint, a new FS should be created.

As for references between FS.. during the deserialization of xmi2, the
deserializer knows that any reference to an xmi:id with value <= 100
is a reference to an FS that pre-existed in the CAS, while any
reference to an xmi:id with value > 100 is a reference to an FS that
is part of the xmi2 document that's currently being deserialized.  So
all the information is there to do this correctly.  If xmi2 contained
a reference to id 101, the deserializer would know that this was
supposed to refer to id 101 in xmi2, NOT in xmi1.

I think when xmi:ids must be merged before the CAS is created the XMI
deserializer have to take care about the references with special offset
values. That is easy when using

<myproj:Baz xmi:id="2">
   <myFoo href="#1"/>
<myproj.Baz>

but more difficult when using

<myproj:Baz xmi:id="2" myFoo="1"/>


The deserializer knows the TypeSystem of the CAS, so will know that
myFoo is defined as a reference and not an integer.  This is in fact
needed for "normal" deserialization to work, even without merging.

Thanks!

So the xmi:ids that are not unique must not be merged they can just beadded to the CAS one XMI document after the other.


-- Michael

Re: Enhancement to XMI deserializer as a foundation for remote parallel processing

Reply via email to