preserving comments when parsing xml uima descriptors support checked in

Marshall Schor Fri, 20 Jan 2012 11:47:52 -0800

Here's a brief description of this.

The XML parser methods already take a parameter, an instance of ParsingOptions.This was augmented to have one additional boolean - preserveComments (defaultsto false).

If not set, then the parser works as before. No lexical handler is installed,so it should operate as fast as before. There *is* one extra slot in the Javaobject representation corresponding to some of the elements in the XML (not allelements have their own Java object class); this slot is set to null in this case.

When preserveComments is true, the slot is set to be a reference to the DOMElement node object corresponding to that object. This results in the "DOM"that previously was a *temporary* object, being retained while the Java objectscorresponding to it are retained. This will increase the "footprint" for aparsed UIMA Descriptor, of course.

The *toXML* method was modified to check this slot, and if it is not null, theDOM around the vicinity of the element is scanned for comment and whitespacenodes, and the appropriate ones are used. An attempt is made to beheuristically close to the original - in the presence of some editing (adding /deleting nodes). See the bottom of MetaDataObject_impl class for some detailsof this.

The Component Descriptor Editor is modified to preserve comments (only for thoseXML pieces which it is editing and might be writing out).

So, the good news is, if you edit a descriptor with the CDE and it has an Apachelicense header at the top, it will no longer be deleted... :-)

All the test cases pass, and I did some amount of manual editing / testing; moretesting welcome.


-Marshall

preserving comments when parsing xml uima descriptors support checked in

Reply via email to