Dear Dave and all
Thanks for providing such a useful package - I have been using for over an year
on a number of projects! However recently I stumbled upon the following
problem. In a schema that I am using I have a complex type with simple content
described as follows:
<xs:complexType name="sciSpeciesType">
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="ncbiTaxId" type="xs:integer"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
I had a user who had filled in an XML with the following data:
<sciSpeciesStrain>Armache & Anger (AA) PBMCs</sciSpeciesStrain>
I used generateDS to generate a class to read this and to write it out and the
output was as follows:
<sciSpeciesStrain>Armache & Anger (AA) PBMCs</sciSpeciesStrain>
The '&' symbol being illegal in XML in this form caused downstream issues when
the file was re-read.
I tested a simple type string or token with '&' and this was written out as
I expected as '&'
Looking through the generateDS code I found that when a simple string or token
is being written out, the value is escaped using the quote_xml function, but
this was not the case for simple content of a complex type.
So I modified the line 2725 in generateDS.py (2.17a0):
wrt(" outfile.write(str(self.valueOf_).encode("
"ExternalEncoding))\n")
To >>
wrt(" outfile.write((quote_xml(self.valueOf_) if
type(self.valueOf_) is str else str(self.valueOf_)).encode("
"ExternalEncoding))\n")
In
def generateExportFn(wrt, prefix, element, namespace, nameSpacesDef):
childCount = countChildren(element, 0)
name = element.getName()
base = element.getBase()
wrt(" def export(self, outfile, level, namespace_='%s', "
"name_='%s', namespacedef_='%s', pretty_print=True):\n" %
(namespace, name, nameSpacesDef))
wrt(' if pretty_print:\n')
wrt(" eol_ = '\\n'\n")
wrt(' else:\n')
wrt(" eol_ = ''\n")
# We need to be able to export the original tag name.
wrt(" if self.original_tagname_ is not None:\n")
wrt(" name_ = self.original_tagname_\n")
wrt(' showIndent(outfile, level, pretty_print)\n')
wrt(" outfile.write('<%s%s%s' % (namespace_, name_, "
"namespacedef_ and ' ' + namespacedef_ or '', ))\n")
wrt(" already_processed = set()\n")
wrt(" self.exportAttributes(outfile, level, "
"already_processed, namespace_, name_='%s')\n" %
(name, ))
# fix_abstract
if base and base in ElementDict:
base_element = ElementDict[base]
# fix_derived
if base_element.isAbstract():
pass
if childCount == 0 and element.isMixed():
wrt(" outfile.write('>')\n")
wrt(" self.exportChildren(outfile, level + 1, "
"namespace_, name_, pretty_print=pretty_print)\n")
wrt(" outfile.write('</%s%s>%s' % (namespace_, name_, eol_))\n")
else:
wrt(" if self.hasContent_():\n")
# Added to keep value on the same line as the tag no children.
if element.getSimpleContent():
wrt(" outfile.write('>')\n")
if not element.isMixed():
>> wrt(" outfile.write((quote_xml(self.valueOf_) if
>> type(self.valueOf_) is str else str(self.valueOf_)).encode("
"ExternalEncoding))\n")
else:
….
This is a quick hack and I am sure there are better ways of doing this. It
solved my problem but I would appreciate your feedback.
Many thanks and best wishes
------------------------------------------------------------------------------
_______________________________________________
generateds-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/generateds-users