Re: [Generateds-users] Problem with ampersand in string simple content of complex type

Dave Kuhlman Sun, 15 Nov 2015 16:38:24 -0800

On Fri, Nov 13, 2015 at 05:39:21PM +0000, Ardan Patwardhan wrote:
> Dear Dave and all
> 
> Thanks for providing such a useful package - I have been using for
> over an year on a number of projects! However recently I stumbled
> upon the following problem. In a schema that I am using I have a
> complex type with simple content described as follows:


Ardan,

Thanks for reporting this.  I've added your fix to generateDS.py.
I'll do some more testing.  I've got a couple of other fixes to work
into the code, and then I'll upload the fixed version to the
repository.

Thanks for your help with fixing this.

Dave


> 
> <xs:complexType name="sciSpeciesType">
>     <xs:simpleContent>
>       <xs:extension base="xs:string">
>         <xs:attribute name="ncbiTaxId" type="xs:integer"/>
>       </xs:extension>
>     </xs:simpleContent>
>   </xs:complexType>
> 
> I had a user who had filled in an XML with the following data:
> 
> <sciSpeciesStrain>Armache &amp; Anger (AA) PBMCs</sciSpeciesStrain>
> 
> I used generateDS to generate a class to read this and to write it
> out and the output was as follows:
> 
> <sciSpeciesStrain>Armache & Anger (AA) PBMCs</sciSpeciesStrain>
> 
> The '&' symbol being illegal in XML in this form caused downstream
> issues when the file was re-read.
> 
> I tested a simple type string or token with '&amp;' and this was
> written out as I expected as '&amp;'
> 
> Looking through the generateDS code I found that when a simple
> string or token is being written out, the value is escaped using the
> quote_xml function, but this was not the case for simple content of
> a complex type.
> 
> So I modified the line 2725 in generateDS.py (2.17a0):
> 
>  wrt("            outfile.write(str(self.valueOf_).encode("
>                     "ExternalEncoding))\n")
> 
> To >>
> 
>  wrt("            outfile.write((quote_xml(self.valueOf_) if 
> type(self.valueOf_) is str else str(self.valueOf_)).encode("
>                     "ExternalEncoding))\n")
> 
> In
> 
> def generateExportFn(wrt, prefix, element, namespace, nameSpacesDef):
>     childCount = countChildren(element, 0)
>     name = element.getName()
>     base = element.getBase()
>     
>     wrt("    def export(self, outfile, level, namespace_='%s', "
>         "name_='%s', namespacedef_='%s', pretty_print=True):\n" %
>         (namespace, name, nameSpacesDef))
>     wrt('        if pretty_print:\n')
>     wrt("            eol_ = '\\n'\n")
>     wrt('        else:\n')
>     wrt("            eol_ = ''\n")
>     # We need to be able to export the original tag name.
>     wrt("        if self.original_tagname_ is not None:\n")
>     wrt("            name_ = self.original_tagname_\n")
>     wrt('        showIndent(outfile, level, pretty_print)\n')
>     wrt("        outfile.write('<%s%s%s' % (namespace_, name_, "
>         "namespacedef_ and ' ' + namespacedef_ or '', ))\n")
>     wrt("        already_processed = set()\n")
>     wrt("        self.exportAttributes(outfile, level, "
>         "already_processed, namespace_, name_='%s')\n" %
>         (name, ))
>     # fix_abstract
>     if base and base in ElementDict:
>         base_element = ElementDict[base]
>         # fix_derived
>         if base_element.isAbstract():
>             pass
>     if childCount == 0 and element.isMixed():
>         wrt("        outfile.write('>')\n")
>         wrt("        self.exportChildren(outfile, level + 1, "
>             "namespace_, name_, pretty_print=pretty_print)\n")
>         wrt("        outfile.write('</%s%s>%s' % (namespace_, name_, 
> eol_))\n")
>     else:
>         wrt("        if self.hasContent_():\n")
>         # Added to keep value on the same line as the tag no children.
>         if element.getSimpleContent():
>             wrt("            outfile.write('>')\n")
>             if not element.isMixed():
> >>                wrt("            outfile.write((quote_xml(self.valueOf_) if 
> >> type(self.valueOf_) is str else str(self.valueOf_)).encode("
>                     "ExternalEncoding))\n")
>         else:
> 
> 
> This is a quick hack and I am sure there are better ways of doing this. It 
> solved my problem but I would appreciate your feedback.
> 
> Many thanks and best wishes
> 

-- 

Dave Kuhlman
http://www.davekuhlman.org

------------------------------------------------------------------------------
Presto, an open source distributed SQL query engine for big data, initially
developed by Facebook, enables you to easily query your data on Hadoop in a 
more interactive manner. Teradata is also now providing full enterprise
support for Presto. Download a free open source copy now.
http://pubads.g.doubleclick.net/gampad/clk?id=250295911&iu=/4140
_______________________________________________
generateds-users mailing list
generateds-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/generateds-users

Re: [Generateds-users] Problem with ampersand in string simple content of complex type

Reply via email to