On Fri, Dec 23, 2016 at 04:00:56PM -0800, Clint Pitzak wrote: Clint,
I believe I see what the problem is. That restriction contains a vertical bar. generateDS.py generates code that turns that restriction value into a regular expression, and then uses the Python re (regular expression) module to test for a match. But, a vertical bar means something special in a regular expression. I'll have to give this a little thought. Why did I use the re module if we just want to test for equality? That seems like over-kill when a simple test for equality using the "==" operator would accomplish what we really want. Or, was there some reason for using the re module that I do not remember now? Strict equality is the test that we want, right? No. Wait. There must be *some* reason why I implemented this with regular expression matching. I looked at the XML Schema documentation again. Read this from https://www.w3.org/TR/2004/REC-xmlschema-0-20041028/#CreatDt: The purchase order schema contains another, more elaborate, example of a simple type definition. A new simple type called SKU is derived (by restriction) from the simple type string. Furthermore, we constrain the values of SKU using a facet called pattern in conjunction with the regular expression "\d{3}-[A-Z]{2}" that is read "three digits followed by a hyphen followed by two upper-case ASCII letters": Example Defining the Simple Type "SKU" <xsd:simpleType name="SKU"> <xsd:restriction base="xsd:string"> <xsd:pattern value="\d{3}-[A-Z]{2}"/> </xsd:restriction> </xsd:simpleType> That suggests that the pattern really is a regular expression. Also see https://www.w3.org/TR/2004/REC-xmlschema-2-20041028/datatypes.html#rf-pattern, where it says: 4.3.4.4 pattern Validation Rules Validation Rule: pattern valid A literal in a ·lexical space· is facet-valid with respect to ·pattern· if: 1 the literal is among the set of character sequences denoted by the ·regular expression· specified in {value}. I tried testing with the pattern that generateDS.py generates, and it confirms your report. For example (in the ipython interactive Python shell): In [20]: print re.search('^Electronic Materials|ELECTRONIC MATERIALS$', 'Electronic Materialsz') <_sre.SRE_Match object at 0x7f95a87fd4a8> In [21]: print re.search('^Electronic Materials|ELECTRONIC MATERIALS$', 'Electronic Materials') <_sre.SRE_Match object at 0x7f95a87fd4a8> In [22]: print re.search('^Electronic Materials|ELECTRONIC MATERIALS$', 'Electronic Material') None In [23]: print re.search('^Electronic Materials|ELECTRONIC MATERIALS$', 'Electronic Materials|ELECTRONIC MATERIALS') <_sre.SRE_Match object at 0x7f95a87fd4a8> In [24]: In [24]: print re.search('^Electronic Materials|ELECTRONIC MATERIALS$', 'Electronic Materials|ELECTRONIC MATERIALSx') <_sre.SRE_Match object at 0x7f95a87fd4a8> In [25]: print re.search('^Electronic Materials|ELECTRONIC MATERIALS$', 'Electronic Materials|ELECTRONIC MATERIAL') <_sre.SRE_Match object at 0x7f95a87fd4a8> In [26]: When re.search returns a match object (_sre.SRE_Match), that means that it successfully matched. So, if we take the XML Schema documentation seriously, perhaps that pattern in the schema should actually be the following (note the added backslash escape): <xsd:simpleType> <xsd:restriction base="xsd:string"> <!-- ADVANCED ELECTRONICS --> <xsd:pattern value="Electronic Materials\|ELECTRONIC MATERIALS" /> Then we'd get these results: In [26]: In [26]: print re.search('^Electronic Materials\|ELECTRONIC MATERIALS$', 'Electronic Materials|ELECTRONIC MATERIALS') <_sre.SRE_Match object at 0x7f95a87fd4a8> In [27]: print re.search('^Electronic Materials\|ELECTRONIC MATERIALS$', 'Electronic Materials|ELECTRONIC MATERIALSx') None In [28]: print re.search('^Electronic Materials\|ELECTRONIC MATERIALS$', 'Electronic Materials|ELECTRONIC MATERIAL') None In [29]: print re.search('^Electronic Materials\|ELECTRONIC MATERIALS$', 'Electronic Materials') None In [30]: print re.search('^Electronic Materials\|ELECTRONIC MATERIALS$', 'Electronic Materialss') None What do you think? Could the schema actually be wrong? Not likely, but ... Dave > Hi Dave, > > I wanted to first say amazing python module. Thank you very much. > > I found a bug in the validation that was produced. Specifically the xsd I'm > using has the restriction: > > <xsd:pattern value="Electronic Materials|ELECTRONIC MATERIALS"/> > > and generateDS allows "Electronic Materialss" > > To reproduce the error you can use generateDS to generate the api from: > http://www.dtic.mil/dtic/pdf/ird_xml_data_submission.xml > > And then for COISubArea provide the text "Electronic Materialss" which is > invalid according to the xsd from > http://www.dtic.mil/dtic/pdf/ird_xml_data_submission.xml > > However, generateDS doesn't warn that its invalid in the validator. > > If you change it to "Electronic Materialsz" it will also not warn. > > If you change it to "Electronic Material" it will warn. > > Thanks again, > > Clint -- Dave Kuhlman http://www.davekuhlman.org ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ generateds-users mailing list generateds-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/generateds-users