On Fri, Feb 06, 2015 at 01:05:42AM +0000, George David wrote: > Hi Dave, > > I created a xsd that has an element called script. The intent is to allow > users to send us javascript that is encoded with CDATA tags. > > In the attached files you can see that I set the script variable as follows: > cdataObj = Cdata() > > script='''<![CDATA[ > var x, text; > > // Get the value of the input field with id="numb" > x = document.getElementById("numb one").value; > > // If x is Not a Number or less than one or greater than 10 > if (isNaN(x) || x < 1 || x > 10) { > text = "Input not valid"; > } else { > text = "Input OK"; > } > document.getElementById("demo").innerHTML = text; > ]]>''' > cdataObj.set_script(script) > > I exported it: > > cdataObj.export(sys.stdout, 0, name_='cdata') > > And got the following: > > <cdata:cdata xmlns:cdata="urn:cdata"> > <cdata:script><![CDATA[ > var x, text; > > // Get the value of the input field with id="numb" > x = document.getElementById("numb one").value; > > // If x is Not a Number or less than one or greater than 10 > if (isNaN(x) || x < 1 || x > 10) { > text = "Input not valid"; > } else { > text = "Input OK"; > } > document.getElementById("demo").innerHTML = text; > ]]></cdata:script> > </cdata:cdata> > > Note that the CDATA wrappers have been encoded <![CDATA[ has been changed > to <![CDATA[ and ]]> has been changed to ]]>
George, Good to hear from you again. One solution to the above is to use a more intelligent replacement. The attached patch uses the re module and two regular expressions to replace (escape) "<" and ">" without replacing "<![CDATA[" and "]]>". > > Also notice that the < and > signs in the java script have also been > encoded. I believe there should be code to check for the CDATA tags and not > xml encode it if they exist. I'll try to track this down in the code but I > wanted to make sure this wasn't done on purpose. > > There is another problem with CDATA. If I create an xml string with CDATA > and parse it like this: Re: the missing CDATA wrappers: The problem is that when the generated code uses lxml to parse an XML instance doc, lxml strips away the "<![CDATA[" and "]]>". I don't believe that we can even tell that they were there in the first place. The attached script (cdata_demo.py) attempts to demonstrate this. So, after that XML instance doc has been parsed, there is no way to tell that the CDATA tags were there in the first place. Wait ... I did one more Web search ... It's even the case that lxml has a special provision for this issue. I found this: http://lxml.de/api.html#cdata (It's incredible what kind of hidden information you can find with a Web search engine. You should try one sometime. But, seriously, ...) However, when you use ``element.text`` to capture the text data, the CDATA tags are still missing, even though when you use ``etree.tostring(some_element)`` they are there. I haven't figured out how to deal with this, yet. I'll think a bit more on it. If you can think of a work-around for this, please let me know. On an unrelated subject -- generateDS.py does not handle multiple namespaces in the same XML schema, in particular when ``<xs:import ...>`` is used. I've had several reports about this. If I recall correctly, you contributed the code that implements --one-file-per-xsd. I'm wondering if that might be helpful in some of these situations. If you have any comments or suggestions about this, I'd be interested in hearing them. And, have you had any experience with lxml.objectify? (http://lxml.de/objectify.html) I'm wondering whether it might solve some of these problems (in particular the namespaces and CDATA issues) better that generateDS.py does. Maybe we can learn something from it. More later. Dave > > xml=''' > <cdata:cdata xmlns:cdata="urn:cdata"> > <cdata:script><![CDATA[ > var x, text; > > // Get the value of the input field with id="numb" > x = document.getElementById("numb one").value; > > // If x is Not a Number or less than one or greater than 10 > if (isNaN(x) || x < 1 || x > 10) { > text = "Input not valid"; > } else { > text = "Input OK"; > } > document.getElementById("demo").innerHTML = text; > ]]></cdata:script> > </cdata:cdata> > ''' > cdata.parseString(xml) > > It incorrectly strips out the CDATA tags: > > parseString spits out xml with the CDATA tags removed.: > > <?xml version="1.0" ?> > <cdata:cdata xmlns:cdata="urn:cdata"> > <cdata:script> > var x, text; > > // Get the value of the input field with id="numb" > x = document.getElementById("numb one").value; > > // If x is Not a Number or less than one or greater than 10 > if (isNaN(x) || x &lt; 1 || x &gt; 10) { > text = "Input not valid"; > } else { > text = "Input OK"; > } > document.getElementById("demo").innerHTML = text; > </cdata:script> > </cdata:cdata> > > > And on printing the script specifically, I also don't have CDATA tags > anymore and the < and > are xml encoded. > > print cdataObj.get_script() > > var x, text; > > // Get the value of the input field with id="numb" > x = document.getElementById("numb one").value; > > // If x is Not a Number or less than one or greater than 10 > if (isNaN(x) || x < 1 || x > 10) { > text = "Input not valid"; > } else { > text = "Input OK"; > } > document.getElementById("demo").innerHTML = text; > > > I'll see if I can track this down also. If you could give me a hint of > where to look that would be helpful. > > Thanks, > George -- Dave Kuhlman http://www.davekuhlman.org ------------------------------------------------------------------------------ Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ generateds-users mailing list generateds-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/generateds-users