Well, I do not think it is unfortunate at this point.

As a newuser I would rather have accuracy as opposed to speed (though I
respect that others may be in a different situation).

-Tim


On Wed, 2009-12-02 at 06:33 -0700, Peter A. Bigot wrote:
> Unfortunately, performance took a back seat to validation for the 
> current implementation.  The example in examples/tmsxtvd has been the 
> sole performance benchmark so far.  On my machine, it shows:
> 
>   vmfed9[26]$ python dumpsample.py
>   Generating binding from tmsdatadirect_sample.xml with minidom
>   minidom first callSign at None
>   Generating binding from tmsdatadirect_sample.xml with SAXDOM
>   SAXDOM first callSign at tmsdatadirect_sample.xml[5:0]
>   Generating binding from tmsdatadirect_sample.xml with SAX
>   SAXER first callSign at tmsdatadirect_sample.xml[5:0]
>   DOM-based read 0.000962, parse 0.391175, bind 10.292386, total 10.683561
>   SAXDOM-based parse 1.658077, bind 10.178704, total 11.836781
>   SAX-based read 0.000112, parse and bind 10.605082, total 10.605194
> 
> These are using three different XML back ends to parse the document, but 
> the same generated bindings and runtime support.  As you can see, the 
> bulk of the time is in checking all the content and putting the values 
> into Python objects.  The test document here is 205 KB in 10 seconds, so 
> a 6MB document in 90 seconds is faster than I'd thought it might be.
> 
> However, performance is unacceptable for certain applications.  There 
> are a couple approaches.  One specifically that I have in mind is to 
> implement an optimized back end stores values like integers and strings 
> in native Python form rather than in the subclasses that support 
> validation.  In that case, validation would become a second, optional, 
> step that you'd have to invoke specifically on each object.  The 
> following is the same program, same bindings, but with:
> 
>   pyxb.RequireValidWhenParsing(False)
> 
> set at the top of the script.  That option provides an extremely crude 
> validation bypass, and I can't say it will work correctly in all 
> situations.  However, the results are promising (and better than I'd 
> expected):
> 
>   Generating binding from tmsdatadirect_sample.xml with minidom
>   minidom first callSign at None
>   Generating binding from tmsdatadirect_sample.xml with SAXDOM
>   SAXDOM first callSign at tmsdatadirect_sample.xml[5:0]
>   Generating binding from tmsdatadirect_sample.xml with SAX
>   SAXER first callSign at tmsdatadirect_sample.xml[5:0]
>   DOM-based read 0.001482, parse 0.398322, bind 2.947036, total 3.345358
>   SAXDOM-based parse 1.677429, bind 2.689278, total 4.366707
>   SAX-based read 0.000217, parse and bind 3.052327, total 3.052544
> 
> The separate validation step would be something like:
> 
>   pyxb.RequireValidWhenParsing(True)
>   dom_instance.validateBinding()
> 
> (You must reset the RequireValidWhenParsing flag, or the validateBinding 
> method will immediately succeed.)  With this, I get the following 
> additional time for validation:
> 
>   DOM-based validate 1.676465
>   SAXDOM-based validate 1.699026
>   SAX-based validate 1.710580
> 
> The fact that generation plus validation is half the time of generation 
> with validation leaves me skeptical that this is working correctly.
> 
> However, if that option meets your immediate performance needs, and you 
> can live with either no validation or a second pass, possibly incorrect, 
> validation, that's the best solution I have right now.  If you try it, 
> please let us know how it affected the speed; and if it breaks please 
> file a ticket on: http://sourceforge.net/apps/trac/pyxb/
> 
> I have hopes that a proper optimized back end, with or without 
> validation, will be available in about three months, but I need to see 
> whether the folks I originally developed this for are interested in 
> funding it.
> 
> Peter
> 
> Romain CHANU wrote:
> > Hi,
> >
> > Regarding my last email to the mailing list, I was trying to decide 
> > whether to use PyXB or generateDS.
> >
> > As a matter of fact, generateDS does not perform any validation 
> > against XML schema and had some issues in the creation of the bindings 
> > for complex schemas.
> >
> > I am now facing a performance issue with PyXB: I parse and validate a 
> > 6 Mo file containing XML data. This step takes about 90 seconds...
> >
> > Is this normal? Any hints to improve this?
> >
> > Thank you.
> >
> > Romain Chanu
> > ------------------------------------------------------------------------
> >
> > ------------------------------------------------------------------------------
> > Join us December 9, 2009 for the Red Hat Virtual Experience,
> > a free event focused on virtualization and cloud computing. 
> > Attend in-depth sessions from your desk. Your couch. Anywhere.
> > http://p.sf.net/sfu/redhat-sfdev2dev
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > pyxb-users mailing list
> > pyxb-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/pyxb-users
> >   
> 
> 
> 
> ------------------------------------------------------------------------------
> Join us December 9, 2009 for the Red Hat Virtual Experience,
> a free event focused on virtualization and cloud computing. 
> Attend in-depth sessions from your desk. Your couch. Anywhere.
> http://p.sf.net/sfu/redhat-sfdev2dev
> _______________________________________________
> pyxb-users mailing list
> pyxb-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pyxb-users


-- 
***************************************************************
Timothy Cook, MSc

LinkedIn Profile:http://www.linkedin.com/in/timothywaynecook 
Skype ID == (upon request)
Academic.Edu Profile: http://uff.academia.edu/TimothyCook

You may get my Public GPG key from  popular keyservers or    
from this link http://timothywayne.cook.googlepages.com/home 

Attachment: signature.asc
Description: This is a digitally signed message part

------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
_______________________________________________
pyxb-users mailing list
pyxb-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pyxb-users

Reply via email to