Re: [xml] Exponential running time in RELAX NG matching of optional attributes
On Mon, Jul 22, 2019 at 11:59:42AM +0200, Jan Pokorný via xml wrote: > On 27/06/19 14:15 +0200, Nikolai Weibull via xml wrote: > > The following RELAX NG schema requires exponential running time when > > matching attributes (15 attributes is where my computer begins to > > show the symptoms, this may vary with your set-up): > > > > a.rng: > > > > > > http://relaxng.org/ns/structure/1.0;> > > > > > > a > > > > > ns="">a > > > ns="">b > > > ns="">c > > > ns="">d > > > ns="">e > > > ns="">f > > > ns="">g > > > ns="">h > > > ns="">i > > > ns="">j > > > ns="">k > > > ns="">l > > > ns="">m > > > ns="">n > > > ns="">o > > > > > > > > > > > > a.xml: > > > > > > > l="12" m="13" n="14" o="15"/> > > > > % time xmllint --noout --relaxng a.rng a.xml > > real: 3.175, user: 3.147, system: 0.014 (99%) > > > > Changing the group to an interleave, that is, > > > > b.rng: > > > > > > http://relaxng.org/ns/structure/1.0;> > > > > > > a > > > > > ns="">a > > > ns="">b > > > ns="">c > > > ns="">d > > > ns="">e > > > ns="">f > > > ns="">g > > > ns="">h > > > ns="">i > > > ns="">j > > > ns="">k > > > ns="">l > > > ns="">m > > > ns="">n > > > ns="">o > > > > > > > > > > > > gives > > > > % time xmllint --noout --relaxng b.rng a.xml > > real: 0.008, user: 0.003, system: 0.002 (54%) > > > > Making the attributes required instead of optional also fixes the issue: > > > > c.rng: > > > > > > http://relaxng.org/ns/structure/1.0;> > > > > > > a > > > > a > > b > > c > > d > > e > > f > > g > > h > > i > > j > > k > > l > > m > > n > > o > > > > > > > > > > > > % time xmllint --noout --relaxng c.rng a.xml > > real: 0.008, user: 0.003, system: 0.002 (55%) > > > > The problem seems to be that xmlRelaxNGValidateDefinition keeps > > adding states that I feel don’t need checking, but I can’t figure > > out how to avoid this from happening. Hum, I'm way behind sorry yes that sounds like there is a graph generation issue, best is to compile the regexp support with things like DEBUG_REGEXP_GRAPH enabled and trace down with pencil and paper what kind of graph it generates. That's how most of that code ended up being debugged > > I can only nod to your analysis, ran into similar observations > in the past, and they were likewise XML attributes (that parameters to > fence device instances boil down to) related: > https://bugzilla.redhat.com/show_bug.cgi?id=1224378#c7 > > Callgrind proved to be a useful tool to get better quantitative numbers. yup but that's way down > > Daniel, do you have any input on this? I’d gladly work on fixing > > this, but I feel that I need some guidance as to what’s going wrong > > and perhaps also how to fix it. > > Would be nice to get libxml's RelaxNG validation support closer > to parity with jing (the only other freely available validator I am > aware of that can be directly used for a peer comparison), for sure. well jing use an orthogonal approach, derivating the document graph directly while libxml2 uses validation structure derivation instead (when it can't compile down to a regexp). In that case it should be regexp based. The two will performs massively differently on different problems. Daniel > ___ > xml mailing list, project page http://xmlsoft.org/ > xml@gnome.org > https://mail.gnome.org/mailman/listinfo/xml -- Daniel Veillard | Red Hat Developers Tools http://developer.redhat.com/ veill...@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/ ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
Re: [xml] Exponential running time in RELAX NG matching of optional attributes
On 22.07.19 11:59, Jan Pokorný via xml wrote: Would be nice to get libxml's RelaxNG validation support closer to parity with jing (the only other freely available validator I am aware of that can be directly used for a peer comparison), for sure. there's also the old Sun Multi-Schema Validator, which we use in the ODF Toolkit validator; it's Java based of course but for exponential algorithms you can still compare i guess. https://github.com/kohsuke/msv ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
Re: [xml] Exponential running time in RELAX NG matching of optional attributes
On 27/06/19 14:15 +0200, Nikolai Weibull via xml wrote: > The following RELAX NG schema requires exponential running time when > matching attributes (15 attributes is where my computer begins to > show the symptoms, this may vary with your set-up): > > a.rng: > > > http://relaxng.org/ns/structure/1.0;> > > > a > > ns="">a > ns="">b > ns="">c > ns="">d > ns="">e > ns="">f > ns="">g > ns="">h > ns="">i > ns="">j > ns="">k > ns="">l > ns="">m > ns="">n > ns="">o > > > > > > a.xml: > > > m="13" n="14" o="15"/> > > % time xmllint --noout --relaxng a.rng a.xml > real: 3.175, user: 3.147, system: 0.014 (99%) > > Changing the group to an interleave, that is, > > b.rng: > > > http://relaxng.org/ns/structure/1.0;> > > > a > > ns="">a > ns="">b > ns="">c > ns="">d > ns="">e > ns="">f > ns="">g > ns="">h > ns="">i > ns="">j > ns="">k > ns="">l > ns="">m > ns="">n > ns="">o > > > > > > gives > > % time xmllint --noout --relaxng b.rng a.xml > real: 0.008, user: 0.003, system: 0.002 (54%) > > Making the attributes required instead of optional also fixes the issue: > > c.rng: > > > http://relaxng.org/ns/structure/1.0;> > > > a > > a > b > c > d > e > f > g > h > i > j > k > l > m > n > o > > > > > > % time xmllint --noout --relaxng c.rng a.xml > real: 0.008, user: 0.003, system: 0.002 (55%) > > The problem seems to be that xmlRelaxNGValidateDefinition keeps > adding states that I feel don’t need checking, but I can’t figure > out how to avoid this from happening. I can only nod to your analysis, ran into similar observations in the past, and they were likewise XML attributes (that parameters to fence device instances boil down to) related: https://bugzilla.redhat.com/show_bug.cgi?id=1224378#c7 Callgrind proved to be a useful tool to get better quantitative numbers. > Daniel, do you have any input on this? I’d gladly work on fixing > this, but I feel that I need some guidance as to what’s going wrong > and perhaps also how to fix it. Would be nice to get libxml's RelaxNG validation support closer to parity with jing (the only other freely available validator I am aware of that can be directly used for a peer comparison), for sure. -- Jan (Poki) pgp5HF3mWGh93.pgp Description: PGP signature ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[xml] Exponential running time in RELAX NG matching of optional attributes
Hi! The following RELAX NG schema requires exponential running time when matching attributes (15 attributes is where my computer begins to show the symptoms, this may vary with your set-up): a.rng: http://relaxng.org/ns/structure/1.0;> a a b c d e f g h i j k l m n o a.xml: % time xmllint --noout --relaxng a.rng a.xml real: 3.175, user: 3.147, system: 0.014 (99%) Changing the group to an interleave, that is, b.rng: http://relaxng.org/ns/structure/1.0;> a a b c d e f g h i j k l m n o gives % time xmllint --noout --relaxng b.rng a.xml real: 0.008, user: 0.003, system: 0.002 (54%) Making the attributes required instead of optional also fixes the issue: c.rng: http://relaxng.org/ns/structure/1.0;> a a b c d e f g h i j k l m n o % time xmllint --noout --relaxng c.rng a.xml real: 0.008, user: 0.003, system: 0.002 (55%) The problem seems to be that xmlRelaxNGValidateDefinition keeps adding states that I feel don’t need checking, but I can’t figure out how to avoid this from happening. Daniel, do you have any input on this? I’d gladly work on fixing this, but I feel that I need some guidance as to what’s going wrong and perhaps also how to fix it. Thank you, Nikolai ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml