Re: [xml] Exponential running time in RELAX NG matching of optional attributes

2019-08-23 Thread Daniel Veillard via xml
On Mon, Jul 22, 2019 at 11:59:42AM +0200, Jan Pokorný via xml wrote:
> On 27/06/19 14:15 +0200, Nikolai Weibull via xml wrote:
> > The following RELAX NG schema requires exponential running time when
> > matching attributes (15 attributes is where my computer begins to
> > show the symptoms, this may vary with your set-up):
> > 
> > a.rng:
> > 
> > 
> > http://relaxng.org/ns/structure/1.0;>
> >   
> > 
> >   a
> >   
> >  > ns="">a
> >  > ns="">b
> >  > ns="">c
> >  > ns="">d
> >  > ns="">e
> >  > ns="">f
> >  > ns="">g
> >  > ns="">h
> >  > ns="">i
> >  > ns="">j
> >  > ns="">k
> >  > ns="">l
> >  > ns="">m
> >  > ns="">n
> >  > ns="">o
> >   
> > 
> >   
> > 
> > 
> > a.xml:
> > 
> > 
> >  > l="12" m="13" n="14" o="15"/>
> > 
> > % time xmllint --noout --relaxng a.rng a.xml
> > real: 3.175, user: 3.147, system: 0.014 (99%)
> > 
> > Changing the group to an interleave, that is,
> > 
> > b.rng:
> > 
> > 
> > http://relaxng.org/ns/structure/1.0;>
> >   
> > 
> >   a
> >   
> >  > ns="">a
> >  > ns="">b
> >  > ns="">c
> >  > ns="">d
> >  > ns="">e
> >  > ns="">f
> >  > ns="">g
> >  > ns="">h
> >  > ns="">i
> >  > ns="">j
> >  > ns="">k
> >  > ns="">l
> >  > ns="">m
> >  > ns="">n
> >  > ns="">o
> >   
> > 
> >   
> > 
> > 
> > gives
> > 
> > % time xmllint --noout --relaxng b.rng a.xml
> > real: 0.008, user: 0.003, system: 0.002 (54%)
> > 
> > Making the attributes required instead of optional also fixes the issue:
> > 
> > c.rng:
> > 
> > 
> > http://relaxng.org/ns/structure/1.0;>
> >   
> > 
> >   a
> >   
> > a
> > b
> > c
> > d
> > e
> > f
> > g
> > h
> > i
> > j
> > k
> > l
> > m
> > n
> > o
> >   
> > 
> >   
> > 
> > 
> > % time xmllint --noout --relaxng c.rng a.xml
> > real: 0.008, user: 0.003, system: 0.002 (55%)
> > 
> > The problem seems to be that xmlRelaxNGValidateDefinition keeps
> > adding states that I feel don’t need checking, but I can’t figure
> > out how to avoid this from happening.

  Hum, I'm way behind  sorry

yes that sounds like there is a graph generation issue, best is to compile
the regexp support with things like DEBUG_REGEXP_GRAPH enabled and trace down
with pencil and paper what kind of graph it generates. That's how most of that 
code
ended up being debugged 

> 
> I can only nod to your analysis, ran into similar observations
> in the past, and they were likewise XML attributes (that parameters to
> fence device instances boil down to) related:

> https://bugzilla.redhat.com/show_bug.cgi?id=1224378#c7
> 
> Callgrind proved to be a useful tool to get better quantitative numbers.

  yup but that's way down

> > Daniel, do you have any input on this?  I’d gladly work on fixing
> > this, but I feel that I need some guidance as to what’s going wrong
> > and perhaps also how to fix it.
> 
> Would be nice to get libxml's RelaxNG validation support closer
> to parity with jing (the only other freely available validator I am
> aware of that can be directly used for a peer comparison), for sure.

  well jing use an orthogonal approach, derivating the document graph directly
while libxml2 uses validation structure derivation instead (when it can't 
compile
down to a regexp). In that case it should be regexp based.
  The two will performs massively differently on different problems.

Daniel

> ___
> xml mailing list, project page  http://xmlsoft.org/
> xml@gnome.org
> https://mail.gnome.org/mailman/listinfo/xml


-- 
Daniel Veillard  | Red Hat Developers Tools http://developer.redhat.com/
veill...@redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Exponential running time in RELAX NG matching of optional attributes

2019-07-23 Thread Michael Stahl

On 22.07.19 11:59, Jan Pokorný via xml wrote:

Would be nice to get libxml's RelaxNG validation support closer
to parity with jing (the only other freely available validator I am
aware of that can be directly used for a peer comparison), for sure.


there's also the old Sun Multi-Schema Validator, which we use in the ODF 
Toolkit validator; it's Java based of course but for exponential 
algorithms you can still compare i guess.


https://github.com/kohsuke/msv

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Exponential running time in RELAX NG matching of optional attributes

2019-07-22 Thread Jan Pokorný via xml
On 27/06/19 14:15 +0200, Nikolai Weibull via xml wrote:
> The following RELAX NG schema requires exponential running time when
> matching attributes (15 attributes is where my computer begins to
> show the symptoms, this may vary with your set-up):
> 
> a.rng:
> 
> 
> http://relaxng.org/ns/structure/1.0;>
>   
> 
>   a
>   
>  ns="">a
>  ns="">b
>  ns="">c
>  ns="">d
>  ns="">e
>  ns="">f
>  ns="">g
>  ns="">h
>  ns="">i
>  ns="">j
>  ns="">k
>  ns="">l
>  ns="">m
>  ns="">n
>  ns="">o
>   
> 
>   
> 
> 
> a.xml:
> 
> 
>  m="13" n="14" o="15"/>
> 
> % time xmllint --noout --relaxng a.rng a.xml
> real: 3.175, user: 3.147, system: 0.014 (99%)
> 
> Changing the group to an interleave, that is,
> 
> b.rng:
> 
> 
> http://relaxng.org/ns/structure/1.0;>
>   
> 
>   a
>   
>  ns="">a
>  ns="">b
>  ns="">c
>  ns="">d
>  ns="">e
>  ns="">f
>  ns="">g
>  ns="">h
>  ns="">i
>  ns="">j
>  ns="">k
>  ns="">l
>  ns="">m
>  ns="">n
>  ns="">o
>   
> 
>   
> 
> 
> gives
> 
> % time xmllint --noout --relaxng b.rng a.xml
> real: 0.008, user: 0.003, system: 0.002 (54%)
> 
> Making the attributes required instead of optional also fixes the issue:
> 
> c.rng:
> 
> 
> http://relaxng.org/ns/structure/1.0;>
>   
> 
>   a
>   
> a
> b
> c
> d
> e
> f
> g
> h
> i
> j
> k
> l
> m
> n
> o
>   
> 
>   
> 
> 
> % time xmllint --noout --relaxng c.rng a.xml
> real: 0.008, user: 0.003, system: 0.002 (55%)
> 
> The problem seems to be that xmlRelaxNGValidateDefinition keeps
> adding states that I feel don’t need checking, but I can’t figure
> out how to avoid this from happening.

I can only nod to your analysis, ran into similar observations
in the past, and they were likewise XML attributes (that parameters to
fence device instances boil down to) related:

https://bugzilla.redhat.com/show_bug.cgi?id=1224378#c7

Callgrind proved to be a useful tool to get better quantitative numbers.

> Daniel, do you have any input on this?  I’d gladly work on fixing
> this, but I feel that I need some guidance as to what’s going wrong
> and perhaps also how to fix it.

Would be nice to get libxml's RelaxNG validation support closer
to parity with jing (the only other freely available validator I am
aware of that can be directly used for a peer comparison), for sure.

-- 
Jan (Poki)


pgp5HF3mWGh93.pgp
Description: PGP signature
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


[xml] Exponential running time in RELAX NG matching of optional attributes

2019-06-27 Thread Nikolai Weibull via xml
Hi!

The following RELAX NG schema requires exponential running time when matching 
attributes (15 attributes is where my computer begins to show the symptoms, 
this may vary with your set-up):

a.rng:


http://relaxng.org/ns/structure/1.0;>
  

  a
  
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
  

  


a.xml:




% time xmllint --noout --relaxng a.rng a.xml
real: 3.175, user: 3.147, system: 0.014 (99%)

Changing the group to an interleave, that is,

b.rng:


http://relaxng.org/ns/structure/1.0;>
  

  a
  
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
  

  


gives

% time xmllint --noout --relaxng b.rng a.xml
real: 0.008, user: 0.003, system: 0.002 (54%)

Making the attributes required instead of optional also fixes the issue:

c.rng:


http://relaxng.org/ns/structure/1.0;>
  

  a
  
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
  

  


% time xmllint --noout --relaxng c.rng a.xml
real: 0.008, user: 0.003, system: 0.002 (55%)

The problem seems to be that xmlRelaxNGValidateDefinition keeps adding states 
that I feel don’t need checking, but I can’t figure out how to avoid this from 
happening.

Daniel, do you have any input on this?  I’d gladly work on fixing this, but I 
feel that I need some guidance as to what’s going wrong and perhaps also how to 
fix it.

Thank you,
  Nikolai
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml