Martin Panter added the comment:

I started looking at the lower Expat-level changes. Here are some thoughts, in 
the order that I thought them. :) But the end result is to investigate a 
different approach to disable entities in existing versions of Expat.

Currently, it looks like max_entity_indirections = 0 is a special value meaning 
no limit. I think it would be better to use some other value such as None for 
this, and then 0 could disable all entity expansion (other than pre-defined 
entities like & &#xNNNN; etc).

What is the benefit of having the indirection limit? I would have thought the 
entity expansion (character) limit on its own would already be effective at 
preventing nested expansion attacks like “billion laughs”. Even if the entity 
expanded to an empty string, all of the intermediate entity references are 
still included in the character count.

I wonder if it would make more sense to have a total character limit instead, 
which would include the characters from custom entity expansions as already 
counted by the patch, but also count characters directly from the XML body. Why 
would you want to avoid 8 million characters from entity expansion, but allow 8 
million characters of plain XML (or gzipped XML)? (I am not an XML expert, so I 
could be missing something obvious here.)

Now I have discovered that it seems you can build Python to use an external 
Expat library, which won’t be affected by Christian’s fix (correct me if I am 
wrong). I think we should find a different solution that will also work with 
existing external Expat versions. Maybe setting EntityDeclHandler to raise an 
error would be good enough:

>>> from xml.parsers import expat
>>> bomb = '<!DOCTYPE bomb [\n<!ENTITY a "" >\n<!ENTITY b "' + '&a;' * 1000 + 
>>> '" >\n<!ENTITY c "' + '&b;' * 1000 + '" >\n]>\n<bomb a="' + '&c;' * 10 + '" 
>>> />\n'
>>> p = expat.ParserCreate()
>>> p.Parse(bomb, True)  # Noticeable delay (DOS) while parsing
1
>>> p = expat.ParserCreate()
>>> def handler(*so_much_argh):
...     raise ValueError("Entity handling disabled")
... 
>>> p.EntityDeclHandler = handler
>>> p.Parse(bomb, True)  # Instant failure (no DOS)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/build/python/src/Python-3.4.3/Modules/pyexpat.c", line 494, in 
EntityDecl
  File "<stdin>", line 2, in handler
ValueError: Entity handling disabled

This solution has been suggested and implemented elsewhere:
* https://bugzilla.redhat.com/show_bug.cgi?id=1000109#c1
* 
http://mail-archives.apache.org/mod_mbox/apr-dev/200906.mbox/%3c20090602162934.ga28...@redhat.com%3E
 (though I suspect the SetDefaultHandler option there is not sufficient)

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17239>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to