Martin Panter added the comment:
I started looking at the lower Expat-level changes. Here are some thoughts, in
the order that I thought them. :) But the end result is to investigate a
different approach to disable entities in existing versions of Expat.
Currently, it looks like max_entity_indirections = 0 is a special value meaning
no limit. I think it would be better to use some other value such as None for
this, and then 0 could disable all entity expansion (other than pre-defined
entities like & &#xNNNN; etc).
What is the benefit of having the indirection limit? I would have thought the
entity expansion (character) limit on its own would already be effective at
preventing nested expansion attacks like “billion laughs”. Even if the entity
expanded to an empty string, all of the intermediate entity references are
still included in the character count.
I wonder if it would make more sense to have a total character limit instead,
which would include the characters from custom entity expansions as already
counted by the patch, but also count characters directly from the XML body. Why
would you want to avoid 8 million characters from entity expansion, but allow 8
million characters of plain XML (or gzipped XML)? (I am not an XML expert, so I
could be missing something obvious here.)
Now I have discovered that it seems you can build Python to use an external
Expat library, which won’t be affected by Christian’s fix (correct me if I am
wrong). I think we should find a different solution that will also work with
existing external Expat versions. Maybe setting EntityDeclHandler to raise an
error would be good enough:
>>> from xml.parsers import expat
>>> bomb = '<!DOCTYPE bomb [\n<!ENTITY a "" >\n<!ENTITY b "' + '&a;' * 1000 +
>>> '" >\n<!ENTITY c "' + '&b;' * 1000 + '" >\n]>\n<bomb a="' + '&c;' * 10 + '"
>>> />\n'
>>> p = expat.ParserCreate()
>>> p.Parse(bomb, True) # Noticeable delay (DOS) while parsing
1
>>> p = expat.ParserCreate()
>>> def handler(*so_much_argh):
... raise ValueError("Entity handling disabled")
...
>>> p.EntityDeclHandler = handler
>>> p.Parse(bomb, True) # Instant failure (no DOS)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/build/python/src/Python-3.4.3/Modules/pyexpat.c", line 494, in
EntityDecl
File "<stdin>", line 2, in handler
ValueError: Entity handling disabled
This solution has been suggested and implemented elsewhere:
* https://bugzilla.redhat.com/show_bug.cgi?id=1000109#c1
*
http://mail-archives.apache.org/mod_mbox/apr-dev/200906.mbox/%[email protected]%3E
(though I suspect the SetDefaultHandler option there is not sufficient)
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue17239>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com