Raynard Sandwick added the comment:
I have opened issue #17318 to try to specify the problem better. While I do
think that catalogs are the correct fix for the validation use case (and thus
would like to see something more out-of-the-box in that vein), the real trouble
is that users are often
Brian Visel aeon.descrip...@gmail.com added the comment:
..still an issue.
--
nosy: +Brian.Visel
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2124
___
Martin v. Löwis mar...@v.loewis.de added the comment:
And my position still remains the same: this is not a bug. Applications
affected by this need to use the APIs that are in place precisely to deal with
this issue.
So I propose to close this report as invalid.
--
Brian Visel aeon.descrip...@gmail.com added the comment:
Of course, you can do as you like.
http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic/
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2124
Martin v. Löwis mar...@v.loewis.de added the comment:
Well, the issue is clearly underspecified, and different people read different
things into it. I take your citation of the W3C blog entry that you are asking
that caching should be employed. I read the issue entirely different, namely
that
Paul Boddie p...@boddie.org.uk added the comment:
Note that Python 3 provided a good opportunity for doing the minimal amount of
work here - just stop things from accessing remote DTDs - but I imagine that
even elementary standard library improvements of this kind weren't made (let
alone the
Changes by A.M. Kuchling li...@amk.ca:
--
assignee: akuchling -
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2124
___
___
Python-bugs-list
Mark Lawrence breamore...@yahoo.co.uk added the comment:
Does anybody know if users are still experiencing problems with this issue?
--
nosy: +BreamoreBoy
versions: +Python 2.7, Python 3.1, Python 3.2 -Python 2.6
___
Python tracker
Jean-Paul Calderone exar...@twistedmatrix.com added the comment:
Yes.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2124
___
___
Damien Neil ne...@misago.org added the comment:
I just ran into this problem. I was very surprised to realize that
every time the code I was working on parsed a docbook file, it generated
several HTTP requests to oasis-open.org to fetch the docbook DTDs.
I attempted to fix the issue by adding
Changes by Jean-Paul Calderone exar...@divmod.com:
--
nosy: +exarkun
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2124
___
___
Python-bugs-list
Jean-Paul Calderone exar...@divmod.com added the comment:
Though it's inconvenient to do so, you can arrange to have the locator
available from the entity resolver. The content handler's
setDocumentLocator method will be called early on with the locator
object. So you can give your entity
Martin v. Löwis mar...@v.loewis.de added the comment:
EntityResolver.resolveEntity() is called with the publicId and systemId as
arguments. It does not receive a locator.
Sure. But ContentHandler.setDocumentLocator receives it, and you are
supposed to store it for the entire parse, to always
Damien Neil ne...@misago.org added the comment:
On Feb 3, 2009, at 1:42 PM, Martin v. Löwis wrote:
Sure. But ContentHandler.setDocumentLocator receives it, and you are
supposed to store it for the entire parse, to always know what entity
is being processed if you want to.
Where in the
Damien Neil ne...@misago.org added the comment:
I just discovered another really fun wrinkle in this.
Let's say I want to have my entity resolver return a reference to my
local copy of a DTD. I write:
source = xml.sax.InputSource()
source.setPublicId(publicId)
Jean-Paul Calderone exar...@divmod.com added the comment:
It's indeed possible to provide that as a third-party module; one
would have to implement an EntityResolver, and applications would
have to use it. If there was a need for such a thing, somebody would
have done it years ago.
I don't
Martin v. Löwis mar...@v.loewis.de added the comment:
Where in the following sequence am I supposed to receive the document
locator?
parser = xml.sax.make_parser()
parser.setEntityResolver(CachingEntityResolver())
doc = xml.dom.minidom.parse('file.xml', parser)
This is DOM parsing, not
Damien Neil ne...@misago.org added the comment:
On Feb 3, 2009, at 11:23 AM, Martin v. Löwis wrote:
I don't think this is actually the case. Did you try calling getSystemId
on the locator?
EntityResolver.resolveEntity() is called with the publicId and systemId as
arguments. It does not
Martin v. Löwis mar...@v.loewis.de added the comment:
The EntityResolver's resolveEntity() method is not, however, passed the
base path to resolve the relative systemId from.
This makes it impossible to properly implement a parser which caches
fetched DTDs.
I don't think this is actually
Damien Neil ne...@misago.org added the comment:
On Feb 3, 2009, at 3:12 PM, Martin v. Löwis wrote:
This is DOM parsing, not SAX parsing.
1) The title of this ticket begins with xml.sax and xml.dom
2) I am creating a SAX parser and passing it to xml.dom, which uses it.
So break layers of
A.M. Kuchling added the comment:
The solution of adding caching, If-Modified-Since, etc. is a good one,
but I quail in fear at the prospect of expanding the saxutils resolver
into a fully caching HTML agent that uses a cache across processes. We
should really be encouraging people to use more
Martin v. Löwis added the comment:
I may have lost track somewhere: what does have urllib* to do with this
issue?
__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2124
__
___
Virgil Dupras added the comment:
-1 on the systematic warnings too, but what I was talking about is a
warning that would say The server you are trying to fetch your resource
from is refusing the connection. Don't cha think you misbehave? only on
5xx and 4xx responses, not on every remote
ajaksu added the comment:
Martin, I agree that simply not resolving DTDs is an unreasonable
request (and said so in the blog post). But IMHO there are lots of
possible optimizations, and the most valuable would be those darn easy
for newcomers to understand and use.
In Python, a winning combo
Virgil Dupras added the comment:
The blog page talked about 503 responses. What about issuing a warning
on these responses? Maybe it would be enough to make developers aware of
the problem?
Or what about in-memory caching of the DTDs? Sure, it wouldn't be as
good as a catalog or anything,
Paul Boddie added the comment:
(Andrew, thanks for making a bug, and apologies for not reporting this
in a timely fashion.)
Although an in-memory caching solution might seem to be sufficient, if
one considers things like CGI programs, it's clear that such programs
aren't going to benefit from
A.M. Kuchling added the comment:
What if we just tried to make the remote accesses apparent to the user,
by making a warning.warn() call in the default implementation that was
deactivated by a setFeature() call. With a warning, code will continue
to run but the user will at least be aware
Martin v. Löwis added the comment:
-1 on issuing a warning. I really cannot see much of a problem in this
entire issue. XML was designed to be straightforwardly usable over the
Internet (XML rec., section 1.1), and this issue is a direct
consequence of that design decision. You might just as
New submission from A.M. Kuchling:
The W3C posted an item at
http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic
describing how their DTDs are being fetched up to 130M times per day.
The Python parsers are part of the problem, as
noted by Paul Boddie on the python-advocacy
Changes by A.M. Kuchling:
--
type: - resource usage
__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2124
__
___
Python-bugs-list mailing list
Unsubscribe:
A.M. Kuchling added the comment:
Here's a simple test to demonstrate the problem:
from xml.sax import make_parser
from xml.sax.saxutils import prepare_input_source
parser = make_parser()
inp = prepare_input_source('file:file.xhtml')
parser.parse(inp)
file.xhtml contains:
?xml version=1.0
Changes by A.M. Kuchling:
--
priority: - urgent
__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2124
__
___
Python-bugs-list mailing list
Unsubscribe:
Martin v. Löwis added the comment:
On systems that support catalogs, the parsers should be changed to
support public identifiers, using local copies of these DTDs.
However, I see really no way how the library could avoid resolving the
DTDs altogether. The blog is WRONG in claiming that the
33 matches
Mail list logo