Stefan Behnel added the comment:
My latest status is that a decision on the future of the parser argument is
still pending. See #20219.
It's correctly deprecated in the sense that passing any previously existing
parser isn't going to be supported anymore, but passing an XMLPullParser (which
Nick Coghlan added the comment:
Yeah, I'd agree with Stefan here - documented deprecation is reasonable at this
point (so that new users avoid it for now), but we may still undeprecate it
later if we decide it makes sense to do so.
--
___
Python
R. David Murray added the comment:
I think the current status of whatsnew is OK for this, then.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
R. David Murray added the comment:
Someone should add programatic deprecation for the html argument, though, since
people need to switch to keyword-based calls to XMLParser to prepare for that
going away.
--
___
Python tracker
R. David Murray added the comment:
As far as I can tell from scanning this ticket, there was no agreement on
deprecating the parser argument to iterparse, but it has been documented as
deprecated with no documentation about what to replace it with, nor any
DeprecationWarning in the code that
Roundup Robot added the comment:
New changeset 31e6adf5bfba by R David Murray in branch 'default':
whatsnew: deprecation of ElementTree XMLParser *html* and iterparse *parser*.
http://hg.python.org/cpython/rev/31e6adf5bfba
--
___
Python tracker
Stefan Behnel added the comment:
The way the XMLPullParser is implemented in lxml.etree now is that it simply
inherits from XMLParser. This would also make sense for ElementTree, even
before supporting arbitrary targets. The patch in ticket #18990 makes this
simple to do.
For reference, here
Stefan Behnel added the comment:
Looks like we missed the alpha2 release for the close() API fix. I recommend
not letting yet another deadline go by.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
Stefan Behnel added the comment:
Created separate issue18990 to keep this one closed as is.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
Stefan Behnel added the comment:
While refactoring the iterparse() implementation in lxml to support this new
interface, I noticed that the close() method of the XMLPullParser does not
behave like the close() method of the XMLParser. Instead of setting some .root
attribute on the parser
Eli Bendersky added the comment:
I'm closing this issue as fixed because the feature was implemented. As
discussed, opened a new issue (#18902) to discuss re-designing some parts of
the current code that would allow a nicer implementation of this feature.
Anyone interested in the subject -
Roundup Robot added the comment:
New changeset 1faaec66c73d by Eli Bendersky in branch 'default':
Fix XMLPullParser documentation to say non-blocking instead of asynchronous.
http://hg.python.org/cpython/rev/1faaec66c73d
--
___
Python tracker
Roundup Robot added the comment:
New changeset 8fd72b1bb262 by Eli Bendersky in branch 'default':
Issue #17741: Rename IncrementalParser and its methods.
http://hg.python.org/cpython/rev/8fd72b1bb262
--
___
Python tracker rep...@bugs.python.org
Eli Bendersky added the comment:
This issue has become too large and discusses a few different things; hence I'm
inclined to close it as fixed and open a new one as a placeholder for
discussing the new design of the internals.
I'll do this in a couple of days if there are no objections.
Eli Bendersky added the comment:
This patch implements the renaming and updates the documentation.
--
Added file: http://bugs.python.org/file31497/issue17741.api-renames.1.patch
___
Python tracker rep...@bugs.python.org
Stefan Behnel added the comment:
Any comments regarding my naming suggestion?
Calling it a push parser is just too ambiguous.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
Stefan Behnel added the comment:
Erm, pull parser, but you see what I mean.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
___
Stefan Behnel added the comment:
Putting _setevents aside for the moment,
Agreed, obviously.
XMLParser is a clean and simple API. Its output is only push (by calling
callbacks on the target). It doesn't deal with Elements at all.
We already agreed on that, too. Keep things simple.
It
Nick Coghlan added the comment:
The whole point of the new API is not to replace XMLParser, but to
provide a convenience API to set up a particular combination of an
XMLParser with a particular kind of custom target. It just happens
that actually *implementing it* that way doesn't work right
Stefan Behnel added the comment:
The whole point of the new API is not to replace XMLParser, but to provide a
convenience API to set up a particular combination of an XMLParser with a
particular kind of custom target.
Ok, but I'm saying that we don't need that. It's all there, it all comes
Eli Bendersky added the comment:
I'm sorry if I sound offensive, but most of what I read as counter
arguments in this ticket so far was either incomplete or (sometimes) even
plain wrong. Some seemed pretty close to FUD, but maybe that's just me (and
I'm guilty here too, I guess - sorry for
Stefan Behnel added the comment:
Eli, I agree that we've put way more than enough time into the discussion by
now. We all know each other's arguments and failed to convince each other.
Please come up with working code that shows that the approach you are
advocating for the final
Nick Coghlan added the comment:
Stefan, your proposed merged design isn't going to happen. Two alternative
ways of using the one class is far more complicated to explain to users
than exposing a helper that composes the existing API components to address
a particular use case (particularly when
Nick Coghlan added the comment:
Sorry, Stefan, I missed your last comment before posting mine. It appears
you had already reached the same conclusion I had regarding further high
level design discussion being pointless :)
--
___
Python tracker
Eli Bendersky added the comment:
W.r.t. the summary in http://bugs.python.org/msg196177, after some further chat
with Nick, a name that sounds good for us for the class is XMLPullParser. It
has both XML and Parser in it, so the intention is clear. On the other hand,
Pull works well for
Stefan Behnel added the comment:
the push API is inactive and gets redirected to a pull API
Given that this is aimed to become a redundant 'convenience' wrapper around
something else at a point, I assume that you are aware that the above is just
an arbitrary restriction due to the fact that
Stefan Behnel added the comment:
BTW, I also like how short and clean iterparse() becomes when you move this
feature into the parser. It's basically just a convenience function that does
read(), feed(), and yield-from. Plus the usual bit of bolerplate code,
obviously.
--
Nick Coghlan added the comment:
Eli's summary left out an exchange between us that happened after he'd already
written the summary - he pointed out the same problem with the EventParser name
that you noticed: it's really an alternative XMLParser that exposes
read_events(), rather than an
Stefan Behnel added the comment:
iterparse's parser argument will be deprecated
No need to do that. Update the docs, yes, but otherwise keep the possibility to
improve the implementation later on, without going through a deprecation +
dedeprecation cycle. That would just confuse users IMHO.
Nick Coghlan added the comment:
Since parsers don't support changing the target after creation, I think it
makes sense to deprecate passing in a parser *instance*, and instead require
passing in a callback that accepts the target to use and *returns* an
appropriate parser object. The parser
Stefan Behnel added the comment:
I don't see adding one method to XMLParser as a design problem. In fact, it's
even a good design on the technical side, because if ET ever gains an
HTMLParser, then the implementation of this feature would be highly dependent
on the underlying parser, i.e. it
Stefan Behnel added the comment:
... instead require passing in a callback that accepts the target ...
That could be the parser class then, for example, except that there may be
other options to set as well. Plus, it would not actually allow iterparse to
wrap a user provided target. So, in
Stefan Behnel added the comment:
it's really about turning XMLParser's push API for events (where the events
are pushed into the target object by the parser calling the appropriate
methods), into an iterparse style pull API where the events can be retrieved
via calls to read_events().
Stefan Behnel added the comment:
in the long run we want the new class to just be a convenience API for
combining XMLParser and a custom target object, even if it can't be
implemented that way right now.
Just to be clear: I changed my opinion on this one and I no longer think that
it is a
Eli Bendersky added the comment:
On Sun, Aug 25, 2013 at 10:40 PM, Stefan Behnel rep...@bugs.python.orgwrote:
Stefan Behnel added the comment:
Hmm, did you look at my last comment at all? It solves both the technical
issues and the API issues very nicely and avoids any problems of
Stefan Behnel added the comment:
XMLParser knows nothing about Elements, at least in the direct API of today.
The one constructing Elements is the target.
Absolutely. And I'm 100% for keeping that distinction exactly as it is.
The read_events method proposed for the new class (currently
Stefan Behnel added the comment:
Here is a proof-of-concept patch that integrates the functionality of the
IncrementalParser into the XMLParser. I ended up reusing most of Antoines
implementation and test suite. In case he'll look back into this ticket at some
point, I'll put a thank you
Stefan Behnel added the comment:
(I still wonder why I'm the one writing all the patches here when Eli is the
one who actually wants this feature ...)
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
Stefan Behnel added the comment:
BTW, maybe read_events() still isn't the ideal method name to put on a parser.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
Eli Bendersky added the comment:
(I still wonder why I'm the one writing all the patches here when Eli is
the one who actually wants this feature ...)
Well, obviously because you're the only real programmer around here and the
rest of us are just a bunch of hand-wavy morons.
Seriously,
Changes by Ethan Furman et...@stoneleaf.us:
--
nosy: +ethan.furman
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
___
Python-bugs-list
Ethan Furman added the comment:
Stefan Behnel wrote:
... I still wonder why I'm the one writing all the patches ...
I imagine for the same reasons that I offered to write Enum: I had definite
ideas about how it should be, it is sometimes easier to explain with working
code than with prose,
Eli Bendersky added the comment:
On Mon, Aug 26, 2013 at 12:35 PM, Stefan Behnel rep...@bugs.python.orgwrote:
Stefan Behnel added the comment:
Here is a proof-of-concept patch that integrates the functionality of the
IncrementalParser into the XMLParser. I ended up reusing most of Antoines
Stefan Behnel added the comment:
1. Why have the event builder wrap a tree builder? Can't it just be a
separate target?
You need a TreeBuilder in order to build the tree for the events.
If you want to use a different target than a TreeBuilder, you're free to do so.
That's the idea of
Eli Bendersky added the comment:
Looking at this more, the Parser/Builder abstraction in ET leaks badly,
especially when it comes to incremental parsing. This manifests in (at least)
two ways:
1. The parser accepted by iterparse has to use the ET-provided TreeBuilder,
because the C
Stefan Behnel added the comment:
I attached a patch that removes the IncrementalParser class and merges its
functionality into the _IterParseIterator. It thus retains most of the
refactoring without adding new functionality and/or APIs.
I did not take a look if anything else from later
Eli Bendersky added the comment:
Thanks for the effort, Stefan, but I still think IncrementalParser is worth
keeping. It provides functionality that doesn't currently exist in ET, and IMHO
this functionality is both important and useful.
Renaming its methods to feed close kills the API
Stefan Behnel added the comment:
TreeBuilder has to support an explicit API for collecting and reporting events.
XMLParser has to call into this API and either not have _setevents at all or
have something public and documented. Note also that event hookup in the parser
makes sense in a way,
Stefan Behnel added the comment:
I still think IncrementalParser is worth keeping.
If you want to keep it at all cost, I think we should at least hide it behind a
function (as with iterparse()). If it's implemented as a class, chances are
that people will start relying on internals by
Eli Bendersky added the comment:
Stefan Behnel added the comment:
I still think IncrementalParser is worth keeping.
If you want to keep it at all cost, I think we should at least hide it
behind a function (as with iterparse()). If it's implemented as a class,
chances are that people will
Stefan Behnel added the comment:
Actually, let me revise my rpevious comment. I think we should fake the new
interface for now by adding a TreeEventBuilder that requires having its own
TreeBuilder internally, instead of wrapping an arbitrary target. That way, we
can avoid having to clean up
Eli Bendersky added the comment:
Stefan, I don't mind looking at a working patch; however, it's not clear to me
how you intend to solve the iterparse-accepting-a-custom-XMLParser problem. As
for faking the new API, I don't know if that's a good idea because we're not
yet sure what that new
Stefan Behnel added the comment:
fully working patches will be considered
Let me remind you that it's not me who wants this feature so badly.
As for faking the new API, I don't know if that's a good idea because we're
not yet sure what that new API is.
If that's your concern, then I
Eli Bendersky added the comment:
As for faking the new API, I don't know if that's a good idea because
we're not yet sure what that new API is.
If that's your concern, then I suggest not adding the feature at all, as
long as we don't know if we can (or should) keep up the IncrementalParser
Stefan Behnel added the comment:
Also, even if the new approach is implemented in the next release,
IncrementalParser can stay as a simple synonym to
XMLParser(target=EventBuilder(...)).
No it can't. According to your signature, it accepts a parser instance as
input. So it can't just create a
Changes by Jeremy Kloth jeremy.kloth+python-trac...@gmail.com:
--
nosy: +jkloth
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
___
Eli Bendersky added the comment:
I actually have another idea. Since we all agree that passing a custom parser
to iterparse is dubious, IncrementalParser does not have to do that at all.
This will make it a much more future-proof API. So its signature can be:
class
Eli Bendersky added the comment:
I had a chat with Nick about this, and a variation of the proposal in
http://bugs.python.org/msg196133 seems acceptable to us both. To summarize,
here's what goes into 3.4.
* The class will be named EventParser.
Rationale: The Incremental in IncrementalParser
Nick Coghlan added the comment:
Sounds good to me :)
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
___
Python-bugs-list mailing list
Stefan Behnel added the comment:
Hmm, did you look at my last comment at all? It solves both the technical
issues and the API issues very nicely and avoids any problems of potential
future changes. Let me quickly explain why.
The feature in question depends on two existing parts of the API:
Nick Coghlan added the comment:
Using tulip-inspired method names (when tulip hasn't landed) to duplicate
existing data input functionality (feed() and close()) seems a rather dubious
design decision to me.
Given how popular lxml.etree is as an alternative to the standard library's
etree
Nick Coghlan added the comment:
Reopening this as per discussion on python-dev. I haven't reverted anything at
this point, as subsequent changes mean a simple hg backout is no longer
sufficient.
--
resolution: fixed -
stage: committed/rejected - needs patch
status: closed - open
Eli Bendersky added the comment:
I like the idea of having .events() in a special target passed to a XMLParser,
which already has feed() and close(). I read through Stefan's proposal and
there are a couple of issues I wanted to raise:
1. Why have the event builder wrap a tree builder? Can't
Stefan Behnel added the comment:
Ok, so what are we going to do for the next alpha?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
___
Roundup Robot added the comment:
New changeset 815f328a6578 by Antoine Pitrou in branch 'default':
Issue #17741: use composition, rather than inheritance, for
xml.etree.iterparse's result class.
http://hg.python.org/cpython/rev/815f328a6578
--
___
Antoine Pitrou added the comment:
I've committed the inheritance removal patch. I personally don't think there's
anything more to do, since the TreeEventBuilder proposal stumbled on some
implementation issues.
--
status: open - closed
___
Python
Stefan Behnel added the comment:
I was asking for the current implementation to be removed until we have a
working implementation that hurts neither the API nor the module design.
--
___
Python tracker rep...@bugs.python.org
Antoine Pitrou added the comment:
I was asking for the current implementation to be removed until we
have a working implementation that hurts neither the API nor the
module design.
It would help if we could keep the discussion on rational terms.
--
Stefan Behnel added the comment:
I don't think I understand what you mean.
In any case, it's not to late to remove the implementation. There was only one
alpha release so far that included it, so it can't really break any existing
code that relies on it. The longer we wait, the more damage
Antoine Pitrou added the comment:
In any case, it's not to late to remove the implementation.
As far as I can say, there's no reason to remove the implementation
except for your distaste of the method names.
--
___
Python tracker
Antoine Pitrou added the comment:
To wrap things up:
- the feature is desireable
- the API is reasonable
- the implementation is functional and properly tested
Really, if you can't come up with an actual blocker, I'll ask you to leave this
issue at rest.
--
Stefan Behnel added the comment:
Could we please keep the discussion on rational terms?
It's not just the method names. The problem is that you are duplicating an
existing class (the XMLParser) for no good reason, instead of putting the
feature where it belongs: *behind* the XMLParser. Even
Changes by Antoine Pitrou pit...@free.fr:
--
nosy: -pitrou
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
___
Python-bugs-list mailing
Antoine Pitrou added the comment:
The problem is that you are duplicating an existing class (the
XMLParser) for no good reason
No. The logic was already in _IterParseIterator, just wrapped up
differently.
And the proof is that the original commit, while adding a feature,
*removed* code from
Stefan Behnel added the comment:
Given that it seems to be hard to come to a consensus in this ticket, I've
asked for removal of the code on python-dev.
http://mail.python.org/pipermail/python-dev/2013-August/128095.html
--
___
Python tracker
Stefan Behnel added the comment:
I gave the implementation a try and attached an incomplete patch. Some tests
are failing.
It turns out that it's not entirely easy to do this. As Antoine noticed, the
hack in the C implementation of the TreeBuilder makes it tricky to integrate
with other
Stefan Behnel added the comment:
Here is a patch that cleans up the current implementation to avoid making the
result of iterparse() an IncrementalParser (with all of its new API).
Please note that the tulip mailing list is not an appropriate place to discuss
additions to the XML libraries,
Changes by Stefan Behnel sco...@users.sourceforge.net:
--
components: +XML
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
___
Stefan Behnel added the comment:
Copying the discussion between Antoine and me from python-dev:
IMO it should mimic the interface of the TreeBuilder, which
calls the data reception method feed() and the termination method
close(). There is no reason to add yet another set of methods names
Stefan Behnel added the comment:
Actually, let me take that last paragraph back. There is an Obvious Way to do
it, and that's the feed() and close() methods. They are the existing and
established ElementTree interface for incremental parsing. The fact that
close() doesn't clean up all state
Changes by Stefan Behnel sco...@users.sourceforge.net:
Added file: http://bugs.python.org/file31206/iterparse_api_cleanup.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
Stefan Behnel added the comment:
Thinking about the original patch some more - I wonder why it doesn't use a
wrapper for TreeBuilder to process the events. Antoine, is there a reason why
you had to add this _setevents() method to the XMLParser, instead of making the
IncrementalParser an
Stefan Behnel added the comment:
Oh, and could we reopen this ticket, please?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
___
Antoine Pitrou added the comment:
Antoine, is there a reason why you had to add this _setevents() method
to the XMLParser, instead of making the IncrementalParser an
IncrementalTreeBuilder?
The point is not to build a tree of potentially unbounded size (think XMPP).
The point is to yield
Changes by Antoine Pitrou pit...@free.fr:
--
status: closed - open
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
___
___
Python-bugs-list
Antoine Pitrou added the comment:
About the patch: I think changing the API names now that alpha1 has been
released is a bit gratuitous. I would like to keep the encapsulation part but
ditch the naming changes.
--
___
Python tracker
Stefan Behnel added the comment:
The point is not to build a tree of potentially unbounded size (think
XMPP). The point is to yield events in a non-blocking way (iterparse()
is blocking, which makes it useless for non-blocking applications).
Ok, but that's the only difference. Instead of
Stefan Behnel added the comment:
About the patch: I think changing the API names now that alpha1 has been
released is a bit gratuitous.
Sorry for being late, but I can't see it being my fault.
A change in an alpha release is still way better than a change after a final
release. Eventually,
Antoine Pitrou added the comment:
The point is not to build a tree of potentially unbounded size
(think
XMPP). The point is to yield events in a non-blocking way
(iterparse()
is blocking, which makes it useless for non-blocking applications).
Ok, but that's the only difference.
Antoine Pitrou added the comment:
A change in an alpha release is still way better than a change after
a final release.
But worse than no change at all. Arguing about API naming is a bit futile,
*especially* when the ship has sailed.
--
___
Python
Stefan Behnel added the comment:
But worse than no change at all. Arguing about API naming is a bit
futile, *especially* when the ship has sailed.
rantIt's easy to say that as a core developer with commit rights who can
simply hide changes in a low frequented bug tracker without notifying
Stefan Behnel added the comment:
But IncrementalParser uses an XMLParser internally, which in turn uses a
TreeBuilder internally. So what exactly is your point?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17741
Antoine Pitrou added the comment:
But IncrementalParser uses an XMLParser internally, which in turn
uses a TreeBuilder internally. So what exactly is your point?
Well, I would rather like to understand yours. Whatever
IncrementalParser uses internally needn't impact what API it
exposes to the
Stefan Behnel added the comment:
Well, I would rather like to understand yours.
My point is that the IncrementalParser uses a TreeBuilder that builds an XML
tree in the back. So I'm wondering why you are saying that it doesn't build a
tree.
Whatever IncrementalParser uses internally
Antoine Pitrou added the comment:
rantIt's easy to say that as a core developer with commit rights
who can simply hide changes in a low frequented bug tracker without
notifying those who have to know about these changes./rant It's
pure luck I noticed this change during the alpha release
Stefan Behnel added the comment:
TreeBuilder doesn't do parsing, it takes already parsed data: it has a
start() method to open a tag, and a data() method to add raw text
inside that tag.
That is correct. However, the XMLParser has a feed() method that sends new data
into the parser, and a
Antoine Pitrou added the comment:
Well, I would rather like to understand yours.
My point is that the IncrementalParser uses a TreeBuilder that builds
an XML tree in the back. So I'm wondering why you are saying that it
doesn't build a tree.
Unless I'm reading it wrong, when _setevents()
Stefan Behnel added the comment:
Unless I'm reading it wrong, when _setevents() is called, the internal
hooks are rewired to populate the events list, rather than call the
corresponding TreeBuilder methods. So, yes, there's a TreeBuilder
somewhere, but it stands unused.
Yes, you *are*
Eli Bendersky added the comment:
rantIt's easy to say that as a core developer with commit rights who can
simply hide changes in a low frequented bug tracker without
notifying those who have to know about these changes./rant It's pure luck
I noticed this change during the alpha release
Stefan Behnel added the comment:
ask to be added to the experts list for ET
Already done, see the corresponding python-dev thread.
Now back to a productive discussion please...
I think we already are. Keep reading through the rest of the posts.
--
1 - 100 of 116 matches
Mail list logo