Updates:
Cc: [email protected]
Comment #2 on issue 3172 by laurent.laffont: NewInspector really really
slow while parsing XMLNode
http://code.google.com/p/pharo/issues/detail?id=3172
From jaayer:
Hi Laurent.
First, thank you for your screencasts.
I am writing to you concerning this issue:
http://code.google.com/p/pharo/issues/detail?id=3172&q=xml&colspec=ID%20Type%20Status%20Summary%20Milestone%20Difficulty
I only discovered the above last night, as I am not a Pharo core developer
and only check the issue tracker occasionally. I am a maintainer of
XML-Support, however, so feel free to mail me directly with any issues you
have in the future, or at the very least, please post them to the
Pharo-Project, Squeak-dev or Moose-dev lists.
Now, in your video, which was made back in August, you used version .74 of
XML-Parser. If you look at the changelogs of XML-Parser, it is clear that
much has changed since that time. It would not be unreasonable to assume
that one or more of those changes is responsible for the difficulties you
experienced inspecting that XML document. However, if you load version .74,
the same version used in your screencast, into a Pharo 1.1 image, the
problem is still there (and is in fact much worse), meaning that whatever
has changed since August to bring this about, I can say with some
confidence that it is not in XML-Parser.
Using the latest version of XML-Parser and ignoring the question of
causality for a moment, the bottleneck we're observing is in
XMLNode>>printOn:. The proof is that disabling XMLNode>>printOn: makes our
problem go away. Why then is XML generation suddenly such a bottleneck? The
main reason appears to be that the document being inspected is, when
printed, roughly 40K in size, and by inspecting it you are causing it to be
printed over and over again. I tracked sends of #printOn: and found that
inspecting the document and navigating to the "title" element results in
#printOn: being sent 300-400 times. The recipients were the document object
and the root "feed" element ~350 times. By the time you've navigated to
the "feed" element and clicked the "Elements" entry to display its
children, #printOn: has already been sent ~150 times. That means the
remaining 150-250 sends come when you click on the "title" element or some
other child element of the root, and that single additional click results
in the generation of roughly 6-10 MBs of XML (150 * 40K to 250 * 40K).
*That* is why it's so slow.
Now, why does NewInspector cause #printOn: to be sent so much? I honestly
don't know. I will forward this to the maintainers of NewInspector to get
their input. Maybe NewInspector can be changed to not print inspected
objects so often, or to provide them with some way to suppress its printing
behavior.
It must be pointed out that Pharo also comes with an object explorer that
is designed specifically for this usecase of browsing deep object
hierarchies. I tried exploring the document with cmd-I (capital i) and
encountered no trouble.