[mwlib] Problems parsing French Wikipedia page

Peter W Thu, 04 Nov 2010 15:07:36 -0700

Hi there,

As mentioned in the other post, I'm having some sort of problem
parsing a French Wikipedia page. The code I've used works well on
pages in 40 other languages, but not on the French.


Specifically, while it's parsing one of those pages, the console
starts printing endless warning lines of things like:

2010-11-04T21:41:59 advtree.warn >> fixTagNodes, unknowntagnode
TagNode tagname=u'abbr' vlist={u'class': u'abbr', u'title': u'Langue :
anglais'}->u'abbr

as well as things like

2010-11-04T21:42:00 xmlwriter >> SKIPPED
2010-11-04T21:42:00 xmlwriter >> TagNode
2010-11-04T21:42:00 xmlwriter >> ["parent => ArticleLink target=u'1er
janvier' ns=0", "tagname => u'abbr'", "caption => u'abbr'", "type =>
'complex_tag'", "vlist => {u'class': u'abbr', u'title': u'Premier'}"]

Though the amount of warning information being printed means that the
console dump is massive and I can't scroll back to find it, it even
claimed that the "maximum recursion depth for Python objects" had been
exceeded; I thus imagine there is some sort of infinite loop. I tried
to catch templates that redirect to each other, so I don't think
that's the problem -- if you think it is, I'll spend more time on
testing the code I used to prevent that.

Does someone have an idea of what's going on in this case? Is there
any more information I could provide that would be of help?

Thanks so much,

Peter

-- 
You received this message because you are subscribed to the Google Groups 
"mwlib" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/mwlib?hl=en.

[mwlib] Problems parsing French Wikipedia page

Reply via email to