"Joel Nothman" <[email protected]> writes: > The current parser also creates excess Nodes: > > Article > Paragraph tagname='p' > u'\n' > Style"'''" > u'Thomas Cruise Mapother IV' > Node > u', better known by his ' > ArticleLink target=u'Stage name' ns=0 > u'screen name' > u' of ' > Style"'''" > u'Tom Cruise' > Node > u', is an ' > ArticleLink target=u'United States' ns=0 > u'American' > u' actor and ' > ArticleLink target=u'film producer' ns=0 > u'. ' > > I would have thought these Text, ArticleLink, etc., should be directly > under Paragraph, as they were in the earlier parser. > > This can obviously be performed by a postprocessor, but I don't see why > these intermediate Nodes are necessary in the basic parse.
They are not necessary but won't hurt either. It's just the implementation which probably got a bit easier (in this case core.py/parse_singlequotes). Regards, - Ralf --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mwlib" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/mwlib?hl=en -~----------~----~----~----~------~----~------~--~---
