[mwlib] Re: mwlib for NLP, and cleaning up the API

Ralf Schmitt Thu, 13 Aug 2009 03:12:58 -0700

"Joel Nothman" <[email protected]> writes:

> The current parser also creates excess Nodes:
>
> Article
>      Paragraph tagname='p'
>          u'\n'
>          Style"'''"
>              u'Thomas Cruise Mapother IV'
>          Node
>              u', better known by his '
>              ArticleLink target=u'Stage name' ns=0
>                  u'screen name'
>              u' of '
>          Style"'''"
>              u'Tom Cruise'
>          Node
>              u', is an '
>              ArticleLink target=u'United States' ns=0
>                  u'American'
>              u' actor and '
>              ArticleLink target=u'film producer' ns=0
>              u'. '
>
> I would have thought these Text, ArticleLink, etc., should be directly  
> under Paragraph, as they were in the earlier parser.
>
> This can obviously be performed by a postprocessor, but I don't see why  
> these intermediate Nodes are necessary in the basic parse.


They are not necessary but won't hurt either. It's just the
implementation which probably got a bit easier (in this case
core.py/parse_singlequotes).

Regards,
- Ralf

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mwlib" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [email protected]
For more options, visit this group at http://groups.google.com/group/mwlib?hl=en
-~----------~----~----~----~------~----~------~--~---

[mwlib] Re: mwlib for NLP, and cleaning up the API

Reply via email to