Where I think parser fall down is in recognizing common English typing and
spelling errors.
"Hello how are you" would be recognizable by a parser but the following
constructs all recognizable by a human
would only be recognizable to a fuzzy pattern matcher.
"Helohowareyou"
"Hellllllo hwo r u"
"Hell, howwww areyu"
Examining the Transcripts in pas years Turing competitions it is very easy to
see that all of the entries are very
intolerant to fuzzy data and would respond with a obvious bluff when presented
with such inputs.
-------------- Original message --------------
From: "Jean-paul Van Belle" <[EMAIL PROTECTED]>
Research Associate: CITANDA
Post-Graduate Section Head
Department of Information Systems
Phone: (+27)-(0)21-6504256
Fax: (+27)-(0)21-6502280
Office: Leslie Commerce 4.21
>>> Linas Vepstas [EMAIL PROTECTED]> 2007/11/05 23:41 >>
>Do you have any recommendations for other parsers?
One of the reasons I like Python: It's got NLTK and
MontyLingua:
- MontyTokenizer
- normalizes punctuation, spacing and contractions, with sensitivity to
abbrevs.
- MontyTagger
- Part-of-speech tagging using PENN TREEBANK tagset
- enriched with "Common Sense" from the Open Mind Common Sense project
- MontyREChunker
- chunks tagged text into verb, noun, and adjective chunks (VX,NX, and AX
respectively)
- MontyExtractor
- extracts verb-argument structures, phrases, and other semantically
valuable information from sentences and returns sentences as "digests"
- MontyLemmatiser
- part-of-speech sensitive lemmatisation
- strips plurals (geese-->goose) and tense (were-->be, had-->have)
- includes regexps from Humphreys and Carroll's morph.lex, and UPENN's XTAG
corpus
- MontyNLGenerator
- generates summaries
- generates surface form sentences
- determines and numbers NPs and tenses verbs
- accounts for sentence_type
Note: It also has chatterbot code.
____________________________________________________________________________________
This e-mail is subject to the UCT ICT policies and e-mail disclaimer published
on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/ or
obtainable from +27 21 650 4500. This e-mail is intended only for the person(s)
to whom it is addressed. If the e-mail has reached you in error, please notify
the author. If you are not the intended recipient of the e-mail you may not
use, disclose, copy, redirect or print the content. If this e-mail is not
related to the business of UCT it is sent by the sender in the sender's
individual capacity.
____________________________________________________________________________________
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&
-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=61636748-ad030f