Hi all,
I am just discovering LT and I am getting interested in its possibilities.
I have been auditing/evaluating a correction software for a company looking for
style correction.
It is called LELIE, is based on the Dislog language, a layer on top of Prolog
(Commons licence).
It is a more powerful approach than LT but it has its drawbacks (complexity,
maintenance cost, need formal training to maintain, logic programming in
Prolog, lexicon, rules, reasonning, everything is in Prolog, etc.
http://www.irit.fr/~Patrick.Saint-Dizier/publi_fichier/manuelV1.pdf
<http://www.irit.fr/~Patrick.Saint-Dizier/publi_fichier/manuelV1.pdf> )
Linguistically, it relies on rethorical structures (RST,
http://www.sfu.ca/rst/01intro/intro.html
<http://www.sfu.ca/rst/01intro/intro.html> )
It is able to recognize semantic function like circumstance, concession,
condition, evaulation, etc.
Its performance in term of speed are not spectacular (deep parsing, Prolog
backtracking) but it is usable.
Some publications in case you are curious:
http://www.irit.fr/recherches/ILPL/lelie/accueil.html
<http://www.irit.fr/recherches/ILPL/lelie/accueil.html>
http://dl.acm.org/citation.cfm?id=2388653
<http://dl.acm.org/citation.cfm?id=2388653>
http://anthology.aclweb.org/C/C14/C14-2006.pdf
<http://anthology.aclweb.org/C/C14/C14-2006.pdf>
https://liris.cnrs.fr/inforsid/sites/default/files/2012_6_1-PatrickSaint-Dizier.pdf
<https://liris.cnrs.fr/inforsid/sites/default/files/2012_6_1-PatrickSaint-Dizier.pdf>
The reason for this email is that I am looking for an alternative.
I would like to be able to answer to the following questions :
- Is LT able to recognize complex structures, such as passive form, structures
with gap in the middle (I assume so since it seems able to do regex on patterns
of part of speech)
- Is LT able to take into account a provded SKOS (or similar) thesaurus in
order to pre-recognized multi-word terms
- How LT does part of speech tagging (ML models, other approach, TreeTagger,
etc ?). Is it conceivable to plug in one’s POS tagger (for instance Stanford
NLP Tools tagger) ?
- Is it easly extensible ? (rule templates for new form of error recognition,
complex syntactic patterns that would require their own implementation)
- Can it cope with structure information (xml tags). Here is an example :
enumerations. One could say that all items of an enumeration should begin with
the same form (infinitive verb, or noun, whatever). To verify this, the
structure of the document mus be taken in to account. If the document is
available in XML with sutructure information, it is conceivable for LT to
process such a document (does its architecture allows this, if it not possible
yet).
Another topic :
Do you know BlackLab (based on Lucene)
https://github.com/INL/BlackLab/wiki/Features
<https://github.com/INL/BlackLab/wiki/Features> ?
It can look for patterns (like LT rules) in very large amount of texts (thanks
to Lucene) and get almost immediate answers.
It can process annotated text (part of speech, up to 10 levels of more type of
linguistic information, semantic, tonalities, etc).
I have been playing with it and I think it could be of a good help to do
statistics on syntatctic patterns from large corpus, in order may be, to infer
correction rules froma corpus of uncorrect sentences.
Sorry I have not yet read the full LT documentation but I thought I could save
some time submitting a question on the dev mailing list,
Thank you,
Cheers,
Elie Naulleau
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel