Hello,
When LT tags text, the tag <S> shows the start of a sentence, doesn't it?
With one particular disambiguation.xml file, I get unexpected results for the
tagged text. LT gives multiple instances of the sentence start marker <S>, as
shown in this output from the GUI:
<S><S><S><S> testword[</S>testword/TESTPOS]
The first rule in my disambiguation.xml is as follows. (Testrules gives no
errors.):
<rule id="add_TESTPOS" name="add TESTPOS">
<pattern>
<token>testword</token>
</pattern>
<disambig action="add"><wd pos="TESTPOS"/></disambig>
</rule>
If I put that rule in the LanguageTool disambiguation.xml file, there is only
one <S> tag, as I expect.
I do not understand:
1. How can there be multiple sentence starts?
2. Something in my disambiguation.xml makes LT show multiple <S>. But, this is
the FIRST rule. How can rules that come after the first rule affect the
tagging? (The rules "are applied in the order as they appear in the file"
http://wiki.languagetool.org/developing-a-disambiguator .)
(I think that this is a bug. Probably, I will send more related questions, but
for now, I want to keep things simple and focus only on one thing at a time.)
Regards,
Mike Unwalla
Contact: www.techscribe.co.uk/techw/contact.htm
------------------------------------------------------------------------------
The Go Parallel Website, sponsored by Intel - in partnership with Geeknet,
is your hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials, tech docs,
whitepapers, evaluation guides, and opinion stories. Check out the most
recent posts - join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel