Hello,

When LT tags text, the tag <S> shows the start of a sentence, doesn't it?

With one particular disambiguation.xml file, I get unexpected results for the 
tagged text. LT gives multiple instances of the sentence start marker <S>, as 
shown in this output from the GUI:
<S><S><S><S> testword[</S>testword/TESTPOS] 

The first rule in my disambiguation.xml is as follows. (Testrules gives no 
errors.):

    <rule id="add_TESTPOS" name="add TESTPOS">
      <pattern>
        <token>testword</token>
      </pattern>
      <disambig action="add"><wd pos="TESTPOS"/></disambig>
    </rule>

If I put that rule in the LanguageTool disambiguation.xml file, there is only 
one <S> tag, as I expect.

I do not understand:
1. How can there be multiple sentence starts?
2. Something in my disambiguation.xml makes LT show multiple <S>. But, this is 
the FIRST rule. How can rules that come after the first rule affect the 
tagging? (The rules "are applied in the order as they appear in the file" 
http://wiki.languagetool.org/developing-a-disambiguator .)

(I think that this is a bug. Probably, I will send more related questions, but 
for now, I want to keep things simple and focus only on one thing at a time.)

Regards,

Mike Unwalla
Contact: www.techscribe.co.uk/techw/contact.htm 




------------------------------------------------------------------------------
The Go Parallel Website, sponsored by Intel - in partnership with Geeknet, 
is your hub for all things parallel software development, from weekly thought 
leadership blogs to news, videos, case studies, tutorials, tech docs, 
whitepapers, evaluation guides, and opinion stories. Check out the most 
recent posts - join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to