Hi,

The UpperCaseSentenceStart rule is common to many languages. The problem
with enumerations and tables is also common to all languages.

We had to strike a balance between too many false positives and some false
negatives. LanguageTool has no information about tables or enumerations, so
we have to make some assumption. The rule as it is now doesn't require
upper case when the analyzed sentence doesn't end in a full stop (or
equivalent [!?...]) and the previous sentence doesn't end in a full stop.
In this case, we assume that we are probably in a table or in an
enumeration.

As I said, the matches in Russian don't happen in command-line or JUnit
tests. I think that they appear in the Wikipedia tests because in these
tests we don't have information about the previous sentences.

Regards,
Jaume Ortolà



2014-12-21 18:35 GMT+01:00 Yakov Reztsov <yakovr...@mail.ru>:

> Hi,
>
>
> Sat, 20 Dec 2014 11:36:13 +0100 от Jaume Ortolà i Font :
>
> Hi,
>
> I have modified the rule UpperCaseSentenceStart so that there is a match
> in sentences starting with quotes like « or “ and a lower case word.
>
> In the nightly tests there are some new matches for different languages.
> Tell me if there is any problem.
>
> In French there are new matches caused by wrong sentence tokenization:
>
> « Ne nous sommes-nous pas rencontrés quelque part avant ? » demanda
> l'étudiant.
>
> The matches in Polish and Russian do not happen in local tests. I don't
> know why they happen in the regression tests.
>
> Regards,
> Jaume Ortolà
>
>
> The matches for Russian is regressions.
> Examples in
> https://www.languagetool.org/regression-tests/20141219/result_ru_20141219.html
> are elements of marked lists in document or elements of table.
> But in plain text these words should be written with a capital letter.
>
>
> --
>
> Yakov Reztsov
>
>
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
>
> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
> _______________________________________________
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>
>
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to