Maybe the part after the \. should be in the afterbreak element?

Regards,

Piotr


On Wed, May 1, 2013 at 6:18 PM, Andriy Rysin <ary...@gmail.com> wrote:

> Hi all
>
> I need a bit help with srx sentence tokenizer, I've added this rule to
> prevent sentence split on Name abbreviation+Surname, e.g. "Т.Шевченко"
> which is often met in texts.
> The rule will need to be a bit more complex but I am trying something
> simple first.
>
> <rule break="no">
> <beforebreak>\b[А-ЯІЇЄҐ]\.[А-ЯІЇЄҐ]</beforebreak>
> <afterbreak></afterbreak>
> </rule>
>
> But my test in UkrainianSRXSentenceTokenizerTest.java fails (it's
> currently commented out in svn):
>
>     testSplit("Наша зустріч з А.Марчуком відбулася в грудні минулого
> року.");
>
> I tried to spin the regex a bit but nothing helps. I've added couple of
> other rules and they worked ok.
>
> Any help would be greately appreciated.
>
> Thanks
> Andriy
>
> P.S. BTW would not it make sense to split segement.srx by language modules?
>
>
> ------------------------------------------------------------------------------
> Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
> Get 100% visibility into your production application - at no cost.
> Code-level diagnostics for performance bottlenecks with <2% overhead
> Download for free and get started troubleshooting in minutes.
> http://p.sf.net/sfu/appdyn_d2d_ap1
> _______________________________________________
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>
>
------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to