On 2014-08-18 17:18, R.J. Baars wrote:
I was able to test, and removed 2 of my additions to make it work.
Thanks, I have committed it. It will be part of the daily builds
tonight:
https://languagetool.org/download/snapshots/?C=M;O=D
Regards
Daniel
I am currently checking the output of all the rules on the 20GB corpus;
some rules are perfect, some less (though hard to tweak).
Result will be a major update, I guess...
Ruud
On 2014-08-18 17:18, R.J. Baars wrote:
I was able to test, and removed 2 of my additions to make it work.
Thanks,
There is an adjustment to make in the sentence splitter. But where did the
.srx go?
I detected an abbreviation that is commonly used and as for now seen as
sentence end:
milj.
Could this be added to the Dutch srx rules?
Ruud
Same applies to [0-9]{1,2}[-]pers.
Ruud
There is an adjustment to make in the sentence splitter. But where did the
.srx go?
I detected an abbreviation that is commonly used and as for now seen as
sentence end:
milj.
Could this be added to the Dutch srx rules?
Ruud
On 2014-08-18 16:16, R.J. Baars wrote:
There is an adjustment to make in the sentence splitter. But where did
the
.srx go?
It's at
languagetool-core/src/main/resources/org/languagetool/resource/segment.srx
Could this be added to the Dutch srx rules?
Sure, could you send a patch?
Regards
I am not qualified to edit sources. Just no programmer.
Unfortunately, the srx is not separate per languages too.
I found the source on Github (which I don't really understand) so I will
be able to adjust, and send it to you.
But how can I test it if it is not in the runtime version?
Ruud
W dniu 2014-04-12 09:55, Daniel Naber pisze:
On 2014-04-12 09:34, Marcin Miłkowski wrote:
SRX file can be easily edited and we will happily accept all patches,
also for languages without complete support in LT. Where's the problem?
Today, you can extend the Language class and have a Regex
On 01.05.2013, 12:18:41 Andriy Rysin wrote:
P.S. BTW would not it make sense to split segement.srx by language
modules?
Absolutely. This isn't very high on my personal TODO list though, so any
help/patches are welcome.
Regards
Daniel
--
http://www.danielnaber.de
Most srx-compliant software uses a single file for all languages, AFAIK.
Regards, Marcin
02-05-2013 09:08 użytkownik Daniel Naber list2...@danielnaber.de
napisał:
On 01.05.2013, 12:18:41 Andriy Rysin wrote:
P.S. BTW would not it make sense to split segement.srx by language
modules
Hi all
I need a bit help with srx sentence tokenizer, I've added this rule to
prevent sentence split on Name abbreviation+Surname, e.g. Т.Шевченко
which is often met in texts.
The rule will need to be a bit more complex but I am trying something
simple first.
rule break=no
beforebreak\b[А-ЯІЇЄҐ
Maybe the part after the \. should be in the afterbreak element?
Regards,
Piotr
On Wed, May 1, 2013 at 6:18 PM, Andriy Rysin ary...@gmail.com wrote:
Hi all
I need a bit help with srx sentence tokenizer, I've added this rule to
prevent sentence split on Name abbreviation+Surname, e.g
Thanks, that helped!
Andriy
On 05/01/2013 02:54 PM, Piotr wrote:
Maybe the part after the \. should be in the afterbreak element?
Regards,
Piotr
On Wed, May 1, 2013 at 6:18 PM, Andriy Rysin ary...@gmail.com
mailto:ary...@gmail.com wrote:
Hi all
I need a bit help with srx
12 matches
Mail list logo