I think a good addition for LT would be to have a general rule, just
acting on tokens, a bit like srx does wit letters.
bed : ok
bed english : not ok = bad english
A mechanism, that lets the longer token list overrule the shorter one.
This would create the option to add found errors empirically.
On 2014-08-18 17:18, R.J. Baars wrote:
I was able to test, and removed 2 of my additions to make it work.
Thanks, I have committed it. It will be part of the daily builds
tonight:
https://languagetool.org/download/snapshots/?C=M;O=D
Regards
Daniel
I am currently checking the output of all the rules on the 20GB corpus;
some rules are perfect, some less (though hard to tweak).
Result will be a major update, I guess...
Ruud
On 2014-08-18 17:18, R.J. Baars wrote:
I was able to test, and removed 2 of my additions to make it work.
Thanks,
This generates an error. How do I add multiple urls to 1 rule?
urlhttps://onzetaal.nl/taaladvies/advies/instandhouden-in-stand-houden/url
urlhttp://taaladvies.net/taal/advies/vraag/412/in_bedrijf_stelling_inbedrijfstelling//url
Ruud
On 2014-08-19 12:41, R.J. Baars wrote:
This generates an error. How do I add multiple urls to 1 rule?
Only one URL per rule is currently supported.
Regards
Daniel
--
___
On 2014-08-19 09:14, R.J. Baars wrote:
bed : ok
bed english : not ok = bad english
For some types of errors, I think it works better then current
rule/exception type of check.
I'm not sure I understand: do you suggest a different (more compact) way
to write down simple rules, or do you
This is a limitation; there are several texts to refer to sometimes ...
Is it on the whishlist to change this?
On 2014-08-19 12:41, R.J. Baars wrote:
This generates an error. How do I add multiple urls to 1 rule?
Only one URL per rule is currently supported.
Regards
Daniel
What I mean is just making a list of token groups, good and bad.
I'll try a different example:
hand some; wrong; handsome
I hand some tools to; correct
Another one:
bene; wrong;been
nota bene;correct
It is a very compact way of defining very simple rules.
I encounter rules that work fine,
On 2014-08-19 13:35, R.J. Baars wrote:
This is a limitation; there are several texts to refer to sometimes ...
Is it on the whishlist to change this?
Please open an issue at
https://github.com/languagetool-org/languagetool/issues
Regards
Daniel
On 2014-08-19 13:43, R.J. Baars wrote:
hand some; wrong; handsome
I hand some tools to; correct
It is a very compact way of defining very simple rules.
I see, but the thing is that these rules probably won't stay simple for
long. What if you want to add you hand some tools, we hand some
Hi,
some people will already have noticed it: LT has just added support for
Persian. This is an important step, as Persian is the first
right-to-left language we support. There will probably be some bugs, but
for now it's looking good and there have only been minor issues.
Persian is already
Postags are a challenge. There are so many words having that amount of
postags, it will be hard to get those really wel determined.
I will spend some time in the disambiguator, for assigning postags as
deleting ambiguous ones where possible.
First we have to establish a better postagging system
Persian or Farsi?
Hi,
some people will already have noticed it: LT has just added support for
Persian. This is an important step, as Persian is the first
right-to-left language we support. There will probably be some bugs, but
for now it's looking good and there have only been minor issues.
https://en.wikipedia.org/wiki/Persian_language
On Tue, Aug 19, 2014 at 5:33 PM, R.J. Baars r.j.ba...@xs4all.nl wrote:
Persian or Farsi?
Hi,
some people will already have noticed it: LT has just added support for
Persian. This is an important step, as Persian is the first
java -jar languagetool.jar
(process:14246): GLib-CRITICAL **: g_slice_set_config: assertion
'sys_page_size == 0' failed
This occurs after clicking on the reference url in the UI; it works though.
Ruud
--
Okay, if you need sentences to test on, or plain words lists with
frequencies, or word groups and their frequencies, I have them.
Ruud
https://en.wikipedia.org/wiki/Persian_language
On Tue, Aug 19, 2014 at 5:33 PM, R.J. Baars r.j.ba...@xs4all.nl wrote:
Persian or Farsi?
Hi,
some
Thanks,
I want to write suggest for message for
rule id=PluralFix name=ZWNJ for Plural extension
pattern
token
regexp='yes'[ءآأؤإئابپةتثجحخچدذرزژسشصضطظعغفقكکگلمنهوىیيًٌٍَُِّْ]+/token
token regexp='yes'ها(ی|یی|یم|یت|یش|مان|تان|شان|)/token
I discovered that the rule below is not working very well.
It look like 'skip' also skips over sentence boundaries.
Is that intentional? Or is something else wrong?
In case it is intentional, is there an option to forbid that?
Ruud
rule id=nr738 name=duur kost
pattern
token skip=4duur/token
Hi,
while we're adding support for languages (Tamil, Persian), we're less
successful in finding maintainers for the unmaintained languages we
support. Here's a list of languages that need a maintainer:
Lithuanian
Belarusian
Malayalam
Swedish
Icelandic
Japanese
Danish
Galician
Romanian
Chinese
On 2014-08-19 15:39, R.J. Baars wrote:
I discovered that the rule below is not working very well.
It look like 'skip' also skips over sentence boundaries.
No, that shouldn't be possible. Maybe sentence detection is broken? What
sentence does this match that it shouldn't?
Regards
Daniel
On 2014-08-19 15:32, Reza engyian wrote:
it should suggest
first_word+ZWNJ [4]+ها
would you please tell me how should I write the suggest ?
I haven't tested it, but this should work:
messageDo you mean suggestion\1#8204;ها/suggestion?/message
Regards
Daniel
R.J. Baars r.j.ba...@xs4all.nl wrote:
I discovered that the rule below is not working very well.
It look like 'skip' also skips over sentence boundaries.
Is that intentional? Or is something else wrong?
In case it is intentional, is there an option to forbid that?
Ruud
rule id=nr738
THanks, I will check it out.
The rule is not functioning very well at all. I commented it out and put
it on the list of items to do.
Ruud
R.J. Baars r.j.ba...@xs4all.nl wrote:
I discovered that the rule below is not working very well.
It look like 'skip' also skips over sentence
Hi all,
I found some regexs which help to have better punctuation if they are not
controlled by LT please add them to the tool:
The Second line after the regexs is suggestion line
---
should have space between word and numbers numbers
(\d+)(\w+)
$1 $2
(\w+)(\d+)
$1 $2
space between
24 matches
Mail list logo