.
Signed-off-by: Silvan Jegen s.je...@gmail.com
---
Hi
I had difficulties when creating Japanese rules because the mecab program
I used to determine the tokenization of the example phrases produced
different tokens than the tokenization library used in languagetool.
It took me quite a while to find
Am 2014-08-11 01:07, schrieb Daniel Naber:
On 2014-08-10 17:37, Silvan Jegen wrote:
If including the analyzed token readings is useful in other assertion
messages as well, it may also be better to refactor the token reading
code into its own function and making it less ad hoc.
What do you
Am 2014-08-11 10:18, schrieb Daniel Naber:
On 2014-08-11 09:01, Silvan Jegen wrote:
Maybe it would be best to automatically generate a mail to
this dev list whenever a Github issue has been opened...
I agree. Do you know an easy way to set that up, or do we need to
create
a fake user
Hi
I realized that the current implementation of the JapaneseWordTokenizer
and JapaneseTagger work in quite an odd way.
Because the tagger library used by them (called 'sen') does the
tokenization and tagging in one step, these two steps cannot be separated
as cleanly as required by the
Am 2014-08-25 11:05, schrieb Daniel Naber:
On 2014-08-24 14:21, Silvan Jegen wrote:
3. When the JapaneseTagger is called with the above (null/empty)
ListString as input we ignore the input parameter. Instead we get
the
analyzedTokens field directly from the JapaneseWordTokenizer
On Mon, Aug 25, 2014 at 12:47:06PM +0200, Daniel Naber wrote:
On 2014-08-25 12:27, Silvan Jegen wrote:
I agree that it would be about equally confusing (and inelegant) but at
least it would save some unnecessary work for LT.
I don't think we should argue with performance unless there's
On Wed, May 13, 2015 at 05:16:10PM +0900, NOKUBI Takatsugu wrote:
On Wed, 13 May 2015 09:12:14 +0200
Daniel Naber daniel.na...@languagetool.org wrote:
token regexp=yes[\u3040-\u309F]+/token
tokenX/token
There is some exepction, like a long consontants character (っ) but it is
not bad to
Hi
Thanks for considering to write a grammar rule for Japanese!
Am 2015-05-13 07:43, schrieb Takatsugu Nokubi:
I consider to write a grammar rule of Japanese.
ー (prolonged sound mark) is a popular symbol in Japanese.
And the rule itself is simple:
The symbol is placed after Hiragana or
Heyho
On Wed, Jan 20, 2016 at 03:09:29PM -0800, Rick Genter wrote:
> There is a rule in the Japanese grammar.xml that says this:
>
>
>
> かっこいい
>
> 誤変換です。恰好良いの間違いです。
> あの人はかっこいい。
>
>
> A Japanese colleague of mine says that the suggestion is using the wrong
> first character:
Hi
Sadly, my math is weak but I will give it a try. Just make sure to
re-check :)
On Thu, Aug 06, 2015 at 11:29:05AM +0200, Daniel Naber wrote:
> we're using a bit probability theory to calculate ngram probabilities.
> This way we can decide which word of a homophone pair like there/their
> is
10 matches
Mail list logo