At Sun, 24 Aug 2014 14:21:52 +0200,
Silvan Jegen wrote:
Because the tagger library used by them (called 'sen') does the
tokenization and tagging in one step, these two steps cannot be separated
as cleanly as required by the interfaces used in LT.
Yes, almost Japanese morphological analysis
On 2014-08-24 14:21, Silvan Jegen wrote:
3. When the JapaneseTagger is called with the above (null/empty)
ListString as input we ignore the input parameter. Instead we get
the
analyzedTokens field directly from the JapaneseWordTokenizer
(a reference to which we saved within
Am 2014-08-25 11:05, schrieb Daniel Naber:
On 2014-08-24 14:21, Silvan Jegen wrote:
3. When the JapaneseTagger is called with the above (null/empty)
ListString as input we ignore the input parameter. Instead we get
the
analyzedTokens field directly from the JapaneseWordTokenizer
On 2014-08-25 12:27, Silvan Jegen wrote:
I agree that it would be about equally confusing (and inelegant) but at
least it would save some unnecessary work for LT.
I don't think we should argue with performance unless there's a
real-world use case that's actually too slow and we can show that
On Mon, Aug 25, 2014 at 12:47:06PM +0200, Daniel Naber wrote:
On 2014-08-25 12:27, Silvan Jegen wrote:
I agree that it would be about equally confusing (and inelegant) but at
least it would save some unnecessary work for LT.
I don't think we should argue with performance unless there's a
Hi
I realized that the current implementation of the JapaneseWordTokenizer
and JapaneseTagger work in quite an odd way.
Because the tagger library used by them (called 'sen') does the
tokenization and tagging in one step, these two steps cannot be separated
as cleanly as required by the