Makoto Yui created HIVEMALL-248:
-----------------------------------
Summary: UDFs for Kuromoji stoptags
Key: HIVEMALL-248
URL: https://issues.apache.org/jira/browse/HIVEMALL-248
Project: Hivemall
Issue Type: Wish
Affects Versions: 0.5.2
Reporter: Makoto Yui
Assignee: Makoto Yui
Fix For: 0.6.0
In tokenize_ja, user need to provide stoptags that matched tokens removed from
the token stream. So, stoptag is "exclusive" rule.
So, create a UDF for "inclusive" rule. stoptags_ja_exclude('名詞') returns array
of tags excluding '名詞'.
[
https://github.com/apache/lucene-solr/blob/master/lucene/analysis/kuromoji/src/resources/org/apache/lucene/analysis/ja/stoptags.txt]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)