Makoto Yui created HIVEMALL-248:
-----------------------------------

             Summary: UDFs for Kuromoji stoptags
                 Key: HIVEMALL-248
                 URL: https://issues.apache.org/jira/browse/HIVEMALL-248
             Project: Hivemall
          Issue Type: Wish
    Affects Versions: 0.5.2
            Reporter: Makoto Yui
            Assignee: Makoto Yui
             Fix For: 0.6.0


In tokenize_ja, user need to provide stoptags that matched tokens removed from 
the token stream. So, stoptag is "exclusive" rule.

So, create a UDF for "inclusive" rule. stoptags_ja_exclude('名詞') returns array 
of tags excluding '名詞'. 
[
https://github.com/apache/lucene-solr/blob/master/lucene/analysis/kuromoji/src/resources/org/apache/lucene/analysis/ja/stoptags.txt]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to