[ 
https://issues.apache.org/jira/browse/HIVEMALL-248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Makoto Yui updated HIVEMALL-248:
--------------------------------
    Description: 
In tokenize_ja, user need to provide stoptags that matched tokens removed from 
the token stream. So, stoptag is "exclusive" rule.

So, create a UDF for "inclusive" rule. stoptags_ja_exclude('名詞') returns array 
of tags excluding '名詞'. 

 
[https://github.com/apache/lucene-solr/blob/master/lucene/analysis/kuromoji/src/resources/org/apache/lucene/analysis/ja/stoptags.txt]

  was:
In tokenize_ja, user need to provide stoptags that matched tokens removed from 
the token stream. So, stoptag is "exclusive" rule.

So, create a UDF for "inclusive" rule. stoptags_ja_exclude('名詞') returns array 
of tags excluding '名詞'. 
[
https://github.com/apache/lucene-solr/blob/master/lucene/analysis/kuromoji/src/resources/org/apache/lucene/analysis/ja/stoptags.txt]


> UDFs for Kuromoji stoptags
> --------------------------
>
>                 Key: HIVEMALL-248
>                 URL: https://issues.apache.org/jira/browse/HIVEMALL-248
>             Project: Hivemall
>          Issue Type: Wish
>    Affects Versions: 0.5.2
>            Reporter: Makoto Yui
>            Assignee: Makoto Yui
>            Priority: Minor
>             Fix For: 0.6.0
>
>
> In tokenize_ja, user need to provide stoptags that matched tokens removed 
> from the token stream. So, stoptag is "exclusive" rule.
> So, create a UDF for "inclusive" rule. stoptags_ja_exclude('名詞') returns 
> array of tags excluding '名詞'. 
>  
> [https://github.com/apache/lucene-solr/blob/master/lucene/analysis/kuromoji/src/resources/org/apache/lucene/analysis/ja/stoptags.txt]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to