[ 
https://issues.apache.org/jira/browse/LUCENE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12801163#action_12801163
 ] 

Simon Willnauer commented on LUCENE-2206:
-----------------------------------------

Robert, patch looks good except of one thing. 
{code}
  public static HashSet<String> getSnowballWordSet(Reader reader)
{code}

it returns a hashset but should really return a Set<String>. We plan to change 
all return types to the interface instead of the implementation.


> integrate snowball stopword lists
> ---------------------------------
>
>                 Key: LUCENE-2206
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2206
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: contrib/analyzers
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 3.1
>
>         Attachments: LUCENE-2206.patch
>
>
> The snowball project creates stopword lists as well as stemmers, example: 
> http://svn.tartarus.org/snowball/trunk/website/algorithms/english/stop.txt?view=markup
> This patch includes the following:
> * snowball stopword lists for 13 languages in contrib/snowball/resources
> * all stoplists are unmodified, only added license header and converted each 
> one from whatever encoding it was in to UTF-8
> * added getSnowballWordSet  to WordListLoader, this is because the format of 
> these files is very different, for example it supports multiple words per 
> line and embedded comments.
> I did not add any changes to SnowballAnalyzer to actually automatically use 
> these lists yet, i would like us to discuss this in a future issue proposing 
> integrating snowball with contrib/analyzers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to