[
https://issues.apache.org/jira/browse/MAHOUT-748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13058171#comment-13058171
]
steven zhuang commented on MAHOUT-748:
--------------------------------------
hi, Sean, sorry, I know little about patching. and HEAD, what is that?
> WikipediaAnalyzer in 0.5 would fail due to lucene3.1's
> CharArraySet.iterator() returns an "char[]" iterator instead of a "String"
> iterator
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: MAHOUT-748
> URL: https://issues.apache.org/jira/browse/MAHOUT-748
> Project: Mahout
> Issue Type: Bug
> Components: Examples
> Affects Versions: 0.5
> Reporter: steven zhuang
> Priority: Minor
> Fix For: 0.6
>
> Attachments: WikipediaAnalyzer.java.patch, WikipediaAnalyzer.java_diff
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> in mahout0.5, the class org.apache.mahout.analysis.WikipediaAnalyzer would
> fail to be constructed.
> the statement around WikipediaAnalyzer.java line 38:
> stopSet = (CharArraySet) StopFilter.makeStopSet(Version.LUCENE_31,
> StopAnalyzer.ENGLISH_STOP_WORDS_SET.toArray(new
> String[StopAnalyzer.ENGLISH_STOP_WORDS_SET.size()]));
> will raise an ArrayStoreException exception due to
> StopAnalyzer.ENGLISH_STOP_WORDS_SET.toArray(String[] ) will throw
> such an exception.
> the cause is that in lucene3.1, when version number is bigger than 3.0,
> the CharArraySet.iterator() method returns an 'char[]' iterator instead of an
> "String" list.
> see code from CharArraySet.java:
> @Override @SuppressWarnings("unchecked")
> public Iterator<Object> iterator() {
> // use the AbstractSet#keySet()'s iterator (to not produce endless
> recursion)
> return map.matchVersion.onOrAfter(Version.LUCENE_31) ?
> map.originalKeySet().iterator() : (Iterator) stringIterator();
> }
> so in WikipediaAnalyzer() we may need to make a transform from char[] to
> String to make it work.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira