[
https://issues.apache.org/jira/browse/MAHOUT-748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13057653#comment-13057653
]
Sean Owen commented on MAHOUT-748:
----------------------------------
The patch is fine, but it does not appear to be against HEAD. Patches need to
be changes that can be applied to code in Subversion as it exists now. In
particular this logic has been refactored into a superclass, where the logic
has changed (maybe it is fixed in a different way).
Can you look at HEAD and see if there needs to be a fix?
> WikipediaAnalyzer in 0.5 would fail due to lucene3.1's
> CharArraySet.iterator() returns an "char[]" iterator instead of a "String"
> iterator
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: MAHOUT-748
> URL: https://issues.apache.org/jira/browse/MAHOUT-748
> Project: Mahout
> Issue Type: Bug
> Components: Examples
> Affects Versions: 0.5
> Reporter: steven zhuang
> Priority: Minor
> Fix For: 0.6
>
> Attachments: WikipediaAnalyzer.java.patch, WikipediaAnalyzer.java_diff
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> in mahout0.5, the class org.apache.mahout.analysis.WikipediaAnalyzer would
> fail to be constructed.
> the statement around WikipediaAnalyzer.java line 38:
> stopSet = (CharArraySet) StopFilter.makeStopSet(Version.LUCENE_31,
> StopAnalyzer.ENGLISH_STOP_WORDS_SET.toArray(new
> String[StopAnalyzer.ENGLISH_STOP_WORDS_SET.size()]));
> will raise an ArrayStoreException exception due to
> StopAnalyzer.ENGLISH_STOP_WORDS_SET.toArray(String[] ) will throw
> such an exception.
> the cause is that in lucene3.1, when version number is bigger than 3.0,
> the CharArraySet.iterator() method returns an 'char[]' iterator instead of an
> "String" list.
> see code from CharArraySet.java:
> @Override @SuppressWarnings("unchecked")
> public Iterator<Object> iterator() {
> // use the AbstractSet#keySet()'s iterator (to not produce endless
> recursion)
> return map.matchVersion.onOrAfter(Version.LUCENE_31) ?
> map.originalKeySet().iterator() : (Iterator) stringIterator();
> }
> so in WikipediaAnalyzer() we may need to make a transform from char[] to
> String to make it work.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira