[ 
https://issues.apache.org/jira/browse/LUCENE-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147192#comment-13147192
 ] 

Robert Muir commented on LUCENE-2564:
-------------------------------------

patch looks good... i was just referring to Solr's resource loading of 
stopwords and stuff.

but we don't have to do that here, imo we should fix the issues here first.

Maybe for the javadocs on getReader we should explain that unlike the java 
default, it creates
a reader that will throw an exception if it detects the charset is wrong 
(so this is good for configuration files-reading like WordListLoader, but not 
recommended
for say documents crawled from the web or something)

                
> wordlistloader is inefficient
> -----------------------------
>
>                 Key: LUCENE-2564
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2564
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: modules/analysis
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 3.5, 4.0
>
>         Attachments: LUCENE-2564.patch
>
>
> WordListLoader is basically used for loading up stopwords lists, stem 
> dictionaries, etc.
> Unfortunately the api returns Set<String> and sometimes even HashSet<String> 
> or HashMap<String,String>
> I think we should break it and return CharArraySets and CharArrayMaps (but 
> leave the return value as generic Set,Map).
> If someone objects to breaking it in 3.1, then we can do this only in 4.0, 
> but i think it would be good to fix it both places.
> The reason is that if someone does new FooAnalyzer() a lot (probably not 
> uncommon) i think its doing a bunch of useless copying.
> I think we should slap @lucene.internal on this API too, since thats mostly 
> how its being used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to