[ 
https://issues.apache.org/jira/browse/LUCENE-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2247:
----------------------------------

    Attachment: LUCENE-2247.patch

Here the patch.

To apply, first do:
{code}
svn copy src/java/org/apache/lucene/analysis/CharArraySet.java 
src/java/org/apache/lucene/analysis/CharArrayMap.java
{code}

Have fun!

> Add CharArrayMap to lucene and make CharAraySet an proxy on the keySet() of it
> ------------------------------------------------------------------------------
>
>                 Key: LUCENE-2247
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2247
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.1
>
>         Attachments: LUCENE-2247.patch
>
>
> This patch adds a CharArrayMap<V> to Lucene's analysis package as compagnon 
> of CharArraySet. It supports fast retrieval of char[] keys like CharArraySet 
> does. This is important for some stemmers and other places in Lucene.
> Stemers generally use CharArrayMap<String>, which has then get(char[]) 
> returning String. Strings are compact and can be easily copied into 
> termBuffer. A Map<String,String> would be slow as the termBuffer would be 
> first converted to String, then looked up. The return value as String is 
> perfectly legal, as it can be copied easily into termBuffer.
> This class borrows lots of code from Solr's pendant, but has additional 
> features and more consistent API according to CharArraySet. The key is always 
> <?>, because as of CharArraySet, anything that has a toString() 
> representation can be used as key (of course with overhead). It also defines 
> a unmodifiable map and correct iterators (returning the native char[]).
> CharArraySet was made consistent and now returns for matchVersion>=3.1 also 
> an iterator on char[]. CharArraySet's code was almost completely copied to 
> CharArrayMap and removed in the Set. CharArraySet is now a simple proxy on 
> the keySet().
> In future we can think of making 
> CharArraySet/CharArrayMap/CharArrayCollection an interface so the whole API 
> would be more consistent to the Java collections API. But this would be a 
> backwards break. But it would be possible to use better impl instead of 
> hashing (like prefix trees).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to