On 25/02/2013 11:24, Thomas Matthijs wrote:
On Mon, Feb 25, 2013 at 12:19 PM, Thomas Matthijs <li...@selckin.be <mailto:li...@selckin.be>> wrote:

    On Mon, Feb 25, 2013 at 11:30 AM, Thomas Matthijs
    <li...@selckin.be <mailto:li...@selckin.be>> wrote:


        On Mon, Feb 25, 2013 at 11:24 AM, Paul Taylor
        <paul_t...@fastmail.fm <mailto:paul_t...@fastmail.fm>> wrote:

            On 20/02/2013 11:28, Paul Taylor wrote:

                Just updating codebase from Lucene 3.6 to Lucene 4.1
                and seems my tests that use NormalizeCharMap for
                replacing characters in the anyalzers are not working.

            bump, anybody I thought a self contained testcase would be
            enough to pique somebodys interest, am I doing something
            silly - maybe but I can't see it



        Tried to run your test but it uses  MusicbrainzTokenizer



    Well i made it work, if it's a bug that this is required or if it
    documented anywhere i don't know, it does seem very trappy:



It is documented all the way at the bottom: http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/analysis/package-summary.html

So it should be:

    class SimpleAnalyzer extends Analyzer {

        protected NormalizeCharMap charConvertMap;

        public SimpleAnalyzer() {
NormalizeCharMap.Builder builder = new NormalizeCharMap.Builder();
            builder.add("&", "and");
            charConvertMap = builder.build();
        }

        @Override
protected TokenStreamComponents createComponents(String fieldName, Reader reader) { Tokenizer source = new WhitespaceTokenizer(Version.LUCENE_40, reader); TokenStream filter = new LowerCaseFilter(Version.LUCENE_40, source);
            return new TokenStreamComponents(source, filter);
        }

        @Override
        protected Reader initReader(String fieldName, Reader reader) {
            return new MappingCharFilter(charConvertMap, reader);
        }
    }

Thanks Thomas, for some reason didnt see your post until now and independently worked it out.

Reply via email to