[ https://issues.apache.org/jira/browse/JCR-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525031 ]
Ard Schrijvers commented on JCR-1079: ------------------------------------- > - There's a System.out in AnalyzerImpl.tokenStream > - Can you please move AnalyzerImpl to a top level class and maybe rename it > to JackrabbitAnalyzer > - Change the type of SearchIndex.analyzer to match the concrete class, this > makes the cast redundant. Aaarrggh, how stupid of mine to have a System.out, again! I am sorry, I'll check check double check in the future :-) I am not in the office today, so will create a patch with your suggestions tomorrow and edit the wiki. Thanks for reviewing > Extend the IndexingConfiguration to allow configuration of reuseable analyzers > ------------------------------------------------------------------------------ > > Key: JCR-1079 > URL: https://issues.apache.org/jira/browse/JCR-1079 > Project: Jackrabbit > Issue Type: New Feature > Affects Versions: 1.3.1 > Reporter: Ard Schrijvers > Priority: Minor > Fix For: 1.4 > > Attachments: JCR-1079.patch > > > To the indexing_configuration.xml a xml block of analyzers should be > configurable. In each <index-rule> to a property an analyzer can be assigned. > This means, that property will be analyzed with that specific analyzer. In > the first place, it enables multilingual indexing. > Documentation needs to be added explaining the difference in searching in the > node scope [jcr:contains(.,'foo')] and in some property > [jcr:contains(@myprop,'foo')]. The node scope will always be searched and > indexed with the default analyzer, which can be configured in the > workspace.xml in the <SearchIndex> element. > Below a possible indexing_configuration.xml snippet is shown. Also node the > possible enhancement (not sure wether this implementation will have it, > because it requires a lot of filter Factories and is probably out of scope). > Adding custom filters which do not need a factory might be easier. > <analyzers> > <analyzer name="fr" > class="org.apache.lucene.analysis.fr.FrenchAnalyzer"/> > <analyzer name="de" > class="org.apache.lucene.analysis.de.GermanAnalyzer"/> > <analyzer name="compound" > class="org.apache.lucene.analysis.SimpleAnalyzer"> > <filter class="jr.StopFilterFactory" words="stopwords.txt"/> > <filter class="jr.EdgeNGramTokenizerFactory" side="front" > minGram="1" maxGram="2"/> > </analyzer> > </analyzers> > <index-rule nodeType="nt:unstructured"> > <property analyzer="fr">bode_fr</property> > <property analyzer="de">bode_de</property> > </index-rule> -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.