Hello, > Feature-downgrading doesn't help me. I have a clear need how nodes must > be indexed. Since there seems to be no easy way to do this in > Jackrabbit, so I fall back to my own index. > And that's ok for me. I just asked (although I looked at the code) to > not miss something. > > Thanks for all your patience and help!
Sry for dropping in so late, I have been to occupied lately to even follow the list. Hope to have more time in near future. Anyways, my 2 cents about this: First of all, I think we already exactly did exactly what Bernd wants (without hooking in your own index and keep it in sync: i wouldn't go there). Also, I would favor support for more indexing tuning within Jackrabbit. I think everybody accustomed to lucene, or Solr, is used to define *how* fields should be indexed. Now, we have quite some support for this in JackRabbit already, see [1], but some tuning is missing (for example I would like to be able to indicate that lucene should not index some property (summary) as a single term, as I never want to sort on it, or use equals for it....) As you can find at the buttom of [1], you can configure a *per* property analyzer. So, if I know I have some comma seperate keywords property, I could configure the property to be indexed with my CommaSeperatedKeyWordAnalyzer. So basically, support for it is there. Now, what is not currently easy to achieve is for example indexing one and the same property in multiple ways: If for example I am indexing a title property, I might also want to index a short_title (which is not a jcr property, just index field only) : now, if I have 1.000.000 text documents having a title, I could still do a sort on short_title, whereas sorting on the normal title field will result in instant OOM (I am actually facing this, and will use some kind of 'short_title' strategy...where similar stuff applies for date range queries wrt different granularities) Anyways, back to Bernd's thing: We provided faceted navigation exposed over jcr as virtual structures. Similar I am planning to do so for Taxonomy navigation, tagging navigation, lucene term space navigation (exposing auto-completion options), similar nodes navigation, broken link checker, etc etc. But, some of them, for example faceted navigation needed, I think, the very same thing Bernd wants: controlling how properties are being indexed. As Marcel points out, you only need to extend the SearchIndex and override createDocument. You can then also use your own NodeIndexer impl which indexes all the properties (and index a single property in multiple ways) the way you want. The only drawback that makes it a little more difficult to write, is that for historical reasons (as it was not possible with the lucene version used at the time of writing the jr indexing) all properties end up in a single lucene property field, hence, you need to do some field value prefixing tricks (not hard, just a little annoying and confusing from time to time :-)) In your repository.xml, in the <SearchIndex> element, change for example: <SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex"> to <SearchIndex class=",,,..query.lucene.MySearchIndex"> Anyways, you can take look at [2] and [3] for examples, though they are already somewhat more sophisticated as we added custom indexing configurations as well for optimizing indexing (for example date granularity config which I actually still need to add) Hope this helps Regards Ard [1] http://wiki.apache.org/jackrabbit/IndexingConfiguration [2] http://svn.onehippo.org/repos/hippo/hippo-ecm/trunk/repository/engine/src/main/java/org/hippoecm/repository/query/lucene/ServicingSearchIndex.java [3] http://svn.onehippo.org/repos/hippo/hippo-ecm/trunk/repository/engine/src/main/java/org/hippoecm/repository/query/lucene/ServicingNodeIndexer.java > > Bernd > >> >> regards >> marcel >> >>> The downside of not being able to do this (controlling Lucene doc >>> creation) is having another, self-managed index, and (re-)indexing must >>> be done by hand, using JCR listeners or some other approach. >>> >>> Bernd >>> >> > >
