On 1-9-2011 23:22, Jeroen Reijn wrote: > On Wed, Aug 31, 2011 at 3:16 PM, Dennis van der Laan > <[email protected]> wrote: >> Ian, others, >> >> As with many 'bugs' that have a workaround, this bug has been lying >> around for about a year now. We still have the problem that the >> cluster-nodes have different lucene indexes. At first, we thought this >> happened over time. Recently we made a copy of our production database >> and used it with 4 new cluster nodes (we cleared the journal table and >> the local revisions table, first). We started them all, completely >> clean, at which point all nodes started to build the lucene index. >> Without making any changes to the contents, we see different results for >> jackrabbit search queries on these 4 cluster nodes. So it seems the >> lucene indexes might differ more over time, but could differ right from >> the start. >> >> Does anybody have a clue how this could happen? Are we missing something? > I'm wondering what you mean with the statement: "different results for > jackrabbit search queries". When doing a fulltext search (xpath query with a 'contains' clause), on some cluster nodes a document containing the queried text might show up in the results, whereas on other cluster nodes it may not. When we update such a document so it gets indexed again on all cluster nodes (hopefully), it may show up on all cluster nodes again. I do not have numbers on how many documents are not indexed on all cluster nodes, but happened too often to speak of 'an incident'. > Could you perhaps show some of those queries? This could also be > related to your indexing configuration. I don't quite understand what you mean with 'related to your indexing configuration'. We roll out our cluster nodes from a single templating server, so the configuration for all cluster nodes is exactly the same, except for the cluster id. An example of a query with might not return the same results on all cluster nodes:
/jcr:root/cms/documents//element(*, nt:file)/jcr:content[(jcr:contains(cms:searchData/@cms:title, 'academy assistent') or jcr:contains(@jcr:data, 'academy assistent')) and (@cms:type = 'article')]/(@jcr:lastModified|rep:excerpt()|@cms:type) order by @cms:sortfield ascending > > I asume you do not have an index when starting one of the cluster nodes? Not when we start a fresh cluster node for the first time, no. > > BTW which version of Jackrabbit are you experiencing this with? We are currently still using Jackrabbit 1.6.1 Thanks for taking a look at our problem! Best regards, Dennis > >> TIA >> Dennis >> >> On 29-9-2010 12:37, Ian Boston wrote: >>> On 29 Sep 2010, at 11:33, Dennis van der Laan wrote: >>> >>>> From your reply I >>>> understand that this should not be the case with Lucene, is it? >>> Every JournalRecord should have been replayed on every machine (at some >>> time later if the JVM was down). That *should* ensure that all documents >>> are indexed on all machines. >>> Sounds like this is not happening in your environment. >>> >>> Ian >>> >> >> -- >> Dennis van der Laan, MSc >> Centre for Information Technology >> University of Groningen >> >> > > -- Dennis van der Laan, MSc Centre for Information Technology University of Groningen
