I don't have any suggestions as I'm very new to Lucene.NET still. I would be very interested in hearing about how you end up solving this, if you do end up finding a solution.
For my faceting needs, I used code derived from the post you cited. I haven't had any performance problems, but I have only a single index with approximately 5 GB of data. On Wed, Jun 10, 2009 at 10:27 PM, Grahem Cuthbertson < [email protected]> wrote: > > Hi Again, > > I have a situation where I want to facet over a MultiSearcher. I ran into > a great post ( > http://mail-archives.apache.org/mod_mbox/incubator-lucene-net-user/200712.mbox/%[email protected]%3e) > which lead me to learn al ot more about the inner working of Lucene, but > admittedly not enough. So here is my test setup: > > > > Test Machine : > 2XDual Core - 3.4Ghz > 3Gig mem > > > > Index: > > 12 Indexes > > 11Gigs combined (please don't ask me why) > > ~3 million docs > > for argument sake, half of these indexes are optimized > > > > my facet is constrained to 90K hits, meaning, i run a search for some > keyword (resulting in 90K of 3mil docs) and am looking for facet counts on 9 > fields against those results. > > > > Some performance numbers... > > 1st search (or warming) = ~1.7 seconds > > Mem consumption ~100Megs > > > > The problem I am facing is A.) warming w/facet performance and B.) memory > consumption > > > > Post warming search with 9 facet fields: > > Mem consumption = ~1.3G > ~48 seconds (~90K hits) > > > > Second search with Facets is like 15ms for the search and 200ms for the > facet - well within everything I ever wanted out of Lucene (and then some, > even though mem is stuck at ~1.3G) > > > > In a production scenario, however, even if I could get away with the > warming times and mem consumption - this scenario is what we call a "group" > and we have hundreds of them (even if only a fraction are running in > parallel). So I have to get these initial numbers way down, but I am not > sure how to do it.... > > > > In a very crude first-pass attempt to morph Jokin's awesome contribution - > In it's more simpler form, I came up with meh: > > > > public static IEnumerable<KeyValuePair<string,int>> Facet(Query query, > MultiSearcher s, string Field, int max) > > { > > Dictionary<string,int> result = new Dictionary<string,int>(); > > for (int q = 0; q < s.GetSearchables().Length; q++) > > { > > //TODO: don't assume an IndexSearcher is the basis for > searchables > > StringIndex stringIndex = > FieldCache_Fields.DEFAULT.GetStringIndex(((IndexSearcher)s.GetSearchables()[q]).Reader, > Field); > > int[] c = new int[stringIndex.lookup.Length]; > > FacetCollector results = new FacetCollector(c, stringIndex); > > ((IndexSearcher)s.GetSearchables()[q]).Search(query, results); > > .... > > .... > > } > > return result; > > } > > > > ... refers to basically merging results together and getting top hits, but > it doesn't effect the numbers given above > > > > Can anyone shed some light on methods to do faceting across multiple > indexes? I knew when writing this code this afternoon I was going to need to > && some bit sets, but also still just getting familiar with the > inner-workings of Lucene, and if anyone could point me in the right > direction I would be grateful > > ! > > Thanks! > > Grahem > > > > > > >
