Ok, some improvement; "Faceting" as an end-user interface feature (or may be "Filtering"?):
A. Faceting (for further filtering) 1. We are counting "facets" 2. Sorting by "counts" in descending order 3. Presenting top-N to user for possible filtering/narrowing search results B. "Simplified Lucene" (with default operator "AND"): 1. For each term, find DocSet 2. Calculate DocSet intersections If we can avoid calculating "counts" for facets, and sorting by counts... Just list of related filters to narrow search results... P.S. Faceting on "country" field with 10 possible values still takes 20-30 seconds for a query id:[* TO *] (100 mlns docs), although obviously it can use FilterCache without any calcs! Fuad Efendi ================================== http://www.linkedin.com/in/liferay http://www.tokenizer.org http://www.casaGURU.com ================================== -----Original Message----- From: Fuad Efendi [mailto:f...@efendi.ca] Sent: August-21-09 11:42 AM To: solr-user@lucene.apache.org; yo...@lucidimagination.com Subject: RE: [ANNOUNCEMENT] Newly released book: Solr 1.4 Enterprise Search Server >actually a hybrid that goes back to DocSet intersections when it's more efficient I noticed that too when I played with it, for large query results DocSet intersections are de-facto standard; but when "faceting" started CNET had only 400,000 documents :) Nowadays even 2-3 seconds response time is bad... may be storing all users' queries and executing some tasks on background (storing "facets" in a database similar to heavy warehouse, predicting facet counts depending on query terms and domain analysis, and etc)? On Fri, Aug 21, 2009 at 11:25 AM, Fuad Efendi<f...@efendi.ca> wrote: > I was joking [off-topic]; "faceting" as a DocSet intersections' replaced by > trivial term count calcs which is extremely faster in some (if not all) use > cases, including possibly even NON-tokenized (with standard faceting we can > use FilterCache)... One size does not fit all. The enum method is not outdated or deprecated, and still works better in some scenarios. The new faceting code is actually a hybrid that goes back to DocSet intersections when it's more efficient. -Yonik http://www.lucidimagination.com