Hi Tri, Look at this: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/%3CCAEN8dyX_Am_v4f=5614eu35fnhb5h7dzkmkzdfwvrrm1xpq...@mail.gmail.com%3E Roman On 13 Feb 2014 03:39, "Tri Cao" <tm...@me.com> wrote:
> Hi Joel, > > Thanks a lot for the suggestion. > > After thinking more about this, I think I could skip the faceting count > for now, > and so just provide a filtering option without display how many items that > would > be there after filtering. After all, even Google Shopping product search > doesn't > display the facet counts :) Given that, I think the easiest way is to add > a new > PostFilter to the query. > > Thanks again, > Tri > > On Feb 12, 2014, at 12:03 PM, Joel Bernstein <joels...@gmail.com> wrote: > > Tri, > > You will most likely need to implement a custom QParserPlugin to > efficiently handle what you described. Inside of this QParserPlugin you > could create the logic that would bring in your outside list of ID's and > build a DocSet that could be applied to the fq and the facet.query. I > haven't attempted to use a QParserPlugin with a facet.query, but in theory > it would work. > > With the filter query you also have the option of implementing your Query > as a PostFilter. PostFilter logic is applied at collect time so the logic > needs to only be applied to the documents that match the query. In many > cause this can be faster, especially when result sets are relatively small > but the index is large. > > > Joel Bernstein > Search Engineer at Heliosearch > > > On Wed, Feb 12, 2014 at 2:12 PM, Tri Cao <tm...@me.com> wrote: > > Hi all, > > I am running a Solr application and I would need to implement a feature > > that requires faceting and filtering on a large list of IDs. The IDs are > > stored outside of Solr and is specific to the current logged on user. An > > example of this is the articles/tweets the user has read in the last few > > weeks. Note that the IDs here are the real document IDs and not Lucene > > internal docids. > > So the question is what would be the best way to implement this in Solr? > > The list could be as large as a ten of thousands of IDs. The obvious way of > > rewriting Solr query to add the ID list as "facet.query" and "fq" doesn't > > seem to be the best way because: a) the query would be very long, and b) it > > would surely exceed that the default limit of 1024 Boolean clauses and I > > am sure the limit is there for a reason. > > I had a similar problem before but back then I was using Lucene directly > > and the way I solved it is to use a MultiTermQuery to retrieve the internal > > docids from the ID list and then apply the resulting DocSet to counting and > > filtering. It was working reasonably for list of size ~10K, and with proper > > caching, it was working ok. My current application is very invested in Solr > > that going back to Lucene is not an option anymore. > > All advice/suggestion are welcomed. > > Thanks, > > Tri > >