David,
We have a similar query in astrophysics, an user can select an area of the
sky....many stars out there....

I am long overdue in creating a Jira issue, but here you have another
efficient mechanism for searching large number of ids

https://github.com/romanchyla/montysolr/blob/master/contrib/adsabs/src/java/org/apache/solr/search/BitSetQParserPlugin.java

Roman
On 12 Oct 2013 01:57, "David Philip" <davidphilipshe...@gmail.com> wrote:

> Groups are pharmaceutical research expts.. User is presented with graph
> view, he can select some region and all the groups in that region gets
> included..user can modify the groups also here.. so we didn't maintain
> group information in same solr index but we have externalized.
> I looked at post filter article. So my understanding is that, I simply have
> to extended as you did and should include implementaton for
> "isAllowed(acls[doc], groups)" .This will filter the documents in the
> collector and finally this collector will be returned. am I right?
>
>   @Override
>       public void collect(int doc) throws IOException {
>         if (isAllowed(acls[doc], user, groups)) super.collect(doc);
>       }
>
>
> Erick, I am interested to know whether I can extend any class that can
> return me only the bitset of the documents that match the search query. I
> can then do bitset1.andbitset2OfGroups - finally, collect only those
> documents to return to user. How do I try this approach? Any pointers for
> bit set?
>
> Thanks - David
>
>
>
>
> On Thu, Oct 10, 2013 at 5:25 PM, Erick Erickson <erickerick...@gmail.com
> >wrote:
>
> > Well, my first question is why 50K groups is necessary, and
> > whether you can simplify that. How a user can manually
> > choose from among that many groups is "interesting". But
> > assuming they're all necessary, I can think of two things.
> >
> > If the user can only select ranges, just put in filter queries
> > using ranges. Or possibly both ranges and individual entries,
> > as fq=group:[1A TO 10000A] OR group:(2B 45C 98Z) etc.
> > You need to be a little careful how you put index these so
> > range queries work properly, in the above you'd miss
> > 2A because it's sorting lexicographically, you'd need to
> > store in some form that sorts like 0000001A 010000A
> > and so on. You wouldn't need to show that form to the
> > user, just form your fq's in the app to work with
> > that form.
> >
> > If that won't work (you wouldn't want this to get huge), think
> > about a "post filter" that would only operate on documents that
> > had made it through the select, although how to convey which
> > groups the user selected to the post filter is an open
> > question.
> >
> > Best,
> > Erick
> >
> > On Wed, Oct 9, 2013 at 12:23 PM, David Philip
> > <davidphilipshe...@gmail.com> wrote:
> > > Hi All,
> > >
> > >     I have an issue in handling filters for one of our requirements and
> > > liked to get suggestion  for the best approaches.
> > >
> > >
> > > *Use Case:*
> > >
> > > 1.  We have List of groups and the number of groups can increase upto
> >1
> > > million. Currently we have almost 90 thousand groups in the solr search
> > > system.
> > >
> > > 2.  Just before the user hits a search, He has options to select the
> no.
> > of
> > >  groups he want to retrieve. [the distinct list of these group Names
> for
> > > display are retrieved from other solr index that has more information
> > about
> > > groups]
> > >
> > > *3.User Operation:** *
> > > Say if user selected group 1A  - group 10000A.  and searches for
> > key:cancer.
> > >
> > >
> > > The current approach I was thinking is : get search results and filter
> > > query by groupids' list selected by user. But my concern is When these
> > > groups list is increasing to >50k unique Ids, This can cause lot of
> delay
> > > in getting search results. So wanted to know whether there are
> different
> > >  filtering ways that I can try for?
> > >
> > > I was thinking of one more approach as suggested by my colleague to do
> -
> > >  intersection.  -
> > > Get the groupIds' selected by user.
> > > Get the list of groupId's from search results,
> > > Perform intersection of both and then get the entire result set of only
> > > those groupid that intersected. Is this better way? Can I use any cache
> > > technique in this case?
> > >
> > >
> > > - David.
> >
>

Reply via email to