that worked but seem unable to run 1) phrase queries: i.e.
*core1*/select?fl=title&q={!join from=id to=id fromIndex=*core0*} titleNormalized:"*text pdf*"&facet=true&facet.field=tags or 2) run filters on core0 *core1*/select?fl=title&q={!join from=id to=id fromIndex=*core0*} titleNormalized:"*text pdf*"&fq=user:76&facet=true&facet.field=tags i am thinking a better design is to build a custom searchcomponent on core0 and add it as the last-component to the default search component on core0 (both cores are on the same JVM). the custom core aware component will access core1 as follows: // inform of core0 // public void inform(SolrCore core){ SolrCore core1 = core.getCoreDescriptor().getCoreContainer().getCore( "core1"); SolrIndexSearcher = core1.getNewestSearcher(false).get(); } then i intercept the default search handler public void process(ResponseBuilder rb) throws IOException { SolrIndexSearcher core0 = rb.req.getSearcher(); SolrParams params = rb.req.getParams(); Iterator<Integer> docIt = rb.getResults().docList.iterator(); String tagname; String id; while(docIt.hasNext()) { Integer docID = docIt.next(); id = core0.doc(docID).get("id"); tagname = doc.search(id); ..... do faceting on the docs; } } On Thu, Jun 4, 2015 at 10:29 AM, Alessandro Benedetti < benedetti.ale...@gmail.com> wrote: > Hi Rob, > according to your use case you have to : > > Call the /select from *core1 *in this way* :* > > *core1*/select?fl=title&q={!join from=id to=id fromIndex=*core0*} > titleNormalized:pdf&facet=true&facet.field=tags > > Hope this clarify your problem. > > Cheers > > 2015-06-04 15:00 GMT+01:00 Robust Links <pey...@robustlinks.com>: > > > my requirement is to join core1 onto core0. restating the requirements > > again. I have 2 cores > > > > core0 > > -------- > > field:id > > field: text > > > > core1 > > -------- > > field:id > > field tag > > > > > > I want to > > > > 1) query text field of core0, together with filters > > 2) use the {id} of matches (which can be >>10K) to retrieve the docs in > > core 1 with same id and > > 3) facet on tags in core1 > > > > so my /select is to run on core0 and facet on tag field of core1 > > > > thank you Alessandro > > > > > > On Thu, Jun 4, 2015 at 9:28 AM, Alessandro Benedetti < > > benedetti.ale...@gmail.com> wrote: > > > > > Lets try to make clear some point : > > > > > > Index TO : is the one you are using to call the select request handler > > > Index From : Tags > > > Is titleNormalized present in the "Tags" index ? Because there is where > > the > > > query will run. > > > > > > The documents in tags satisfying the query will be joined with the > index > > TO > > > . > > > The resulting documents can be filtered and faceted. > > > I did use this approach a lot of times. > > > And I can tell you it is working in this way. > > > Maybe you misunderstood the Join feature, or I misunderstood your > > > requirement. > > > > > > Cheers > > > > > > 2015-06-04 13:27 GMT+01:00 Robust Links <pey...@robustlinks.com>: > > > > > > > try it for yourself and see if it works Alessandro. Not only cant i > get > > > > facets but i even get field errors when i run such join queries > > > > > > > > select?fl=title&q={!join from=id to=id > > fromIndex=Tags}titleNormalized:pdf > > > > > > > > <lst name="error"> > > > > <str name="msg">undefined field titleNormalized</str> > > > > <int name="code">400</int> > > > > </lst> > > > > > > > > > > > > > > > > > > > > On Thu, Jun 4, 2015 at 5:19 AM, Alessandro Benedetti < > > > > benedetti.ale...@gmail.com> wrote: > > > > > > > > > Hi Rob, > > > > > Reading your use case I can not understand why the Query Time join > is > > > > not a > > > > > fit for you ! > > > > > The documents returned by the Query Time Join will be from core1, > so > > > > > faceting and filter querying that core, would definitely be > possible > > ! > > > > > I can not see your problem honestly ! > > > > > > > > > > Cheers > > > > > > > > > > 2015-06-04 1:47 GMT+01:00 Robust Links <pey...@robustlinks.com>: > > > > > > > > > > > that doesnt work either, and even if it did, joining is not going > > to > > > > be a > > > > > > solution since i cant query 1 core and facet on the result of the > > > > other. > > > > > To > > > > > > sum up, my problem is > > > > > > > > > > > > core0 > > > > > > -------- > > > > > > field:id > > > > > > field: text > > > > > > > > > > > > core1 > > > > > > -------- > > > > > > field:id > > > > > > field tag > > > > > > > > > > > > > > > > > > I want to > > > > > > > > > > > > 1) query text field of core0, > > > > > > 2) use the {id} of matches (which can be >>10K) to retrieve the > > docs > > > in > > > > > > core 1 with same id and > > > > > > 3) facet on tags in core1 > > > > > > > > > > > > Is this possible without denormalizing (which is not an option)? > > > > > > > > > > > > thank you > > > > > > > > > > > > On Wed, Jun 3, 2015 at 4:24 PM, Jack Krupansky < > > > > jack.krupan...@gmail.com > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Specify the join query parser for the main query. See: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser > > > > > > > > > > > > > > > > > > > > > -- Jack Krupansky > > > > > > > > > > > > > > On Wed, Jun 3, 2015 at 3:32 PM, Robust Links < > > > pey...@robustlinks.com > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Erick > > > > > > > > > > > > > > > > they are on the same JVM. I had already tried the core join > > > > strategy > > > > > > but > > > > > > > > that doesnt solve the faceting problem... i.e if i have 2 > > cores, > > > > > core0 > > > > > > > and > > > > > > > > core1, and I run this query on core0 > > > > > > > > > > > > > > > > /select?&q=<QUERY>fq={!join from=id1 to=id2 > > > > > > > > fromIndex=core1}&facet=true&facet.field=tag > > > > > > > > > > > > > > > > has 2 problems > > > > > > > > 1) i need to specify the docIDs with the fq (so back to the > > same > > > > > > > > fq={!terms} problem), and > > > > > > > > 2) faceting doesnt work > > > > > > > > > > > > > > > > > > > > > > > > Flattening the data is not possible due to security reasons. > > > > > > > > > > > > > > > > Am I using join correctly? > > > > > > > > > > > > > > > > thank you Erick > > > > > > > > > > > > > > > > Peyman > > > > > > > > > > > > > > > > On Wed, Jun 3, 2015 at 2:12 PM, Erick Erickson < > > > > > > erickerick...@gmail.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Are these indexes on different machines? Because if they're > > in > > > > the > > > > > > > > > same JVM, you might be able to use cross-core joins. Be > > aware, > > > > > > though, > > > > > > > > > that joining on high-cardinality fields (which, by > > definition, > > > > > docID > > > > > > > > > probably is) is where pseudo joins perform worst. > > > > > > > > > > > > > > > > > > Have you considered flattening the data and including > > whatever > > > > > > > > > information you have in your "from" index in your main > index? > > > > > Because > > > > > > > > > < 100ms response is probably not going to be tough if you > > have > > > to > > > > > > have > > > > > > > > > two indexes/cores. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Erick > > > > > > > > > > > > > > > > > > On Wed, Jun 3, 2015 at 10:58 AM, Joel Bernstein < > > > > > joels...@gmail.com> > > > > > > > > > wrote: > > > > > > > > > > You may have to do something custom to meet your needs. > > > > > > > > > > > > > > > > > > > > 10,000 DocID's is not huge but you're latency requirement > > are > > > > > > pretty > > > > > > > > low. > > > > > > > > > > > > > > > > > > > > Are your DocID's by any chance integers? This can make > > custom > > > > > > > > PostFilters > > > > > > > > > > run much faster. > > > > > > > > > > > > > > > > > > > > You should also be aware of the Streaming API in Solr 5.1 > > > which > > > > > > will > > > > > > > > give > > > > > > > > > > you fast Map/Reduce approaches ( > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://joelsolr.blogspot.com/2015/04/the-streaming-api-solrjio-basics.html > > > > > > > > > ). > > > > > > > > > > > > > > > > > > > > Joel Bernstein > > > > > > > > > > http://joelsolr.blogspot.com/ > > > > > > > > > > > > > > > > > > > > On Wed, Jun 3, 2015 at 1:46 PM, Robust Links < > > > > > > pey...@robustlinks.com > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > >> Hey Joel > > > > > > > > > >> > > > > > > > > > >> see below > > > > > > > > > >> > > > > > > > > > >> On Wed, Jun 3, 2015 at 1:43 PM, Joel Bernstein < > > > > > > joels...@gmail.com> > > > > > > > > > wrote: > > > > > > > > > >> > > > > > > > > > >> > A few questions for you: > > > > > > > > > >> > > > > > > > > > > >> > How large can the list of filtering ID's be? > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> >> 10k > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > What's your expectation on latency? > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> 10> latency <100 > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > What version of Solr are you using? > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> 5.0.0 > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > SolrCloud or not? > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> not > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > Joel Bernstein > > > > > > > > > >> > http://joelsolr.blogspot.com/ > > > > > > > > > >> > > > > > > > > > > >> > On Wed, Jun 3, 2015 at 1:23 PM, Robust Links < > > > > > > > > pey...@robustlinks.com> > > > > > > > > > >> > wrote: > > > > > > > > > >> > > > > > > > > > > >> > > Hi > > > > > > > > > >> > > > > > > > > > > > >> > > I have a set of document IDs from one core and i > want > > to > > > > > query > > > > > > > > > another > > > > > > > > > >> > core > > > > > > > > > >> > > using the ids retrieved from the first core...the > > > > constraint > > > > > > is > > > > > > > > that > > > > > > > > > >> the > > > > > > > > > >> > > size of doc ID set can be very large. I want to: > > > > > > > > > >> > > > > > > > > > > > >> > > 1) retrieve these docs from the 2nd index > > > > > > > > > >> > > 2) facet on the results > > > > > > > > > >> > > > > > > > > > > > >> > > I can think of 3 solutions: > > > > > > > > > >> > > > > > > > > > > > >> > > 1) boolean query > > > > > > > > > >> > > 2) terms fq > > > > > > > > > >> > > 3) use a DB rather than Solr > > > > > > > > > >> > > > > > > > > > > > >> > > I am trying to keep latencies down so prefer to not > > use > > > > (3). > > > > > > The > > > > > > > > > >> problem > > > > > > > > > >> > > with (1) is maxBooleanclauses is hardwired and I am > > not > > > > sure > > > > > > > when > > > > > > > > I > > > > > > > > > >> will > > > > > > > > > >> > > hit the exception. Option (2) seems to also hit > > limits.. > > > > so > > > > > > if I > > > > > > > > do > > > > > > > > > >> > > > > > > > > > > > >> > > > > > select?fl=*&q=*:*&facet=true&facet.field=title&fq={!terms > > > > > > > > > >> > > f=id}<LONG_LIST_OF_IDS> > > > > > > > > > >> > > > > > > > > > > > >> > > solr just goes blank. I have tried adding cost=200 > to > > > try > > > > to > > > > > > run > > > > > > > > the > > > > > > > > > >> > query > > > > > > > > > >> > > first fq={!terms f=id cost=200} but still no good. > > > Paging > > > > on > > > > > > doc > > > > > > > > IDs > > > > > > > > > >> > could > > > > > > > > > >> > > be a solution but the problem then is that the > > faceting > > > > > > results > > > > > > > > > >> > correspond > > > > > > > > > >> > > to the paged IDs and not the global set. > > > > > > > > > >> > > > > > > > > > > > >> > > My filter cache spec is as follows > > > > > > > > > >> > > > > > > > > > > > >> > > <filterCache class="solr.FastLRUCache" > > > > > > > > > >> > > size="1000000" > > > > > > > > > >> > > initialSize="1000000" > > > > > > > > > >> > > autowarmCount="100000"/> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > What would be the best way for me to solve this > > problem? > > > > > > > > > >> > > > > > > > > > > > >> > > thank you > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > -------------------------- > > > > > > > > > > Benedetti Alessandro > > > > > Visiting card : http://about.me/alessandro_benedetti > > > > > > > > > > "Tyger, tyger burning bright > > > > > In the forests of the night, > > > > > What immortal hand or eye > > > > > Could frame thy fearful symmetry?" > > > > > > > > > > William Blake - Songs of Experience -1794 England > > > > > > > > > > > > > > > > > > > > > -- > > > -------------------------- > > > > > > Benedetti Alessandro > > > Visiting card : http://about.me/alessandro_benedetti > > > > > > "Tyger, tyger burning bright > > > In the forests of the night, > > > What immortal hand or eye > > > Could frame thy fearful symmetry?" > > > > > > William Blake - Songs of Experience -1794 England > > > > > > > > > -- > -------------------------- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England >