@Walter::::: Think of it this way.
1. Separate data-solr from merger-solr. So merger-solr has more cpu, while data-solr has more ram (very simplistic). This way, you can also scale them separately. (es has something like search-only-node) 2. Second step is to join client-app with merger-solr. So you do 1 less hop. This node doesn't have to lose the global-idf, query-cache or whatever merger-only-solr currently does. If the client-app is just a frontend/proxy for solr, then should be better. 3. The whole point is to have fewer, more powerful machines. And each client-app should be able to saturate it's own embedded-solr. Makes sense ? On Wed, Apr 19, 2017 at 6:29 PM, Walter Underwood <[email protected]> wrote: > That is exactly what I thought you meant. Adds complexity with no benefit. > > Now the merger needs to keep caches for global IDF. But those caches don’t > get the benefit of other requests to the same cluster. > > I’m not sure if query result caches cache the results of distributed > queries, but if they do, then this approach looses that benefit too. > > wunder > Walter Underwood > [email protected] > http://observer.wunderwood.org/ (my blog) > > > On Apr 19, 2017, at 9:01 AM, Dorian Hoxha <[email protected]> wrote: > > @Walter > > Usually you have: client-app --> random-solr-node(mergerer) --> each other > node that has a shard > While what I want: client-app (mergerer is in same jvm) --> each other > node that has a shard > > Makes sense ? > > On Wed, Apr 19, 2017 at 4:50 PM, Walter Underwood <[email protected]> > wrote: > >> Does not make sense to me. It would do more queries from the client to >> the cluster, not fewer. And those HTTP request would probably be slower >> than the intra-cluster requests. >> >> I expect the distributed portion of the query load is small compared to >> other CPU usage. >> >> It adds complexity for no gain in performance. Maybe a slight loss. >> >> wunder >> Walter Underwood >> [email protected] >> http://observer.wunderwood.org/ (my blog) >> >> >> On Apr 19, 2017, at 6:32 AM, Mikhail Khludnev <[email protected]> wrote: >> >> Hello, Dorian. >> I'm not sure about 1. But you can create EmbeddedSolrServer and add >> "collection" parameter. It's what's done in >> org.apache.solr.response.transform.SubQueryAugmenter >> [subquery] >> >> On Wed, Apr 19, 2017 at 3:53 PM, Dorian Hoxha <[email protected]> >> wrote: >> >>> Hi friends, >>> >>> Anybody has done this ? Reasons being: 1 less http-request when doing >>> distributed search. But also not storing data itself (like a >>> search-only-node). And the other nodes not caring about search-nodes. >>> >>> Makes sense ? >>> >>> Regards, >>> Dorian >>> >> >> >> >> -- >> Sincerely yours >> Mikhail Khludnev >> >> >> > >
