I know exactly how this works. MarkLogic can be configured with separate
execute and data nodes. But in MarkLogic, the execute nodes can do a lot of
work. The query may be a mix of indexed searching and “table scan” searching,
all expressed in an XQuery program.
It does not make sense for
@Walter:
Think of it this way.
1. Separate data-solr from merger-solr. So merger-solr has more cpu, while
data-solr has more ram (very simplistic).
This way, you can also scale them separately. (es has something like
search-only-node)
2. Second step is to join client-app with merger-solr.
That is exactly what I thought you meant. Adds complexity with no benefit.
Now the merger needs to keep caches for global IDF. But those caches don’t get
the benefit of other requests to the same cluster.
I’m not sure if query result caches cache the results of distributed queries,
but if they
@Walter
Usually you have: client-app --> random-solr-node(mergerer) --> each other
node that has a shard
While what I want: client-app (mergerer is in same jvm) --> each other node
that has a shard
Makes sense ?
On Wed, Apr 19, 2017 at 4:50 PM, Walter Underwood
wrote:
>
Does not make sense to me. It would do more queries from the client to the
cluster, not fewer. And those HTTP request would probably be slower than the
intra-cluster requests.
I expect the distributed portion of the query load is small compared to other
CPU usage.
It adds complexity for no
Hello, Dorian.
I'm not sure about 1. But you can create EmbeddedSolrServer and add
"collection" parameter. It's what's done in
org.apache.solr.response.transform.SubQueryAugmenter [subquery]
On Wed, Apr 19, 2017 at 3:53 PM, Dorian Hoxha
wrote:
> Hi friends,
>
> Anybody
Hi friends,
Anybody has done this ? Reasons being: 1 less http-request when doing
distributed search. But also not storing data itself (like a
search-only-node). And the other nodes not caring about search-nodes.
Makes sense ?
Regards,
Dorian