Re: Separating Search and Indexing in SolrCloud

2016-12-18 Thread Иван Иванов
Stop 16 дек. 2016 г. 3:31 PM пользователь "Jaroslaw Rozanski" < m...@jarekrozanski.com> написал: > Hi all, > > According to documentation, in normal operation (not recovery) in Solr > Cloud configuration the leader sends updates it receives to all the > replicas. > > This means and all nodes in

Re: Separating Search and Indexing in SolrCloud

2016-12-18 Thread Erick Erickson
Analyzed documents. The transaction log stores the raw input. On Sun, Dec 18, 2016 at 5:32 AM, Jaroslaw Rozanski wrote: > Hi Erick, > > > Not talking about separation any more. I merely summarized message from > Pushkar. As I said it was clear that it was not possible. >

Re: Separating Search and Indexing in SolrCloud

2016-12-18 Thread Jaroslaw Rozanski
Hi Erick, Not talking about separation any more. I merely summarized message from Pushkar. As I said it was clear that it was not possible. About the RAMBufferSizeMB, getting back to my original question, is this buffer for storing update requests or ready to index, analyzed documents?

Re: Separating Search and Indexing in SolrCloud

2016-12-17 Thread Erick Erickson
Yes indexing is adding stress. No you can't separate the two in SolrCloud. End of story, why beat it to death? You'll have to figure out the sharding strategy that meets your indexing and querying needs and live within that framework. I'd advise setting up a small cluster and driving it to its

Re: Separating Search and Indexing in SolrCloud

2016-12-17 Thread Jaroslaw Rozanski
Hi Erick, So what does this buffer represent? What does it actually store? Raw update request or analyzed document? The documentation suggest that it stores actual update requests. Obviously analyzed document can and will occupy much more space than raw one. Also analysis with create a lot of

Re: Separating Search and Indexing in SolrCloud

2016-12-17 Thread Erick Erickson
bq: I am more concerned with indexing memory requirements at volume By and large this isn't much of a problem. RAMBufferSizeMB in solrconfig.xml governs how much memory is consumed in Solr for indexing. When that limit is exceeded, the buffer is flushed to disk. I've rarely heard of indexing

Re: Separating Search and Indexing in SolrCloud

2016-12-16 Thread Jaroslaw Rozanski
Thanks, that issue looks interesting! On 16/12/16 16:38, Pushkar Raste wrote: > This kind of separation is not supported yet. There however some work > going on, you can read about it on > https://issues.apache.org/jira/browse/SOLR-9835 > > This unfortunately would not support soft commits and

Re: Separating Search and Indexing in SolrCloud

2016-12-16 Thread Jaroslaw Rozanski
Thanks, On 16/12/16 20:56, Shawn Heisey wrote: > On 12/16/2016 5:43 AM, Jaroslaw Rozanski wrote: >> Leader is responsible for distributing update requests to replica. So >> eventually all replicas have same state as leader. Not a problem. It >> is more about the performance of such. If I gather

Re: Separating Search and Indexing in SolrCloud

2016-12-16 Thread Shawn Heisey
On 12/16/2016 5:43 AM, Jaroslaw Rozanski wrote: > Leader is responsible for distributing update requests to replica. So > eventually all replicas have same state as leader. Not a problem. It > is more about the performance of such. If I gather correctly normal > replication happens by standard

Re: Separating Search and Indexing in SolrCloud

2016-12-16 Thread Pushkar Raste
This kind of separation is not supported yet. There however some work going on, you can read about it on https://issues.apache.org/jira/browse/SOLR-9835 This unfortunately would not support soft commits and hence would not be a good solution for near real time indexing. On Dec 16, 2016 7:44

Re: Separating Search and Indexing in SolrCloud

2016-12-16 Thread Dorian Hoxha
Makes more sense, but I think the master should do the write before it can be redirected to other replicas. So not sure if that can be done. In elasticsearch you can have datanodes and coordinator nodes:

Re: Separating Search and Indexing in SolrCloud

2016-12-16 Thread Jaroslaw Rozanski
Sorry, not what I meant. Leader is responsible for distributing update requests to replica. So eventually all replicas have same state as leader. Not a problem. It is more about the performance of such. If I gather correctly normal replication happens by standard update request. Not by, say,

Re: Separating Search and Indexing in SolrCloud

2016-12-16 Thread Dorian Hoxha
The leader is the source of truth. You expect to make the replica the source of truth or something???Doesn't make sense? What people do, is send write to leader/master and reads to replicas/slaves in other solr/other-dbs. On Fri, Dec 16, 2016 at 1:31 PM, Jaroslaw Rozanski

Separating Search and Indexing in SolrCloud

2016-12-16 Thread Jaroslaw Rozanski
Hi all, According to documentation, in normal operation (not recovery) in Solr Cloud configuration the leader sends updates it receives to all the replicas. This means and all nodes in the shard perform same effort to index single document. Correct? Is there then a benefit to *not* to send