Solr Cloud: Zookeeper failure modes

2019-01-02 Thread Pavel Micka
Hi,
We are currently implementing Solr cloud and as part of this effort we are 
investigating, which failure modes may happen between Solr and Zookeeper.

We have found quite a lot articles describing the "happy path" failure, when ZK 
stops (loses majority) and the Solr Cluster ceases to serve write requests (& 
read continues to work as expected). Once ZK cluster is reconciled and majority 
achieved again, everything continues working as expected.

What we have not been able to find is what happens when ZK cluster 
catastrophically fails and loses its data. Either completely (scenario A) or is 
restarted from backup (scenario B).

So now the questions:

1)  Scenario A - Is existing Solr Cloud cluster able to start against a 
clean Zookeeper and reconstruct all the ZK data from its internal state (using 
some king of emergency recovery; it may take long)?

2)  Scenario B - What is the worst case backup/restore scenario? For 
example when

a.   ZK is backed up

b.   Cluster performs some transition between states "X -> Y" (such as 
commit shard, elect new leader etc.)

c.   ZK fails completely

d.   ZK is restored from backup created in step a

e.   Solr Cloud is in state "Y", while ZK is in state "X"

Thanks in advance,

Pavel



Solr Cloud in AWS

2018-11-29 Thread Pavel Micka
Hi,

Is there any well established vendor offering Solr Cloud as a service in 
different AWS regions? Or what is the easiest way to get Solr Cloud running in 
AWS?

We already have K8s with Helm, but unfortunately it seems that there is no Helm 
Chart available...

Thanks,

Pavel


Solr memory reqs for time-sorted data

2018-09-07 Thread Pavel Micka
Hi,

I found on wiki (https://wiki.apache.org/solr/SolrPerformanceProblems#RAM) that 
optimal amount of RAM for SOLR is equal to index size. This is lets say the 
ideal case to have everything in memory.

We plan to have small installation with 2 nodes and 8shards. We'll have inside 
the cluster 100M of documents. We expect that each document will take 5kB to 
index. With in-memory index this would mean that those two nodes would require 
~500GB RAM. This would mean 2x 256GB to have everything in memory. And those 
are really big machines... Is this calculation even correct in new Solr 
versions?

And we do have a bit restricted problem: Our data are time based logs and we 
generally have a restricted search for last 3 months. Which will match let's 
say 10M of documents. How will this affect SOLR memory requirements? Will we 
still need to have the whole inverted indexes in memory? Or is there some 
internal optimization, which will ensure that only some part will need to be in 
memory?

The questions:

1)  Is the 500GB of memory reqs correct assumption?

2)  Will the fact that we have time-based logs with majority of accesses to 
recent data only help?

3)  Is there some best practice how to reduce required RAM in Solr?



Thanks in advance!

Pavel


Side note:
We were thinking about DB partitioning based on Time Routed Aliases, but 
unfortunately we need to ensure disaster recovery through a bad network 
connection. And TRA and Cross Data Center Replication are not compatible. (CDCR 
requires static number of cores, while TRA creates cores dynamically).



SolrCloud acceptable latency, when to use CDCR?

2018-07-23 Thread Pavel Micka
Hi,

We are discussing advantages of SolrCloud Replication and Cross Data Center 
Replication (CDCR). In CDCR docs, it is written that
"The SolrCloud architecture is not particularly well suited for situations 
where a single SolrCloud cluster consists of nodes in separated data clusters 
connected by an expensive pipe".

But we fail to find, what latency is acceptable for SolrCloud/ZK and when we 
should start considering using CDCR (master-slave). And what would be the 
issues if we install SolrCloud on problematic network?

Thanks in advance,

Pavel


Time Routed Aliases & CDCR

2018-07-20 Thread Pavel Micka
Hello,

We are planning to implement Time Routed Aliases to our solution. But one of 
our requirements is to be able to provide disaster recovery in case one of two 
Data Centers dies. We have a network between DCs, which is potentially unstable 
and has latencies in hundreds of millis.

We were recommended to use CDCR and it really seems to fit our needs. But after 
reading docs, I have some questions.


1)  With TRA, we define a single solrconfig.xml, this SolrConfig is then 
assigned to each new collection, when it is automatically created by TRA logic.

a.   BUT CDCR requires us to specify sourceCollectionName and 
targetCollectionName 
(https://lucene.apache.org/solr/guide/7_4/cdcr-config.html#cdcr-config), but I 
can't specify it, because I have the same solrConfig applied to all collections 
behind the alias. And I do not have the creation of collections in my hands, 
its done automatically? (and I do not get, why I need to specify the names, 
when solrconfig.xml file is per collection...)

2)  CDCR docs state that "Configuration files (solrconfig.xml, 
managed-schema, etc.) are not automatically synchronized between the Source and 
Target clusters.". Does this apply also to files stored in ZooKeeper? Or only 
to those on disks. If also to those in ZK, we may have a problem, the 
collections are created automatically, so we can't easily detect that we should 
do the ZK sync to backup site.

If there is some smarter way, how to do Disaster Recovery (2 node Solr setup) 
to backup site (over possibly bad network), please let me know either in this 
mailing list, or on stack overflow 
(https://stackoverflow.com/questions/51425009/solrcloud-2-nodes-cluster).

Thanks,

Pavel




RE: Solr SQL: standalone mode

2017-09-25 Thread Pavel Micka
Glad to hear that. Btw: where is the limitation (that its not possible to run 
the SQL in standalone). Is it in the distribution algorithm itself, or is just 
Solr missing ZooKeeper storage. I am asking because if its the second case, we 
can just install single node ZK + single Solr and have a "non-distributed 
cloud" :-)

Thanks,
Pavel

-Original Message-
From: Joel Bernstein [mailto:joels...@gmail.com] 
Sent: Monday, September 25, 2017 3:04 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr SQL: standalone mode

At Alfresco we are working on a version of Solr's SQL that works in non-Solr 
Cloud mode. The plan is to contribute this back to 7x branch.
There will also be improvements to the SQL coverage committed back from 
Alfresco.

Joel Bernstein
http://joelsolr.blogspot.com/

On Sun, Sep 24, 2017 at 6:04 PM, Pavel Micka <pavel.mi...@zoomint.com>
wrote:

> Hi,
>
>
> I read in the documentation that executing Solr SQL is possible only 
> in SolrCloud mode. The thing is that we have unfortunatelly some 
> installations, which simply can't have multiple nodes (too small 
> instances). Is it somehow possible to workaround this restriction or 
> is there at least any plan to lift it?
>
>
> Thanks,
>
>
> Pavel
>


Solr SQL: standalone mode

2017-09-24 Thread Pavel Micka
Hi,


I read in the documentation that executing Solr SQL is possible only in 
SolrCloud mode. The thing is that we have unfortunatelly some installations, 
which simply can't have multiple nodes (too small instances). Is it somehow 
possible to workaround this restriction or is there at least any plan to lift 
it?


Thanks,


Pavel