Bob:

I'd guess you had to fiddle with lock factories and the like, although
you say that master/slave wasn't even available when you put this
system together so I don't even remember what was available "way back
when" ;).

If it ain't broke, don't fix it applies. That said, if I were redoing
the system (or even upgrading) I'd strongly consider either
master/slave or SolrCloud if for no other reason than you'd have to
re-figure how to signal the search boxes that the index had
changed....

Do note that there has been some talk of "read only replicas", see:
SOLR-6237 but that hasn't been committed and I don't know how much
love that issue will get.

Best,
Erick

On Fri, May 26, 2017 at 9:09 AM, Robert Haschart <rh...@virginia.edu> wrote:
> We have run using this exact scenario for several years.   We have three
> Solr servers sitting behind a load balancer, with all three accessing the
> same Solr index stored on read-only network addressable storage.   A fourth
> machine is used to update the index (typically daily) and signal the three
> Solr servers when the updated index is available.   Our index is primarily
> bibliographic information and it contains about 8 million documents and is
> about 30GB in size.    We've used this configuration since before Zookeeper
> and Cloud-based Solr or even java-based master slave replication were
> available.   I cannot say whether this configuration has any benefits over
> the current accepted way of load-balancing, but it has worked well for us
> for several years and we've never had a corrupted index problem.
>
>
> -Bob Haschart
> University of Virginia Library
>
>
>
> On 5/23/2017 10:05 PM, Shawn Heisey wrote:
>>
>> On 5/19/2017 8:33 AM, Ravi Kumar Taminidi wrote:
>>>
>>> Hello,  Scenario: Currently we have 2 Solr Servers running in 2 different
>>> servers (linux), Is there any way can we make the Core to be located in NAS
>>> or Network shared Drive so both the solrs using the same Index.
>>>
>>> Let me know if any performance issues, our size of Index is appx 1GB.
>>
>> I think it's a very bad idea to try to share indexes between multiple
>> Solr instances.  You can override the locking and get it to work, and
>> you may be able to find advice on the Internet about how to do it.  I
>> can tell you that it's outside the design intent for both Lucene and
>> Solr.  Lucene works aggressively to *prevent* multiple processes from
>> sharing an index.
>>
>> In general, network storage is not a good idea for Solr.  There's added
>> latency for accessing any data, and frequently the filesystem won't
>> support the kind of locking that Lucene wants to use, but the biggest
>> potential problem is disk caching.  Solr/Lucene is absolutely reliant on
>> disk caching in the SOlr server's local memory for good performance.  If
>> the network filesystem cannot be cached by the client that has mounted
>> the storage, which I believe is the case for most network filesystem
>> types, then you're reliant on disk caching in the network server(s).
>> For VERY large indexes, which is really the only viable use case I can
>> imagine for network storage, it is highly unlikely that the network
>> server(s) will have enough memory to effectively cache the data.
>>
>> Solr has explicit support for HDFS storage, but as I understand it, HDFS
>> includes the ability for a client to allocate memory that gets used
>> exclusively for caching on the client side, which allows HDFS to
>> function like a local filesystem in ways that I don't think NFS can.
>> Getting back to my advice about not sharing indexes -- even with
>> SolrCloud on HDFS, multiple replicas generally do NOT share an index.
>>
>> A 1GB index is very small, so there's no good reason I can think of to
>> involve network storage.  I would strongly recommend local storage, and
>> you should abandon any attempt to share the same index data between more
>> than one Solr instance.
>>
>> Thanks,
>> Shawn
>>
>

Reply via email to