It really sounds like you're re-inventing SolrCloud, but
you know your requirements best.

Erick

On Wed, Nov 2, 2016 at 8:48 PM, Kent Mu <solr.st...@gmail.com> wrote:
> Thanks Erick!
> Actually, similar to solrcloud, we split our data to 8 customized shards(1
> master with 4 slaves), and each with one ctrix and two apache web server to
> reduce server pressure through load balancing.
> As we are running an e-commerce site, the number of reviews of selling
> products grows very fast, we get the modulus on product code to put the
> reviews in the proper customized solr shard, so that we can relatively
> reduce the index size on each solr.
> we will first try to upgrade the physical memory, and let's see what it
> will happen. if the query performance is not ideal, we will try to deploy
> solr in physical machine, or we can use SSD instead.
>
>         “Rome was not built in a day”, so we can explore it step by step.
> Ha ha...
> Best Regards!
> Kent
>
> 2016-11-03 1:10 GMT+08:00 Erick Erickson <erickerick...@gmail.com>:
>
>> You need to move to SolrCloud when it's
>> time to shard ;).....
>>
>> More seriously, at some point simply adding more
>> memory will not be adequate. Either your JVM
>> heap will to grow to a point where you start encountering
>> GC pauses or the time to serve requests will
>> increase unacceptably. "when?" you ask? well
>> unfortunately there are no guidelines that can be
>> guaranteed, here's a long blog on the subject:
>>
>> https://lucidworks.com/blog/sizing-hardware-in-the-
>> abstract-why-we-dont-have-a-definitive-answer/
>>
>> The short form is you need to stress-test your
>> index and query patterns.
>>
>> Now, I've seen 20M docs strain a 32G Java heap. I've
>> seen 300M docs give very nice response times with
>> 12G of memory. It Depends (tm).
>>
>> Whether to put Solr on bare metal or not: There's
>> inevitably some penalty for a VM. That said there are lots
>> of places that use VMs successfully. Again, stress
>> testing is the key.
>>
>> And finally, using docValues for any field that sorts,
>> facets or groups will reduce the JVM requirements
>> significantly, albeit by using OS memory space, see
>> Uwe's excellent blog:
>>
>> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>>
>> Best,
>> Erick
>>
>> On Tue, Nov 1, 2016 at 10:23 PM, Kent Mu <solr.st...@gmail.com> wrote:
>> > Thanks, I got it, Erick!
>> >
>> > the size of our index data is more than 30GB every year now, and it is
>> > still growing up, and actually our solr now is running on a virtual
>> > machine. so I wonder if we need to deploy solr in a physical machine, or
>> I
>> > can just upgrade the physical memory of our Virtual machines?
>> >
>> > Best,
>> > Kent
>> >
>> > 2016-11-02 11:33 GMT+08:00 Erick Erickson <erickerick...@gmail.com>:
>> >
>> >> Kent: OK, I see now. Then a minor pedantic point...
>> >>
>> >> It'll avoid confusion if you use master and slaves
>> >> rather than master and replicas when talking about
>> >> non-cloud setups.
>> >>
>> >> The equivalent in SolrCloud is leader and replicas.
>> >>
>> >> No big deal either way, just FYI.
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Tue, Nov 1, 2016 at 8:09 PM, Kent Mu <solr.st...@gmail.com> wrote:
>> >> > Thanks a lot for your reply, Shawn!
>> >> >
>> >> > no other applications on the server, I agree with you that we need to
>> >> > upgrade physical memory, and allocate the reasonable jvm size, so that
>> >> the
>> >> > operating system have spare memory available to cache the index.
>> >> >
>> >> > actually, we have nearly 100 million of data every year now, and it is
>> >> > still growing, and actually our solr now is running on a virtual
>> machine.
>> >> > so I wonder if we need to deploy solr in a physical machine.
>> >> >
>> >> > Best Regards!
>> >> > Kent
>> >> >
>> >> > 2016-11-01 21:18 GMT+08:00 Shawn Heisey <apa...@elyograg.org>:
>> >> >
>> >> >> On 11/1/2016 1:07 AM, Kent Mu wrote:
>> >> >> > Hi friends! We come across an issue when we use the solrj(4.9.1) to
>> >> >> > connect to solr server, our deployment is one master with 10
>> replicas.
>> >> >> > we index data to the master, and search data from the replicas via
>> >> >> > load balancing. the error stack is as below: *Timeout occured while
>> >> >> > waiting response from server at:
>> >> >> > http://review.solrsearch3.cnsuning.com/solr/commodityReview
>> >> >> > <http://review.solrsearch3.cnsuning.com/solr/commodityReview>*
>> >> >> > org.apache.solr.client.solrj.SolrServerException: Timeout occured
>> >> >> > while waiting response from server at:
>> >> >>
>> >> >> This shows that you are connecting to port 80.  It is relatively
>> rare to
>> >> >> run Solr on port 80, though it is possible.  Do you have an
>> intermediate
>> >> >> layer, like a proxy or a load balancer?  If so, you'll need to ensure
>> >> >> that there's not a problem there.  If it works normally when
>> replication
>> >> >> isn't happening, that's probably not a worry.
>> >> >>
>> >> >> > It takes place not often. after analysis, we find that only when
>> the
>> >> >> > replicas Synchronous Data from master solr server. it seem that
>> when
>> >> >> > the replicas block search requests when synchronizing data from
>> >> >> > master, is that true?
>> >> >>
>> >> >> Solr should be able to continue serving requests while replication
>> >> >> happens.  I have never heard of this happening before, and I never
>> ran
>> >> >> into it when I was using replication a long time ago on version
>> 1.4.x.
>> >> >> I think it is more likely that you've got a memory issue than a
>> bug.  If
>> >> >> it IS a bug, it will *not* be fixed in a 4.x version, you would need
>> to
>> >> >> upgrade to 6.x and see whether it's still a problem.  Version 6.2.1
>> is
>> >> >> the latest at the moment, and release plans are underway for 6.3
>> right
>> >> now.
>> >> >>
>> >> >> > I wonder if it is because that our solr server hardware
>> configuration
>> >> >> > is too low? the physical memory is 8G with 4 cores. and the JVM we
>> set
>> >> >> > is Xms512m, Xmx7168m.
>> >> >>
>> >> >> The following assumes that there is no other software on the server,
>> >> >> like a database, or an application server, web server, etc.  If there
>> >> >> is, any issues are likely to be a result of extreme memory
>> starvation,
>> >> >> and possibly swapping.  Additional physical memory is definitely
>> needed
>> >> >> if there is other software on the server beyond basic OS tools.
>> >> >>
>> >> >> If the total index data that is on your server is larger than about
>> 1.5
>> >> >> to 2GB, chances are excellent that you do not have enough free
>> memory to
>> >> >> cache that data effectively, which can lead to major performance
>> >> >> issues.  You've only left about 1GB of memory in the system for that
>> >> >> purpose, and that memory must also run the entire operating system,
>> >> >> which can take a significant percentage of 1GB.  With a large index,
>> I
>> >> >> would strongly recommend adding memory to this server.
>> >> >>
>> >> >> https://wiki.apache.org/solr/SolrPerformanceProblems
>> >> >>
>> >> >> As mentioned in that wiki page, for good performance Solr absolutely
>> >> >> requires that the operating system have spare memory available to
>> cache
>> >> >> the index.  In general, allocating almost all your memory to the Java
>> >> >> heap is a bad idea with Solr.
>> >> >>
>> >> >> If your index *is* smaller than 1.5 to 2GB, allocating a 7GB heap is
>> >> >> probably not necessary, unless you are doing *incredibly*
>> memory-hungry
>> >> >> queries, such as grouping, faceting, or sorting on many fields.  If
>> you
>> >> >> can reduce the heap size, there would be more memory available for
>> >> caching.
>> >> >>
>> >> >> Indexing can sometimes cause very large merges to happen, and a full
>> >> >> index optimize would rewrite the entire index.  Replication copies
>> the
>> >> >> changed index files, and if the size of the changes is significant,
>> >> >> additional memory can be required for good performance.  See the
>> special
>> >> >> note on the wiki page above about optimizes.
>> >> >>
>> >> >> Thanks,
>> >> >> Shawn
>> >> >>
>> >> >>
>> >>
>>

Reply via email to