Hi Shawn,

Without going into excessive detail on our design, I won't be able to sufficiently justify an answer to your question as to the why of it. Suffice it to say we plan to deploy this indexing for our entire customer base. Because of size these document collections and the way that they will grow over time, doubling up in machines is not feasible in our current infrastructure at this time. It may be justified later, but not today. It's less expensive to add more CPUs and RAM than doubling up on physical machines. Additionally, there are further budgetary constraints going into our international datacenters which prevents us from having identical clusters across the board, thus requiring doubling up. We're not talking about 2 or 3 machines here. We're talking 128 running instances of Solr with 64 clusters and many shards.

However, that doesn't preclude the use of something like Docker or KVM to allow encapsulation of each Solr environment on a virtual machine which is hooked to a fast storage subsystem.

I would also suggest that if the recommendation is not to run two instance side-by-side, then the documentation regarding how to set this up should be removed and a strong statement put in its place that running multiple Solr instances is not a supported configuration. Right now, the documentation does not state this and, in fact, implies that it is perfectly fine to run multiple instances side by side as long as independent disks are used to hold the instances.

Note, this was not my design and I am not a fan doing this, but I'm not the person making this decision. I am the person who's tasked to implement this design choice.

Thanks.

On 2/17/16 10:19 PM, Shawn Heisey wrote:
On 2/17/2016 10:38 PM, Brian Wright wrote:
We have a new project to use Solr. Our Solr instance will use Jetty
rather than Tomcat. We plan to extend the Solr core system by adding
additional classes (jar files) to the
/opt/solr/server/solr-webapp/webapp/WEB-INF/lib directory to extend
features. We also plan to run two instances of Solr on each physical
server preferably from a single installed Solr instance. I've read the
best practices doc on running two Solr instances, and while it's
detailed about how to set up two instances, it doesn't cover our
specific use case.
Why do you want to run multiple instances on one server?  Unless you
have a REALLY good reason to have more than one instance per server,
don't do it.  One instance can handle many indexes with no problem.

The only valid reason I can think of to run more than one instance per
machine is when a single instance requires a VERY large heap.  In that
case, it *might* be better to run two instances that each have a smaller
heap, so that garbage collection times are lower.  I personally would
add more machines, rather than run multiple instances.

Generally the best way to load custom jars (and contrib components like
the dataimport handler) in Solr is to create a "lib" directory in the
solr home (where solr.xml lives) and place all extra jars there.  They
will be loaded once when Solr starts, and all cores will have access to
them.

The rest of your email was concerned with running multiple instances.
If you *REALLY* want to go against advice and do this, here's the
recommended way:

https://cwiki.apache.org/confluence/display/solr/Taking+Solr+to+Production#TakingSolrtoProduction-RunningmultipleSolrnodesperhost

It is very likely possible to run multiple instances out of the same
installation directory, but I am not sure how to do it.

Thanks,
Shawn


--
Signature

*Brian Wright*
*Sr. Systems Engineer *
901 Mariners Island Blvd Suite 200
San Mateo, CA 94404 USA
*Email *bri...@marketo.com <mailto:bri...@marketo.com>
*Phone *+1.650.539.3530**
*****www.marketo.com <http://www.marketo.com/>*

        Marketo Logo


Reply via email to