[ 
https://issues.apache.org/jira/browse/SOLR-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563281#comment-14563281
 ] 

Yonik Seeley commented on SOLR-6743:
------------------------------------

Great job Tim!

> Support deploying SolrCloud on YARN
> -----------------------------------
>
>                 Key: SOLR-6743
>                 URL: https://issues.apache.org/jira/browse/SOLR-6743
>             Project: Solr
>          Issue Type: New Feature
>          Components: Hadoop Integration, SolrCloud
>            Reporter: Timothy Potter
>            Assignee: Timothy Potter
>
> We're seeing Solr running with Hadoop more and more and YARN allows us to 
> deploy and manage distributed applications across a cluster of machines. This 
> feature will provide support for deploying SolrCloud in YARN. Currently, the 
> code is implemented in an open-source project hosted on Lucidworks github, 
> see: https://github.com/LucidWorks/yarn-proto
> We'd like to submit this to the Apache Solr project as a contrib so it is 
> easier to run Solr on YARN right out-of-the-box. There are a few hurdles to 
> get over though:
> 1) Overall approach: There are various options for supporting YARN, such as 
> Apache Slider, but I opted to just use the YARN client API directly which 
> simply invokes the bin/solr start script under the covers. The YARN specific 
> code is quite simple and most of the code is just handling command line 
> options/parsing. I'm curious what others think about having a simple native 
> solution that ships with Solr (similar to the HdfsDirectoryFactory) vs. 
> something more heavy-weight that requires 3rd party tools to be involved.
> 2) Unit testing - Solr on YARN relies on putting a full Solr bundle into HDFS 
> (which you can see how that might work in the SolrYarnTestIT test case). This 
> obviously has problems in the Solr build as there is no bundle of Solr 
> available during unit testing. I'm thinking about having a mock bundle that 
> simulates starting Solr but that limits what we can verify on the cluster 
> once it's up.
> 3) Shutdown - In order to support an orderly shutdown of Solr when the 
> application is stopped by the ResourceManager, we need a shutdown handler in 
> Jetty/Solr that allows a remote application to request shutdown. The built-in 
> Jetty shutdown handler requires the stop request to come from localhost. To 
> work-around this, I've introduced a custom ShutdownHandler that can be 
> configured using System properties at startup to allow a remote host to 
> request shutdown. When YARN starts Solr nodes, I register the address of the 
> SolrMaster node with a secret key that will allow the SolrMaster to shutdown 
> Solr gracefully. This seems secure since only the SolrMaster can request 
> shutdown using the correct key. Other ideas on how to handle graceful 
> shutdown?
> 4) Additional features: The current implementation is useful for 
> starting/stopping SolrCloud nodes in YARN. My thinking is that you'll 
> provision the cluster using YARN and then just interact with Solr directly 
> using Solr's API , so the YARN layer is quite thin. Other features needed?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to