[ https://issues.apache.org/jira/browse/SOLR-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563281#comment-14563281 ]
Yonik Seeley commented on SOLR-6743: ------------------------------------ Great job Tim! > Support deploying SolrCloud on YARN > ----------------------------------- > > Key: SOLR-6743 > URL: https://issues.apache.org/jira/browse/SOLR-6743 > Project: Solr > Issue Type: New Feature > Components: Hadoop Integration, SolrCloud > Reporter: Timothy Potter > Assignee: Timothy Potter > > We're seeing Solr running with Hadoop more and more and YARN allows us to > deploy and manage distributed applications across a cluster of machines. This > feature will provide support for deploying SolrCloud in YARN. Currently, the > code is implemented in an open-source project hosted on Lucidworks github, > see: https://github.com/LucidWorks/yarn-proto > We'd like to submit this to the Apache Solr project as a contrib so it is > easier to run Solr on YARN right out-of-the-box. There are a few hurdles to > get over though: > 1) Overall approach: There are various options for supporting YARN, such as > Apache Slider, but I opted to just use the YARN client API directly which > simply invokes the bin/solr start script under the covers. The YARN specific > code is quite simple and most of the code is just handling command line > options/parsing. I'm curious what others think about having a simple native > solution that ships with Solr (similar to the HdfsDirectoryFactory) vs. > something more heavy-weight that requires 3rd party tools to be involved. > 2) Unit testing - Solr on YARN relies on putting a full Solr bundle into HDFS > (which you can see how that might work in the SolrYarnTestIT test case). This > obviously has problems in the Solr build as there is no bundle of Solr > available during unit testing. I'm thinking about having a mock bundle that > simulates starting Solr but that limits what we can verify on the cluster > once it's up. > 3) Shutdown - In order to support an orderly shutdown of Solr when the > application is stopped by the ResourceManager, we need a shutdown handler in > Jetty/Solr that allows a remote application to request shutdown. The built-in > Jetty shutdown handler requires the stop request to come from localhost. To > work-around this, I've introduced a custom ShutdownHandler that can be > configured using System properties at startup to allow a remote host to > request shutdown. When YARN starts Solr nodes, I register the address of the > SolrMaster node with a secret key that will allow the SolrMaster to shutdown > Solr gracefully. This seems secure since only the SolrMaster can request > shutdown using the correct key. Other ideas on how to handle graceful > shutdown? > 4) Additional features: The current implementation is useful for > starting/stopping SolrCloud nodes in YARN. My thinking is that you'll > provision the cluster using YARN and then just interact with Solr directly > using Solr's API , so the YARN layer is quite thin. Other features needed? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org