Re: SolrCloud Tomcat configuration: problems and doubts.
Forward to solr-user mailing list. We forgot to reply to it, :-/ 2012/11/5 Luis Cappa Banda luisca...@gmail.com Hello, Mark! I´ve been testing more and more and things are going better. I have tested what you told me about -Dbootstrap_conf=true and works fine, but the problem is that if I include that application parameter in every Tomcat instance when I deploy all Solr servers each one load again all solrCore configurations inside Zookeeper. It should exists something like a Tomcat master server which only has the following parameters that defines the basic SolrCloud configuration: JAVA_OPTS=-DzkHost=127.0.0.1:9000 -DnumShards=2 -Dbootstrap_conf=true Then the other Tomcat servers should have only: JAVA_OPTS=-DzkHost=127.0.0.1:9000 However, I think that is not the best way to procceed. We are at 2012, it´s the end of the world - God (well, one of them) is angry and attacks my Production environment. Imagine that all servers go down and a Monit service restarts them alleatory. Maybe one common Tomcat server finishes it´s startup faster than the named Tomcat master server, so those SolrCloud configuration parameters won´t be loaded at first. That´s a problem. One posibility is to write a simple script to be executed in every Tomcat launch execution that consists on something like: I´m the first Tomcat and I´m launching! I´ll write a solrcloud.config.lock file in a well-known path (or maybe into Zookeeper) to announce the other Tomcats that I´ll start to load SolrCloud configuration files into Zookeeper. I am the Tomcat master server, so I´ll load* JAVA_OPTS=-DzkHost=127.0.0.1:9000 -DnumShards=2 -Dbootstrap_conf=true* . I´m a second Tomcat and I´m launching! First I check if any solrcloud.config.lock file exists. If exists, I simple load * JAVA_OPTS=-DzkHost=127.0.0.1:9000* And so on. I don´t like too much this solution because it´s not elegant and it´s very ad-hoc, but it works. What do you think about it? I´ve just started with SolrCloud four or five days ago and maybe I forget something that could solve this problem. Thank you very much, Mark. Regards, Luis Cappa. 2012/11/3 Mark Miller markrmil...@gmail.com On Fri, Nov 2, 2012 at 9:05 AM, Luis Cappa Banda luisca...@gmail.com wrote: Hello, Mark! How are you? Thanks a lot for helping me. You were right about jetty.host parameter. My fianl test solr.xml looks like: cores adminPath=/admin/cores defaultCoreName=items_en host=localhost hostPort=9080 hostContext=items_en core name=items_en instanceDir=items_en / /cores I´ve noticed that 'hostContext' parameter was also required, so I included it. It should default to /solr if you don't set it - it is there in case you deploy to a different context though. After that corrections Cloud graph tree looks right, and executing queries doesn' t return a 503 error. Phew! However, I checked in the Cloud graph tree that acollection1 appears too pointing to http://localhost:8983/solr. I will continue testing if I missed something, but looks like it is creating another collection with default parameters (collection name, port) without control. It should only create what it finds in solr.xml - let me know what you find. While using Apache Tomcat I was forced to include in catalina.sh (or setenv.sh) the following environment parameters, as I told you before: JAVA_OPTS=-DzkHost=127.0.0.1:9000 -Dcollection.configName=items_en You should only need -DzkHost= - see below. Just three questions more: 1. That´s a problem for me, because I would like to deploy in each Tomcat instance more than one Solr server with different configurations file (I mean, differents configName parameters), so including that JAVA_OPTS forces to me to deploy in that Tomcat server only Solr servers with this kind of configuration. In a production environment I would like to deploy in a single Tomcat instance at least for Solr servers, one per each kind of documents that I will index and query to. Do you know any way to configure the configName per each Solr server instance? Is it posible to configure it inside solr.xml file? Also, it make sense to deploy in each Solr server a multi-core configuration, each core with each configName allocated in Zookeeper, but again using that kind of JAVA_OPTS on-fire params configuration makes it impossible, :-( That config name sys prop is not being used here - it's only used when you use -Dbootstrap_confdir=path, and then only the first time you start up. Collections are linked to configuration sets in ZooKeeper. If you use -Dboostrap_conf=true, a special rule is used that auto links collections and config sets with the same name as the collection. Otherwise, you can use the ZkCLi cmd line tool to link any collectio to any config in zookeeper. 2. The other question is about indexing. What is the best way to plain index (I
Re: SolrCloud Tomcat configuration: problems and doubts.
Hello, Mark! How are you? Thanks a lot for helping me. You were right about jetty.host parameter. My fianl test solr.xml looks like: * cores adminPath=/admin/cores defaultCoreName=items_en host=localhost hostPort=9080 hostContext=items_en* *core name=items_en instanceDir=items_en /* * /cores* I´ve noticed that 'hostContext' parameter was also required, so I included it. After that corrections Cloud graph tree looks right, and executing queries doesn' t return a 503 error. Phew! However, I checked in the Cloud graph tree that acollection1 appears too pointing to http://localhost:8983/solr. I will continue testing if I missed something, but looks like it is creating another collection with default parameters (collection name, port) without control. While using Apache Tomcat I was forced to include in catalina.sh (or setenv.sh) the following environment parameters, as I told you before: *JAVA_OPTS=-DzkHost=127.0.0.1:9000 -Dcollection.configName=items_en * Just three questions more: *1.* That´s a problem for me, because I would like to deploy in each Tomcat instance more than one Solr server with different configurations file (I mean, differents configName parameters), so including that JAVA_OPTS forces to me to deploy in that Tomcat server only Solr servers with this kind of configuration. In a production environment I would like to deploy in a single Tomcat instance at least for Solr servers, one per each kind of documents that I will index and query to. Do you know any way to configure the configName per each Solr server instance? Is it posible to configure it inside solr.xml file? Also, it make sense to deploy in each Solr server a multi-core configuration, each core with each configName allocated in Zookeeper, but again using that kind of JAVA_OPTS on-fire params configuration makes it impossible, :-( *2.* The other question is about indexing. What is the best way to plain index (I mean, without DIH or similar) in SolrCloud? Maybe configuring a LBHttpSolrServer that decides itself what is the best Solr server instance per indexation process? *3.* The following question may sound strange, but... but the thing is that I would like to help anyway in Apache Solr project contributing with code (bugs corrections, new features, etc.). How can I contribute to the community? Thanks a lot. Best Regards, Luis Cappa. 2012/10/31 Mark Miller markrmil...@gmail.com A big difference if you are using tomcat is that you still need to specify jetty.port - unless you change the name of that sys prop in solr.xml. Some more below: On Wed, Oct 31, 2012 at 2:09 PM, Luis Cappa Banda luisca...@gmail.com wrote: Hello! How are you?I followed SolrCloud Wiki tutorial and noticed that all worked perfectly with Jetty and with a very basic configuration. My first impression was that SolrCloud is amazing and I´m interested on deploying a more complex and near-production environment SolrCloud architecture with tests purposes. I´m using Tomcat as application server, so I´ve started testing with it. I´ve installed Zookeper sevice in a single machine and started up with the following configuration: *1.)* ~zookeperhome/conf/zoo.cfg *tickTime=2000* *initLimit=10* *syncLimit=5* *dataDir=~zookeperhome/data/* *clientPort=9000* *2.) * I testing with a single core Solr server called 'items_en'. I have the configuration is as follows: *Indexes conf/data tree*: /mnt/data-store*/solr/* /solr.xml /zoo.cfg /items_en/ /conf/ schema.xml solrconfig.xml etc. So we have a simple configuration where conf files and data indexes files are in the same path. *3.)* Ok, so we have Solr server configured, but I have to save into Zookeper the configuration. I do as follows: *./bin/zkcli.sh -cmd upconfig -zkhost 127.0.0.1:9000 -confdir * /mnt/data-store/solr/*items_en/conf -collection items_en -confname items_en * And seems to work perfectly, because if I use Zookeper client and executes 'ls' command the files appear: *./bin/zkCli.sh -server localhost:9000 * * * *[zk: localhost:9000(CONNECTED) 1] ls /configs/items_en* *[admin-extra.menu-top.html, currency.xml, protwords.txt, mapping-FoldToASCII.txt, solrconfig.xml, lang, spellings.txt, mapping-ISOLatin1Accent.txt, admin-extra.html, xslt, scripts.conf, synonyms.txt, update-script.js, velocity, elevate.xml, zoo.cfg, admin-extra.menu-bottom.html, stopwords_en.txt, schema.xml]* * * * * *4.) *I would like that all the Solr servers deployed in that Tomcat instance points to Zookeper port 9000 service, so I included the following JAVA_OPTS hoping that they´ll make that posible: *JAVA_OPTS=-DzkHost=127.0.0.1:9000
Re: SolrCloud Tomcat configuration: problems and doubts.
A big difference if you are using tomcat is that you still need to specify jetty.port - unless you change the name of that sys prop in solr.xml. Some more below: On Wed, Oct 31, 2012 at 2:09 PM, Luis Cappa Banda luisca...@gmail.com wrote: Hello! How are you?I followed SolrCloud Wiki tutorial and noticed that all worked perfectly with Jetty and with a very basic configuration. My first impression was that SolrCloud is amazing and I´m interested on deploying a more complex and near-production environment SolrCloud architecture with tests purposes. I´m using Tomcat as application server, so I´ve started testing with it. I´ve installed Zookeper sevice in a single machine and started up with the following configuration: *1.)* ~zookeperhome/conf/zoo.cfg *tickTime=2000* *initLimit=10* *syncLimit=5* *dataDir=~zookeperhome/data/* *clientPort=9000* *2.) * I testing with a single core Solr server called 'items_en'. I have the configuration is as follows: *Indexes conf/data tree*: /mnt/data-store*/solr/* /solr.xml /zoo.cfg /items_en/ /conf/ schema.xml solrconfig.xml etc. So we have a simple configuration where conf files and data indexes files are in the same path. *3.)* Ok, so we have Solr server configured, but I have to save into Zookeper the configuration. I do as follows: *./bin/zkcli.sh -cmd upconfig -zkhost 127.0.0.1:9000 -confdir * /mnt/data-store/solr/*items_en/conf -collection items_en -confname items_en * And seems to work perfectly, because if I use Zookeper client and executes 'ls' command the files appear: *./bin/zkCli.sh -server localhost:9000 * * * *[zk: localhost:9000(CONNECTED) 1] ls /configs/items_en* *[admin-extra.menu-top.html, currency.xml, protwords.txt, mapping-FoldToASCII.txt, solrconfig.xml, lang, spellings.txt, mapping-ISOLatin1Accent.txt, admin-extra.html, xslt, scripts.conf, synonyms.txt, update-script.js, velocity, elevate.xml, zoo.cfg, admin-extra.menu-bottom.html, stopwords_en.txt, schema.xml]* * * * * *4.) *I would like that all the Solr servers deployed in that Tomcat instance points to Zookeper port 9000 service, so I included the following JAVA_OPTS hoping that they´ll make that posible: *JAVA_OPTS=-DzkHost=127.0.0.1:9000 -Dcollection.configName=items_en -DnumShards=2 * * * *Question 1: suposing that JAVA_OPTS are OK, do you think there exists a more flexible and less fixed way to indicate to each Solr server instance which is it´s Zookeper service?* Your zkHost should actually be a comma sep list of the zk hosts. Yes, we hope to improve this in the future as zookeeper becomes more flexible. * * *Question 2: can you increment the numShards later even after an indexation? Example: imagine that you have millions of documents and you want to expand from two to four shards and increment aswell the number of Solr servers* You can't change the number of shards yet - there is an open jira issue for this and ongoing work. It's been called shard splitting. * * *Question 3: do again suposing that JAVA_OPTS is OK (or near to be OK), is it necessary to include always -DnumShard per each Tomcat server? Can' t this confuse Zookeeper instance?* It depends on how you start your instances. The first one is the only one that matters - it only makes sense to specify for each instance if you plan on starting them all at the same time and are not sure which the first to register in zk will be. * * *Question 4: **imagine that we have three Zookeeper instances to manage config files in production environment. The parameter -DzkHost should be like this? -DzkHost=host1:port1,host2:port2,host3:port3.* Yes. * * *5.) *I started *Tomcat (port 8080)* with a single Solr server and everything seems to be OK: there is a single core setted as 'items_en' and Cloud button is active. The graph is a simple tree with shard1 and shard2. Connected to shard1 is the current instance. *Also, if I execute any query I just receive a 503 error code: no servers hosting.* * Not sure why offhand - if you are not passing jetty.port (or something else if you have renamed it - like tomcat.port), that will be a problem. * * * *6.) *I started another Solr server in a* second Tomcat instance (port 9080). *Its Solr home is in the following path: *Indexes conf/data tree*: /mnt/data-store*/solr2/* /solr.xml /zoo.cfg /items_en/ /conf/ schema.xml solrconfig.xml