Forward to solr-user mailing list. We forgot to reply to it, :-/ 2012/11/5 Luis Cappa Banda <luisca...@gmail.com>
> Hello, Mark! > > I´ve been testing more and more and things are going better. I have tested > what you told me about "-Dbootstrap_conf=true" and works fine, but the > problem is that if I include that application parameter in every Tomcat > instance when I deploy all Solr servers each one load again all solrCore > configurations inside Zookeeper. > > It should exists something like a Tomcat master server which only has the > following parameters that defines the basic SolrCloud configuration: > > JAVA_OPTS="-DzkHost=127.0.0.1:9000 -DnumShards=2 -Dbootstrap_conf=true" > > Then the other Tomcat servers should have only: > > JAVA_OPTS="-DzkHost=127.0.0.1:9000" > > > However, I think that is not the best way to procceed. We are at 2012, > it´s the end of the world - God (well, one of them) is angry and attacks my > Production environment. Imagine that all servers go down and a Monit > service restarts them alleatory. Maybe one common Tomcat server finishes > it´s startup faster than the named Tomcat master server, so those SolrCloud > configuration parameters won´t be loaded at first. That´s a problem. > > One posibility is to write a simple script to be executed in every Tomcat > launch execution that consists on something like: > > " I´m the first Tomcat and I´m launching! I´ll write a > solrcloud.config.lock file in a well-known path (or maybe into Zookeeper) > to announce the other Tomcats that I´ll start to load SolrCloud > configuration files into Zookeeper. I am the Tomcat master server, so I´ll > load* JAVA_OPTS="-DzkHost=127.0.0.1:9000 -DnumShards=2 > -Dbootstrap_conf=true"* ". > > " I´m a second Tomcat and I´m launching! First I check if any > solrcloud.config.lock file exists. If exists, I simple load * > JAVA_OPTS="-DzkHost=127.0.0.1:9000"* " > > > And so on. > > > > I don´t like too much this solution because it´s not elegant and it´s very > ad-hoc, but it works. What do you think about it? I´ve just started with > SolrCloud four or five days ago and maybe I forget something that could > solve this problem. > > Thank you very much, Mark. > > Regards, > > Luis Cappa. > > > > 2012/11/3 Mark Miller <markrmil...@gmail.com> > >> On Fri, Nov 2, 2012 at 9:05 AM, Luis Cappa Banda <luisca...@gmail.com> >> wrote: >> > Hello, Mark! >> > >> > How are you? Thanks a lot for helping me. You were right about >> jetty.host >> > parameter. My fianl test solr.xml looks like: >> > >> > <cores adminPath="/admin/cores" defaultCoreName="items_en" >> > host="localhost" hostPort="9080" hostContext="items_en"> >> > <core name="items_en" instanceDir="items_en" /> >> > </cores> >> > >> > >> > I´ve noticed that 'hostContext' parameter was also required, so I >> included >> > it. >> >> It should default to /solr if you don't set it - it is there in case >> you deploy to a different context though. >> >> >After that corrections Cloud graph tree looks right, and executing >> > queries doesn' t return a 503 error. Phew! However, I checked in the >> Cloud >> > graph tree that a"collection1" appears too pointing to >> > http://localhost:8983/solr. I will continue testing if I missed >> something, >> > but looks like it is creating another collection with default parameters >> > (collection name, port) without control. >> >> It should only create what it finds in solr.xml - let me know what you >> find. >> >> > >> > While using Apache Tomcat I was forced to include in catalina.sh (or >> > setenv.sh) the following environment parameters, as I told you before: >> > >> > JAVA_OPTS="-DzkHost=127.0.0.1:9000 -Dcollection.configName=items_en" >> >> You should only need -DzkHost= - see below. >> >> > >> > >> > Just three questions more: >> > >> > 1. That´s a problem for me, because I would like to deploy in each >> Tomcat >> > instance more than one Solr server with different configurations file (I >> > mean, differents configName parameters), so including that JAVA_OPTS >> forces >> > to me to deploy in that Tomcat server only Solr servers with this kind >> of >> > configuration. In a production environment I would like to deploy in a >> > single Tomcat instance at least for Solr servers, one per each kind of >> > documents that I will index and query to. Do you know any way to >> configure >> > the configName per each Solr server instance? Is it posible to >> configure it >> > inside solr.xml file? Also, it make sense to deploy in each Solr server >> a >> > multi-core configuration, each core with each configName allocated in >> > Zookeeper, but again using that kind of JAVA_OPTS on-fire params >> > configuration makes it impossible, :-( >> >> That config name sys prop is not being used here - it's only used when >> you use -Dbootstrap_confdir=<path>, and then only the first time you >> start up. >> >> Collections are linked to configuration sets in ZooKeeper. If you use >> -Dboostrap_conf=true, a special rule is used that auto links >> collections and config sets with the same name as the collection. >> Otherwise, you can use the ZkCLi cmd line tool to link any collectio >> to any config in zookeeper. >> >> >> >> > >> > 2. The other question is about indexing. What is the best way to plain >> index >> > (I mean, without DIH or similar) in SolrCloud? Maybe configuring a >> > LBHttpSolrServer that decides itself what is the best Solr server >> instance >> > per indexation process? >> >> CloudSolrServer is prob you best bet. It does load balancing and knows >> the cluster state from zookeeper. >> >> > >> > 3. The following question may sound strange, but... but the thing is >> that I >> > would like to help anyway in Apache Solr project contributing with code >> > (bugs corrections, new features, etc.). How can I contribute to the >> > community? >> >> Create JIRA's in our issue tracking system, participate on the mailing >> list, update our wiki, etc :) >> >> > >> > Thanks a lot. >> > >> > Best Regards, >> > >> > >> > Luis Cappa. >> > >> > >> > 2012/10/31 Mark Miller <markrmil...@gmail.com> >> >> >> >> A big difference if you are using tomcat is that you still need to >> >> specify jetty.port - unless you change the name of that sys prop in >> >> solr.xml. >> >> >> >> Some more below: >> >> >> >> On Wed, Oct 31, 2012 at 2:09 PM, Luis Cappa Banda <luisca...@gmail.com >> > >> >> wrote: >> >> > Hello! >> >> > >> >> > How are you?I followed SolrCloud Wiki tutorial and noticed that all >> >> > worked >> >> > perfectly with Jetty and with a very basic configuration. My first >> >> > impression was that SolrCloud is amazing and I´m interested on >> deploying >> >> > a >> >> > more complex and near-production environment SolrCloud architecture >> with >> >> > tests purposes. I´m using Tomcat as application server, so I´ve >> started >> >> > testing with it. >> >> > >> >> > I´ve installed Zookeper sevice in a single machine and started up >> with >> >> > the >> >> > following configuration: >> >> > >> >> > *1.)* >> >> > >> >> > ~zookeperhome/conf/zoo.cfg >> >> > >> >> > *tickTime=2000* >> >> > *initLimit=10* >> >> > *syncLimit=5* >> >> > *dataDir=~zookeperhome/data/* >> >> > *clientPort=9000* >> >> > >> >> > *2.) * I testing with a single core Solr server called 'items_en'. I >> >> > have >> >> > the configuration is as follows: >> >> > >> >> > *Indexes conf/data tree*: /mnt/data-store*/solr/* >> >> > /solr.xml >> >> > /zoo.cfg >> >> > /items_en/ >> >> > >> /conf/ >> >> > >> >> > schema.xml >> >> > >> >> > solrconfig.xml >> >> > >> >> > etc. >> >> > >> >> > So we have a simple configuration where conf files and data indexes >> >> > files >> >> > are in the same path. >> >> > >> >> > *3.)* Ok, so we have Solr server configured, but I have to save into >> >> > Zookeper the configuration. I do as follows: >> >> > >> >> > *./bin/zkcli.sh -cmd upconfig -zkhost 127.0.0.1:9000 -confdir * >> >> > /mnt/data-store/solr/*items_en/conf -collection items_en -confname >> >> > items_en >> >> > * >> >> > >> >> > And seems to work perfectly, because if I use Zookeper client and >> >> > executes >> >> > 'ls' command the files appear: >> >> > >> >> > *./bin/zkCli.sh -server localhost:9000 >> >> > * >> >> > * >> >> > * >> >> > *[zk: localhost:9000(CONNECTED) 1] ls /configs/items_en* >> >> > *[admin-extra.menu-top.html, currency.xml, protwords.txt, >> >> > mapping-FoldToASCII.txt, solrconfig.xml, lang, spellings.txt, >> >> > mapping-ISOLatin1Accent.txt, admin-extra.html, xslt, scripts.conf, >> >> > synonyms.txt, update-script.js, velocity, elevate.xml, zoo.cfg, >> >> > admin-extra.menu-bottom.html, stopwords_en.txt, schema.xml]* >> >> > * >> >> > * >> >> > * >> >> > * >> >> > *4.) *I would like that all the Solr servers deployed in that Tomcat >> >> > instance points to Zookeper port 9000 service, so I included the >> >> > following >> >> > JAVA_OPTS hoping that they´ll make that posible: >> >> > >> >> > *JAVA_OPTS="-DzkHost=127.0.0.1:9000 -Dcollection.configName=items_en >> >> > -DnumShards=2" * >> >> > * >> >> > * >> >> > *Question 1: suposing that JAVA_OPTS are OK, do you think there >> exists a >> >> > more flexible and less fixed way to indicate to each Solr server >> >> > instance >> >> > which is it´s Zookeper service?* >> >> >> >> Your zkHost should actually be a comma sep list of the zk hosts. Yes, >> >> we hope to improve this in the future as zookeeper becomes more >> >> flexible. >> >> >> >> > * >> >> > * >> >> > *Question 2: can you increment the numShards later even after an >> >> > indexation? Example: imagine that you have millions of documents and >> you >> >> > want to expand from two to four shards and increment aswell the >> number >> >> > of >> >> > Solr servers* >> >> >> >> You can't change the number of shards yet - there is an open jira >> >> issue for this and ongoing work. It's been called shard splitting. >> >> >> >> > * >> >> > * >> >> > *Question 3: do again suposing that JAVA_OPTS is OK (or near to be >> OK), >> >> > is >> >> > it necessary to include always -DnumShard per each Tomcat server? >> Can' t >> >> > this confuse Zookeeper instance?* >> >> >> >> It depends on how you start your instances. The first one is the only >> >> one that matters - it only makes sense to specify for each instance if >> >> you plan on starting them all at the same time and are not sure which >> >> the first to register in zk will be. >> >> >> >> > * >> >> > * >> >> > *Question 4: **imagine that we have three Zookeeper instances to >> manage >> >> > config files in production environment. The parameter -DzkHost >> should be >> >> > like this? -DzkHost=host1:port1,host2:port2,host3:port3.* >> >> >> >> Yes. >> >> >> >> > * >> >> > * >> >> > *5.) *I started *Tomcat (port 8080)* with a single Solr server and >> >> > everything seems to be OK: there is a single core setted as >> 'items_en' >> >> > and >> >> > Cloud button is active. The graph is a simple tree with shard1 and >> >> > shard2. >> >> > Connected to shard1 is the current instance. *Also, if I execute any >> >> > query >> >> > I just receive a 503 error code: "no servers hosting".* >> >> > * >> >> >> >> Not sure why offhand - if you are not passing jetty.port (or something >> >> else if you have renamed it - like tomcat.port), that will be a >> >> problem. >> >> >> >> > * >> >> > * >> >> > * >> >> > *6.) *I started another Solr server in a* second Tomcat instance >> (port >> >> > 9080). *Its Solr home is in the following path: >> >> > >> >> > *Indexes conf/data tree*: /mnt/data-store*/solr2/* >> >> > /solr.xml >> >> > /zoo.cfg >> >> > /items_en/ >> >> > >> /conf/ >> >> > >> >> > schema.xml >> >> > >> >> > solrconfig.xml >> >> > >> >> > etc. >> >> > >> >> > Notice that I have a second Solr home for this second Solr server. >> >> > Again, >> >> > when depolying it in Tomcat the Cloud button is active, but when I >> >> > analyze >> >> > the graph it appears another empty tree/shard1+shard2 graph where >> shard1 >> >> > is >> >> > Solr server instance from Tomcat 9080. What I have expected is that >> this >> >> > second Solr server instance becomes shard2, but it doesn´t. The most >> >> > interesting thing is that I was watching in paralallel both Tomcat1 >> and >> >> > Tomcat2 logs and they output some *"INFO: Updating live nodes"* >> traces, >> >> > so >> >> > I thought everything was allright, but it doesn´t, :-(* >> >> > * >> >> > * >> >> > * >> >> > *Question 5: ehem... what I´m doing wrong? Can anyone help me? I just >> >> > one >> >> > to follow the same example from SolrCloud wiki where there exists two >> >> > application server instances, each one with a Solr server deployed in >> >> > and >> >> > each Solr server becomes shard1 and shard2.* >> >> >> >> I'm guessing its the jetty.port issue until you tell me otherwise. >> >> >> >> > * >> >> > * >> >> > * >> >> > * >> >> > Thank you very much for your help. At last I promise to write a >> detailed >> >> > (and for dummies, like me) step by step tutorial about how to >> configure >> >> > and >> >> > deploy SolrCloud in Tomcat that I hope could help others. >> >> > >> >> > >> >> > Regards, >> >> > >> >> > >> >> > >> >> > Luis Cappa. >> >> >> >> >> >> >> >> -- >> >> - Mark >> > >> > >> >> >> >> -- >> - Mark >> > > -- - Luis Cappa