Forward to solr-user mailing list. We forgot to reply to it, :-/

2012/11/5 Luis Cappa Banda <luisca...@gmail.com>

> Hello, Mark!
>
> I´ve been testing more and more and things are going better. I have tested
> what you told me about "-Dbootstrap_conf=true" and works fine, but the
> problem is that if I include that application parameter in every Tomcat
> instance when I deploy all Solr servers each one load again all solrCore
> configurations inside Zookeeper.
>
> It should exists something like a Tomcat master server which only has the
> following parameters that defines the basic SolrCloud configuration:
>
> JAVA_OPTS="-DzkHost=127.0.0.1:9000 -DnumShards=2 -Dbootstrap_conf=true"
>
> Then the other Tomcat servers should have only:
>
> JAVA_OPTS="-DzkHost=127.0.0.1:9000"
>
>
> However, I think that is not the best way to procceed. We are at 2012,
> it´s the end of the world - God (well, one of them) is angry and attacks my
> Production environment. Imagine that all servers go down and a Monit
> service restarts them alleatory. Maybe one common Tomcat server finishes
> it´s startup faster than the named Tomcat master server, so those SolrCloud
> configuration parameters won´t be loaded at first. That´s a problem.
>
> One posibility is to write a simple script to be executed in every Tomcat
> launch execution that consists on something like:
>
> " I´m the first Tomcat and I´m launching! I´ll write a
> solrcloud.config.lock file in a well-known path (or maybe into Zookeeper)
> to announce the other Tomcats that I´ll start to load SolrCloud
> configuration files into Zookeeper. I am the Tomcat master server, so I´ll
> load* JAVA_OPTS="-DzkHost=127.0.0.1:9000 -DnumShards=2
> -Dbootstrap_conf=true"* ".
>
> " I´m a second Tomcat and I´m launching! First I check if any
> solrcloud.config.lock file exists. If exists, I simple load *
> JAVA_OPTS="-DzkHost=127.0.0.1:9000"* "
>
>
> And so on.
>
>
>
> I don´t like too much this solution because it´s not elegant and it´s very
> ad-hoc, but it works. What do you think about it? I´ve just started with
> SolrCloud four or five days ago and maybe I forget something that could
> solve this problem.
>
> Thank you very much, Mark.
>
> Regards,
>
>     Luis Cappa.
>
>
>
> 2012/11/3 Mark Miller <markrmil...@gmail.com>
>
>> On Fri, Nov 2, 2012 at 9:05 AM, Luis Cappa Banda <luisca...@gmail.com>
>> wrote:
>> > Hello, Mark!
>> >
>> > How are you? Thanks a lot for helping me. You were right about
>> jetty.host
>> > parameter. My fianl test solr.xml looks like:
>> >
>> >   <cores adminPath="/admin/cores" defaultCoreName="items_en"
>> > host="localhost" hostPort="9080" hostContext="items_en">
>> >     <core name="items_en" instanceDir="items_en" />
>> >   </cores>
>> >
>> >
>> > I´ve noticed that 'hostContext' parameter was also required, so I
>> included
>> > it.
>>
>> It should default to /solr if you don't set it - it is there in case
>> you deploy to a different context though.
>>
>> >After that corrections Cloud graph tree looks right, and executing
>> > queries doesn' t return a 503 error. Phew! However, I checked in the
>> Cloud
>> > graph tree that a"collection1" appears too pointing to
>> > http://localhost:8983/solr. I will continue testing if I missed
>> something,
>> > but looks like it is creating another collection with default parameters
>> > (collection name, port) without control.
>>
>> It should only create what it finds in solr.xml - let me know what you
>> find.
>>
>> >
>> > While using Apache Tomcat I was forced to include in catalina.sh (or
>> > setenv.sh) the following environment parameters, as I told you before:
>> >
>> > JAVA_OPTS="-DzkHost=127.0.0.1:9000 -Dcollection.configName=items_en"
>>
>> You should only need -DzkHost= - see below.
>>
>> >
>> >
>> > Just three questions more:
>> >
>> > 1. That´s a problem for me, because I would like to deploy in each
>> Tomcat
>> > instance more than one Solr server with different configurations file (I
>> > mean, differents configName parameters), so including that JAVA_OPTS
>> forces
>> > to me to deploy in that Tomcat server only Solr servers with this kind
>> of
>> > configuration. In a production environment I would like to deploy in a
>> > single Tomcat instance at least for Solr servers, one per each kind of
>> > documents that I will index and query to. Do you know any way to
>> configure
>> > the configName per each Solr server instance? Is it posible to
>> configure it
>> > inside solr.xml file? Also, it make sense to deploy in each Solr server
>> a
>> > multi-core configuration, each core with each configName allocated in
>> > Zookeeper, but again using that kind of JAVA_OPTS on-fire params
>> > configuration makes it impossible, :-(
>>
>> That config name sys prop is not being used here - it's only used when
>> you use -Dbootstrap_confdir=<path>, and then only the first time you
>> start up.
>>
>> Collections are linked to configuration sets in ZooKeeper. If you use
>> -Dboostrap_conf=true, a special rule is used that auto links
>> collections and config sets with the same name as the collection.
>> Otherwise, you can use the ZkCLi cmd line tool to link any collectio
>> to any config in zookeeper.
>>
>>
>>
>> >
>> > 2. The other question is about indexing. What is the best way to plain
>> index
>> > (I mean, without DIH or similar) in SolrCloud? Maybe configuring a
>> > LBHttpSolrServer that decides itself what is the best Solr server
>> instance
>> > per indexation process?
>>
>> CloudSolrServer is prob you best bet. It does load balancing and knows
>> the cluster state from zookeeper.
>>
>> >
>> > 3. The following question may sound strange, but... but the thing is
>> that I
>> > would like to help anyway in Apache Solr project contributing with code
>> > (bugs corrections, new features, etc.). How can I contribute to the
>> > community?
>>
>> Create JIRA's in our issue tracking system, participate on the mailing
>> list, update our wiki, etc :)
>>
>> >
>> > Thanks a lot.
>> >
>> > Best Regards,
>> >
>> >
>> > Luis Cappa.
>> >
>> >
>> > 2012/10/31 Mark Miller <markrmil...@gmail.com>
>> >>
>> >> A big difference if you are using tomcat is that you still need to
>> >> specify jetty.port - unless you change the name of that sys prop in
>> >> solr.xml.
>> >>
>> >> Some more below:
>> >>
>> >> On Wed, Oct 31, 2012 at 2:09 PM, Luis Cappa Banda <luisca...@gmail.com
>> >
>> >> wrote:
>> >> > Hello!
>> >> >
>> >> > How are you?I followed SolrCloud Wiki tutorial and noticed that all
>> >> > worked
>> >> > perfectly with Jetty and with a very basic configuration. My first
>> >> > impression was that SolrCloud is amazing and I´m interested on
>> deploying
>> >> > a
>> >> > more complex and near-production environment SolrCloud architecture
>> with
>> >> > tests purposes. I´m using Tomcat as application server, so I´ve
>> started
>> >> > testing with it.
>> >> >
>> >> > I´ve installed Zookeper sevice in a single machine and started up
>> with
>> >> > the
>> >> > following configuration:
>> >> >
>> >> > *1.)*
>> >> >
>> >> > ~zookeperhome/conf/zoo.cfg
>> >> >
>> >> > *tickTime=2000*
>> >> > *initLimit=10*
>> >> > *syncLimit=5*
>> >> > *dataDir=~zookeperhome/data/*
>> >> > *clientPort=9000*
>> >> >
>> >> > *2.) * I testing with a single core Solr server called 'items_en'. I
>> >> > have
>> >> > the configuration is as follows:
>> >> >
>> >> > *Indexes conf/data tree*: /mnt/data-store*/solr/*
>> >> >                                                    /solr.xml
>> >> >                                                    /zoo.cfg
>> >> >                                                    /items_en/
>> >> >
>>  /conf/
>> >> >
>> >> > schema.xml
>> >> >
>> >> > solrconfig.xml
>> >> >
>> >> > etc.
>> >> >
>> >> > So we have a simple configuration where conf files and data indexes
>> >> > files
>> >> > are in the same path.
>> >> >
>> >> > *3.)* Ok, so we have Solr server configured, but I have to save into
>> >> > Zookeper the configuration. I do as follows:
>> >> >
>> >> > *./bin/zkcli.sh -cmd upconfig -zkhost 127.0.0.1:9000 -confdir *
>> >> > /mnt/data-store/solr/*items_en/conf -collection items_en -confname
>> >> > items_en
>> >> > *
>> >> >
>> >> > And seems to work perfectly, because if I use Zookeper client and
>> >> > executes
>> >> > 'ls' command the files appear:
>> >> >
>> >> > *./bin/zkCli.sh -server localhost:9000
>> >> > *
>> >> > *
>> >> > *
>> >> > *[zk: localhost:9000(CONNECTED) 1] ls /configs/items_en*
>> >> > *[admin-extra.menu-top.html, currency.xml, protwords.txt,
>> >> > mapping-FoldToASCII.txt, solrconfig.xml, lang, spellings.txt,
>> >> > mapping-ISOLatin1Accent.txt, admin-extra.html, xslt, scripts.conf,
>> >> > synonyms.txt, update-script.js, velocity, elevate.xml, zoo.cfg,
>> >> > admin-extra.menu-bottom.html, stopwords_en.txt, schema.xml]*
>> >> > *
>> >> > *
>> >> > *
>> >> > *
>> >> > *4.) *I would like that all the Solr servers deployed in that Tomcat
>> >> > instance points to Zookeper port 9000 service, so I included the
>> >> > following
>> >> > JAVA_OPTS hoping that they´ll make that posible:
>> >> >
>> >> > *JAVA_OPTS="-DzkHost=127.0.0.1:9000 -Dcollection.configName=items_en
>> >> > -DnumShards=2" *
>> >> > *
>> >> > *
>> >> > *Question 1: suposing that JAVA_OPTS are OK, do you think there
>> exists a
>> >> > more flexible and less fixed way to indicate to each Solr server
>> >> > instance
>> >> > which is it´s Zookeper service?*
>> >>
>> >> Your zkHost should actually be a comma sep list of the zk hosts. Yes,
>> >> we hope to improve this in the future as zookeeper becomes more
>> >> flexible.
>> >>
>> >> > *
>> >> > *
>> >> > *Question 2: can you increment the numShards later even after an
>> >> > indexation? Example: imagine that you have millions of documents and
>> you
>> >> > want to expand from two to four shards and increment aswell the
>> number
>> >> > of
>> >> > Solr servers*
>> >>
>> >> You can't change the number of shards yet - there is an open jira
>> >> issue for this and ongoing work. It's been called shard splitting.
>> >>
>> >> > *
>> >> > *
>> >> > *Question 3: do again suposing that JAVA_OPTS is OK (or near to be
>> OK),
>> >> > is
>> >> > it necessary to include always -DnumShard per each Tomcat server?
>> Can' t
>> >> > this confuse Zookeeper instance?*
>> >>
>> >> It depends on how you start your instances. The first one is the only
>> >> one that matters - it only makes sense to specify for each instance if
>> >> you plan on starting them all at the same time and are not sure which
>> >> the first to register in zk will be.
>> >>
>> >> > *
>> >> > *
>> >> > *Question 4: **imagine that we have three Zookeeper instances to
>> manage
>> >> > config files in production environment. The parameter -DzkHost
>> should be
>> >> > like this? -DzkHost=host1:port1,host2:port2,host3:port3.*
>> >>
>> >> Yes.
>> >>
>> >> > *
>> >> > *
>> >> > *5.) *I started *Tomcat (port 8080)* with a single Solr server and
>> >> > everything seems to be OK: there is a single core setted as
>> 'items_en'
>> >> > and
>> >> > Cloud button is active. The graph is a simple tree with shard1 and
>> >> > shard2.
>> >> > Connected to shard1 is the current instance. *Also, if I execute any
>> >> > query
>> >> > I just receive a 503 error code: "no servers hosting".*
>> >> > *
>> >>
>> >> Not sure why offhand - if you are not passing jetty.port (or something
>> >> else if you have renamed it - like tomcat.port), that will be a
>> >> problem.
>> >>
>> >> > *
>> >> > *
>> >> > *
>> >> > *6.) *I started another Solr server in a* second Tomcat instance
>> (port
>> >> > 9080). *Its Solr home is in the following path:
>> >> >
>> >> > *Indexes conf/data tree*: /mnt/data-store*/solr2/*
>> >> >                                                    /solr.xml
>> >> >                                                    /zoo.cfg
>> >> >                                                    /items_en/
>> >> >
>>  /conf/
>> >> >
>> >> > schema.xml
>> >> >
>> >> > solrconfig.xml
>> >> >
>> >> > etc.
>> >> >
>> >> > Notice that I have a second Solr home for this second Solr server.
>> >> > Again,
>> >> > when depolying it in Tomcat the Cloud button is active, but when I
>> >> > analyze
>> >> > the graph it appears another empty tree/shard1+shard2 graph where
>> shard1
>> >> > is
>> >> > Solr server instance from Tomcat 9080. What I have expected is that
>> this
>> >> > second Solr server instance becomes shard2, but it doesn´t. The most
>> >> > interesting thing is that I was watching in paralallel both Tomcat1
>> and
>> >> > Tomcat2 logs and they output some *"INFO: Updating live nodes"*
>> traces,
>> >> > so
>> >> > I thought everything was allright, but it doesn´t, :-(*
>> >> > *
>> >> > *
>> >> > *
>> >> > *Question 5: ehem... what I´m doing wrong? Can anyone help me? I just
>> >> > one
>> >> > to follow the same example from SolrCloud wiki where there exists two
>> >> > application server instances, each one with a Solr server deployed in
>> >> > and
>> >> > each Solr server becomes shard1 and shard2.*
>> >>
>> >> I'm guessing its the jetty.port issue until you tell me otherwise.
>> >>
>> >> > *
>> >> > *
>> >> > *
>> >> > *
>> >> > Thank you very much for your help. At last I promise to write a
>> detailed
>> >> > (and for dummies, like me) step by step tutorial about how to
>> configure
>> >> > and
>> >> > deploy SolrCloud in Tomcat that I hope could help others.
>> >> >
>> >> >
>> >> > Regards,
>> >> >
>> >> >
>> >> >
>> >> > Luis Cappa.
>> >>
>> >>
>> >>
>> >> --
>> >> - Mark
>> >
>> >
>>
>>
>>
>> --
>> - Mark
>>
>
>


-- 

- Luis Cappa

Reply via email to