Re: SolrCloud Tomcat configuration: problems and doubts.

2012-11-06 Thread Luis Cappa Banda
Forward to solr-user mailing list. We forgot to reply to it, :-/

2012/11/5 Luis Cappa Banda luisca...@gmail.com

 Hello, Mark!

 I´ve been testing more and more and things are going better. I have tested
 what you told me about -Dbootstrap_conf=true and works fine, but the
 problem is that if I include that application parameter in every Tomcat
 instance when I deploy all Solr servers each one load again all solrCore
 configurations inside Zookeeper.

 It should exists something like a Tomcat master server which only has the
 following parameters that defines the basic SolrCloud configuration:

 JAVA_OPTS=-DzkHost=127.0.0.1:9000 -DnumShards=2 -Dbootstrap_conf=true

 Then the other Tomcat servers should have only:

 JAVA_OPTS=-DzkHost=127.0.0.1:9000


 However, I think that is not the best way to procceed. We are at 2012,
 it´s the end of the world - God (well, one of them) is angry and attacks my
 Production environment. Imagine that all servers go down and a Monit
 service restarts them alleatory. Maybe one common Tomcat server finishes
 it´s startup faster than the named Tomcat master server, so those SolrCloud
 configuration parameters won´t be loaded at first. That´s a problem.

 One posibility is to write a simple script to be executed in every Tomcat
 launch execution that consists on something like:

  I´m the first Tomcat and I´m launching! I´ll write a
 solrcloud.config.lock file in a well-known path (or maybe into Zookeeper)
 to announce the other Tomcats that I´ll start to load SolrCloud
 configuration files into Zookeeper. I am the Tomcat master server, so I´ll
 load* JAVA_OPTS=-DzkHost=127.0.0.1:9000 -DnumShards=2
 -Dbootstrap_conf=true* .

  I´m a second Tomcat and I´m launching! First I check if any
 solrcloud.config.lock file exists. If exists, I simple load *
 JAVA_OPTS=-DzkHost=127.0.0.1:9000* 


 And so on.



 I don´t like too much this solution because it´s not elegant and it´s very
 ad-hoc, but it works. What do you think about it? I´ve just started with
 SolrCloud four or five days ago and maybe I forget something that could
 solve this problem.

 Thank you very much, Mark.

 Regards,

 Luis Cappa.



 2012/11/3 Mark Miller markrmil...@gmail.com

 On Fri, Nov 2, 2012 at 9:05 AM, Luis Cappa Banda luisca...@gmail.com
 wrote:
  Hello, Mark!
 
  How are you? Thanks a lot for helping me. You were right about
 jetty.host
  parameter. My fianl test solr.xml looks like:
 
cores adminPath=/admin/cores defaultCoreName=items_en
  host=localhost hostPort=9080 hostContext=items_en
  core name=items_en instanceDir=items_en /
/cores
 
 
  I´ve noticed that 'hostContext' parameter was also required, so I
 included
  it.

 It should default to /solr if you don't set it - it is there in case
 you deploy to a different context though.

 After that corrections Cloud graph tree looks right, and executing
  queries doesn' t return a 503 error. Phew! However, I checked in the
 Cloud
  graph tree that acollection1 appears too pointing to
  http://localhost:8983/solr. I will continue testing if I missed
 something,
  but looks like it is creating another collection with default parameters
  (collection name, port) without control.

 It should only create what it finds in solr.xml - let me know what you
 find.

 
  While using Apache Tomcat I was forced to include in catalina.sh (or
  setenv.sh) the following environment parameters, as I told you before:
 
  JAVA_OPTS=-DzkHost=127.0.0.1:9000 -Dcollection.configName=items_en

 You should only need -DzkHost= - see below.

 
 
  Just three questions more:
 
  1. That´s a problem for me, because I would like to deploy in each
 Tomcat
  instance more than one Solr server with different configurations file (I
  mean, differents configName parameters), so including that JAVA_OPTS
 forces
  to me to deploy in that Tomcat server only Solr servers with this kind
 of
  configuration. In a production environment I would like to deploy in a
  single Tomcat instance at least for Solr servers, one per each kind of
  documents that I will index and query to. Do you know any way to
 configure
  the configName per each Solr server instance? Is it posible to
 configure it
  inside solr.xml file? Also, it make sense to deploy in each Solr server
 a
  multi-core configuration, each core with each configName allocated in
  Zookeeper, but again using that kind of JAVA_OPTS on-fire params
  configuration makes it impossible, :-(

 That config name sys prop is not being used here - it's only used when
 you use -Dbootstrap_confdir=path, and then only the first time you
 start up.

 Collections are linked to configuration sets in ZooKeeper. If you use
 -Dboostrap_conf=true, a special rule is used that auto links
 collections and config sets with the same name as the collection.
 Otherwise, you can use the ZkCLi cmd line tool to link any collectio
 to any config in zookeeper.



 
  2. The other question is about indexing. What is the best way to plain
 index
  (I 

Re: SolrCloud Tomcat configuration: problems and doubts.

2012-11-02 Thread Luis Cappa Banda
Hello, Mark!

How are you? Thanks a lot for helping me. You were right about jetty.host
parameter. My fianl test solr.xml looks like:

*  cores adminPath=/admin/cores defaultCoreName=items_en
host=localhost hostPort=9080 hostContext=items_en*
*core name=items_en instanceDir=items_en /*
*  /cores*


I´ve noticed that 'hostContext' parameter was also required, so I included
it. After that corrections Cloud graph tree looks right, and executing
queries doesn' t return a 503 error. Phew! However, I checked in the Cloud
graph tree that acollection1 appears too pointing to
http://localhost:8983/solr. I will continue testing if I missed something,
but looks like it is creating another collection with default parameters
(collection name, port) without control.

While using Apache Tomcat I was forced to include in catalina.sh (or
setenv.sh) the following environment parameters, as I told you before:

*JAVA_OPTS=-DzkHost=127.0.0.1:9000 -Dcollection.configName=items_en *


Just three questions more:

*1.* That´s a problem for me, because I would like to deploy in each Tomcat
instance more than one Solr server with different configurations file (I
mean, differents configName parameters), so including that JAVA_OPTS forces
to me to deploy in that Tomcat server only Solr servers with this kind of
configuration. In a production environment I would like to deploy in a
single Tomcat instance at least for Solr servers, one per each kind of
documents that I will index and query to. Do you know any way to configure
the configName per each Solr server instance? Is it posible to configure it
inside solr.xml file? Also, it make sense to deploy in each Solr server a
multi-core configuration, each core with each configName allocated in
Zookeeper, but again using that kind of JAVA_OPTS on-fire params
configuration makes it impossible, :-(

*2.* The other question is about indexing. What is the best way to plain
index (I mean, without DIH or similar) in SolrCloud? Maybe configuring a
LBHttpSolrServer that decides itself what is the best Solr server instance
per indexation process?

*3.* The following question may sound strange, but... but the thing is that
I would like to help anyway in Apache Solr project contributing with code
(bugs corrections, new features, etc.). How can I contribute to the
community?

Thanks a lot.

Best Regards,


Luis Cappa.


2012/10/31 Mark Miller markrmil...@gmail.com

 A big difference if you are using tomcat is that you still need to
 specify jetty.port - unless you change the name of that sys prop in
 solr.xml.

 Some more below:

 On Wed, Oct 31, 2012 at 2:09 PM, Luis Cappa Banda luisca...@gmail.com
 wrote:
  Hello!
 
  How are you?I followed SolrCloud Wiki tutorial and noticed that all
 worked
  perfectly with Jetty and with a very basic configuration. My first
  impression was that SolrCloud is amazing and I´m interested on deploying
 a
  more complex and near-production environment SolrCloud architecture with
  tests purposes. I´m using Tomcat as application server, so I´ve started
  testing with it.
 
  I´ve installed Zookeper sevice in a single machine and started up with
 the
  following configuration:
 
  *1.)*
 
  ~zookeperhome/conf/zoo.cfg
 
  *tickTime=2000*
  *initLimit=10*
  *syncLimit=5*
  *dataDir=~zookeperhome/data/*
  *clientPort=9000*
 
  *2.) * I testing with a single core Solr server called 'items_en'. I have
  the configuration is as follows:
 
  *Indexes conf/data tree*: /mnt/data-store*/solr/*
 /solr.xml
 /zoo.cfg
 /items_en/
   /conf/
 
  schema.xml
 
  solrconfig.xml
 
 etc.
 
  So we have a simple configuration where conf files and data indexes files
  are in the same path.
 
  *3.)* Ok, so we have Solr server configured, but I have to save into
  Zookeper the configuration. I do as follows:
 
  *./bin/zkcli.sh -cmd upconfig -zkhost 127.0.0.1:9000 -confdir *
  /mnt/data-store/solr/*items_en/conf -collection items_en -confname
 items_en
  *
 
  And seems to work perfectly, because if I use Zookeper client and
 executes
  'ls' command the files appear:
 
  *./bin/zkCli.sh -server localhost:9000
  *
  *
  *
  *[zk: localhost:9000(CONNECTED) 1] ls /configs/items_en*
  *[admin-extra.menu-top.html, currency.xml, protwords.txt,
  mapping-FoldToASCII.txt, solrconfig.xml, lang, spellings.txt,
  mapping-ISOLatin1Accent.txt, admin-extra.html, xslt, scripts.conf,
  synonyms.txt, update-script.js, velocity, elevate.xml, zoo.cfg,
  admin-extra.menu-bottom.html, stopwords_en.txt, schema.xml]*
  *
  *
  *
  *
  *4.) *I would like that all the Solr servers deployed in that Tomcat
  instance points to Zookeper port 9000 service, so I included the
 following
  JAVA_OPTS hoping that they´ll make that posible:
 
  *JAVA_OPTS=-DzkHost=127.0.0.1:9000 

Re: SolrCloud Tomcat configuration: problems and doubts.

2012-10-31 Thread Mark Miller
A big difference if you are using tomcat is that you still need to
specify jetty.port - unless you change the name of that sys prop in
solr.xml.

Some more below:

On Wed, Oct 31, 2012 at 2:09 PM, Luis Cappa Banda luisca...@gmail.com wrote:
 Hello!

 How are you?I followed SolrCloud Wiki tutorial and noticed that all worked
 perfectly with Jetty and with a very basic configuration. My first
 impression was that SolrCloud is amazing and I´m interested on deploying a
 more complex and near-production environment SolrCloud architecture with
 tests purposes. I´m using Tomcat as application server, so I´ve started
 testing with it.

 I´ve installed Zookeper sevice in a single machine and started up with the
 following configuration:

 *1.)*

 ~zookeperhome/conf/zoo.cfg

 *tickTime=2000*
 *initLimit=10*
 *syncLimit=5*
 *dataDir=~zookeperhome/data/*
 *clientPort=9000*

 *2.) * I testing with a single core Solr server called 'items_en'. I have
 the configuration is as follows:

 *Indexes conf/data tree*: /mnt/data-store*/solr/*
/solr.xml
/zoo.cfg
/items_en/
  /conf/

 schema.xml

 solrconfig.xml
 etc.

 So we have a simple configuration where conf files and data indexes files
 are in the same path.

 *3.)* Ok, so we have Solr server configured, but I have to save into
 Zookeper the configuration. I do as follows:

 *./bin/zkcli.sh -cmd upconfig -zkhost 127.0.0.1:9000 -confdir *
 /mnt/data-store/solr/*items_en/conf -collection items_en -confname items_en
 *

 And seems to work perfectly, because if I use Zookeper client and executes
 'ls' command the files appear:

 *./bin/zkCli.sh -server localhost:9000
 *
 *
 *
 *[zk: localhost:9000(CONNECTED) 1] ls /configs/items_en*
 *[admin-extra.menu-top.html, currency.xml, protwords.txt,
 mapping-FoldToASCII.txt, solrconfig.xml, lang, spellings.txt,
 mapping-ISOLatin1Accent.txt, admin-extra.html, xslt, scripts.conf,
 synonyms.txt, update-script.js, velocity, elevate.xml, zoo.cfg,
 admin-extra.menu-bottom.html, stopwords_en.txt, schema.xml]*
 *
 *
 *
 *
 *4.) *I would like that all the Solr servers deployed in that Tomcat
 instance points to Zookeper port 9000 service, so I included the following
 JAVA_OPTS hoping that they´ll make that posible:

 *JAVA_OPTS=-DzkHost=127.0.0.1:9000 -Dcollection.configName=items_en
 -DnumShards=2 *
 *
 *
 *Question 1: suposing that JAVA_OPTS are OK, do you think there exists a
 more flexible and less fixed way to indicate to each Solr server instance
 which is it´s Zookeper service?*

Your zkHost should actually be a comma sep list of the zk hosts. Yes,
we hope to improve this in the future as zookeeper becomes more
flexible.

 *
 *
 *Question 2: can you increment the numShards later even after an
 indexation? Example: imagine that you have millions of documents and you
 want to expand from two to four shards and increment aswell the number of
 Solr servers*

You can't change the number of shards yet - there is an open jira
issue for this and ongoing work. It's been called shard splitting.

 *
 *
 *Question 3: do again suposing that JAVA_OPTS is OK (or near to be OK), is
 it necessary to include always -DnumShard per each Tomcat server? Can' t
 this confuse Zookeeper instance?*

It depends on how you start your instances. The first one is the only
one that matters - it only makes sense to specify for each instance if
you plan on starting them all at the same time and are not sure which
the first to register in zk will be.

 *
 *
 *Question 4: **imagine that we have three Zookeeper instances to manage
 config files in production environment. The parameter -DzkHost should be
 like this? -DzkHost=host1:port1,host2:port2,host3:port3.*

Yes.

 *
 *
 *5.) *I started *Tomcat (port 8080)* with a single Solr server and
 everything seems to be OK: there is a single core setted as 'items_en' and
 Cloud button is active. The graph is a simple tree with shard1 and shard2.
 Connected to shard1 is the current instance. *Also, if I execute any query
 I just receive a 503 error code: no servers hosting.*
 *

Not sure why offhand - if you are not passing jetty.port (or something
else if you have renamed it - like tomcat.port), that will be a
problem.

 *
 *
 *
 *6.) *I started another Solr server in a* second Tomcat instance (port
 9080). *Its Solr home is in the following path:

 *Indexes conf/data tree*: /mnt/data-store*/solr2/*
/solr.xml
/zoo.cfg
/items_en/
  /conf/

 schema.xml

 solrconfig.xml