Re: Updating clusterstate from the zookeeper

2013-04-19 Thread Nate Fox
I've used zookeeper's cli to do this. I doubt its the right way and I have
no idea if it'll work for clusterstate.json, but it seems to work for
certain things.

cd /opt/zookeeper/bin
./zkCli.sh -server 127.0.0.1:2183 set /configs/collection1/schema.xml `cat
/tmp/newschema.xml`
sleep 10  # give a lil time to get pushed out
curl 
http://localhost:8080/solr/admin/cores?wt=jsonaction=RELOADcore=collection1


This is on zk 3.4.5



--
Nate Fox
Sr Systems Engineer

o: 310.658.5775
m: 714.248.5350

Follow us @NEOGOV http://twitter.com/NEOGOV and on
Facebookhttp://www.facebook.com/neogov

NEOGOV http://www.neogov.com/ is among the top fastest growing software
companies in the USA, recognized by Inc 500|5000, Deloitte Fast 500, and
the LA Business Journal. We are hiring!http://www.neogov.com/#/company/careers



On Fri, Apr 19, 2013 at 11:30 AM, Mingfeng Yang mfy...@wisewindow.comwrote:

 Right. I am wondering if/how we can download a specific file from the
 zookeeper, modify it and then upload to rewrite it.  Anyone ?

 Thanks,
 Ming


 On Fri, Apr 19, 2013 at 10:53 AM, Michael Della Bitta 
 michael.della.bi...@appinions.com wrote:

  I would like to know the answer to this as well.
 
  Michael Della Bitta
 
  
  Appinions
  18 East 41st Street, 2nd Floor
  New York, NY 10017-6271
 
  www.appinions.com
 
  Where Influence Isn’t a Game
 
 
  On Thu, Apr 18, 2013 at 8:15 PM, Manuel Le Normand
  manuel.lenorm...@gmail.com wrote:
   Hello,
   After creating a distributed collection on several different servers I
   sometimes get to deal with failing servers (cores appear not
 available
  =
   grey) or failing cores (Down / unable to recover = brown / red).
   In case i wish to delete this errorneous collection (through collection
   API) only the green nodes get erased, leaving a meaningless
 unavailable
   collection in the clusterstate.json.
  
   Is there any way to edit explicitly the clusterstate.json? If not, how
  do i
   update it so the collection as above gets deleted?
  
   Cheers,
   Manu
 



Re: Need hook to know when replication backup is actually completed.

2013-04-12 Thread Nate Fox
Tim, thank you for this! I had been looking for this a while back (even
posted something on serverfault) and never got a decent answer. This is
exactly what I was looking for.


--
Nate Fox
Sr Systems Engineer

o: 310.658.5775
m: 714.248.5350

Follow us @NEOGOV http://twitter.com/NEOGOV and on
Facebookhttp://www.facebook.com/neogov

NEOGOV http://www.neogov.com/ is among the top fastest growing software
companies in the USA, recognized by Inc 500|5000, Deloitte Fast 500, and
the LA Business Journal. We are hiring!http://www.neogov.com/#/company/careers



On Fri, Apr 12, 2013 at 12:04 PM, Timothy Potter thelabd...@gmail.comwrote:

 Update to this ... did some code scanning and it looks like the backup
 status is available via the details command, e.g.

 lst name=backup
   str name=startTimeFri Apr 12 17:53:17 UTC 2013/str
   int name=fileCount120/int
   str name=statussuccess/str
   str name=snapshotCompletedAtFri Apr 12 17:58:22 UTC 2013/str
 /lst

 So with a little polling of the details command from my backup script and
 I'm good to go. If anyone knows of a more direct way, let me know otherwise
 I'm moving ahead with this approach.

 Cheers,
 Tim


 On Fri, Apr 12, 2013 at 9:31 AM, Timothy Potter thelabd...@gmail.com
 wrote:

  Hi,
 
  I'd like to use the backup command to create a backup of each shard
  leader's index periodically. This is for disaster recovery in case our
 data
  center goes offline.
 
  We use SolrCloud leader/replica for day-to-day fault-tolerance and it
  works great.
 
  The backup command (
  http://master_host:port/solr/replication?command=backup) works just fine
  but it returns immediately while the actual backup creation runs in the
  background on the shard leader.
 
  Is there any way to know when the actual backup is complete? I need that
  hook to then move the backup to another storage device outside of our
 data
  center, e.g. S3.
 
  What are others doing for this type of backup process?
 
  Thanks in advance.
  Tim
 



Re: How can I set configuration options?

2013-04-09 Thread Nate Fox
In Ubuntu, I've added it to /etc/default/tomcat7 in the JAVA_OPTS options.

For example, I have:
JAVA_OPTS=-Djava.awt.headless=true -Xmx2048m -XX:+UseConcMarkSweepGC
JAVA_OPTS=${JAVA_OPTS} -DnumShards=2 -Djetty.port=8080
-DzkHost=zookeeper01.dev.:2181 -Dboostrap_conf=true



--
Nate Fox
Sr Systems Engineer

o: 310.658.5775
m: 714.248.5350

Follow us @NEOGOV http://twitter.com/NEOGOV and on
Facebookhttp://www.facebook.com/neogov

NEOGOV http://www.neogov.com/ is among the top fastest growing software
companies in the USA, recognized by Inc 500|5000, Deloitte Fast 500, and
the LA Business Journal. We are hiring!http://www.neogov.com/#/company/careers



On Tue, Apr 9, 2013 at 8:55 AM, Edd Grant e...@eddgrant.com wrote:

 Hi all,

 I have been working through the examples on the SolrCloud page:
 http://wiki.apache.org/solr/SolrCloud

 I am now at the point where, rather than firing up Solr through start.jar,
 I'm deploying the Solr war in to Tomcat instances. Taking the following
 command as an example:

 java -Dbootstrap_confdir=./solr/collection1/conf
 -Dcollection.configName=myconf -DzkRun
 -DzkHost=localhost:9983,localhost:8574,localhost:9900 -DnumShards=2
 -jar start.jar

 I can't figure out from the documentation how/ where I set the above
 properties when deploying Solr as a war file. I initially thought these
 might be configurable through solr.xml but can't find anything in the
 documentation to support this.

 Most grateful for any pointers here.

 Cheers,

 Edd
 --
 Web: http://www.eddgrant.com
 Email: e...@eddgrant.com
 Mobile: +44 (0) 7861 394 543



Re: Loadtesting solr/tomcat7 and tomcat stops responding entirely

2013-03-27 Thread Nate Fox
Update: issue resolved!
Cranking up the maxThreads did the trick. Default is 200. I went with 2500
for grins and giggles and things work great. Now, even if I overwhelm the
box with too many requests, when the requests back off the box continues to
respond. And when I slam the server after it's been restarted (without
having warmup queries), it acts as I wanted: queries are slow to respond
(upwards of 30s) for the first couple minutes then they start to all be
under 25ms and normalize at a very fast pace (obviously as the cache is
warmed).

Christopher, I could have sworn I tried upping acceptCount, maxConnections
and maxThreads in my testing, but with your prodding I tried it again - and
that was the solution.

I have a couple quick followup questions:
- What is the downside of having a maxThreads, acceptCount and
maxConnections really high? Obviously defaults are there for a reason - I'd
like to know what the reasoning is.
- Any reason I shouldnt use Tomcat? I just went with it because I figured
it was extremely mature and was easy to use with apt-get :)

I'll probably toy with the APR as suggested by Michael, as I like the idea
of a non-blocking connector.





--
Nate Fox
Sr Systems Engineer

o: 310.658.5775
m: 714.248.5350

Follow us @NEOGOV http://twitter.com/NEOGOV and on
Facebookhttp://www.facebook.com/neogov

NEOGOV http://www.neogov.com/ is among the top fastest growing software
companies in the USA, recognized by Inc 500|5000, Deloitte Fast 500, and
the LA Business Journal. We are hiring!http://www.neogov.com/#/company/careers



On Tue, Mar 26, 2013 at 5:56 PM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 : * When I set solrmeter to run 4000 queries/min, it will handle a few
 : hundred queries and then tomcat will stop responding completely to
 requests
 : (even though according to lsof -i it is still listening and the java
 : process is still running).

 have you tried tacking using jstack to generate a thread dump of the
 server to see what it's doing?

 : * When I set solrmeter to run 1000 queries/min it runs fine. I can stop
 : solrmeter after a couple of  minutes at that pace and then run at
 4000/min
 : without issue.
 :
 : It's as if it needs a ramp up time? Also, I noticed (regardless of ramp
 up)
 : that my setup cannot handle 8000/min. The reaction at 8k/min is the same
 as
 : if I were to run 4k/min without the ramp up. Of note, only the shard that
 : solrmeter is pointed to stops responding. The other shard hums along
 : without incident.

 Just to clarify: you're running a 2 node SolrCloud cluster, where each
 node contains a unique shard, and pointing solrmeter at a single node for
 the queries -- correct?

 Here's my hunch: you are probably hitting the limit of the number of
 concurrent connections tomcat will allow (whatever it may be confiurged
 ot in your setup).

 In the 8000/min case, you are probably maxing out that limit with direct
 connections you issue from solrmeter to that single node.

 In the 4000/min case, each request you issue causes that single node to
 fire off multiple requests to each shard, and since each shard exists on
 only one node, you are garunteeing thta you double the number of
 concurrent requests hitting that first node.

 in the case where you start w/ 1000/min, and then later ramp up to
 4000/min, you are probably causing enough of the queries to be warmed up
 that they are in the caches on both nodes, so they can be served really
 fast and return their results before you reach that max number of
 concurrent connections after you ramp up.

 I'm no tomcat expert, but skimming hte docs, you may want to look at
 settings like acceptCount, maxConnections, maxThreads, etc...

 -Hoss



Loadtesting solr/tomcat7 and tomcat stops responding entirely

2013-03-26 Thread Nate Fox
I'm new to solr and I'm load testing our setup to see what we can handle.
I'm using solrmeter and my problem is a bit odd:
* When I set solrmeter to run 4000 queries/min, it will handle a few
hundred queries and then tomcat will stop responding completely to requests
(even though according to lsof -i it is still listening and the java
process is still running).
* When I set solrmeter to run 1000 queries/min it runs fine. I can stop
solrmeter after a couple of  minutes at that pace and then run at 4000/min
without issue.

It's as if it needs a ramp up time? Also, I noticed (regardless of ramp up)
that my setup cannot handle 8000/min. The reaction at 8k/min is the same as
if I were to run 4k/min without the ramp up. Of note, only the shard that
solrmeter is pointed to stops responding. The other shard hums along
without incident.

Setup (everything in AWS):
- 2x m1.large (7.5Gb RAM) running tomcat7 + solr 4.2.0
(open-jdk-7-headless) : Ubuntu 12.04
- 1x m1.micro running zookeeper 3.4.5 : Ubuntu 12.04
I have ~30k documents in each node (~300Mb on each node)

The vast majority of my solr/tomcat7 config is default from ubuntu's
packages/solr's example dir. Here's the configs and the end of the
catalina.out file:https://gist.github.com/anonymous/ef8fa79ecc1673d11bc0

My main question is two fold:
1. Is this normal behavior for tomcat (to just stop responding completely)
when it gets overwhelmed? And the only option is to restart it? I guess I
dont know what it looks like when tomcat/solr cant keep up.
2. Why does it handle better when I give it a lower number of queries and
then ramp it up? It concerns me that if I have to restart a server in the
cluster and it gets thrown into the pool of machines that things will blow
up.

As an aside, does this seem like a normal amount of queries (~4k/min) that
this kind of environment should be able to handle?


Re: Loadtesting solr/tomcat7 and tomcat stops responding entirely

2013-03-26 Thread Nate Fox
I was wondering if the warmup stuff was one of the culprits (we dont have
warmup's at all - the configs are pretty stock).
As for the system, it seems capable of quite a bit more: memory usage is
~30%, jvm-memory (from the dashboard) is very low (~220Mb out of 3Gb) and
load below 1.00.

The seed data and queries were put together by one of our developers. I've
put all the solrmeter files here:
https://gist.github.com/natefox/ee5cef3d4fbbc73e9bce
Unfortunately I'm quite new to solr (and tomcat) so I'm not entirely sure
which file does which specifically.

Does the system's reaction to a 'fast load' without a warmup sound normal?
I would have expected the first couple hundred queries to be very slow
(500ms) and then the system catch up after a while. But it just dies very
quickly and never recovers.

I'll check out your SPM - I've seen it mentioned before. Thanks!



--
Nate Fox
Sr Systems Engineer

o: 310.658.5775
m: 714.248.5350

Follow us @NEOGOV http://twitter.com/NEOGOV and on
Facebookhttp://www.facebook.com/neogov

NEOGOV http://www.neogov.com/ is among the top fastest growing software
companies in the USA, recognized by Inc 500|5000, Deloitte Fast 500, and
the LA Business Journal. We are hiring!http://www.neogov.com/#/company/careers



On Tue, Mar 26, 2013 at 11:12 AM, Otis Gospodnetic 
otis.gospodne...@gmail.com wrote:

 Hi,

 In short, certain data structures need to load from index in the
 beginning, (for sorting and faceting) caches need to warm up, JVM
 needs to warm up, etc., so going slowly in the beginning makes sense.
 Why things die after that is a different Q.  Maybe it OOMs?  Maybe
 queries are very complex?  What do your queries look like?  I see
 newrelic.jar in the command-line.  May want to try SPM for Solr, it
 has better Solr metrics.

 Otis
 --
 Solr  ElasticSearch Support
 http://sematext.com/





 On Tue, Mar 26, 2013 at 1:24 PM, Nate Fox n...@neogov.com wrote:
  I'm new to solr and I'm load testing our setup to see what we can handle.
  I'm using solrmeter and my problem is a bit odd:
  * When I set solrmeter to run 4000 queries/min, it will handle a few
  hundred queries and then tomcat will stop responding completely to
 requests
  (even though according to lsof -i it is still listening and the java
  process is still running).
  * When I set solrmeter to run 1000 queries/min it runs fine. I can stop
  solrmeter after a couple of  minutes at that pace and then run at
 4000/min
  without issue.
 
  It's as if it needs a ramp up time? Also, I noticed (regardless of ramp
 up)
  that my setup cannot handle 8000/min. The reaction at 8k/min is the same
 as
  if I were to run 4k/min without the ramp up. Of note, only the shard that
  solrmeter is pointed to stops responding. The other shard hums along
  without incident.
 
  Setup (everything in AWS):
  - 2x m1.large (7.5Gb RAM) running tomcat7 + solr 4.2.0
  (open-jdk-7-headless) : Ubuntu 12.04
  - 1x m1.micro running zookeeper 3.4.5 : Ubuntu 12.04
  I have ~30k documents in each node (~300Mb on each node)
 
  The vast majority of my solr/tomcat7 config is default from ubuntu's
  packages/solr's example dir. Here's the configs and the end of the
  catalina.out file:https://gist.github.com/anonymous/ef8fa79ecc1673d11bc0
 
  My main question is two fold:
  1. Is this normal behavior for tomcat (to just stop responding
 completely)
  when it gets overwhelmed? And the only option is to restart it? I guess I
  dont know what it looks like when tomcat/solr cant keep up.
  2. Why does it handle better when I give it a lower number of queries and
  then ramp it up? It concerns me that if I have to restart a server in the
  cluster and it gets thrown into the pool of machines that things will
 blow
  up.
 
  As an aside, does this seem like a normal amount of queries (~4k/min)
 that
  this kind of environment should be able to handle?



Re: Loadtesting solr/tomcat7 and tomcat stops responding entirely

2013-03-26 Thread Nate Fox
We're not using ELB and I have no idea which connector I'm using - I'm
guessing whatever is default (I'm a total noob). This is from my server.xml:
Connector port=8080 protocol=HTTP/1.1 connectionTimeout=6
   URIEncoding=UTF-8 redirectPort=8443 /



--
Nate Fox
Sr Systems Engineer

o: 310.658.5775
m: 714.248.5350

Follow us @NEOGOV http://twitter.com/NEOGOV and on
Facebookhttp://www.facebook.com/neogov

NEOGOV http://www.neogov.com/ is among the top fastest growing software
companies in the USA, recognized by Inc 500|5000, Deloitte Fast 500, and
the LA Business Journal. We are hiring!http://www.neogov.com/#/company/careers



On Tue, Mar 26, 2013 at 1:02 PM, Michael Della Bitta 
michael.della.bi...@appinions.com wrote:

 Nate,

 We just cleared up a problem similar to this by ditching Elastic Load
 Balancer and switching over to the APR connector in Tomcat. Are you
 using either of those?

 Michael Della Bitta

 
 Appinions
 18 East 41st Street, 2nd Floor
 New York, NY 10017-6271

 www.appinions.com

 Where Influence Isn’t a Game


 On Tue, Mar 26, 2013 at 2:58 PM, Otis Gospodnetic
 otis.gospodne...@gmail.com wrote:
  Hi Nate,
 
  Try adding some warmup queries and making sure the setting for using
  the cold searcher in solrconfig.xml is set to false.  Your warmup
  queries should use facets and sorting if your normal queries use them.
   In SPM you'll actually see how much time warming up takes, so you'll
  get a better idea of the cost of that (when you don't do it).
 
  Otis
  --
  Solr  ElasticSearch Support
  http://sematext.com/
 
 
 
 
 
  On Tue, Mar 26, 2013 at 2:50 PM, Nate Fox n...@neogov.com wrote:
  I was wondering if the warmup stuff was one of the culprits (we dont
 have
  warmup's at all - the configs are pretty stock).
  As for the system, it seems capable of quite a bit more: memory usage is
  ~30%, jvm-memory (from the dashboard) is very low (~220Mb out of 3Gb)
 and
  load below 1.00.
 
  The seed data and queries were put together by one of our developers.
 I've
  put all the solrmeter files here:
  https://gist.github.com/natefox/ee5cef3d4fbbc73e9bce
  Unfortunately I'm quite new to solr (and tomcat) so I'm not entirely
 sure
  which file does which specifically.
 
  Does the system's reaction to a 'fast load' without a warmup sound
 normal?
  I would have expected the first couple hundred queries to be very slow
  (500ms) and then the system catch up after a while. But it just dies
 very
  quickly and never recovers.
 
  I'll check out your SPM - I've seen it mentioned before. Thanks!
 
 
 
  --
  Nate Fox
  Sr Systems Engineer
 
  o: 310.658.5775
  m: 714.248.5350
 
  Follow us @NEOGOV http://twitter.com/NEOGOV and on
  Facebookhttp://www.facebook.com/neogov
 
  NEOGOV http://www.neogov.com/ is among the top fastest growing
 software
  companies in the USA, recognized by Inc 500|5000, Deloitte Fast 500, and
  the LA Business Journal. We are hiring!
 http://www.neogov.com/#/company/careers
 
 
 
  On Tue, Mar 26, 2013 at 11:12 AM, Otis Gospodnetic 
  otis.gospodne...@gmail.com wrote:
 
  Hi,
 
  In short, certain data structures need to load from index in the
  beginning, (for sorting and faceting) caches need to warm up, JVM
  needs to warm up, etc., so going slowly in the beginning makes sense.
  Why things die after that is a different Q.  Maybe it OOMs?  Maybe
  queries are very complex?  What do your queries look like?  I see
  newrelic.jar in the command-line.  May want to try SPM for Solr, it
  has better Solr metrics.
 
  Otis
  --
  Solr  ElasticSearch Support
  http://sematext.com/
 
 
 
 
 
  On Tue, Mar 26, 2013 at 1:24 PM, Nate Fox n...@neogov.com wrote:
   I'm new to solr and I'm load testing our setup to see what we can
 handle.
   I'm using solrmeter and my problem is a bit odd:
   * When I set solrmeter to run 4000 queries/min, it will handle a few
   hundred queries and then tomcat will stop responding completely to
  requests
   (even though according to lsof -i it is still listening and the java
   process is still running).
   * When I set solrmeter to run 1000 queries/min it runs fine. I can
 stop
   solrmeter after a couple of  minutes at that pace and then run at
  4000/min
   without issue.
  
   It's as if it needs a ramp up time? Also, I noticed (regardless of
 ramp
  up)
   that my setup cannot handle 8000/min. The reaction at 8k/min is the
 same
  as
   if I were to run 4k/min without the ramp up. Of note, only the shard
 that
   solrmeter is pointed to stops responding. The other shard hums along
   without incident.
  
   Setup (everything in AWS):
   - 2x m1.large (7.5Gb RAM) running tomcat7 + solr 4.2.0
   (open-jdk-7-headless) : Ubuntu 12.04
   - 1x m1.micro running zookeeper 3.4.5 : Ubuntu 12.04
   I have ~30k documents in each node (~300Mb on each node)
  
   The vast majority of my solr/tomcat7 config is default from ubuntu's
   packages/solr's example dir. Here's the configs