Or, if you want to go with something older/more stable, go with BDB. :)
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Marcus Herou <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Thursday, June 12, 2008 3:17:52 PM > Subject: Re: Num docs > > Cacti, Nagios you name it already in use :) > > Well I'm the CTO so the one really really interested in estimating perf. > > The id's come from a db initially and is later used for retrieval from a > distributed on disk caching system which I have written. > I'm in the process of moving from MySQL to HBase or Hypertable. > > /M > > On Tue, Jun 10, 2008 at 10:03 PM, Otis Gospodnetic < > [EMAIL PROTECTED]> wrote: > > > Marcus, > > > > It sounds like you may just want to use a good server monitoring package > > that collects server data and prints out pretty charts. Then you can show > > them to your IT/budget people when the charts start showing increased query > > latency times, very little available RAM, swapping, high CPU usage and such. > > Nagios, Ganglia, any of those things will do. > > > > > > Otis > > -- > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > ----- Original Message ---- > > > From: Marcus Herou > > > To: solr-user@lucene.apache.org > > > Sent: Tuesday, June 10, 2008 3:29:40 PM > > > Subject: Re: Num docs > > > > > > Well guys you are right... Still I want to have a clue about how much > > each > > > machine stores to predict when we need more machines (measure performance > > > degradation per new document). But it's harder to collect that kind of > > data. > > > It sure is doable no doubt and is a normal sharding "algo" for MySQL. > > > > > > The best approach I think is to have some bg threads run X number of > > queries > > > and collect the response times, throw away the n lowest/highest response > > > times and calc an avg time which is used for in sharding and query > > lb'ing. > > > > > > Little off topic but interesting.... > > > What would you guys say about a good correlation between the index size > > on > > > disk (no stored text content) and available RAM and having good response > > > times. > > > > > > How long is a rope would you perhaps say...but I think some rule of thumb > > > could be established... > > > > > > One of the schemas of concern > > > > > > > > > required="true" /> > > > > > > required="true" /> > > > > > > required="false" /> > > > > > > stored="false" required="true" /> > > > > > > required="true" /> > > > > > > required="true" /> > > > > > > required="false" /> > > > > > > required="true" /> > > > > > > required="true" /> > > > > > > required="false" /> > > > > > > required="false" multiValued="true"/> > > > > > > required="false" /> > > > > > > required="false" /> > > > > > > required="false" /> > > > > > > required="false" /> > > > > > > > > > and a normal solr query (taken from the log): > > > /select > > > > > > start=0&q=(title:(apple)^4+OR+description:(apple))&version=2.2&rows=15&wt=xml&sort=publishDate+desc > > > > > > > > > //Marcus > > > > > > > > > > > > > > > > > > On Tue, Jun 10, 2008 at 1:15 AM, Otis Gospodnetic < > > > [EMAIL PROTECTED]> wrote: > > > > > > > Exactly. I think I mentioned this once before several months ago. One > > can > > > > take various hardware specs (# cores, CPU speed, FSB, RAM, etc.), > > > > performance numbers, etc. and come up with a number for each server's > > > > overall capacity. > > > > > > > > > > > > As a matter of fact, I think this would be useful to have right in > > Solr, > > > > primarily for use when allocating and sizing shards for Distributed > > Search. > > > > JIRA enhancement/feature issue? > > > > Otis > > > > -- > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > > > ----- Original Message ---- > > > > > From: Alexander Ramos Jardim > > > > > To: solr-user@lucene.apache.org > > > > > Sent: Monday, June 9, 2008 6:42:17 PM > > > > > Subject: Re: Num docs > > > > > > > > > > I even think that such a decision should be based on the overall > > machine > > > > > performance at a given time, and not the index size. Unless you are > > > > talking > > > > > solely about HD space and not having any performance issues. > > > > > > > > > > 2008/6/7 Otis Gospodnetic : > > > > > > > > > > > Marcus, > > > > > > > > > > > > > > > > > > For that you can rely on du, vmstat, iostat, top and such, too. :) > > > > > > > > > > > > Otis > > > > > > -- > > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > > > > > > > > > ----- Original Message ---- > > > > > > > From: Marcus Herou > > > > > > > To: solr-user@lucene.apache.org > > > > > > > Sent: Saturday, June 7, 2008 12:33:10 PM > > > > > > > Subject: Re: Num docs > > > > > > > > > > > > > > Thanks, I wanna ask the indices how much more each shard can > > handle > > > > > > before > > > > > > > they're considered "full" and scream for a budget to get a new > > > > machine :) > > > > > > > > > > > > > > /M > > > > > > > > > > > > > > On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic > > > > > > > wrote: > > > > > > > > > > > > > > > Marcus, check out the Luke request handler. You can get it > > from > > > > its > > > > > > > > output. It may also be possible to get *just* that number, but > > I'm > > > > not > > > > > > > > looking at docs/code right now to know for sure. > > > > > > > > > > > > > > > > Otis > > > > > > > > -- > > > > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ---- > > > > > > > > > From: Marcus Herou > > > > > > > > > To: solr-user@lucene.apache.org > > > > > > > > > Sent: Saturday, June 7, 2008 5:09:20 AM > > > > > > > > > Subject: Num docs > > > > > > > > > > > > > > > > > > Hi. > > > > > > > > > > > > > > > > > > Is there a way of retrieve IndexWriter.numDocs() in SOLR ? > > > > > > > > > > > > > > > > > > Kindly > > > > > > > > > > > > > > > > > > //Marcus > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Marcus Herou CTO and co-founder Tailsweep AB > > > > > > > > > +46702561312 > > > > > > > > > [EMAIL PROTECTED] > > > > > > > > > http://www.tailsweep.com/ > > > > > > > > > http://blogg.tailsweep.com/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Marcus Herou CTO and co-founder Tailsweep AB > > > > > > > +46702561312 > > > > > > > [EMAIL PROTECTED] > > > > > > > http://www.tailsweep.com/ > > > > > > > http://blogg.tailsweep.com/ > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Alexander Ramos Jardim > > > > > > > > > > > > > > > > > -- > > > Marcus Herou CTO and co-founder Tailsweep AB > > > +46702561312 > > > [EMAIL PROTECTED] > > > http://www.tailsweep.com/ > > > http://blogg.tailsweep.com/ > > > > > > > -- > Marcus Herou CTO and co-founder Tailsweep AB > +46702561312 > [EMAIL PROTECTED] > http://www.tailsweep.com/ > http://blogg.tailsweep.com/