Or, if you want to go with something older/more stable, go with BDB. :)

Otis --
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


----- Original Message ----
> From: Marcus Herou <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Thursday, June 12, 2008 3:17:52 PM
> Subject: Re: Num docs
> 
> Cacti, Nagios you name it already in use :)
> 
> Well I'm the CTO so the one really really interested in estimating perf.
> 
> The id's come from a db initially and is later used for retrieval from a
> distributed on disk caching system which I have written.
> I'm in the process of moving from MySQL to HBase or Hypertable.
> 
> /M
> 
> On Tue, Jun 10, 2008 at 10:03 PM, Otis Gospodnetic <
> [EMAIL PROTECTED]> wrote:
> 
> > Marcus,
> >
> > It sounds like you may just want to use a good server monitoring package
> > that collects server data and prints out pretty charts.  Then you can show
> > them to your IT/budget people when the charts start showing increased query
> > latency times, very little available RAM, swapping, high CPU usage and such.
> >  Nagios, Ganglia, any of those things will do.
> >
> >
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >
> >
> > ----- Original Message ----
> > > From: Marcus Herou 
> > > To: solr-user@lucene.apache.org
> > > Sent: Tuesday, June 10, 2008 3:29:40 PM
> > > Subject: Re: Num docs
> > >
> > > Well guys you are right... Still I want to have a clue about how much
> > each
> > > machine stores to predict when we need more machines (measure performance
> > > degradation per new document). But it's harder to collect that kind of
> > data.
> > > It sure is doable no doubt and is a normal sharding "algo" for MySQL.
> > >
> > > The best approach I think is to have some bg threads run X number of
> > queries
> > > and collect the response times, throw away the n lowest/highest response
> > > times and calc an avg time which is used for in sharding and query
> > lb'ing.
> > >
> > > Little off topic but interesting....
> > > What would you guys say about a good correlation between the index size
> > on
> > > disk (no stored text content) and available RAM and having good response
> > > times.
> > >
> > > How long is a rope would you perhaps say...but I think some rule of thumb
> > > could be established...
> > >
> > > One of the schemas of concern
> > >
> > >
> > > required="true" />
> > >
> > > required="true" />
> > >
> > > required="false" />
> > >
> > > stored="false" required="true" />
> > >
> > > required="true" />
> > >
> > > required="true" />
> > >
> > > required="false" />
> > >
> > > required="true" />
> > >
> > > required="true" />
> > >
> > > required="false" />
> > >
> > > required="false" multiValued="true"/>
> > >
> > > required="false" />
> > >
> > > required="false" />
> > >
> > > required="false" />
> > >
> > > required="false" />
> > >
> > >
> > > and a normal solr query (taken from the log):
> > > /select
> > >
> > 
> start=0&q=(title:(apple)^4+OR+description:(apple))&version=2.2&rows=15&wt=xml&sort=publishDate+desc
> > >
> > >
> > > //Marcus
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Jun 10, 2008 at 1:15 AM, Otis Gospodnetic <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > > > Exactly.  I think I mentioned this once before several months ago.  One
> > can
> > > > take various hardware specs (# cores, CPU speed, FSB, RAM, etc.),
> > > > performance numbers, etc. and come up with a number for each server's
> > > > overall capacity.
> > > >
> > > >
> > > > As a matter of fact, I think this would be useful to have right in
> > Solr,
> > > > primarily for use when allocating and sizing shards for Distributed
> > Search.
> > > >  JIRA enhancement/feature issue?
> > > > Otis
> > > > --
> > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > > >
> > > >
> > > > ----- Original Message ----
> > > > > From: Alexander Ramos Jardim
> > > > > To: solr-user@lucene.apache.org
> > > > > Sent: Monday, June 9, 2008 6:42:17 PM
> > > > > Subject: Re: Num docs
> > > > >
> > > > > I even think that such a decision should be based on the overall
> > machine
> > > > > performance at a given time, and not the index size. Unless you are
> > > > talking
> > > > > solely about HD space and not having any performance issues.
> > > > >
> > > > > 2008/6/7 Otis Gospodnetic :
> > > > >
> > > > > > Marcus,
> > > > > >
> > > > > >
> > > > > > For that you can rely on du, vmstat, iostat, top and such, too. :)
> > > > > >
> > > > > > Otis
> > > > > > --
> > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > > > > >
> > > > > >
> > > > > > ----- Original Message ----
> > > > > > > From: Marcus Herou
> > > > > > > To: solr-user@lucene.apache.org
> > > > > > > Sent: Saturday, June 7, 2008 12:33:10 PM
> > > > > > > Subject: Re: Num docs
> > > > > > >
> > > > > > > Thanks, I wanna ask the indices how much more each shard can
> > handle
> > > > > > before
> > > > > > > they're considered "full" and scream for a budget to get a new
> > > > machine :)
> > > > > > >
> > > > > > > /M
> > > > > > >
> > > > > > > On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Marcus, check out the Luke request handler.  You can get it
> > from
> > > > its
> > > > > > > > output.  It may also be possible to get *just* that number, but
> > I'm
> > > > not
> > > > > > > > looking at docs/code right now to know for sure.
> > > > > > > >
> > > > > > > >  Otis
> > > > > > > > --
> > > > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > > > > > > >
> > > > > > > >
> > > > > > > > ----- Original Message ----
> > > > > > > > > From: Marcus Herou
> > > > > > > > > To: solr-user@lucene.apache.org
> > > > > > > > > Sent: Saturday, June 7, 2008 5:09:20 AM
> > > > > > > > > Subject: Num docs
> > > > > > > > >
> > > > > > > > > Hi.
> > > > > > > > >
> > > > > > > > > Is there a way of retrieve IndexWriter.numDocs() in SOLR ?
> > > > > > > > >
> > > > > > > > > Kindly
> > > > > > > > >
> > > > > > > > > //Marcus
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Marcus Herou CTO and co-founder Tailsweep AB
> > > > > > > > > +46702561312
> > > > > > > > > [EMAIL PROTECTED]
> > > > > > > > > http://www.tailsweep.com/
> > > > > > > > > http://blogg.tailsweep.com/
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Marcus Herou CTO and co-founder Tailsweep AB
> > > > > > > +46702561312
> > > > > > > [EMAIL PROTECTED]
> > > > > > > http://www.tailsweep.com/
> > > > > > > http://blogg.tailsweep.com/
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Alexander Ramos Jardim
> > > >
> > > >
> > >
> > >
> > > --
> > > Marcus Herou CTO and co-founder Tailsweep AB
> > > +46702561312
> > > [EMAIL PROTECTED]
> > > http://www.tailsweep.com/
> > > http://blogg.tailsweep.com/
> >
> >
> 
> 
> -- 
> Marcus Herou CTO and co-founder Tailsweep AB
> +46702561312
> [EMAIL PROTECTED]
> http://www.tailsweep.com/
> http://blogg.tailsweep.com/

Reply via email to