Yonik Seeley wrote:
On Fri, Sep 5, 2008 at 10:45 AM, Shalin Shekhar Mangar
<[EMAIL PROTECTED]> wrote:
Cloud computing cluster having hundreds of servers is a niche area
limited to few (and they already have their systems in place without Solr).
Any efforts of integration must keep in mind our users and their needs.
Most Solr deployments have around 1-25 servers.
We all scratch our own itch in open-source... good, practical
scalability to high levels is an interest of mine :-)
Many will choose their solution based on it's ability to scale to
their most optimistic projections... so they may only use 10 or 20
servers, but if it can't scale to 100 then they might start with
something that easily can. And I think this work would greatly
benefit those with smaller clusters also.
-Yonik
Can't emphasize how much I agree with this. Search Engines have often
been stuck with for 4 or 5 years easily. You don't want to switch too
often. In this day and age, many users that we might want to target solr
too (the 'enterprise'), are going to want to be able to do capacity
planning over the next 4 or 5 years. The rate that some of these players
are/will grow seems to mean we want to be shooting for 100 servers with
solr no problem. To put solr in the class of FAST et all (something I am
personally interested in), we have to have almost arbitrary ease in scaling.
When FAST comes in, they give you a formula telling you how many servers
with how much RAM you need to run your collection n. Their formula
probably breaks down, but seems to at least hold in to the billions. A
company looking at FAST can know, they can throw servers at their great
to have super growing problem, at almost any realistic scale. I think a
company should know they can do the same thing with solr. Part of my
itch anyway.
My guess is that 10-15 mil docs per decent machine today is optimal...so
to get to a billion docs thats...or 500 million...
It might be that few current users have those needs, but think of the
users we will have in the coming years or could have now...and how our
current users will benefit in ways they don't expect now...
solr should rule the search server roost across the board given enough
time. Why not <g> ?
- Mark