There is already a patch available to address that short-coming in
distributed search:
http://issues.apache.org/jira/browse/SOLR-1632
On Feb 11, 2010, at 6:56 AM, abhishes wrote:
Thanks really useful article.
I am wondering about this statement in the article
"Keep in mind that Solr does not calculate universal term/doc
frequencies.
At a large scale, its not likely to matter that tf/idf is
calculated at the
shard level - however, if your collection is heavily skewed in its
distribution across servers, you might take issue with the relevance
results. Its probably best to randomly distribute documents to your
shards"
So if there is no universal tf/idf kept, then how does solr
determine the
rank of two documents which came from different shards in a
distributed
search query?
Regards,
Abhishek
Juan Pedro Danculovic-2 wrote:
To scale solr, take a look to this article
http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr
Juan Pedro Danculovic
CTO - www.linebee.com
On Thu, Feb 11, 2010 at 4:12 AM, abhishes <abhis...@gmail.com> wrote:
Suppose I am indexing very large data (5 billion rows in a database)
Now I want to use the Solr Core feature to split the index into
manageable
chunks.
However I have two questions
1. Can Cores reside on difference physical servers?
2. when a query comes, will the query be answered by index in 1
core or
the
query will be sent to all the cores?
My desire is to have a system which from outside appears as a single
large
index... but inside it is multiple small indexes running on
different
hardware machines.
--
View this message in context:
http://old.nabble.com/Question-on-Solr-Scalability-tp27543068p27543068.html
Sent from the Solr - User mailing list archive at Nabble.com.
--
View this message in context:
http://old.nabble.com/Question-on-Solr-Scalability-tp27543068p27544436.html
Sent from the Solr - User mailing list archive at Nabble.com.