Dietrich,

I don't think there are established practices in the open (yet).  You could 
design your application with a site(s)->shard mapping and then, knowing which 
sites are involved in the query, search only the relevant shards.  This will be 
efficient, but it would require careful management on your part.

Putting everything in a single index would just not work with "normal" 
machines, I think.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
From: Dietrich <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Wednesday, March 26, 2008 10:47:55 AM
Subject: Re: How to index multiple sites with option of combining results in 
search

I understand that, and that makes sense. But, coming back to the
orginal question:
>  >  When performing searches,
>  >  I need to be able to search against any combination of sites.
>  >  Does anybody have suggestions what the best practice for a scenario
>  >  like that would be, considering  both indexing and querying
>  >  performance? Put everything into one index and filter when performing
>  >  the queries, or creating a separate index for each one and combining
>  >  results when performing the query?

Are there any established best practices for that?

-ds

On Tue, Mar 25, 2008 at 11:25 PM, Otis Gospodnetic
<[EMAIL PROTECTED]> wrote:
> Dietrich,
>
>  I pointed to SOLR-303 because 275 * 200,000 looks like a too big of a number 
> for a single machine to handle.
>
>
>  Otis
>  --
>  Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>  ----- Original Message ----
>  From: Dietrich <[EMAIL PROTECTED]>
>  To: solr-user@lucene.apache.org
>
>
> Sent: Tuesday, March 25, 2008 7:00:17 PM
>  Subject: Re: How to index multiple sites with option of combining results in 
> search
>
>  On Tue, Mar 25, 2008 at 6:12 PM, Otis Gospodnetic
>  <[EMAIL PROTECTED]> wrote:
>  > Sounds like SOLR-303 is a must for you.
>  Why? I see the benefits of using a distributed architecture in
>  general, but why do you recommend it specifically for this scenario.
>  > Have you looked at Nutch?
>  I don't want to (or need to) use a crawler. I am using a crawler-base
>  system now, and it does not offer the flexibility I need when it comes
>  to custom schemes and faceting.
>  >
>  >  Otis
>  >  --
>  >  Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>  >
>  >
>  >
>  >  ----- Original Message ----
>  >  From: Dietrich <[EMAIL PROTECTED]>
>  >  To: solr-user@lucene.apache.org
>  >  Sent: Tuesday, March 25, 2008 4:15:23 PM
>  >  Subject: How to index multiple sites with option of combining results in 
> search
>  >
>  >  I am planning to index 275+ different sites with Solr, each of which
>  >  might have anywhere up to 200 000 documents. When performing searches,
>  >  I need to be able to search against any combination of sites.
>  >  Does anybody have suggestions what the best practice for a scenario
>  >  like that would be, considering  both indexing and querying
>  >  performance? Put everything into one index and filter when performing
>  >  the queries, or creating a separate index for each one and combining
>  >  results when performing the query?
>  >
>  >
>  >
>  >
>
>
>
>



Reply via email to