searching a subset of SOLR index

2011-07-05 Thread Jame Vaalet
Hi,
Let say, I have got 10^10 documents in an index with unique id being document 
id which is assigned to each of those from 1 to 10^10 .
Now I want to search a particular query string in a subset of these documents 
say ( document id 100 to 1000).

The question here is.. will SOLR able to search just in this set of documents 
rather than the entire index ? if yes what should be query to limit search into 
this subset ?

Regards,
JAME VAALET
Software Developer
EXT :8108
Capital IQ



Re: searching a subset of SOLR index

2011-07-05 Thread Shashi Kant
Range query


On Tue, Jul 5, 2011 at 4:37 AM, Jame Vaalet jvaa...@capitaliq.com wrote:
 Hi,
 Let say, I have got 10^10 documents in an index with unique id being document 
 id which is assigned to each of those from 1 to 10^10 .
 Now I want to search a particular query string in a subset of these documents 
 say ( document id 100 to 1000).

 The question here is.. will SOLR able to search just in this set of documents 
 rather than the entire index ? if yes what should be query to limit search 
 into this subset ?

 Regards,
 JAME VAALET
 Software Developer
 EXT :8108
 Capital IQ




RE: searching a subset of SOLR index

2011-07-05 Thread Jame Vaalet
Thanks.
But does this range query just limit the universe logically or does it have any 
mechanism to limit this physically as well .Do we leverage time factor by using 
the range query ?

Regards,
JAME VAALET


-Original Message-
From: shashi@gmail.com [mailto:shashi@gmail.com] On Behalf Of Shashi 
Kant
Sent: Tuesday, July 05, 2011 2:26 PM
To: solr-user@lucene.apache.org
Subject: Re: searching a subset of SOLR index

Range query


On Tue, Jul 5, 2011 at 4:37 AM, Jame Vaalet jvaa...@capitaliq.com wrote:
 Hi,
 Let say, I have got 10^10 documents in an index with unique id being document 
 id which is assigned to each of those from 1 to 10^10 .
 Now I want to search a particular query string in a subset of these documents 
 say ( document id 100 to 1000).

 The question here is.. will SOLR able to search just in this set of documents 
 rather than the entire index ? if yes what should be query to limit search 
 into this subset ?

 Regards,
 JAME VAALET
 Software Developer
 EXT :8108
 Capital IQ




RE: searching a subset of SOLR index

2011-07-05 Thread Pierre GOSSE
The limit will always be logical if you have all documents in the same index. 
But filters are very efficient when working with subset of your index, 
especially if you reuse the same filter for many queries since there is a cache.

If your subsets are always the same subsets, maybe your could use shards. But 
we would need to know more about what you intend to do, to point to an adequate 
solution.

Pierre

-Message d'origine-
De : Jame Vaalet [mailto:jvaa...@capitaliq.com] 
Envoyé : mardi 5 juillet 2011 11:10
À : solr-user@lucene.apache.org
Objet : RE: searching a subset of SOLR index

Thanks.
But does this range query just limit the universe logically or does it have any 
mechanism to limit this physically as well .Do we leverage time factor by using 
the range query ?

Regards,
JAME VAALET


-Original Message-
From: shashi@gmail.com [mailto:shashi@gmail.com] On Behalf Of Shashi 
Kant
Sent: Tuesday, July 05, 2011 2:26 PM
To: solr-user@lucene.apache.org
Subject: Re: searching a subset of SOLR index

Range query


On Tue, Jul 5, 2011 at 4:37 AM, Jame Vaalet jvaa...@capitaliq.com wrote:
 Hi,
 Let say, I have got 10^10 documents in an index with unique id being document 
 id which is assigned to each of those from 1 to 10^10 .
 Now I want to search a particular query string in a subset of these documents 
 say ( document id 100 to 1000).

 The question here is.. will SOLR able to search just in this set of documents 
 rather than the entire index ? if yes what should be query to limit search 
 into this subset ?

 Regards,
 JAME VAALET
 Software Developer
 EXT :8108
 Capital IQ




RE: searching a subset of SOLR index

2011-07-05 Thread Jame Vaalet
I have got two applications 

1. website
The website will enable any user to search the document repository , 
and the set they search on is known as website presentable
2. windows service 
The windows service will search on all the documents in the repository 
for fixed set of key words and store the found result in database.this set  
 is universal set of documents in the doc repository including the website 
presentable.


Website is a high prioritized app which should work smoothly without any 
interference , where as windows service should run all day long continuously 
without break to save result from incoming docs.
The problem here is website set is predefined and I don't want the windows 
service request to SOLR to slow down website request.

Suppose am segregating the website presentable docs index into a particular 
core and rest of them into different core will it solve the problem ?
I have also read about multiple ports for listening request from different apps 
, can this be used. 



Regards,
JAME VAALET


-Original Message-
From: Pierre GOSSE [mailto:pierre.go...@arisem.com] 
Sent: Tuesday, July 05, 2011 3:52 PM
To: solr-user@lucene.apache.org
Subject: RE: searching a subset of SOLR index

The limit will always be logical if you have all documents in the same index. 
But filters are very efficient when working with subset of your index, 
especially if you reuse the same filter for many queries since there is a cache.

If your subsets are always the same subsets, maybe your could use shards. But 
we would need to know more about what you intend to do, to point to an adequate 
solution.

Pierre

-Message d'origine-
De : Jame Vaalet [mailto:jvaa...@capitaliq.com] 
Envoyé : mardi 5 juillet 2011 11:10
À : solr-user@lucene.apache.org
Objet : RE: searching a subset of SOLR index

Thanks.
But does this range query just limit the universe logically or does it have any 
mechanism to limit this physically as well .Do we leverage time factor by using 
the range query ?

Regards,
JAME VAALET


-Original Message-
From: shashi@gmail.com [mailto:shashi@gmail.com] On Behalf Of Shashi 
Kant
Sent: Tuesday, July 05, 2011 2:26 PM
To: solr-user@lucene.apache.org
Subject: Re: searching a subset of SOLR index

Range query


On Tue, Jul 5, 2011 at 4:37 AM, Jame Vaalet jvaa...@capitaliq.com wrote:
 Hi,
 Let say, I have got 10^10 documents in an index with unique id being document 
 id which is assigned to each of those from 1 to 10^10 .
 Now I want to search a particular query string in a subset of these documents 
 say ( document id 100 to 1000).

 The question here is.. will SOLR able to search just in this set of documents 
 rather than the entire index ? if yes what should be query to limit search 
 into this subset ?

 Regards,
 JAME VAALET
 Software Developer
 EXT :8108
 Capital IQ




RE: searching a subset of SOLR index

2011-07-05 Thread Pierre GOSSE
From what you tell us, I guess a separate index for website docs would be the 
best. If you fear that request from the window service would cripple your web 
site performance, why not have a totally separated index on another server, 
and have your website documents index in both indexes ?

Pierre

-Message d'origine-
De : Jame Vaalet [mailto:jvaa...@capitaliq.com] 
Envoyé : mardi 5 juillet 2011 13:14
À : solr-user@lucene.apache.org
Objet : RE: searching a subset of SOLR index

I have got two applications 

1. website
The website will enable any user to search the document repository , 
and the set they search on is known as website presentable
2. windows service 
The windows service will search on all the documents in the repository 
for fixed set of key words and store the found result in database.this set  
 is universal set of documents in the doc repository including the website 
presentable.


Website is a high prioritized app which should work smoothly without any 
interference , where as windows service should run all day long continuously 
without break to save result from incoming docs.
The problem here is website set is predefined and I don't want the windows 
service request to SOLR to slow down website request.

Suppose am segregating the website presentable docs index into a particular 
core and rest of them into different core will it solve the problem ?
I have also read about multiple ports for listening request from different apps 
, can this be used. 



Regards,
JAME VAALET


-Original Message-
From: Pierre GOSSE [mailto:pierre.go...@arisem.com] 
Sent: Tuesday, July 05, 2011 3:52 PM
To: solr-user@lucene.apache.org
Subject: RE: searching a subset of SOLR index

The limit will always be logical if you have all documents in the same index. 
But filters are very efficient when working with subset of your index, 
especially if you reuse the same filter for many queries since there is a cache.

If your subsets are always the same subsets, maybe your could use shards. But 
we would need to know more about what you intend to do, to point to an adequate 
solution.

Pierre

-Message d'origine-
De : Jame Vaalet [mailto:jvaa...@capitaliq.com] 
Envoyé : mardi 5 juillet 2011 11:10
À : solr-user@lucene.apache.org
Objet : RE: searching a subset of SOLR index

Thanks.
But does this range query just limit the universe logically or does it have any 
mechanism to limit this physically as well .Do we leverage time factor by using 
the range query ?

Regards,
JAME VAALET


-Original Message-
From: shashi@gmail.com [mailto:shashi@gmail.com] On Behalf Of Shashi 
Kant
Sent: Tuesday, July 05, 2011 2:26 PM
To: solr-user@lucene.apache.org
Subject: Re: searching a subset of SOLR index

Range query


On Tue, Jul 5, 2011 at 4:37 AM, Jame Vaalet jvaa...@capitaliq.com wrote:
 Hi,
 Let say, I have got 10^10 documents in an index with unique id being document 
 id which is assigned to each of those from 1 to 10^10 .
 Now I want to search a particular query string in a subset of these documents 
 say ( document id 100 to 1000).

 The question here is.. will SOLR able to search just in this set of documents 
 rather than the entire index ? if yes what should be query to limit search 
 into this subset ?

 Regards,
 JAME VAALET
 Software Developer
 EXT :8108
 Capital IQ




RE: searching a subset of SOLR index

2011-07-05 Thread Jame Vaalet
But incase the website docs contribute around 50 % of the entire docs , why to 
recreate the indexes . don't you think its redundancy ?
Can two web apps (solr instances ) share a single index file to search on it 
without interfering each other 


Regards,
JAME VAALET
Software Developer 
EXT :8108
Capital IQ


-Original Message-
From: Pierre GOSSE [mailto:pierre.go...@arisem.com] 
Sent: Tuesday, July 05, 2011 5:12 PM
To: solr-user@lucene.apache.org
Subject: RE: searching a subset of SOLR index

From what you tell us, I guess a separate index for website docs would be the 
best. If you fear that request from the window service would cripple your web 
site performance, why not have a totally separated index on another server, 
and have your website documents index in both indexes ?

Pierre

-Message d'origine-
De : Jame Vaalet [mailto:jvaa...@capitaliq.com] 
Envoyé : mardi 5 juillet 2011 13:14
À : solr-user@lucene.apache.org
Objet : RE: searching a subset of SOLR index

I have got two applications 

1. website
The website will enable any user to search the document repository , 
and the set they search on is known as website presentable
2. windows service 
The windows service will search on all the documents in the repository 
for fixed set of key words and store the found result in database.this set  
 is universal set of documents in the doc repository including the website 
presentable.


Website is a high prioritized app which should work smoothly without any 
interference , where as windows service should run all day long continuously 
without break to save result from incoming docs.
The problem here is website set is predefined and I don't want the windows 
service request to SOLR to slow down website request.

Suppose am segregating the website presentable docs index into a particular 
core and rest of them into different core will it solve the problem ?
I have also read about multiple ports for listening request from different apps 
, can this be used. 



Regards,
JAME VAALET


-Original Message-
From: Pierre GOSSE [mailto:pierre.go...@arisem.com] 
Sent: Tuesday, July 05, 2011 3:52 PM
To: solr-user@lucene.apache.org
Subject: RE: searching a subset of SOLR index

The limit will always be logical if you have all documents in the same index. 
But filters are very efficient when working with subset of your index, 
especially if you reuse the same filter for many queries since there is a cache.

If your subsets are always the same subsets, maybe your could use shards. But 
we would need to know more about what you intend to do, to point to an adequate 
solution.

Pierre

-Message d'origine-
De : Jame Vaalet [mailto:jvaa...@capitaliq.com] 
Envoyé : mardi 5 juillet 2011 11:10
À : solr-user@lucene.apache.org
Objet : RE: searching a subset of SOLR index

Thanks.
But does this range query just limit the universe logically or does it have any 
mechanism to limit this physically as well .Do we leverage time factor by using 
the range query ?

Regards,
JAME VAALET


-Original Message-
From: shashi@gmail.com [mailto:shashi@gmail.com] On Behalf Of Shashi 
Kant
Sent: Tuesday, July 05, 2011 2:26 PM
To: solr-user@lucene.apache.org
Subject: Re: searching a subset of SOLR index

Range query


On Tue, Jul 5, 2011 at 4:37 AM, Jame Vaalet jvaa...@capitaliq.com wrote:
 Hi,
 Let say, I have got 10^10 documents in an index with unique id being document 
 id which is assigned to each of those from 1 to 10^10 .
 Now I want to search a particular query string in a subset of these documents 
 say ( document id 100 to 1000).

 The question here is.. will SOLR able to search just in this set of documents 
 rather than the entire index ? if yes what should be query to limit search 
 into this subset ?

 Regards,
 JAME VAALET
 Software Developer
 EXT :8108
 Capital IQ




Re: searching a subset of SOLR index

2011-07-05 Thread Erik Hatcher
I wouldn't share the same index across two Solr webapps - as they could step on 
each others toes.  

In this scenario, I think having two Solr instances replicating from the same 
master is the way to go, to allow you to scale your load from each application 
separately.  

Erik



On Jul 5, 2011, at 09:04 , Jame Vaalet wrote:

 But incase the website docs contribute around 50 % of the entire docs , why 
 to recreate the indexes . don't you think its redundancy ?
 Can two web apps (solr instances ) share a single index file to search on it 
 without interfering each other 
 
 
 Regards,
 JAME VAALET
 Software Developer 
 EXT :8108
 Capital IQ
 
 
 -Original Message-
 From: Pierre GOSSE [mailto:pierre.go...@arisem.com] 
 Sent: Tuesday, July 05, 2011 5:12 PM
 To: solr-user@lucene.apache.org
 Subject: RE: searching a subset of SOLR index
 
 From what you tell us, I guess a separate index for website docs would be the 
 best. If you fear that request from the window service would cripple your web 
 site performance, why not have a totally separated index on another server, 
 and have your website documents index in both indexes ?
 
 Pierre
 
 -Message d'origine-
 De : Jame Vaalet [mailto:jvaa...@capitaliq.com] 
 Envoyé : mardi 5 juillet 2011 13:14
 À : solr-user@lucene.apache.org
 Objet : RE: searching a subset of SOLR index
 
 I have got two applications 
 
 1. website
   The website will enable any user to search the document repository , 
 and the set they search on is known as website presentable
 2. windows service 
   The windows service will search on all the documents in the repository 
 for fixed set of key words and store the found result in database.this set
is universal set of documents in the doc repository including the website 
 presentable.
 
 
 Website is a high prioritized app which should work smoothly without any 
 interference , where as windows service should run all day long continuously 
 without break to save result from incoming docs.
 The problem here is website set is predefined and I don't want the windows 
 service request to SOLR to slow down website request.
 
 Suppose am segregating the website presentable docs index into a particular 
 core and rest of them into different core will it solve the problem ?
 I have also read about multiple ports for listening request from different 
 apps , can this be used. 
 
 
 
 Regards,
 JAME VAALET
 
 
 -Original Message-
 From: Pierre GOSSE [mailto:pierre.go...@arisem.com] 
 Sent: Tuesday, July 05, 2011 3:52 PM
 To: solr-user@lucene.apache.org
 Subject: RE: searching a subset of SOLR index
 
 The limit will always be logical if you have all documents in the same index. 
 But filters are very efficient when working with subset of your index, 
 especially if you reuse the same filter for many queries since there is a 
 cache.
 
 If your subsets are always the same subsets, maybe your could use shards. But 
 we would need to know more about what you intend to do, to point to an 
 adequate solution.
 
 Pierre
 
 -Message d'origine-
 De : Jame Vaalet [mailto:jvaa...@capitaliq.com] 
 Envoyé : mardi 5 juillet 2011 11:10
 À : solr-user@lucene.apache.org
 Objet : RE: searching a subset of SOLR index
 
 Thanks.
 But does this range query just limit the universe logically or does it have 
 any mechanism to limit this physically as well .Do we leverage time factor by 
 using the range query ?
 
 Regards,
 JAME VAALET
 
 
 -Original Message-
 From: shashi@gmail.com [mailto:shashi@gmail.com] On Behalf Of Shashi 
 Kant
 Sent: Tuesday, July 05, 2011 2:26 PM
 To: solr-user@lucene.apache.org
 Subject: Re: searching a subset of SOLR index
 
 Range query
 
 
 On Tue, Jul 5, 2011 at 4:37 AM, Jame Vaalet jvaa...@capitaliq.com wrote:
 Hi,
 Let say, I have got 10^10 documents in an index with unique id being 
 document id which is assigned to each of those from 1 to 10^10 .
 Now I want to search a particular query string in a subset of these 
 documents say ( document id 100 to 1000).
 
 The question here is.. will SOLR able to search just in this set of 
 documents rather than the entire index ? if yes what should be query to 
 limit search into this subset ?
 
 Regards,
 JAME VAALET
 Software Developer
 EXT :8108
 Capital IQ
 
 



RE: searching a subset of SOLR index

2011-07-05 Thread Pierre GOSSE
It is redundancy. You have to balance the cost of redundancy with the cost in 
performance with your web index requested by your windows service. If your 
windows service is not too aggressive in its requests, go for shards.

Pierre

-Message d'origine-
De : Jame Vaalet [mailto:jvaa...@capitaliq.com] 
Envoyé : mardi 5 juillet 2011 15:05
À : solr-user@lucene.apache.org
Objet : RE: searching a subset of SOLR index

But incase the website docs contribute around 50 % of the entire docs , why to 
recreate the indexes . don't you think its redundancy ?
Can two web apps (solr instances ) share a single index file to search on it 
without interfering each other 


Regards,
JAME VAALET
Software Developer 
EXT :8108
Capital IQ


-Original Message-
From: Pierre GOSSE [mailto:pierre.go...@arisem.com] 
Sent: Tuesday, July 05, 2011 5:12 PM
To: solr-user@lucene.apache.org
Subject: RE: searching a subset of SOLR index

From what you tell us, I guess a separate index for website docs would be the 
best. If you fear that request from the window service would cripple your web 
site performance, why not have a totally separated index on another server, 
and have your website documents index in both indexes ?

Pierre

-Message d'origine-
De : Jame Vaalet [mailto:jvaa...@capitaliq.com] 
Envoyé : mardi 5 juillet 2011 13:14
À : solr-user@lucene.apache.org
Objet : RE: searching a subset of SOLR index

I have got two applications 

1. website
The website will enable any user to search the document repository , 
and the set they search on is known as website presentable
2. windows service 
The windows service will search on all the documents in the repository 
for fixed set of key words and store the found result in database.this set  
 is universal set of documents in the doc repository including the website 
presentable.


Website is a high prioritized app which should work smoothly without any 
interference , where as windows service should run all day long continuously 
without break to save result from incoming docs.
The problem here is website set is predefined and I don't want the windows 
service request to SOLR to slow down website request.

Suppose am segregating the website presentable docs index into a particular 
core and rest of them into different core will it solve the problem ?
I have also read about multiple ports for listening request from different apps 
, can this be used. 



Regards,
JAME VAALET


-Original Message-
From: Pierre GOSSE [mailto:pierre.go...@arisem.com] 
Sent: Tuesday, July 05, 2011 3:52 PM
To: solr-user@lucene.apache.org
Subject: RE: searching a subset of SOLR index

The limit will always be logical if you have all documents in the same index. 
But filters are very efficient when working with subset of your index, 
especially if you reuse the same filter for many queries since there is a cache.

If your subsets are always the same subsets, maybe your could use shards. But 
we would need to know more about what you intend to do, to point to an adequate 
solution.

Pierre

-Message d'origine-
De : Jame Vaalet [mailto:jvaa...@capitaliq.com] 
Envoyé : mardi 5 juillet 2011 11:10
À : solr-user@lucene.apache.org
Objet : RE: searching a subset of SOLR index

Thanks.
But does this range query just limit the universe logically or does it have any 
mechanism to limit this physically as well .Do we leverage time factor by using 
the range query ?

Regards,
JAME VAALET


-Original Message-
From: shashi@gmail.com [mailto:shashi@gmail.com] On Behalf Of Shashi 
Kant
Sent: Tuesday, July 05, 2011 2:26 PM
To: solr-user@lucene.apache.org
Subject: Re: searching a subset of SOLR index

Range query


On Tue, Jul 5, 2011 at 4:37 AM, Jame Vaalet jvaa...@capitaliq.com wrote:
 Hi,
 Let say, I have got 10^10 documents in an index with unique id being document 
 id which is assigned to each of those from 1 to 10^10 .
 Now I want to search a particular query string in a subset of these documents 
 say ( document id 100 to 1000).

 The question here is.. will SOLR able to search just in this set of documents 
 rather than the entire index ? if yes what should be query to limit search 
 into this subset ?

 Regards,
 JAME VAALET
 Software Developer
 EXT :8108
 Capital IQ