Re: MatchAllDocsQuery in solr?

2006-11-21 Thread Walter Lewis

Walter Underwood wrote:

I was thinking something similar, maybe _solr:all. At Infoseek, we
hardcoded url:http to match all docs.
I suppose that different data would yield different responses but a 
space ( ) works on our data.


the other Walter


MatchAllDocsQuery in solr?

2006-11-21 Thread Tom

Is there a way to do a match all docs query in solr?

I mean is there something I can put in a solr URL that will get 
recognized by the SolrQueryParser as meaning a match all?


Why? Because I'm porting unit tests from our internal Lucene 
container to Solr, and the tests usually run such a query,  upon 
completion, to make sure the index is in the expected state (nothing 
missing, nothing extra).


Yes, I can create a query that will match all my docs, there are a 
few fields that have a relatively small range of values. I was just 
looking for a standard way to do it first.


Thanks,

Tom




Re: MatchAllDocsQuery in solr?

2006-11-21 Thread Walter Underwood
On 11/21/06 3:19 PM, Chris Hostetter [EMAIL PROTECTED] wrote:
 
 :  I've considered *:* but I haven't checked if the JavaCC grammar will
 :  allow that through or if it would need to be modified.
 :
 : I looked into it quick, and it looks like the grammar may need to be
 : modified (i.e., one can't just override a method of QueryParser to do
 : this).
 
 we could add this to the function parser, so  _val_:ALL  could return a
 MatchAllDocsQuery ?

I was thinking something similar, maybe _solr:all. At Infoseek, we
hardcoded url:http to match all docs.

wunder
-- 
Walter Underwood
Search Guru, Netflix





Re: Minimum time between distributions

2006-11-21 Thread Chris Hostetter

: Are there any risks with reducing this window to every one or two
: minutes? With large caches could the autowarming take longer than one
: or two minutes? It isn't a business need to reduce the window but I'm
: just curious about the feasibility and risks.

you can definitely run into problems if autoWarming takes longer then your
snappuller interval, thre's nothing in Solr right now that prevents
warming searchers from layering on top of eachother eating resources.

once upon a time Yonik and i discussed having a maxAutoWarm time option
on the caches, and they would give up if that much time had elapsed -- but
that approach is too simplistic to be of much use -- even if you configure
snappuller to run ever minute, there are going to be lots of times when it
doesn't acctually have anything new every minute.

: How often do other people run snappuller and snapinstaller?

most of the indexes i deal with are set to 5 minutes ... 1 that really
REALLY relies on the caches for facet counts is only 15 ... i think we
have one index snappulling/installing every minute -- but there are almost
never new indexes, that internal is just so the rare occasions when there
is a new index, the data shows up ASAP.



-Hoss



Re: Minimum time between distributions

2006-11-21 Thread Yonik Seeley

On 11/21/06, Chris Hostetter [EMAIL PROTECTED] wrote:

once upon a time Yonik and i discussed having a maxAutoWarm time option
on the caches, and they would give up if that much time had elapsed


At first I thought it would be easy... just autowarm until time is up.
But then I realized that there is a problem: one would want to warm
items from the most important to the least important if you don't know
if you are going to get to them all.  But doing them in that order
with an LRU cache means that they would end up in the wrong order (the
most recently used item would be at the tail of the list and first to
be ejected).

Even if we *did* have access to the internal list of the
LinkedHashMap, reversing the order isn't what is desired either since
generation of one cache entry can touch and thus promote another.
Perhaps the solution is to keep a separate list while warming, and
then when time runs out, clear and reinsert in the correct order.


-Yonik


Re: MatchAllDocsQuery in solr?

2006-11-21 Thread Yonik Seeley

On 11/21/06, Tom [EMAIL PROTECTED] wrote:

Is there a way to do a match all docs query in solr?

I mean is there something I can put in a solr URL that will get
recognized by the SolrQueryParser as meaning a match all?


No, but there should be.

I've considered *:* but I haven't checked if the JavaCC grammar will
allow that through or if it would need to be modified.

-Yonik


Minimum time between distributions

2006-11-21 Thread Kevin Lewandowski

On Discogs I'm running Solr with two slaves and one master, using the
distribution scripts. The slaves pull and install a new snapshot every
five minutes and this is working very well so far.

Are there any risks with reducing this window to every one or two
minutes? With large caches could the autowarming take longer than one
or two minutes? It isn't a business need to reduce the window but I'm
just curious about the feasibility and risks.

How often do other people run snappuller and snapinstaller?

thanks,
Kevin


Re: MatchAllDocsQuery in solr?

2006-11-21 Thread Tom

Thanks for the quick response.

I thought about a range query on the ID, but was wondering what the 
implications were for a large range query. (e.g. Number of docs  
maxBooleanClauses). But this approach will work for me, as my test 
indicies are generally small.


For a large data set, would it be faster to do that on a field with 
fewer values (but the same number of documents)


e.g. type:[* TO *] where the type field has a small number of values.

Or does that not matter?

Thanks,

Tom

At 02:49 PM 11/21/2006, you wrote:


:  I mean is there something I can put in a solr URL that will get
:  recognized by the SolrQueryParser as meaning a match all?
:
: No, but there should be.

if you use the uniqueKey feature, then you can do id:[* TO *] ... that
acctually works on any field to find all docs that have a value, but on
a uniqueKey field it by definition returns all docs since all docs have a
uniequeKey.




-Hoss




RE: MatchAllDocsQuery in solr?

2006-11-21 Thread Fuad Efendi
I was thinking about existing SOLR Admin Interface... It already provides
some kind of _solr:all, it can show at least number of documents and some
other statistics.

If we can do it XML-way, and make it more abstract and generic (facets,
terms, etc.)...



-Original Message-
From: Walter Underwood [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 21, 2006 6:24 PM
To: solr-user@lucene.apache.org
Subject: Re: MatchAllDocsQuery in solr?


On 11/21/06 3:19 PM, Chris Hostetter [EMAIL PROTECTED] wrote:
 
 :  I've considered *:* but I haven't checked if the JavaCC grammar will
 :  allow that through or if it would need to be modified.
 :
 : I looked into it quick, and it looks like the grammar may need to be
 : modified (i.e., one can't just override a method of QueryParser to do
 : this).
 
 we could add this to the function parser, so  _val_:ALL  could return a
 MatchAllDocsQuery ?

I was thinking something similar, maybe _solr:all. At Infoseek, we
hardcoded url:http to match all docs.

wunder
-- 
Walter Underwood
Search Guru, Netflix







RE: MatchAllDocsQuery in solr?

2006-11-21 Thread Fuad Efendi
Is there a way to do a match all docs query in solr?
Why do you need to perform full index search in order to find all indexed
documents?
We need additional XML-Admin-API, but it is different type of a 'query in
solr' - no need for analyzer, tokenizer, etc.



Re: MatchAllDocsQuery in solr?

2006-11-21 Thread Yonik Seeley

On 11/21/06, Yonik Seeley [EMAIL PROTECTED] wrote:

I looked into it quick, and it looks like the grammar may need to be
modified (i.e., one can't just override a method of QueryParser to do
this).


Done, but not yet committed in Lucene:
http://issues.apache.org/jira/browse/LUCENE-723

-Yonik