Re: MatchAllDocsQuery in solr?
Walter Underwood wrote: I was thinking something similar, maybe _solr:all. At Infoseek, we hardcoded url:http to match all docs. I suppose that different data would yield different responses but a space ( ) works on our data. the other Walter
MatchAllDocsQuery in solr?
Is there a way to do a match all docs query in solr? I mean is there something I can put in a solr URL that will get recognized by the SolrQueryParser as meaning a match all? Why? Because I'm porting unit tests from our internal Lucene container to Solr, and the tests usually run such a query, upon completion, to make sure the index is in the expected state (nothing missing, nothing extra). Yes, I can create a query that will match all my docs, there are a few fields that have a relatively small range of values. I was just looking for a standard way to do it first. Thanks, Tom
Re: MatchAllDocsQuery in solr?
On 11/21/06 3:19 PM, Chris Hostetter [EMAIL PROTECTED] wrote: : I've considered *:* but I haven't checked if the JavaCC grammar will : allow that through or if it would need to be modified. : : I looked into it quick, and it looks like the grammar may need to be : modified (i.e., one can't just override a method of QueryParser to do : this). we could add this to the function parser, so _val_:ALL could return a MatchAllDocsQuery ? I was thinking something similar, maybe _solr:all. At Infoseek, we hardcoded url:http to match all docs. wunder -- Walter Underwood Search Guru, Netflix
Re: Minimum time between distributions
: Are there any risks with reducing this window to every one or two : minutes? With large caches could the autowarming take longer than one : or two minutes? It isn't a business need to reduce the window but I'm : just curious about the feasibility and risks. you can definitely run into problems if autoWarming takes longer then your snappuller interval, thre's nothing in Solr right now that prevents warming searchers from layering on top of eachother eating resources. once upon a time Yonik and i discussed having a maxAutoWarm time option on the caches, and they would give up if that much time had elapsed -- but that approach is too simplistic to be of much use -- even if you configure snappuller to run ever minute, there are going to be lots of times when it doesn't acctually have anything new every minute. : How often do other people run snappuller and snapinstaller? most of the indexes i deal with are set to 5 minutes ... 1 that really REALLY relies on the caches for facet counts is only 15 ... i think we have one index snappulling/installing every minute -- but there are almost never new indexes, that internal is just so the rare occasions when there is a new index, the data shows up ASAP. -Hoss
Re: Minimum time between distributions
On 11/21/06, Chris Hostetter [EMAIL PROTECTED] wrote: once upon a time Yonik and i discussed having a maxAutoWarm time option on the caches, and they would give up if that much time had elapsed At first I thought it would be easy... just autowarm until time is up. But then I realized that there is a problem: one would want to warm items from the most important to the least important if you don't know if you are going to get to them all. But doing them in that order with an LRU cache means that they would end up in the wrong order (the most recently used item would be at the tail of the list and first to be ejected). Even if we *did* have access to the internal list of the LinkedHashMap, reversing the order isn't what is desired either since generation of one cache entry can touch and thus promote another. Perhaps the solution is to keep a separate list while warming, and then when time runs out, clear and reinsert in the correct order. -Yonik
Re: MatchAllDocsQuery in solr?
On 11/21/06, Tom [EMAIL PROTECTED] wrote: Is there a way to do a match all docs query in solr? I mean is there something I can put in a solr URL that will get recognized by the SolrQueryParser as meaning a match all? No, but there should be. I've considered *:* but I haven't checked if the JavaCC grammar will allow that through or if it would need to be modified. -Yonik
Minimum time between distributions
On Discogs I'm running Solr with two slaves and one master, using the distribution scripts. The slaves pull and install a new snapshot every five minutes and this is working very well so far. Are there any risks with reducing this window to every one or two minutes? With large caches could the autowarming take longer than one or two minutes? It isn't a business need to reduce the window but I'm just curious about the feasibility and risks. How often do other people run snappuller and snapinstaller? thanks, Kevin
Re: MatchAllDocsQuery in solr?
Thanks for the quick response. I thought about a range query on the ID, but was wondering what the implications were for a large range query. (e.g. Number of docs maxBooleanClauses). But this approach will work for me, as my test indicies are generally small. For a large data set, would it be faster to do that on a field with fewer values (but the same number of documents) e.g. type:[* TO *] where the type field has a small number of values. Or does that not matter? Thanks, Tom At 02:49 PM 11/21/2006, you wrote: : I mean is there something I can put in a solr URL that will get : recognized by the SolrQueryParser as meaning a match all? : : No, but there should be. if you use the uniqueKey feature, then you can do id:[* TO *] ... that acctually works on any field to find all docs that have a value, but on a uniqueKey field it by definition returns all docs since all docs have a uniequeKey. -Hoss
RE: MatchAllDocsQuery in solr?
I was thinking about existing SOLR Admin Interface... It already provides some kind of _solr:all, it can show at least number of documents and some other statistics. If we can do it XML-way, and make it more abstract and generic (facets, terms, etc.)... -Original Message- From: Walter Underwood [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 21, 2006 6:24 PM To: solr-user@lucene.apache.org Subject: Re: MatchAllDocsQuery in solr? On 11/21/06 3:19 PM, Chris Hostetter [EMAIL PROTECTED] wrote: : I've considered *:* but I haven't checked if the JavaCC grammar will : allow that through or if it would need to be modified. : : I looked into it quick, and it looks like the grammar may need to be : modified (i.e., one can't just override a method of QueryParser to do : this). we could add this to the function parser, so _val_:ALL could return a MatchAllDocsQuery ? I was thinking something similar, maybe _solr:all. At Infoseek, we hardcoded url:http to match all docs. wunder -- Walter Underwood Search Guru, Netflix
RE: MatchAllDocsQuery in solr?
Is there a way to do a match all docs query in solr? Why do you need to perform full index search in order to find all indexed documents? We need additional XML-Admin-API, but it is different type of a 'query in solr' - no need for analyzer, tokenizer, etc.
Re: MatchAllDocsQuery in solr?
On 11/21/06, Yonik Seeley [EMAIL PROTECTED] wrote: I looked into it quick, and it looks like the grammar may need to be modified (i.e., one can't just override a method of QueryParser to do this). Done, but not yet committed in Lucene: http://issues.apache.org/jira/browse/LUCENE-723 -Yonik