[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-08-05 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729536#comment-13729536
 ] 

David Smiley commented on SOLR-5093:


I guess I change my mind; the veto arguments are good.

Mikhail, I like your idea on making a sub-clause be filter-cache'able.  But I 
don't think it should be a separate query parser because it's an orthogonal 
issue to how the query is parsed.  Perhaps a special local-param 
filterCache=true.  Your example would become:

{noformat}
  q=bee:blah OR {! filterCache=true}foo:bar OR {! filterCache=true}foo:bar
{noformat}

A key thing to document would not only be that this clause would be cached in 
the filter-cache, but also that it would constant-score.

 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-08-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729793#comment-13729793
 ] 

Robert Muir commented on SOLR-5093:
---

I dont think that would work:

{quote}
There may only be one LocalParams prefix per argument, preventing the need for 
any escaping of the original argument.
{quote}

http://wiki.apache.org/solr/LocalParams



 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-08-05 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729873#comment-13729873
 ] 

Mikhail Khludnev commented on SOLR-5093:


[~rcmuir] I think that would SOLR-4093

Can anyone confess to {! sep=true} which is backed by 
ExtendedQuery.getCacheSep()? Isn't it somehow related to the discussed 
challenge? 

 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-08-05 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729879#comment-13729879
 ] 

Yonik Seeley commented on SOLR-5093:


bq. Can anyone confess to {! sep=true}

It's a placeholder that currently does nothing (and is undocumented)... ignore 
it, or remove it if it bothers people ;-)

 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-31 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725094#comment-13725094
 ] 

Mikhail Khludnev commented on SOLR-5093:


I agree with vetoes. 
but in a rare cases users need q=bee:blah *OR* pp:\* there is also a jira to 
handle fq disjunction like fq=foo:bar OR foo:baz. We can deliver simple qparser 
and use it like
q=bee:blah OR _query_:{!fq}foo:bar OR _query_:{!fq}foo:bar
it keeps syntax crazy enough. that's great. 
Do you like to accept it ?

Afterwards, we can allow BS in Solr to handle filters disjunction efficiently. 

 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724393#comment-13724393
 ] 

Robert Muir commented on SOLR-5093:
---

Err, this user already had this in their FQ. So if they had a filtercache, he'd 
be using it.

he should pull that slow piece to a separate FQ so its cached by itself. I 
don't understand why the queryparser needs to do anything else here (especially 
any trappy auto-caching)

 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724405#comment-13724405
 ] 

Jack Krupansky commented on SOLR-5093:
--

Some time ago I had suggested a related approach: LUCENE-4386 - Query parser 
should generate FieldValueFilter for pure wildcard terms to boost query 
performance.

There were objections from the Lucene guys, but now that the Solr query parser 
is divorced from Lucene, maybe it could be reconsidered.

I couldn't testify as to the relative merits of using the filter cache vs. the 
FieldValueFilter.


 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724422#comment-13724422
 ] 

Robert Muir commented on SOLR-5093:
---

Those same lucene guys are not afraid to object here either.

This user just has to pull out AND pp:* into another fq of pp:*

{quote}
(Each filter is executed and cached separately. When it's time to use them to 
limit the number of results returned by a query, this is done using set 
intersections.) 
{quote}
http://wiki.apache.org/solr/SolrCaching#filterCache

 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724428#comment-13724428
 ] 

David Smiley commented on SOLR-5093:


Rob,
You're right for this particular user's use-case that I mentioned.  I 
overlooked that aspect of his query.  Nonetheless, I don't think that negates 
the usefulness of what I propose in this issue though.

If you consider auto-caching trappy then you probably don't like Solr very 
much at all then.

 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724430#comment-13724430
 ] 

Jack Krupansky commented on SOLR-5093:
--

bq. This user just has to pull out AND pp:* into another fq of pp:*

Exactly! That's what we (non-Lucene guys) are trying to do - eliminate the need 
for users to have to do that kind of manual optimization.

We want Solr to behave as optimally as possibly OOTB.


 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724443#comment-13724443
 ] 

Robert Muir commented on SOLR-5093:
---

Solr today doesn't auto-cache. You can specify that you intend for a query to 
act only as a filter with fqs, control the caching behavior of these fqs, and 
so on.

So there is no need to add any additional auto-caching in the queryparser. 
Things like LUCENE-4386 would just cause filter cache insanity where its 
cached in duplicate places (on FieldCache.docsWithField as well as in fq 
bitsets).

Auto-caching things in the query can easily pollute the cache with stuff thats 
not actually intended to be reused: then it doesn't really work at all.

 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

2013-07-30 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724446#comment-13724446
 ] 

Hoss Man commented on SOLR-5093:


I can see the argument for making field:* parse as equivalent to field:[* TO 
*] if the later is in fact more efficient, but i agree with rob that we 
shouldn't try make the parser pull out individual clauses and construct special 
query objects that are baked by the filterCache.  If i have an fq in my 
solrconfig that looks like this...

{noformat}
str name=fqX AND Y AND Z/str
{noformat}

...that entire BooleanQuery should be cached as a single entity in the 
filterCache regardless of what X, Y, and Z really are -- because that's what i 
asked for: a single filter query.

it would suck if the Query Parser looked at the specifics of what each of those 
clauses are and said I'm going to try and be smart and make each of these 
clauses be special query backed by the filterCache because now i have 4 
queries in my filterCache instead of just 1, and 3 of them will never be used.



 Rewrite field:* to use the filter cache
 ---

 Key: SOLR-5093
 URL: https://issues.apache.org/jira/browse/SOLR-5093
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Reporter: David Smiley

 Sometimes people writes a query including something like {{field:*}} which 
 matches all documents that have an indexed value in that field.  That can be 
 particularly expensive for tokenized text, numeric, and spatial fields.  The 
 expert advise is to index a separate boolean field that is used in place of 
 these query clauses, but that's annoying to do and it can take users a while 
 to realize that's what they need to do.
 I propose that Solr's query parser rewrite such queries to return a query 
 backed by Solr's filter cache.  The underlying query happens once (and it's 
 slow this time) and then it's cached after which it's super-fast to reuse.  
 Unfortunately Solr's filter cache is currently index global, not per-segment; 
 that's being handled in a separate issue.  
 Related to this, it may be worth considering if Solr should behind the scenes 
 index a field that records which fields have indexed values, and then it 
 could use this indexed data to power these queries so they are always fast to 
 execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly 
 use this.
 For an example of how a user bumped into this, see:
 http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org