[jira] [Comment Edited] (SOLR-4701) CollectorFilterQParserPlugin support Filter Collector at search with PostFilter

2013-04-21 Thread Linbin Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13637502#comment-13637502
 ] 

Linbin Chen edited comment on SOLR-4701 at 4/21/13 8:15 AM:


frange now has use PostFilter, but CollectorFilterQParserPlugin create other  
collector filter query.

approach case:

case 1: in query

like sql in operate select * from a where user=123 and status in (1,2,3)

a field 'status' has value mybe (0,1,2,3,4,5,6,7,8,9) 10 kind status value.

has 10 million row index。avg 1 million per one of 'status' field value.

user:123 maybe has 2k row. status:\(1 OR 2 OR 3\) has 3 million row.

user:123fq={!cf name=in}status:\(1,2,3\) faster than user:123 AND status:\(1 
OR 2 OR 3\)

maybe can use filterCache status:(1 OR 2 OR 3) query,but 10 kind status 
combination,create C(n,0)+C(n,1)+...+C(n,n)=low(2,n), n=10, will has 1024 
OpenBitSet. 

filterCache 1024 OpenBitSet(maxSize=10 million) RAM = 1.25G

cf.in use RAM = 10M*4 = 40M


case 2: bit query

like options search。such as linux file attribute, R/W/X (R=100, W=010, X=001).

assume bit operate logic query_bit | field_bit !=0

search R OR W
{code}
{!cf name=bit}file_attr:(6)
{code}

I not yet upload bit query patch. extends CollectorFilterable easy impl under 
CollectorFilterQParserPlugin

In my approach use long save 54 bit options。

  was (Author: chenlb):
frange now has use PostFilter, but CollectorFilterQParserPlugin create 
other  collector filter query.

approach case:

case 1: in query

like sql in operate select * from a where user=123 and status in (1,2,3)

a field 'status' has value mybe (0,1,2,3,4,5,6,7,8,9) 10 kind status value.

has 10 million row index。avg 1 million per one of 'status' field value.

user:123 maybe has 2k row. status:\(1 OR 2 OR 3\) has 3 million row.

user:123fq={!cf name=in}status:\(1,2,3\) faster than user:123 AND status:\(1 
OR 2 OR 3\)

maybe can use filterCache status:(1 OR 2 OR 3) query,but 10 kind status 
combination,create C(n,0)+C(n,1)+...+C(n,n)=low(2,n), n=10, will has 1024 
OpenBitSet. 

filterCache 1024 OpenBitSet(maxSize=10 million) RAM = 1.25G

cf.in use RAM = 10M*4 = 40M


case 2: bit query

like options search。such as linux file attribute, R/W/X (R=100, W=010, X=001).

assume bit operate logic query_bit | field_bit !=0

search R OR W
{code}
{!cf name=bit}file_attr:(6)
{code}

I not yet upload bit query patch.

In my approach use long save 54 bit options。
  
 CollectorFilterQParserPlugin support Filter Collector at search with 
 PostFilter
 ---

 Key: SOLR-4701
 URL: https://issues.apache.org/jira/browse/SOLR-4701
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.2
Reporter: Linbin Chen
 Fix For: 4.3

 Attachments: SOLR-4701.patch


 example:
  * {code}fq={!cf name=in}status:(-1, 2){code}
  * {code}fq={!cf name=in not=true}status:(3,4){code}
  * {code}fq={!cf name=range}price:[100 TO 500]{code}
  * {code}fq={!cf name=range}log(page_view):[50 TO 120]{code}
 in operate like sql in, faster then OR boolean query.
 most of the case, range faster then TrieField in lucene query.
 how to do use:
 solrconfig.xml add
 {code:xml}
 queryParser name=cf class=solr.CollectorFilterQParserPlugin/
 {code}
 cf not use query cache, use PostFilter fiter collector

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4701) CollectorFilterQParserPlugin support Filter Collector at search with PostFilter

2013-04-21 Thread Linbin Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13637502#comment-13637502
 ] 

Linbin Chen edited comment on SOLR-4701 at 4/21/13 8:23 AM:


frange now has use PostFilter, but CollectorFilterQParserPlugin create other  
collector filter query.

approach case:

case 1: in query

like sql in operate select * from a where user=123 and status in (1,2,3)

a field 'status' has value mybe (0,1,2,3,4,5,6,7,8,9) 10 kind status value.

has 10 million row index。avg 1 million per one of 'status' field value.

user:123 maybe has 2k row. status:\(1 OR 2 OR 3\) has 3 million row.

user:123fq={!cf name=in}status:\(1,2,3\) faster than user:123 AND status:\(1 
OR 2 OR 3\)

maybe can use filterCache status:(1 OR 2 OR 3) query,but 10 kind status 
combination,create C(n,0)+C(n,1)+...+C(n,n)=low(2,n), n=10, will has 1024 
OpenBitSet. 

filterCache 1024 OpenBitSet(maxSize=10 million) RAM = 1.25G

cf.in user FieldCache, use RAM = 10M*4 = 40M

in near realtime case, filterCache cache by query, but cf.in cache by 
atomicReader. it's hit ratio will higher.


case 2: bit query

like options search。such as linux file attribute, R/W/X (R=100, W=010, X=001).

assume bit operate logic query_bit | field_bit !=0

search R OR W
{code}
{!cf name=bit}file_attr:(6)
{code}

I not yet upload bit query patch. extends CollectorFilterable easy impl under 
CollectorFilterQParserPlugin

In my approach use long save 54 bit options。

  was (Author: chenlb):
frange now has use PostFilter, but CollectorFilterQParserPlugin create 
other  collector filter query.

approach case:

case 1: in query

like sql in operate select * from a where user=123 and status in (1,2,3)

a field 'status' has value mybe (0,1,2,3,4,5,6,7,8,9) 10 kind status value.

has 10 million row index。avg 1 million per one of 'status' field value.

user:123 maybe has 2k row. status:\(1 OR 2 OR 3\) has 3 million row.

user:123fq={!cf name=in}status:\(1,2,3\) faster than user:123 AND status:\(1 
OR 2 OR 3\)

maybe can use filterCache status:(1 OR 2 OR 3) query,but 10 kind status 
combination,create C(n,0)+C(n,1)+...+C(n,n)=low(2,n), n=10, will has 1024 
OpenBitSet. 

filterCache 1024 OpenBitSet(maxSize=10 million) RAM = 1.25G

cf.in use RAM = 10M*4 = 40M


case 2: bit query

like options search。such as linux file attribute, R/W/X (R=100, W=010, X=001).

assume bit operate logic query_bit | field_bit !=0

search R OR W
{code}
{!cf name=bit}file_attr:(6)
{code}

I not yet upload bit query patch. extends CollectorFilterable easy impl under 
CollectorFilterQParserPlugin

In my approach use long save 54 bit options。
  
 CollectorFilterQParserPlugin support Filter Collector at search with 
 PostFilter
 ---

 Key: SOLR-4701
 URL: https://issues.apache.org/jira/browse/SOLR-4701
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.2
Reporter: Linbin Chen
 Fix For: 4.3

 Attachments: SOLR-4701.patch


 example:
  * {code}fq={!cf name=in}status:(-1, 2){code}
  * {code}fq={!cf name=in not=true}status:(3,4){code}
  * {code}fq={!cf name=range}price:[100 TO 500]{code}
  * {code}fq={!cf name=range}log(page_view):[50 TO 120]{code}
 in operate like sql in, faster then OR boolean query.
 most of the case, range faster then TrieField in lucene query.
 how to do use:
 solrconfig.xml add
 {code:xml}
 queryParser name=cf class=solr.CollectorFilterQParserPlugin/
 {code}
 cf not use query cache, use PostFilter fiter collector

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org