[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-07-29 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167124#comment-17167124
 ] 

Christine Poerschke commented on SOLR-13289:


"increased minExactCount related cache use" ticket created: SOLR-14690

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Fix For: master (9.0), 8.6
>
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-26 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116924#comment-17116924
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

I like the idea Christine, thanks! Lets take it in a followup Jira issue, no 
need to keep adding to this one I think. I already created a couple of followup 
tasks, maybe also link yours?

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Fix For: master (9.0), 8.6
>
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-22 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114398#comment-17114398
 ] 

Christine Poerschke commented on SOLR-13289:


{quote}{quote}would or wouldn't a minExactHits=100 request make use of a 
minExactHits=1000
{quote}
It wouldn't. Right now, it's just using equal. We could improve this for sure, 
that said, I'm wondering how useful that would be in practice? like, people 
doing the same request in the same index with different minExactHits. While 
certainly could happen, I'm not sure how common that is.
{quote}
I agree, say {{minExactCount=100}} and {{minExactCount=1000}} on the same index 
might be uncommon but a query with a {{minExactCount=}} restriction being able 
to use a cache entry from a query without a {{minExactCount=}} restriction 
might be more interesting. Anyhow, I've speculatively opened 
[https://github.com/apache/lucene-solr/pull/1530] – though perhaps a new ticket 
would be clearer since this one is now closed.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Fix For: master (9.0), 8.6
>
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114205#comment-17114205
 ] 

ASF subversion and git services commented on SOLR-13289:


Commit 60e5cff87f56d9c4aebc5aeab63e10bd24440087 in lucene-solr's branch 
refs/heads/branch_8x from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=60e5cff ]

SOLR-13289: Add Support for BlockMax WAND (#1456)

Add support for BlockMax WAND via a minExactHits parameter. Hits will be 
counted accurately at least until this value, and above that, the count will be 
an approximation. In distributed search requests, the count will be per shard, 
so potentially the count will be accurately counted until numShards * 
minExactHits. The response will include the value numFoundExact which can be 
true (The value in numFound is exact) or false (the value in numFound is an 
approximation).


> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114207#comment-17114207
 ] 

ASF subversion and git services commented on SOLR-13289:


Commit d5f8aab8614f61348782b09de8443c22e7c26bd2 in lucene-solr's branch 
refs/heads/branch_8x from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d5f8aab ]

SOLR-13289: Rename minExactHits to minExactCount (#1511)


> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114206#comment-17114206
 ] 

ASF subversion and git services commented on SOLR-13289:


Commit 62a3476c89afb81b6ab07a2b3dbd6b27a6634fe7 in lucene-solr's branch 
refs/heads/branch_8x from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=62a3476 ]

SOLR-13289: Use the final collector's scoreMode (#1517)

This is needed in case a PostFilter changes the scoreMode


> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114208#comment-17114208
 ] 

ASF subversion and git services commented on SOLR-13289:


Commit d97e6fe821a8025a66ba6d0f1d558a68d7789aa5 in lucene-solr's branch 
refs/heads/branch_8x from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d97e6fe ]

SOLR-13289: Add Refguide changes (#1501)



> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114209#comment-17114209
 ] 

ASF subversion and git services commented on SOLR-13289:


Commit c8bfe974b26a8963ef20d1fbb283c8a4dddc52b6 in lucene-solr's branch 
refs/heads/branch_8x from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c8bfe97 ]

SOLR-13289: Add CHANGES entry to 8.x


> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113688#comment-17113688
 ] 

ASF subversion and git services commented on SOLR-13289:


Commit 5e9483e7885cab47b7d0e6249cfeb1fc02ffc257 in lucene-solr's branch 
refs/heads/SOLR-14461-fileupload from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5e9483e ]

SOLR-13289: Use the final collector's scoreMode (#1517)

This is needed in case a PostFilter changes the scoreMode

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113690#comment-17113690
 ] 

ASF subversion and git services commented on SOLR-13289:


Commit 16a22fcf564c54cf6e05e5e5c117477fb21aaa04 in lucene-solr's branch 
refs/heads/SOLR-14461-fileupload from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=16a22fc ]

SOLR-13289: Add Refguide changes (#1501)



> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113689#comment-17113689
 ] 

ASF subversion and git services commented on SOLR-13289:


Commit 3ca7628c43747a2f81188b9848a870cc7fc37f63 in lucene-solr's branch 
refs/heads/SOLR-14461-fileupload from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3ca7628 ]

SOLR-13289: Rename minExactHits to minExactCount (#1511)



> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113629#comment-17113629
 ] 

ASF subversion and git services commented on SOLR-13289:


Commit 16a22fcf564c54cf6e05e5e5c117477fb21aaa04 in lucene-solr's branch 
refs/heads/master from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=16a22fc ]

SOLR-13289: Add Refguide changes (#1501)



> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113621#comment-17113621
 ] 

ASF subversion and git services commented on SOLR-13289:


Commit 3ca7628c43747a2f81188b9848a870cc7fc37f63 in lucene-solr's branch 
refs/heads/master from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3ca7628 ]

SOLR-13289: Rename minExactHits to minExactCount (#1511)



> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113605#comment-17113605
 ] 

ASF subversion and git services commented on SOLR-13289:


Commit 5e9483e7885cab47b7d0e6249cfeb1fc02ffc257 in lucene-solr's branch 
refs/heads/master from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5e9483e ]

SOLR-13289: Use the final collector's scoreMode (#1517)

This is needed in case a PostFilter changes the scoreMode

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-15 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108624#comment-17108624
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

bq. would or wouldn't a minExactHits=100 request make use of a minExactHits=1000
It wouldn't. Right now, it's just using equal. We could improve this for sure, 
that said, I'm wondering how useful that would be in practice? like, people 
doing the same request in the same index  with different {{minExactHits}}. 
While certainly could happen, I'm not sure how common that is.
bq. Is it possible to use WAND with ExternalFileField, as is?
[~greggny3], Can you elaborate on how you are using ExternalFieldField? sorting 
by it? use it in a function query?

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-15 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108517#comment-17108517
 ] 

Christine Poerschke commented on SOLR-13289:


Seeing the inclusion of {{minExactHits}} in the {{QueryResultKey}} made me 
curious about caching effects e.g. would or wouldn't a minExactHits=100 request 
make use of a minExactHits=1000 cache entry and/or would use of the cache 
affect the "equal" vs. "greater than or equal" indication in the response -- 
https://github.com/cpoerschke/lucene-solr/commit/68042c52fa58dc1e4e61145b101b3877501b708d
 shares what i have so far.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-14 Thread Gregg Donovan (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107615#comment-17107615
 ] 

Gregg Donovan commented on SOLR-13289:
--

We've been using 
[ExternalFileField|https://lucene.apache.org/solr/guide/8_5/working-with-external-files-and-processes.html#the-externalfilefield-type]
 for non-index ranking signals. Is it possible to use WAND with 
ExternalFileField, as is? Or would ExternalFileField need to be changed to 
provide max impacts per block? FeatureField could work, but ExternalFileField 
is quite useful for changing ranking signals without requiring a reindex.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-13 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106021#comment-17106021
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

[~dsmiley], I'm still debating myself on whether this is the best solution for 
the bug you spotted: 
https://github.com/tflobbe/lucene-solr-1/commit/9beaf4131e53f04651b9e6d9afba60192f348c73,
 please take a look. 
bq. I suggest minExactFound instead because it has the word "Found" which is 
perhaps the most significant portion of "numFound".
I understand it's because of {{numFound}}, but It just sounds very weird to me 
that the input parameter for the query carries the word "found".

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-12 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105399#comment-17105399
 ] 

David Smiley commented on SOLR-13289:
-

I suggest {{minExactFound}} instead because it has the word "Found" which is 
perhaps the most significant portion of "numFound".

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-11 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105038#comment-17105038
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

This is the parameter rename to minExactCount: 
https://github.com/apache/lucene-solr/pull/1511

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-11 Thread Anshum Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104977#comment-17104977
 ] 

Anshum Gupta commented on SOLR-13289:
-

I think {{minExactCount}} sounds the best here. It's explanatory but not super 
verbose.

I'm personally not a huge fan of very long parameter names.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-11 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104816#comment-17104816
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

bq. I agree with Ishan on not enabling this by default.  
Agree, I intentionally left it to a separate Jira issue. I made a comment in 
the PR.

bq. I know minNumFoundToBeExact is kinda long but it's more clear than the 
counter-proposals and I don't expect people to be typing it often (thus "q" and 
"fq" for example are justifiably super short).
Well "typing often", likely doesn't happen with any parameter, since this is 
coded somewhere. But still, I think we don't need such verbosity here. My 
personal favorite is still {{minExactCount}}

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-11 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104768#comment-17104768
 ] 

David Smiley commented on SOLR-13289:
-

Tomas, while you are modifying getDocListNC, can you consider refactoring this 
method into two?  It has an obvious code smell that it should be split.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-11 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104757#comment-17104757
 ] 

David Smiley commented on SOLR-13289:
-

I agree with Ishan on not enabling this by default.  Maybe re-propose that at a 
later time for 9.0 on the dev list (not in some issue like here).  I want to 
see substantial time pass with this to get an impression from users and us.

I know \{{minNumFoundToBeExact}} is kinda long but it's more clear than the 
counter-proposals and I don't expect people to be typing it often (thus "q" and 
"fq" for example are justifiably super short).

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-11 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104755#comment-17104755
 ] 

Ishan Chattopadhyaya commented on SOLR-13289:
-

Right. Default params can be configured using paramsets. I'm just -1 on 
enabling this feature by default.

I think we could just document the config API command to set the their own 
per-query default for this param, so they don't need to hand-edit any file.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-11 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104738#comment-17104738
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

bq.  I think this should be left as an opt-in feature
The {{solrconfig.xml}} would be where you can opt-in or out for all requests. I 
don't see how this changes things. We can argue what's the default we ship with 
in solrconfig.xml then.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-11 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104734#comment-17104734
 ] 

Ishan Chattopadhyaya commented on SOLR-13289:
-

{quote}Maybe we can have something shorter? {{minExactNumFound}}? I also like 
{{minExactCount}}, which I feels sounds a bit better, but more disconnected 
from {{numFound}}
{quote}
{{minExactFound}} or {{numFoundThreshold}}.
{quote}But my plan is to make the default value for {{minFoundHits}} a 
configuration in solrconfig.xml
{quote}
I'm -1 to this. I think this should be left as an opt-in feature for now (i.e. 
users should pass it in for every query). At a later point, we can standardize 
it and make it a default.

 

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-11 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104731#comment-17104731
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

bq. Should we really add numFoundExact="true" on responses
I thought about this too. But my plan is to make the default value for 
{{minFoundHits}} a configuration in solrconfig.xml (and maybe even lower that 
in master to something like 10k or something). So the parameter wouldn't be 
explicitly passed every time. I guess still, one could choose to not use the 
feature (by providing a negative value or Integer.MAX_VALUE), but it does 
become a much more common thing. Implementation-wise, it would be messy to move 
a flag around to figure out if we need to include or not the value in the 
response

bq. Wouldn't we want the controlling parameter to use "numFound" likewise 
instead of "hits"?
Yes, makes sense. 
bq. I propose minNumFoundToBeExact
Maybe we can have something shorter? {{minExactNumFound}}? I also like 
{{minExactCount}}, which I feels sounds a bit better, but more disconnected 
from {{numFound}}

bq. I spent some time today reviewing what you pushed more closely,
Thanks, I'll take a look today

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-09 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103601#comment-17103601
 ] 

David Smiley commented on SOLR-13289:
-

Should we really add {{numFoundExact="true"}} on responses where the user 
didn't even specify a parameter to control this new feature? I prefer not 
adding the noise.

I like the name {{numFoundExact}} in the response compared to others we 
explored – a last minute change I see. Wouldn't we want the controlling 
parameter to use "numFound" likewise instead of "hits"? I propose 
{{minNumFoundToBeExact}}. The word "hits" isn't particularly widespread in 
Solr, except for cache hits.

I spent some time today reviewing what you pushed more closely, and especially 
testing my theory that there is a problem with interactions with the Collapse 
PostFilter/Collector. +There is, albeit not a big problem.+ Essentially the 
Collapse PostFilter must see and cache all docs before passing those it deems 
appropriate on to the rest of the collectors. TopDocs Collector is downstream 
of it, and TDC tries to tell the Scorer to do approximation stuff but it is in 
vain because by this point, all the docs are already accumulated cached with 
Collapse. Other than a possible waste in computation, it ultimately results in 
Solr saying that the results weren't exact when they are actually exact. 
 I pushed a commit to my fork to demonstrate the problem:
 
[https://github.com/dsmiley/lucene-solr/commit/8803db97a5e4deb0ad5f3bdaabd02cd3b302a09f]
 Interestingly I see some other test failures there.

I think the solution is in 
{{org.apache.solr.search.SolrIndexSearcher#getDocListNC}} in the second half of 
the method ({{lastDocRequested <= 0}} i.e. top-X results case), right before 
{{buildTopDocsCollector}} in invoked, set 
{{cmd.setMinExactHits(Integer.MAX_VALUE);}} only if {{pf.postFilter.scoreMode}} 
isn't null and isn't TOP_SCORES, thus it's one of the two COMPLETE options. 
COMPLETE means the Scorer needs yield all matching docs.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-08 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102987#comment-17102987
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

Docs in https://github.com/apache/lucene-solr/pull/1501

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-08 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102951#comment-17102951
 ] 

ASF subversion and git services commented on SOLR-13289:


Commit d9f9d6dd47c06f5fe092d43d6bf0c77c5ff2019f in lucene-solr's branch 
refs/heads/master from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d9f9d6d ]

SOLR-13289: Add Support for BlockMax WAND (#1456)

Add support for BlockMax WAND via a minExactHits parameter. Hits will be 
counted accurately at least until this value, and above that, the count will be 
an approximation. In distributed search requests, the count will be per shard, 
so potentially the count will be accurately counted until numShards * 
minExactHits. The response will include the value numFoundExact which can be 
true (The value in numFound is exact) or false (the value in numFound is an 
approximation).

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-08 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102867#comment-17102867
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

numFoundExact sounds better to me too. I updated. I plan to merge the current 
PR to master and 8.x. I'll do another PR for docs.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-08 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102755#comment-17102755
 ] 

Ishan Chattopadhyaya commented on SOLR-13289:
-

Instead of {{hitCountExact=false}}, I'd rather prefer {{numFoundExact=false}} 
or {{numFoundExactness=false}}

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-08 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102722#comment-17102722
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

OK, I updated the response to include a boolean {{hitCountExact}}:

json:
{code:javascript}
"response": {
"numFound": 4,
"start": 0,
"hitCountExact": false,
"docs": Array[2]
...
{code}
XML:
{code:xml}


...
{code}
Java:
{code:java}
  public Boolean getHitCountExact() {
return hitCountExact;
  }
{code}

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-06 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17100516#comment-17100516
 ] 

Adrien Grand commented on SOLR-13289:
-

bq. TwoPhaseIterator

LUCENE-8806 explores supporting two-phase iterators with block-max WAND.

bq. more signals for calculating the score

We have been recommending using FeatureField for this use-case: 
https://lucene.apache.org/core/8_5_0/core/org/apache/lucene/document/FeatureField.html.
 It encode float values as term freqs to be able to work with block-max WAND.

bq. If you just need a custom similarity that uses TF/IDF, then you should be 
able to use it.

Indeed, we only require that scores do not decrease when tf increases and do 
not increase when norm decreases.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-05 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17100329#comment-17100329
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

bq. Grouping
Grouping uses a different collector. It’s not included as part of this PR. 
Maybe there are things that can be done to allow WAND in some cases for Grouping
bq. Custom scoring via value sources
bq. Payloads
The optimization only applies if the natural score is the first thing to be 
sorted on. i.e. {{sort=score desc, id asc}}. If there is anything than “score” 
there, then the optimization doesn’t apply, not OOTB at least.

bq. Is TwoPhaseIterator or custom ranking in collapse/expand mode or in 
response writer the way? or have to implement a custom query overriding the 
createWeight using ScoreMode.TOP_SCORES?
I don’t know exactly what you want to do. In general, block-Max/WAND requires 
that the DISI can calculate the maximum contribution for a query in a block of 
documents with the indexed {{impacts}}. If you want to have more signals for 
calculating the score  I guess those could be added as Impacts using a custom 
codec and then you could have a custom scorer to use those ([~jpountz], let me 
know if this is wrong). If you just need a custom similarity that uses TF/IDF, 
then you should be able to use it.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-02 Thread Kranti Parisa (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17098238#comment-17098238
 ] 

Kranti Parisa commented on SOLR-13289:
--

Tomas et all, thanks for taking this up. This will be super useful in terms of 
performance gains.

How does this work for:
- Grouping, Sorting 
- Custom scoring via value sources
- Payloads

Is TwoPhaseIterator or custom ranking in collapse/expand mode or in response 
writer the way? or have to implement a custom query overriding the createWeight 
using ScoreMode.TOP_SCORES?

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-01 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097729#comment-17097729
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

Sounds good. I'll switch to boolean, and we can document that when not exact, 
it's always greater than.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-01 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097689#comment-17097689
 ] 

David Smiley commented on SOLR-13289:
-

Yeah; lets just go with a simple-to-understand boolean.  Even if we do 
ultimately add more tuning nobs or whatever, this boolean would still be 
factually correct.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-01 Thread Mike Drob (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097678#comment-17097678
 ] 

Mike Drob commented on SOLR-13289:
--

Based on my understanding, I favor the boolean approach.

The only knob we have is a threshold, not an error bound, right? As in, if 
there are more than X results, it's ok to return an estimate that may be too 
low. We don't get to specify that we need the estimate to be within 10% or 1% 
or whatever of the actual result, which is what precision would imply.

I like [~sokolov]'s comment about {{hitCountExact}}, that makes sense to me, 
but there's lots of options here.

Specific comments left on the PR.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-01 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097507#comment-17097507
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

bq. It seems to me what we want to convey is whether the count is exact, or an 
approximation. Are there any other "relations" that can be returned here? I 
think that's it, so if instead we were to use a boolean
That was my initial thought too. See [#comment-17094032]. I could be convinced 
to go back to boolean if we think we don't need the relation. My biggest 
concern were options that could be added in Lucene that would be impossible to 
represent with a Boolean. I guess in that case, if it happens we can go back 
and do another API change then.

Regardless of what term we use (numFoundPrecision seems to be the most 
popular). I plan to commit this soon (probably early next week) unless there 
are any concerns.


> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-01 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097399#comment-17097399
 ] 

Michael Sokolov commented on SOLR-13289:


It seems to me what we want to convey is whether the count is exact, or an 
approximation. Are there any other "relations" that can be returned here? I 
think that's it, so if instead we were to use a boolean, then the naming could 
be {isCountExact}  (or {isCountApproximate}). I suppose to maintain the enum, 
we could say something like {countExactitude} or {countApproximation} (or 
hitCount or numFound prefix is fine too), but it's slightly more awkward. 

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-01 Thread Anshum Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097234#comment-17097234
 ] 

Anshum Gupta commented on SOLR-13289:
-

I'm still reviewing the PR but wanted to weigh in on the use of the term 
{{precision}}. I don't have something I strongly feel about but how about how 
about \{{numFoundApproximation}} might be a better term.

If I can think of something better while I review this tomorrow, I'll share in 
the feedback.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-29 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095947#comment-17095947
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

Well, that's my point. The value in this attribute is the relation between the 
{{numFound}} and the real number of hits for a query in the index. The first 
time you see the attribute you may ask yourself what it is and refer to the 
docs, but after that it should make sense. I would expect the value of a 
{{numFoundPrecision}} to be a measure of the precision, which we can't provide.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-29 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095935#comment-17095935
 ] 

David Smiley commented on SOLR-13289:
-

I think “precision” in there is easier to understand and does not mandate a 
percentage. 
“Relation” begs the question “relative to what?”

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-29 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095908#comment-17095908
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

I still don't like the word "precision" at all in this. It sounds like the 
value of a "precision" attribute would be something like "10%", or something 
indicating the degree of precision of the result. I still prefer the word 
"relation", as Lucene used. Maybe {{numFoundRelation}} sounds better to you 
than {{hitCountRelation}}? maybe {{hitFoundRelation}}? meaning {{actual hits 
>=/== numFound}}? or {{hitCountRelation}} meaning {{actual hits >=/== count}}?

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-29 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095836#comment-17095836
 ] 

Ishan Chattopadhyaya commented on SOLR-13289:
-

+1 on numFoundPrecision. Other idea: numFoundCoverage (i.e. how much of the 
current result covers the exact resultset).

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-29 Thread Andrzej Bialecki (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095832#comment-17095832
 ] 

Andrzej Bialecki commented on SOLR-13289:
-

Yeah, naming things is hard :) I’m not a native speaker either, I just realized 
that the name clashes with the value that it describes and itself is 
meaningless (what is a “relation” here? relation to what? the Lucene enum name 
barely makes sense either, only after you read the javadocs.)

Agreed on the confusion with the ‘precision’ name. Maybe numFoundPrecision? 
Eh... I’ll stop here.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-29 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095757#comment-17095757
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

bq. IMHO the hitCountRelation vs numFound
Yeah, I'm not sure about this either. I was thinking in "numFoundRelation", but 
then I thought "maybe it's clear that this is the relation between "numFound" 
and the actual number of hits". I don't know, me not being a native English 
speaker certainly may be obscuring things for me. I think {{precision}} may be 
a confusing in the context of IR.

bq. perhaps use GT_EQ or EQ, short and not too cryptic?
+1. We don't need to be so verbose for every request.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-29 Thread Andrzej Bialecki (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095150#comment-17095150
 ] 

Andrzej Bialecki commented on SOLR-13289:
-

IMHO the {{hitCountRelation}} vs {{numFound}} is jarring and at the first 
glance looks cryptic and unrelated to each other. I understand that this 
reflects the API naming in Lucene, but I think Solr could be much more 
user-friendly here and use a name that is both related to {{numFound}} and 
self-explanatory - perhaps {{numFoundPrecision}} or simply {{precision}}?

After all, Solr doesn't use Lucene's {{totalHits}} name either, right?

The enum name is also very long - in total this element adds 40 characters to 
the response for something that is a simple flag ...

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-28 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094923#comment-17094923
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

This is how the relation comes back when using json response writer:
{code}
  "response": {
"numFound": 21,
"start": 0,
"hitCountRelation": "GREATER_THAN_OR_EQUAL_TO",
"docs": [
...
{code}
and xml
{code:xml}


...
{code}
When using SolrJ the {{SolrDocumentList}} contains the relation, and can be 
accessed via the {{getHitCountRelation()}} method.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-27 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094032#comment-17094032
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

The code in [the PR|https://github.com/apache/lucene-solr/pull/1456] is taking 
shape, please take a look at it if you are interested in the change. Specially, 
please take a look at the API. I decided to include in the response the 
{{hitCountRelation}}, similar to what Lucene responds (which can be 
{{EQUAL_TO}} or {{GREATER_OR_EQUAL_TO}}). I initially thought about just 
including a boolean to say if the hit count was exact or an approximation, but 
I decided to include the relation:
1) To be more precise
2) To be open to possible changes in Lucene, which could give more values to 
this relation
I had to ignore a Javabin "forward compatibilty" test, since the binaries 
change and break the test, however, I believe the changes are compatible 
("forward" and "backward"). I left the current default of minExactHits to be 
MAX_INT (count all) for now. I plan to have a new issue to have that discussion 
separately.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-24 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091835#comment-17091835
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

Here is my current progress: https://github.com/apache/lucene-solr/pull/1456

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-24 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091785#comment-17091785
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

Thanks Ishan, I've made some progress to. I can look at your changes and merge 
what's needed

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-23 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091106#comment-17091106
 ] 

Ishan Chattopadhyaya commented on SOLR-13289:
-

[~tflobbe], I was working on this last month and I'm actually much farther 
along on the patch than what I put here. I'll put together an updated patch by 
next week, and we can collaborate on this from there. WDYT?

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-21 Thread Munendra S N (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089241#comment-17089241
 ] 

Munendra S N commented on SOLR-13289:
-

[~tflobbe]
Sure, please go ahead

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-21 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089141#comment-17089141
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

I'd like to take a look at this if you are not working on it, [~munendrasn]

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-01-28 Thread Gregg Donovan (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025286#comment-17025286
 ] 

Gregg Donovan commented on SOLR-13289:
--

{quote}This feature currently doesn't work in case of faceting(this is 
expected), grouping.{quote}

Will WAND cause faceting to break entirely? Or will the counts for facets just 
be inexact?

{quote}as same minExactHits is shared across shard. so, actual minExactHits is 
shardCount*minExactHits{quote}
Perhaps it would be worth having an additional parameter for a 
perShardExactHits? E.g. if we're requesting the top 1000 hits across 64 shards, 
we'd likely be fine with WAND getting the top, say, 150 per shard.


> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2019-10-28 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961301#comment-16961301
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

Thanks for taking this Ishan. I'm also +1 for option 1. One thing I think we 
should do is to keep the same element in the response (numFound) as the value 
where the count is set, regardless of if it's an estimate or exact, otherwise 
it will make it much more difficult for users to know where to fetch the value 
(i.e. I want exact count for up to 1000 results and I do a query, what element 
in the response do I look for to display in the UI)? We can then also include a 
boolean in the response that indicates if the value displayed is exact or an 
estimate. I'll also make it easier to change the default without breaking 
compatibility.

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2019-10-28 Thread Munendra S N (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961295#comment-16961295
 ] 

Munendra S N commented on SOLR-13289:
-

 [^SOLR-13289.patch] 
[~ichattopadhyaya]
As discussed started working on this.
* Shared patch is rebased with latest master
* any negative values for {{minExactHits}} would be treated as 
{{Integer.MAX_VALUE}}
* Add initial support in distrib mode but this has some catch
** as same minExactHits is shared across shard. so, actual minExactHits is 
shardCount*minExactHits
** {{maxScore}} won't be returned for non-max_value minExactHits. maxScore 
could be returned as {{NaN}}. I have raised  SOLR-13839 for this NaN

This feature currently doesn't work in case of faceting(this is expected), 
grouping.
Next, need to add test cases for all components


> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org