[jira] [Commented] (LUCENE-7997) More sanity testing of similarities

2017-10-25 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218204#comment-16218204
 ] 

Adrien Grand commented on LUCENE-7997:
--

Thanks Robert!

> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (8.0)
>
> Attachments: LUCENE-7997.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7997) More sanity testing of similarities

2017-10-24 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218010#comment-16218010
 ] 

ASF subversion and git services commented on LUCENE-7997:
-

Commit 42717d5f4bbed46009f11a86f307541a19fd7fb5 in lucene-solr's branch 
refs/heads/master from [~rcmuir]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=42717d5 ]

LUCENE-7997: More sanity testing of similarities


> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7997.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7997) More sanity testing of similarities

2017-10-24 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217005#comment-16217005
 ] 

Robert Muir commented on LUCENE-7997:
-

I would like to commit this soonish, the patch will quickly conflict and go out 
of date.

I will sanity test to ensure the changes didnt introduce bugs in the formulas. 

And I think open a 2nd issue about dealing with bad apple sims, and change the 
AwaitsFix to point to that.

> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7997) More sanity testing of similarities

2017-10-21 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213777#comment-16213777
 ] 

Adrien Grand commented on LUCENE-7997:
--

I like where the patch is going!

> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7997) More sanity testing of similarities

2017-10-17 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208515#comment-16208515
 ] 

Robert Muir commented on LUCENE-7997:
-

Exactly we need a picky base sim test for this (like how 
BaseTokenStreamTestCase checks various requirements for analyzers). Currently 
these properties are "scattered" across various parts of the code/tests/issues: 
such as scores not being inf/NaN for some collectors, not being negative, 
monotonic tf, etc that maxscore requires. Sims that use certain statistics 
should fallback to other things when term frequencies are omitted, etc. It 
would be better to ensure we test all sims for all these things with direct 
tests. We should also try to test all norm values explicitly so that there 
aren't problems with super large documents and so on.



> More sanity testing of similarities
> ---
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org