Thanks Mike for having looked into these issues while I was away! Le dim. 13 nov. 2016 à 23:08, Michael McCandless <luc...@mikemccandless.com> a écrit :
> OK thanks Rob, I will revert my changes (that made test coverage of > our similarities wimpier) and pull BooleanSimilarity from the random > rotation instead. > > Mike McCandless > > http://blog.mikemccandless.com > > On Sun, Nov 13, 2016 at 9:16 AM, Robert Muir <rcm...@gmail.com> wrote: > > By the way, the third option, even better, is to replace all the stuff > > these tests are doing with direct ones in the similarity unit tests. > > For example TestSimilarityBase.testCrazyIndexTimeBoosts() > > > > But that is a good deal of work. Without that, its important for > > tests, somewhere, somehow to catch when these scoring systems are not > > well-behaved (e.g. some bug in the formula causing scoring formula to > > go backwards and many other things). > > > > That is why not all Similarities are in RandomSImilarity, because some > > still cannot meet that bar and have crazy behavior, some may be > > unfixable! But the tests found tons of bugs in these formulas, where > > bad things happen when their expected distribution is not met, or > > where features of lucene like boost don't play well with them. It is > > important to have tests, somewhere, somehow, that will find these > > things. > > > > Making all the tests wimpy so that BooleanSimilarity can be in there > > doesn't make any sense to me. > > > > > > On Sun, Nov 13, 2016 at 8:58 AM, Robert Muir <rcm...@gmail.com> wrote: > >> I think these fixes are not the best way to go. > >> > >> Overall, these commits destroy large features of lucene's test suite, > >> for testing that similarities are well-behaved and disables them. > >> Previously, they were randomized across all similarities: This is the > >> whole point of doing that. > >> > >> But why add BooleanSimilarity to the random list at all, when any test > >> failures it causes (because of its own brokenness!), will be disabled > >> to use only BM25???? This hurts all of lucene. > >> > >> Instead, I think all these test changes should be reverted and either: > >> 1) BooleanSimilarity should be fixed to not be so brain-dead: e.g. why > >> does it ignore index-time boost but not query-time boost? This is > >> buggy are some of the reasons these tests fail. > >> > >> or > >> > >> 2) Remove BooleanSimilarity from the randomized testing setup here. > >> > >> > >> > >> > >> On Sat, Nov 12, 2016 at 1:29 PM, <mikemcc...@apache.org> wrote: > >>> Repository: lucene-solr > >>> Updated Branches: > >>> refs/heads/master 4d9451034 -> 2bc1d2761 > >>> > >>> > >>> LUCENE-7555: use BM25Similarity for this test > >>> > >>> > >>> Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo > >>> Commit: > http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/2bc1d276 > >>> Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/2bc1d276 > >>> Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/2bc1d276 > >>> > >>> Branch: refs/heads/master > >>> Commit: 2bc1d2761f593b2ceea19b3195e0cb430318ceaa > >>> Parents: 4d94510 > >>> Author: Mike McCandless <mikemcc...@apache.org> > >>> Authored: Sat Nov 12 13:28:57 2016 -0500 > >>> Committer: Mike McCandless <mikemcc...@apache.org> > >>> Committed: Sat Nov 12 13:28:57 2016 -0500 > >>> > >>> ---------------------------------------------------------------------- > >>> .../src/test/org/apache/solr/uninverting/TestFieldCacheSort.java | 4 > +++- > >>> 1 file changed, 3 insertions(+), 1 deletion(-) > >>> ---------------------------------------------------------------------- > >>> > >>> > >>> > http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/2bc1d276/solr/core/src/test/org/apache/solr/uninverting/TestFieldCacheSort.java > >>> ---------------------------------------------------------------------- > >>> diff --git > a/solr/core/src/test/org/apache/solr/uninverting/TestFieldCacheSort.java > b/solr/core/src/test/org/apache/solr/uninverting/TestFieldCacheSort.java > >>> index d53f610..4755c8f 100644 > >>> --- > a/solr/core/src/test/org/apache/solr/uninverting/TestFieldCacheSort.java > >>> +++ > b/solr/core/src/test/org/apache/solr/uninverting/TestFieldCacheSort.java > >>> @@ -50,10 +50,11 @@ import org.apache.lucene.search.Sort; > >>> import org.apache.lucene.search.SortField; > >>> import org.apache.lucene.search.TermQuery; > >>> import org.apache.lucene.search.TopDocs; > >>> +import org.apache.lucene.search.similarities.BM25Similarity; > >>> import org.apache.lucene.store.Directory; > >>> -import org.apache.solr.uninverting.UninvertingReader.Type; > >>> import org.apache.lucene.util.LuceneTestCase; > >>> import org.apache.lucene.util.TestUtil; > >>> +import org.apache.solr.uninverting.UninvertingReader.Type; > >>> > >>> /* > >>> * Tests sorting (but with fieldcache instead of docvalues) > >>> @@ -434,6 +435,7 @@ public class TestFieldCacheSort extends > LuceneTestCase { > >>> writer.close(); > >>> > >>> IndexSearcher searcher = newSearcher(ir); > >>> + searcher.setSimilarity(new BM25Similarity()); > >>> Sort sort = new Sort(new SortField(null, SortField.Type.SCORE, > true)); > >>> > >>> TopDocs actual = searcher.search(new TermQuery(new Term("value", > "foo")), 10, sort); > >>> > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >