Thanks Mike for having looked into these issues while I was away!

Le dim. 13 nov. 2016 à 23:08, Michael McCandless <luc...@mikemccandless.com>
a écrit :

> OK thanks Rob, I will revert my changes (that made test coverage of
> our similarities wimpier) and pull BooleanSimilarity from the random
> rotation instead.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Sun, Nov 13, 2016 at 9:16 AM, Robert Muir <rcm...@gmail.com> wrote:
> > By the way, the third option, even better, is to replace all the stuff
> > these tests are doing with direct ones in the similarity unit tests.
> > For example TestSimilarityBase.testCrazyIndexTimeBoosts()
> >
> > But that is a good deal of work. Without that, its important for
> > tests, somewhere, somehow to catch when these scoring systems are not
> > well-behaved (e.g. some bug in the formula causing scoring formula to
> > go backwards and many other things).
> >
> > That is why not all Similarities are in RandomSImilarity, because some
> > still cannot meet that bar and have crazy behavior, some may be
> > unfixable! But the tests found tons of bugs in these formulas, where
> > bad things happen when their expected distribution is not met, or
> > where features of lucene like boost don't play well with them. It is
> > important to have tests, somewhere, somehow, that will find these
> > things.
> >
> > Making all the tests wimpy so that BooleanSimilarity can be in there
> > doesn't make any sense to me.
> >
> >
> > On Sun, Nov 13, 2016 at 8:58 AM, Robert Muir <rcm...@gmail.com> wrote:
> >> I think these fixes are not the best way to go.
> >>
> >> Overall, these commits destroy large features of lucene's test suite,
> >> for testing that similarities are well-behaved and disables them.
> >> Previously, they were randomized across all similarities: This is the
> >> whole point of doing that.
> >>
> >> But why add BooleanSimilarity to the random list at all, when any test
> >> failures it causes (because of its own brokenness!), will be disabled
> >> to use only BM25???? This hurts all of lucene.
> >>
> >> Instead, I think all these test changes should be reverted and either:
> >> 1) BooleanSimilarity should be fixed to not be so brain-dead: e.g. why
> >> does it ignore index-time boost but not query-time boost? This is
> >> buggy are some of the reasons these tests fail.
> >>
> >> or
> >>
> >> 2) Remove BooleanSimilarity from the randomized testing setup here.
> >>
> >>
> >>
> >>
> >> On Sat, Nov 12, 2016 at 1:29 PM,  <mikemcc...@apache.org> wrote:
> >>> Repository: lucene-solr
> >>> Updated Branches:
> >>>   refs/heads/master 4d9451034 -> 2bc1d2761
> >>>
> >>>
> >>> LUCENE-7555: use BM25Similarity for this test
> >>>
> >>>
> >>> Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
> >>> Commit:
> http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/2bc1d276
> >>> Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/2bc1d276
> >>> Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/2bc1d276
> >>>
> >>> Branch: refs/heads/master
> >>> Commit: 2bc1d2761f593b2ceea19b3195e0cb430318ceaa
> >>> Parents: 4d94510
> >>> Author: Mike McCandless <mikemcc...@apache.org>
> >>> Authored: Sat Nov 12 13:28:57 2016 -0500
> >>> Committer: Mike McCandless <mikemcc...@apache.org>
> >>> Committed: Sat Nov 12 13:28:57 2016 -0500
> >>>
> >>> ----------------------------------------------------------------------
> >>>  .../src/test/org/apache/solr/uninverting/TestFieldCacheSort.java | 4
> +++-
> >>>  1 file changed, 3 insertions(+), 1 deletion(-)
> >>> ----------------------------------------------------------------------
> >>>
> >>>
> >>>
> http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/2bc1d276/solr/core/src/test/org/apache/solr/uninverting/TestFieldCacheSort.java
> >>> ----------------------------------------------------------------------
> >>> diff --git
> a/solr/core/src/test/org/apache/solr/uninverting/TestFieldCacheSort.java
> b/solr/core/src/test/org/apache/solr/uninverting/TestFieldCacheSort.java
> >>> index d53f610..4755c8f 100644
> >>> ---
> a/solr/core/src/test/org/apache/solr/uninverting/TestFieldCacheSort.java
> >>> +++
> b/solr/core/src/test/org/apache/solr/uninverting/TestFieldCacheSort.java
> >>> @@ -50,10 +50,11 @@ import org.apache.lucene.search.Sort;
> >>>  import org.apache.lucene.search.SortField;
> >>>  import org.apache.lucene.search.TermQuery;
> >>>  import org.apache.lucene.search.TopDocs;
> >>> +import org.apache.lucene.search.similarities.BM25Similarity;
> >>>  import org.apache.lucene.store.Directory;
> >>> -import org.apache.solr.uninverting.UninvertingReader.Type;
> >>>  import org.apache.lucene.util.LuceneTestCase;
> >>>  import org.apache.lucene.util.TestUtil;
> >>> +import org.apache.solr.uninverting.UninvertingReader.Type;
> >>>
> >>>  /*
> >>>   * Tests sorting (but with fieldcache instead of docvalues)
> >>> @@ -434,6 +435,7 @@ public class TestFieldCacheSort extends
> LuceneTestCase {
> >>>      writer.close();
> >>>
> >>>      IndexSearcher searcher = newSearcher(ir);
> >>> +    searcher.setSimilarity(new BM25Similarity());
> >>>      Sort sort = new Sort(new SortField(null, SortField.Type.SCORE,
> true));
> >>>
> >>>      TopDocs actual = searcher.search(new TermQuery(new Term("value",
> "foo")), 10, sort);
> >>>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

Reply via email to