[jira] [Commented] (LUCENE-8129) Support for defining a Unicode set filter when using ICUFoldingFilter

2018-01-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16324246#comment-16324246 ] Robert Muir commented on LUCENE-8129: - otherwise patch looks great to me. thanks! >

[jira] [Commented] (LUCENE-8129) Support for defining a Unicode set filter when using ICUFoldingFilter

2018-01-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16324172#comment-16324172 ] Robert Muir commented on LUCENE-8129: - Minor nitpick: can we rename it from {{normali

[jira] [Commented] (LUCENE-8129) Support for defining a Unicode set filter when using ICUFoldingFilter

2018-01-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323940#comment-16323940 ] Robert Muir commented on LUCENE-8129: - Need to double-check, but i'm pretty sure thes

[jira] [Commented] (LUCENE-8129) Support for defining a Unicode set filter when using ICUFoldingFilter

2018-01-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323919#comment-16323919 ] Robert Muir commented on LUCENE-8129: - Thanks for the patch. I don't like the change

[jira] [Moved] (LUCENE-8129) Support for defining a Unicode set filter when using ICUFoldingFilter

2018-01-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir moved SOLR-11811 to LUCENE-8129: Security: (was: Public) Component/s: (was: Schema and An

[jira] [Commented] (LUCENE-8119) Remove SimScorer.maxScore(maxFreq)

2018-01-09 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16318384#comment-16318384 ] Robert Muir commented on LUCENE-8119: - +1 > Remove SimScorer.maxScore(maxFreq) > ---

[jira] [Updated] (LUCENE-8125) emoji sequence support in ICUTokenizer

2018-01-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8125: Attachment: LUCENE-8125.patch I added tests for emoji tag sequences. I also refactored TestICUToken

[jira] [Updated] (LUCENE-8125) emoji sequence support in ICUTokenizer

2018-01-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8125: Attachment: LUCENE-8125.patch I updated with a middle of the road approach better matching our RBBI

[jira] [Updated] (LUCENE-8125) emoji sequence support in ICUTokenizer

2018-01-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8125: Attachment: LUCENE-8125.patch Updated patch just with some code comments explaining the logic, in p

[jira] [Updated] (LUCENE-8125) emoji sequence support in ICUTokenizer

2018-01-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8125: Attachment: LUCENE-8125.patch I updated the patch with support for presentation selectors (http://

[jira] [Commented] (LUCENE-8125) emoji sequence support in ICUTokenizer

2018-01-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317274#comment-16317274 ] Robert Muir commented on LUCENE-8125: - Note: I think it'd be nice to fix for standard

[jira] [Updated] (LUCENE-8125) emoji sequence support in ICUTokenizer

2018-01-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8125: Attachment: LUCENE-8125.patch Here's a patch. I did more cleanup of outdated breakiterator stuff wh

[jira] [Created] (LUCENE-8125) emoji sequence support in ICUTokenizer

2018-01-08 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-8125: --- Summary: emoji sequence support in ICUTokenizer Key: LUCENE-8125 URL: https://issues.apache.org/jira/browse/LUCENE-8125 Project: Lucene - Core Issue Type: Impr

[jira] [Resolved] (LUCENE-8122) upgrade icu to 60.2

2018-01-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-8122. - Resolution: Fixed Fix Version/s: 7.3 trunk > upgrade icu to 60.2 >

[jira] [Resolved] (LUCENE-8123) Question about how to retrieve by TFIDFSimilarity query on lucene

2018-01-07 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-8123. - Resolution: Invalid Please use the mailing list for questions. > Question about how to retrieve

[jira] [Commented] (LUCENE-8118) ArrayIndexOutOfBoundsException in TermsHashPerField.writeByte during indexing

2018-01-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314869#comment-16314869 ] Robert Muir commented on LUCENE-8118: - Dawid it is not complicated in this case. It i

[jira] [Updated] (LUCENE-8122) upgrade icu to 60.2

2018-01-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8122: Attachment: LUCENE-8122.patch with regenerated jar checksums too. > upgrade icu to 60.2 >

[jira] [Updated] (LUCENE-8122) upgrade icu to 60.2

2018-01-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8122: Attachment: LUCENE-8122.patch > upgrade icu to 60.2 > --- > > Key:

[jira] [Created] (LUCENE-8122) upgrade icu to 60.2

2018-01-06 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-8122: --- Summary: upgrade icu to 60.2 Key: LUCENE-8122 URL: https://issues.apache.org/jira/browse/LUCENE-8122 Project: Lucene - Core Issue Type: Improvement C

[jira] [Commented] (LUCENE-4198) Allow codecs to index term impacts

2018-01-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314678#comment-16314678 ] Robert Muir commented on LUCENE-4198: - {quote} Current take is to add PostingsEnum.se

[jira] [Commented] (LUCENE-8118) ArrayIndexOutOfBoundsException in TermsHashPerField.writeByte during indexing

2018-01-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314545#comment-16314545 ] Robert Muir commented on LUCENE-8118: - Well, I think a simple limit can work. For thi

[jira] [Commented] (LUCENE-8118) ArrayIndexOutOfBoundsException in TermsHashPerField.writeByte during indexing

2018-01-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313744#comment-16313744 ] Robert Muir commented on LUCENE-8118: - the test had to work hard to hit AIOOBE instea

[jira] [Updated] (LUCENE-8118) ArrayIndexOutOfBoundsException in TermsHashPerField.writeByte during indexing

2018-01-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8118: Attachment: LUCENE-8118_test.patch Here's a really bad test, but it works (takes about 2 minutes).

[jira] [Commented] (LUCENE-8118) ArrayIndexOutOfBoundsException in TermsHashPerField.writeByte during indexing

2018-01-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313337#comment-16313337 ] Robert Muir commented on LUCENE-8118: - yeah, but we still need to fix the case where

[jira] [Commented] (LUCENE-8118) ArrayIndexOutOfBoundsException in TermsHashPerField.writeByte during indexing

2018-01-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313322#comment-16313322 ] Robert Muir commented on LUCENE-8118: - whatever we decide to do, we can be sure that

[jira] [Commented] (LUCENE-8118) ArrayIndexOutOfBoundsException in TermsHashPerField.writeByte during indexing

2018-01-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313318#comment-16313318 ] Robert Muir commented on LUCENE-8118: - Well, I understand the bug, but not sure what

[jira] [Commented] (LUCENE-8118) ArrayIndexOutOfBoundsException in TermsHashPerField.writeByte during indexing

2018-01-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313299#comment-16313299 ] Robert Muir commented on LUCENE-8118: - It is nothing like that, it is simply a bug.

[jira] [Commented] (LUCENE-8119) Remove SimScorer.maxScore(maxFreq)

2018-01-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313287#comment-16313287 ] Robert Muir commented on LUCENE-8119: - I think we can see it does what we want via th

[jira] [Commented] (LUCENE-8119) Remove SimScorer.maxScore(maxFreq)

2018-01-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313280#comment-16313280 ] Robert Muir commented on LUCENE-8119: - Right, I think its better to look at 0 as just

[jira] [Commented] (LUCENE-8118) ArrayIndexOutOfBoundsException in TermsHashPerField.writeByte during indexing

2018-01-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313271#comment-16313271 ] Robert Muir commented on LUCENE-8118: - Issuing unnecessary commits is just masking th

[jira] [Commented] (LUCENE-8119) Remove SimScorer.maxScore(maxFreq)

2018-01-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313193#comment-16313193 ] Robert Muir commented on LUCENE-8119: - I don't understand the 0 stuff. This shouldn't

[jira] [Commented] (LUCENE-8113) Allow terms dictionary lookups to be lazy when scores are not needed

2018-01-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312968#comment-16312968 ] Robert Muir commented on LUCENE-8113: - Wait, I don't think thats right. There's no

[jira] [Commented] (LUCENE-8116) Similarity scores should depend only on freq and norm

2018-01-04 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16311303#comment-16311303 ] Robert Muir commented on LUCENE-8116: - +1, this is a nice cleanup. > Similarity scor

[jira] [Commented] (LUCENE-4198) Allow codecs to index term impacts

2018-01-03 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310818#comment-16310818 ] Robert Muir commented on LUCENE-4198: - {quote} The similarity API doesn't make it eas

[jira] [Commented] (LUCENE-8113) Allow terms dictionary lookups to be lazy when scores are not needed

2018-01-02 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16307943#comment-16307943 ] Robert Muir commented on LUCENE-8113: - Can we fold this into TermContext directly rat

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-12-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306850#comment-16306850 ] Robert Muir commented on LUCENE-7976: - {quote} Should we log a warning if someone con

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-12-29 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306201#comment-16306201 ] Robert Muir commented on LUCENE-7976: - {quote} <3b> is my attempt to reconcile the is

[jira] [Commented] (LUCENE-8010) fix or sandbox similarities in core with problems

2017-12-26 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16304100#comment-16304100 ] Robert Muir commented on LUCENE-8010: - {quote} +// Not adding the AxiomaticF3 sim

[jira] [Commented] (LUCENE-8010) fix or sandbox similarities in core with problems

2017-12-26 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16304098#comment-16304098 ] Robert Muir commented on LUCENE-8010: - And the TODOs about randomizing parameters are

[jira] [Commented] (LUCENE-8010) fix or sandbox similarities in core with problems

2017-12-26 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16304092#comment-16304092 ] Robert Muir commented on LUCENE-8010: - Sorry for the long delay, here is a really rou

[jira] [Commented] (LUCENE-8087) Record per-term max term frequencies

2017-12-26 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16304061#comment-16304061 ] Robert Muir commented on LUCENE-8087: - {quote} I guess the only way to avoid recordin

[jira] [Commented] (LUCENE-8087) Record per-term max term frequencies

2017-12-26 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16304054#comment-16304054 ] Robert Muir commented on LUCENE-8087: - Well its not just similarity-specific informat

[jira] [Commented] (LUCENE-8087) Record per-term max term frequencies

2017-12-26 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16303984#comment-16303984 ] Robert Muir commented on LUCENE-8087: - Maybe it doesnt belong in the terms dict? Thin

[jira] [Commented] (LUCENE-8102) CompiledAutomaton performance for determining common suffix

2017-12-18 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16295157#comment-16295157 ] Robert Muir commented on LUCENE-8102: - Also here are few more ideas: * There is a TOD

[jira] [Commented] (LUCENE-8102) CompiledAutomaton performance for determining common suffix

2017-12-18 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16295003#comment-16295003 ] Robert Muir commented on LUCENE-8102: - The optimization speeds up leading wildcard ty

[jira] [Commented] (LUCENE-8012) Improve Explanation class

2017-12-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289225#comment-16289225 ] Robert Muir commented on LUCENE-8012: - {quote} Changing .getValue() to return a Numb

[jira] [Commented] (LUCENE-8093) TrimFilterFactory should implement MultiTermAwareComponent

2017-12-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16287885#comment-16287885 ] Robert Muir commented on LUCENE-8093: - Seems like a reasonable argument to me. The ot

[jira] [Commented] (LUCENE-8093) TrimFilterFactory should implement MultiTermAwareComponent

2017-12-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16287706#comment-16287706 ] Robert Muir commented on LUCENE-8093: - Well i don't think its a truly formal definiti

[jira] [Commented] (LUCENE-8093) TrimFilterFactory should implement MultiTermAwareComponent

2017-12-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16287565#comment-16287565 ] Robert Muir commented on LUCENE-8093: - Any stemmer will work "perfectly well" too, un

[jira] [Commented] (LUCENE-8087) Record per-term max term frequencies

2017-12-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16283823#comment-16283823 ] Robert Muir commented on LUCENE-8087: - also for omit norms and omit frequencies cases

[jira] [Commented] (LUCENE-8087) Record per-term max term frequencies

2017-12-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16283818#comment-16283818 ] Robert Muir commented on LUCENE-8087: - {quote} Ideally we'd need something like the m

[jira] [Commented] (LUCENE-8010) fix or sandbox similarities in core with problems

2017-12-07 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16282650#comment-16282650 ] Robert Muir commented on LUCENE-8010: - Thanks for hacking on these! I can run some re

[jira] [Commented] (LUCENE-8083) Give similarities better values for maxScore

2017-12-07 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16282054#comment-16282054 ] Robert Muir commented on LUCENE-8083: - {quote} IBSimilarity with DistributionSPL, Axi

[jira] [Commented] (LUCENE-8015) TestBasicModelIne.testRandomScoring failure

2017-12-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280508#comment-16280508 ] Robert Muir commented on LUCENE-8015: - Took a glance, I am good with this approach, t

[jira] [Commented] (LUCENE-8015) TestBasicModelIne.testRandomScoring failure

2017-12-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16278667#comment-16278667 ] Robert Muir commented on LUCENE-8015: - I think I like the proposed solution. Lets dro

[jira] [Commented] (LUCENE-8015) TestBasicModelIne.testRandomScoring failure

2017-12-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16278646#comment-16278646 ] Robert Muir commented on LUCENE-8015: - Adrien it should reproduce every time with the

[jira] [Commented] (LUCENE-8015) TestBasicModelIne.testRandomScoring failure

2017-12-04 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16277325#comment-16277325 ] Robert Muir commented on LUCENE-8015: - thanks for the analysis! I still don't even re

[jira] [Commented] (LUCENE-8073) TestBasicModelIn.testRandomScoring failure

2017-12-01 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274649#comment-16274649 ] Robert Muir commented on LUCENE-8073: - Dup of LUCENE-8015 > TestBasicModelIn.testRan

[jira] [Commented] (LUCENE-8072) Improve accuracy of similarity scores

2017-12-01 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274493#comment-16274493 ] Robert Muir commented on LUCENE-8072: - As far as changes to double precision, we shou

[jira] [Commented] (LUCENE-8072) Improve accuracy of similarity scores

2017-12-01 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274490#comment-16274490 ] Robert Muir commented on LUCENE-8072: - Also i dont see the benefit to relevance. I am

[jira] [Commented] (LUCENE-8072) Improve accuracy of similarity scores

2017-12-01 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274486#comment-16274486 ] Robert Muir commented on LUCENE-8072: - There is a cost to log1p, I'm not sure we shou

[jira] [Commented] (LUCENE-8008) reduce/remove usages of CheckHits.EXPLAIN_TOLERANCE_*

2017-11-29 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16270893#comment-16270893 ] Robert Muir commented on LUCENE-8008: - +1, thank you for looking into this! > reduce

[jira] [Commented] (LUCENE-8069) Allow index sorting by field length

2017-11-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16269976#comment-16269976 ] Robert Muir commented on LUCENE-8069: - Sorry, i don't agree with this issue. In my op

[jira] [Commented] (LUCENE-8028) Arabic Stemmer improvement for Better Search Accuracy

2017-11-26 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266019#comment-16266019 ] Robert Muir commented on LUCENE-8028: - Hello, sorry for the slow response! (holiday t

[jira] [Commented] (LUCENE-8048) Filesystems do not guarantee order of directories updates

2017-11-22 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16263313#comment-16263313 ] Robert Muir commented on LUCENE-8048: - Erick, can you hold off a bit. I don't think i

[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2017-11-22 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16263148#comment-16263148 ] Robert Muir commented on LUCENE-8060: - +1, I think something like that is a better tr

[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2017-11-22 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16263111#comment-16263111 ] Robert Muir commented on LUCENE-8060: - I don't think it should be mandatory, it shoul

[jira] [Commented] (LUCENE-8053) Similarities should round the length up

2017-11-20 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259354#comment-16259354 ] Robert Muir commented on LUCENE-8053: - Doesn't need to be the same token: discount_ov

[jira] [Commented] (LUCENE-8053) Similarities should round the length up

2017-11-20 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259283#comment-16259283 ] Robert Muir commented on LUCENE-8053: - frequencies can always be larger than the leng

[jira] [Commented] (LUCENE-8052) Test failure: TestBasicModelG.testRandomScoring (small numeric delta comparison error)

2017-11-17 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16257491#comment-16257491 ] Robert Muir commented on LUCENE-8052: - it is described in LUCENE-8015. its not a nume

[jira] [Commented] (LUCENE-6228) Do not expose full-fledged scorers in LeafCollector.setScorer

2017-11-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16255483#comment-16255483 ] Robert Muir commented on LUCENE-6228: - I don't agree, sorry. We don't need to prevent

[jira] [Commented] (LUCENE-6228) Do not expose full-fledged scorers in LeafCollector.setScorer

2017-11-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16255444#comment-16255444 ] Robert Muir commented on LUCENE-6228: - {quote} This just moves the existing problem w

[jira] [Commented] (LUCENE-6228) Do not expose full-fledged scorers in LeafCollector.setScorer

2017-11-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16255251#comment-16255251 ] Robert Muir commented on LUCENE-6228: - The casts back to Scorer in some of the tests

[jira] [Commented] (LUCENE-8028) Arabic Stemmer improvement for Better Search Accuracy

2017-11-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253463#comment-16253463 ] Robert Muir commented on LUCENE-8028: - Also if we could avoid using naming such as "h

[jira] [Commented] (LUCENE-8028) Arabic Stemmer improvement for Better Search Accuracy

2017-11-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253457#comment-16253457 ] Robert Muir commented on LUCENE-8028: - Can we instead factor out this stemmer into it

[jira] [Commented] (LUCENE-8040) Optimize IndexSearcher.collectionStatistics

2017-11-14 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252939#comment-16252939 ] Robert Muir commented on LUCENE-8040: - +1 to take the conservative approach and just

[jira] [Commented] (LUCENE-8040) Optimize IndexSearcher.collectionStatistics

2017-11-14 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252325#comment-16252325 ] Robert Muir commented on LUCENE-8040: - no for 7.x you need to handle -1 case for stat

[jira] [Commented] (LUCENE-8048) Filesystems do not guarantee order of directories updates

2017-11-11 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248485#comment-16248485 ] Robert Muir commented on LUCENE-8048: - I think any filesystem that behaves in this wa

[jira] [Commented] (LUCENE-8047) Comparison of String objects using == or !=

2017-11-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247638#comment-16247638 ] Robert Muir commented on LUCENE-8047: - These token type instances are intentionally u

[jira] [Commented] (LUCENE-8048) Filesystems do not guarantee order of directories updates

2017-11-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247616#comment-16247616 ] Robert Muir commented on LUCENE-8048: - Hi [~mar-kolya], I like the idea of the patch.

[jira] [Moved] (LUCENE-8048) Filesystems do not guarantee order of directories updates

2017-11-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir moved SOLR-11626 to LUCENE-8048: Security: (was: Public) Lucene Fields: New Key:

[jira] [Commented] (LUCENE-8042) Add SegmentCachable interface

2017-11-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247602#comment-16247602 ] Robert Muir commented on LUCENE-8042: - +1, this really looks a lot better to me. > A

[jira] [Commented] (SOLR-11595) optimize SolrIndexSearcher.localCollectionStatistics to use cached MultiFields

2017-11-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247598#comment-16247598 ] Robert Muir commented on SOLR-11595: I don't think you should add such a cache to solr

[jira] [Commented] (LUCENE-8040) Optimize IndexSearcher.collectionStatistics

2017-11-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247596#comment-16247596 ] Robert Muir commented on LUCENE-8040: - Can you please upload a proper patch so it can

[jira] [Commented] (LUCENE-8040) Optimize IndexSearcher.collectionStatistics

2017-11-09 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247097#comment-16247097 ] Robert Muir commented on LUCENE-8040: - Its not saving a "lot". We are talking about m

[jira] [Commented] (LUCENE-8042) Add SegmentCachable interface

2017-11-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244028#comment-16244028 ] Robert Muir commented on LUCENE-8042: - Also i would still like to see if we can make

[jira] [Commented] (LUCENE-8042) Add SegmentCachable interface

2017-11-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244024#comment-16244024 ] Robert Muir commented on LUCENE-8042: - some of the code in this patch still uses Segm

[jira] [Commented] (LUCENE-8038) Decouple payload decoding from Similarity

2017-11-07 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241900#comment-16241900 ] Robert Muir commented on LUCENE-8038: - They need to be restricted though, thats why w

[jira] [Commented] (LUCENE-8042) Add SegmentCachable interface

2017-11-07 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241882#comment-16241882 ] Robert Muir commented on LUCENE-8042: - Can we reconsider the latter? This is a bit t

[jira] [Commented] (LUCENE-8041) All Fields.terms(fld) impls should be O(N) not O(log(N))

2017-11-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240977#comment-16240977 ] Robert Muir commented on LUCENE-8041: - It doesn't need to be all *all* fields.terms i

[jira] [Commented] (LUCENE-8040) Optimize IndexSearcher.collectionStatistics

2017-11-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240965#comment-16240965 ] Robert Muir commented on LUCENE-8040: - I don't really see it as a separate issue. col

[jira] [Commented] (LUCENE-8040) Optimize IndexSearcher.collectionStatistics

2017-11-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240867#comment-16240867 ] Robert Muir commented on LUCENE-8040: - Also I think as far as lowering the overhead t

[jira] [Commented] (LUCENE-8040) Optimize IndexSearcher.collectionStatistics

2017-11-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240728#comment-16240728 ] Robert Muir commented on LUCENE-8040: - I don't think we should add caching to the ind

[jira] [Updated] (LUCENE-8031) DOCS_ONLY fields set incorrect length norms

2017-11-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8031: Attachment: LUCENE-8031.patch Here is a patch, i didn't yet improve tests and didn't address downgr

[jira] [Commented] (LUCENE-8038) Decouple payload decoding from Similarity

2017-11-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240220#comment-16240220 ] Robert Muir commented on LUCENE-8038: - I want to urge caution about adding more flexi

[jira] [Commented] (LUCENE-8034) SpanNotWeight returns wrong results due to integer overflow

2017-11-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16239839#comment-16239839 ] Robert Muir commented on LUCENE-8034: - I think you are right, I was just confused by

[jira] [Commented] (LUCENE-8034) SpanNotWeight returns wrong results due to integer overflow

2017-11-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16239792#comment-16239792 ] Robert Muir commented on LUCENE-8034: - It currently causes scores to go negative with

[jira] [Resolved] (LUCENE-8007) Require that codecs always store totalTermFreq, sumDocFreq and sumTotalTermFreq

2017-11-02 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-8007. - Resolution: Fixed > Require that codecs always store totalTermFreq, sumDocFreq and > sumTotalTer

[jira] [Updated] (LUCENE-8007) Require that codecs always store totalTermFreq, sumDocFreq and sumTotalTermFreq

2017-11-02 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8007: Attachment: LUCENE-8007.patch I removed the != 0 checks in checkIndex and added additional checks a

[jira] [Commented] (LUCENE-8019) Add a root failure cause to Explanation

2017-11-02 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235687#comment-16235687 ] Robert Muir commented on LUCENE-8019: - +1 for the DebugBooleanQuery idea. When I look

<    1   2   3   4   5   6   7   8   9   10   >