[jira] [Commented] (SOLR-14378) factor out a (contrib/ltr) FilterFeatureScorer class
[ https://issues.apache.org/jira/browse/SOLR-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074264#comment-17074264 ] David Smiley commented on SOLR-14378: - +1 LGTM. No CHANGES.txt entry needed IMO. > factor out a (contrib/ltr) FilterFeatureScorer class > > > Key: SOLR-14378 > URL: https://issues.apache.org/jira/browse/SOLR-14378 > Project: Solr > Issue Type: Task > Components: contrib - LTR >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-14378.patch > > > It looks like a {{FilterFeatureScorer}} class can be factored out from > {{OriginalScoreScorer}} (and {{SolrFeatureScorer}} after or with [~dsmiley]'s > SOLR-14364 changes could use it too). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley opened a new pull request #1399: SOLR-14376: optimize SolrIndexSearcher.getDocSet matches everything
dsmiley opened a new pull request #1399: SOLR-14376: optimize SolrIndexSearcher.getDocSet matches everything URL: https://github.com/apache/lucene-solr/pull/1399 https://issues.apache.org/jira/browse/SOLR-14376 * getProcessedFilter now returns null filter if it's all docs more reliably * getProcessedFilter now documented clearly as an internal method * getDocSet detects all-docs and exits early with getLiveDocs The PR depends on #1381 (SOLR-14364 LTR to not call getProcessedFilter) because the current code there incorrectly assumed that a null ProcessedFilter.filter meant match nothing when it's the opposite. Since a null filter is set more reliably now, it resulted in some tests over there failing. Really people should avoid this low level method if they can. The PR has improved javadocs and some changes in getProcessedFilter I added to improve readability (to me, any way). I also touched getDocSetBits & makeDocSetBits trivially to check where an instanceof check happens because I thought it was clearer. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (WIP)
dsmiley commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (WIP) URL: https://github.com/apache/lucene-solr/pull/1395#discussion_r402711624 ## File path: solr/core/src/java/org/apache/solr/search/CollapsingQParserPlugin.java ## @@ -524,6 +533,7 @@ public int docID() { public OrdScoreCollector(int maxDoc, int segments, + PrimitiveMapFactory mapFactory, Review comment: I suggest we take inspiration from two similar places in Lucene/Solr: `org.apache.lucene.util.DocIdSetBuilder` and `org.apache.solr.search.DocSetBuilder` which are doing essentially the same thing. Basically, start with a memory efficient data structure, and then once we reach a threshold, switch on the fly to a larger but more efficient structure. So start with a hash based map with a substantial initial capacity (not 4!), then upgrade to an array based map impl _instead of resizing the map_ if we exceed that capacity. This might be done with a map impl that delegates to the real impl (hppc) with switch/resize logic. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14356) PeerSync should not fail with SocketTimeoutException from hanging nodes
[ https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cao Manh Dat updated SOLR-14356: Fix Version/s: 8.6 master (9.0) Resolution: Fixed Status: Resolved (was: Patch Available) > PeerSync should not fail with SocketTimeoutException from hanging nodes > --- > > Key: SOLR-14356 > URL: https://issues.apache.org/jira/browse/SOLR-14356 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat >Priority: Major > Fix For: master (9.0), 8.6 > > Attachments: SOLR-14356.patch, SOLR-14356.patch > > > Right now in {{PeerSync}} (during leader election), in case of exception on > requesting versions to a node, we will skip that node if exception is one the > following type > * ConnectTimeoutException > * NoHttpResponseException > * SocketException > Sometime the other node basically hang but still accept connection. In that > case SocketTimeoutException is thrown and we consider the {{PeerSync}} > process as failed and the whole shard just basically leaderless forever (as > long as the hang node still there). > We can't just blindly adding {{SocketTimeoutException}} to above list, since > [~shalin] mentioned that sometimes timeout can happen because of genuine > reasons too e.g. temporary GC pause. > I think the general idea here is we obey {{leaderVoteWait}} restriction and > retry doing sync with others in case of connection/timeout exception happen. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14356) PeerSync should not fail with SocketTimeoutException from hanging nodes
[ https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cao Manh Dat reassigned SOLR-14356: --- Assignee: Cao Manh Dat > PeerSync should not fail with SocketTimeoutException from hanging nodes > --- > > Key: SOLR-14356 > URL: https://issues.apache.org/jira/browse/SOLR-14356 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: master (9.0), 8.6 > > Attachments: SOLR-14356.patch, SOLR-14356.patch > > > Right now in {{PeerSync}} (during leader election), in case of exception on > requesting versions to a node, we will skip that node if exception is one the > following type > * ConnectTimeoutException > * NoHttpResponseException > * SocketException > Sometime the other node basically hang but still accept connection. In that > case SocketTimeoutException is thrown and we consider the {{PeerSync}} > process as failed and the whole shard just basically leaderless forever (as > long as the hang node still there). > We can't just blindly adding {{SocketTimeoutException}} to above list, since > [~shalin] mentioned that sometimes timeout can happen because of genuine > reasons too e.g. temporary GC pause. > I think the general idea here is we obey {{leaderVoteWait}} restriction and > retry doing sync with others in case of connection/timeout exception happen. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14356) PeerSync should not fail with SocketTimeoutException from hanging nodes
[ https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074225#comment-17074225 ] ASF subversion and git services commented on SOLR-14356: Commit e6c7564e41d40cdd7149e0c339fc8259047ac744 in lucene-solr's branch refs/heads/branch_8x from Cao Manh Dat [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e6c7564 ] SOLR-14356: PeerSync should not fail with SocketTimeoutException from hanging nodes > PeerSync should not fail with SocketTimeoutException from hanging nodes > --- > > Key: SOLR-14356 > URL: https://issues.apache.org/jira/browse/SOLR-14356 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat >Priority: Major > Attachments: SOLR-14356.patch, SOLR-14356.patch > > > Right now in {{PeerSync}} (during leader election), in case of exception on > requesting versions to a node, we will skip that node if exception is one the > following type > * ConnectTimeoutException > * NoHttpResponseException > * SocketException > Sometime the other node basically hang but still accept connection. In that > case SocketTimeoutException is thrown and we consider the {{PeerSync}} > process as failed and the whole shard just basically leaderless forever (as > long as the hang node still there). > We can't just blindly adding {{SocketTimeoutException}} to above list, since > [~shalin] mentioned that sometimes timeout can happen because of genuine > reasons too e.g. temporary GC pause. > I think the general idea here is we obey {{leaderVoteWait}} restriction and > retry doing sync with others in case of connection/timeout exception happen. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14356) PeerSync should not fail with SocketTimeoutException from hanging nodes
[ https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074224#comment-17074224 ] ASF subversion and git services commented on SOLR-14356: Commit 28dea8d32783878b218d035708dbaf4245beacee in lucene-solr's branch refs/heads/master from Cao Manh Dat [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=28dea8d ] SOLR-14356: PeerSync should not fail with SocketTimeoutException from hanging nodes > PeerSync should not fail with SocketTimeoutException from hanging nodes > --- > > Key: SOLR-14356 > URL: https://issues.apache.org/jira/browse/SOLR-14356 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat >Priority: Major > Attachments: SOLR-14356.patch, SOLR-14356.patch > > > Right now in {{PeerSync}} (during leader election), in case of exception on > requesting versions to a node, we will skip that node if exception is one the > following type > * ConnectTimeoutException > * NoHttpResponseException > * SocketException > Sometime the other node basically hang but still accept connection. In that > case SocketTimeoutException is thrown and we consider the {{PeerSync}} > process as failed and the whole shard just basically leaderless forever (as > long as the hang node still there). > We can't just blindly adding {{SocketTimeoutException}} to above list, since > [~shalin] mentioned that sometimes timeout can happen because of genuine > reasons too e.g. temporary GC pause. > I think the general idea here is we obey {{leaderVoteWait}} restriction and > retry doing sync with others in case of connection/timeout exception happen. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14356) PeerSync should not fail with SocketTimeoutException from hanging nodes
[ https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cao Manh Dat updated SOLR-14356: Summary: PeerSync should not fail with SocketTimeoutException from hanging nodes (was: PeerSync with hanging nodes) > PeerSync should not fail with SocketTimeoutException from hanging nodes > --- > > Key: SOLR-14356 > URL: https://issues.apache.org/jira/browse/SOLR-14356 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat >Priority: Major > Attachments: SOLR-14356.patch, SOLR-14356.patch > > > Right now in {{PeerSync}} (during leader election), in case of exception on > requesting versions to a node, we will skip that node if exception is one the > following type > * ConnectTimeoutException > * NoHttpResponseException > * SocketException > Sometime the other node basically hang but still accept connection. In that > case SocketTimeoutException is thrown and we consider the {{PeerSync}} > process as failed and the whole shard just basically leaderless forever (as > long as the hang node still there). > We can't just blindly adding {{SocketTimeoutException}} to above list, since > [~shalin] mentioned that sometimes timeout can happen because of genuine > reasons too e.g. temporary GC pause. > I think the general idea here is we obey {{leaderVoteWait}} restriction and > retry doing sync with others in case of connection/timeout exception happen. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] CaoManhDat commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (W
CaoManhDat commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (WIP) URL: https://github.com/apache/lucene-solr/pull/1395#discussion_r402691881 ## File path: solr/core/src/java/org/apache/solr/search/CollapsingQParserPlugin.java ## @@ -524,6 +533,7 @@ public int docID() { public OrdScoreCollector(int maxDoc, int segments, + PrimitiveMapFactory mapFactory, Review comment: That is what I thought in the first place and HPPC primitive map did gave a very good result in our performance test on large collection (number of unique values in the collapsing field is 1.2million), around 5x better in qps. But when I write a simple single-thread benchmark test (I can add it here in this PR so you can try to tune it). - Number of docs are 300k - 95% percent of queries are sparse ones so we only doing collapsing on 1% of docs. Array approach actually do better than map (around 1.5x better). The reason for that is - We don't now how much size we gonna need for the map, so we pay lot of cost on resizing the map. - Map needs 2 array for storing key and value seperately, plus with above point it leads to more memory usage. I just don't want to do a commit that can potentially slow down users. Although switching to HPPC primitive map is much easier and makes thing simple for me. Any idea on this @dsmiley @shalinmangar @joel-bernstein This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] CaoManhDat commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (W
CaoManhDat commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (WIP) URL: https://github.com/apache/lucene-solr/pull/1395#discussion_r402691881 ## File path: solr/core/src/java/org/apache/solr/search/CollapsingQParserPlugin.java ## @@ -524,6 +533,7 @@ public int docID() { public OrdScoreCollector(int maxDoc, int segments, + PrimitiveMapFactory mapFactory, Review comment: That is what I thought in the first place and HPPC primitive map did gave a very good result in our performance test on large collection (number of unique values in the collapsing field is 1.2million), around 5x better in qps. But when I write a simple single-thread benchmark test (I can add it here in this PR so you can try to tune it). - Number of docs are 300k - 95% percent of queries are sparse ones so we only doing collapsing on 1% of docs. Array approach actually do better than map (around 1.5x better). The reason for that is - We don't now how much size we gonna need for the map, so we pay lot of cost on resizing the map. - Map needs 2 array for storing key and value seperately, plus with above point it leads to more memory usage. I just don't want to do a commit that can potentially slow down users. Although switching to HPPC primitive map is much easier and makes thing simple for me. Any idea on this @dsmiley @shalinmangar @joel-bernstein This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] CaoManhDat commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (W
CaoManhDat commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (WIP) URL: https://github.com/apache/lucene-solr/pull/1395#discussion_r402687112 ## File path: solr/core/src/java/org/apache/solr/util/numeric/IntIntArrayBasedMap.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.numeric; + +import java.util.Arrays; +import java.util.Iterator; +import java.util.function.IntConsumer; + +import org.apache.lucene.util.ArrayUtil; + +public class IntIntArrayBasedMap implements IntIntMap { + + private int size; + private int[] keyValues; + private int emptyValue; + + public IntIntArrayBasedMap(int initialSize, int emptyValue) { +this.size = initialSize; +this.keyValues = new int[initialSize]; +this.emptyValue = emptyValue; +if (emptyValue != 0) { + Arrays.fill(keyValues, emptyValue); +} + } + + @Override + public void set(int key, int value) { +if (key >= size) { + keyValues = ArrayUtil.grow(keyValues); Review comment: oh, right! :+1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] CaoManhDat commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (W
CaoManhDat commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (WIP) URL: https://github.com/apache/lucene-solr/pull/1395#discussion_r402687019 ## File path: solr/core/src/java/org/apache/solr/util/numeric/IntIntArrayBasedMap.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.numeric; + +import java.util.Arrays; +import java.util.Iterator; +import java.util.function.IntConsumer; + +import org.apache.lucene.util.ArrayUtil; + +public class IntIntArrayBasedMap implements IntIntMap { + + private int size; + private int[] keyValues; + private int emptyValue; + + public IntIntArrayBasedMap(int initialSize, int emptyValue) { +this.size = initialSize; +this.keyValues = new int[initialSize]; +this.emptyValue = emptyValue; +if (emptyValue != 0) { + Arrays.fill(keyValues, emptyValue); +} + } + + @Override + public void set(int key, int value) { +if (key >= size) { + keyValues = ArrayUtil.grow(keyValues); + if (emptyValue != 0) { +for (int i = size; i < keyValues.length; i++) { + keyValues[i] = emptyValue; +} + } + size = keyValues.length; +} +keyValues[key] = value; + } + + @Override + public int get(int key) { +if (key >= size) { + return emptyValue; +} +return keyValues[key]; + } + + @Override + public void forEachValue(IntConsumer consumer) { +for (int val: keyValues) { + if (val != emptyValue) { +consumer.accept(val); + } +} + } + + @Override + public void remove(int key) { +if (key < size) keyValues[key] = emptyValue; + } + + @Override + public int size() { +return keyValues.length; Review comment: Thanks! This kinda like a debug method and I will delete it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9303) There may be can simpler in DefaultIndexingChain
[ https://issues.apache.org/jira/browse/LUCENE-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074175#comment-17074175 ] kkewwei commented on LUCENE-9303: - [https://github.com/apache/lucene-solr/pull/993] is this one ok? > There may be can simpler in DefaultIndexingChain > > > Key: LUCENE-9303 > URL: https://issues.apache.org/jira/browse/LUCENE-9303 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Affects Versions: 8.2 >Reporter: kkewwei >Priority: Major > Labels: newdev > > In DefaultIndexingChain.processField() > {code:java} > if (fieldType.stored()) { > if (fp == null) { > fp = getOrAddField(fieldName, fieldType, false); > } > if (fieldType.stored()) { > String value = field.stringValue(); > .. > try { > storedFieldsConsumer.writeField(fp.fieldInfo, field); > } catch (Throwable th) { > .. > } > } > } > {code} > If there has need to add the second {{if}}, because {{fieldType.stored()}} > is executed before. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ebehrendt opened a new pull request #1398: SOLR-13101: Every BLOB_PULL_STARTED should have a matching BLOB_PULL_FINISHED irrespective of failures
ebehrendt opened a new pull request #1398: SOLR-13101: Every BLOB_PULL_STARTED should have a matching BLOB_PULL_FINISHED irrespective of failures URL: https://github.com/apache/lucene-solr/pull/1398 # Description If exception is thrown during core pull from blob store, there is no record of the pull finishing. This caused a failure in SharedCoreConcurrencyTest.testIndexingQueriesDeletes() when it identified a BLOB_PULL_STARTED without a matching BLOB_PULL_FINISHED, meaning the pulls are interleaved. We want to record pull as finished upon completion even if it was unsuccessful. # Solution Move recording the state the pull as finished finally block so it will record pull as finished irrespective of failures. # Tests Reproduced failure by throwing an exception in pull code and verified that code change fixes the interleaved pull failure. Ran all tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14380) Repeated words in comments and documentation
[ https://issues.apache.org/jira/browse/SOLR-14380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074120#comment-17074120 ] Ted Gifford commented on SOLR-14380: My super-elegant approach: {code:sh} grep --include='*.adoc' -rP '\b([a-zA-Z]\w+)\s+\1\b' . -lZ | xargs -0 perl -i -ple 's#\b([a-zA-Z]\w+)\s+\1\b#$1#g;' git add -p # to review {code} and such. > Repeated words in comments and documentation > > > Key: SOLR-14380 > URL: https://issues.apache.org/jira/browse/SOLR-14380 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Affects Versions: master (9.0) > Environment: N/A >Reporter: Ted Gifford >Priority: Trivial > Labels: typo > Attachments: the_the.patch > > > I noticed the repeated "the the" in the [the query elevation > documentation|https://lucene.apache.org/solr/guide/8_5/the-query-elevation-component.html]. > {quote}If either one of these parameters is specified at request time, *the > the*[sic] entire elevation configuration for the query is ignored. > {quote} > I have prepared a PR with all similar occurrences in comments and > documentation replaced, but not blindly. Seems like a particular choice was > probably made for some of the tests, and maybe that bled into the docs? Maybe > an inside joke? > In any case, let me know if you want the PR submitted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14380) Repeated words in comments and documentation
[ https://issues.apache.org/jira/browse/SOLR-14380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Gifford updated SOLR-14380: --- Attachment: the_the.patch Status: Open (was: Open) > Repeated words in comments and documentation > > > Key: SOLR-14380 > URL: https://issues.apache.org/jira/browse/SOLR-14380 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Affects Versions: master (9.0) > Environment: N/A >Reporter: Ted Gifford >Priority: Trivial > Labels: typo > Attachments: the_the.patch > > > I noticed the repeated "the the" in the [the query elevation > documentation|https://lucene.apache.org/solr/guide/8_5/the-query-elevation-component.html]. > {quote}If either one of these parameters is specified at request time, *the > the*[sic] entire elevation configuration for the query is ignored. > {quote} > I have prepared a PR with all similar occurrences in comments and > documentation replaced, but not blindly. Seems like a particular choice was > probably made for some of the tests, and maybe that bled into the docs? Maybe > an inside joke? > In any case, let me know if you want the PR submitted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14380) Repeated words in comments and documentation
Ted Gifford created SOLR-14380: -- Summary: Repeated words in comments and documentation Key: SOLR-14380 URL: https://issues.apache.org/jira/browse/SOLR-14380 Project: Solr Issue Type: Task Security Level: Public (Default Security Level. Issues are Public) Components: documentation Affects Versions: master (9.0) Environment: N/A Reporter: Ted Gifford I noticed the repeated "the the" in the [the query elevation documentation|https://lucene.apache.org/solr/guide/8_5/the-query-elevation-component.html]. {quote}If either one of these parameters is specified at request time, *the the*[sic] entire elevation configuration for the query is ignored. {quote} I have prepared a PR with all similar occurrences in comments and documentation replaced, but not blindly. Seems like a particular choice was probably made for some of the tests, and maybe that bled into the docs? Maybe an inside joke? In any case, let me know if you want the PR submitted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on a change in pull request #1397: LUCENE-9304: Refactor DWPTPool to pool DWPT directly
s1monw commented on a change in pull request #1397: LUCENE-9304: Refactor DWPTPool to pool DWPT directly URL: https://github.com/apache/lucene-solr/pull/1397#discussion_r402588893 ## File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriter.java ## @@ -322,35 +316,27 @@ synchronized Closeable lockAndAbortAll() throws IOException { } /** Returns how many documents were aborted. */ - private int abortThreadState(final ThreadState perThread) throws IOException { + private int abortDocumentsWriterPerThread(final DocumentsWriterPerThread perThread) throws IOException { assert perThread.isHeldByCurrentThread(); -if (perThread.isInitialized()) { - try { -int abortedDocCount = perThread.dwpt.getNumDocsInRAM(); -subtractFlushedNumDocs(abortedDocCount); -perThread.dwpt.abort(); -return abortedDocCount; - } finally { -flushControl.doOnAbort(perThread); - } -} else { +try { + int abortedDocCount = perThread.getNumDocsInRAM(); + subtractFlushedNumDocs(abortedDocCount); + perThread.abort(); + return abortedDocCount; +} finally { flushControl.doOnAbort(perThread); - // This DWPT was never initialized so it has no indexed documents: - return 0; } } /** returns the maximum sequence number for all previously completed operations */ public long getMaxCompletedSequenceNumber() { -long value = lastSeqNo; -int limit = perThreadPool.getMaxThreadStates(); -for(int i = 0; i < limit; i++) { - ThreadState perThread = perThreadPool.getThreadState(i); - value = Math.max(value, perThread.lastSeqNo); -} -return value; +// NOCOMMIT: speak to mikemccandless about this change https://github.com/apache/lucene-solr/commit/5a03216/ Review comment: @mikemccand can you take a look at this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on issue #1381: SOLR-14364: LTR SolrFeature fq improvements
dsmiley commented on issue #1381: SOLR-14364: LTR SolrFeature fq improvements URL: https://github.com/apache/lucene-solr/pull/1381#issuecomment-608087581 Are you okay with me merging this now or is there anything left to resolve @cpoerschke ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw opened a new pull request #1397: LUCENE-9304: Refactor DWPTPool to pool DWPT directly
s1monw opened a new pull request #1397: LUCENE-9304: Refactor DWPTPool to pool DWPT directly URL: https://github.com/apache/lucene-solr/pull/1397 This change removes the `ThreadState` indirection from DWPTPool and pools DWPT directly. The tracking information and locking semantics are mostly moved to DWPT directly and the pool semantics have changed slightly such that DWPT need to be _checked-out_ in the pool once they need to be flushed or aborted. This automatically grows and shrinks the number of DWPT in the system when number of threads grow or shrink. Access of pooled DWPTs is more straight forward and doesn't require ordinal. Instead consumers can just iterate over the elements in the pool. This allowed for removal of indirections in DWPTFlushControl like `BlockedFlush`, the removal of DWPTPool setter and getter in `IndexWriterConfig` and the addition of stronger assertions in DWPT and DW. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9304) Clean up DWPTPool
Simon Willnauer created LUCENE-9304: --- Summary: Clean up DWPTPool Key: LUCENE-9304 URL: https://issues.apache.org/jira/browse/LUCENE-9304 Project: Lucene - Core Issue Type: Improvement Affects Versions: master (9.0), 8.6 Reporter: Simon Willnauer DWPTPool currently uses an indirection called ThreadState to hold DWPT instances. This class holds several information that belongs in other places, inherits from ReentrantLock and has a mutable nature. Instead we could pool the DWPT directly and remove other indirections inside DWPTFlushControl if we move some of the ThreadState properties to DWPT directly. The threadpool also has a problem that is grows it's ThreadStates to the number of concurrently indexing threads but never shrinks it if they are reduced. With pooling DWPT directly this limitation could be removed. In summary, this component has seen quite some refactoring and requires some cleanups and docs changes in order to stay the test of time. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-9101) introduce SolrRequestInfo.doInSuspension(Callable). make SolrRequestInfo final
[ https://issues.apache.org/jira/browse/SOLR-9101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074065#comment-17074065 ] David Smiley commented on SOLR-9101: Recently my colleague [~matmarie] troubleshooted some logging problem and with my help we found that SolrIndexSearcher.warm was clearing the SolrRequestInfo but never re-instated a possibly pre-existing SolrRequestInfo. As I find-usages on SolrRequestInfo.setRequestInfo, I think this problem is common-place. I think this suggests SRI should have a better API to make doing the right thing easy. For example maybe the SRI should be layered like a stack. [~mkhl] I found your issue here which isn't what I suggest but is another type of solution. I propose I file a new issue to turn it into a stack and then close this here as wont-fix. WDYT? BTW MDCLoggingContext.setCore/clear is similar-ish but it always keeps the original context. > introduce SolrRequestInfo.doInSuspension(Callable). make SolrRequestInfo > final > - > > Key: SOLR-9101 > URL: https://issues.apache.org/jira/browse/SOLR-9101 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: Mikhail Khludnev >Priority: Major > Attachments: SOLR-9101.patch > > > During work on SOLR-8208 I fortunately hacked this. I propose changes in the > subj which make SolrRequestInfo more powerful and protected. > [~ysee...@gmail.com], I need your opinion regarding it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14378) factor out a (contrib/ltr) FilterFeatureScorer class
[ https://issues.apache.org/jira/browse/SOLR-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074058#comment-17074058 ] Lucene/Solr QA commented on SOLR-14378: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 27s{color} | {color:green} ltr in the patch passed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 7m 29s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | SOLR-14378 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12998632/SOLR-14378.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene2-us-west.apache.org 4.4.0-170-generic #199-Ubuntu SMP Thu Nov 14 01:45:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / e609079 | | ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 | | Default Java | LTS | | Test Results | https://builds.apache.org/job/PreCommit-SOLR-Build/730/testReport/ | | modules | C: solr/contrib/ltr U: solr/contrib/ltr | | Console output | https://builds.apache.org/job/PreCommit-SOLR-Build/730/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > factor out a (contrib/ltr) FilterFeatureScorer class > > > Key: SOLR-14378 > URL: https://issues.apache.org/jira/browse/SOLR-14378 > Project: Solr > Issue Type: Task > Components: contrib - LTR >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-14378.patch > > > It looks like a {{FilterFeatureScorer}} class can be factored out from > {{OriginalScoreScorer}} (and {{SolrFeatureScorer}} after or with [~dsmiley]'s > SOLR-14364 changes could use it too). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness
[ https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074044#comment-17074044 ] Michael Gibney commented on SOLR-13132: --- Yes, I think I understand the lines along which you're thinking, and that makes sense. Forgive me, I was intending the [earlier comment|https://issues.apache.org/jira/browse/SOLR-13132?focusedCommentId=17073878#comment-17073878] to be narrowly about the "XX temporary!" comment ... really coming to the conclusion that the functionality it was marking (approximately respecting {{cacheDf}}) should in fact be permanent, not temporary (i.e., the comment is misleading/out-of-date and should just be removed). True I misspoke in implying that non-sweep {{SKGSlotAcc}} was just about refinement; it's necessary for any use case that takes a more "a la carte" approach, not requiring facet counts to be calculated over the full domain (refinement, resort, otherAccs, as you say ... maybe others?). In fact, otherAccs and resort, being likely to generate more DocSet lookups than refinement, make it all the more important that SKGSlotAcc respect {{cacheDf}} to control filterCache usage, no? Your other suggestions, concerns about brittleness, API changes etc. definitely resonate with me (your stream-of-consciousness is very intelligible!) – I plan to work through them in the next day or two and address any questions as they come up. > Improve JSON "terms" facet performance when sorted by relatedness > -- > > Key: SOLR-13132 > URL: https://issues.apache.org/jira/browse/SOLR-13132 > Project: Solr > Issue Type: Improvement > Components: Facet Module >Affects Versions: 7.4, master (9.0) >Reporter: Michael Gibney >Priority: Major > Attachments: SOLR-13132-with-cache-01.patch, > SOLR-13132-with-cache.patch, SOLR-13132.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate > {{relatedness}} for every term. > The current implementation uses a standard uninverted approach (either > {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain > base docSet, and then uses that initial pass as a pre-filter for a > second-pass, inverted approach of fetching docSets for each relevant term > (i.e., {{count > minCount}}?) and calculating intersection size of those sets > with the domain base docSet. > Over high-cardinality fields, the overhead of per-term docSet creation and > set intersection operations increases request latency to the point where > relatedness sort may not be usable in practice (for my use case, even after > applying the patch for SOLR-13108, for a field with ~220k unique terms per > core, QTime for high-cardinality domain docSets were, e.g.: cardinality > 1816684=9000ms, cardinality 5032902=18000ms). > The attached patch brings the above example QTimes down to a manageable > ~300ms and ~250ms respectively. The approach calculates uninverted facet > counts over domain base, foreground, and background docSets in parallel in a > single pass. This allows us to take advantage of the efficiencies built into > the standard uninverted {{FacetFieldProcessorByArray[DV|UIF]}}), and avoids > the per-term docSet creation and set intersection overhead. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents
mayya-sharipova commented on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents URL: https://github.com/apache/lucene-solr/pull/1351#issuecomment-608059291 @jpountz What do you think of this design in eeb23c11? 1. `IterableFieldComparator` wraps an `FieldComparator` to provide skipping functionality. All numeric comparators are wrapped in corresponding iterable comparators. 2. `SortField` has a new method `allowSkipNonCompetitveDocs`, that if set will use a comparator that provided skipping functionality. In this case, we would not need other classes that I previously introduced `LongDocValuesPointComparator` and `LongDocValuesPointSortField`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9170) wagon-ssh Maven HTTPS issue
[ https://issues.apache.org/jira/browse/LUCENE-9170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob resolved LUCENE-9170. --- Resolution: Fixed > wagon-ssh Maven HTTPS issue > --- > > Key: LUCENE-9170 > URL: https://issues.apache.org/jira/browse/LUCENE-9170 > Project: Lucene - Core > Issue Type: Bug >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Fix For: master (9.0), 8.6 > > Attachments: LUCENE-9170.patch, LUCENE-9170.patch, LUCENE-9170.patch > > > When I do, from lucene/ in branch_8_4: > ant -Dversion=8.4.2 generate-maven-artifacts > I see that wagon-ssh is being resolved from http://repo1.maven.org/maven2 > instead of https equivalent. This is surprising to me, since I can't find the > http URL anywhere. > Here's my log: > https://paste.centos.org/view/be2d3f3f > This is a critical issue since releases won't work without this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14379) nightly smoke fails due to java servlet jar
Mike Drob created SOLR-14379: Summary: nightly smoke fails due to java servlet jar Key: SOLR-14379 URL: https://issues.apache.org/jira/browse/SOLR-14379 Project: Solr Issue Type: Task Security Level: Public (Default Security Level. Issues are Public) Reporter: Mike Drob Running locally: {noformat} [smoker] Traceback (most recent call last): [smoker] File "/Users/mdrob/code/lucene-solr/dev-tools/scripts/smokeTestRelease.py", line 1487, in [smoker] main() [smoker] File "/Users/mdrob/code/lucene-solr/dev-tools/scripts/smokeTestRelease.py", line 1413, in main [smoker] downloadOnly=c.download_only) [smoker] File "/Users/mdrob/code/lucene-solr/dev-tools/scripts/smokeTestRelease.py", line 1474, in smokeTest [smoker] unpackAndVerify(java, 'solr', tmpDir, artifact, gitRevision, version, testArgs, baseURL) [smoker] File "/Users/mdrob/code/lucene-solr/dev-tools/scripts/smokeTestRelease.py", line 566, in unpackAndVerify [smoker] verifyUnpacked(java, project, artifact, unpackPath, gitRevision, version, testArgs, tmpDir, baseURL) [smoker] File "/Users/mdrob/code/lucene-solr/dev-tools/scripts/smokeTestRelease.py", line 716, in verifyUnpacked [smoker] checkAllJARs(os.getcwd(), project, gitRevision, version, tmpDir, baseURL) [smoker] File "/Users/mdrob/code/lucene-solr/dev-tools/scripts/smokeTestRelease.py", line 237, in checkAllJARs [smoker] noJavaPackageClasses('JAR file "%s"' % fullPath, fullPath) [smoker] File "/Users/mdrob/code/lucene-solr/dev-tools/scripts/smokeTestRelease.py", line 125, in noJavaPackageClasses [smoker] raise RuntimeError('%s contains sheisty class "%s"' % (desc, name2)) [smoker] RuntimeError: JAR file "/Users/mdrob/code/lucene-solr/lucene/build/smokeTestRelease/tmp/unpack/solr-9.0.0/server/build/packaging/lib/javax.servlet-api-3.1.0.jar" contains sheisty class "javax/servlet/annotation/HandlesTypes.class" {noformat} There's some checks in there about ignoring javax.serlvet and ignoring server/lib directory, but it's somehow picking up this file still. Did we move paths around at some point? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9266) ant nightly-smoke fails due to presence of build.gradle
[ https://issues.apache.org/jira/browse/LUCENE-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob resolved LUCENE-9266. --- Fix Version/s: master (9.0) Resolution: Fixed > ant nightly-smoke fails due to presence of build.gradle > --- > > Key: LUCENE-9266 > URL: https://issues.apache.org/jira/browse/LUCENE-9266 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Fix For: master (9.0) > > Time Spent: 6.5h > Remaining Estimate: 0h > > Seen on Jenkins - > [https://builds.apache.org/job/Lucene-Solr-SmokeRelease-master/1617/console] > > Reproduced locally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness
[ https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073982#comment-17073982 ] Chris M. Hostetter commented on SOLR-13132: --- {quote}Given that we need to continue to maintain SKGSlotAcc to support refinement requests, it's reasonable to continue to support it also as a configurable option for initial-pass faceting (with sufficient filterCache capacity, it could be more efficient for low-cardinality fields – analogous to enum method terms faceting). {quote} Sure – but to be clear I don't think it's *just* about refinement, other cases where the numSlots can (currently) be "1" are "otherAccs" – ie: we're only returning the calculated value w/o being involved in sorting, or it might be that {{prelim_sort}} has been used on some other simpler aggregation (or {{"count"}}) but the {{relatedness()}} function is being used as the {{resort}} collector. {quote}The choice to use SKGSlotAcc for slot cardinality of 1 is really just an indirect way of determining whether the SlotAcc may be used to process a refinement request (for which sweep accumulation is not applicable – maybe there's a better/more direct way to determine this?). {quote} This is the main reasons i was worried that your current code is brittle against future changes – these assumptions inside the implementation of {{RelatednessAgg.createSlotAcc(...)}} about how the facet processors work (and how many slots they use for various things) that might change down the road. Hence my suggestion that we make the Processors (that support sweeping) be *VERY* explicit about when they want {{"Accumulation Structures To Use When Sweeping"}} – but since there isn't neccessaily a 1-to-1 correspondence between a {{SlotAcc}} and these accumulation strucutres (ie: a single {{relatedness()}} function wants to sweep over both it's foreground and background) that's where my suggestion came from to introduce a new method on {{SlotAcc}} that the processors can call if and only if they can sweep. I know i was kind of hand wavy before (largely because it was just stream of consciousness) but i think what i'm suggesting would be a good API would be something like this (psuedo-code, still not 100% thought through, but trying to put "names" on concepts to be able to have a shared vocabulary about what is flaoting around in my head)... {code:java} /** Implemented by some SlotAccs if they are capable of being used for * sweep collecting in compatable facet processors */ interface SweepableSlotAcc { /** * called by processors if they support sweeping * @param baseAcc - never null, where the SlotAcc should call "addChild" * @return if true then collect methods will never be called on this SlotAcc public boolean registerSweepingAccs(SweepingAcc baseSweepingAcc); } class CountSlotAcc implements SweepableSlotAcc { // TODO: not sure if it actually needs to implement? // in addition to implementing registerSweepingAccs (if needed) ... // // CountSlotAcc defines a special method for processors to get the "baseSweepingAcc" ... /** * CountSlotAcc always supports being used for sweeping across the base set * @returns never null public SweepingAcc getBaseSweepingAcc() { return ... // build off of fcontext.base; } // ... } class SKGSlotAcc implements SweepableSlotAcc { // ... all existing internal strcutures and logic, like ... private BucketData[] slotvalues; // ... /** * If called, may register SweepingAccs for fg and bg set based on wether * user indicated sweeping should be used (default) * * @returns true if any SweepingAccs were registered since no other collection is needed for relatedness */ public boolean registerSweepingAccs(SweepingAcc baseSweepingAcc) { if (this.agg.useSweep) { // set by param (or maybe by hueristic if no param?) // pass 'slotValues' to our custom SweepingAcc impls to update directly as collection happens baseSweepingAcc.addChild(new FGSweepingAcc(slotvalues, fgSet)); baseSweepingAcc.addChild(new BGSweepingAcc(slotvalues, bgSet)); return true; } } class MultiAcc implements SweepableSlotAcc { // ... all existing logic, plus... public boolean registerSweepingAccs(SweepingAcc baseSweepingAcc); // loops over subAccs // if any return "true" it removes them from subAccs // if subAccs is empty when it's done, then it return's true; else false } /** * Abstraction used by processors that support sweeping to decide what to sweep over and how to * "collect" when doing the sweep. * * This class would encapsulate most of hte logic/structures currently in * SweepCountAccStruct + FilterCtStruct (and maybe CountAccEntry?) and should allow us to * simplify the number of new methods in FacetFieldProcessor - they could instead be encapsulated here * (where they don't pollute/impact
[jira] [Commented] (LUCENE-9286) FST construction explodes memory in BitTable
[ https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073959#comment-17073959 ] Dawid Weiss commented on LUCENE-9286: - Hi Bruno. Thank you for looking into it. The problem is not during construction of the FST but later on - when the FST is used. In our algorithms we kept a significant number of some arcs in memory. Previously they were cheap, now they are not: arc.copyOf copies the entire underlying bit table: bq. What was previously fairly cheap (copyOf) has become fairly heavy and blows up memory when you have data structures that require storing intermediate Arcs during processing I didn't look into this but if these bit tables are immutable once the FST is constructed then copyOf could just copy the reference. A side note is that copyOf doesn't really fully reset the state of an arc (clear bit table reference if the copied arc doesn't have the bit table, for example). > FST construction explodes memory in BitTable > > > Key: LUCENE-9286 > URL: https://issues.apache.org/jira/browse/LUCENE-9286 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 8.5 >Reporter: Dawid Weiss >Assignee: Bruno Roustant >Priority: Major > Attachments: screen-[1].png > > > I see a dramatic increase in the amount of memory required for construction > of (arguably large) automata. It currently OOMs with 8GB of memory consumed > for bit tables. I am pretty sure this didn't require so much memory before > (the automaton is ~50MB after construction). > Something bad happened in between. Thoughts, [~broustant], [~sokolov]? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402468813 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,37 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map perName = new HashMap<>(); Review comment: nit: we call it `byName`elsewhere This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9266) ant nightly-smoke fails due to presence of build.gradle
[ https://issues.apache.org/jira/browse/LUCENE-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073877#comment-17073877 ] Mike Drob commented on LUCENE-9266: --- Sent an email to the dev list highlighting the potential issue for folks, https://lists.apache.org/thread.html/rad256c5711282dd725f34fa2ac89690b65fb9b2b798f3c77241fc22b%40%3Cdev.lucene.apache.org%3E > ant nightly-smoke fails due to presence of build.gradle > --- > > Key: LUCENE-9266 > URL: https://issues.apache.org/jira/browse/LUCENE-9266 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Time Spent: 6.5h > Remaining Estimate: 0h > > Seen on Jenkins - > [https://builds.apache.org/job/Lucene-Solr-SmokeRelease-master/1617/console] > > Reproduced locally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob closed pull request #1390: LUCENE-9266 remove gradle wrapper jar from source
madrob closed pull request #1390: LUCENE-9266 remove gradle wrapper jar from source URL: https://github.com/apache/lucene-solr/pull/1390 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9266) ant nightly-smoke fails due to presence of build.gradle
[ https://issues.apache.org/jira/browse/LUCENE-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073874#comment-17073874 ] ASF subversion and git services commented on LUCENE-9266: - Commit e25ab4204fc4e2673b23b09d9a99e90dc401b03f in lucene-solr's branch refs/heads/master from Mike Drob [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e25ab42 ] LUCENE-9266 remove gradle wrapper jar from source ASF Release Policy states that we cannot have binary JAR files checked in to our source releases, a few other projects have solved this by modifying their generated gradlew scripts to download a copy of the wrapper jar. We now have a version and checksum file in ./gradle/wrapper directory used for verifying the wrapper jar, and will take advantage of single source java execution to verify and download. The gradle wrapper jar will continue to be available in the git repository, but will be excluded from src tarball generation. This should not change workflows for any users, since we expect the gradlew script to get the jar when it is missing. Co-authored-by: Dawid Weiss > ant nightly-smoke fails due to presence of build.gradle > --- > > Key: LUCENE-9266 > URL: https://issues.apache.org/jira/browse/LUCENE-9266 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Time Spent: 6h 10m > Remaining Estimate: 0h > > Seen on Jenkins - > [https://builds.apache.org/job/Lucene-Solr-SmokeRelease-master/1617/console] > > Reproduced locally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on issue #1390: LUCENE-9266 remove gradle wrapper jar from source
madrob commented on issue #1390: LUCENE-9266 remove gradle wrapper jar from source URL: https://github.com/apache/lucene-solr/pull/1390#issuecomment-607953114 Committed in e25ab4204 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9286) FST construction explodes memory in BitTable
[ https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073869#comment-17073869 ] Bruno Roustant commented on LUCENE-9286: [~dweiss] I'm able to reproduce the FSTEnum traversal time increase and I started to investigate. But I don't reproduce the memory blowup. Did you share the same FST as your memory issue? I'm able to recompile the FST with an oversizing factor of 1.0 with less than 200M using -Xmx200m. > FST construction explodes memory in BitTable > > > Key: LUCENE-9286 > URL: https://issues.apache.org/jira/browse/LUCENE-9286 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 8.5 >Reporter: Dawid Weiss >Assignee: Bruno Roustant >Priority: Major > Attachments: screen-[1].png > > > I see a dramatic increase in the amount of memory required for construction > of (arguably large) automata. It currently OOMs with 8GB of memory consumed > for bit tables. I am pretty sure this didn't require so much memory before > (the automaton is ~50MB after construction). > Something bad happened in between. Thoughts, [~broustant], [~sokolov]? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14294) Typo in response to ?action=XX on StreamHandler
[ https://issues.apache.org/jira/browse/SOLR-14294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob resolved SOLR-14294. -- Fix Version/s: 8.6 Assignee: Mike Drob Resolution: Fixed Pushed to 8.x and master, sorry about the delay. > Typo in response to ?action=XX on StreamHandler > --- > > Key: SOLR-14294 > URL: https://issues.apache.org/jira/browse/SOLR-14294 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: streaming expressions >Affects Versions: 8.4.1 >Reporter: David Eric Pugh >Assignee: Mike Drob >Priority: Minor > Fix For: 8.6 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The messages back when interacting with Streaming API have typo in the word > Daemon, they are spelled "Deamon". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob merged pull request #1302: SOLR-14294 fix typo in message
madrob merged pull request #1302: SOLR-14294 fix typo in message URL: https://github.com/apache/lucene-solr/pull/1302 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14370) Refactor bin/solr to allow external override of Jetty modules
[ https://issues.apache.org/jira/browse/SOLR-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073824#comment-17073824 ] Mike Drob commented on SOLR-14370: -- How does this relate to [https://cwiki.apache.org/confluence/display/SOLR/SIP-6+Solr+should+own+the+bootstrap+process] and if we adopt that, then does this still have a place? Note that this could be a short-lived fix until we overhaul the bootstrap process, but that need not block this change. cc: [~janhoy] > Refactor bin/solr to allow external override of Jetty modules > - > > Key: SOLR-14370 > URL: https://issues.apache.org/jira/browse/SOLR-14370 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: scripts and tools >Reporter: Andy Throgmorton >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > The bin/solr script currently does not allow for externally overriding the > modules passed to Jetty on startup. > This PR adds the ability to override the Jetty modules on startup by setting > {{JETTY_MODULES}} as an environment variable; when passed, bin/solr will pass > through (and not clobber) the string verbatim into {{SOLR_JETTY_CONFIG}}. For > example, you can now run: > {{JETTY_MODULES=--module=foo bin/solr start}} > We've added some custom Jetty modules that can be optionally enabled; this > change allows us to keep our logic (regarding which modules to use) in a > separate script, rather than maintaining a forked bin/solr. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on issue #1394: LUCENE-9300: Fix field infos update on doc values update
jimczi commented on issue #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#issuecomment-607916720 Thanks @jpountz and @s1monw . I pushed new commits to address your comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402404730 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,39 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map newFields = new HashMap<>(); for (FieldInfo fi : reader.getFieldInfos()) { - FieldInfo clone = builder.add(fi); - // copy the stuff FieldInfos.Builder doesn't copy - for (Entry e : fi.attributes().entrySet()) { -clone.putAttribute(e.getKey(), e.getValue()); - } - clone.setDocValuesGen(fi.getDocValuesGen()); + // cannot use builder.add(fi) because it does not preserve + // the local field number. Field numbers can be different from the global ones + // if the segment was created externally (with IndexWriter#addIndexes(Directory)). + FieldInfo clone = cloneFieldInfo(fi, fi.number); + newFields.put(clone.name, clone); + maxFieldNumber = Math.max(clone.number, maxFieldNumber); } // create new fields with the right DV type +FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); for (List updates : pendingDVUpdates.values()) { DocValuesFieldUpdates update = updates.get(0); - FieldInfo fieldInfo = builder.getOrAdd(update.field); - fieldInfo.setDocValuesType(update.type); + FieldInfo globalFieldInfo = builder.getOrAdd(update.field); + globalFieldInfo.setDocValuesType(update.type); + FieldInfo segmentFieldInfo = newFields.get(update.field); + if (segmentFieldInfo == null) { +// the field is not present in this segment so we can fallback to the global fields. +if (globalFieldInfo.number <= maxFieldNumber) { + // the global field number is already used in this segment for a different field so we force a new one locally. + globalFieldInfo = cloneFieldInfo(globalFieldInfo, ++maxFieldNumber); Review comment: we should, thanks. I rewrote the logic in my last commits. I thought that `addOrGet` would preserve the dv type but it doesn't. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402403738 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,39 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map newFields = new HashMap<>(); for (FieldInfo fi : reader.getFieldInfos()) { - FieldInfo clone = builder.add(fi); - // copy the stuff FieldInfos.Builder doesn't copy - for (Entry e : fi.attributes().entrySet()) { -clone.putAttribute(e.getKey(), e.getValue()); - } - clone.setDocValuesGen(fi.getDocValuesGen()); + // cannot use builder.add(fi) because it does not preserve + // the local field number. Field numbers can be different from the global ones + // if the segment was created externally (with IndexWriter#addIndexes(Directory)). Review comment: ++ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402403553 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,39 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map newFields = new HashMap<>(); for (FieldInfo fi : reader.getFieldInfos()) { - FieldInfo clone = builder.add(fi); - // copy the stuff FieldInfos.Builder doesn't copy - for (Entry e : fi.attributes().entrySet()) { -clone.putAttribute(e.getKey(), e.getValue()); - } - clone.setDocValuesGen(fi.getDocValuesGen()); + // cannot use builder.add(fi) because it does not preserve + // the local field number. Field numbers can be different from the global ones + // if the segment was created externally (with IndexWriter#addIndexes(Directory)). + FieldInfo clone = cloneFieldInfo(fi, fi.number); + newFields.put(clone.name, clone); + maxFieldNumber = Math.max(clone.number, maxFieldNumber); } // create new fields with the right DV type +FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); for (List updates : pendingDVUpdates.values()) { DocValuesFieldUpdates update = updates.get(0); - FieldInfo fieldInfo = builder.getOrAdd(update.field); - fieldInfo.setDocValuesType(update.type); + FieldInfo globalFieldInfo = builder.getOrAdd(update.field); + globalFieldInfo.setDocValuesType(update.type); + FieldInfo segmentFieldInfo = newFields.get(update.field); + if (segmentFieldInfo == null) { +// the field is not present in this segment so we can fallback to the global fields. +if (globalFieldInfo.number <= maxFieldNumber) { + // the global field number is already used in this segment for a different field so we force a new one locally. + globalFieldInfo = cloneFieldInfo(globalFieldInfo, ++maxFieldNumber); +} Review comment: Thanks, I reworked the logic to always clone and reassign the field number. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402403634 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,39 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map newFields = new HashMap<>(); for (FieldInfo fi : reader.getFieldInfos()) { - FieldInfo clone = builder.add(fi); - // copy the stuff FieldInfos.Builder doesn't copy - for (Entry e : fi.attributes().entrySet()) { -clone.putAttribute(e.getKey(), e.getValue()); - } - clone.setDocValuesGen(fi.getDocValuesGen()); + // cannot use builder.add(fi) because it does not preserve + // the local field number. Field numbers can be different from the global ones + // if the segment was created externally (with IndexWriter#addIndexes(Directory)). + FieldInfo clone = cloneFieldInfo(fi, fi.number); + newFields.put(clone.name, clone); + maxFieldNumber = Math.max(clone.number, maxFieldNumber); } // create new fields with the right DV type +FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); for (List updates : pendingDVUpdates.values()) { DocValuesFieldUpdates update = updates.get(0); - FieldInfo fieldInfo = builder.getOrAdd(update.field); - fieldInfo.setDocValuesType(update.type); + FieldInfo globalFieldInfo = builder.getOrAdd(update.field); + globalFieldInfo.setDocValuesType(update.type); + FieldInfo segmentFieldInfo = newFields.get(update.field); + if (segmentFieldInfo == null) { +// the field is not present in this segment so we can fallback to the global fields. +if (globalFieldInfo.number <= maxFieldNumber) { + // the global field number is already used in this segment for a different field so we force a new one locally. + globalFieldInfo = cloneFieldInfo(globalFieldInfo, ++maxFieldNumber); +} +newFields.put(update.field, globalFieldInfo); + } else { +segmentFieldInfo.setDocValuesType(update.type); +newFields.put(update.field, segmentFieldInfo); Review comment: ++, removed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402403065 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,39 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map newFields = new HashMap<>(); for (FieldInfo fi : reader.getFieldInfos()) { - FieldInfo clone = builder.add(fi); - // copy the stuff FieldInfos.Builder doesn't copy - for (Entry e : fi.attributes().entrySet()) { -clone.putAttribute(e.getKey(), e.getValue()); - } - clone.setDocValuesGen(fi.getDocValuesGen()); + // cannot use builder.add(fi) because it does not preserve + // the local field number. Field numbers can be different from the global ones + // if the segment was created externally (with IndexWriter#addIndexes(Directory)). + FieldInfo clone = cloneFieldInfo(fi, fi.number); + newFields.put(clone.name, clone); + maxFieldNumber = Math.max(clone.number, maxFieldNumber); } // create new fields with the right DV type +FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); for (List updates : pendingDVUpdates.values()) { DocValuesFieldUpdates update = updates.get(0); - FieldInfo fieldInfo = builder.getOrAdd(update.field); - fieldInfo.setDocValuesType(update.type); + FieldInfo globalFieldInfo = builder.getOrAdd(update.field); + globalFieldInfo.setDocValuesType(update.type); Review comment: ++, https://github.com/apache/lucene-solr/pull/1394/commits/0b5c6c896bcc67f8ec6d9206d5dc9caa5cc5ee81 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402402939 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,39 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map newFields = new HashMap<>(); for (FieldInfo fi : reader.getFieldInfos()) { - FieldInfo clone = builder.add(fi); - // copy the stuff FieldInfos.Builder doesn't copy - for (Entry e : fi.attributes().entrySet()) { -clone.putAttribute(e.getKey(), e.getValue()); - } - clone.setDocValuesGen(fi.getDocValuesGen()); + // cannot use builder.add(fi) because it does not preserve + // the local field number. Field numbers can be different from the global ones + // if the segment was created externally (with IndexWriter#addIndexes(Directory)). + FieldInfo clone = cloneFieldInfo(fi, fi.number); + newFields.put(clone.name, clone); + maxFieldNumber = Math.max(clone.number, maxFieldNumber); } // create new fields with the right DV type +FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); for (List updates : pendingDVUpdates.values()) { DocValuesFieldUpdates update = updates.get(0); - FieldInfo fieldInfo = builder.getOrAdd(update.field); - fieldInfo.setDocValuesType(update.type); + FieldInfo globalFieldInfo = builder.getOrAdd(update.field); + globalFieldInfo.setDocValuesType(update.type); Review comment: I added an [assertion](https://github.com/apache/lucene-solr/pull/1394/commits/0b5c6c896bcc67f8ec6d9206d5dc9caa5cc5ee81) to check the dv type of the global field but we still need to set the dv type on the cloned field since it is not preserved by the `getOrAdd` call. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402394140 ## File path: lucene/core/src/test/org/apache/lucene/index/TestNumericDocValuesUpdates.java ## @@ -1483,6 +1484,83 @@ public void testAddIndexes() throws Exception { IOUtils.close(dir1, dir2); } + public void testUpdatesAfterAddIndexes() throws Exception { Review comment: ++, I pushed https://github.com/apache/lucene-solr/pull/1394/commits/00d755888d4744ff6a1beba17e20580f43841f12 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jimczi commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402394008 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -644,6 +656,13 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum return true; } + private FieldInfo cloneFieldInfo(FieldInfo fi, int fieldNumber) { +return new FieldInfo(fi.name, fieldNumber, fi.hasVectors(), fi.omitsNorms(), fi.hasPayloads(), +fi.getIndexOptions(), fi.getDocValuesType(), fi.getDocValuesGen(), new HashMap<>(fi.attributes()), +fi.getPointDimensionCount(), fi.getPointIndexDimensionCount(), fi.getPointNumBytes(), fi.isSoftDeletesField()); + Review comment: ++, https://github.com/apache/lucene-solr/pull/1394/commits/34718cb64e747e56ebc5341f9180d2f12aace029 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] janhoy commented on a change in pull request #1387: SOLR-14210: Include replica health in healtcheck handler
janhoy commented on a change in pull request #1387: SOLR-14210: Include replica health in healtcheck handler URL: https://github.com/apache/lucene-solr/pull/1387#discussion_r402382040 ## File path: solr/core/src/java/org/apache/solr/handler/admin/HealthCheckHandler.java ## @@ -88,15 +96,46 @@ public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throw return; } -// Set status to true if this node is in live_nodes -if (clusterState.getLiveNodes().contains(cores.getZkController().getNodeName())) { - rsp.add(STATUS, OK); -} else { +// Fail if not in live_nodes +if (!clusterState.getLiveNodes().contains(cores.getZkController().getNodeName())) { rsp.add(STATUS, FAILURE); rsp.setException(new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE, "Host Unavailable: Not in live nodes as per zk")); + return; } -rsp.setHttpCaching(false); +// Optionally require that all cores on this node are active if param 'requireHealthyCores=true' +if (req.getParams().getBool(PARAM_REQUIRE_HEALTHY_CORES, false)) { + List unhealthyCores = findUnhealthyCores(clusterState, + cores.getNodeConfig().getNodeName(), + cores.getAllCoreNames()); + if (unhealthyCores.size() > 0) { + rsp.add(STATUS, FAILURE); + rsp.setException(new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE, + "Replica(s) " + unhealthyCores + " are currently initializing or recovering")); + return; + } + rsp.add("message", "All cores are healthy"); +} + +// All lights green, report healthy +rsp.add(STATUS, OK); + } + + /** + * Find replicas DOWN or RECOVERING, or replicas in clusterstate that do not exist on local node + * @param clusterState clusterstate from ZK + * @param nodeName this node name + * @param allCoreNames list of all core names on current node + * @return list of core names that are either DOWN ore RECOVERING on 'nodeName' + */ + static List findUnhealthyCores(ClusterState clusterState, String nodeName, Collection allCoreNames) { +return clusterState.getCollectionsMap().values().stream() Review comment: Thanks, this is super valuable. Will switch the loop to only fetch info for collection/replicas that reside on the node! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] shalinmangar commented on a change in pull request #1387: SOLR-14210: Include replica health in healtcheck handler
shalinmangar commented on a change in pull request #1387: SOLR-14210: Include replica health in healtcheck handler URL: https://github.com/apache/lucene-solr/pull/1387#discussion_r402372168 ## File path: solr/core/src/java/org/apache/solr/handler/admin/HealthCheckHandler.java ## @@ -88,15 +96,46 @@ public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throw return; } -// Set status to true if this node is in live_nodes -if (clusterState.getLiveNodes().contains(cores.getZkController().getNodeName())) { - rsp.add(STATUS, OK); -} else { +// Fail if not in live_nodes +if (!clusterState.getLiveNodes().contains(cores.getZkController().getNodeName())) { rsp.add(STATUS, FAILURE); rsp.setException(new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE, "Host Unavailable: Not in live nodes as per zk")); + return; } -rsp.setHttpCaching(false); +// Optionally require that all cores on this node are active if param 'requireHealthyCores=true' +if (req.getParams().getBool(PARAM_REQUIRE_HEALTHY_CORES, false)) { + List unhealthyCores = findUnhealthyCores(clusterState, + cores.getNodeConfig().getNodeName(), + cores.getAllCoreNames()); + if (unhealthyCores.size() > 0) { + rsp.add(STATUS, FAILURE); + rsp.setException(new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE, + "Replica(s) " + unhealthyCores + " are currently initializing or recovering")); + return; + } + rsp.add("message", "All cores are healthy"); +} + +// All lights green, report healthy +rsp.add(STATUS, OK); + } + + /** + * Find replicas DOWN or RECOVERING, or replicas in clusterstate that do not exist on local node + * @param clusterState clusterstate from ZK + * @param nodeName this node name + * @param allCoreNames list of all core names on current node + * @return list of core names that are either DOWN ore RECOVERING on 'nodeName' + */ + static List findUnhealthyCores(ClusterState clusterState, String nodeName, Collection allCoreNames) { +return clusterState.getCollectionsMap().values().stream() Review comment: The cluster state is a shell object that holds individual collection states that each live in different znodes. Each node watches only those collection states for which it hosts a replica. The rest of the collections exist as a lazy reference which is populated by a live read. The `getCollectionsMap()` method calls `CollectionRef.get()` for all collections so it will cause a live read to zk for all lazy references. The lazy reference can optionally cache the fetched state for 2 seconds (if you call `CollectionRef.get(true)`) but that too is too short an interval for a health check. > I want to exclude replicas of inactive shards from the check. The only place I could find that info was in Slice inside Clusterstate. It's more code but it is a good idea for sure. Your idea of skipping recovery_failed cores from the health check is also sound. Thanks for taking this up! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
s1monw commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402341962 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,39 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map newFields = new HashMap<>(); for (FieldInfo fi : reader.getFieldInfos()) { - FieldInfo clone = builder.add(fi); - // copy the stuff FieldInfos.Builder doesn't copy - for (Entry e : fi.attributes().entrySet()) { -clone.putAttribute(e.getKey(), e.getValue()); - } - clone.setDocValuesGen(fi.getDocValuesGen()); + // cannot use builder.add(fi) because it does not preserve + // the local field number. Field numbers can be different from the global ones + // if the segment was created externally (with IndexWriter#addIndexes(Directory)). + FieldInfo clone = cloneFieldInfo(fi, fi.number); + newFields.put(clone.name, clone); + maxFieldNumber = Math.max(clone.number, maxFieldNumber); } // create new fields with the right DV type +FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); for (List updates : pendingDVUpdates.values()) { DocValuesFieldUpdates update = updates.get(0); - FieldInfo fieldInfo = builder.getOrAdd(update.field); - fieldInfo.setDocValuesType(update.type); + FieldInfo globalFieldInfo = builder.getOrAdd(update.field); + globalFieldInfo.setDocValuesType(update.type); + FieldInfo segmentFieldInfo = newFields.get(update.field); + if (segmentFieldInfo == null) { +// the field is not present in this segment so we can fallback to the global fields. +if (globalFieldInfo.number <= maxFieldNumber) { + // the global field number is already used in this segment for a different field so we force a new one locally. + globalFieldInfo = cloneFieldInfo(globalFieldInfo, ++maxFieldNumber); Review comment: we should set the DV type here too no? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
s1monw commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402339219 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -644,6 +656,13 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum return true; } + private FieldInfo cloneFieldInfo(FieldInfo fi, int fieldNumber) { +return new FieldInfo(fi.name, fieldNumber, fi.hasVectors(), fi.omitsNorms(), fi.hasPayloads(), +fi.getIndexOptions(), fi.getDocValuesType(), fi.getDocValuesGen(), new HashMap<>(fi.attributes()), +fi.getPointDimensionCount(), fi.getPointIndexDimensionCount(), fi.getPointNumBytes(), fi.isSoftDeletesField()); + Review comment: extra newline This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14378) factor out a (contrib/ltr) FilterFeatureScorer class
[ https://issues.apache.org/jira/browse/SOLR-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke updated SOLR-14378: --- Status: Patch Available (was: Open) > factor out a (contrib/ltr) FilterFeatureScorer class > > > Key: SOLR-14378 > URL: https://issues.apache.org/jira/browse/SOLR-14378 > Project: Solr > Issue Type: Task > Components: contrib - LTR >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-14378.patch > > > It looks like a {{FilterFeatureScorer}} class can be factored out from > {{OriginalScoreScorer}} (and {{SolrFeatureScorer}} after or with [~dsmiley]'s > SOLR-14364 changes could use it too). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1381: SOLR-14364: LTR SolrFeature fq improvements
cpoerschke commented on a change in pull request #1381: SOLR-14364: LTR SolrFeature fq improvements URL: https://github.com/apache/lucene-solr/pull/1381#discussion_r402335911 ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/feature/SolrFeature.java ## @@ -280,52 +243,10 @@ public float score() throws IOException { @Override public float getMaxScore(int upTo) throws IOException { -return Float.POSITIVE_INFINITY; - } -} - -/** - * An iterator that allows to iterate only on the documents for which a feature has - * a value. - **/ -public class SolrFeatureScorerIterator extends DocIdSetIterator { - - final private DocIdSetIterator filterIterator; - final private DocIdSetIterator scorerFilter; - - SolrFeatureScorerIterator(DocIdSetIterator filterIterator, - DocIdSetIterator scorerFilter) { -this.filterIterator = filterIterator; -this.scorerFilter = scorerFilter; - } - - @Override - public int docID() { -return filterIterator.docID(); - } - - @Override - public int nextDoc() throws IOException { -int docID = filterIterator.nextDoc(); -scorerFilter.advance(docID); -return docID; - } - - @Override - public int advance(int target) throws IOException { -// We use iterator to catch the scorer up since -// that checks if the target id is in the query + all the filters -int docID = filterIterator.advance(target); -scorerFilter.advance(docID); -return docID; - } - - @Override - public long cost() { -return filterIterator.cost() + scorerFilter.cost(); +return solrScorer.getMaxScore(upTo); } + // TODO delegate more methods? Review comment: Looks like there could be scope for factoring out a `FilterFeatureScorer` class, I've opened https://issues.apache.org/jira/browse/SOLR-14378 re: that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1381: SOLR-14364: LTR SolrFeature fq improvements
cpoerschke commented on a change in pull request #1381: SOLR-14364: LTR SolrFeature fq improvements URL: https://github.com/apache/lucene-solr/pull/1381#discussion_r402333435 ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/feature/SolrFeature.java ## @@ -120,67 +123,66 @@ protected void validate() throws FeatureException { ": Q or FQ must be provided"); } } + /** * Weight for a SolrFeature **/ public class SolrFeatureWeight extends FeatureWeight { -final private Weight solrQueryWeight; -final private Query query; -final private List queryAndFilters; +private final Weight solrQueryWeight; -public SolrFeatureWeight(IndexSearcher searcher, -SolrQueryRequest request, Query originalQuery, Map efi) throws IOException { +public SolrFeatureWeight(SolrIndexSearcher searcher, + SolrQueryRequest request, Query originalQuery, Map efi) throws IOException { super(SolrFeature.this, searcher, request, originalQuery, efi); try { -String solrQuery = q; -final List fqs = fq; - -if ((solrQuery == null) || solrQuery.isEmpty()) { - solrQuery = "*:*"; -} - -solrQuery = macroExpander.expand(solrQuery); -if (solrQuery == null) { - throw new FeatureException(this.getClass().getSimpleName()+" requires efi parameter that was not passed in request."); -} - -final SolrQueryRequest req = makeRequest(request.getCore(), solrQuery, -fqs, df); +final SolrQueryRequest req = makeRequest(request.getCore(), q, fq, df); if (req == null) { throw new IOException("ERROR: No parameters provided"); } +// Build the scoring query +Query scoreQuery; +String qStr = q; +if (qStr == null || qStr.isEmpty()) { + scoreQuery = null; // ultimately behaves like MatchAllDocs +} else { + qStr = macroExpander.expand(qStr); + if (qStr == null) { +throw new FeatureException(this.getClass().getSimpleName() + " requires efi parameter that was not passed in request."); + } + scoreQuery = QParser.getParser(qStr, req).getQuery(); + // note: QParser can return a null Query sometimes, such as if the query is a stopword or just symbols + if (scoreQuery == null) { +scoreQuery = new MatchNoDocsQuery(); // debatable; all or none? Review comment: > ... The semantics of what's here is no change from the prior logic. ... Good to know, thanks for explaining! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1381: SOLR-14364: LTR SolrFeature fq improvements
cpoerschke commented on a change in pull request #1381: SOLR-14364: LTR SolrFeature fq improvements URL: https://github.com/apache/lucene-solr/pull/1381#discussion_r402340272 ## File path: solr/core/src/java/org/apache/solr/search/grouping/CommandHandler.java ## @@ -228,12 +226,8 @@ private void searchWithTimeLimiter(Query query, collector = MultiCollector.wrap(collector, hitCountCollector); } -if (filter.filter != null) { - query = new BooleanQuery.Builder() - .add(query, Occur.MUST) - .add(filter.filter, Occur.FILTER) - .build(); -} +query = QueryUtils.combineQueryAndFilter(query, filter.filter); Review comment: Okay, returning to this I think I now understand the `QueryUtils.combineQueryAndFilter` scenarios better: * There are two inputs and zero, two or one of them could be null. * scoreQuery non-null, filterQuery null: * (both old and new code) just use `scoreQuery`. * scoreQuery non-null, filterQuery non-null: * `BooleanQuery.Builder` is used to combine the two (in both old and new code). * scoreQuery null, filterQuery null: * In the old code in CommandHandler/SolrIndexSearcher/Grouping/ExpandComponent a value of null would have been used for `scoreQuery`, searching on that isn't gonna end well, is it. * In the new code `MatchAllDocsQuery` will be used to return everything. * scoreQuery null, filterQuery non-null: * In the old code `BooleanQuery.Builder` is used to combine the two, regardless of `scoreQuery` being null and so that search is not going to end well too. * In the new code the `filterQuery` wrapped up into a `ConstantScoreQuery` is used to search. In terms of search meaning this is equivalent to the `scoreQuery` being a `MatchAllDocsQuery` i.e. with a score query matching everything only the filter query needs to be searched with. So in the "scoreQuery null" scenario the new code * no longer results in a failed search, * but instead returns something, * and what is returned is consistent w.r.t. any filter query, * and/or consistent within itself i.e. a null (score or filter) query is taken to mean 'everything' i.e. `MatchAllDocsQuery`. So yes, I agree, `MatchAllDocsQuery` is the logical choice. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14378) factor out a (contrib/ltr) FilterFeatureScorer class
[ https://issues.apache.org/jira/browse/SOLR-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073787#comment-17073787 ] Christine Poerschke commented on SOLR-14378: Attached proposed patch. I'm unsure on whether or not a solr/CHANGES.txt entry for this would be appropriate though: * on the one hand something like "Factor a FilterFeatureScorer class out from (contrib/ltr) OriginalScoreScorer." under the "Other Changes" section seems too close to the code and an implementation detail. * on the other hand this is not a literal factoring out i.e. the factored out class delegates more methods than the OriginalScoreScorer did (but the practical impact of that I'm not yet clear on, other than that the tests continue to pass). > factor out a (contrib/ltr) FilterFeatureScorer class > > > Key: SOLR-14378 > URL: https://issues.apache.org/jira/browse/SOLR-14378 > Project: Solr > Issue Type: Task > Components: contrib - LTR >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-14378.patch > > > It looks like a {{FilterFeatureScorer}} class can be factored out from > {{OriginalScoreScorer}} (and {{SolrFeatureScorer}} after or with [~dsmiley]'s > SOLR-14364 changes could use it too). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8811) Add maximum clause count check to IndexSearcher rather than BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073785#comment-17073785 ] Stamatis Zampetakis commented on LUCENE-8811: - Naively, I thought that after this change the number of clauses in a {{TermInSetQuery}} will be the number of terms that it contains. Looking better into the implementation of {{getNumClausesCheckVisitor}} (thanks [~rubenql] for noticing) it seems that the whole {{TermInSetQuery}} will count as only one clause. This behavior seems a bit inconsistent especially since a {{TermInSetQuery}} is somehow equivalent to a {{BooleanQuery}}. Why there should be limits on one and not for the other? > Add maximum clause count check to IndexSearcher rather than BooleanQuery > > > Key: LUCENE-8811 > URL: https://issues.apache.org/jira/browse/LUCENE-8811 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Alan Woodward >Priority: Minor > Fix For: master (9.0) > > Attachments: LUCENE-8811.patch, LUCENE-8811.patch, LUCENE-8811.patch, > LUCENE-8811.patch, LUCENE-8811.patch, LUCENE-8811.patch > > > Currently we only check whether boolean queries have too many clauses. > However there are other ways that queries may have too many clauses, for > instance if you have boolean queries that have themselves inner boolean > queries. > Could we use the new Query visitor API to move this check from BooleanQuery > to IndexSearcher in order to make this check more consistent across queries? > See for instance LUCENE-8810 where a rewrite rule caused the maximum clause > count to be hit even though the total number of leaf queries remained the > same. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14378) factor out a (contrib/ltr) FilterFeatureScorer class
[ https://issues.apache.org/jira/browse/SOLR-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke updated SOLR-14378: --- Attachment: SOLR-14378.patch > factor out a (contrib/ltr) FilterFeatureScorer class > > > Key: SOLR-14378 > URL: https://issues.apache.org/jira/browse/SOLR-14378 > Project: Solr > Issue Type: Task > Components: contrib - LTR >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-14378.patch > > > It looks like a {{FilterFeatureScorer}} class can be factored out from > {{OriginalScoreScorer}} (and {{SolrFeatureScorer}} after or with [~dsmiley]'s > SOLR-14364 changes could use it too). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14378) factor out a (contrib/ltr) FilterFeatureScorer class
Christine Poerschke created SOLR-14378: -- Summary: factor out a (contrib/ltr) FilterFeatureScorer class Key: SOLR-14378 URL: https://issues.apache.org/jira/browse/SOLR-14378 Project: Solr Issue Type: Task Components: contrib - LTR Reporter: Christine Poerschke Assignee: Christine Poerschke It looks like a {{FilterFeatureScorer}} class can be factored out from {{OriginalScoreScorer}} (and {{SolrFeatureScorer}} after or with [~dsmiley]'s SOLR-14364 changes could use it too). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9303) There may be can simpler in DefaultIndexingChain
[ https://issues.apache.org/jira/browse/LUCENE-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073772#comment-17073772 ] Jan Høydahl commented on LUCENE-9303: - Thanks. I encourage you to open a GitHub Pull Request for your proposed fix. Make sure to start the title with LUCENE-9393 so it gets linked to this issue. Once you have a patch, one of us will review and merge it. > There may be can simpler in DefaultIndexingChain > > > Key: LUCENE-9303 > URL: https://issues.apache.org/jira/browse/LUCENE-9303 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Affects Versions: 8.2 >Reporter: kkewwei >Priority: Major > > In DefaultIndexingChain.processField() > {code:java} > if (fieldType.stored()) { > if (fp == null) { > fp = getOrAddField(fieldName, fieldType, false); > } > if (fieldType.stored()) { > String value = field.stringValue(); > .. > try { > storedFieldsConsumer.writeField(fp.fieldInfo, field); > } catch (Throwable th) { > .. > } > } > } > {code} > If there has need to add the second {{if}}, because {{fieldType.stored()}} > is executed before. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9303) There may be can simpler in DefaultIndexingChain
[ https://issues.apache.org/jira/browse/LUCENE-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated LUCENE-9303: Labels: newdev (was: ) > There may be can simpler in DefaultIndexingChain > > > Key: LUCENE-9303 > URL: https://issues.apache.org/jira/browse/LUCENE-9303 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Affects Versions: 8.2 >Reporter: kkewwei >Priority: Major > Labels: newdev > > In DefaultIndexingChain.processField() > {code:java} > if (fieldType.stored()) { > if (fp == null) { > fp = getOrAddField(fieldName, fieldType, false); > } > if (fieldType.stored()) { > String value = field.stringValue(); > .. > try { > storedFieldsConsumer.writeField(fp.fieldInfo, field); > } catch (Throwable th) { > .. > } > } > } > {code} > If there has need to add the second {{if}}, because {{fieldType.stored()}} > is executed before. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14377) Solr with private SSL certificate not working
[ https://issues.apache.org/jira/browse/SOLR-14377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl resolved SOLR-14377. Resolution: Invalid Please do not open Jira issues to ask questions. Use the solr-user mailing list. Go to lucene.apache.org/solr to find out how to subscribe. Closing this as invalid. > Solr with private SSL certificate not working > - > > Key: SOLR-14377 > URL: https://issues.apache.org/jira/browse/SOLR-14377 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCLI >Affects Versions: 8.4.1 > Environment: Centos 7 > Solr-8.4.1 > java -version > openjdk version "1.8.0_121" > OpenJDK Runtime Environment (build 1.8.0_121-b13) > OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode) >Reporter: Ravi Prakash >Priority: Major > Labels: SSL > Attachments: image-2020-04-02-18-52-59-662.png, > image-2020-04-02-18-54-16-255.png > > > I installed solr-8.4.1 on centos 7, and tried to add SSL certificate to > bin/solr.in.sh file. > === > #Enables HTTPS. It is implictly true if you set SOLR_SSL_KEY_STORE. Use this > config > # to enable https module with custom jetty configuration. > *SOLR_SSL_ENABLED=true* > # Uncomment to set SSL-related system properties > # Be sure to update the paths to the correct keystore for your environment > *SOLR_SSL_KEY_STORE=/opt/solr/server/solr-ssl.keystore.jks > SOLR_SSL_KEY_STORE_PASSWORD=mypassword > SOLR_SSL_TRUST_STORE=/opt/solr/server/solr-ssl.keystore.jks > SOLR_SSL_TRUST_STORE_PASSWORD=mypassword* > # Require clients to authenticate > *SOLR_SSL_NEED_CLIENT_AUTH=false* > # Enable clients to authenticate (but not require) > *SOLR_SSL_WANT_CLIENT_AUTH=false* > # Verify client's hostname during SSL handshake > *SOLR_SSL_CLIENT_HOSTNAME_VERIFICATION=false* > === > > Then I restart the server : service solr restart > Still all the browser says : > This site can't provide a secure connection localhsot sent an invalid > response. > Try running Windows Network Diagnostics. > ERR_SSL_PROTOCOL_ERROR > I checked the logs in /var/solr/logs/solr.log > 2020-04-02 12:58:33.669 INFO (main) [ ] o.e.j.u.log Logging initialized > @1856ms to org.eclipse.jetty.util.log.Slf4jLog > 2020-04-02 12:58:33.870 WARN (main) [ ] o.e.j.s.AbstractConnector Ignoring > deprecated socket close linger time > 2020-04-02 12:58:33.870 WARN (main) [ ] o.e.j.x.XmlConfiguration > Deprecated method public void > org.eclipse.jetty.server.ServerConnector.setSoLingerTime(int) in > file:///opt/solr-8.4.1/server/etc/jetty-http.xml > 2020-04-02 12:58:33.877 INFO (main) [ ] o.e.j.s.Server > jetty-9.4.19.v20190610; built: 2019-06-10T16:30:51.723Z; git: > afcf563148970e98786327af5e07c261fda175d3; jvm 1.8.0_121-b13 > 2020-04-02 12:58:33.907 INFO (main) [ ] o.e.j.d.p.ScanningAppProvider > Deployment monitor [file:///opt/solr-8.4.1/server/contexts/] at interval 0 > 2020-04-02 12:58:34.238 INFO (main) [ ] > o.e.j.w.StandardDescriptorProcessor NO JSP Support for /solr, did not find > org.apache.jasper.servlet.JspServlet > 2020-04-02 12:58:34.251 INFO (main) [ ] o.e.j.s.session > DefaultSessionIdManager workerName=node0 > 2020-04-02 12:58:34.251 INFO (main) [ ] o.e.j.s.session No > SessionScavenger set, using defaults > 2020-04-02 12:58:34.254 INFO (main) [ ] o.e.j.s.session node0 Scavenging > every 66ms > 2020-04-02 12:58:34.362 INFO (main) [ ] o.a.s.s.SolrDispatchFilter Using > logger factory org.apache.logging.slf4j.Log4jLoggerFactory > 2020-04-02 12:58:34.368 INFO (main) [ ] o.a.s.s.SolrDispatchFilter ___ > _ Welcome to Apache Solr™ version 8.4.1 > 2020-04-02 12:58:34.368 INFO (main) [ ] o.a.s.s.SolrDispatchFilter / __| > ___| |_ _ Starting in standalone mode on port 8983 > 2020-04-02 12:58:34.368 INFO (main) [ ] o.a.s.s.SolrDispatchFilter \__ \/ > _ \ | '_| Install dir: /opt/solr > 2020-04-02 12:58:34.369 INFO (main) [ ] o.a.s.s.SolrDispatchFilter > |___/\___/_|_|Start time: 2020-04-02T12:58:34.368Z > 2020-04-02 12:58:34.397 INFO (main) [ ] o.a.s.c.SolrResourceLoader Using > system property solr.solr.home: /var/solr/data > 2020-04-02 12:58:34.406 INFO (main) [ ] o.a.s.c.SolrXmlConfig Loading > container configuration from /var/solr/data/solr.xml > 2020-04-02 12:58:34.499 INFO (main) [ ] o.a.s.c.SolrXmlConfig MBean server > found: com.sun.jmx.mbeanserver.JmxMBeanServer@143640d5, but no JMX reporters > were configured - adding default JMX reporter. > 2020-04-02 12:58:35.177 INFO (main) [ ] o.a.s.h.c.HttpShardHandlerFactory > Host whitelist initialized: WhitelistHostChecker [whitelistHosts=null, > whitelistHostCheckingEnabled=true] > 2020-04-02 12:58:35.331 WARN (main) [ ]
[jira] [Assigned] (LUCENE-9286) FST construction explodes memory in BitTable
[ https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno Roustant reassigned LUCENE-9286: -- Assignee: Bruno Roustant (was: Dawid Weiss) > FST construction explodes memory in BitTable > > > Key: LUCENE-9286 > URL: https://issues.apache.org/jira/browse/LUCENE-9286 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 8.5 >Reporter: Dawid Weiss >Assignee: Bruno Roustant >Priority: Major > Attachments: screen-[1].png > > > I see a dramatic increase in the amount of memory required for construction > of (arguably large) automata. It currently OOMs with 8GB of memory consumed > for bit tables. I am pretty sure this didn't require so much memory before > (the automaton is ~50MB after construction). > Something bad happened in between. Thoughts, [~broustant], [~sokolov]? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9286) FST construction explodes memory in BitTable
[ https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073766#comment-17073766 ] Bruno Roustant commented on LUCENE-9286: I'm starting to investigate that. > FST construction explodes memory in BitTable > > > Key: LUCENE-9286 > URL: https://issues.apache.org/jira/browse/LUCENE-9286 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 8.5 >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Major > Attachments: screen-[1].png > > > I see a dramatic increase in the amount of memory required for construction > of (arguably large) automata. It currently OOMs with 8GB of memory consumed > for bit tables. I am pretty sure this didn't require so much memory before > (the automaton is ~50MB after construction). > Something bad happened in between. Thoughts, [~broustant], [~sokolov]? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] janhoy commented on a change in pull request #1387: SOLR-14210: Include replica health in healtcheck handler
janhoy commented on a change in pull request #1387: SOLR-14210: Include replica health in healtcheck handler URL: https://github.com/apache/lucene-solr/pull/1387#discussion_r402344025 ## File path: solr/core/src/java/org/apache/solr/handler/admin/HealthCheckHandler.java ## @@ -88,15 +96,46 @@ public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throw return; } -// Set status to true if this node is in live_nodes -if (clusterState.getLiveNodes().contains(cores.getZkController().getNodeName())) { - rsp.add(STATUS, OK); -} else { +// Fail if not in live_nodes +if (!clusterState.getLiveNodes().contains(cores.getZkController().getNodeName())) { rsp.add(STATUS, FAILURE); rsp.setException(new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE, "Host Unavailable: Not in live nodes as per zk")); + return; } -rsp.setHttpCaching(false); +// Optionally require that all cores on this node are active if param 'requireHealthyCores=true' +if (req.getParams().getBool(PARAM_REQUIRE_HEALTHY_CORES, false)) { + List unhealthyCores = findUnhealthyCores(clusterState, + cores.getNodeConfig().getNodeName(), + cores.getAllCoreNames()); + if (unhealthyCores.size() > 0) { + rsp.add(STATUS, FAILURE); + rsp.setException(new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE, + "Replica(s) " + unhealthyCores + " are currently initializing or recovering")); + return; + } + rsp.add("message", "All cores are healthy"); +} + +// All lights green, report healthy +rsp.add(STATUS, OK); + } + + /** + * Find replicas DOWN or RECOVERING, or replicas in clusterstate that do not exist on local node + * @param clusterState clusterstate from ZK + * @param nodeName this node name + * @param allCoreNames list of all core names on current node + * @return list of core names that are either DOWN ore RECOVERING on 'nodeName' + */ + static List findUnhealthyCores(ClusterState clusterState, String nodeName, Collection allCoreNames) { +return clusterState.getCollectionsMap().values().stream() Review comment: I assumed ClusterState object in each node is cached on the node and iterating it will not incur any new ZK calls, but it is uptated by watches? If it incurs connections then I agree with you! I want to exclude replicas of inactive shards from the check. The only place I could find that info was in Slice inside Clusterstate. Sure, I can iterate each core on local host, find its Slice-ID and then go lookup the Slice in clusterstate to find whether it's active, that was my other alternative but more code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9303) There may be can simpler in DefaultIndexingChain
[ https://issues.apache.org/jira/browse/LUCENE-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kkewwei updated LUCENE-9303: Description: In DefaultIndexingChain.processField() {code:java} if (fieldType.stored()) { if (fp == null) { fp = getOrAddField(fieldName, fieldType, false); } if (fieldType.stored()) { String value = field.stringValue(); .. try { storedFieldsConsumer.writeField(fp.fieldInfo, field); } catch (Throwable th) { .. } } } {code} If there has need to add the second {{if}}, because {{fieldType.stored()}} is executed before. was: {code:java} {code} In DefaultIndexingChain.processField() {code:java} // code placeholder {code} > There may be can simpler in DefaultIndexingChain > > > Key: LUCENE-9303 > URL: https://issues.apache.org/jira/browse/LUCENE-9303 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Affects Versions: 8.2 >Reporter: kkewwei >Priority: Major > > In DefaultIndexingChain.processField() > {code:java} > if (fieldType.stored()) { > if (fp == null) { > fp = getOrAddField(fieldName, fieldType, false); > } > if (fieldType.stored()) { > String value = field.stringValue(); > .. > try { > storedFieldsConsumer.writeField(fp.fieldInfo, field); > } catch (Throwable th) { > .. > } > } > } > {code} > If there has need to add the second {{if}}, because {{fieldType.stored()}} > is executed before. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9303) There may be can simpler in DefaultIndexingChain
kkewwei created LUCENE-9303: --- Summary: There may be can simpler in DefaultIndexingChain Key: LUCENE-9303 URL: https://issues.apache.org/jira/browse/LUCENE-9303 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 8.2 Reporter: kkewwei {code:java} {code} In DefaultIndexingChain.processField() {code:java} // code placeholder {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14377) Solr with private SSL certificate not working
[ https://issues.apache.org/jira/browse/SOLR-14377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated SOLR-14377: Attachment: image-2020-04-02-18-52-59-662.png image-2020-04-02-18-54-16-255.png > Solr with private SSL certificate not working > - > > Key: SOLR-14377 > URL: https://issues.apache.org/jira/browse/SOLR-14377 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCLI >Affects Versions: 8.4.1 > Environment: Centos 7 > Solr-8.4.1 > java -version > openjdk version "1.8.0_121" > OpenJDK Runtime Environment (build 1.8.0_121-b13) > OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode) >Reporter: Ravi Prakash >Priority: Major > Labels: SSL > Attachments: image-2020-04-02-18-52-59-662.png, > image-2020-04-02-18-54-16-255.png > > > I installed solr-8.4.1 on centos 7, and tried to add SSL certificate to > bin/solr.in.sh file. > === > #Enables HTTPS. It is implictly true if you set SOLR_SSL_KEY_STORE. Use this > config > # to enable https module with custom jetty configuration. > *SOLR_SSL_ENABLED=true* > # Uncomment to set SSL-related system properties > # Be sure to update the paths to the correct keystore for your environment > *SOLR_SSL_KEY_STORE=/opt/solr/server/solr-ssl.keystore.jks > SOLR_SSL_KEY_STORE_PASSWORD=mypassword > SOLR_SSL_TRUST_STORE=/opt/solr/server/solr-ssl.keystore.jks > SOLR_SSL_TRUST_STORE_PASSWORD=mypassword* > # Require clients to authenticate > *SOLR_SSL_NEED_CLIENT_AUTH=false* > # Enable clients to authenticate (but not require) > *SOLR_SSL_WANT_CLIENT_AUTH=false* > # Verify client's hostname during SSL handshake > *SOLR_SSL_CLIENT_HOSTNAME_VERIFICATION=false* > === > > Then I restart the server : service solr restart > Still all the browser says : > This site can't provide a secure connection localhsot sent an invalid > response. > Try running Windows Network Diagnostics. > ERR_SSL_PROTOCOL_ERROR > I checked the logs in /var/solr/logs/solr.log > 2020-04-02 12:58:33.669 INFO (main) [ ] o.e.j.u.log Logging initialized > @1856ms to org.eclipse.jetty.util.log.Slf4jLog > 2020-04-02 12:58:33.870 WARN (main) [ ] o.e.j.s.AbstractConnector Ignoring > deprecated socket close linger time > 2020-04-02 12:58:33.870 WARN (main) [ ] o.e.j.x.XmlConfiguration > Deprecated method public void > org.eclipse.jetty.server.ServerConnector.setSoLingerTime(int) in > file:///opt/solr-8.4.1/server/etc/jetty-http.xml > 2020-04-02 12:58:33.877 INFO (main) [ ] o.e.j.s.Server > jetty-9.4.19.v20190610; built: 2019-06-10T16:30:51.723Z; git: > afcf563148970e98786327af5e07c261fda175d3; jvm 1.8.0_121-b13 > 2020-04-02 12:58:33.907 INFO (main) [ ] o.e.j.d.p.ScanningAppProvider > Deployment monitor [file:///opt/solr-8.4.1/server/contexts/] at interval 0 > 2020-04-02 12:58:34.238 INFO (main) [ ] > o.e.j.w.StandardDescriptorProcessor NO JSP Support for /solr, did not find > org.apache.jasper.servlet.JspServlet > 2020-04-02 12:58:34.251 INFO (main) [ ] o.e.j.s.session > DefaultSessionIdManager workerName=node0 > 2020-04-02 12:58:34.251 INFO (main) [ ] o.e.j.s.session No > SessionScavenger set, using defaults > 2020-04-02 12:58:34.254 INFO (main) [ ] o.e.j.s.session node0 Scavenging > every 66ms > 2020-04-02 12:58:34.362 INFO (main) [ ] o.a.s.s.SolrDispatchFilter Using > logger factory org.apache.logging.slf4j.Log4jLoggerFactory > 2020-04-02 12:58:34.368 INFO (main) [ ] o.a.s.s.SolrDispatchFilter ___ > _ Welcome to Apache Solr™ version 8.4.1 > 2020-04-02 12:58:34.368 INFO (main) [ ] o.a.s.s.SolrDispatchFilter / __| > ___| |_ _ Starting in standalone mode on port 8983 > 2020-04-02 12:58:34.368 INFO (main) [ ] o.a.s.s.SolrDispatchFilter \__ \/ > _ \ | '_| Install dir: /opt/solr > 2020-04-02 12:58:34.369 INFO (main) [ ] o.a.s.s.SolrDispatchFilter > |___/\___/_|_|Start time: 2020-04-02T12:58:34.368Z > 2020-04-02 12:58:34.397 INFO (main) [ ] o.a.s.c.SolrResourceLoader Using > system property solr.solr.home: /var/solr/data > 2020-04-02 12:58:34.406 INFO (main) [ ] o.a.s.c.SolrXmlConfig Loading > container configuration from /var/solr/data/solr.xml > 2020-04-02 12:58:34.499 INFO (main) [ ] o.a.s.c.SolrXmlConfig MBean server > found: com.sun.jmx.mbeanserver.JmxMBeanServer@143640d5, but no JMX reporters > were configured - adding default JMX reporter. > 2020-04-02 12:58:35.177 INFO (main) [ ] o.a.s.h.c.HttpShardHandlerFactory > Host whitelist initialized: WhitelistHostChecker [whitelistHosts=null, > whitelistHostCheckingEnabled=true] > 2020-04-02 12:58:35.331 WARN (main) [ ] o.e.j.u.s.S.config Trusting all > certificates configured for >
[GitHub] [lucene-solr] shalinmangar commented on a change in pull request #1387: SOLR-14210: Include replica health in healtcheck handler
shalinmangar commented on a change in pull request #1387: SOLR-14210: Include replica health in healtcheck handler URL: https://github.com/apache/lucene-solr/pull/1387#discussion_r402303966 ## File path: solr/core/src/java/org/apache/solr/handler/admin/HealthCheckHandler.java ## @@ -88,15 +96,46 @@ public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throw return; } -// Set status to true if this node is in live_nodes -if (clusterState.getLiveNodes().contains(cores.getZkController().getNodeName())) { - rsp.add(STATUS, OK); -} else { +// Fail if not in live_nodes +if (!clusterState.getLiveNodes().contains(cores.getZkController().getNodeName())) { rsp.add(STATUS, FAILURE); rsp.setException(new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE, "Host Unavailable: Not in live nodes as per zk")); + return; } -rsp.setHttpCaching(false); +// Optionally require that all cores on this node are active if param 'requireHealthyCores=true' +if (req.getParams().getBool(PARAM_REQUIRE_HEALTHY_CORES, false)) { + List unhealthyCores = findUnhealthyCores(clusterState, + cores.getNodeConfig().getNodeName(), + cores.getAllCoreNames()); + if (unhealthyCores.size() > 0) { + rsp.add(STATUS, FAILURE); + rsp.setException(new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE, + "Replica(s) " + unhealthyCores + " are currently initializing or recovering")); + return; + } + rsp.add("message", "All cores are healthy"); +} + +// All lights green, report healthy +rsp.add(STATUS, OK); + } + + /** + * Find replicas DOWN or RECOVERING, or replicas in clusterstate that do not exist on local node + * @param clusterState clusterstate from ZK + * @param nodeName this node name + * @param allCoreNames list of all core names on current node + * @return list of core names that are either DOWN ore RECOVERING on 'nodeName' + */ + static List findUnhealthyCores(ClusterState clusterState, String nodeName, Collection allCoreNames) { +return clusterState.getCollectionsMap().values().stream() Review comment: Why does this need to go to cluster state? This becomes a very expensive method when you have a lot of collections because it fetches collection states for all collections from ZK even if those collections have no replicas on the current node. Imagine having 300 collections and doing 299 zk read operations just for a health check if the node hosts replicas for one collection only. I think this method should be rewritten to iterate over the CloudDescriptors of all local cores and check for `hasRegistered == true` and `lastPublished == ACTIVE`. Those two should be sufficient for a health check functionality. We don't even need to consult the cluster state. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14377) Solr with private SSL certificate not working
Ravi Prakash created SOLR-14377: --- Summary: Solr with private SSL certificate not working Key: SOLR-14377 URL: https://issues.apache.org/jira/browse/SOLR-14377 Project: Solr Issue Type: Test Security Level: Public (Default Security Level. Issues are Public) Components: SolrCLI Affects Versions: 8.4.1 Environment: Centos 7 Solr-8.4.1 java -version openjdk version "1.8.0_121" OpenJDK Runtime Environment (build 1.8.0_121-b13) OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode) Reporter: Ravi Prakash I installed solr-8.4.1 on centos 7, and tried to add SSL certificate to bin/solr.in.sh file. === #Enables HTTPS. It is implictly true if you set SOLR_SSL_KEY_STORE. Use this config # to enable https module with custom jetty configuration. *SOLR_SSL_ENABLED=true* # Uncomment to set SSL-related system properties # Be sure to update the paths to the correct keystore for your environment *SOLR_SSL_KEY_STORE=/opt/solr/server/solr-ssl.keystore.jks SOLR_SSL_KEY_STORE_PASSWORD=mypassword SOLR_SSL_TRUST_STORE=/opt/solr/server/solr-ssl.keystore.jks SOLR_SSL_TRUST_STORE_PASSWORD=mypassword* # Require clients to authenticate *SOLR_SSL_NEED_CLIENT_AUTH=false* # Enable clients to authenticate (but not require) *SOLR_SSL_WANT_CLIENT_AUTH=false* # Verify client's hostname during SSL handshake *SOLR_SSL_CLIENT_HOSTNAME_VERIFICATION=false* === Then I restart the server : service solr restart Still all the browser says : This site can't provide a secure connection localhsot sent an invalid response. Try running Windows Network Diagnostics. ERR_SSL_PROTOCOL_ERROR I checked the logs in /var/solr/logs/solr.log 2020-04-02 12:58:33.669 INFO (main) [ ] o.e.j.u.log Logging initialized @1856ms to org.eclipse.jetty.util.log.Slf4jLog 2020-04-02 12:58:33.870 WARN (main) [ ] o.e.j.s.AbstractConnector Ignoring deprecated socket close linger time 2020-04-02 12:58:33.870 WARN (main) [ ] o.e.j.x.XmlConfiguration Deprecated method public void org.eclipse.jetty.server.ServerConnector.setSoLingerTime(int) in file:///opt/solr-8.4.1/server/etc/jetty-http.xml 2020-04-02 12:58:33.877 INFO (main) [ ] o.e.j.s.Server jetty-9.4.19.v20190610; built: 2019-06-10T16:30:51.723Z; git: afcf563148970e98786327af5e07c261fda175d3; jvm 1.8.0_121-b13 2020-04-02 12:58:33.907 INFO (main) [ ] o.e.j.d.p.ScanningAppProvider Deployment monitor [file:///opt/solr-8.4.1/server/contexts/] at interval 0 2020-04-02 12:58:34.238 INFO (main) [ ] o.e.j.w.StandardDescriptorProcessor NO JSP Support for /solr, did not find org.apache.jasper.servlet.JspServlet 2020-04-02 12:58:34.251 INFO (main) [ ] o.e.j.s.session DefaultSessionIdManager workerName=node0 2020-04-02 12:58:34.251 INFO (main) [ ] o.e.j.s.session No SessionScavenger set, using defaults 2020-04-02 12:58:34.254 INFO (main) [ ] o.e.j.s.session node0 Scavenging every 66ms 2020-04-02 12:58:34.362 INFO (main) [ ] o.a.s.s.SolrDispatchFilter Using logger factory org.apache.logging.slf4j.Log4jLoggerFactory 2020-04-02 12:58:34.368 INFO (main) [ ] o.a.s.s.SolrDispatchFilter ___ _ Welcome to Apache Solr™ version 8.4.1 2020-04-02 12:58:34.368 INFO (main) [ ] o.a.s.s.SolrDispatchFilter / __| ___| |_ _ Starting in standalone mode on port 8983 2020-04-02 12:58:34.368 INFO (main) [ ] o.a.s.s.SolrDispatchFilter \__ \/ _ \ | '_| Install dir: /opt/solr 2020-04-02 12:58:34.369 INFO (main) [ ] o.a.s.s.SolrDispatchFilter |___/\___/_|_|Start time: 2020-04-02T12:58:34.368Z 2020-04-02 12:58:34.397 INFO (main) [ ] o.a.s.c.SolrResourceLoader Using system property solr.solr.home: /var/solr/data 2020-04-02 12:58:34.406 INFO (main) [ ] o.a.s.c.SolrXmlConfig Loading container configuration from /var/solr/data/solr.xml 2020-04-02 12:58:34.499 INFO (main) [ ] o.a.s.c.SolrXmlConfig MBean server found: com.sun.jmx.mbeanserver.JmxMBeanServer@143640d5, but no JMX reporters were configured - adding default JMX reporter. 2020-04-02 12:58:35.177 INFO (main) [ ] o.a.s.h.c.HttpShardHandlerFactory Host whitelist initialized: WhitelistHostChecker [whitelistHosts=null, whitelistHostCheckingEnabled=true] 2020-04-02 12:58:35.331 WARN (main) [ ] o.e.j.u.s.S.config Trusting all certificates configured for Client@57eda880[provider=null,keyStore=null,trustStore=null] 2020-04-02 12:58:35.331 WARN (main) [ ] o.e.j.u.s.S.config No Client EndPointIdentificationAlgorithm configured for Client@57eda880[provider=null,keyStore=null,trustStore=null] 2020-04-02 12:58:35.548 WARN (main) [ ] o.e.j.u.s.S.config Trusting all certificates configured for Client@423e4cbb[provider=null,keyStore=null,trustStore=null] 2020-04-02 12:58:35.548 WARN (main) [ ] o.e.j.u.s.S.config No Client EndPointIdentificationAlgorithm configured for Client@423e4cbb[provider=null,keyStore=null,trustStore=null] 2020-04-02
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402231946 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,39 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map newFields = new HashMap<>(); for (FieldInfo fi : reader.getFieldInfos()) { - FieldInfo clone = builder.add(fi); - // copy the stuff FieldInfos.Builder doesn't copy - for (Entry e : fi.attributes().entrySet()) { -clone.putAttribute(e.getKey(), e.getValue()); - } - clone.setDocValuesGen(fi.getDocValuesGen()); + // cannot use builder.add(fi) because it does not preserve + // the local field number. Field numbers can be different from the global ones + // if the segment was created externally (with IndexWriter#addIndexes(Directory)). + FieldInfo clone = cloneFieldInfo(fi, fi.number); + newFields.put(clone.name, clone); + maxFieldNumber = Math.max(clone.number, maxFieldNumber); } // create new fields with the right DV type +FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); for (List updates : pendingDVUpdates.values()) { DocValuesFieldUpdates update = updates.get(0); - FieldInfo fieldInfo = builder.getOrAdd(update.field); - fieldInfo.setDocValuesType(update.type); + FieldInfo globalFieldInfo = builder.getOrAdd(update.field); + globalFieldInfo.setDocValuesType(update.type); + FieldInfo segmentFieldInfo = newFields.get(update.field); + if (segmentFieldInfo == null) { +// the field is not present in this segment so we can fallback to the global fields. +if (globalFieldInfo.number <= maxFieldNumber) { + // the global field number is already used in this segment for a different field so we force a new one locally. + globalFieldInfo = cloneFieldInfo(globalFieldInfo, ++maxFieldNumber); +} +newFields.put(update.field, globalFieldInfo); + } else { +segmentFieldInfo.setDocValuesType(update.type); +newFields.put(update.field, segmentFieldInfo); Review comment: this `put` call should be unnecessary since the field is already in the map? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402225938 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,39 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map newFields = new HashMap<>(); for (FieldInfo fi : reader.getFieldInfos()) { - FieldInfo clone = builder.add(fi); - // copy the stuff FieldInfos.Builder doesn't copy - for (Entry e : fi.attributes().entrySet()) { -clone.putAttribute(e.getKey(), e.getValue()); - } - clone.setDocValuesGen(fi.getDocValuesGen()); + // cannot use builder.add(fi) because it does not preserve + // the local field number. Field numbers can be different from the global ones + // if the segment was created externally (with IndexWriter#addIndexes(Directory)). + FieldInfo clone = cloneFieldInfo(fi, fi.number); + newFields.put(clone.name, clone); + maxFieldNumber = Math.max(clone.number, maxFieldNumber); } // create new fields with the right DV type +FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); for (List updates : pendingDVUpdates.values()) { DocValuesFieldUpdates update = updates.get(0); - FieldInfo fieldInfo = builder.getOrAdd(update.field); - fieldInfo.setDocValuesType(update.type); + FieldInfo globalFieldInfo = builder.getOrAdd(update.field); + globalFieldInfo.setDocValuesType(update.type); Review comment: Sorry I wasn't sure how that worked when doing previous reviews, but the field infos of the writer should be updated when the doc-value update is performed. So the FieldInfo should always exist at this point and the doc-value type should always be correct. So we should be able to assert here instead of calling `setDocValuesType`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402233937 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,39 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map newFields = new HashMap<>(); for (FieldInfo fi : reader.getFieldInfos()) { - FieldInfo clone = builder.add(fi); - // copy the stuff FieldInfos.Builder doesn't copy - for (Entry e : fi.attributes().entrySet()) { -clone.putAttribute(e.getKey(), e.getValue()); - } - clone.setDocValuesGen(fi.getDocValuesGen()); + // cannot use builder.add(fi) because it does not preserve + // the local field number. Field numbers can be different from the global ones + // if the segment was created externally (with IndexWriter#addIndexes(Directory)). + FieldInfo clone = cloneFieldInfo(fi, fi.number); + newFields.put(clone.name, clone); + maxFieldNumber = Math.max(clone.number, maxFieldNumber); } // create new fields with the right DV type +FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); for (List updates : pendingDVUpdates.values()) { DocValuesFieldUpdates update = updates.get(0); - FieldInfo fieldInfo = builder.getOrAdd(update.field); - fieldInfo.setDocValuesType(update.type); + FieldInfo globalFieldInfo = builder.getOrAdd(update.field); + globalFieldInfo.setDocValuesType(update.type); Review comment: maybe add a comment on the above `getOrAdd` call to mention that it never needs to add the field? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402231793 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,39 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map newFields = new HashMap<>(); for (FieldInfo fi : reader.getFieldInfos()) { - FieldInfo clone = builder.add(fi); - // copy the stuff FieldInfos.Builder doesn't copy - for (Entry e : fi.attributes().entrySet()) { -clone.putAttribute(e.getKey(), e.getValue()); - } - clone.setDocValuesGen(fi.getDocValuesGen()); + // cannot use builder.add(fi) because it does not preserve + // the local field number. Field numbers can be different from the global ones + // if the segment was created externally (with IndexWriter#addIndexes(Directory)). + FieldInfo clone = cloneFieldInfo(fi, fi.number); + newFields.put(clone.name, clone); + maxFieldNumber = Math.max(clone.number, maxFieldNumber); } // create new fields with the right DV type +FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); for (List updates : pendingDVUpdates.values()) { DocValuesFieldUpdates update = updates.get(0); - FieldInfo fieldInfo = builder.getOrAdd(update.field); - fieldInfo.setDocValuesType(update.type); + FieldInfo globalFieldInfo = builder.getOrAdd(update.field); + globalFieldInfo.setDocValuesType(update.type); + FieldInfo segmentFieldInfo = newFields.get(update.field); + if (segmentFieldInfo == null) { +// the field is not present in this segment so we can fallback to the global fields. +if (globalFieldInfo.number <= maxFieldNumber) { + // the global field number is already used in this segment for a different field so we force a new one locally. + globalFieldInfo = cloneFieldInfo(globalFieldInfo, ++maxFieldNumber); +} Review comment: I think we should either have an `else` branch that updates the `maxFieldNumber` to avoid conflicts (in case dv updates introduce multiple fields) or always clone the `FieldInfo` to update the number. I have a preference for the latter since it requires fewer branches and makes the code easier to reason about? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402237062 ## File path: lucene/core/src/test/org/apache/lucene/index/TestNumericDocValuesUpdates.java ## @@ -1483,6 +1484,83 @@ public void testAddIndexes() throws Exception { IOUtils.close(dir1, dir2); } + public void testUpdatesAfterAddIndexes() throws Exception { Review comment: It might be better to split into one unit test for the addIndexes case when the updated field already exists in the segment, and another one for the case when the updated field doesn't exist in the segment yet? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402228065 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,39 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map newFields = new HashMap<>(); for (FieldInfo fi : reader.getFieldInfos()) { - FieldInfo clone = builder.add(fi); - // copy the stuff FieldInfos.Builder doesn't copy - for (Entry e : fi.attributes().entrySet()) { -clone.putAttribute(e.getKey(), e.getValue()); - } - clone.setDocValuesGen(fi.getDocValuesGen()); + // cannot use builder.add(fi) because it does not preserve + // the local field number. Field numbers can be different from the global ones + // if the segment was created externally (with IndexWriter#addIndexes(Directory)). + FieldInfo clone = cloneFieldInfo(fi, fi.number); + newFields.put(clone.name, clone); + maxFieldNumber = Math.max(clone.number, maxFieldNumber); } // create new fields with the right DV type +FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); for (List updates : pendingDVUpdates.values()) { DocValuesFieldUpdates update = updates.get(0); - FieldInfo fieldInfo = builder.getOrAdd(update.field); - fieldInfo.setDocValuesType(update.type); + FieldInfo globalFieldInfo = builder.getOrAdd(update.field); + globalFieldInfo.setDocValuesType(update.type); Review comment: Sorry I suggested this on the previous review, but the doc-value type is actually supposed to be set on the updateDocValue call, so we should even be able to assert that the doc-value type is correct at this stage instead of setting it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update
jpountz commented on a change in pull request #1394: LUCENE-9300: Fix field infos update on doc values update URL: https://github.com/apache/lucene-solr/pull/1394#discussion_r402234114 ## File path: lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java ## @@ -543,27 +543,39 @@ public synchronized boolean writeFieldUpdates(Directory dir, FieldInfos.FieldNum try { // clone FieldInfos so that we can update their dvGen separately from -// the reader's infos and write them to a new fieldInfos_gen file -FieldInfos.Builder builder = new FieldInfos.Builder(fieldNumbers); -// cannot use builder.add(reader.getFieldInfos()) because it does not -// clone FI.attributes as well FI.dvGen +// the reader's infos and write them to a new fieldInfos_gen file. +int maxFieldNumber = -1; +Map newFields = new HashMap<>(); for (FieldInfo fi : reader.getFieldInfos()) { - FieldInfo clone = builder.add(fi); - // copy the stuff FieldInfos.Builder doesn't copy - for (Entry e : fi.attributes().entrySet()) { -clone.putAttribute(e.getKey(), e.getValue()); - } - clone.setDocValuesGen(fi.getDocValuesGen()); + // cannot use builder.add(fi) because it does not preserve + // the local field number. Field numbers can be different from the global ones + // if the segment was created externally (with IndexWriter#addIndexes(Directory)). Review comment: ```suggestion // if the segment was created externally (and added to this index with IndexWriter#addIndexes(Directory)). ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9301) Gradle: Jar MANIFEST incomplete
[ https://issues.apache.org/jira/browse/LUCENE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073599#comment-17073599 ] Jan Høydahl commented on LUCENE-9301: - Built the PR locally and verified that the version info now shows up in Solr Admin UI. Did not try to understand all the changes or check other MANIFEST files. > Gradle: Jar MANIFEST incomplete > --- > > Key: LUCENE-9301 > URL: https://issues.apache.org/jira/browse/LUCENE-9301 > Project: Lucene - Core > Issue Type: Sub-task > Components: general/build >Affects Versions: master (9.0) >Reporter: Jan Høydahl >Assignee: Dawid Weiss >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > After building with gradle, the MANIFEST.MF file for e.g. solr-core.jar > containst > {noformat} > Manifest-Version: 1.0 > {noformat} > While when building with ant, it says > {noformat} > Manifest-Version: 1.0 > Ant-Version: Apache Ant 1.10.7 > Created-By: 11.0.6+10 (AdoptOpenJDK) > Extension-Name: org.apache.solr > Specification-Title: Apache Solr Search Server: solr-core > Specification-Version: 9.0.0 > Specification-Vendor: The Apache Software Foundation > Implementation-Title: org.apache.solr > Implementation-Version: 9.0.0-SNAPSHOT 9b5542ad55da601e0bdfda96bad8c2c > cabbbc397 - janhoy - 2020-04-01 16:24:09 > Implementation-Vendor: The Apache Software Foundation > X-Compile-Source-JDK: 11 > X-Compile-Target-JDK: 11 > {noformat} > In addition, with ant, the META-INF folder also contains LICENSE.txt and > NOTICE.txt files. > There is a macro {{build-manifest}} in common-build.xml that seems to build > the manifest. > The effect of this is e.g. that spec an implementation versions do not show > in Solr Admin UI -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique value
bruno-roustant commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (WIP) URL: https://github.com/apache/lucene-solr/pull/1395#discussion_r402207235 ## File path: solr/core/src/java/org/apache/solr/search/CollapsingQParserPlugin.java ## @@ -524,6 +533,7 @@ public int docID() { public OrdScoreCollector(int maxDoc, int segments, + PrimitiveMapFactory mapFactory, Review comment: I agree that we should use primitive map instead of array if we don't load enough arrays. The speed of the map put/get is sufficiently fast that I suppose it should not be visible in the overall perf of this CollapsingQParser. But do we need to have a new factory for primitive map to switch between hash map and array? I think we could need this if we often had both cases (high load and low load of the array) AND if we had a way to switch automatically. But I suppose we don't know in advance (so we don't switch auto), and that most often the load is low. So I suggest that we don't add the complexity of these new classes (the new util.numeric package) and we just use HPPC primitive map directly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique value
bruno-roustant commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (WIP) URL: https://github.com/apache/lucene-solr/pull/1395#discussion_r402198934 ## File path: solr/core/src/java/org/apache/solr/util/numeric/IntIntArrayBasedMap.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.numeric; + +import java.util.Arrays; +import java.util.Iterator; +import java.util.function.IntConsumer; + +import org.apache.lucene.util.ArrayUtil; + +public class IntIntArrayBasedMap implements IntIntMap { + + private int size; Review comment: Should be named "capacity" rather than "size". This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique value
bruno-roustant commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (WIP) URL: https://github.com/apache/lucene-solr/pull/1395#discussion_r402200092 ## File path: solr/core/src/java/org/apache/solr/util/numeric/IntIntArrayBasedMap.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.numeric; + +import java.util.Arrays; +import java.util.Iterator; +import java.util.function.IntConsumer; + +import org.apache.lucene.util.ArrayUtil; + +public class IntIntArrayBasedMap implements IntIntMap { + + private int size; + private int[] keyValues; + private int emptyValue; + + public IntIntArrayBasedMap(int initialSize, int emptyValue) { +this.size = initialSize; +this.keyValues = new int[initialSize]; +this.emptyValue = emptyValue; +if (emptyValue != 0) { + Arrays.fill(keyValues, emptyValue); +} + } + + @Override + public void set(int key, int value) { +if (key >= size) { + keyValues = ArrayUtil.grow(keyValues); + if (emptyValue != 0) { +for (int i = size; i < keyValues.length; i++) { + keyValues[i] = emptyValue; +} + } + size = keyValues.length; +} +keyValues[key] = value; + } + + @Override + public int get(int key) { +if (key >= size) { + return emptyValue; +} +return keyValues[key]; + } + + @Override + public void forEachValue(IntConsumer consumer) { +for (int val: keyValues) { + if (val != emptyValue) { +consumer.accept(val); + } +} + } + + @Override + public void remove(int key) { +if (key < size) keyValues[key] = emptyValue; + } + + @Override + public int size() { +return keyValues.length; Review comment: keyValues.length is a capacity while size() semantics is more a size: the current number of entries in the map. We should return a counter instead. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique value
bruno-roustant commented on a change in pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (WIP) URL: https://github.com/apache/lucene-solr/pull/1395#discussion_r402201394 ## File path: solr/core/src/java/org/apache/solr/util/numeric/IntIntArrayBasedMap.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.numeric; + +import java.util.Arrays; +import java.util.Iterator; +import java.util.function.IntConsumer; + +import org.apache.lucene.util.ArrayUtil; + +public class IntIntArrayBasedMap implements IntIntMap { + + private int size; + private int[] keyValues; + private int emptyValue; + + public IntIntArrayBasedMap(int initialSize, int emptyValue) { +this.size = initialSize; +this.keyValues = new int[initialSize]; +this.emptyValue = emptyValue; +if (emptyValue != 0) { + Arrays.fill(keyValues, emptyValue); +} + } + + @Override + public void set(int key, int value) { +if (key >= size) { + keyValues = ArrayUtil.grow(keyValues); Review comment: Here keyValues is enlarged (based on its current capacity), but are we sure it is enlarged enough to hold "key"? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9301) Gradle: Jar MANIFEST incomplete
[ https://issues.apache.org/jira/browse/LUCENE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073575#comment-17073575 ] Dawid Weiss commented on LUCENE-9301: - https://github.com/apache/lucene-solr/pull/1396 Suggested patch. I did try to make it similar to what ant produces at the moment but I didn't verify each and every subproject so it'd be good if somebody had a look. > Gradle: Jar MANIFEST incomplete > --- > > Key: LUCENE-9301 > URL: https://issues.apache.org/jira/browse/LUCENE-9301 > Project: Lucene - Core > Issue Type: Sub-task > Components: general/build >Affects Versions: master (9.0) >Reporter: Jan Høydahl >Assignee: Dawid Weiss >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > After building with gradle, the MANIFEST.MF file for e.g. solr-core.jar > containst > {noformat} > Manifest-Version: 1.0 > {noformat} > While when building with ant, it says > {noformat} > Manifest-Version: 1.0 > Ant-Version: Apache Ant 1.10.7 > Created-By: 11.0.6+10 (AdoptOpenJDK) > Extension-Name: org.apache.solr > Specification-Title: Apache Solr Search Server: solr-core > Specification-Version: 9.0.0 > Specification-Vendor: The Apache Software Foundation > Implementation-Title: org.apache.solr > Implementation-Version: 9.0.0-SNAPSHOT 9b5542ad55da601e0bdfda96bad8c2c > cabbbc397 - janhoy - 2020-04-01 16:24:09 > Implementation-Vendor: The Apache Software Foundation > X-Compile-Source-JDK: 11 > X-Compile-Target-JDK: 11 > {noformat} > In addition, with ant, the META-INF folder also contains LICENSE.txt and > NOTICE.txt files. > There is a macro {{build-manifest}} in common-build.xml that seems to build > the manifest. > The effect of this is e.g. that spec an implementation versions do not show > in Solr Admin UI -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss opened a new pull request #1396: LUCENE-9301: add a manifest entries to JARs.
dweiss opened a new pull request #1396: LUCENE-9301: add a manifest entries to JARs. URL: https://github.com/apache/lucene-solr/pull/1396 Suggested patch. I did try to make it similar to what ant produces at the moment but I didn't verify each and every subproject so it'd be good if somebody had a look. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14371) Zk StatusHandler should know about dynamic zk config
[ https://issues.apache.org/jira/browse/SOLR-14371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073569#comment-17073569 ] Jan Høydahl commented on SOLR-14371: I suggest as a first step, we handle this by letting the Admin UI display a clear warning if it detects that ZK connection string is different than the dynamic list of hosts. That way people are aware of the risks, until we fix Solr's zk code to actually be aware of dynamic reconfiguration. !dynamic-reconfig-warning.png|width=900! As that effort overlaps/collides with SOLR-14306, let's defer that part to that issue. I'll commit the proposed warning logic to the PR. > Zk StatusHandler should know about dynamic zk config > > > Key: SOLR-14371 > URL: https://issues.apache.org/jira/browse/SOLR-14371 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Attachments: dynamic-reconfig-warning.png, dynamic-reconfig.png > > Time Spent: 50m > Remaining Estimate: 0h > > With zk 3.5 it supports dynamic reconfig, which is used by the solr-operator > for Kubernetes. Then Solr is given a zkHost of one url pointing to a LB > (Service) in front of all zookeepers, and the zkclient will then fetch list > of all zookeepers from special zknode /zookeeper/config and reconfigure > itself with connection to all zk nodes listed. So you can then scale up/down > number of zk nodes dynamically without restarting solr. > However, the Admin UI displays errors since it believes it is connected to > only one zk, which is contradictory to what zk itself reports. We need to > make ZookeeperStatusHandler aware of dynamic reconfig so it asks zkclient > what current zkHost is instead of relying on Zk_HOST static setting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-12845) Add a default cluster policy
[ https://issues.apache.org/jira/browse/SOLR-12845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki reassigned SOLR-12845: --- Assignee: Andrzej Bialecki > Add a default cluster policy > > > Key: SOLR-12845 > URL: https://issues.apache.org/jira/browse/SOLR-12845 > Project: Solr > Issue Type: Improvement > Components: AutoScaling >Reporter: Shalin Shekhar Mangar >Assignee: Andrzej Bialecki >Priority: Major > Attachments: SOLR-12845.patch, SOLR-12845.patch > > > [~varunthacker] commented on SOLR-12739: > bq. We should also ship with some default policies - "Don't allow more than > one replica of a shard on the same JVM" , "Distribute cores across the > cluster evenly" , "Distribute replicas per collection across the nodes" > This issue is about adding these defaults. I propose the following as default > cluster policy: > {code} > # Each shard cannot have more than one replica on the same node if possible > {"replica": "<2", "shard": "#EACH", "node": "#ANY", "strict":false} > # Each collections replicas should be equally distributed amongst nodes > {"replica": "#EQUAL", "node": "#ANY", "strict":false} > # All cores should be equally distributed amongst nodes > {"cores": "#EQUAL", "node": "#ANY", "strict":false} > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14371) Zk StatusHandler should know about dynamic zk config
[ https://issues.apache.org/jira/browse/SOLR-14371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-14371: --- Attachment: dynamic-reconfig-warning.png > Zk StatusHandler should know about dynamic zk config > > > Key: SOLR-14371 > URL: https://issues.apache.org/jira/browse/SOLR-14371 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Attachments: dynamic-reconfig-warning.png, dynamic-reconfig.png > > Time Spent: 50m > Remaining Estimate: 0h > > With zk 3.5 it supports dynamic reconfig, which is used by the solr-operator > for Kubernetes. Then Solr is given a zkHost of one url pointing to a LB > (Service) in front of all zookeepers, and the zkclient will then fetch list > of all zookeepers from special zknode /zookeeper/config and reconfigure > itself with connection to all zk nodes listed. So you can then scale up/down > number of zk nodes dynamically without restarting solr. > However, the Admin UI displays errors since it believes it is connected to > only one zk, which is contradictory to what zk itself reports. We need to > make ZookeeperStatusHandler aware of dynamic reconfig so it asks zkclient > what current zkHost is instead of relying on Zk_HOST static setting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-12847) Cut over implementation of maxShardsPerNode to a collection policy
[ https://issues.apache.org/jira/browse/SOLR-12847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki reassigned SOLR-12847: --- Assignee: Andrzej Bialecki > Cut over implementation of maxShardsPerNode to a collection policy > -- > > Key: SOLR-12847 > URL: https://issues.apache.org/jira/browse/SOLR-12847 > Project: Solr > Issue Type: Bug > Components: AutoScaling, SolrCloud >Reporter: Shalin Shekhar Mangar >Assignee: Andrzej Bialecki >Priority: Major > Fix For: master (9.0), 8.2 > > > We've back and forth over handling maxShardsPerNode with autoscaling policies > (see SOLR-11005 for history). Now that we've reimplemented support for > creating collections with maxShardsPerNode when autoscaling policy is > enabled, we should re-look at how it is implemented. > I propose that we fold maxShardsPerNode (if specified) to a collection level > policy that overrides the corresponding default in cluster policy (see > SOLR-12845). We'll need to ensure that if maxShardsPerNode is specified then > the user sees neither violations nor corresponding suggestions because of the > default cluster policy. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-9301) Gradle: Jar MANIFEST incomplete
[ https://issues.apache.org/jira/browse/LUCENE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss reassigned LUCENE-9301: --- Assignee: Dawid Weiss > Gradle: Jar MANIFEST incomplete > --- > > Key: LUCENE-9301 > URL: https://issues.apache.org/jira/browse/LUCENE-9301 > Project: Lucene - Core > Issue Type: Sub-task > Components: general/build >Affects Versions: master (9.0) >Reporter: Jan Høydahl >Assignee: Dawid Weiss >Priority: Major > > After building with gradle, the MANIFEST.MF file for e.g. solr-core.jar > containst > {noformat} > Manifest-Version: 1.0 > {noformat} > While when building with ant, it says > {noformat} > Manifest-Version: 1.0 > Ant-Version: Apache Ant 1.10.7 > Created-By: 11.0.6+10 (AdoptOpenJDK) > Extension-Name: org.apache.solr > Specification-Title: Apache Solr Search Server: solr-core > Specification-Version: 9.0.0 > Specification-Vendor: The Apache Software Foundation > Implementation-Title: org.apache.solr > Implementation-Version: 9.0.0-SNAPSHOT 9b5542ad55da601e0bdfda96bad8c2c > cabbbc397 - janhoy - 2020-04-01 16:24:09 > Implementation-Vendor: The Apache Software Foundation > X-Compile-Source-JDK: 11 > X-Compile-Target-JDK: 11 > {noformat} > In addition, with ant, the META-INF folder also contains LICENSE.txt and > NOTICE.txt files. > There is a macro {{build-manifest}} in common-build.xml that seems to build > the manifest. > The effect of this is e.g. that spec an implementation versions do not show > in Solr Admin UI -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14365) CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values
[ https://issues.apache.org/jira/browse/SOLR-14365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073479#comment-17073479 ] ASF subversion and git services commented on SOLR-14365: Commit 0d82a9f5cdac67c9de92c979e016c1c6b1f8dcf4 in lucene-solr's branch refs/heads/jira/SOLR-14365 from Cao Manh Dat [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=0d82a9f ] SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (WIP) > CollapsingQParser - Avoiding always allocate int[] and float[] with size > equals to number of unique values > -- > > Key: SOLR-14365 > URL: https://issues.apache.org/jira/browse/SOLR-14365 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.4.1 >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Attachments: SOLR-14365.patch > > > Since Collapsing is a PostFilter, documents reach Collapsing must match with > all filters and queries, so the number of documents Collapsing need to > collect/compute score is a small fraction of the total number documents in > the index. So why do we need to always consume the memory (for int[] and > float[] array) for all unique values of the collapsed field? If the number of > unique values of the collapsed field found in the documents that match > queries and filters is 300 then we only need int[] and float[] array with > size of 300 and not 1.2 million in size. However, we don't know which value > of the collapsed field will show up in the results so we cannot use a smaller > array. > The easy fix for this problem is using as much as we need by using IntIntMap > and IntFloatMap that hold primitives and are much more space efficient than > the Java HashMap. These maps can be slower (10x or 20x) than plain int[] and > float[] if matched documents is large (almost all documents matched queries > and other filters). But our belief is that does not happen that frequently > (how frequently do we run collapsing on the entire index?). > For this issue I propose adding 2 methods for collapsing which is > * array : which is current implementation > * hash : which is new approach and will be default method > later we can add another method {{smart}} which is automatically pick method > based on comparision between {{number of docs matched queries and filters}} > and {{number of unique values of the field}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] CaoManhDat opened a new pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (WIP)
CaoManhDat opened a new pull request #1395: SOLR-14365: CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values (WIP) URL: https://github.com/apache/lucene-solr/pull/1395 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org