[jira] [Commented] (SOLR-14437) Remove/refactor "ApiSupport" interface? (for V2 API)
[ https://issues.apache.org/jira/browse/SOLR-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17096141#comment-17096141 ] Noble Paul commented on SOLR-14437: --- we can remove the methods {code} default Boolean registerV1() { return Boolean.TRUE; } default Boolean registerV2() { return Boolean.FALSE; } {code} can you explain a bit more on the {{getApis()}} ? > Remove/refactor "ApiSupport" interface? (for V2 API) > > > Key: SOLR-14437 > URL: https://issues.apache.org/jira/browse/SOLR-14437 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: v2 API >Affects Versions: master (9.0) >Reporter: David Smiley >Priority: Major > > ApiSupport.java is an interface relating to the V2 API that is implemented by > all request handlers, both those at a core level and others. It's > essentially this: (comments removed) > {code:java} > public interface ApiSupport { > Collection getApis(); > default Boolean registerV1() { return Boolean.TRUE; } > default Boolean registerV2() { return Boolean.FALSE; } > } > {code} > Firstly, let's always assume that the handler will always be registered in > V2. All implementations I've seen explicitly return true here; maybe I'm > missing something though. > Secondly, getApis() seems problematic for the ability to lazily load request > handlers. Can we assume, at least for core level request handlers, that > there is exactly one API and where necessary rely on the "spec" JSON > definition -- see org.apache.solr.api.ApiBag#registerLazy ? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob opened a new pull request #1469: SOLR-14274 More avoid re-registering multiple JVM metrics
madrob opened a new pull request #1469: URL: https://github.com/apache/lucene-solr/pull/1469 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on pull request #1412: Add MinimalSolrTest for scale testing
madrob commented on pull request #1412: URL: https://github.com/apache/lucene-solr/pull/1412#issuecomment-621576764 > Test-framework as a stub makes sense I tried making this work, but it got really messy because I started having to move the configset over modules and rearranging much more of the project than I wanted to. Part of the problem is that we don't have something like `solr/benchmark` matching the Lucene counterpart. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on pull request #1463: SOLR-14440 Cert Auth plugin
madrob commented on pull request #1463: URL: https://github.com/apache/lucene-solr/pull/1463#issuecomment-621515099 > But check what it says if you do not have right ssl cert. Ideally it should show the Login screen with a helpful message that Cert Auth is used and you need to provide correct cert. It does this now! Great suggestion. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1467: LUCENE-9350: Don't hold references to large automata on FuzzyQuery
madrob commented on a change in pull request #1467: URL: https://github.com/apache/lucene-solr/pull/1467#discussion_r417658589 ## File path: lucene/core/src/java/org/apache/lucene/search/FuzzyQuery.java ## @@ -237,22 +216,9 @@ public boolean equals(Object obj) { if (getClass() != obj.getClass()) return false; FuzzyQuery other = (FuzzyQuery) obj; -// Note that we don't need to compare termLength or automata because they -// are entirely determined by the other fields -if (maxEdits != other.maxEdits) - return false; -if (prefixLength != other.prefixLength) - return false; -if (maxExpansions != other.maxExpansions) - return false; -if (transpositions != other.transpositions) - return false; -if (term == null) { - if (other.term != null) -return false; -} else if (!term.equals(other.term)) - return false; -return true; +return Objects.equals(maxEdits, other.maxEdits) && Objects.equals(prefixLength, other.prefixLength) Review comment: Is there a school of thought that we want to compare most likely to differ objects first, competing with the school that advocates comparing primitive types first because they are faster? ## File path: lucene/core/src/java/org/apache/lucene/search/FuzzyAutomatonBuilder.java ## @@ -0,0 +1,88 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.lucene.search; + +import org.apache.lucene.util.UnicodeUtil; +import org.apache.lucene.util.automaton.CompiledAutomaton; +import org.apache.lucene.util.automaton.LevenshteinAutomata; +import org.apache.lucene.util.automaton.TooComplexToDeterminizeException; + +/** + * Builds a set of CompiledAutomaton for fuzzy matching on a given term, + * with specified maximum edit distance, fixed prefix and whether or not + * to allow transpositions. + */ +class FuzzyAutomatonBuilder { + + private final String term; + private final int maxEdits; + private final LevenshteinAutomata levBuilder; + private final String prefix; + private final int termLength; + + FuzzyAutomatonBuilder(String term, int maxEdits, int prefixLength, boolean transpositions) { +if (maxEdits < 0 || maxEdits > LevenshteinAutomata.MAXIMUM_SUPPORTED_DISTANCE) { + throw new IllegalArgumentException("max edits must be 0.." + LevenshteinAutomata.MAXIMUM_SUPPORTED_DISTANCE + ", inclusive; got: " + maxEdits); +} +if (prefixLength < 0) { + throw new IllegalArgumentException("prefixLength cannot be less than 0"); +} +this.term = term; +this.maxEdits = maxEdits; +int[] codePoints = stringToUTF32(term); +this.termLength = codePoints.length; +prefixLength = Math.min(prefixLength, codePoints.length); +int[] suffix = new int[codePoints.length - prefixLength]; +System.arraycopy(codePoints, prefixLength, suffix, 0, suffix.length); +this.levBuilder = new LevenshteinAutomata(suffix, Character.MAX_CODE_POINT, transpositions); +this.prefix = UnicodeUtil.newString(codePoints, 0, prefixLength); + } + + CompiledAutomaton[] buildAutomatonSet() { +CompiledAutomaton[] compiled = new CompiledAutomaton[maxEdits + 1]; +for (int i = 0; i <= maxEdits; i++) { + try { +compiled[i] = new CompiledAutomaton(levBuilder.toAutomaton(i, prefix), true, false); + } + catch (TooComplexToDeterminizeException e) { +throw new FuzzyTermsEnum.FuzzyTermsException(term, e); + } +} +return compiled; + } + + CompiledAutomaton buildMaxEditAutomaton() { Review comment: I'm confused about the difference between when we would want the full automaton set and when we want the max edit automaton. When is one useful but not the other? Is this a simple optimization to skip building the relatively inexpensive (exponentially less expensive, even) fewer edit automata? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (SOLR-13289) Support for BlockMax WAND
[ https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095947#comment-17095947 ] Tomas Eduardo Fernandez Lobbe commented on SOLR-13289: -- Well, that's my point. The value in this attribute is the relation between the {{numFound}} and the real number of hits for a query in the index. The first time you see the attribute you may ask yourself what it is and refer to the docs, but after that it should make sense. I would expect the value of a {{numFoundPrecision}} to be a measure of the precision, which we can't provide. > Support for BlockMax WAND > - > > Key: SOLR-13289 > URL: https://issues.apache.org/jira/browse/SOLR-13289 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Tomas Eduardo Fernandez Lobbe >Priority: Major > Attachments: SOLR-13289.patch, SOLR-13289.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to > expose this via Solr. When enabled, the numFound returned will not be exact. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13289) Support for BlockMax WAND
[ https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095935#comment-17095935 ] David Smiley commented on SOLR-13289: - I think “precision” in there is easier to understand and does not mandate a percentage. “Relation” begs the question “relative to what?” > Support for BlockMax WAND > - > > Key: SOLR-13289 > URL: https://issues.apache.org/jira/browse/SOLR-13289 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Tomas Eduardo Fernandez Lobbe >Priority: Major > Attachments: SOLR-13289.patch, SOLR-13289.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to > expose this via Solr. When enabled, the numFound returned will not be exact. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14307) "user caches" don't support "enabled" attribute
[ https://issues.apache.org/jira/browse/SOLR-14307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter updated SOLR-14307: -- Fix Version/s: 8.6 master (9.0) Resolution: Fixed Status: Resolved (was: Patch Available) > "user caches" don't support "enabled" attribute > --- > > Key: SOLR-14307 > URL: https://issues.apache.org/jira/browse/SOLR-14307 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: master (9.0), 8.6 > > Attachments: SOLR-14307.patch > > > while trying to help write some test cases for SOLR-13807 i discovered that > the code path used for building the {{List}} of _user_ caches > (ie: {{}} doesn't respect the idea of an "enabled" > attribute ... that is only checked for in the code path use for building > singular CacheConfig options from explicit xpaths (ie: {{ />}} etc...) > We should fix this, if for no other reason then so it's easy for tests to use > system properties to enable/disable all caches. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness
[ https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095926#comment-17095926 ] Chris M. Hostetter commented on SOLR-13132: --- {quote}Thanks, Hoss! Some initial responses re: some of the nocommit comments from 8fcd6271b6: ... {quote} yeah, sorry for any folks following along on the jira – i thought i posted a comment about this last week: * in an offline email with Michael about setting up a branch to more closely iterate on, he pointed out that the way github PRs are setup I already had permission to push commits to hte branch he was using for his PR * so i started committing some test work and small edits with refactorings/comments/quesions directly to his branch {quote}... my initial inclination is to prefer leaving SweepingAcc as a separate class, because CountSlotAcc currently clearly does one specific thing, ... {quote} Well, except that with the changes we're making, CountSlotAcc is no longer just doing one specific thing (counting) – it's also managing any sweep collectors that exist. I think we need something ... to help clean up the API a bit. Not only are the internals of {{CountSlotAcc.getBaseSweepingAcc(...)}} really messy and code-smell-ish right now, but as i discovered when i added in "whitebox" testing that inspects the facet debug output: since {{getBaseSweepingAcc().setValues(...)}} has to be called from {{FacetFieldProcessor.fillBucketFromSlot()}}, that means even non-sweep processors (like "hashdv" for example) wind up initializing {{SweepingAcc}} instances. more concrete thoughts on possible ways to clean up this API / seperation of concerns as part of my next comment below... Re: {{CountSlotAcc implements SweepableSlotAcc}} {quote}... my thinking was: although countAcc is currently the one and only CountSlotAcc, used to accumulate counts over the base domain DocSet only, there could be cases where extra CountSlotAccs are used more directly (e.g. as part of stats collection, analogous to how they're used indirectly for SKG sweep collection). In such a case, these "non-base" CountSlotAccs would respond as implemented {quote} that's totally fair – if you can envision possible usages for "extra" {{CountSlotAcc}} instances down the road that act as {{SweepableSlotAcc}} instances "hanging off" the "main" {{countAcc}} then I'm fine with that, since it doesn't really hurt anything and it's a "clean" implementation of the API: there's really no reason why all {{CountSlotAcc}} impls can't {{implements SweepableSlotAcc}}. But taking a step back on the topic of "API cleanliness", i think the direction i'm leaning at the moment is that {{SweepingAcc}} should really just be {{SweepingCountSlotAcc extends CountSlotArrAcc}} ... a new concrete {{CountSlotAcc}} subclass that would be instantiated by processors that support sweeping for use as {{this.countAcc}}, and would inherently & automatically handles all the "sweep" related logic currently spread between {{SweepingAcc}} and {{CountSlotAcc.getBaseSweepingAcc(...)}} (and might eliminate the need for {{FacetFieldProcessorSlotAccMapper implements CollectSlotAccMappingAware}} ... ? ...unless there's another use for {{CollectSlotAccMappingAware}} i'm overlooking?) This would include making sure that {{countAcc.setValues(...)}} "does the right thing" when {{countAcc instanceof SweepingCountSlotAcc}} and there are registered "output" SlotAccs forit to handle... Which brings me back to 2 related earlier comments/questions you had... {quote}I think it's probably better to have collectAcc read access mediation be an "all-or-nothing" thing ... considering for example the MultiAcc case where some SlotAccs might be modified while others aren't, that separating them and calling setValues(...) on the modified and non-modified groups separately would affect the order of output. I think this would only affect the MultiAcc case, since all other SlotAccs in collectAcc would be all-or-nothing by nature (i.e., they'd either register replacement SlotAcc or not). {quote} 1) I wouldn't worry about the order of keys – this is already non-deterministic on master -- not only can changing the "sort" or "prelim_sort" affect the order of the keys currently (due to how the "collectAcc" SlotAcc gets treated distinctly from "otherAccs") but even things like changing "method" or "limit" can cause the keys to come in diff orders (either because a switching to the "enum" method, or because of using MultiAcc when doing a singlePass collection) (This is something i noticed along time ago and forgot about until it started biting me in the ass in the randomized test i committed to your branch – independent of wether sweeping is used ... i have not yet decided yet if it seems better/easier to make the test not care about the order of the keys, or to bite the bullet and enforce a deterministic over in
[jira] [Resolved] (SOLR-14448) Solr 7.5 Autoscaling history API
[ https://issues.apache.org/jira/browse/SOLR-14448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-14448. --- Resolution: Invalid Please raise questions like this on the user's list, we try to reserve JIRAs for known bugs/enhancements rather than usage questions See: http://lucene.apache.org/solr/community.html#mailing-lists-irc there are links to both Lucene and Solr mailing lists there. A _lot_ more people will see your question on that list and may be able to help more quickly. If it's determined that this really is a code issue or enhancement to Lucene or Solr and not a configuration/usage problem, we can raise a new JIRA or reopen this one. But this sounds more like usage... > Solr 7.5 Autoscaling history API > > > Key: SOLR-14448 > URL: https://issues.apache.org/jira/browse/SOLR-14448 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Sujith >Priority: Major > > I am using Solr version 7.5. I tried to capture all the autoscaling events of > Solr like the event start time,end time, status..etc. I know these > information is provided in the autoscaling history API i.e > "[http://localhost:8983/solr/admin/autoscaling/history];. Though I had to > manually create a collection named ".system" in Solr to get this API > functioning. This link > [https://lucene.apache.org/solr/guide/7_5/blob-store-api.html] was followed. > However, there is a weird behavior that I am observing now. This autoscaling > history API is not updating the events after a certain point.I am no longer > able to view any of the autoscaling events after a certain period. I tried to > delete the ".system" collection and recreated it again. The new autoscaling > events were captured but again it stopped after a certain period. Any idea > why this is happening. Advise is greatly appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9321) Port documentation task to gradle
[ https://issues.apache.org/jira/browse/LUCENE-9321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095910#comment-17095910 ] Uwe Schindler commented on LUCENE-9321: --- The tasks look fine, I have no idea about the doap file processing for CHANGES.txt. Its there to get release dates from JIRA? I think the JIRA export can be done in JSON, too - it's just another output format of JIRA's REST API. --- From looking at the task it seems like it's no longer using versions from JIRA, instead a doap file. I am not sure how this one is generated, the ant magic looks fine, we can keep it for now (at least until Ant build is finally removed). The markdown and index.html template processing is not yet implemented. I have a bit of time (after I released forbiddenapis), so I would work on that one, OK? I want to get rid of the XSL completely, so I will just add some simple templating for the index.html file, by first creating a markdown file using a simple groovy multi-line template, traversing the graph of projects and adding liks to the javadocs folders of each. No XSL needed for that. After that we convert the markdown in the same way. I will also work on the markdown task, which should be easy! Because of 1st of May, we have a long workend :-) > Port documentation task to gradle > - > > Key: LUCENE-9321 > URL: https://issues.apache.org/jira/browse/LUCENE-9321 > Project: Lucene - Core > Issue Type: Sub-task > Components: general/build >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > > This is a placeholder issue for porting ant "documentation" task to gradle. > The generated documents should be able to be published on lucene.apache.org > web site on "as-is" basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13289) Support for BlockMax WAND
[ https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095908#comment-17095908 ] Tomas Eduardo Fernandez Lobbe commented on SOLR-13289: -- I still don't like the word "precision" at all in this. It sounds like the value of a "precision" attribute would be something like "10%", or something indicating the degree of precision of the result. I still prefer the word "relation", as Lucene used. Maybe {{numFoundRelation}} sounds better to you than {{hitCountRelation}}? maybe {{hitFoundRelation}}? meaning {{actual hits >=/== numFound}}? or {{hitCountRelation}} meaning {{actual hits >=/== count}}? > Support for BlockMax WAND > - > > Key: SOLR-13289 > URL: https://issues.apache.org/jira/browse/SOLR-13289 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Tomas Eduardo Fernandez Lobbe >Priority: Major > Attachments: SOLR-13289.patch, SOLR-13289.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to > expose this via Solr. When enabled, the numFound returned will not be exact. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14446) Upload configset should use ZkClient.multi()
[ https://issues.apache.org/jira/browse/SOLR-14446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095850#comment-17095850 ] David Smiley commented on SOLR-14446: - Yes, this would be helpful. We shouldn't just address this one spot but also the other place(s) where the files are uploaded. https://github.com/apache/lucene-solr/blob/5e6d91eec082d158abcf9338ff0982eb20b4d30b/solr/solrj/src/java/org/apache/solr/common/cloud/ZkMaintenanceUtils.java#L278 (used to bootstrap ZK with the default configset, and perhaps used elsewhere) There should probably be one utility method to do this that takes a Collection of the data so that we have one place to maintain. This will be nice change but I think just one piece of a two-piece puzzle. Above is the write-side. On the read-side, due to eventual consistency, we need to somewhere add a catch and retry mechanism if an expected file is not present. But before retrying, call zookeeper.sync ( FYI: SOLR-14425 ). This mechanism might be in ZkSolrResourceLoader, perhaps. > Upload configset should use ZkClient.multi() > > > Key: SOLR-14446 > URL: https://issues.apache.org/jira/browse/SOLR-14446 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > > Based on a private discussion with [~dsmiley] and [~dragonsinth] for > SOLR-14425, it occurred to me that our configset upload is a loop over all > files in a configset and individual writes. > [https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/admin/ConfigSetsHandler.java#L184] > > It might make sense to use ZkClient.multi() here so that collection creation > doesn't need to guess whether all files of the configset made it into the ZK > or not (they will either all be there, or none at all). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14448) Solr 7.5 Autoscaling history API
Sujith created SOLR-14448: - Summary: Solr 7.5 Autoscaling history API Key: SOLR-14448 URL: https://issues.apache.org/jira/browse/SOLR-14448 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Sujith I am using Solr version 7.5. I tried to capture all the autoscaling events of Solr like the event start time,end time, status..etc. I know these information is provided in the autoscaling history API i.e "[http://localhost:8983/solr/admin/autoscaling/history];. Though I had to manually create a collection named ".system" in Solr to get this API functioning. This link [https://lucene.apache.org/solr/guide/7_5/blob-store-api.html] was followed. However, there is a weird behavior that I am observing now. This autoscaling history API is not updating the events after a certain point.I am no longer able to view any of the autoscaling events after a certain period. I tried to delete the ".system" collection and recreated it again. The new autoscaling events were captured but again it stopped after a certain period. Any idea why this is happening. Advise is greatly appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13289) Support for BlockMax WAND
[ https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095836#comment-17095836 ] Ishan Chattopadhyaya commented on SOLR-13289: - +1 on numFoundPrecision. Other idea: numFoundCoverage (i.e. how much of the current result covers the exact resultset). > Support for BlockMax WAND > - > > Key: SOLR-13289 > URL: https://issues.apache.org/jira/browse/SOLR-13289 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Tomas Eduardo Fernandez Lobbe >Priority: Major > Attachments: SOLR-13289.patch, SOLR-13289.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to > expose this via Solr. When enabled, the numFound returned will not be exact. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14447) Replica Types for all Autoscaling events
Sujith created SOLR-14447: - Summary: Replica Types for all Autoscaling events Key: SOLR-14447 URL: https://issues.apache.org/jira/browse/SOLR-14447 Project: Solr Issue Type: New Feature Security Level: Public (Default Security Level. Issues are Public) Reporter: Sujith The replica type attribute is only supported for the trigger "nodeAdded". All the autoscaling triggers which supports the "ADDREPLICA" operation must be provided with this attribute option. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13289) Support for BlockMax WAND
[ https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095832#comment-17095832 ] Andrzej Bialecki commented on SOLR-13289: - Yeah, naming things is hard :) I’m not a native speaker either, I just realized that the name clashes with the value that it describes and itself is meaningless (what is a “relation” here? relation to what? the Lucene enum name barely makes sense either, only after you read the javadocs.) Agreed on the confusion with the ‘precision’ name. Maybe numFoundPrecision? Eh... I’ll stop here. > Support for BlockMax WAND > - > > Key: SOLR-13289 > URL: https://issues.apache.org/jira/browse/SOLR-13289 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Tomas Eduardo Fernandez Lobbe >Priority: Major > Attachments: SOLR-13289.patch, SOLR-13289.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to > expose this via Solr. When enabled, the numFound returned will not be exact. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14446) Upload configset should use ZkClient.multi()
Ishan Chattopadhyaya created SOLR-14446: --- Summary: Upload configset should use ZkClient.multi() Key: SOLR-14446 URL: https://issues.apache.org/jira/browse/SOLR-14446 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Ishan Chattopadhyaya Based on a private discussion with [~dsmiley] and [~dragonsinth] for SOLR-14425, it occurred to me that our configset upload is a loop over all files in a configset and individual writes. [https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/admin/ConfigSetsHandler.java#L184] It might make sense to use ZkClient.multi() here so that collection creation doesn't need to guess whether all files of the configset made it into the ZK or not (they will either all be there, or none at all). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14173) Ref Guide Redesign Phase 1: Page Design
[ https://issues.apache.org/jira/browse/SOLR-14173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095778#comment-17095778 ] ASF subversion and git services commented on SOLR-14173: Commit 5e6d91eec082d158abcf9338ff0982eb20b4d30b in lucene-solr's branch refs/heads/master from Cassandra Targett [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5e6d91e ] SOLR-14173: Don't use JQuery-Slim as it breaks the sidebar sub-menu system. > Ref Guide Redesign Phase 1: Page Design > --- > > Key: SOLR-14173 > URL: https://issues.apache.org/jira/browse/SOLR-14173 > Project: Solr > Issue Type: Improvement > Components: documentation >Reporter: Cassandra Targett >Assignee: Cassandra Targett >Priority: Major > Attachments: SOLR-14173.patch, SOLR-14173.patch, blue-left-nav.png, > gray-left-nav.png > > > The current design of the Ref Guide was essentially copied from a > Jekyll-based documentation theme > (https://idratherbewriting.com/documentation-theme-jekyll/), which had a > couple important benefits for that time: > * It was well-documented and since I had little experience with Jekyll and > its Liquid templates and since I was the one doing it, I wanted to make it as > easy on myself as possible > * It was designed for documentation specifically so took care of all the > things like inter-page navigation, etc. > * It helped us get from Confluence to our current system quickly > It had some drawbacks, though: > * It wasted a lot of space on the page > * The theme was built for Markdown files, so did not take advantage of the > features of the {{jekyll-asciidoc}} plugin we use (the in-page TOC being one > big example - the plugin could create it at build time, but the theme > included JS to do it as the page loads, so we use the JS) > * It had a lot of JS and overlapping CSS files. While it used Bootstrap it > used a customized CSS on top of it for theming that made modifications > complex (it was hard to figure out how exactly a change would behave) > * With all the stuff I'd changed in my bumbling way just to get things to > work back then, I broke a lot of the stuff Bootstrap is supposed to give us > in terms of responsiveness and making the Guide usable even on smaller screen > sizes. > After upgrading the Asciidoctor components in SOLR-12786 and stopping the PDF > (SOLR-13782), I wanted to try to set us up for a more flexible system. We > need it for things like Joel's work on the visual guide for streaming > expressions (SOLR-13105), and in order to implement other ideas we might have > on how to present information in the future. > I view this issue as a phase 1 of an overall redesign that I've already > started in a local branch. I'll explain in a comment the changes I've already > made, and will use this issue to create and push a branch where we can > discuss in more detail. > Phase 1 here will be under-the-hood CSS/JS changes + overall page layout > changes. > Phase 2 (SOLR-1) will be a wholesale re-organization of all the pages of > the Guide. > Phase 3 (issue TBD) will explore moving us from Jekyll to another static site > generator that is better suited for our content format, file types, and build > conventions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13289) Support for BlockMax WAND
[ https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095757#comment-17095757 ] Tomas Eduardo Fernandez Lobbe commented on SOLR-13289: -- bq. IMHO the hitCountRelation vs numFound Yeah, I'm not sure about this either. I was thinking in "numFoundRelation", but then I thought "maybe it's clear that this is the relation between "numFound" and the actual number of hits". I don't know, me not being a native English speaker certainly may be obscuring things for me. I think {{precision}} may be a confusing in the context of IR. bq. perhaps use GT_EQ or EQ, short and not too cryptic? +1. We don't need to be so verbose for every request. > Support for BlockMax WAND > - > > Key: SOLR-13289 > URL: https://issues.apache.org/jira/browse/SOLR-13289 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Tomas Eduardo Fernandez Lobbe >Priority: Major > Attachments: SOLR-13289.patch, SOLR-13289.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to > expose this via Solr. When enabled, the numFound returned will not be exact. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9321) Port documentation task to gradle
[ https://issues.apache.org/jira/browse/LUCENE-9321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095756#comment-17095756 ] Tomoko Uchida commented on LUCENE-9321: --- [~dweiss] I opened a pull request for subtask LUCENE-9333; this ports "changes-to-html" ant target to gradle. [https://github.com/apache/lucene-solr/pull/1468] Would you take a look at it? > Port documentation task to gradle > - > > Key: LUCENE-9321 > URL: https://issues.apache.org/jira/browse/LUCENE-9321 > Project: Lucene - Core > Issue Type: Sub-task > Components: general/build >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > > This is a placeholder issue for porting ant "documentation" task to gradle. > The generated documents should be able to be published on lucene.apache.org > web site on "as-is" basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mocobeta opened a new pull request #1468: LUCENE-9333: Add gradle task to compile changes.txt to a html
mocobeta opened a new pull request #1468: URL: https://github.com/apache/lucene-solr/pull/1468 # Description This PR adds "documentation" gradle task (equivalent of ant's "documentation") and its sub task "changesToHtml" (equivalent of ant's "changes-to-html" that compiles CHANGES.txt into Changes.html). The output directories are `lucene/build/documentation` (for lucene) and `solr/build/documentation` (for solr). Those are not used by ant; ant outputs docs into `lucene/build/docs` and `solr/build/docs`. "documentation" task is incomplete for now; please see the TODO comment. # Tests I checked md5 hash for the compiled Changes.html to verify the gradle task generates exact same file as ant (when version property is set to `9.0.0`). ``` lucene-solr $ md5sum lucene/build/docs/changes/Changes.html # generate by ant e5ddb897e191915be0fe9f23bdd0edf0 lucene/build/docs/changes/Changes.html lucene-solr $ md5sum lucene/build/documentation/changes/Changes.html # generated by gradle e5ddb897e191915be0fe9f23bdd0edf0 lucene/build/documentation/changes/Changes.html ``` ``` lucene-solr $ md5sum solr/build/docs/changes/Changes.html # generate by ant 0076f4eb251c44a0b6effc9d8e958cd9 solr/build/docs/changes/Changes.html # Changes.html generated by gradle lucene-solr $ md5sum solr/build/documentation/changes/Changes.html # generated by gradle 0076f4eb251c44a0b6effc9d8e958cd9 solr/build/documentation/changes/Changes.html ``` # Note about RDF processing Not fully sure, but there is no equivalent gradle task for `ant.xmlproperties` as far as I know. We could remove the ant call by writing our custom XPath processor which will be built on groovy's [XmlSluper](https://docs.groovy-lang.org/latest/html/api/groovy/xml/XmlSlurper.html). I didn't do so here but just cloned the original Ant task. To me it would be preferable to replace the DOAP RDF with a JSON (or simply a CSV) instead of adding another XML processing code, if we want to completely drop the ant convention. ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14237) Add authenticated user principal in Solr admin UI
[ https://issues.apache.org/jira/browse/SOLR-14237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya resolved SOLR-14237. - Fix Version/s: 8.6 Resolution: Fixed Thanks [~moshebla], [~noble.paul], [~janhoy]. > Add authenticated user principal in Solr admin UI > - > > Key: SOLR-14237 > URL: https://issues.apache.org/jira/browse/SOLR-14237 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: Admin UI >Reporter: mosh >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: AdminUI, kerberos, security > Fix For: 8.6 > > Attachments: SOLR-14237-2.patch, SOLR-14237.patch, Screenshot from > 2020-04-28 12-48-45.png > > > When user is logged in to Solr's admin UI using Kerberos, no authentication > info is displayed. > It would be very useful to see the logged in user principal. This could be > specially crucial when SSO is being used and user not always aware that Solr > is even configured with authentication mechanism. > +Info should include:+ > 1. user principal > 2. mapped role (in case authorization plugin is also configured) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14237) Add authenticated user principal in Solr admin UI
[ https://issues.apache.org/jira/browse/SOLR-14237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095718#comment-17095718 ] ASF subversion and git services commented on SOLR-14237: Commit f56089f923b7b011fd699f75faace9e336e2ede8 in lucene-solr's branch refs/heads/branch_8x from Ishan Chattopadhyaya [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f56089f ] SOLR-14237: A new panel with security info in admin UI's dashboard > Add authenticated user principal in Solr admin UI > - > > Key: SOLR-14237 > URL: https://issues.apache.org/jira/browse/SOLR-14237 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: Admin UI >Reporter: mosh >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: AdminUI, kerberos, security > Attachments: SOLR-14237-2.patch, SOLR-14237.patch, Screenshot from > 2020-04-28 12-48-45.png > > > When user is logged in to Solr's admin UI using Kerberos, no authentication > info is displayed. > It would be very useful to see the logged in user principal. This could be > specially crucial when SSO is being used and user not always aware that Solr > is even configured with authentication mechanism. > +Info should include:+ > 1. user principal > 2. mapped role (in case authorization plugin is also configured) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14237) Add authenticated user principal in Solr admin UI
[ https://issues.apache.org/jira/browse/SOLR-14237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095717#comment-17095717 ] ASF subversion and git services commented on SOLR-14237: Commit 561e36660a7f04936104cd1a5243a14b9c58235f in lucene-solr's branch refs/heads/master from Ishan Chattopadhyaya [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=561e366 ] SOLR-14237: A new panel with security info in admin UI's dashboard > Add authenticated user principal in Solr admin UI > - > > Key: SOLR-14237 > URL: https://issues.apache.org/jira/browse/SOLR-14237 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: Admin UI >Reporter: mosh >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: AdminUI, kerberos, security > Attachments: SOLR-14237-2.patch, SOLR-14237.patch, Screenshot from > 2020-04-28 12-48-45.png > > > When user is logged in to Solr's admin UI using Kerberos, no authentication > info is displayed. > It would be very useful to see the logged in user principal. This could be > specially crucial when SSO is being used and user not always aware that Solr > is even configured with authentication mechanism. > +Info should include:+ > 1. user principal > 2. mapped role (in case authorization plugin is also configured) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14237) Add authenticated user principal in Solr admin UI
[ https://issues.apache.org/jira/browse/SOLR-14237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095706#comment-17095706 ] ASF subversion and git services commented on SOLR-14237: Commit 66a9bb09ca1d2611ce2047671d797ee681de8932 in lucene-solr's branch refs/heads/branch_8x from Ishan Chattopadhyaya [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=66a9bb0 ] SOLR-14237: A new panel with security info in admin UI's dashboard > Add authenticated user principal in Solr admin UI > - > > Key: SOLR-14237 > URL: https://issues.apache.org/jira/browse/SOLR-14237 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: Admin UI >Reporter: mosh >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: AdminUI, kerberos, security > Attachments: SOLR-14237-2.patch, SOLR-14237.patch, Screenshot from > 2020-04-28 12-48-45.png > > > When user is logged in to Solr's admin UI using Kerberos, no authentication > info is displayed. > It would be very useful to see the logged in user principal. This could be > specially crucial when SSO is being used and user not always aware that Solr > is even configured with authentication mechanism. > +Info should include:+ > 1. user principal > 2. mapped role (in case authorization plugin is also configured) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14237) Add authenticated user principal in Solr admin UI
[ https://issues.apache.org/jira/browse/SOLR-14237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095702#comment-17095702 ] ASF subversion and git services commented on SOLR-14237: Commit 0c682d0e9af34bf75a574beb3d12ff849dcab9eb in lucene-solr's branch refs/heads/master from Ishan Chattopadhyaya [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=0c682d0 ] SOLR-14237: A new panel with security info in admin UI's dashboard > Add authenticated user principal in Solr admin UI > - > > Key: SOLR-14237 > URL: https://issues.apache.org/jira/browse/SOLR-14237 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: Admin UI >Reporter: mosh >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: AdminUI, kerberos, security > Attachments: SOLR-14237-2.patch, SOLR-14237.patch, Screenshot from > 2020-04-28 12-48-45.png > > > When user is logged in to Solr's admin UI using Kerberos, no authentication > info is displayed. > It would be very useful to see the logged in user principal. This could be > specially crucial when SSO is being used and user not always aware that Solr > is even configured with authentication mechanism. > +Info should include:+ > 1. user principal > 2. mapped role (in case authorization plugin is also configured) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] upendrasoft commented on pull request #807: Remove solr.jetty.https.port when SSL is not used
upendrasoft commented on pull request #807: URL: https://github.com/apache/lucene-solr/pull/807#issuecomment-621340127 Apologies for the delay. Thanks @madrob and @janhoy for taking care of this. Closing this PR as the changes were already added. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5
[ https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095689#comment-17095689 ] Andrzej Bialecki commented on SOLR-14428: - Alan, seems you're on top of things here, feel free to assign the issue to yourself. :) > FuzzyQuery has severe memory usage in 8.5 > - > > Key: SOLR-14428 > URL: https://issues.apache.org/jira/browse/SOLR-14428 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.5, 8.5.1 >Reporter: Colvin Cowie >Assignee: Andrzej Bialecki >Priority: Major > Attachments: FuzzyHammer.java, SOLR-14428-WeakReferences.patch, > image-2020-04-23-09-18-06-070.png, image-2020-04-24-20-09-31-179.png, > screenshot-2.png, screenshot-3.png, screenshot-4.png > > Time Spent: 10m > Remaining Estimate: 0h > > I sent this to the mailing list > I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors > while running our normal tests. After profiling it was clear that the > majority of the heap was allocated through FuzzyQuery. > LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the > FuzzyQuery's constructor. > I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries > from random UUID strings for 5 minutes > {code} > FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2" > {code} > When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while > the memory usage has increased drastically on 8.5.0 and 8.5.1. > Comparison of heap usage while running the attached test against Solr 8.3.1 > and 8.5.1 with a single (empty) shard and 4GB heap: > !image-2020-04-23-09-18-06-070.png! > And with 4 shards on 8.4.1 and 8.5.0: > !screenshot-2.png! > I'm guessing that the memory might be being leaked if the FuzzyQuery objects > are referenced from the cache, while the FuzzyTermsEnum would not have been. > Query Result Cache on 8.5.1: > !screenshot-3.png! > ~316mb in the cache > QRC on 8.3.1 > !screenshot-4.png! > <1mb > With an empty cache, running this query > _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory > allocation > {noformat} > 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed: 1520 > 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855 > {noformat} > ~1 gives 98253 and ~0 gives 6339 on 8.5.1. 8.3.1 is constant at 1520 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9191) Fix linefiledocs compression or replace in tests
[ https://issues.apache.org/jira/browse/LUCENE-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095607#comment-17095607 ] Chris M. Hostetter commented on LUCENE-9191: i'm no expert, but doesn't jenkins just fetch it from [http://svn.apache.org/repos/asf/lucene/test-data] ? isn't that what all the jenkins jobs log when they start? > Fix linefiledocs compression or replace in tests > > > Key: LUCENE-9191 > URL: https://issues.apache.org/jira/browse/LUCENE-9191 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Michael McCandless >Priority: Major > Fix For: 8.6 > > Attachments: LUCENE-9191.patch, LUCENE-9191.patch > > > LineFileDocs(random) is very slow, even to open. It does a very slow "random > skip" through a gzip compressed file. > For the analyzers tests, in LUCENE-9186 I simply removed its usage, since > TestUtil.randomAnalysisString is superior, and fast. But we should address > other tests using it, since LineFileDocs(random) is slow! > I think it is also the case that every lucene test has probably tested every > LineFileDocs line many times now, whereas randomAnalysisString will invent > new ones. > Alternatively, we could "fix" LineFileDocs(random), e.g. special compression > options (in blocks)... deflate supports such stuff. But it would make it even > hairier than it is now. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14414) New Admin UI
[ https://issues.apache.org/jira/browse/SOLR-14414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095592#comment-17095592 ] Marcus Eagan commented on SOLR-14414: - looking back at this one from the homie [~uboness]. One for the history books: https://issues.apache.org/jira/browse/SOLR-1163 > New Admin UI > > > Key: SOLR-14414 > URL: https://issues.apache.org/jira/browse/SOLR-14414 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Admin UI >Affects Versions: master (9.0) >Reporter: Marcus Eagan >Priority: Major > Attachments: QueryUX-SolrAdminUIReboot.mov > > > We have had a lengthy discussion in the mailing list about the need to build > a modern UI that is both more security and does not depend on deprecated, end > of life code. In this ticket, I intend to familiarize the community with the > efforts of the community to do just that that. While we are nearing feature > parity, but not there yet as many have suggested we could complete this task > in iterations, here is an attempt to get the ball rolling. I have mostly > worked on it in weekend nights on the occasion that I could find the time. > Angular is certainly not my specialty, and this is my first attempt at using > TypeScript besides a few brief learning exercises here and there. However, I > will be engaging experts in both of these areas for consultation as our > community tries to pull our UI into another era. > Many of the components here can improve. One or two them need to be > rewritten, and there are even at least three essential components to the app > missing, along with some tests. A couple other things missing are the V2 API, > which I found difficult to build with in this context because it is not > documented on the web. I understand that it is "self-documenting," but the > most easy-to-use APIs are still documented on the web. Maybe it is entirely > documented on the web, and I had trouble finding it. Forgive me, as that > could be an area of assistance. Another area where I need assistance is > packaging this application as a Solr package. I understand this app is not in > the right place for that today, but it can be. There are still many > improvements to be made in this Jira and certainly in this code. > The project is located in {{lucene-solr/solr/webapp2}}, where there is a > README for information on running the app. > The app can be started from the this directory with {{npm start}} for now. It > can quickly be modified to start as a part of the typical start commands as > it approaches parity. I expect there will be a lot of opinions. I welcome > them, of course. The community input should drive the project's success. > Discussion in mailing list: > https://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3CCAF76exK-EB_tyFx0B4fBiA%3DJj8gH%3Divn2Uo6cWvMwhvzRdA3KA%40mail.gmail.com%3E -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1467: LUCENE-9350: Don't hold references to large automata on FuzzyQuery
dsmiley commented on a change in pull request #1467: URL: https://github.com/apache/lucene-solr/pull/1467#discussion_r417440312 ## File path: lucene/core/src/java/org/apache/lucene/search/FuzzyQuery.java ## @@ -237,22 +216,9 @@ public boolean equals(Object obj) { if (getClass() != obj.getClass()) return false; FuzzyQuery other = (FuzzyQuery) obj; -// Note that we don't need to compare termLength or automata because they -// are entirely determined by the other fields -if (maxEdits != other.maxEdits) - return false; -if (prefixLength != other.prefixLength) - return false; -if (maxExpansions != other.maxExpansions) - return false; -if (transpositions != other.transpositions) - return false; -if (term == null) { - if (other.term != null) -return false; -} else if (!term.equals(other.term)) - return false; -return true; +return Objects.equals(maxEdits, other.maxEdits) && Objects.equals(prefixLength, other.prefixLength) Review comment: nitpick: compare the term first, as it is most likely to return false for different fuzzy This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-8099) Remove sleep() function / ValueSourceParser
[ https://issues.apache.org/jira/browse/SOLR-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095576#comment-17095576 ] Christine Poerschke commented on SOLR-8099: --- bq. ... Somebody wanted to use the sleep function for some testing they were doing. It took some headscratching and digging to determine that the function requires TWO parameters, and help from Hoss to determine exactly what that second parameter does. ... Thanks for the hint w.r.t. the function taking two parameters! It helped with the testing I did for SOLR-14442 and from observation and quick code inspection -- https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.5.1/solr/core/src/java/org/apache/solr/search/ValueSourceParser.java#L160-L172 --- it appears that the first parameter is the sleep interval in milliseconds and the second parameter is the function's return value. Illustration: {code} $ curl 'http://localhost:8983/solr/techproducts/select?fl=id,popularity,score=func=add(popularity,sleep(1234,42))=1' { "responseHeader":{ "status":0, "QTime":1236, "params":{ "q":"add(popularity,sleep(1234,42))", "defType":"func", "fl":"id,popularity,score", "rows":"1"}}, "response":{"numFound":32,"start":0,"maxScore":52.0,"docs":[ { "id":"MA147LL/A", "popularity":10, "score":52.0}] }} {code} > Remove sleep() function / ValueSourceParser > --- > > Key: SOLR-8099 > URL: https://issues.apache.org/jira/browse/SOLR-8099 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: security > Fix For: 5.5 > > Attachments: SOLR-8099.patch, SOLR-8099.patch, SOLR-8099.patch > > > As per Doug Turnbull, the sleep() represents a security risk. > {noformat} > I noticed a while back that "sleep" is a function query. Which I > believe means I can make the current query thread sleep for as long as I > like. > I'm guessing an attacker could use this to starve Solr of threads, running > a denial of service attack by running multiple queries with sleeps in them. > Is this a concern? I realize there may be test purposes to sleep a function > query, but I'm trying to think if there's really practical purpose to > having sleep here. > Best, > -Doug > {noformat} > This issue is to remove it, since it is neither documented publicly, nor used > internally very much, apart from one test suite. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1351: LUCENE-9280: Collectors to skip noncompetitive documents
mayya-sharipova commented on a change in pull request #1351: URL: https://github.com/apache/lucene-solr/pull/1351#discussion_r417410225 ## File path: lucene/core/src/java/org/apache/lucene/search/FilteringNumericComparator.java ## @@ -0,0 +1,329 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search; + +import org.apache.lucene.document.DoublePoint; +import org.apache.lucene.document.FloatPoint; +import org.apache.lucene.document.IntPoint; +import org.apache.lucene.document.LongPoint; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.index.PointValues; +import org.apache.lucene.util.DocIdSetBuilder; + +import java.io.IOException; +import java.util.Arrays; + +/** + * A wrapper over {@code NumericComparator} that adds a functionality to filter non-competitive docs. + */ +public abstract class FilteringNumericComparator extends FilteringFieldComparator implements LeafFieldComparator { +protected final boolean reverse; +private boolean hasTopValue = false; +private PointValues pointValues; +private final int bytesCount; +private final byte[] minValueAsBytes; +private final byte[] maxValueAsBytes; +private boolean minValueExist = false; +private boolean maxValueExist = false; +private int maxDoc; +private int maxDocVisited; +private int updateCounter = 0; +private final String field; +protected boolean canUpdateIterator = false; // set to true when queue becomes full and hitsThreshold is reached +protected DocIdSetIterator competitiveIterator = null; +private long iteratorCost = 0; + +public FilteringNumericComparator(NumericComparator in, boolean reverse, int bytesCount) { +super(in); +this.field = in.field; +this.bytesCount = bytesCount; +this.reverse = reverse; +minValueAsBytes = new byte[bytesCount]; +maxValueAsBytes = new byte[bytesCount]; +if (reverse) { +minValueExist = true; +} else { +maxValueExist = true; +} +} + +/** + * Returns an iterator over competitive documents + */ +@Override +public DocIdSetIterator competitiveIterator() { +if (competitiveIterator == null) return null; +return new DocIdSetIterator() { +private int doc; +@Override +public int nextDoc() throws IOException { +return doc = competitiveIterator.nextDoc(); +} + +@Override +public int docID() { +return doc; +} + +@Override +public long cost() { +return competitiveIterator.cost(); +} + +@Override +public int advance(int target) throws IOException { +return doc = competitiveIterator.advance(target); +} +}; +} + +@Override +public void setCanUpdateIterator() throws IOException { +this.canUpdateIterator = true; +// for the 1st time queue becomes full and hitsThreshold is reached +// we can start updating competitive iterator +updateCompetitiveIterator(); +} + +@Override +public void setTopValue(T value) { +hasTopValue = true; +if (reverse) { +maxValueExist = true; +} else { +minValueExist = true; +} +in.setTopValue(value); +} + +@Override +public void setBottom(int slot) throws IOException { +((NumericComparator) in).setBottom(slot); +updateCompetitiveIterator(); // update an iterator if we set a new bottom +} + +@Override +public int compareBottom(int doc) throws IOException { +return ((NumericComparator) in).compareBottom(doc); +} + +@Override +public int compareTop(int doc) throws IOException { +return ((NumericComparator) in).compareTop(doc); +} + +@Override +public void copy(int slot, int doc) throws IOException { +((NumericComparator) in).copy(slot, doc); +} + +@Override +public void setScorer(Scorable scorer) throws IOException { +
[jira] [Updated] (LUCENE-9352) Add cost function to Scorable
[ https://issues.apache.org/jira/browse/LUCENE-9352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova updated LUCENE-9352: Description: {{Scorable.cost() function could be useful in optimizations.}} For example, the ability for collectors to skip non-competitive documents introduced in LUCENE-9280 is based on the cost of the corresponding Scorable. was: {{Scorable.cost() }}{{function could be useful in optimizations.}} For example, the ability for collectors to skip non-competitive documents introduced in [LUCENE-9280| https://issues.apache.org/jira/browse/LUCENE-9280 ] is based on the cost of the corresponding Scorable. > Add cost function to Scorable > - > > Key: LUCENE-9352 > URL: https://issues.apache.org/jira/browse/LUCENE-9352 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Minor > > {{Scorable.cost() function could be useful in optimizations.}} > For example, the ability for collectors to skip non-competitive documents > introduced in LUCENE-9280 is based on the cost of the corresponding Scorable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on pull request #1365: SOLR-14347: Autoscaling placement wrong when concurrent replica placements are calculated
sigram commented on pull request #1365: URL: https://github.com/apache/lucene-solr/pull/1365#issuecomment-621289167 Merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9352) Add cost function to Scorable
Mayya Sharipova created LUCENE-9352: --- Summary: Add cost function to Scorable Key: LUCENE-9352 URL: https://issues.apache.org/jira/browse/LUCENE-9352 Project: Lucene - Core Issue Type: Improvement Reporter: Mayya Sharipova {{Scorable.cost() }}{{function could be useful in optimizations.}} For example, the ability for collectors to skip non-competitive documents introduced in [LUCENE-9280| https://issues.apache.org/jira/browse/LUCENE-9280 ] is based on the cost of the corresponding Scorable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14442) bin/solr to attempt jstack before killing hung Solr instance
[ https://issues.apache.org/jira/browse/SOLR-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095548#comment-17095548 ] Christine Poerschke commented on SOLR-14442: Uploaded patch which also includes {{bin/solr.cmd}} change for Windows, though it's not a Solr platform I'm familiar with. Would appreciate review and/or test input from anyone who is. Thanks! > bin/solr to attempt jstack before killing hung Solr instance > > > Key: SOLR-14442 > URL: https://issues.apache.org/jira/browse/SOLR-14442 > Project: Solr > Issue Type: Improvement >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-14442.patch, SOLR-14442.patch > > > If a Solr instance did not respond to the 'stop' command in a timely manner > then the {{bin/solr}} script will attempt to forcefully kill it: > [https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.5.1/solr/bin/solr#L859] > Gathering of information (e.g. a jstack of the java process) before the kill > command may be helpful in determining why the instance did not stop as > expected. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1351: LUCENE-9280: Collectors to skip noncompetitive documents
mayya-sharipova commented on a change in pull request #1351: URL: https://github.com/apache/lucene-solr/pull/1351#discussion_r417402090 ## File path: lucene/core/src/java/org/apache/lucene/search/FilteringFieldComparator.java ## @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search; + +import java.io.IOException; + +/** + * Decorates a wrapped FieldComparator to add a functionality to skip over non-competitive docs. + * FilteringFieldComparator provides two additional functions for a FieldComparator: + * 1) {@code competitiveIterator()} that returns an iterator over + * competitive docs that are stronger than already collected docs. + * 2) {@code setCanUpdateIterator()} that notifies the comparator when it is ok to start updating its internal iterator. + * This method is called from a collector to inform the comparator to start updating its iterator. + */ +public abstract class FilteringFieldComparator extends FieldComparator { +final FieldComparator in; Review comment: addressed in b8e138c This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9351) Port releaseWizard to 7_7 branch
Jan Høydahl created LUCENE-9351: --- Summary: Port releaseWizard to 7_7 branch Key: LUCENE-9351 URL: https://issues.apache.org/jira/browse/LUCENE-9351 Project: Lucene - Core Issue Type: Task Components: general/tools Reporter: Jan Høydahl The releaseWizard tool can be used for 7.7.x releases. Must be back-ported from master/branch8x. Note that also changes in buildAndPushRelease.py, poll-mirrors.py and other tools may be necessary too. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1351: LUCENE-9280: Collectors to skip noncompetitive documents
mayya-sharipova commented on a change in pull request #1351: URL: https://github.com/apache/lucene-solr/pull/1351#discussion_r417400609 ## File path: lucene/core/src/java/org/apache/lucene/search/FilteringNumericComparator.java ## @@ -0,0 +1,329 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search; + +import org.apache.lucene.document.DoublePoint; +import org.apache.lucene.document.FloatPoint; +import org.apache.lucene.document.IntPoint; +import org.apache.lucene.document.LongPoint; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.index.PointValues; +import org.apache.lucene.util.DocIdSetBuilder; + +import java.io.IOException; +import java.util.Arrays; + +/** + * A wrapper over {@code NumericComparator} that adds a functionality to filter non-competitive docs. + */ +public abstract class FilteringNumericComparator extends FilteringFieldComparator implements LeafFieldComparator { +protected final boolean reverse; +private boolean hasTopValue = false; +private PointValues pointValues; +private final int bytesCount; +private final byte[] minValueAsBytes; +private final byte[] maxValueAsBytes; +private boolean minValueExist = false; +private boolean maxValueExist = false; +private int maxDoc; +private int maxDocVisited; +private int updateCounter = 0; +private final String field; +protected boolean canUpdateIterator = false; // set to true when queue becomes full and hitsThreshold is reached +protected DocIdSetIterator competitiveIterator = null; +private long iteratorCost = 0; + +public FilteringNumericComparator(NumericComparator in, boolean reverse, int bytesCount) { +super(in); +this.field = in.field; +this.bytesCount = bytesCount; +this.reverse = reverse; +minValueAsBytes = new byte[bytesCount]; +maxValueAsBytes = new byte[bytesCount]; +if (reverse) { +minValueExist = true; +} else { +maxValueExist = true; +} +} + +/** + * Returns an iterator over competitive documents + */ +@Override +public DocIdSetIterator competitiveIterator() { +if (competitiveIterator == null) return null; +return new DocIdSetIterator() { +private int doc; +@Override +public int nextDoc() throws IOException { +return doc = competitiveIterator.nextDoc(); +} + +@Override +public int docID() { +return doc; +} + +@Override +public long cost() { +return competitiveIterator.cost(); +} + +@Override +public int advance(int target) throws IOException { +return doc = competitiveIterator.advance(target); +} +}; +} + +@Override +public void setCanUpdateIterator() throws IOException { +this.canUpdateIterator = true; +// for the 1st time queue becomes full and hitsThreshold is reached +// we can start updating competitive iterator +updateCompetitiveIterator(); +} + +@Override +public void setTopValue(T value) { +hasTopValue = true; +if (reverse) { +maxValueExist = true; +} else { +minValueExist = true; +} +in.setTopValue(value); +} + +@Override +public void setBottom(int slot) throws IOException { +((NumericComparator) in).setBottom(slot); +updateCompetitiveIterator(); // update an iterator if we set a new bottom +} + +@Override +public int compareBottom(int doc) throws IOException { +return ((NumericComparator) in).compareBottom(doc); +} + +@Override +public int compareTop(int doc) throws IOException { +return ((NumericComparator) in).compareTop(doc); +} + +@Override +public void copy(int slot, int doc) throws IOException { +((NumericComparator) in).copy(slot, doc); +} + +@Override +public void setScorer(Scorable scorer) throws IOException { +
[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1351: LUCENE-9280: Collectors to skip noncompetitive documents
mayya-sharipova commented on a change in pull request #1351: URL: https://github.com/apache/lucene-solr/pull/1351#discussion_r417400706 ## File path: lucene/core/src/java/org/apache/lucene/search/FilteringNumericComparator.java ## @@ -0,0 +1,329 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search; + +import org.apache.lucene.document.DoublePoint; +import org.apache.lucene.document.FloatPoint; +import org.apache.lucene.document.IntPoint; +import org.apache.lucene.document.LongPoint; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.index.PointValues; +import org.apache.lucene.util.DocIdSetBuilder; + +import java.io.IOException; +import java.util.Arrays; + +/** + * A wrapper over {@code NumericComparator} that adds a functionality to filter non-competitive docs. + */ +public abstract class FilteringNumericComparator extends FilteringFieldComparator implements LeafFieldComparator { +protected final boolean reverse; +private boolean hasTopValue = false; +private PointValues pointValues; +private final int bytesCount; +private final byte[] minValueAsBytes; +private final byte[] maxValueAsBytes; +private boolean minValueExist = false; +private boolean maxValueExist = false; +private int maxDoc; +private int maxDocVisited; +private int updateCounter = 0; +private final String field; +protected boolean canUpdateIterator = false; // set to true when queue becomes full and hitsThreshold is reached +protected DocIdSetIterator competitiveIterator = null; +private long iteratorCost = 0; + +public FilteringNumericComparator(NumericComparator in, boolean reverse, int bytesCount) { +super(in); +this.field = in.field; +this.bytesCount = bytesCount; +this.reverse = reverse; +minValueAsBytes = new byte[bytesCount]; +maxValueAsBytes = new byte[bytesCount]; +if (reverse) { +minValueExist = true; +} else { +maxValueExist = true; +} +} + +/** + * Returns an iterator over competitive documents + */ +@Override +public DocIdSetIterator competitiveIterator() { +if (competitiveIterator == null) return null; +return new DocIdSetIterator() { +private int doc; +@Override +public int nextDoc() throws IOException { +return doc = competitiveIterator.nextDoc(); +} + +@Override +public int docID() { +return doc; +} + +@Override +public long cost() { +return competitiveIterator.cost(); +} + +@Override +public int advance(int target) throws IOException { +return doc = competitiveIterator.advance(target); +} +}; +} + +@Override +public void setCanUpdateIterator() throws IOException { +this.canUpdateIterator = true; +// for the 1st time queue becomes full and hitsThreshold is reached +// we can start updating competitive iterator +updateCompetitiveIterator(); +} + +@Override +public void setTopValue(T value) { +hasTopValue = true; +if (reverse) { +maxValueExist = true; +} else { +minValueExist = true; +} +in.setTopValue(value); +} + +@Override +public void setBottom(int slot) throws IOException { +((NumericComparator) in).setBottom(slot); +updateCompetitiveIterator(); // update an iterator if we set a new bottom +} + +@Override +public int compareBottom(int doc) throws IOException { +return ((NumericComparator) in).compareBottom(doc); +} + +@Override +public int compareTop(int doc) throws IOException { +return ((NumericComparator) in).compareTop(doc); +} + +@Override +public void copy(int slot, int doc) throws IOException { +((NumericComparator) in).copy(slot, doc); +} + +@Override +public void setScorer(Scorable scorer) throws IOException { +
[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1467: LUCENE-9350: Don't hold references to large automata on FuzzyQuery
romseygeek commented on a change in pull request #1467: URL: https://github.com/apache/lucene-solr/pull/1467#discussion_r417400710 ## File path: lucene/core/src/java/org/apache/lucene/search/FuzzyTermsEnum.java ## @@ -364,4 +325,60 @@ public BytesRef term() throws IOException { } } + /** + * Used for sharing automata between segments + * + * Levenshtein automata are large and expensive to build; we don't want to build + * them directly on the query because this can blow up caches that use queries + * as keys; we also don't want to rebuild them for every segment. This attribute + * allows the FuzzyTermsEnum to build the automata once for its first segment + * and then share them for subsequent segment calls. + */ + private interface AutomatonAttribute extends Attribute { +CompiledAutomaton[] getAutomata(); +int getTermLength(); +void init(Supplier builder); + } + + private static class AutomatonAttributeImpl extends AttributeImpl implements AutomatonAttribute { + +private CompiledAutomaton[] automata; +private int termLength; + +@Override +public CompiledAutomaton[] getAutomata() { + return automata; +} + +@Override +public int getTermLength() { + return termLength; +} + +@Override +public void init(Supplier supplier) { + if (automata != null) { +return; + } + FuzzyAutomatonBuilder builder = supplier.get(); + this.termLength = builder.getTermLength(); + this.automata = builder.buildAutomatonSet(); +} + +@Override +public void clear() { + this.automata = null; +} + +@Override +public void reflectWith(AttributeReflector reflector) { Review comment: It's a private implementation and only gets used internally to FuzzyTermsEnum, so I don't see how the analysis page or luke would get presented with one of these? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1351: LUCENE-9280: Collectors to skip noncompetitive documents
mayya-sharipova commented on a change in pull request #1351: URL: https://github.com/apache/lucene-solr/pull/1351#discussion_r417400122 ## File path: lucene/core/src/java/org/apache/lucene/search/FilteringFieldComparator.java ## @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search; + +import java.io.IOException; + +/** + * Decorates a wrapped FieldComparator to add a functionality to skip over non-competitive docs. + * FilteringFieldComparator provides two additional functions for a FieldComparator: + * 1) {@code competitiveIterator()} that returns an iterator over + * competitive docs that are stronger than already collected docs. + * 2) {@code setCanUpdateIterator()} that notifies the comparator when it is ok to start updating its internal iterator. + * This method is called from a collector to inform the comparator to start updating its iterator. + */ +public abstract class FilteringFieldComparator extends FieldComparator { +final FieldComparator in; + +public FilteringFieldComparator(FieldComparator in) { +this.in = in; +} + +protected abstract DocIdSetIterator competitiveIterator(); Review comment: Thanks @jpountz, indeed the suggested structure is much better. addressed in 7120424ff ## File path: lucene/core/src/java/org/apache/lucene/search/FilteringFieldComparator.java ## @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search; + +import java.io.IOException; + +/** + * Decorates a wrapped FieldComparator to add a functionality to skip over non-competitive docs. + * FilteringFieldComparator provides two additional functions for a FieldComparator: + * 1) {@code competitiveIterator()} that returns an iterator over + * competitive docs that are stronger than already collected docs. + * 2) {@code setCanUpdateIterator()} that notifies the comparator when it is ok to start updating its internal iterator. + * This method is called from a collector to inform the comparator to start updating its iterator. + */ +public abstract class FilteringFieldComparator extends FieldComparator { +final FieldComparator in; + +public FilteringFieldComparator(FieldComparator in) { +this.in = in; +} + +protected abstract DocIdSetIterator competitiveIterator(); + +protected abstract void setCanUpdateIterator() throws IOException; Review comment: addressed in 7120424ff This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1351: LUCENE-9280: Collectors to skip noncompetitive documents
mayya-sharipova commented on a change in pull request #1351: URL: https://github.com/apache/lucene-solr/pull/1351#discussion_r417399428 ## File path: lucene/core/src/java/org/apache/lucene/search/FilterLeafCollector.java ## @@ -53,4 +53,8 @@ public String toString() { return name + "(" + in + ")"; } + @Override + public DocIdSetIterator competitiveIterator() { +return in.competitiveIterator(); + } Review comment: addressed in 7120424ff This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14442) bin/solr to attempt jstack before killing hung Solr instance
[ https://issues.apache.org/jira/browse/SOLR-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke updated SOLR-14442: --- Attachment: SOLR-14442.patch > bin/solr to attempt jstack before killing hung Solr instance > > > Key: SOLR-14442 > URL: https://issues.apache.org/jira/browse/SOLR-14442 > Project: Solr > Issue Type: Improvement >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-14442.patch, SOLR-14442.patch > > > If a Solr instance did not respond to the 'stop' command in a timely manner > then the {{bin/solr}} script will attempt to forcefully kill it: > [https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.5.1/solr/bin/solr#L859] > Gathering of information (e.g. a jstack of the java process) before the kill > command may be helpful in determining why the instance did not stop as > expected. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1467: LUCENE-9350: Don't hold references to large automata on FuzzyQuery
dsmiley commented on a change in pull request #1467: URL: https://github.com/apache/lucene-solr/pull/1467#discussion_r417397488 ## File path: lucene/core/src/java/org/apache/lucene/search/FuzzyTermsEnum.java ## @@ -364,4 +325,60 @@ public BytesRef term() throws IOException { } } + /** + * Used for sharing automata between segments + * + * Levenshtein automata are large and expensive to build; we don't want to build + * them directly on the query because this can blow up caches that use queries + * as keys; we also don't want to rebuild them for every segment. This attribute + * allows the FuzzyTermsEnum to build the automata once for its first segment + * and then share them for subsequent segment calls. + */ + private interface AutomatonAttribute extends Attribute { +CompiledAutomaton[] getAutomata(); +int getTermLength(); +void init(Supplier builder); + } + + private static class AutomatonAttributeImpl extends AttributeImpl implements AutomatonAttribute { + +private CompiledAutomaton[] automata; +private int termLength; + +@Override +public CompiledAutomaton[] getAutomata() { + return automata; +} + +@Override +public int getTermLength() { + return termLength; +} + +@Override +public void init(Supplier supplier) { + if (automata != null) { +return; + } + FuzzyAutomatonBuilder builder = supplier.get(); + this.termLength = builder.getTermLength(); + this.automata = builder.buildAutomatonSet(); +} + +@Override +public void clear() { + this.automata = null; +} + +@Override +public void reflectWith(AttributeReflector reflector) { Review comment: Why not implement these like they are done on 8x? I suspect introspection tools like "luke" / Solr analysis page may choke if this isn't done right. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14442) bin/solr to attempt jstack before killing hung Solr instance
[ https://issues.apache.org/jira/browse/SOLR-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095540#comment-17095540 ] Christine Poerschke commented on SOLR-14442: Here's the commands I used to test the {{bin/solr}} changes. * First terminal: {code:java} cd solr ant dist server bin/solr start -e techproducts -noprompt bin/solr stop {code} * Second terminal: {code:java} curl 'http://localhost:8983/solr/techproducts/select?fl=id,popularity,score=func=add(popularity,0)' curl 'http://localhost:8983/solr/techproducts/select?fl=id,popularity,score=func=add(popularity,sleep(1000,0))' curl 'http://localhost:8983/solr/techproducts/select?fl=id,popularity,score=func=add(popularity,sleep(234000,0))' {code} The last query with the sleep command needs to be run before the stop command is issued and the sleep needs to be long enough to make the instance stopping untimely i.e. the script waits for SOLR_STOP_WAIT of 180s and the first sleep parameter looks to be milliseconds i.e. 234000 is 234s which is plenty over the 180s limit. > bin/solr to attempt jstack before killing hung Solr instance > > > Key: SOLR-14442 > URL: https://issues.apache.org/jira/browse/SOLR-14442 > Project: Solr > Issue Type: Improvement >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-14442.patch > > > If a Solr instance did not respond to the 'stop' command in a timely manner > then the {{bin/solr}} script will attempt to forcefully kill it: > [https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.5.1/solr/bin/solr#L859] > Gathering of information (e.g. a jstack of the java process) before the kill > command may be helpful in determining why the instance did not stop as > expected. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7788) fail precommit on unparameterised log messages and examine for wasted work/objects
[ https://issues.apache.org/jira/browse/LUCENE-7788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095528#comment-17095528 ] ASF subversion and git services commented on LUCENE-7788: - Commit 1995e4dcd2d7b92f21d9e2c1e235b9d923396cad in lucene-solr's branch refs/heads/branch_8x from Erick Erickson [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1995e4d ] LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects > fail precommit on unparameterised log messages and examine for wasted > work/objects > -- > > Key: LUCENE-7788 > URL: https://issues.apache.org/jira/browse/LUCENE-7788 > Project: Lucene - Core > Issue Type: Task >Reporter: Christine Poerschke >Assignee: Erick Erickson >Priority: Minor > Attachments: LUCENE-7788.patch, LUCENE-7788.patch, gradle_only.patch, > gradle_only.patch > > Time Spent: 50m > Remaining Estimate: 0h > > SOLR-10415 would be removing existing unparameterised log.trace messages use > and once that is in place then this ticket's one-line change would be for > 'ant precommit' to reject any future unparameterised log.trace message use. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7788) fail precommit on unparameterised log messages and examine for wasted work/objects
[ https://issues.apache.org/jira/browse/LUCENE-7788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095526#comment-17095526 ] ASF subversion and git services commented on LUCENE-7788: - Commit 6e96d01efc579b4df40fb02e6158b05ee2aeff7f in lucene-solr's branch refs/heads/master from Erick Erickson [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6e96d01 ] LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects > fail precommit on unparameterised log messages and examine for wasted > work/objects > -- > > Key: LUCENE-7788 > URL: https://issues.apache.org/jira/browse/LUCENE-7788 > Project: Lucene - Core > Issue Type: Task >Reporter: Christine Poerschke >Assignee: Erick Erickson >Priority: Minor > Attachments: LUCENE-7788.patch, LUCENE-7788.patch, gradle_only.patch, > gradle_only.patch > > Time Spent: 50m > Remaining Estimate: 0h > > SOLR-10415 would be removing existing unparameterised log.trace messages use > and once that is in place then this ticket's one-line change would be for > 'ant precommit' to reject any future unparameterised log.trace message use. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7788) fail precommit on unparameterised log messages and examine for wasted work/objects
[ https://issues.apache.org/jira/browse/LUCENE-7788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095479#comment-17095479 ] Erick Erickson commented on LUCENE-7788: Well, nobody is checking it in gradle files currently that I know of ;) I'll be pushing the next batch in a bit, I'll change it in a few. > fail precommit on unparameterised log messages and examine for wasted > work/objects > -- > > Key: LUCENE-7788 > URL: https://issues.apache.org/jira/browse/LUCENE-7788 > Project: Lucene - Core > Issue Type: Task >Reporter: Christine Poerschke >Assignee: Erick Erickson >Priority: Minor > Attachments: LUCENE-7788.patch, LUCENE-7788.patch, gradle_only.patch, > gradle_only.patch > > Time Spent: 50m > Remaining Estimate: 0h > > SOLR-10415 would be removing existing unparameterised log.trace messages use > and once that is in place then this ticket's one-line change would be for > 'ant precommit' to reject any future unparameterised log.trace message use. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14173) Ref Guide Redesign Phase 1: Page Design
[ https://issues.apache.org/jira/browse/SOLR-14173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095477#comment-17095477 ] ASF subversion and git services commented on SOLR-14173: Commit 28e747950ffeb70218cb3cf17ba7b6b4e69cffe0 in lucene-solr's branch refs/heads/master from Cassandra Targett [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=28e7479 ] SOLR-14173: Change left nav item highlighting to fix menu jumpiness when hovering/selecting > Ref Guide Redesign Phase 1: Page Design > --- > > Key: SOLR-14173 > URL: https://issues.apache.org/jira/browse/SOLR-14173 > Project: Solr > Issue Type: Improvement > Components: documentation >Reporter: Cassandra Targett >Assignee: Cassandra Targett >Priority: Major > Attachments: SOLR-14173.patch, SOLR-14173.patch, blue-left-nav.png, > gray-left-nav.png > > > The current design of the Ref Guide was essentially copied from a > Jekyll-based documentation theme > (https://idratherbewriting.com/documentation-theme-jekyll/), which had a > couple important benefits for that time: > * It was well-documented and since I had little experience with Jekyll and > its Liquid templates and since I was the one doing it, I wanted to make it as > easy on myself as possible > * It was designed for documentation specifically so took care of all the > things like inter-page navigation, etc. > * It helped us get from Confluence to our current system quickly > It had some drawbacks, though: > * It wasted a lot of space on the page > * The theme was built for Markdown files, so did not take advantage of the > features of the {{jekyll-asciidoc}} plugin we use (the in-page TOC being one > big example - the plugin could create it at build time, but the theme > included JS to do it as the page loads, so we use the JS) > * It had a lot of JS and overlapping CSS files. While it used Bootstrap it > used a customized CSS on top of it for theming that made modifications > complex (it was hard to figure out how exactly a change would behave) > * With all the stuff I'd changed in my bumbling way just to get things to > work back then, I broke a lot of the stuff Bootstrap is supposed to give us > in terms of responsiveness and making the Guide usable even on smaller screen > sizes. > After upgrading the Asciidoctor components in SOLR-12786 and stopping the PDF > (SOLR-13782), I wanted to try to set us up for a more flexible system. We > need it for things like Joel's work on the visual guide for streaming > expressions (SOLR-13105), and in order to implement other ideas we might have > on how to present information in the future. > I view this issue as a phase 1 of an overall redesign that I've already > started in a local branch. I'll explain in a comment the changes I've already > made, and will use this issue to create and push a branch where we can > discuss in more detail. > Phase 1 here will be under-the-hood CSS/JS changes + overall page layout > changes. > Phase 2 (SOLR-1) will be a wholesale re-organization of all the pages of > the Guide. > Phase 3 (issue TBD) will explore moving us from Jekyll to another static site > generator that is better suited for our content format, file types, and build > conventions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7788) fail precommit on unparameterised log messages and examine for wasted work/objects
[ https://issues.apache.org/jira/browse/LUCENE-7788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095474#comment-17095474 ] Dawid Weiss commented on LUCENE-7788: - It defeats the purpose of the nocommit marker though? Can it be marked as "todo" or something? > fail precommit on unparameterised log messages and examine for wasted > work/objects > -- > > Key: LUCENE-7788 > URL: https://issues.apache.org/jira/browse/LUCENE-7788 > Project: Lucene - Core > Issue Type: Task >Reporter: Christine Poerschke >Assignee: Erick Erickson >Priority: Minor > Attachments: LUCENE-7788.patch, LUCENE-7788.patch, gradle_only.patch, > gradle_only.patch > > Time Spent: 50m > Remaining Estimate: 0h > > SOLR-10415 would be removing existing unparameterised log.trace messages use > and once that is in place then this ticket's one-line change would be for > 'ant precommit' to reject any future unparameterised log.trace message use. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9349) Avoid parsing all terms in TermInSetQuery.visit()
[ https://issues.apache.org/jira/browse/LUCENE-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Woodward resolved LUCENE-9349. --- Fix Version/s: 8.6 Resolution: Fixed > Avoid parsing all terms in TermInSetQuery.visit() > - > > Key: LUCENE-9349 > URL: https://issues.apache.org/jira/browse/LUCENE-9349 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Fix For: 8.6 > > Time Spent: 0.5h > Remaining Estimate: 0h > > TermInSetQuery currently iterates through all its prefix-encoded terms in > order to build an array to pass back to its visitor when visit() is called. > This seems like a waste, particularly when the visitor is not actually > consuming the terms (for example, when doing a clause-count check before > executing a search). Instead TermInSetQuery should use > consumeTermsMatching(), and we should change the signature of this method so > that it takes a BytesRunAutomaton supplier to allow for lazy instantiation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14445) Add documentation about the DIH Entity Caching
[ https://issues.apache.org/jira/browse/SOLR-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tobias Kässmann updated SOLR-14445: --- Description: This will add a hint that would've saved me some hours of debugging. When you use the DIH Entity Caching the cache keys must have the same types. The only hint in solr is a `ClassCastException`. Pullrequest on Github: [https://github.com/apache/lucene-solr/pull/1466] was: This will add a hint that would've saved me some hours of debugging. When you use the DIH Entity Caching the cache keys must have the same types. The only hint in solr is a `ClassCastException`. > Add documentation about the DIH Entity Caching > -- > > Key: SOLR-14445 > URL: https://issues.apache.org/jira/browse/SOLR-14445 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Affects Versions: 8.5.1 >Reporter: Tobias Kässmann >Priority: Trivial > Labels: DIH, documentation > > This will add a hint that would've saved me some hours of debugging. > When you use the DIH Entity Caching the cache keys must have the same types. > The only hint in solr is a `ClassCastException`. > Pullrequest on Github: [https://github.com/apache/lucene-solr/pull/1466] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5
[ https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095452#comment-17095452 ] Alan Woodward commented on SOLR-14428: -- I opened https://github.com/apache/lucene-solr/pull/1467 - we need to restore the weird and hacky 'cache automata on a shared attribute source' behaviour. I'm not brilliantly happy with this, but the mechanics of MultiTermQuery don't really make it feasible to share the automata anywhere else. > FuzzyQuery has severe memory usage in 8.5 > - > > Key: SOLR-14428 > URL: https://issues.apache.org/jira/browse/SOLR-14428 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.5, 8.5.1 >Reporter: Colvin Cowie >Assignee: Andrzej Bialecki >Priority: Major > Attachments: FuzzyHammer.java, SOLR-14428-WeakReferences.patch, > image-2020-04-23-09-18-06-070.png, image-2020-04-24-20-09-31-179.png, > screenshot-2.png, screenshot-3.png, screenshot-4.png > > Time Spent: 10m > Remaining Estimate: 0h > > I sent this to the mailing list > I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors > while running our normal tests. After profiling it was clear that the > majority of the heap was allocated through FuzzyQuery. > LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the > FuzzyQuery's constructor. > I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries > from random UUID strings for 5 minutes > {code} > FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2" > {code} > When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while > the memory usage has increased drastically on 8.5.0 and 8.5.1. > Comparison of heap usage while running the attached test against Solr 8.3.1 > and 8.5.1 with a single (empty) shard and 4GB heap: > !image-2020-04-23-09-18-06-070.png! > And with 4 shards on 8.4.1 and 8.5.0: > !screenshot-2.png! > I'm guessing that the memory might be being leaked if the FuzzyQuery objects > are referenced from the cache, while the FuzzyTermsEnum would not have been. > Query Result Cache on 8.5.1: > !screenshot-3.png! > ~316mb in the cache > QRC on 8.3.1 > !screenshot-4.png! > <1mb > With an empty cache, running this query > _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory > allocation > {noformat} > 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed: 1520 > 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855 > {noformat} > ~1 gives 98253 and ~0 gives 6339 on 8.5.1. 8.3.1 is constant at 1520 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] romseygeek opened a new pull request #1467: LUCENE-9350: Don't hold references to large automata on FuzzyQuery
romseygeek opened a new pull request #1467: URL: https://github.com/apache/lucene-solr/pull/1467 LUCENE-9068 moved fuzzy automata construction into FuzzyQuery itself. However, this has the nasty side-effect of blowing up query caches that expect queries to be fairly small. This commit restores the previous behaviour of caching the large automata on an AttributeSource shared between segments, while making the construction a bit clearer by factoring it out into a package-private `FuzzyAutomatonBuilder`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9350) Don't cache automata on FuzzyQuery
Alan Woodward created LUCENE-9350: - Summary: Don't cache automata on FuzzyQuery Key: LUCENE-9350 URL: https://issues.apache.org/jira/browse/LUCENE-9350 Project: Lucene - Core Issue Type: Improvement Reporter: Alan Woodward Assignee: Alan Woodward LUCENE-9068 moved construction of FuzzyQuery's automaton directly onto the query itself. However, as SOLR-14428 demonstrates, this ends up blowing up query caches that assume query objects are fairly small when calculating their memory usage. We should move automaton construction back into FuzzyTermsEnum, while keeping as much of the nice refactoring of LUCENE-9068 as possible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-7788) fail precommit on unparameterised log messages and examine for wasted work/objects
[ https://issues.apache.org/jira/browse/LUCENE-7788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095433#comment-17095433 ] Erick Erickson edited comment on LUCENE-7788 at 4/29/20, 1:02 PM: -- Dawid: Yep, bunches. If it's OK I'd rather leave them until the first pass is done, there's some extra checks that'll be obsolete, as well as a bunch of cruft. I'm making good progress on the cleaning all the log calls up, I'm hoping to be done sometime this coming weekend. I certainly expect to remove them all and generally clean up that file, it's very messy right now. There's a bunch of cruft in the validation file that I'll remove before I'm done, for instance: - a check to catch "()" rather than "{}" which I've inadvertently typed in a few times - a bunch of specific checks for individual files etc. things that won't be flagged after I've gotten it all done - commented out hacks to run against 8x by pathing to someplace completely different. - etc. I can make them some other string if anyone's trying to integrate nocommit checks for gradle build files, otherwise I'll ask that you just ignore them for another week, let me know. was (Author: erickerickson): Yep, bunches. If it's OK I'd rather leave them until the first pass is done, there's some extra checks that'll be obsolete, as well as a bunch of cruft. I'm making good progress on the cleaning all the log calls up, I'm hoping to be done sometime this coming weekend. I certainly expect to remove them all and generally clean up that file, it's very messy right now. There's a bunch of cruft in the validation file that I'll remove before I'm done, for instance: - a check to catch "()" rather than "{}" which I've inadvertently typed in a few times - a bunch of specific checks for individual files etc. things that won't be flagged after I've gotten it all done - commented out hacks to run against 8x by pathing to someplace completely different. - etc. I can make them some other string if anyone's trying to integrate nocommit checks for gradle build files, otherwise I'll ask that you just ignore them for another week, let me know. > fail precommit on unparameterised log messages and examine for wasted > work/objects > -- > > Key: LUCENE-7788 > URL: https://issues.apache.org/jira/browse/LUCENE-7788 > Project: Lucene - Core > Issue Type: Task >Reporter: Christine Poerschke >Assignee: Erick Erickson >Priority: Minor > Attachments: LUCENE-7788.patch, LUCENE-7788.patch, gradle_only.patch, > gradle_only.patch > > Time Spent: 50m > Remaining Estimate: 0h > > SOLR-10415 would be removing existing unparameterised log.trace messages use > and once that is in place then this ticket's one-line change would be for > 'ant precommit' to reject any future unparameterised log.trace message use. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7788) fail precommit on unparameterised log messages and examine for wasted work/objects
[ https://issues.apache.org/jira/browse/LUCENE-7788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095433#comment-17095433 ] Erick Erickson commented on LUCENE-7788: Yep, bunches. If it's OK I'd rather leave them until the first pass is done, there's some extra checks that'll be obsolete, as well as a bunch of cruft. I'm making good progress on the cleaning all the log calls up, I'm hoping to be done sometime this coming weekend. I certainly expect to remove them all and generally clean up that file, it's very messy right now. There's a bunch of cruft in the validation file that I'll remove before I'm done, for instance: - a check to catch "()" rather than "{}" which I've inadvertently typed in a few times - a bunch of specific checks for individual files etc. things that won't be flagged after I've gotten it all done - commented out hacks to run against 8x by pathing to someplace completely different. - etc. I can make them some other string if anyone's trying to integrate nocommit checks for gradle build files, otherwise I'll ask that you just ignore them for another week, let me know. > fail precommit on unparameterised log messages and examine for wasted > work/objects > -- > > Key: LUCENE-7788 > URL: https://issues.apache.org/jira/browse/LUCENE-7788 > Project: Lucene - Core > Issue Type: Task >Reporter: Christine Poerschke >Assignee: Erick Erickson >Priority: Minor > Attachments: LUCENE-7788.patch, LUCENE-7788.patch, gradle_only.patch, > gradle_only.patch > > Time Spent: 50m > Remaining Estimate: 0h > > SOLR-10415 would be removing existing unparameterised log.trace messages use > and once that is in place then this ticket's one-line change would be for > 'ant precommit' to reject any future unparameterised log.trace message use. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9191) Fix linefiledocs compression or replace in tests
[ https://issues.apache.org/jira/browse/LUCENE-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095346#comment-17095346 ] Michael McCandless commented on LUCENE-9191: Oh! We need to push new line file docs to jenkins! I posted them at my home.apache.org (links are above) – we need to install both the {{.gz}} and {{.seek}} files to Jenkins. Can someone w/ Jenkins access do that? > Fix linefiledocs compression or replace in tests > > > Key: LUCENE-9191 > URL: https://issues.apache.org/jira/browse/LUCENE-9191 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Michael McCandless >Priority: Major > Fix For: 8.6 > > Attachments: LUCENE-9191.patch, LUCENE-9191.patch > > > LineFileDocs(random) is very slow, even to open. It does a very slow "random > skip" through a gzip compressed file. > For the analyzers tests, in LUCENE-9186 I simply removed its usage, since > TestUtil.randomAnalysisString is superior, and fast. But we should address > other tests using it, since LineFileDocs(random) is slow! > I think it is also the case that every lucene test has probably tested every > LineFileDocs line many times now, whereas randomAnalysisString will invent > new ones. > Alternatively, we could "fix" LineFileDocs(random), e.g. special compression > options (in blocks)... deflate supports such stuff. But it would make it even > hairier than it is now. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14445) Add documentation about the DIH Entity Caching
Tobias Kässmann created SOLR-14445: -- Summary: Add documentation about the DIH Entity Caching Key: SOLR-14445 URL: https://issues.apache.org/jira/browse/SOLR-14445 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: documentation Affects Versions: 8.5.1 Reporter: Tobias Kässmann This will add a hint that would've saved me some hours of debugging. When you use the DIH Entity Caching the cache keys must have the same types. The only hint in solr is a `ClassCastException`. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9089) FST.Builder with fluent-style constructor
[ https://issues.apache.org/jira/browse/LUCENE-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095303#comment-17095303 ] Tomoko Uchida commented on LUCENE-9089: --- Thanks for the comments. The ASF bot seems not to be working though, I've committed it to the master. https://github.com/apache/lucene-solr/commit/59a8e83520e2b9075f189c5ecf7e9867ea26b937 > FST.Builder with fluent-style constructor > - > > Key: LUCENE-9089 > URL: https://issues.apache.org/jira/browse/LUCENE-9089 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Bruno Roustant >Assignee: Bruno Roustant >Priority: Minor > Fix For: master (9.0) > > Attachments: fix-fst-package-summary.patch > > Time Spent: 2.5h > Remaining Estimate: 0h > > A first step in a try to make the FST code easier to read and evolve. This > step is just about the FST Builder constructor. > By making it fluent, the many calls to it are simplified and it becomes easy > to spot the intent and special param tuning. > No functional change. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9089) FST.Builder with fluent-style constructor
[ https://issues.apache.org/jira/browse/LUCENE-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095215#comment-17095215 ] Bruno Roustant commented on LUCENE-9089: Good catch [~tomoko], thanks! > FST.Builder with fluent-style constructor > - > > Key: LUCENE-9089 > URL: https://issues.apache.org/jira/browse/LUCENE-9089 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Bruno Roustant >Assignee: Bruno Roustant >Priority: Minor > Fix For: master (9.0) > > Attachments: fix-fst-package-summary.patch > > Time Spent: 2.5h > Remaining Estimate: 0h > > A first step in a try to make the FST code easier to read and evolve. This > step is just about the FST Builder constructor. > By making it fluent, the many calls to it are simplified and it becomes easy > to spot the intent and special param tuning. > No functional change. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13289) Support for BlockMax WAND
[ https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095150#comment-17095150 ] Andrzej Bialecki edited comment on SOLR-13289 at 4/29/20, 7:29 AM: --- IMHO the {{hitCountRelation}} vs {{numFound}} is jarring and at the first glance looks cryptic and unrelated to each other. I understand that this reflects the API naming in Lucene, but I think Solr could be much more user-friendly here and use a name that is both related to {{numFound}} and self-explanatory - perhaps {{numFoundPrecision}} or simply {{precision}}? After all, Solr doesn't use Lucene's {{totalHits}} name either, right? The enum name is also very long - in total this element adds 40 characters to the response for something that is a simple flag ... perhaps use {{GT_EQ}} or {{EQ}}, short and not too cryptic? was (Author: ab): IMHO the {{hitCountRelation}} vs {{numFound}} is jarring and at the first glance looks cryptic and unrelated to each other. I understand that this reflects the API naming in Lucene, but I think Solr could be much more user-friendly here and use a name that is both related to {{numFound}} and self-explanatory - perhaps {{numFoundPrecision}} or simply {{precision}}? After all, Solr doesn't use Lucene's {{totalHits}} name either, right? The enum name is also very long - in total this element adds 40 characters to the response for something that is a simple flag ... > Support for BlockMax WAND > - > > Key: SOLR-13289 > URL: https://issues.apache.org/jira/browse/SOLR-13289 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Tomas Eduardo Fernandez Lobbe >Priority: Major > Attachments: SOLR-13289.patch, SOLR-13289.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to > expose this via Solr. When enabled, the numFound returned will not be exact. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13289) Support for BlockMax WAND
[ https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095150#comment-17095150 ] Andrzej Bialecki commented on SOLR-13289: - IMHO the {{hitCountRelation}} vs {{numFound}} is jarring and at the first glance looks cryptic and unrelated to each other. I understand that this reflects the API naming in Lucene, but I think Solr could be much more user-friendly here and use a name that is both related to {{numFound}} and self-explanatory - perhaps {{numFoundPrecision}} or simply {{precision}}? After all, Solr doesn't use Lucene's {{totalHits}} name either, right? The enum name is also very long - in total this element adds 40 characters to the response for something that is a simple flag ... > Support for BlockMax WAND > - > > Key: SOLR-13289 > URL: https://issues.apache.org/jira/browse/SOLR-13289 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Tomas Eduardo Fernandez Lobbe >Priority: Major > Attachments: SOLR-13289.patch, SOLR-13289.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to > expose this via Solr. When enabled, the numFound returned will not be exact. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7788) fail precommit on unparameterised log messages and examine for wasted work/objects
[ https://issues.apache.org/jira/browse/LUCENE-7788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095141#comment-17095141 ] Dawid Weiss commented on LUCENE-7788: - Erick there are "nocommit" substrings in validate-log-calls.gradle - these strings should be removed from master, I guess? > fail precommit on unparameterised log messages and examine for wasted > work/objects > -- > > Key: LUCENE-7788 > URL: https://issues.apache.org/jira/browse/LUCENE-7788 > Project: Lucene - Core > Issue Type: Task >Reporter: Christine Poerschke >Assignee: Erick Erickson >Priority: Minor > Attachments: LUCENE-7788.patch, LUCENE-7788.patch, gradle_only.patch, > gradle_only.patch > > Time Spent: 50m > Remaining Estimate: 0h > > SOLR-10415 would be removing existing unparameterised log.trace messages use > and once that is in place then this ticket's one-line change would be for > 'ant precommit' to reject any future unparameterised log.trace message use. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org