[jira] [Commented] (SOLR-14597) Advanced Query Parser
[ https://issues.apache.org/jira/browse/SOLR-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200304#comment-17200304 ] Gus Heck commented on SOLR-14597: - Right, agreed, Lucene stuff also should be broken out to Lucene tickets. All initially here to keep the donation process simple. > Advanced Query Parser > - > > Key: SOLR-14597 > URL: https://issues.apache.org/jira/browse/SOLR-14597 > Project: Solr > Issue Type: New Feature > Components: query parsers >Reporter: Mike Nibeck >Assignee: Gus Heck >Priority: Major > Attachments: aqp_patch.patch > > > This JIRA ticket tracks the progress of SIP-9, the Advanced Query Parser that > is being donated by the Library of Congress. Full description of the feature > can be found on the SIP Page. > [https://cwiki.apache.org/confluence/display/SOLR/SIP-9+Advanced+Query+Parser] > Briefly, this parser provides a comprehensive syntax for users that use > search on a daily basis. It also reserves a smaller set of punctuators than > other parsers. This facilitates easier handling of acronyms and punctuated > patterns with meaning ( such as C++ or 401(k) ). The new syntax opens up some > advanced features while also preventing access to arbitrary features via > local parameters. This parser will be safe for accepting user queries > directly with minimal pre-parsing, but for use cases beyond it's established > features alternate query paths (using other parsers) will need to be supplied. > The code drop is being prepared and will be supplied as soon as we receive > guidance from the PMC regarding the proper process. Given that the Library > already has a signed CCLA we need to understand which of these (or other > processes) apply: > [http://incubator.apache.org/ip-clearance/ip-clearance-template.html] > and > [https://www.apache.org/licenses/contributor-agreements.html#grants] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14597) Advanced Query Parser
[ https://issues.apache.org/jira/browse/SOLR-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200041#comment-17200041 ] David Smiley commented on SOLR-14597: - Treat Lucene as a separate project. Bring each change there (separate classes) as their own JIRA that must have merit in its own right, granted motivated by the use-case shown here. Maybe some could be grouped where you think it makes sense, but I'm seeing at least 2 issues, maybe 3, not 1. The BaseTokenStreamTestCase thing could just be a PR without JIRA issue. > Advanced Query Parser > - > > Key: SOLR-14597 > URL: https://issues.apache.org/jira/browse/SOLR-14597 > Project: Solr > Issue Type: New Feature > Components: query parsers >Reporter: Mike Nibeck >Assignee: Gus Heck >Priority: Major > Attachments: aqp_patch.patch > > > This JIRA ticket tracks the progress of SIP-9, the Advanced Query Parser that > is being donated by the Library of Congress. Full description of the feature > can be found on the SIP Page. > [https://cwiki.apache.org/confluence/display/SOLR/SIP-9+Advanced+Query+Parser] > Briefly, this parser provides a comprehensive syntax for users that use > search on a daily basis. It also reserves a smaller set of punctuators than > other parsers. This facilitates easier handling of acronyms and punctuated > patterns with meaning ( such as C++ or 401(k) ). The new syntax opens up some > advanced features while also preventing access to arbitrary features via > local parameters. This parser will be safe for accepting user queries > directly with minimal pre-parsing, but for use cases beyond it's established > features alternate query paths (using other parsers) will need to be supplied. > The code drop is being prepared and will be supplied as soon as we receive > guidance from the PMC regarding the proper process. Given that the Library > already has a signed CCLA we need to understand which of these (or other > processes) apply: > [http://incubator.apache.org/ip-clearance/ip-clearance-template.html] > and > [https://www.apache.org/licenses/contributor-agreements.html#grants] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14597) Advanced Query Parser
[ https://issues.apache.org/jira/browse/SOLR-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199726#comment-17199726 ] Gus Heck commented on SOLR-14597: - looks like LUCENE-9531 has caused a conflict with the patch, and there have been some changes to the gradle files running javacc so I'm working on updating to work with that and I'll publish the fix as a pull-request for easier review. Now that there is code to look at some responses: [~arafalov]: Point noted about TypeTokenFilter, there is similarity though filtering on flags instead of types. It would be attractive to also inherit from FilteringTokenFilter but It looks like one edge case I ran into isn't handled by the super class. (and makes me wonder if there's a lurking issue with other FilteringTokenFilter sub classes. The case I ran into is thus: The first token in the stream gets assigned a synonym, then in a subsequent step the first token is dropped (this is quite intentional in some use cases we had where the intent was to entirely prevent matches on the original token, but still match on the synonym). When this happens it causes {{java.lang.IllegalArgumentException: first position increment must be > 0 (got 0)}} despite the fact that this scenario is not actually an error in terms of which tokens we want. Unfortunately there's no good way to know what's going to happen to the next token (which may not have the flags in question) so I came up with a workaround that I'm not very pleased with dropping in a placeholder token that is unlikely to match anything. Open to suggestions for better options there, and interested in whether or not other filters that drop tokens can hit the same issue, or if they've handled it in some graceful way I'm not appreciating. Also, now that the code is available, let me know if you still see similarity between PatternTypingFilterFactory and KeywordMarkerFilterFactory... I think they are quite different. [~ichattopadhyaya], [~dsmiley] While some of this could potentially be broken out into a package, there are also some changes to core and some lucene level classes that probably wouldn't want to be in a package, so feel free to put some eyes on it and suggest what the dividing line is (more eyes == better). I'm not against the idea of a 1st party package, but the question is will this be popular enough to merit default inclusion? Another breaking new ground sort of question is "Is it easier to pull it in later or push it out to a package later if we change our minds?" Maybe neither is harder... Changes to note to classes outside the new org.apache.solr.aqp package (where the meat of the new parser and it's .jj file lives): # TypeAsSynonymFilter is gaining the ability to manage what flags are transmitted from the original token to the synonym when it is created # BaseTokenStreamTestCase is gaining the ability to verify the flags on the tokens produced. # access org.apache.solr.cloud.AbstractDistribZkTestBase#copyConfigUp is opened up so that it can be used in a wider array of tests. # Solr gains TokenAnalyzerFilter which applies the Analyzer from a specified field type to the individual tokens of the current stream (see javadoc for more detail) # Operator and SynonymQueryStyle are extracted from the standard parser's base class so they can be re-used. Reuse is is necessary because TextField references SynonymQueryStyle directly. # The above change forces an compile time API change in TextField, which might force this to not be available till 9.x (though the desire to make AQP available in 8.x is there). # The change to TextField then failed TestPackages which failed with a ClassNotFound when it went looking for the old SynonymeQueryStyle inner class that had been promoted to a separate class. This forced me to decompile and provide classes and build/rebuild support for the binary jars checked in for TestPackages (as *.jar.bin). (the .java files for the classes loaded by this test had not been checked in). This is the genesis of the o.a.smy.pkg package namespace. Some of the above (especially #7) might want to be broken into related or sub-tickets. > Advanced Query Parser > - > > Key: SOLR-14597 > URL: https://issues.apache.org/jira/browse/SOLR-14597 > Project: Solr > Issue Type: New Feature > Components: query parsers >Reporter: Mike Nibeck >Assignee: Gus Heck >Priority: Major > Attachments: aqp_patch.patch > > > This JIRA ticket tracks the progress of SIP-9, the Advanced Query Parser that > is being donated by the Library of Congress. Full description of the feature > can be found on the SIP Page. > [https://cwiki.apache.org/confluence/display/SOLR/SIP-9+Advanced+Query+Parser] > Briefly, this parser provides a comprehensive syntax for users that use >
[jira] [Commented] (SOLR-14597) Advanced Query Parser
[ https://issues.apache.org/jira/browse/SOLR-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196156#comment-17196156 ] Mike Nibeck commented on SOLR-14597: Patch attached to fix Unit Tests. > Advanced Query Parser > - > > Key: SOLR-14597 > URL: https://issues.apache.org/jira/browse/SOLR-14597 > Project: Solr > Issue Type: New Feature > Components: query parsers >Reporter: Mike Nibeck >Assignee: Gus Heck >Priority: Major > Attachments: aqp_patch.patch > > > This JIRA ticket tracks the progress of SIP-9, the Advanced Query Parser that > is being donated by the Library of Congress. Full description of the feature > can be found on the SIP Page. > [https://cwiki.apache.org/confluence/display/SOLR/SIP-9+Advanced+Query+Parser] > Briefly, this parser provides a comprehensive syntax for users that use > search on a daily basis. It also reserves a smaller set of punctuators than > other parsers. This facilitates easier handling of acronyms and punctuated > patterns with meaning ( such as C++ or 401(k) ). The new syntax opens up some > advanced features while also preventing access to arbitrary features via > local parameters. This parser will be safe for accepting user queries > directly with minimal pre-parsing, but for use cases beyond it's established > features alternate query paths (using other parsers) will need to be supplied. > The code drop is being prepared and will be supplied as soon as we receive > guidance from the PMC regarding the proper process. Given that the Library > already has a signed CCLA we need to understand which of these (or other > processes) apply: > [http://incubator.apache.org/ip-clearance/ip-clearance-template.html] > and > [https://www.apache.org/licenses/contributor-agreements.html#grants] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14597) Advanced Query Parser
[ https://issues.apache.org/jira/browse/SOLR-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185226#comment-17185226 ] Alexandre Rafalovitch commented on SOLR-14597: -- Posted this on SIP, but it belongs here more: PatternTypingFilterFactory and DropIfFlaggedFilterFactory seem to be quite similar to KeywordMarkerFilterFactory and TypeTokenFilterFactory to the degree that perhaps the existing classes should be enhanced instead to support additional functionality. Especially since keyword marking is integrated into other parts of Solr (e.g. not dropping it as stopword, I think). Also TypeTokenFilter can work as both a blacklist and a whitelist. Both types of filtering are useful. E.g. I used it in the book to allow to search for emails only extracted from some text: [https://github.com/arafalov/solr-indexing-book/blob/master/published/text2/conf/schema.xml] > Advanced Query Parser > - > > Key: SOLR-14597 > URL: https://issues.apache.org/jira/browse/SOLR-14597 > Project: Solr > Issue Type: New Feature > Components: query parsers >Affects Versions: 8.6 >Reporter: Mike Nibeck >Assignee: Gus Heck >Priority: Major > > This JIRA ticket tracks the progress of SIP-9, the Advanced Query Parser that > is being donated by the Library of Congress. Full description of the feature > can be found on the SIP Page. > [https://cwiki.apache.org/confluence/display/SOLR/SIP-9+Advanced+Query+Parser] > Briefly, this parser provides a comprehensive syntax for users that use > search on a daily basis. It also reserves a smaller set of punctuators than > other parsers. This facilitates easier handling of acronyms and punctuated > patterns with meaning ( such as C++ or 401(k) ). The new syntax opens up some > advanced features while also preventing access to arbitrary features via > local parameters. This parser will be safe for accepting user queries > directly with minimal pre-parsing, but for use cases beyond it's established > features alternate query paths (using other parsers) will need to be supplied. > The code drop is being prepared and will be supplied as soon as we receive > guidance from the PMC regarding the proper process. Given that the Library > already has a signed CCLA we need to understand which of these (or other > processes) apply: > [http://incubator.apache.org/ip-clearance/ip-clearance-template.html] > and > [https://www.apache.org/licenses/contributor-agreements.html#grants] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14597) Advanced Query Parser
[ https://issues.apache.org/jira/browse/SOLR-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176494#comment-17176494 ] Ishan Chattopadhyaya commented on SOLR-14597: - I think we should build separate package for these, rather than loading them by default. > Advanced Query Parser > - > > Key: SOLR-14597 > URL: https://issues.apache.org/jira/browse/SOLR-14597 > Project: Solr > Issue Type: New Feature > Components: query parsers >Affects Versions: 8.6 >Reporter: Mike Nibeck >Assignee: Gus Heck >Priority: Major > > This JIRA ticket tracks the progress of SIP-9, the Advanced Query Parser that > is being donated by the Library of Congress. Full description of the feature > can be found on the SIP Page. > [https://cwiki.apache.org/confluence/display/SOLR/SIP-9+Advanced+Query+Parser] > Briefly, this parser provides a comprehensive syntax for users that use > search on a daily basis. It also reserves a smaller set of punctuators than > other parsers. This facilitates easier handling of acronyms and punctuated > patterns with meaning ( such as C++ or 401(k) ). The new syntax opens up some > advanced features while also preventing access to arbitrary features via > local parameters. This parser will be safe for accepting user queries > directly with minimal pre-parsing, but for use cases beyond it's established > features alternate query paths (using other parsers) will need to be supplied. > The code drop is being prepared and will be supplied as soon as we receive > guidance from the PMC regarding the proper process. Given that the Library > already has a signed CCLA we need to understand which of these (or other > processes) apply: > [http://incubator.apache.org/ip-clearance/ip-clearance-template.html] > and > [https://www.apache.org/licenses/contributor-agreements.html#grants] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14597) Advanced Query Parser
[ https://issues.apache.org/jira/browse/SOLR-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155587#comment-17155587 ] Gus Heck commented on SOLR-14597: - After some work came up with this which omits files that don't have "java" in their name, but should give a decent idea: {code:java} NS2-MacBook-Pro:lucene-solr-cdg3 gus$ git diff HEAD..master_head | grep 'diff ..git' | grep java |sed 's#b/#@#' | rev | cut -d'@' -f 1 | rev gradle/generation/javacc.gradle lucene/analysis/common/src/java/org/apache/lucene/analysis/miscellaneous/DropIfFlaggedFilter.java lucene/analysis/common/src/java/org/apache/lucene/analysis/miscellaneous/DropIfFlaggedFilterFactory.java lucene/analysis/common/src/java/org/apache/lucene/analysis/miscellaneous/PatternTypingFilter.java lucene/analysis/common/src/java/org/apache/lucene/analysis/miscellaneous/PatternTypingFilterFactory.java lucene/analysis/common/src/java/org/apache/lucene/analysis/miscellaneous/TypeAsSynonymFilter.java lucene/analysis/common/src/java/org/apache/lucene/analysis/miscellaneous/TypeAsSynonymFilterFactory.java lucene/analysis/common/src/test/org/apache/lucene/analysis/minhash/MinHashFilterTest.java lucene/analysis/common/src/test/org/apache/lucene/analysis/miscellaneous/TestConcatenatingTokenStream.java lucene/analysis/common/src/test/org/apache/lucene/analysis/miscellaneous/TestDropIfFlaggedFilter.java lucene/analysis/common/src/test/org/apache/lucene/analysis/miscellaneous/TestDropIfFlaggedFilterFactory.java lucene/analysis/common/src/test/org/apache/lucene/analysis/miscellaneous/TestPatternTypingFilter.java lucene/analysis/common/src/test/org/apache/lucene/analysis/miscellaneous/TestPatternTypingFilterFactory.java lucene/analysis/common/src/test/org/apache/lucene/analysis/miscellaneous/TestTypeAsSynonymFilter.java lucene/analysis/common/src/test/org/apache/lucene/analysis/miscellaneous/TestTypeAsSynonymFilterFactory.java lucene/core/src/test/org/apache/lucene/analysis/TestStopFilter.java lucene/test-framework/src/java/org/apache/lucene/analysis/BaseTokenStreamTestCase.java solr/core/src/java/org/apache/solr/analysis/TokenAnalyzerFilter.java solr/core/src/java/org/apache/solr/analysis/TokenAnalyzerFilterFactory.java solr/core/src/java/org/apache/solr/aqp/AdvToken.java solr/core/src/java/org/apache/solr/aqp/AdvancedQueryParserBase.java solr/core/src/java/org/apache/solr/aqp/ParseException.java solr/core/src/java/org/apache/solr/aqp/QueryParser.java solr/core/src/java/org/apache/solr/aqp/QueryParser.jj solr/core/src/java/org/apache/solr/aqp/QueryParserConstants.java solr/core/src/java/org/apache/solr/aqp/QueryParserTokenManager.java solr/core/src/java/org/apache/solr/aqp/SpanContext.java solr/core/src/java/org/apache/solr/aqp/Token.java solr/core/src/java/org/apache/solr/aqp/TokenMgrError.java solr/core/src/java/org/apache/solr/aqp/package-info.java solr/core/src/java/org/apache/solr/parser/Operator.java solr/core/src/java/org/apache/solr/parser/QueryParser.java solr/core/src/java/org/apache/solr/parser/QueryParser.jj solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java solr/core/src/java/org/apache/solr/parser/SynonymQueryStyle.java solr/core/src/java/org/apache/solr/schema/IndexSchema.java solr/core/src/java/org/apache/solr/schema/TextField.java solr/core/src/java/org/apache/solr/search/AdvancedQParser.java solr/core/src/java/org/apache/solr/search/AdvancedQParserPlugin.java solr/core/src/java/org/apache/solr/search/AdvancedQueryParser.java solr/core/src/java/org/apache/solr/search/ComplexPhraseQParserPlugin.java solr/core/src/java/org/apache/solr/search/DisMaxQParser.java solr/core/src/java/org/apache/solr/search/ExtendedDismaxQParser.java solr/core/src/java/org/apache/solr/search/QParserPlugin.java solr/core/src/java/org/apache/solr/search/QueryParsing.java solr/core/src/java/org/apache/solr/search/SimpleQParserPlugin.java solr/core/src/java/org/apache/solr/util/SolrPluginUtils.java solr/core/src/test/org/apache/solr/analysis/PatternTypingFilterFactoryTest.java solr/core/src/test/org/apache/solr/analysis/TokenAnalyzerFilterFactoryTest.java solr/core/src/test/org/apache/solr/aqp/AbstractAqpTestCase.java solr/core/src/test/org/apache/solr/aqp/CharacterRangeTest.java solr/core/src/test/org/apache/solr/aqp/FieldedSearchTest.java solr/core/src/test/org/apache/solr/aqp/LiteralPhraseTest.java solr/core/src/test/org/apache/solr/aqp/MustNotTest.java solr/core/src/test/org/apache/solr/aqp/MustTest.java solr/core/src/test/org/apache/solr/aqp/NumericSearchTest.java solr/core/src/test/org/apache/solr/aqp/OrderedDistanceGroupTest.java solr/core/src/test/org/apache/solr/aqp/PhraseTest.java solr/core/src/test/org/apache/solr/aqp/ShouldTest.java solr/core/src/test/org/apache/solr/aqp/SimpleGroupTest.java solr/core/src/test/org/apache/solr/aqp/SimpleQueryTest.java solr/core/src/test/org/apache/solr/aqp/TemporalFieldedSearchTest.java
[jira] [Commented] (SOLR-14597) Advanced Query Parser
[ https://issues.apache.org/jira/browse/SOLR-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150443#comment-17150443 ] David Smiley commented on SOLR-14597: - Can you post a comment on class names you propose to be added to Lucene and likewise Solr? Like do an inventory and list them. > Advanced Query Parser > - > > Key: SOLR-14597 > URL: https://issues.apache.org/jira/browse/SOLR-14597 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 8.6 >Reporter: Mike Nibeck >Assignee: Gus Heck >Priority: Major > > This JIRA ticket tracks the progress of SIP-9, the Advanced Query Parser that > is being donated by the Library of Congress. Full description of the feature > can be found on the SIP Page. > [https://cwiki.apache.org/confluence/display/SOLR/SIP-9+Advanced+Query+Parser] > Briefly, this parser provides a comprehensive syntax for users that use > search on a daily basis. It also reserves a smaller set of punctuators than > other parsers. This facilitates easier handling of acronyms and punctuated > patterns with meaning ( such as C++ or 401(k) ). The new syntax opens up some > advanced features while also preventing access to arbitrary features via > local parameters. This parser will be safe for accepting user queries > directly with minimal pre-parsing, but for use cases beyond it's established > features alternate query paths (using other parsers) will need to be supplied. > The code drop is being prepared and will be supplied as soon as we receive > guidance from the PMC regarding the proper process. Given that the Library > already has a signed CCLA we need to understand which of these (or other > processes) apply: > [http://incubator.apache.org/ip-clearance/ip-clearance-template.html] > and > [https://www.apache.org/licenses/contributor-agreements.html#grants] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14597) Advanced Query Parser
[ https://issues.apache.org/jira/browse/SOLR-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150438#comment-17150438 ] Gus Heck commented on SOLR-14597: - This thought has occurred to me, but the coordination of several parts across both Lucene and Solr layers seems awkward for a packge/plugin (Solr parser, a couple new Lucene filters, etc), and I do hope that it is a generally useful parser as you mention. When we sort out the legalities and get a patch up this will become more clear, but generally it adds another javacc based parser, that was based on and is able to reuse some bits of the standard parser (a few of which needed to be extracted/or made accessible). There are also a few small tweaks to core classes, (which seem justified to me, but of course review and commentary is welcome). So even if a package/plugin is part of the final result we will likely have some changes to Solr & Lucene directly as well. > Advanced Query Parser > - > > Key: SOLR-14597 > URL: https://issues.apache.org/jira/browse/SOLR-14597 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 8.6 >Reporter: Mike Nibeck >Assignee: Gus Heck >Priority: Major > > This JIRA ticket tracks the progress of SIP-9, the Advanced Query Parser that > is being donated by the Library of Congress. Full description of the feature > can be found on the SIP Page. > [https://cwiki.apache.org/confluence/display/SOLR/SIP-9+Advanced+Query+Parser] > Briefly, this parser provides a comprehensive syntax for users that use > search on a daily basis. It also reserves a smaller set of punctuators than > other parsers. This facilitates easier handling of acronyms and punctuated > patterns with meaning ( such as C++ or 401(k) ). The new syntax opens up some > advanced features while also preventing access to arbitrary features via > local parameters. This parser will be safe for accepting user queries > directly with minimal pre-parsing, but for use cases beyond it's established > features alternate query paths (using other parsers) will need to be supplied. > The code drop is being prepared and will be supplied as soon as we receive > guidance from the PMC regarding the proper process. Given that the Library > already has a signed CCLA we need to understand which of these (or other > processes) apply: > [http://incubator.apache.org/ip-clearance/ip-clearance-template.html] > and > [https://www.apache.org/licenses/contributor-agreements.html#grants] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14597) Advanced Query Parser
[ https://issues.apache.org/jira/browse/SOLR-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17149787#comment-17149787 ] David Smiley commented on SOLR-14597: - With the advent of Solr's package management system, I wonder if this should be a "package", either 1st party (i.e. a contrib) or even elsewhere. "elsewhere" isn't a great option as there is no central place to find them yet. Also, I see this plugin introduces schema components and the plugin system is very close but doesn't _quite_ support that yet (I'm guilty on that front). But that'll be addressed soon. I don't have a strong opinion on this but wanted to bring it up. I suppose the package system wouldn't help much here: (A) no dependencies added (B) not a large body of code (C) _seems_ to have zero security risk (e.g. no escaping mechanism to use another query parser). Also, I could see this parser being very useful. > Advanced Query Parser > - > > Key: SOLR-14597 > URL: https://issues.apache.org/jira/browse/SOLR-14597 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 8.6 >Reporter: Mike Nibeck >Assignee: Gus Heck >Priority: Major > > This JIRA ticket tracks the progress of SIP-9, the Advanced Query Parser that > is being donated by the Library of Congress. Full description of the feature > can be found on the SIP Page. > [https://cwiki.apache.org/confluence/display/SOLR/SIP-9+Advanced+Query+Parser] > Briefly, this parser provides a comprehensive syntax for users that use > search on a daily basis. It also reserves a smaller set of punctuators than > other parsers. This facilitates easier handling of acronyms and punctuated > patterns with meaning ( such as C++ or 401(k) ). The new syntax opens up some > advanced features while also preventing access to arbitrary features via > local parameters. This parser will be safe for accepting user queries > directly with minimal pre-parsing, but for use cases beyond it's established > features alternate query paths (using other parsers) will need to be supplied. > The code drop is being prepared and will be supplied as soon as we receive > guidance from the PMC regarding the proper process. Given that the Library > already has a signed CCLA we need to understand which of these (or other > processes) apply: > [http://incubator.apache.org/ip-clearance/ip-clearance-template.html] > and > [https://www.apache.org/licenses/contributor-agreements.html#grants] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org