[jira] [Updated] (LUCENE-7712) SimpleQueryString should support auto fuziness
[ https://issues.apache.org/jira/browse/LUCENE-7712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-7712: --- Attachment: LUCENE-7712.patch Attached a small patch that adds auto-fuzziness and updates the tests to check it. > SimpleQueryString should support auto fuziness > -- > > Key: LUCENE-7712 > URL: https://issues.apache.org/jira/browse/LUCENE-7712 > Project: Lucene - Core > Issue Type: Improvement > Components: core/queryparser >Reporter: David Pilato > Attachments: LUCENE-7712.patch > > > Apparently the simpleQueryString query does not support auto fuziness as the > query string does. > So {{foo:bar~1}} works for both simple query string and query string queries. > But {{foo:bar~}} works for query string query but not for simple query string > query. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7712) SimpleQueryString should support auto fuziness
[ https://issues.apache.org/jira/browse/LUCENE-7712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1538#comment-1538 ] Lee Hinman commented on LUCENE-7712: I am happy to submit a patch to add this, however, I don't know what the auto value should be. I wasn't able to find it except in older (3.x) documentation that mentioned it may be 0.5, is that the correct value for fuzziness that should be used if there is no value specified? > SimpleQueryString should support auto fuziness > -- > > Key: LUCENE-7712 > URL: https://issues.apache.org/jira/browse/LUCENE-7712 > Project: Lucene - Core > Issue Type: Improvement > Components: core/queryparser >Reporter: David Pilato > > Apparently the simpleQueryString query does not support auto fuziness as the > query string does. > So {{foo:bar~1}} works for both simple query string and query string queries. > But {{foo:bar~}} works for query string query but not for simple query string > query. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7490) SimpleQueryParser should parse "*" as MatchAllDocsQuery
[ https://issues.apache.org/jira/browse/LUCENE-7490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-7490: --- Attachment: 0001-Parse-as-MatchAllDocsQuery-in-SimpleQueryParser.patch Attaching patch with small fix and unit test > SimpleQueryParser should parse "*" as MatchAllDocsQuery > --- > > Key: LUCENE-7490 > URL: https://issues.apache.org/jira/browse/LUCENE-7490 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/queryparser >Affects Versions: 6.2.1 >Reporter: Lee Hinman >Priority: Minor > Fix For: 6.x, master (7.0) > > Attachments: > 0001-Parse-as-MatchAllDocsQuery-in-SimpleQueryParser.patch > > > It would be beneficial for SimpleQueryString to parse as a MatchAllDocsQuery, > rather than a "field:*" query. > Related discussion on the Elasticsearch project about this: > https://github.com/elastic/elasticsearch/issues/10632 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-7490) SimpleQueryParser should parse "*" as MatchAllDocsQuery
Lee Hinman created LUCENE-7490: -- Summary: SimpleQueryParser should parse "*" as MatchAllDocsQuery Key: LUCENE-7490 URL: https://issues.apache.org/jira/browse/LUCENE-7490 Project: Lucene - Core Issue Type: Improvement Components: modules/queryparser Affects Versions: 6.2.1 Reporter: Lee Hinman Priority: Minor Fix For: 6.x, master (7.0) It would be beneficial for SimpleQueryString to parse as a MatchAllDocsQuery, rather than a "field:*" query. Related discussion on the Elasticsearch project about this: https://github.com/elastic/elasticsearch/issues/10632 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7027) NumericTermAttribute throws IAE after NumericTokenStream is exhausted
[ https://issues.apache.org/jira/browse/LUCENE-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146214#comment-15146214 ] Lee Hinman commented on LUCENE-7027: Thanks [~thetaphi]! > NumericTermAttribute throws IAE after NumericTokenStream is exhausted > - > > Key: LUCENE-7027 > URL: https://issues.apache.org/jira/browse/LUCENE-7027 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.5, master, 6.0 >Reporter: Michael McCandless >Assignee: Uwe Schindler >Priority: Blocker > Fix For: 5.5, master, 6.0 > > Attachments: LUCENE-7027-master.patch, LUCENE-7027-master.patch, > LUCENE-7027-master.patch > > > This small test: > {noformat} > public void testCloneFullPrecisionToken() throws Exception { > FieldType fieldType = new FieldType(IntField.TYPE_NOT_STORED); > fieldType.setNumericPrecisionStep(Integer.MAX_VALUE); > Field field = new IntField("field", 17, fieldType); > TokenStream tokenStream = new CachingTokenFilter(field.tokenStream(null, > null)); > assertTrue(tokenStream.incrementToken()); > } > {noformat} > hits this unexpected exception: > {noformat} > There was 1 failure: > 1) > testCloneFullPrecisionToken(org.apache.lucene.analysis.TestNumericTokenStream) > java.lang.IllegalArgumentException: Illegal shift value, must be 0..31; got > shift=2147483647 > at > __randomizedtesting.SeedInfo.seed([2E1E93EF810CB5F7:EF1304A849574BC7]:0) > at > org.apache.lucene.util.NumericUtils.intToPrefixCodedBytes(NumericUtils.java:175) > at > org.apache.lucene.util.NumericUtils.intToPrefixCoded(NumericUtils.java:133) > at > org.apache.lucene.analysis.NumericTokenStream$NumericTermAttributeImpl.getBytesRef(NumericTokenStream.java:165) > at > org.apache.lucene.analysis.NumericTokenStream$NumericTermAttributeImpl.clone(NumericTokenStream.java:217) > at > org.apache.lucene.analysis.NumericTokenStream$NumericTermAttributeImpl.clone(NumericTokenStream.java:148) > at > org.apache.lucene.util.AttributeSource$State.clone(AttributeSource.java:55) > at > org.apache.lucene.util.AttributeSource.captureState(AttributeSource.java:288) > at > org.apache.lucene.analysis.CachingTokenFilter.fillCache(CachingTokenFilter.java:96) > at > org.apache.lucene.analysis.CachingTokenFilter.incrementToken(CachingTokenFilter.java:70) > at > org.apache.lucene.analysis.TestNumericTokenStream.testCloneFullPrecisionToken(TestNumericTokenStream.java:138) > {noformat} > because {{CachingTokenFilter}} expects that it can {{captureState}} after > calling {{end}} but {{NumericTokenStream}} gets angry about this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6345) null check all term/fields in queries
[ https://issues.apache.org/jira/browse/LUCENE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-6345: --- Attachment: LUCENE-6345.patch Here's a patch that adds a lot of null checks to Querys as well as things like {{BooleanClause}}. It doesn't add tests for every single query for this (yet), though I see there are some already for {{FilteredQuery}}. Should I work on adding tests for every query type for this, or are adding the checks alone sufficient? null check all term/fields in queries - Key: LUCENE-6345 URL: https://issues.apache.org/jira/browse/LUCENE-6345 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-6345.patch See the mail thread is this lucene 4.1.0 bug in PerFieldPostingsFormat. If anyone seriously thinks adding a null check to ctor will cause measurable slowdown to things like regexp or wildcards, they should have their head examined. All queries should just check this crap in ctor and throw exceptions if parameters are invalid. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6345) null check all term/fields in queries
[ https://issues.apache.org/jira/browse/LUCENE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-6345: --- Attachment: LUCENE-6345.patch Updated patch that re-adds an assert that I removed mistakenly. null check all term/fields in queries - Key: LUCENE-6345 URL: https://issues.apache.org/jira/browse/LUCENE-6345 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-6345.patch, LUCENE-6345.patch See the mail thread is this lucene 4.1.0 bug in PerFieldPostingsFormat. If anyone seriously thinks adding a null check to ctor will cause measurable slowdown to things like regexp or wildcards, they should have their head examined. All queries should just check this crap in ctor and throw exceptions if parameters are invalid. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6345) null check all term/fields in queries
[ https://issues.apache.org/jira/browse/LUCENE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362714#comment-14362714 ] Lee Hinman commented on LUCENE-6345: I'm going to work on this. Looking through the code, I see a mixture of: {noformat} Term t = Objects.requireNonNull(term); {noformat} As well as: {noformat} if (term == null) { throw new IllegalArgumentException(Term must not be null); } {noformat} Any particular preference here? I think an explicit message is nicer but I can go either way. If no one has an opinion about it I'll pick one and go with it :) null check all term/fields in queries - Key: LUCENE-6345 URL: https://issues.apache.org/jira/browse/LUCENE-6345 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir See the mail thread is this lucene 4.1.0 bug in PerFieldPostingsFormat. If anyone seriously thinks adding a null check to ctor will cause measurable slowdown to things like regexp or wildcards, they should have their head examined. All queries should just check this crap in ctor and throw exceptions if parameters are invalid. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6333) Clean up overridden .equals and .hashCode methods in Query subclasses
[ https://issues.apache.org/jira/browse/LUCENE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-6333: --- Attachment: LUCENE-6333-2.patch bq. I think the test may fail sporatically, if the randomly generated terms happen to cause a hash collision. Maybe the test can just use some hardcoded terms like apple and orange, which will still find the bug. I agree, here's a new patch that uses apple and orange as you recommended. Clean up overridden .equals and .hashCode methods in Query subclasses - Key: LUCENE-6333 URL: https://issues.apache.org/jira/browse/LUCENE-6333 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Fix For: Trunk, 5.1 Attachments: LUCENE-6333-2.patch, LUCENE-6333-2.patch, LUCENE-6333.patch As a followup to LUCENE-6304, all classes that subclass Query and override the {{equals}} and {{hashCode}} methods should call super.equals/hashCode and, when possible, not override the methods at all. For example, TermQuery.hashCode overrides the Query.hashCode, but will be exactly the same code once LUCENE-6304 is merged. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6333) Clean up overridden .equals and .hashCode methods in Query subclasses
[ https://issues.apache.org/jira/browse/LUCENE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-6333: --- Attachment: LUCENE-6333-2.patch Here is an additional patch (now that the previous LUCENE-6333.patch was applied) that adds a correct {{hashCode}} method for {{TermsQuery}} as well as tests that catch this if it happens in the future. Clean up overridden .equals and .hashCode methods in Query subclasses - Key: LUCENE-6333 URL: https://issues.apache.org/jira/browse/LUCENE-6333 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Fix For: Trunk, 5.1 Attachments: LUCENE-6333-2.patch, LUCENE-6333.patch As a followup to LUCENE-6304, all classes that subclass Query and override the {{equals}} and {{hashCode}} methods should call super.equals/hashCode and, when possible, not override the methods at all. For example, TermQuery.hashCode overrides the Query.hashCode, but will be exactly the same code once LUCENE-6304 is merged. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6355) Add verbose IndexWriter logging for writing field infos
[ https://issues.apache.org/jira/browse/LUCENE-6355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-6355: --- Attachment: LUCENE-6355.patch Here is a very simple patch, it adds the same time logging that is used in the rest of {{SegmentMerger.merge}} to writing the field infos. Add verbose IndexWriter logging for writing field infos --- Key: LUCENE-6355 URL: https://issues.apache.org/jira/browse/LUCENE-6355 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-6355.patch SegmentMerger should also write the amount of time it takes to write the field infos during a merge. This makes it much easier to determine the contributing times for the total merge time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-6355) Add verbose IndexWriter logging for writing field infos
Lee Hinman created LUCENE-6355: -- Summary: Add verbose IndexWriter logging for writing field infos Key: LUCENE-6355 URL: https://issues.apache.org/jira/browse/LUCENE-6355 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor SegmentMerger should also write the amount of time it takes to write the field infos during a merge. This makes it much easier to determine the contributing times for the total merge time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6333) Clean up overridden .equals and .hashCode methods in Query subclasses
[ https://issues.apache.org/jira/browse/LUCENE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-6333: --- Attachment: LUCENE-6333.patch Here's a patch that cleans up the hashCode and equals methods for most queries. I removed the extra {{getBoost}} comparison because the Query superclass does that comparison. I was also able to remove some overridden methods that only did exactly what the {{Query}} implementation does. Clean up overridden .equals and .hashCode methods in Query subclasses - Key: LUCENE-6333 URL: https://issues.apache.org/jira/browse/LUCENE-6333 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-6333.patch As a followup to LUCENE-6304, all classes that subclass Query and override the {{equals}} and {{hashCode}} methods should call super.equals/hashCode and, when possible, not override the methods at all. For example, TermQuery.hashCode overrides the Query.hashCode, but will be exactly the same code once LUCENE-6304 is merged. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents
[ https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-6304: --- Attachment: LUCENE-6304.patch Adrien: I agree about having the hashCode. Here is a new patch that doesn't override equals or hashCode and changes Query to use use the class in the hashCode method as Adrien suggested. Add MatchNoDocsQuery that matches no documents -- Key: LUCENE-6304 URL: https://issues.apache.org/jira/browse/LUCENE-6304 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-6304.patch, LUCENE-6304.patch, LUCENE-6304.patch, LUCENE-6304.patch As a followup to LUCENE-6298, it would be nice to have an explicit MatchNoDocsQuery to indicate that no documents should be matched. This would hopefully be a better indicator than a BooleanQuery with no clauses or (even worse) null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents
[ https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345602#comment-14345602 ] Lee Hinman edited comment on LUCENE-6304 at 3/3/15 7:42 PM: Adrien: I agree about having the hashCode. Here is a new patch that doesn't override equals or hashCode and changes Query to use the class in the hashCode method as Adrien suggested. was (Author: dakrone): Adrien: I agree about having the hashCode. Here is a new patch that doesn't override equals or hashCode and changes Query to use use the class in the hashCode method as Adrien suggested. Add MatchNoDocsQuery that matches no documents -- Key: LUCENE-6304 URL: https://issues.apache.org/jira/browse/LUCENE-6304 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-6304.patch, LUCENE-6304.patch, LUCENE-6304.patch, LUCENE-6304.patch As a followup to LUCENE-6298, it would be nice to have an explicit MatchNoDocsQuery to indicate that no documents should be matched. This would hopefully be a better indicator than a BooleanQuery with no clauses or (even worse) null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents
[ https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345820#comment-14345820 ] Lee Hinman commented on LUCENE-6304: Robert: +1, I opened LUCENE-6333 for this, I'll work on a patch. Add MatchNoDocsQuery that matches no documents -- Key: LUCENE-6304 URL: https://issues.apache.org/jira/browse/LUCENE-6304 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-6304.patch, LUCENE-6304.patch, LUCENE-6304.patch, LUCENE-6304.patch As a followup to LUCENE-6298, it would be nice to have an explicit MatchNoDocsQuery to indicate that no documents should be matched. This would hopefully be a better indicator than a BooleanQuery with no clauses or (even worse) null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-6333) Clean up overridden .equals and .hashCode methods in Query subclasses
Lee Hinman created LUCENE-6333: -- Summary: Clean up overridden .equals and .hashCode methods in Query subclasses Key: LUCENE-6333 URL: https://issues.apache.org/jira/browse/LUCENE-6333 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor As a followup to LUCENE-6304, all classes that subclass Query and override the {{equals}} and {{hashCode}} methods should call super.equals/hashCode and, when possible, not override the methods at all. For example, TermQuery.hashCode overrides the Query.hashCode, but will be exactly the same code once LUCENE-6304 is merged. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents
[ https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-6304: --- Attachment: LUCENE-6304.patch bq. is the hashcode/equals stuff needed here or can the superclass impls in Query be used? The hashcode is required at least, because otherwise the QueryUtils.check(q) fails because both the MatchNoDocsQuery and the superclass Query have the same hashcode, and the anonymous WhackyQuery that QueryUtils creates shares the same hash code, so QueryUtils.checkUnequal() fails. The .equals() stuff is not required though, it can use the superclass implementation. I've attached a new patch that does this. Add MatchNoDocsQuery that matches no documents -- Key: LUCENE-6304 URL: https://issues.apache.org/jira/browse/LUCENE-6304 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-6304.patch, LUCENE-6304.patch, LUCENE-6304.patch As a followup to LUCENE-6298, it would be nice to have an explicit MatchNoDocsQuery to indicate that no documents should be matched. This would hopefully be a better indicator than a BooleanQuery with no clauses or (even worse) null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents
[ https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-6304: --- Attachment: LUCENE-6304.patch New patch that changes MatchNoDocsQuery to rewrite to an empty BooleanQuery. Also removes the nocommit as per Adrien's suggestion Add MatchNoDocsQuery that matches no documents -- Key: LUCENE-6304 URL: https://issues.apache.org/jira/browse/LUCENE-6304 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-6304.patch, LUCENE-6304.patch As a followup to LUCENE-6298, it would be nice to have an explicit MatchNoDocsQuery to indicate that no documents should be matched. This would hopefully be a better indicator than a BooleanQuery with no clauses or (even worse) null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents
Lee Hinman created LUCENE-6304: -- Summary: Add MatchNoDocsQuery that matches no documents Key: LUCENE-6304 URL: https://issues.apache.org/jira/browse/LUCENE-6304 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor As a followup to LUCENE-6298, it would be nice to have an explicit MatchNoDocsQuery to indicate that no documents should be matched. This would hopefully be a better indicator than a BooleanQuery with no clauses or (even worse) null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents
[ https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-6304: --- Attachment: LUCENE-6304.patch Patch that adds the MatchNoDocsQuery and uses it for empty SimpleQueryParser queries as well as when a BooleanQuery is rewritten and has no clauses. Add MatchNoDocsQuery that matches no documents -- Key: LUCENE-6304 URL: https://issues.apache.org/jira/browse/LUCENE-6304 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-6304.patch As a followup to LUCENE-6298, it would be nice to have an explicit MatchNoDocsQuery to indicate that no documents should be matched. This would hopefully be a better indicator than a BooleanQuery with no clauses or (even worse) null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-6298) empty SimpleQueryParser query should return empty BooleanQuery
Lee Hinman created LUCENE-6298: -- Summary: empty SimpleQueryParser query should return empty BooleanQuery Key: LUCENE-6298 URL: https://issues.apache.org/jira/browse/LUCENE-6298 Project: Lucene - Core Issue Type: Bug Components: modules/queryparser Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor In order to be consistent with QueryParser, SimpleQueryParser should return an empty BooleanQuery instead of null when the analyzed query state is null (if the query text is entirely removed during analysis, for instance). Long term it would also be nice to be able to return a MatchNoDocsQuery (or something like that) instead of using null as a stand-in value for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6298) empty SimpleQueryParser query should return empty BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-6298: --- Attachment: LUCENE-6298.patch Small patch that changes the SimpleQueryParser to the desired behavior empty SimpleQueryParser query should return empty BooleanQuery -- Key: LUCENE-6298 URL: https://issues.apache.org/jira/browse/LUCENE-6298 Project: Lucene - Core Issue Type: Bug Components: modules/queryparser Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-6298.patch In order to be consistent with QueryParser, SimpleQueryParser should return an empty BooleanQuery instead of null when the analyzed query state is null (if the query text is entirely removed during analysis, for instance). Long term it would also be nice to be able to return a MatchNoDocsQuery (or something like that) instead of using null as a stand-in value for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6298) empty SimpleQueryParser query should return empty BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-6298: --- Attachment: LUCENE-6298.patch Better patch without all the stupid IDE import changes. empty SimpleQueryParser query should return empty BooleanQuery -- Key: LUCENE-6298 URL: https://issues.apache.org/jira/browse/LUCENE-6298 Project: Lucene - Core Issue Type: Bug Components: modules/queryparser Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-6298.patch, LUCENE-6298.patch In order to be consistent with QueryParser, SimpleQueryParser should return an empty BooleanQuery instead of null when the analyzed query state is null (if the query text is entirely removed during analysis, for instance). Long term it would also be nice to be able to return a MatchNoDocsQuery (or something like that) instead of using null as a stand-in value for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-6046) RegExp.toAutomaton high memory use
Lee Hinman created LUCENE-6046: -- Summary: RegExp.toAutomaton high memory use Key: LUCENE-6046 URL: https://issues.apache.org/jira/browse/LUCENE-6046 Project: Lucene - Core Issue Type: Bug Components: core/queryparser Affects Versions: 4.10.1 Reporter: Lee Hinman Priority: Minor When creating an automaton from an org.apache.lucene.util.automaton.RegExp, it's possible for the automaton to use so much memory it exceeds the maximum array size for java. The following caused an OutOfMemoryError with a 32gb heap: {noformat} new RegExp(\\[\\[(Datei|File|Bild|Image):[^]]*alt=[^]|}]{50,200}).toAutomaton(); {noformat} When increased to a 60gb heap, the following exception is thrown: {noformat} 1 java.lang.IllegalArgumentException: requested array size 2147483624 exceeds maximum array in java (2147483623) 1 __randomizedtesting.SeedInfo.seed([7BE81EF678615C32:95C8057A4ABA5B52]:0) 1 org.apache.lucene.util.ArrayUtil.oversize(ArrayUtil.java:168) 1 org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:295) 1 org.apache.lucene.util.automaton.Automaton$Builder.addTransition(Automaton.java:639) 1 org.apache.lucene.util.automaton.Operations.determinize(Operations.java:741) 1 org.apache.lucene.util.automaton.MinimizationOperations.minimizeHopcroft(MinimizationOperations.java:62) 1 org.apache.lucene.util.automaton.MinimizationOperations.minimize(MinimizationOperations.java:51) 1 org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:477) 1 org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:426) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6046) RegExp.toAutomaton high memory use
[ https://issues.apache.org/jira/browse/LUCENE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194400#comment-14194400 ] Lee Hinman commented on LUCENE-6046: [~mikemccand] I ran it with the following JVM: {noformat} java version 1.8.0_20 Java(TM) SE Runtime Environment (build 1.8.0_20-b26) Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode) {noformat} RegExp.toAutomaton high memory use -- Key: LUCENE-6046 URL: https://issues.apache.org/jira/browse/LUCENE-6046 Project: Lucene - Core Issue Type: Bug Components: core/queryparser Affects Versions: 4.10.1 Reporter: Lee Hinman Assignee: Michael McCandless Priority: Minor When creating an automaton from an org.apache.lucene.util.automaton.RegExp, it's possible for the automaton to use so much memory it exceeds the maximum array size for java. The following caused an OutOfMemoryError with a 32gb heap: {noformat} new RegExp(\\[\\[(Datei|File|Bild|Image):[^]]*alt=[^]|}]{50,200}).toAutomaton(); {noformat} When increased to a 60gb heap, the following exception is thrown: {noformat} 1 java.lang.IllegalArgumentException: requested array size 2147483624 exceeds maximum array in java (2147483623) 1 __randomizedtesting.SeedInfo.seed([7BE81EF678615C32:95C8057A4ABA5B52]:0) 1 org.apache.lucene.util.ArrayUtil.oversize(ArrayUtil.java:168) 1 org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:295) 1 org.apache.lucene.util.automaton.Automaton$Builder.addTransition(Automaton.java:639) 1 org.apache.lucene.util.automaton.Operations.determinize(Operations.java:741) 1 org.apache.lucene.util.automaton.MinimizationOperations.minimizeHopcroft(MinimizationOperations.java:62) 1 org.apache.lucene.util.automaton.MinimizationOperations.minimize(MinimizationOperations.java:51) 1 org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:477) 1 org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:426) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6046) RegExp.toAutomaton high memory use
[ https://issues.apache.org/jira/browse/LUCENE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194436#comment-14194436 ] Lee Hinman commented on LUCENE-6046: bq. I think for this issue we should allow passing in a how much work are you willing to do to RegExp.toAutomaton, and it throws an exc when it would exceed that. For what it's worth, I think this would be a good solution for us, much better than silently (from the user's perspective) freezing and then hitting an OOME. RegExp.toAutomaton high memory use -- Key: LUCENE-6046 URL: https://issues.apache.org/jira/browse/LUCENE-6046 Project: Lucene - Core Issue Type: Bug Components: core/queryparser Affects Versions: 4.10.1 Reporter: Lee Hinman Assignee: Michael McCandless Priority: Minor When creating an automaton from an org.apache.lucene.util.automaton.RegExp, it's possible for the automaton to use so much memory it exceeds the maximum array size for java. The following caused an OutOfMemoryError with a 32gb heap: {noformat} new RegExp(\\[\\[(Datei|File|Bild|Image):[^]]*alt=[^]|}]{50,200}).toAutomaton(); {noformat} When increased to a 60gb heap, the following exception is thrown: {noformat} 1 java.lang.IllegalArgumentException: requested array size 2147483624 exceeds maximum array in java (2147483623) 1 __randomizedtesting.SeedInfo.seed([7BE81EF678615C32:95C8057A4ABA5B52]:0) 1 org.apache.lucene.util.ArrayUtil.oversize(ArrayUtil.java:168) 1 org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:295) 1 org.apache.lucene.util.automaton.Automaton$Builder.addTransition(Automaton.java:639) 1 org.apache.lucene.util.automaton.Operations.determinize(Operations.java:741) 1 org.apache.lucene.util.automaton.MinimizationOperations.minimizeHopcroft(MinimizationOperations.java:62) 1 org.apache.lucene.util.automaton.MinimizationOperations.minimize(MinimizationOperations.java:51) 1 org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:477) 1 org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:426) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5710) DefaultIndexingChain swallows useful information from MaxBytesLengthExceededException
Lee Hinman created LUCENE-5710: -- Summary: DefaultIndexingChain swallows useful information from MaxBytesLengthExceededException Key: LUCENE-5710 URL: https://issues.apache.org/jira/browse/LUCENE-5710 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.8.1 Reporter: Lee Hinman Priority: Minor In DefaultIndexingChain, when a MaxBytesLengthExceededException is caught, the original message is discarded, however, the message contains useful information like the size that exceeded the limit. Lucene should make this information included in the newly thrown IllegalArgumentException. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5710) DefaultIndexingChain swallows useful information from MaxBytesLengthExceededException
[ https://issues.apache.org/jira/browse/LUCENE-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-5710: --- Attachment: LUCENE-5710.patch Attaching patch that includes the original exception's message in the IllegalArgumentException message. DefaultIndexingChain swallows useful information from MaxBytesLengthExceededException - Key: LUCENE-5710 URL: https://issues.apache.org/jira/browse/LUCENE-5710 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.8.1 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-5710.patch In DefaultIndexingChain, when a MaxBytesLengthExceededException is caught, the original message is discarded, however, the message contains useful information like the size that exceeded the limit. Lucene should make this information included in the newly thrown IllegalArgumentException. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5710) DefaultIndexingChain swallows useful information from MaxBytesLengthExceededException
[ https://issues.apache.org/jira/browse/LUCENE-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-5710: --- Attachment: LUCENE-5710.patch Patch that includes the exception as the cause parameter for IllegalArgumentException DefaultIndexingChain swallows useful information from MaxBytesLengthExceededException - Key: LUCENE-5710 URL: https://issues.apache.org/jira/browse/LUCENE-5710 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 4.8.1 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-5710.patch, LUCENE-5710.patch In DefaultIndexingChain, when a MaxBytesLengthExceededException is caught, the original message is discarded, however, the message contains useful information like the size that exceeded the limit. Lucene should make this information included in the newly thrown IllegalArgumentException. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5410) Add fuzziness support to SimpleQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-5410: --- Attachment: LUCENE-5410.patch Here's a new version of the patch with these changes: - Make only minimal changes to {{consumeToken}} and {{consumePhrase}}, all the logic lives in separate parseFuzziness function used by both. - Separate the flags for edit distance and slop, there is now {{FUZZINESS_OPERATOR}} and {{SLOP_OPERATOR}}. - More tests Add fuzziness support to SimpleQueryParser -- Key: LUCENE-5410 URL: https://issues.apache.org/jira/browse/LUCENE-5410 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Affects Versions: 4.7 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-5410.patch, LUCENE-5410.patch, LUCENE-5410.patch Original Estimate: 168h Remaining Estimate: 168h It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so that: {{foo~2}} generates a {{FuzzyQuery}} with an max edit distance of 2 and: {{foo bar~2}} generates a {{PhraseQuery}} with a slop of 2. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5410) Add fuzziness support to SimpleQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880350#comment-13880350 ] Lee Hinman commented on LUCENE-5410: Upayavira: That does bring up a good point, how should the extra fuzzy characters be treated if fuzziness is turned off? Currently the patch treats the token {{foo~2}} as a TermQuery for foo~2 if {{FUZZINESS_OPERATOR}} is disabled. Should it be changed to silently swallow the ~2 even if fuzziness is disabled? Add fuzziness support to SimpleQueryParser -- Key: LUCENE-5410 URL: https://issues.apache.org/jira/browse/LUCENE-5410 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Affects Versions: 4.7 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-5410.patch Original Estimate: 168h Remaining Estimate: 168h It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so that: {{foo~2}} generates a {{FuzzyQuery}} with an max edit distance of 2 and: {{foo bar~2}} generates a {{PhraseQuery}} with a slop of 2. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5410) Add fuzziness support to SimpleQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880364#comment-13880364 ] Lee Hinman commented on LUCENE-5410: Gotcha. Next version of the patch will swallow {{~XXX}} if fuzziness is disabled, and won't touch QueryBuilder.java, as per Roberts comments. Add fuzziness support to SimpleQueryParser -- Key: LUCENE-5410 URL: https://issues.apache.org/jira/browse/LUCENE-5410 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Affects Versions: 4.7 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-5410.patch Original Estimate: 168h Remaining Estimate: 168h It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so that: {{foo~2}} generates a {{FuzzyQuery}} with an max edit distance of 2 and: {{foo bar~2}} generates a {{PhraseQuery}} with a slop of 2. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5410) Add fuzziness support to SimpleQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880384#comment-13880384 ] Lee Hinman edited comment on LUCENE-5410 at 1/23/14 9:30 PM: - Okay, I misunderstood then, I was thinking the ~2 syntax has no meaning whatever meant *no* meaning (ie, ignore it entirely). I will keep the current behavior of {{foo~2}} being a TermQuery for foo~2 and {{foo bar~2}} being a BooleanQuery of the PhraseQuery foo bar and a TermQuery for ~2. was (Author: dakrone): Okay, I misunderstood then, I was thinking the ~2 syntax has no meaning whatever meant *no* meaning (ie, ignore it entirely). I will keep the current behavior of {{foo~2}} being a TermQuery for foo~2 and {{{foo bar~2}}} being a BooleanQuery of the PhraseQuery foo bar and a TermQuery for ~2. Add fuzziness support to SimpleQueryParser -- Key: LUCENE-5410 URL: https://issues.apache.org/jira/browse/LUCENE-5410 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Affects Versions: 4.7 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-5410.patch Original Estimate: 168h Remaining Estimate: 168h It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so that: {{foo~2}} generates a {{FuzzyQuery}} with an max edit distance of 2 and: {{foo bar~2}} generates a {{PhraseQuery}} with a slop of 2. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5410) Add fuzziness support to SimpleQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880384#comment-13880384 ] Lee Hinman commented on LUCENE-5410: Okay, I misunderstood then, I was thinking the ~2 syntax has no meaning whatever meant *no* meaning (ie, ignore it entirely). I will keep the current behavior of {{foo~2}} being a TermQuery for foo~2 and {{{foo bar~2}}} being a BooleanQuery of the PhraseQuery foo bar and a TermQuery for ~2. Add fuzziness support to SimpleQueryParser -- Key: LUCENE-5410 URL: https://issues.apache.org/jira/browse/LUCENE-5410 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Affects Versions: 4.7 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-5410.patch Original Estimate: 168h Remaining Estimate: 168h It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so that: {{foo~2}} generates a {{FuzzyQuery}} with an max edit distance of 2 and: {{foo bar~2}} generates a {{PhraseQuery}} with a slop of 2. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5410) Add fuzziness support to SimpleQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-5410: --- Attachment: LUCENE-5410.patch New version of the patch with these changes: - No changes made to QueryBuilder.java - Only a single method for newPhraseQuery instead of two - Add ~ to testRandomQueries2 Add fuzziness support to SimpleQueryParser -- Key: LUCENE-5410 URL: https://issues.apache.org/jira/browse/LUCENE-5410 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Affects Versions: 4.7 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-5410.patch, LUCENE-5410.patch Original Estimate: 168h Remaining Estimate: 168h It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so that: {{foo~2}} generates a {{FuzzyQuery}} with an max edit distance of 2 and: {{foo bar~2}} generates a {{PhraseQuery}} with a slop of 2. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5410) Add fuzziness support to SimpleQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880511#comment-13880511 ] Lee Hinman commented on LUCENE-5410: Okay, the first is very simple (I'll add a {{SLOP_OPERATOR}} flag and make {{FUZZINESS_OPERATOR}} only work for fuzzy terms. As for the second, I think this is doable, but I think it will still require a bit of special logic in consumeTerm and consumePhrase based on the differences in how they consume/increment state.data and state.index. I'll work on another revision doing this. Add fuzziness support to SimpleQueryParser -- Key: LUCENE-5410 URL: https://issues.apache.org/jira/browse/LUCENE-5410 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Affects Versions: 4.7 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-5410.patch, LUCENE-5410.patch Original Estimate: 168h Remaining Estimate: 168h It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so that: {{foo~2}} generates a {{FuzzyQuery}} with an max edit distance of 2 and: {{foo bar~2}} generates a {{PhraseQuery}} with a slop of 2. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5410) Add fuzziness support to SimpleQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-5410: --- Attachment: LUCENE-5410.patch Attached patch that supports both edit distance for single terms and slop for phrases. I chose to add only a single flag ({{FUZZINESS_OPERATOR}}) for enabling/disabling both behaviors, but it would be easy to separate them if desired. Add fuzziness support to SimpleQueryParser -- Key: LUCENE-5410 URL: https://issues.apache.org/jira/browse/LUCENE-5410 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Affects Versions: 4.7 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-5410.patch Original Estimate: 168h Remaining Estimate: 168h It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so that: {{foo~2}} generates a {{FuzzyQuery}} with an max edit distance of 2 and: {{foo bar~2}} generates a {{PhraseQuery}} with a slop of 2. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5410) Add fuzziness support to SimpleQueryParser
Lee Hinman created LUCENE-5410: -- Summary: Add fuzziness support to SimpleQueryParser Key: LUCENE-5410 URL: https://issues.apache.org/jira/browse/LUCENE-5410 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Affects Versions: 4.7 Reporter: Lee Hinman Priority: Minor It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so that: {{foo~2}} generates a {{FuzzyQuery}} with an max edit distance of 2 and: {{foo bar~2}} generates a {{PhraseQuery}} with a slop of 2. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5410) Add fuzziness support to SimpleQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878015#comment-13878015 ] Lee Hinman commented on LUCENE-5410: I would like to work on this also, if that's alright. Add fuzziness support to SimpleQueryParser -- Key: LUCENE-5410 URL: https://issues.apache.org/jira/browse/LUCENE-5410 Project: Lucene - Core Issue Type: Improvement Components: core/queryparser Affects Versions: 4.7 Reporter: Lee Hinman Priority: Minor Original Estimate: 168h Remaining Estimate: 168h It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so that: {{foo~2}} generates a {{FuzzyQuery}} with an max edit distance of 2 and: {{foo bar~2}} generates a {{PhraseQuery}} with a slop of 2. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org