[jira] [Updated] (LUCENE-7712) SimpleQueryString should support auto fuziness

2017-03-06 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-7712:
---
Attachment: LUCENE-7712.patch

Attached a small patch that adds auto-fuzziness and updates the tests to check 
it.

> SimpleQueryString should support auto fuziness
> --
>
> Key: LUCENE-7712
> URL: https://issues.apache.org/jira/browse/LUCENE-7712
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Reporter: David Pilato
> Attachments: LUCENE-7712.patch
>
>
> Apparently the simpleQueryString query does not support auto fuziness as the 
> query string does.
> So {{foo:bar~1}} works for both simple query string and query string queries.
> But {{foo:bar~}} works for query string query but not for simple query string 
> query.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7712) SimpleQueryString should support auto fuziness

2017-02-28 Thread Lee Hinman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1538#comment-1538
 ] 

Lee Hinman commented on LUCENE-7712:


I am happy to submit a patch to add this, however, I don't know what the auto 
value should be. I wasn't able to find it except in older (3.x) documentation 
that mentioned it may be 0.5, is that the correct value for fuzziness that 
should be used if there is no value specified?

> SimpleQueryString should support auto fuziness
> --
>
> Key: LUCENE-7712
> URL: https://issues.apache.org/jira/browse/LUCENE-7712
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Reporter: David Pilato
>
> Apparently the simpleQueryString query does not support auto fuziness as the 
> query string does.
> So {{foo:bar~1}} works for both simple query string and query string queries.
> But {{foo:bar~}} works for query string query but not for simple query string 
> query.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7490) SimpleQueryParser should parse "*" as MatchAllDocsQuery

2016-10-11 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-7490:
---
Attachment: 0001-Parse-as-MatchAllDocsQuery-in-SimpleQueryParser.patch

Attaching patch with small fix and unit test

> SimpleQueryParser should parse "*" as MatchAllDocsQuery
> ---
>
> Key: LUCENE-7490
> URL: https://issues.apache.org/jira/browse/LUCENE-7490
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/queryparser
>Affects Versions: 6.2.1
>Reporter: Lee Hinman
>Priority: Minor
> Fix For: 6.x, master (7.0)
>
> Attachments: 
> 0001-Parse-as-MatchAllDocsQuery-in-SimpleQueryParser.patch
>
>
> It would be beneficial for SimpleQueryString to parse as a MatchAllDocsQuery, 
> rather than a "field:*" query.
> Related discussion on the Elasticsearch project about this: 
> https://github.com/elastic/elasticsearch/issues/10632



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-7490) SimpleQueryParser should parse "*" as MatchAllDocsQuery

2016-10-11 Thread Lee Hinman (JIRA)
Lee Hinman created LUCENE-7490:
--

 Summary: SimpleQueryParser should parse "*" as MatchAllDocsQuery
 Key: LUCENE-7490
 URL: https://issues.apache.org/jira/browse/LUCENE-7490
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/queryparser
Affects Versions: 6.2.1
Reporter: Lee Hinman
Priority: Minor
 Fix For: 6.x, master (7.0)


It would be beneficial for SimpleQueryString to parse as a MatchAllDocsQuery, 
rather than a "field:*" query.

Related discussion on the Elasticsearch project about this: 
https://github.com/elastic/elasticsearch/issues/10632



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7027) NumericTermAttribute throws IAE after NumericTokenStream is exhausted

2016-02-13 Thread Lee Hinman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146214#comment-15146214
 ] 

Lee Hinman commented on LUCENE-7027:


Thanks [~thetaphi]!

> NumericTermAttribute throws IAE after NumericTokenStream is exhausted
> -
>
> Key: LUCENE-7027
> URL: https://issues.apache.org/jira/browse/LUCENE-7027
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 5.5, master, 6.0
>Reporter: Michael McCandless
>Assignee: Uwe Schindler
>Priority: Blocker
> Fix For: 5.5, master, 6.0
>
> Attachments: LUCENE-7027-master.patch, LUCENE-7027-master.patch, 
> LUCENE-7027-master.patch
>
>
> This small test:
> {noformat}
>   public void testCloneFullPrecisionToken() throws Exception {
> FieldType fieldType = new FieldType(IntField.TYPE_NOT_STORED);
> fieldType.setNumericPrecisionStep(Integer.MAX_VALUE);
> Field field = new IntField("field", 17, fieldType);
> TokenStream tokenStream = new CachingTokenFilter(field.tokenStream(null, 
> null));
> assertTrue(tokenStream.incrementToken());
>   }
> {noformat}
> hits this unexpected exception:
> {noformat}
> There was 1 failure:
> 1) 
> testCloneFullPrecisionToken(org.apache.lucene.analysis.TestNumericTokenStream)
> java.lang.IllegalArgumentException: Illegal shift value, must be 0..31; got 
> shift=2147483647
>   at 
> __randomizedtesting.SeedInfo.seed([2E1E93EF810CB5F7:EF1304A849574BC7]:0)
>   at 
> org.apache.lucene.util.NumericUtils.intToPrefixCodedBytes(NumericUtils.java:175)
>   at 
> org.apache.lucene.util.NumericUtils.intToPrefixCoded(NumericUtils.java:133)
>   at 
> org.apache.lucene.analysis.NumericTokenStream$NumericTermAttributeImpl.getBytesRef(NumericTokenStream.java:165)
>   at 
> org.apache.lucene.analysis.NumericTokenStream$NumericTermAttributeImpl.clone(NumericTokenStream.java:217)
>   at 
> org.apache.lucene.analysis.NumericTokenStream$NumericTermAttributeImpl.clone(NumericTokenStream.java:148)
>   at 
> org.apache.lucene.util.AttributeSource$State.clone(AttributeSource.java:55)
>   at 
> org.apache.lucene.util.AttributeSource.captureState(AttributeSource.java:288)
>   at 
> org.apache.lucene.analysis.CachingTokenFilter.fillCache(CachingTokenFilter.java:96)
>   at 
> org.apache.lucene.analysis.CachingTokenFilter.incrementToken(CachingTokenFilter.java:70)
>   at 
> org.apache.lucene.analysis.TestNumericTokenStream.testCloneFullPrecisionToken(TestNumericTokenStream.java:138)
> {noformat}
> because {{CachingTokenFilter}} expects that it can {{captureState}} after 
> calling {{end}} but {{NumericTokenStream}} gets angry about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6345) null check all term/fields in queries

2015-03-16 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-6345:
---
Attachment: LUCENE-6345.patch

Here's a patch that adds a lot of null checks to Querys as well as things like 
{{BooleanClause}}.

It doesn't add tests for every single query for this (yet), though I see there 
are some already for {{FilteredQuery}}.

Should I work on adding tests for every query type for this, or are adding the 
checks alone sufficient?

 null check all term/fields in queries
 -

 Key: LUCENE-6345
 URL: https://issues.apache.org/jira/browse/LUCENE-6345
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-6345.patch


 See the mail thread is this lucene 4.1.0 bug in PerFieldPostingsFormat.
 If anyone seriously thinks adding a null check to ctor will cause measurable 
 slowdown to things like regexp or wildcards, they should have their head 
 examined.
 All queries should just check this crap in ctor and throw exceptions if 
 parameters are invalid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6345) null check all term/fields in queries

2015-03-16 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-6345:
---
Attachment: LUCENE-6345.patch

Updated patch that re-adds an assert that I removed mistakenly.

 null check all term/fields in queries
 -

 Key: LUCENE-6345
 URL: https://issues.apache.org/jira/browse/LUCENE-6345
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-6345.patch, LUCENE-6345.patch


 See the mail thread is this lucene 4.1.0 bug in PerFieldPostingsFormat.
 If anyone seriously thinks adding a null check to ctor will cause measurable 
 slowdown to things like regexp or wildcards, they should have their head 
 examined.
 All queries should just check this crap in ctor and throw exceptions if 
 parameters are invalid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6345) null check all term/fields in queries

2015-03-15 Thread Lee Hinman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362714#comment-14362714
 ] 

Lee Hinman commented on LUCENE-6345:


I'm going to work on this.

Looking through the code, I see a mixture of:

{noformat}
Term t = Objects.requireNonNull(term);
{noformat}

As well as:

{noformat}
if (term == null) {
  throw new IllegalArgumentException(Term must not be null);
}
{noformat}

Any particular preference here? I think an explicit message is nicer but I can 
go either way. If no one has an opinion about it I'll pick one and go with it :)

 null check all term/fields in queries
 -

 Key: LUCENE-6345
 URL: https://issues.apache.org/jira/browse/LUCENE-6345
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir

 See the mail thread is this lucene 4.1.0 bug in PerFieldPostingsFormat.
 If anyone seriously thinks adding a null check to ctor will cause measurable 
 slowdown to things like regexp or wildcards, they should have their head 
 examined.
 All queries should just check this crap in ctor and throw exceptions if 
 parameters are invalid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6333) Clean up overridden .equals and .hashCode methods in Query subclasses

2015-03-13 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-6333:
---
Attachment: LUCENE-6333-2.patch

bq. I think the test may fail sporatically, if the randomly generated terms 
happen to cause a hash collision. Maybe the test can just use some hardcoded 
terms like apple and orange, which will still find the bug.

I agree, here's a new patch that uses apple and orange as you recommended.

 Clean up overridden .equals and .hashCode methods in Query subclasses
 -

 Key: LUCENE-6333
 URL: https://issues.apache.org/jira/browse/LUCENE-6333
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6333-2.patch, LUCENE-6333-2.patch, 
 LUCENE-6333.patch


 As a followup to LUCENE-6304, all classes that subclass Query and override 
 the {{equals}} and {{hashCode}} methods should call super.equals/hashCode 
 and, when possible, not override the methods at all.
 For example, TermQuery.hashCode overrides the Query.hashCode, but will be 
 exactly the same code once LUCENE-6304 is merged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6333) Clean up overridden .equals and .hashCode methods in Query subclasses

2015-03-10 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-6333:
---
Attachment: LUCENE-6333-2.patch

Here is an additional patch (now that the previous LUCENE-6333.patch was 
applied) that adds a correct {{hashCode}} method for {{TermsQuery}} as well as 
tests that catch this if it happens in the future.

 Clean up overridden .equals and .hashCode methods in Query subclasses
 -

 Key: LUCENE-6333
 URL: https://issues.apache.org/jira/browse/LUCENE-6333
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6333-2.patch, LUCENE-6333.patch


 As a followup to LUCENE-6304, all classes that subclass Query and override 
 the {{equals}} and {{hashCode}} methods should call super.equals/hashCode 
 and, when possible, not override the methods at all.
 For example, TermQuery.hashCode overrides the Query.hashCode, but will be 
 exactly the same code once LUCENE-6304 is merged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6355) Add verbose IndexWriter logging for writing field infos

2015-03-10 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-6355:
---
Attachment: LUCENE-6355.patch

Here is a very simple patch, it adds the same time logging that is used in the 
rest of {{SegmentMerger.merge}} to writing the field infos.

 Add verbose IndexWriter logging for writing field infos
 ---

 Key: LUCENE-6355
 URL: https://issues.apache.org/jira/browse/LUCENE-6355
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-6355.patch


 SegmentMerger should also write the amount of time it takes to write the 
 field infos during a merge. This makes it much easier to determine the 
 contributing times for the total merge time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6355) Add verbose IndexWriter logging for writing field infos

2015-03-10 Thread Lee Hinman (JIRA)
Lee Hinman created LUCENE-6355:
--

 Summary: Add verbose IndexWriter logging for writing field infos
 Key: LUCENE-6355
 URL: https://issues.apache.org/jira/browse/LUCENE-6355
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor


SegmentMerger should also write the amount of time it takes to write the field 
infos during a merge. This makes it much easier to determine the contributing 
times for the total merge time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6333) Clean up overridden .equals and .hashCode methods in Query subclasses

2015-03-04 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-6333:
---
Attachment: LUCENE-6333.patch

Here's a patch that cleans up the hashCode and equals methods for most queries.

I removed the extra {{getBoost}} comparison because the Query superclass does 
that comparison. I was also able to remove some overridden methods that only 
did exactly what the {{Query}} implementation does.

 Clean up overridden .equals and .hashCode methods in Query subclasses
 -

 Key: LUCENE-6333
 URL: https://issues.apache.org/jira/browse/LUCENE-6333
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-6333.patch


 As a followup to LUCENE-6304, all classes that subclass Query and override 
 the {{equals}} and {{hashCode}} methods should call super.equals/hashCode 
 and, when possible, not override the methods at all.
 For example, TermQuery.hashCode overrides the Query.hashCode, but will be 
 exactly the same code once LUCENE-6304 is merged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents

2015-03-03 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-6304:
---
Attachment: LUCENE-6304.patch

Adrien: I agree about having the hashCode.

Here is a new patch that doesn't override equals or hashCode and changes Query 
to use use the class in the hashCode method as Adrien suggested.

 Add MatchNoDocsQuery that matches no documents
 --

 Key: LUCENE-6304
 URL: https://issues.apache.org/jira/browse/LUCENE-6304
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-6304.patch, LUCENE-6304.patch, LUCENE-6304.patch, 
 LUCENE-6304.patch


 As a followup to LUCENE-6298, it would be nice to have an explicit 
 MatchNoDocsQuery to indicate that no documents should be matched. This would 
 hopefully be a better indicator than a BooleanQuery with no clauses or (even 
 worse) null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents

2015-03-03 Thread Lee Hinman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345602#comment-14345602
 ] 

Lee Hinman edited comment on LUCENE-6304 at 3/3/15 7:42 PM:


Adrien: I agree about having the hashCode.

Here is a new patch that doesn't override equals or hashCode and changes Query 
to use the class in the hashCode method as Adrien suggested.


was (Author: dakrone):
Adrien: I agree about having the hashCode.

Here is a new patch that doesn't override equals or hashCode and changes Query 
to use use the class in the hashCode method as Adrien suggested.

 Add MatchNoDocsQuery that matches no documents
 --

 Key: LUCENE-6304
 URL: https://issues.apache.org/jira/browse/LUCENE-6304
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-6304.patch, LUCENE-6304.patch, LUCENE-6304.patch, 
 LUCENE-6304.patch


 As a followup to LUCENE-6298, it would be nice to have an explicit 
 MatchNoDocsQuery to indicate that no documents should be matched. This would 
 hopefully be a better indicator than a BooleanQuery with no clauses or (even 
 worse) null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents

2015-03-03 Thread Lee Hinman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345820#comment-14345820
 ] 

Lee Hinman commented on LUCENE-6304:


Robert: +1, I opened LUCENE-6333 for this, I'll work on a patch.

 Add MatchNoDocsQuery that matches no documents
 --

 Key: LUCENE-6304
 URL: https://issues.apache.org/jira/browse/LUCENE-6304
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-6304.patch, LUCENE-6304.patch, LUCENE-6304.patch, 
 LUCENE-6304.patch


 As a followup to LUCENE-6298, it would be nice to have an explicit 
 MatchNoDocsQuery to indicate that no documents should be matched. This would 
 hopefully be a better indicator than a BooleanQuery with no clauses or (even 
 worse) null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6333) Clean up overridden .equals and .hashCode methods in Query subclasses

2015-03-03 Thread Lee Hinman (JIRA)
Lee Hinman created LUCENE-6333:
--

 Summary: Clean up overridden .equals and .hashCode methods in 
Query subclasses
 Key: LUCENE-6333
 URL: https://issues.apache.org/jira/browse/LUCENE-6333
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor


As a followup to LUCENE-6304, all classes that subclass Query and override the 
{{equals}} and {{hashCode}} methods should call super.equals/hashCode and, when 
possible, not override the methods at all.

For example, TermQuery.hashCode overrides the Query.hashCode, but will be 
exactly the same code once LUCENE-6304 is merged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents

2015-02-27 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-6304:
---
Attachment: LUCENE-6304.patch

bq. is the hashcode/equals stuff needed here or can the superclass impls in 
Query be used?

The hashcode is required at least, because otherwise the QueryUtils.check(q) 
fails because both the MatchNoDocsQuery and the superclass Query have the same 
hashcode, and the anonymous WhackyQuery that QueryUtils creates shares the 
same hash code, so QueryUtils.checkUnequal() fails.

The .equals() stuff is not required though, it can use the superclass 
implementation. I've attached a new patch that does this.

 Add MatchNoDocsQuery that matches no documents
 --

 Key: LUCENE-6304
 URL: https://issues.apache.org/jira/browse/LUCENE-6304
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-6304.patch, LUCENE-6304.patch, LUCENE-6304.patch


 As a followup to LUCENE-6298, it would be nice to have an explicit 
 MatchNoDocsQuery to indicate that no documents should be matched. This would 
 hopefully be a better indicator than a BooleanQuery with no clauses or (even 
 worse) null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents

2015-02-27 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-6304:
---
Attachment: LUCENE-6304.patch

New patch that changes MatchNoDocsQuery to rewrite to an empty BooleanQuery. 
Also removes the nocommit as per Adrien's suggestion

 Add MatchNoDocsQuery that matches no documents
 --

 Key: LUCENE-6304
 URL: https://issues.apache.org/jira/browse/LUCENE-6304
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-6304.patch, LUCENE-6304.patch


 As a followup to LUCENE-6298, it would be nice to have an explicit 
 MatchNoDocsQuery to indicate that no documents should be matched. This would 
 hopefully be a better indicator than a BooleanQuery with no clauses or (even 
 worse) null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents

2015-02-26 Thread Lee Hinman (JIRA)
Lee Hinman created LUCENE-6304:
--

 Summary: Add MatchNoDocsQuery that matches no documents
 Key: LUCENE-6304
 URL: https://issues.apache.org/jira/browse/LUCENE-6304
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor


As a followup to LUCENE-6298, it would be nice to have an explicit 
MatchNoDocsQuery to indicate that no documents should be matched. This would 
hopefully be a better indicator than a BooleanQuery with no clauses or (even 
worse) null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents

2015-02-26 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-6304:
---
Attachment: LUCENE-6304.patch

Patch that adds the MatchNoDocsQuery and uses it for empty SimpleQueryParser 
queries as well as when a BooleanQuery is rewritten and has no clauses.

 Add MatchNoDocsQuery that matches no documents
 --

 Key: LUCENE-6304
 URL: https://issues.apache.org/jira/browse/LUCENE-6304
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-6304.patch


 As a followup to LUCENE-6298, it would be nice to have an explicit 
 MatchNoDocsQuery to indicate that no documents should be matched. This would 
 hopefully be a better indicator than a BooleanQuery with no clauses or (even 
 worse) null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6298) empty SimpleQueryParser query should return empty BooleanQuery

2015-02-25 Thread Lee Hinman (JIRA)
Lee Hinman created LUCENE-6298:
--

 Summary: empty SimpleQueryParser query should return empty 
BooleanQuery
 Key: LUCENE-6298
 URL: https://issues.apache.org/jira/browse/LUCENE-6298
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/queryparser
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor


In order to be consistent with QueryParser, SimpleQueryParser should return an 
empty BooleanQuery instead of null when the analyzed query state is null (if 
the query text is entirely removed during analysis, for instance).

Long term it would also be nice to be able to return a MatchNoDocsQuery (or 
something like that) instead of using null as a stand-in value for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6298) empty SimpleQueryParser query should return empty BooleanQuery

2015-02-25 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-6298:
---
Attachment: LUCENE-6298.patch

Small patch that changes the SimpleQueryParser to the desired behavior

 empty SimpleQueryParser query should return empty BooleanQuery
 --

 Key: LUCENE-6298
 URL: https://issues.apache.org/jira/browse/LUCENE-6298
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/queryparser
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-6298.patch


 In order to be consistent with QueryParser, SimpleQueryParser should return 
 an empty BooleanQuery instead of null when the analyzed query state is null 
 (if the query text is entirely removed during analysis, for instance).
 Long term it would also be nice to be able to return a MatchNoDocsQuery (or 
 something like that) instead of using null as a stand-in value for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6298) empty SimpleQueryParser query should return empty BooleanQuery

2015-02-25 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-6298:
---
Attachment: LUCENE-6298.patch

Better patch without all the stupid IDE import changes.

 empty SimpleQueryParser query should return empty BooleanQuery
 --

 Key: LUCENE-6298
 URL: https://issues.apache.org/jira/browse/LUCENE-6298
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/queryparser
Affects Versions: 5.0
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-6298.patch, LUCENE-6298.patch


 In order to be consistent with QueryParser, SimpleQueryParser should return 
 an empty BooleanQuery instead of null when the analyzed query state is null 
 (if the query text is entirely removed during analysis, for instance).
 Long term it would also be nice to be able to return a MatchNoDocsQuery (or 
 something like that) instead of using null as a stand-in value for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6046) RegExp.toAutomaton high memory use

2014-11-03 Thread Lee Hinman (JIRA)
Lee Hinman created LUCENE-6046:
--

 Summary: RegExp.toAutomaton high memory use
 Key: LUCENE-6046
 URL: https://issues.apache.org/jira/browse/LUCENE-6046
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/queryparser
Affects Versions: 4.10.1
Reporter: Lee Hinman
Priority: Minor


When creating an automaton from an org.apache.lucene.util.automaton.RegExp, 
it's possible for the automaton to use so much memory it exceeds the maximum 
array size for java.

The following caused an OutOfMemoryError with a 32gb heap:

{noformat}
new 
RegExp(\\[\\[(Datei|File|Bild|Image):[^]]*alt=[^]|}]{50,200}).toAutomaton();
{noformat}

When increased to a 60gb heap, the following exception is thrown:

{noformat}
  1 java.lang.IllegalArgumentException: requested array size 2147483624 
exceeds maximum array in java (2147483623)
  1 
__randomizedtesting.SeedInfo.seed([7BE81EF678615C32:95C8057A4ABA5B52]:0)
  1 org.apache.lucene.util.ArrayUtil.oversize(ArrayUtil.java:168)
  1 org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:295)
  1 
org.apache.lucene.util.automaton.Automaton$Builder.addTransition(Automaton.java:639)
  1 
org.apache.lucene.util.automaton.Operations.determinize(Operations.java:741)
  1 
org.apache.lucene.util.automaton.MinimizationOperations.minimizeHopcroft(MinimizationOperations.java:62)
  1 
org.apache.lucene.util.automaton.MinimizationOperations.minimize(MinimizationOperations.java:51)
  1 org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:477)
  1 org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:426)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6046) RegExp.toAutomaton high memory use

2014-11-03 Thread Lee Hinman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194400#comment-14194400
 ] 

Lee Hinman commented on LUCENE-6046:


[~mikemccand] I ran it with the following JVM:

{noformat}
java version 1.8.0_20
Java(TM) SE Runtime Environment (build 1.8.0_20-b26)
Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode)
{noformat}

 RegExp.toAutomaton high memory use
 --

 Key: LUCENE-6046
 URL: https://issues.apache.org/jira/browse/LUCENE-6046
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/queryparser
Affects Versions: 4.10.1
Reporter: Lee Hinman
Assignee: Michael McCandless
Priority: Minor

 When creating an automaton from an org.apache.lucene.util.automaton.RegExp, 
 it's possible for the automaton to use so much memory it exceeds the maximum 
 array size for java.
 The following caused an OutOfMemoryError with a 32gb heap:
 {noformat}
 new 
 RegExp(\\[\\[(Datei|File|Bild|Image):[^]]*alt=[^]|}]{50,200}).toAutomaton();
 {noformat}
 When increased to a 60gb heap, the following exception is thrown:
 {noformat}
   1 java.lang.IllegalArgumentException: requested array size 2147483624 
 exceeds maximum array in java (2147483623)
   1 
 __randomizedtesting.SeedInfo.seed([7BE81EF678615C32:95C8057A4ABA5B52]:0)
   1 org.apache.lucene.util.ArrayUtil.oversize(ArrayUtil.java:168)
   1 org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:295)
   1 
 org.apache.lucene.util.automaton.Automaton$Builder.addTransition(Automaton.java:639)
   1 
 org.apache.lucene.util.automaton.Operations.determinize(Operations.java:741)
   1 
 org.apache.lucene.util.automaton.MinimizationOperations.minimizeHopcroft(MinimizationOperations.java:62)
   1 
 org.apache.lucene.util.automaton.MinimizationOperations.minimize(MinimizationOperations.java:51)
   1 org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:477)
   1 org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:426)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6046) RegExp.toAutomaton high memory use

2014-11-03 Thread Lee Hinman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194436#comment-14194436
 ] 

Lee Hinman commented on LUCENE-6046:


bq. I think for this issue we should allow passing in a how much work are you 
willing to do to RegExp.toAutomaton, and it throws an exc when it would exceed 
that.

For what it's worth, I think this would be a good solution for us, much better 
than silently (from the user's perspective) freezing and then hitting an OOME.

 RegExp.toAutomaton high memory use
 --

 Key: LUCENE-6046
 URL: https://issues.apache.org/jira/browse/LUCENE-6046
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/queryparser
Affects Versions: 4.10.1
Reporter: Lee Hinman
Assignee: Michael McCandless
Priority: Minor

 When creating an automaton from an org.apache.lucene.util.automaton.RegExp, 
 it's possible for the automaton to use so much memory it exceeds the maximum 
 array size for java.
 The following caused an OutOfMemoryError with a 32gb heap:
 {noformat}
 new 
 RegExp(\\[\\[(Datei|File|Bild|Image):[^]]*alt=[^]|}]{50,200}).toAutomaton();
 {noformat}
 When increased to a 60gb heap, the following exception is thrown:
 {noformat}
   1 java.lang.IllegalArgumentException: requested array size 2147483624 
 exceeds maximum array in java (2147483623)
   1 
 __randomizedtesting.SeedInfo.seed([7BE81EF678615C32:95C8057A4ABA5B52]:0)
   1 org.apache.lucene.util.ArrayUtil.oversize(ArrayUtil.java:168)
   1 org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:295)
   1 
 org.apache.lucene.util.automaton.Automaton$Builder.addTransition(Automaton.java:639)
   1 
 org.apache.lucene.util.automaton.Operations.determinize(Operations.java:741)
   1 
 org.apache.lucene.util.automaton.MinimizationOperations.minimizeHopcroft(MinimizationOperations.java:62)
   1 
 org.apache.lucene.util.automaton.MinimizationOperations.minimize(MinimizationOperations.java:51)
   1 org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:477)
   1 org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:426)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5710) DefaultIndexingChain swallows useful information from MaxBytesLengthExceededException

2014-05-28 Thread Lee Hinman (JIRA)
Lee Hinman created LUCENE-5710:
--

 Summary: DefaultIndexingChain swallows useful information from 
MaxBytesLengthExceededException
 Key: LUCENE-5710
 URL: https://issues.apache.org/jira/browse/LUCENE-5710
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.8.1
Reporter: Lee Hinman
Priority: Minor


In DefaultIndexingChain, when a MaxBytesLengthExceededException is caught, the 
original message is discarded, however, the message contains useful information 
like the size that exceeded the limit.

Lucene should make this information included in the newly thrown 
IllegalArgumentException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5710) DefaultIndexingChain swallows useful information from MaxBytesLengthExceededException

2014-05-28 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-5710:
---

Attachment: LUCENE-5710.patch

Attaching patch that includes the original exception's message in the 
IllegalArgumentException message.

 DefaultIndexingChain swallows useful information from 
 MaxBytesLengthExceededException
 -

 Key: LUCENE-5710
 URL: https://issues.apache.org/jira/browse/LUCENE-5710
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.8.1
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-5710.patch


 In DefaultIndexingChain, when a MaxBytesLengthExceededException is caught, 
 the original message is discarded, however, the message contains useful 
 information like the size that exceeded the limit.
 Lucene should make this information included in the newly thrown 
 IllegalArgumentException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5710) DefaultIndexingChain swallows useful information from MaxBytesLengthExceededException

2014-05-28 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-5710:
---

Attachment: LUCENE-5710.patch

Patch that includes the exception as the cause parameter for 
IllegalArgumentException

 DefaultIndexingChain swallows useful information from 
 MaxBytesLengthExceededException
 -

 Key: LUCENE-5710
 URL: https://issues.apache.org/jira/browse/LUCENE-5710
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.8.1
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-5710.patch, LUCENE-5710.patch


 In DefaultIndexingChain, when a MaxBytesLengthExceededException is caught, 
 the original message is discarded, however, the message contains useful 
 information like the size that exceeded the limit.
 Lucene should make this information included in the newly thrown 
 IllegalArgumentException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5410) Add fuzziness support to SimpleQueryParser

2014-01-24 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-5410:
---

Attachment: LUCENE-5410.patch

Here's a new version of the patch with these changes:

- Make only minimal changes to {{consumeToken}} and {{consumePhrase}}, all the 
logic lives in separate parseFuzziness function used by both.
- Separate the flags for edit distance and slop, there is now 
{{FUZZINESS_OPERATOR}} and {{SLOP_OPERATOR}}.
- More tests

 Add fuzziness support to SimpleQueryParser
 --

 Key: LUCENE-5410
 URL: https://issues.apache.org/jira/browse/LUCENE-5410
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 4.7
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-5410.patch, LUCENE-5410.patch, LUCENE-5410.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so 
 that:
 {{foo~2}}
 generates a {{FuzzyQuery}} with an max edit distance of 2 and:
 {{foo bar~2}}
 generates a {{PhraseQuery}} with a slop of 2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5410) Add fuzziness support to SimpleQueryParser

2014-01-23 Thread Lee Hinman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880350#comment-13880350
 ] 

Lee Hinman commented on LUCENE-5410:


Upayavira: That does bring up a good point, how should the extra fuzzy 
characters be treated if fuzziness is turned off? Currently the patch treats 
the token {{foo~2}} as a TermQuery for foo~2 if {{FUZZINESS_OPERATOR}} is 
disabled. Should it be changed to silently swallow the ~2 even if fuzziness 
is disabled?

 Add fuzziness support to SimpleQueryParser
 --

 Key: LUCENE-5410
 URL: https://issues.apache.org/jira/browse/LUCENE-5410
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 4.7
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-5410.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so 
 that:
 {{foo~2}}
 generates a {{FuzzyQuery}} with an max edit distance of 2 and:
 {{foo bar~2}}
 generates a {{PhraseQuery}} with a slop of 2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5410) Add fuzziness support to SimpleQueryParser

2014-01-23 Thread Lee Hinman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880364#comment-13880364
 ] 

Lee Hinman commented on LUCENE-5410:


Gotcha. Next version of the patch will swallow {{~XXX}} if fuzziness is 
disabled, and won't touch QueryBuilder.java, as per Roberts comments.

 Add fuzziness support to SimpleQueryParser
 --

 Key: LUCENE-5410
 URL: https://issues.apache.org/jira/browse/LUCENE-5410
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 4.7
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-5410.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so 
 that:
 {{foo~2}}
 generates a {{FuzzyQuery}} with an max edit distance of 2 and:
 {{foo bar~2}}
 generates a {{PhraseQuery}} with a slop of 2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5410) Add fuzziness support to SimpleQueryParser

2014-01-23 Thread Lee Hinman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880384#comment-13880384
 ] 

Lee Hinman edited comment on LUCENE-5410 at 1/23/14 9:30 PM:
-

Okay, I misunderstood then, I was thinking the ~2 syntax has no meaning 
whatever meant *no* meaning (ie, ignore it entirely). I will keep the current 
behavior of {{foo~2}} being a TermQuery for foo~2 and {{foo bar~2}} being a 
BooleanQuery of the PhraseQuery foo bar and a TermQuery for ~2.


was (Author: dakrone):
Okay, I misunderstood then, I was thinking the ~2 syntax has no meaning 
whatever meant *no* meaning (ie, ignore it entirely). I will keep the current 
behavior of {{foo~2}} being a TermQuery for foo~2 and {{{foo bar~2}}} being 
a BooleanQuery of the PhraseQuery foo bar and a TermQuery for ~2.

 Add fuzziness support to SimpleQueryParser
 --

 Key: LUCENE-5410
 URL: https://issues.apache.org/jira/browse/LUCENE-5410
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 4.7
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-5410.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so 
 that:
 {{foo~2}}
 generates a {{FuzzyQuery}} with an max edit distance of 2 and:
 {{foo bar~2}}
 generates a {{PhraseQuery}} with a slop of 2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5410) Add fuzziness support to SimpleQueryParser

2014-01-23 Thread Lee Hinman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880384#comment-13880384
 ] 

Lee Hinman commented on LUCENE-5410:


Okay, I misunderstood then, I was thinking the ~2 syntax has no meaning 
whatever meant *no* meaning (ie, ignore it entirely). I will keep the current 
behavior of {{foo~2}} being a TermQuery for foo~2 and {{{foo bar~2}}} being 
a BooleanQuery of the PhraseQuery foo bar and a TermQuery for ~2.

 Add fuzziness support to SimpleQueryParser
 --

 Key: LUCENE-5410
 URL: https://issues.apache.org/jira/browse/LUCENE-5410
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 4.7
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-5410.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so 
 that:
 {{foo~2}}
 generates a {{FuzzyQuery}} with an max edit distance of 2 and:
 {{foo bar~2}}
 generates a {{PhraseQuery}} with a slop of 2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5410) Add fuzziness support to SimpleQueryParser

2014-01-23 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-5410:
---

Attachment: LUCENE-5410.patch

New version of the patch with these changes:

- No changes made to QueryBuilder.java
- Only a single method for newPhraseQuery instead of two
- Add ~ to testRandomQueries2

 Add fuzziness support to SimpleQueryParser
 --

 Key: LUCENE-5410
 URL: https://issues.apache.org/jira/browse/LUCENE-5410
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 4.7
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-5410.patch, LUCENE-5410.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so 
 that:
 {{foo~2}}
 generates a {{FuzzyQuery}} with an max edit distance of 2 and:
 {{foo bar~2}}
 generates a {{PhraseQuery}} with a slop of 2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5410) Add fuzziness support to SimpleQueryParser

2014-01-23 Thread Lee Hinman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880511#comment-13880511
 ] 

Lee Hinman commented on LUCENE-5410:


Okay, the first is very simple (I'll add a {{SLOP_OPERATOR}} flag and make 
{{FUZZINESS_OPERATOR}} only work for fuzzy terms.

As for the second, I think this is doable, but I think it will still require a 
bit of special logic in consumeTerm and consumePhrase based on the differences 
in how they consume/increment state.data and state.index. I'll work on another 
revision doing this.

 Add fuzziness support to SimpleQueryParser
 --

 Key: LUCENE-5410
 URL: https://issues.apache.org/jira/browse/LUCENE-5410
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 4.7
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-5410.patch, LUCENE-5410.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so 
 that:
 {{foo~2}}
 generates a {{FuzzyQuery}} with an max edit distance of 2 and:
 {{foo bar~2}}
 generates a {{PhraseQuery}} with a slop of 2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5410) Add fuzziness support to SimpleQueryParser

2014-01-22 Thread Lee Hinman (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Hinman updated LUCENE-5410:
---

Attachment: LUCENE-5410.patch

Attached patch that supports both edit distance for single terms and slop for 
phrases.

I chose to add only a single flag ({{FUZZINESS_OPERATOR}}) for 
enabling/disabling both behaviors, but it would be easy to separate them if 
desired.

 Add fuzziness support to SimpleQueryParser
 --

 Key: LUCENE-5410
 URL: https://issues.apache.org/jira/browse/LUCENE-5410
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 4.7
Reporter: Lee Hinman
Priority: Minor
 Attachments: LUCENE-5410.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so 
 that:
 {{foo~2}}
 generates a {{FuzzyQuery}} with an max edit distance of 2 and:
 {{foo bar~2}}
 generates a {{PhraseQuery}} with a slop of 2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5410) Add fuzziness support to SimpleQueryParser

2014-01-21 Thread Lee Hinman (JIRA)
Lee Hinman created LUCENE-5410:
--

 Summary: Add fuzziness support to SimpleQueryParser
 Key: LUCENE-5410
 URL: https://issues.apache.org/jira/browse/LUCENE-5410
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 4.7
Reporter: Lee Hinman
Priority: Minor


It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so 
that:

{{foo~2}}

generates a {{FuzzyQuery}} with an max edit distance of 2 and:

{{foo bar~2}}

generates a {{PhraseQuery}} with a slop of 2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5410) Add fuzziness support to SimpleQueryParser

2014-01-21 Thread Lee Hinman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878015#comment-13878015
 ] 

Lee Hinman commented on LUCENE-5410:


I would like to work on this also, if that's alright.

 Add fuzziness support to SimpleQueryParser
 --

 Key: LUCENE-5410
 URL: https://issues.apache.org/jira/browse/LUCENE-5410
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 4.7
Reporter: Lee Hinman
Priority: Minor
   Original Estimate: 168h
  Remaining Estimate: 168h

 It would be nice to add fuzzy query support to the {{SimpleQueryParser}} so 
 that:
 {{foo~2}}
 generates a {{FuzzyQuery}} with an max edit distance of 2 and:
 {{foo bar~2}}
 generates a {{PhraseQuery}} with a slop of 2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org