[jira] [Commented] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser

2014-03-13 Thread Nikhil Chhaochharia (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934612#comment-13934612
 ] 

Nikhil Chhaochharia commented on LUCENE-5205:
-

We tried reducing all stopwords to an impossible token and it increased our 
indexing time as well as index size by about 10% when compared to using the 
StopFilter. We used a SynonymFilter to map all stop words to the impossible 
token and set expand=false  Initial tests show that the functionality is as 
expected and PhraseQuery / SpanQuery handle stop words properly. We will be 
running more tests to check if there are any unexpected side-effects, but this 
looks like a better option compared to using a StopFilter which sometimes leads 
to false matches.

 [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to 
 classic QueryParser
 ---

 Key: LUCENE-5205
 URL: https://issues.apache.org/jira/browse/LUCENE-5205
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Reporter: Tim Allison
  Labels: patch
 Fix For: 4.8

 Attachments: LUCENE-5205-cleanup-tests.patch, 
 LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, 
 LUCENE-5205_dateTestReInitPkgPrvt.patch, 
 LUCENE-5205_improve_stop_word_handling.patch, 
 LUCENE-5205_smallTestMods.patch, LUCENE_5205.patch, 
 SpanQueryParser_v1.patch.gz, patch.txt


 This parser extends QueryParserBase and includes functionality from:
 * Classic QueryParser: most of its syntax
 * SurroundQueryParser: recursive parsing for near and not clauses.
 * ComplexPhraseQueryParser: can handle near queries that include multiterms 
 (wildcard, fuzzy, regex, prefix),
 * AnalyzingQueryParser: has an option to analyze multiterms.
 At a high level, there's a first pass BooleanQuery/field parser and then a 
 span query parser handles all terminal nodes and phrases.
 Same as classic syntax:
 * term: test 
 * fuzzy: roam~0.8, roam~2
 * wildcard: te?t, test*, t*st
 * regex: /\[mb\]oat/
 * phrase: jakarta apache
 * phrase with slop: jakarta apache~3
 * default or clause: jakarta apache
 * grouping or clause: (jakarta apache)
 * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta
 * multiple fields: title:lucene author:hatcher
  
 Main additions in SpanQueryParser syntax vs. classic syntax:
 * Can require in order for phrases with slop with the \~ operator: 
 jakarta apache\~3
 * Can specify not near: fever bieber!\~3,10 ::
 find fever but not if bieber appears within 3 words before or 10 
 words after it.
 * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta 
 apache\]~3 lucene\]\~4 :: 
 find jakarta within 3 words of apache, and that hit has to be within 
 four words before lucene
 * Can also use \[\] for single level phrasal queries instead of  as in: 
 \[jakarta apache\]
 * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 
 :: find apache and then either lucene or solr within three words.
 * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2
 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ 
 /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two 
 words of ap*che and that hit has to be within ten words of something like 
 solr or that lucene regex.
 * Can require at least x number of hits at boolean level: apache AND (lucene 
 solr tika)~2
 * Can use negative only query: -jakarta :: Find all docs that don't contain 
 jakarta
 * Can use an edit distance  2 for fuzzy query via SlowFuzzyQuery (beware of 
 potential performance issues!).
 Trivial additions:
 * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, 
 prefix =2)
 * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance 
 =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein)
 This parser can be very useful for concordance tasks (see also LUCENE-5317 
 and LUCENE-5318) and for analytical search.  
 Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery.
 Most of the documentation is in the javadoc for SpanQueryParser.
 Any and all feedback is welcome.  Thank you.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser

2014-03-12 Thread Nikhil Chhaochharia (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931580#comment-13931580
 ] 

Nikhil Chhaochharia commented on LUCENE-5205:
-

We will try reducing the stop words to some impossible token and report back in 
a few days.

We need the user fields and a few other features of the edismax parser, hence 
we have modified it to send only 'phrase' queries to SpanQueryParser. Its a 
huge hack but we would like include this functionality without the overhead of 
building our own parser from scratch.

 [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to 
 classic QueryParser
 ---

 Key: LUCENE-5205
 URL: https://issues.apache.org/jira/browse/LUCENE-5205
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Reporter: Tim Allison
  Labels: patch
 Fix For: 4.7

 Attachments: LUCENE-5205-cleanup-tests.patch, 
 LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, 
 LUCENE-5205_dateTestReInitPkgPrvt.patch, 
 LUCENE-5205_improve_stop_word_handling.patch, 
 LUCENE-5205_smallTestMods.patch, LUCENE_5205.patch, 
 SpanQueryParser_v1.patch.gz, patch.txt


 This parser extends QueryParserBase and includes functionality from:
 * Classic QueryParser: most of its syntax
 * SurroundQueryParser: recursive parsing for near and not clauses.
 * ComplexPhraseQueryParser: can handle near queries that include multiterms 
 (wildcard, fuzzy, regex, prefix),
 * AnalyzingQueryParser: has an option to analyze multiterms.
 At a high level, there's a first pass BooleanQuery/field parser and then a 
 span query parser handles all terminal nodes and phrases.
 Same as classic syntax:
 * term: test 
 * fuzzy: roam~0.8, roam~2
 * wildcard: te?t, test*, t*st
 * regex: /\[mb\]oat/
 * phrase: jakarta apache
 * phrase with slop: jakarta apache~3
 * default or clause: jakarta apache
 * grouping or clause: (jakarta apache)
 * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta
 * multiple fields: title:lucene author:hatcher
  
 Main additions in SpanQueryParser syntax vs. classic syntax:
 * Can require in order for phrases with slop with the \~ operator: 
 jakarta apache\~3
 * Can specify not near: fever bieber!\~3,10 ::
 find fever but not if bieber appears within 3 words before or 10 
 words after it.
 * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta 
 apache\]~3 lucene\]\~4 :: 
 find jakarta within 3 words of apache, and that hit has to be within 
 four words before lucene
 * Can also use \[\] for single level phrasal queries instead of  as in: 
 \[jakarta apache\]
 * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 
 :: find apache and then either lucene or solr within three words.
 * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2
 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ 
 /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two 
 words of ap*che and that hit has to be within ten words of something like 
 solr or that lucene regex.
 * Can require at least x number of hits at boolean level: apache AND (lucene 
 solr tika)~2
 * Can use negative only query: -jakarta :: Find all docs that don't contain 
 jakarta
 * Can use an edit distance  2 for fuzzy query via SlowFuzzyQuery (beware of 
 potential performance issues!).
 Trivial additions:
 * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, 
 prefix =2)
 * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance 
 =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein)
 This parser can be very useful for concordance tasks (see also LUCENE-5317 
 and LUCENE-5318) and for analytical search.  
 Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery.
 Most of the documentation is in the javadoc for SpanQueryParser.
 Any and all feedback is welcome.  Thank you.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser

2014-03-11 Thread Nikhil Chhaochharia (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930253#comment-13930253
 ] 

Nikhil Chhaochharia commented on LUCENE-5205:
-

Looks good - we will be testing this over the next few days and will report 
back if we find any issues.

With StopFilter removed, the index size increased by 20% and there was no 
appreciable increase in the indexing time.
With StopFilter replaced by a SynonymFilter (all stopwords as synonyms), the 
index size almost doubled and the indexing time more than tripled. We will 
probably not be going forward with this option. (I had mistakenly mentioned the 
stats for an index with the StopFilter removed in my earlier comment)

 [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to 
 classic QueryParser
 ---

 Key: LUCENE-5205
 URL: https://issues.apache.org/jira/browse/LUCENE-5205
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Reporter: Tim Allison
  Labels: patch
 Fix For: 4.7

 Attachments: LUCENE-5205-cleanup-tests.patch, 
 LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, 
 LUCENE-5205_dateTestReInitPkgPrvt.patch, 
 LUCENE-5205_improve_stop_word_handling.patch, 
 LUCENE-5205_smallTestMods.patch, LUCENE_5205.patch, 
 SpanQueryParser_v1.patch.gz, patch.txt


 This parser extends QueryParserBase and includes functionality from:
 * Classic QueryParser: most of its syntax
 * SurroundQueryParser: recursive parsing for near and not clauses.
 * ComplexPhraseQueryParser: can handle near queries that include multiterms 
 (wildcard, fuzzy, regex, prefix),
 * AnalyzingQueryParser: has an option to analyze multiterms.
 At a high level, there's a first pass BooleanQuery/field parser and then a 
 span query parser handles all terminal nodes and phrases.
 Same as classic syntax:
 * term: test 
 * fuzzy: roam~0.8, roam~2
 * wildcard: te?t, test*, t*st
 * regex: /\[mb\]oat/
 * phrase: jakarta apache
 * phrase with slop: jakarta apache~3
 * default or clause: jakarta apache
 * grouping or clause: (jakarta apache)
 * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta
 * multiple fields: title:lucene author:hatcher
  
 Main additions in SpanQueryParser syntax vs. classic syntax:
 * Can require in order for phrases with slop with the \~ operator: 
 jakarta apache\~3
 * Can specify not near: fever bieber!\~3,10 ::
 find fever but not if bieber appears within 3 words before or 10 
 words after it.
 * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta 
 apache\]~3 lucene\]\~4 :: 
 find jakarta within 3 words of apache, and that hit has to be within 
 four words before lucene
 * Can also use \[\] for single level phrasal queries instead of  as in: 
 \[jakarta apache\]
 * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 
 :: find apache and then either lucene or solr within three words.
 * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2
 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ 
 /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two 
 words of ap*che and that hit has to be within ten words of something like 
 solr or that lucene regex.
 * Can require at least x number of hits at boolean level: apache AND (lucene 
 solr tika)~2
 * Can use negative only query: -jakarta :: Find all docs that don't contain 
 jakarta
 * Can use an edit distance  2 for fuzzy query via SlowFuzzyQuery (beware of 
 potential performance issues!).
 Trivial additions:
 * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, 
 prefix =2)
 * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance 
 =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein)
 This parser can be very useful for concordance tasks (see also LUCENE-5317 
 and LUCENE-5318) and for analytical search.  
 Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery.
 Most of the documentation is in the javadoc for SpanQueryParser.
 Any and all feedback is welcome.  Thank you.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser

2014-03-07 Thread Nikhil Chhaochharia (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924671#comment-13924671
 ] 

Nikhil Chhaochharia commented on LUCENE-5205:
-

PhraseQuery does not guarantee that a false hit will be a stop word - if the 
data contains 'calculator xyz evaluating' and we search for calculator for 
evaluating, then it will match.

I think that replacing the StopFilter with a SynonymFilter where all the stop 
words are synonyms of each other may work well. Initial tests by my team 
indicate that it behaves as expected. The index size does increases by about 
20%, but we can live with that. I am wondering if there are any side effects 
that we are missing?

 [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to 
 classic QueryParser
 ---

 Key: LUCENE-5205
 URL: https://issues.apache.org/jira/browse/LUCENE-5205
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Reporter: Tim Allison
  Labels: patch
 Fix For: 4.7

 Attachments: LUCENE-5205-cleanup-tests.patch, 
 LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, 
 LUCENE-5205_dateTestReInitPkgPrvt.patch, LUCENE-5205_smallTestMods.patch, 
 LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt


 This parser extends QueryParserBase and includes functionality from:
 * Classic QueryParser: most of its syntax
 * SurroundQueryParser: recursive parsing for near and not clauses.
 * ComplexPhraseQueryParser: can handle near queries that include multiterms 
 (wildcard, fuzzy, regex, prefix),
 * AnalyzingQueryParser: has an option to analyze multiterms.
 At a high level, there's a first pass BooleanQuery/field parser and then a 
 span query parser handles all terminal nodes and phrases.
 Same as classic syntax:
 * term: test 
 * fuzzy: roam~0.8, roam~2
 * wildcard: te?t, test*, t*st
 * regex: /\[mb\]oat/
 * phrase: jakarta apache
 * phrase with slop: jakarta apache~3
 * default or clause: jakarta apache
 * grouping or clause: (jakarta apache)
 * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta
 * multiple fields: title:lucene author:hatcher
  
 Main additions in SpanQueryParser syntax vs. classic syntax:
 * Can require in order for phrases with slop with the \~ operator: 
 jakarta apache\~3
 * Can specify not near: fever bieber!\~3,10 ::
 find fever but not if bieber appears within 3 words before or 10 
 words after it.
 * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta 
 apache\]~3 lucene\]\~4 :: 
 find jakarta within 3 words of apache, and that hit has to be within 
 four words before lucene
 * Can also use \[\] for single level phrasal queries instead of  as in: 
 \[jakarta apache\]
 * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 
 :: find apache and then either lucene or solr within three words.
 * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2
 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ 
 /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two 
 words of ap*che and that hit has to be within ten words of something like 
 solr or that lucene regex.
 * Can require at least x number of hits at boolean level: apache AND (lucene 
 solr tika)~2
 * Can use negative only query: -jakarta :: Find all docs that don't contain 
 jakarta
 * Can use an edit distance  2 for fuzzy query via SlowFuzzyQuery (beware of 
 potential performance issues!).
 Trivial additions:
 * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, 
 prefix =2)
 * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance 
 =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein)
 This parser can be very useful for concordance tasks (see also LUCENE-5317 
 and LUCENE-5318) and for analytical search.  
 Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery.
 Most of the documentation is in the javadoc for SpanQueryParser.
 Any and all feedback is welcome.  Thank you.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2014-03-03 Thread Nikhil Chhaochharia (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917848#comment-13917848
 ] 

Nikhil Chhaochharia commented on LUCENE-1486:
-

It looks like there is a problem with stopwords also - a query like A for B 
where 'for' is a stopword is parsed as A B and does not match documents 
containing A for B.

 Wildcards, ORs etc inside Phrase queries
 

 Key: LUCENE-1486
 URL: https://issues.apache.org/jira/browse/LUCENE-1486
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 2.4
Reporter: Mark Harwood
Priority: Minor
 Fix For: 4.7

 Attachments: ComplexPhraseQueryParser.java, LUCENE-1486.patch, 
 LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
 LUCENE-1486.patch, LUCENE-1486.patch, Lucene-1486 non default field.patch, 
 TestComplexPhraseQuery.java, junit_complex_phrase_qp_07_21_2009.patch, 
 junit_complex_phrase_qp_07_22_2009.patch


 An extension to the default QueryParser that overrides the parsing of 
 PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
 The implementation feels a little hacky - this is arguably better handled in 
 QueryParser itself. This works as a proof of concept  for much of the query 
 parser syntax. Examples from the Junit test include:
   checkMatches(\j*   smyth~\, 1,2); //wildcards and fuzzies 
 are OK in phrases
   checkMatches(\(jo* -john)  smith\, 2); // boolean logic 
 works
   checkMatches(\jo*  smith\~2, 1,2,3); // position logic 
 works.
   
   checkBadQuery(\jo*  id:1 smith\); //mixing fields in a 
 phrase is bad
   checkBadQuery(\jo* \smith\ \); //phrases inside phrases 
 is bad
   checkBadQuery(\jo* [sma TO smZ]\ \); //range queries 
 inside phrases not supported
 Code plus Junit test to follow...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2014-02-19 Thread Nikhil Chhaochharia (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905074#comment-13905074
 ] 

Nikhil Chhaochharia commented on LUCENE-1486:
-

LUCENE-5205 is very interesting, thanks for pointing me to it. 

However, we should still try to get LUCENE-1486 closed - most of the work has 
already been done and it may be useful in certain cases where the full power of 
LUCENE-5205 is not required.

 Wildcards, ORs etc inside Phrase queries
 

 Key: LUCENE-1486
 URL: https://issues.apache.org/jira/browse/LUCENE-1486
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 2.4
Reporter: Mark Harwood
Priority: Minor
 Fix For: 4.7

 Attachments: ComplexPhraseQueryParser.java, LUCENE-1486.patch, 
 LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
 LUCENE-1486.patch, LUCENE-1486.patch, Lucene-1486 non default field.patch, 
 TestComplexPhraseQuery.java, junit_complex_phrase_qp_07_21_2009.patch, 
 junit_complex_phrase_qp_07_22_2009.patch


 An extension to the default QueryParser that overrides the parsing of 
 PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
 The implementation feels a little hacky - this is arguably better handled in 
 QueryParser itself. This works as a proof of concept  for much of the query 
 parser syntax. Examples from the Junit test include:
   checkMatches(\j*   smyth~\, 1,2); //wildcards and fuzzies 
 are OK in phrases
   checkMatches(\(jo* -john)  smith\, 2); // boolean logic 
 works
   checkMatches(\jo*  smith\~2, 1,2,3); // position logic 
 works.
   
   checkBadQuery(\jo*  id:1 smith\); //mixing fields in a 
 phrase is bad
   checkBadQuery(\jo* \smith\ \); //phrases inside phrases 
 is bad
   checkBadQuery(\jo* [sma TO smZ]\ \); //range queries 
 inside phrases not supported
 Code plus Junit test to follow...



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2014-02-17 Thread Nikhil Chhaochharia (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13903081#comment-13903081
 ] 

Nikhil Chhaochharia commented on LUCENE-1486:
-

The patch posted by Ahmet Arslan on 8th Feb 2012 looks good to me. I have been 
using it in production for some time and did not find any issues.

I will request a committer to kindly look into this and help get this included 
into Solr 4.7.  If any further work is required, then I will be happy to give 
it a shot.

 Wildcards, ORs etc inside Phrase queries
 

 Key: LUCENE-1486
 URL: https://issues.apache.org/jira/browse/LUCENE-1486
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Affects Versions: 2.4
Reporter: Mark Harwood
Priority: Minor
 Fix For: 4.7

 Attachments: ComplexPhraseQueryParser.java, LUCENE-1486.patch, 
 LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
 LUCENE-1486.patch, LUCENE-1486.patch, Lucene-1486 non default field.patch, 
 TestComplexPhraseQuery.java, junit_complex_phrase_qp_07_21_2009.patch, 
 junit_complex_phrase_qp_07_22_2009.patch


 An extension to the default QueryParser that overrides the parsing of 
 PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
 The implementation feels a little hacky - this is arguably better handled in 
 QueryParser itself. This works as a proof of concept  for much of the query 
 parser syntax. Examples from the Junit test include:
   checkMatches(\j*   smyth~\, 1,2); //wildcards and fuzzies 
 are OK in phrases
   checkMatches(\(jo* -john)  smith\, 2); // boolean logic 
 works
   checkMatches(\jo*  smith\~2, 1,2,3); // position logic 
 works.
   
   checkBadQuery(\jo*  id:1 smith\); //mixing fields in a 
 phrase is bad
   checkBadQuery(\jo* \smith\ \); //phrases inside phrases 
 is bad
   checkBadQuery(\jo* [sma TO smZ]\ \); //range queries 
 inside phrases not supported
 Code plus Junit test to follow...



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (SOLR-734) NPE in SolrCore

2008-08-28 Thread Nikhil Chhaochharia (JIRA)
NPE in SolrCore
---

 Key: SOLR-734
 URL: https://issues.apache.org/jira/browse/SOLR-734
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Nikhil Chhaochharia
 Fix For: 1.3



SolrCore.getStatistics() calls 
getCoreDescriptor().getCoreContainer().getCoreNames(this) without checking if 
the CoreDescriptor is null.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-734) NPE in SolrCore

2008-08-28 Thread Nikhil Chhaochharia (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12626537#action_12626537
 ] 

Nikhil Chhaochharia commented on SOLR-734:
--


I have a SolrConfig object and an IndexSchema object.  I was using them to 
create an instance of SolrCore.  Passing null as CoreDescriptor was working 
atleast till the 14th-Aug nightly.

I want to get an instance of SolrCore and am slightly confused with the 
CoreDescriptor, CoreContainer etc. that have been recently introduced.  The 
best thing for me would be a code snippet which shows how to create a SolrCore 
if I have a SolrConfig object and an IndexSchema object.

BTW, I had posted this issue on the mailing list also and it is being discussed 
there also.


 NPE in SolrCore
 ---

 Key: SOLR-734
 URL: https://issues.apache.org/jira/browse/SOLR-734
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Nikhil Chhaochharia
 Fix For: 1.3


 SolrCore.getStatistics() calls 
 getCoreDescriptor().getCoreContainer().getCoreNames(this) without checking if 
 the CoreDescriptor is null.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.