Sorry, I made a mistake when copypasting. Let me just correct my previous mail.
> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED > STATES". 1. Indexed this text: "MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES" ---- As far as I can say, this query correctly find the indexed document (so I have no idea about what is wrong with fuzzy query). +contentDFLT:mains~2 +contentDFLT:"nashua" +contentDFLT:"new-hampshire" +contentDFLT:"united states" I am - using lucene 8.1. - using standard analyzer for both of indexing and searching. - using classic query parser for parsing. 2019年6月13日(木) 23:18 <baris.ka...@oracle.com>: > > However, the index does not have MAINS but MAIN for the expected entry. > > Best regards > > > > On 6/13/19 10:33 AM, baris.ka...@oracle.com wrote: > > does it consider it as like plural word? :) :) :) > > That makes sense. > > > > Best regards > > > > > > On 6/13/19 10:31 AM, baris.ka...@oracle.com wrote: > >> Erick, > >> > >> Cool, could You give a simple example with my example please? > >> > >> Best regards > >> > >> > >> > >> On 6/13/19 10:12 AM, Erick Erickson wrote: > >>> Shot in the dark: stemming. Whenever I see a problem with something > >>> ending in “s” (or “er” or “ing” or….) my first suspect is that > >>> stemming is turned on. In that case the token in the index that’s > >>> actually searched on is somewhat different than you expect. > >>> > >>> The test is easy, just insure your fieldType contains no stemmers. > >>> PorterStemmer is particularly aggressive, but for this case to test > >>> I’d just remove all stemming, re-index and see if the results differ. > >>> > >>> Best, > >>> Erick > >>> > >>>> On Jun 13, 2019, at 7:26 AM, baris.ka...@oracle.com wrote: > >>>> > >>>> Tomoko,- > >>>> > >>>> That is strange indeed. > >>>> > >>>> Something is wrong when i use mains but maink, mainl, mainr,mainq, > >>>> maint all work ok any consonant at the end except s works in this > >>>> case. > >>>> > >>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2". > >>>> > >>>> i am using fuzzy query with ~ from Query.builder and that is not > >>>> PhraseQuery. > >>>> > >>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase > >>>> since it does not go through StandardAnalyzer) is also not > >>>> PhraseQuery. > >>>> > >>>> can there be a clearer sample case for ComplexPhraseQuery please in > >>>> the docs? > >>>> > >>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED > >>>> STATES" the expected output in this case? > >>>> > >>>> Thanks for spending time on this, i would like to thank everyone. > >>>> > >>>> Best regards > >>>> > >>>> > >>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote: > >>>>> Hi, > >>>>> > >>>>>> Ok, i think only this very specific only "mains" has an issue. > >>>>> It looks strange to me. I did some test locally. > >>>>> > >>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE > >>>>> UNITED STATES". > >>>>> > >>>>> 2a. This query string (just copied from your Case #3) worked > >>>>> correctly > >>>>> for me as far as I can see. > >>>>> +contentDFLT:mains~2 +contentDFLT:"nashua", > >>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state" > >>>>> > >>>>> 2b. However this query string got no results. > >>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua", > >>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states" > >>>>> It is an expected behaviour because the classic query parser does not > >>>>> support fuzzy query inside phrase query (as far as I know). > >>>>> > >>>>> I suspect you use fuzzy query operator (~) inside phrase query > >>>>> ("), as > >>>>> the 2b case. > >>>>> > >>>>> FYI: there is a special parser for such complex phrase query. > >>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e= > >>>>> > >>>>> > >>>>> Tomoko > >>>>> > >>>>> 2019年6月13日(木) 6:16 <baris.ka...@oracle.com>: > >>>>>> Ok, i think only this very specific only "mains" has an issue. > >>>>>> > >>>>>> all i knew about Lucene was fine :) Great... > >>>>>> > >>>>>> i have one more question: > >>>>>> > >>>>>> which one is advised to use: FuzzyQuery or the Query.parser with > >>>>>> search string~ appended? > >>>>>> > >>>>>> The second one will go through analyzer and make search string > >>>>>> lowercase. > >>>>>> > >>>>>> Best regards > >>>>>> > >>>>>> > >>>>>> On 6/12/19 1:03 PM, baris.ka...@oracle.com wrote: > >>>>>> > >>>>>> Hi again,- > >>>>>> > >>>>>> this is really interesting and i hope i am missing something. > >>>>>> Index small cases all entries so case sensitivity is not an issue > >>>>>> i think. > >>>>>> > >>>>>> Case #1: > >>>>>> > >>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new > >>>>>> org.apache.lucene.queryparser.classic.QueryParser(field, > >>>>>> phraseAnalyzer) ; > >>>>>> Query q1 = null; > >>>>>> try { > >>>>>> q1 = parser.parse("Main"); > >>>>>> } catch (ParseException e) { > >>>>>> e.printStackTrace(); > >>>>>> } > >>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "NASHUA"), BooleanClause.Occur.MUST); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "UNITED STATES"), BooleanClause.Occur.MUST); > >>>>>> > >>>>>> > >>>>>> This brings with this: > >>>>>> > >>>>>> query plan: > >>>>>> > >>>>>> [+contentDFLT:main, +contentDFLT:"nashua", > >>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"] > >>>>>> > >>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after > >>>>>> exec finished) > >>>>>> > >>>>>> Number of results: 12 > >>>>>> Name: Main Dunstable Rd > >>>>>> Score: 41.204945 > >>>>>> ID: 12677400 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.72631, -71.50269 > >>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE > >>>>>> UNITED STATES > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 41.204945 > >>>>>> ID: 12681980 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.76416, -71.46681 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 41.204945 > >>>>>> ID: 12681973 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.75045, -71.4607 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 41.204945 > >>>>>> ID: 12681974 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.76019, -71.465 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Main Dunstable Rd > >>>>>> Score: 41.204945 > >>>>>> ID: 12677399 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.74641, -71.48943 > >>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE > >>>>>> UNITED STATES > >>>>>> > >>>>>> Name: S Main St > >>>>>> Score: 41.204945 > >>>>>> ID: 11893215 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.73412, -71.44797 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 41.204945 > >>>>>> ID: 12681978 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.73492, -71.44951 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: S Main St > >>>>>> Score: 41.204945 > >>>>>> ID: 11893214 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.73958, -71.45895 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 41.204945 > >>>>>> ID: 12681979 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.76416, -71.46681 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 41.204945 > >>>>>> ID: 12681977 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.747, -71.45957 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> > >>>>>> > >>>>>> Case #2 > >>>>>> > >>>>>> When i did this it also worked by adding ~ to make it Fuzzy query > >>>>>> to Main word: > >>>>>> > >>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new > >>>>>> org.apache.lucene.queryparser.classic.QueryParser(field, > >>>>>> phraseAnalyzer) ; > >>>>>> Query q1 = null; > >>>>>> try { > >>>>>> q1 = parser.parse("Main~"); > >>>>>> } catch (ParseException e) { > >>>>>> e.printStackTrace(); > >>>>>> } > >>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "NASHUA"), BooleanClause.Occur.MUST); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "UNITED STATES"), BooleanClause.Occur.MUST); > >>>>>> > >>>>>> > >>>>>> query plan: > >>>>>> > >>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua", > >>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"] > >>>>>> > >>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging > >>>>>> stops) > >>>>>> Number of results: 12 > >>>>>> Name: Main Dunstable Rd > >>>>>> Score: 41.06405 > >>>>>> ID: 12677400 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.72631, -71.50269 > >>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE > >>>>>> UNITED STATES > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 41.06405 > >>>>>> ID: 12681980 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.76416, -71.46681 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 41.06405 > >>>>>> ID: 12681973 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.75045, -71.4607 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 41.06405 > >>>>>> ID: 12681974 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.76019, -71.465 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Main Dunstable Rd > >>>>>> Score: 41.06405 > >>>>>> ID: 12677399 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.74641, -71.48943 > >>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE > >>>>>> UNITED STATES > >>>>>> > >>>>>> Name: S Main St > >>>>>> Score: 41.06405 > >>>>>> ID: 11893215 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.73412, -71.44797 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 41.06405 > >>>>>> ID: 12681978 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.73492, -71.44951 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: S Main St > >>>>>> Score: 41.06405 > >>>>>> ID: 11893214 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.73958, -71.45895 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 41.06405 > >>>>>> ID: 12681979 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.76416, -71.46681 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 41.06405 > >>>>>> ID: 12681977 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.747, -71.45957 > >>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> Case #3 > >>>>>> > >>>>>> But why does this not work with fuzzy mode and i misspelled a bit > >>>>>> (1 edit away) and as You saw the data is there with Main spelling: > >>>>>> > >>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new > >>>>>> org.apache.lucene.queryparser.classic.QueryParser(field, > >>>>>> phraseAnalyzer) ; > >>>>>> > >>>>>> Query q1 = null; > >>>>>> try { > >>>>>> q1 = parser.parse("Mains~"); // 1 edit away > >>>>>> } catch (ParseException e) { > >>>>>> e.printStackTrace(); > >>>>>> } > >>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "NASHUA"), BooleanClause.Occur.MUST); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "UNITED STATES"), BooleanClause.Occur.MUST); > >>>>>> > >>>>>> query plan: > >>>>>> > >>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua", > >>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"] > >>>>>> > >>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging > >>>>>> stops) > >>>>>> > >>>>>> Number of results: 0 > >>>>>> > >>>>>> > >>>>>> > >>>>>> Case #4 > >>>>>> > >>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy > >>>>>> query is ignored here since there is no MAIN in the first 468 > >>>>>> resuls: > >>>>>> > >>>>>> there is no boost for Mains term here. > >>>>>> > >>>>>> query plan: > >>>>>> > >>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua", > >>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"] > >>>>>> > >>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging > >>>>>> stops) > >>>>>> Number of results: 1794 > >>>>>> Name: Nashua Dr > >>>>>> Score: 34.186226 > >>>>>> ID: 4974936 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.7636, -71.46063 > >>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Nashua River Rail Trl > >>>>>> Score: 34.186226 > >>>>>> ID: 4975508 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.7062, -71.53962 > >>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE > >>>>>> UNITED STATES > >>>>>> > >>>>>> Name: Nashua Rd > >>>>>> Score: 33.84896 > >>>>>> ID: 4975388 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.78746, -71.92823 > >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: NASHUA > >>>>>> Score: 33.84896 > >>>>>> ID: 21014865 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.75873, -71.46438 > >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: NASHUA > >>>>>> Score: 33.84896 > >>>>>> ID: 21014865 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.75873, -71.46438 > >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: NASHUA > >>>>>> Score: 33.84896 > >>>>>> ID: 21014865 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.75873, -71.46438 > >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: NASHUA > >>>>>> Score: 33.84896 > >>>>>> ID: 21014865 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.75873, -71.46438 > >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: NASHUA > >>>>>> Score: 33.84896 > >>>>>> ID: 21014865 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.75873, -71.46438 > >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Nashua St > >>>>>> Score: 33.84896 > >>>>>> ID: 4975671 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.88471, -70.81687 > >>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> Name: Nashua Rd > >>>>>> Score: 33.84896 > >>>>>> ID: 4975400 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.79014, -71.92364 > >>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES > >>>>>> > >>>>>> > >>>>>> Why is the fuzzy query ignored? > >>>>>> Even if i have separate fields for street, city,region, country, > >>>>>> this fuzzy query issue will come into place for words with > >>>>>> multiple parts like main dunstable etc., right? > >>>>>> > >>>>>> Best regards > >>>>>> > >>>>>> On 6/12/19 11:36 AM, baris.ka...@oracle.com wrote: > >>>>>> > >>>>>> Tomoko,- > >>>>>> > >>>>>> Thank You for Your suggestions. i am trying to understand it > >>>>>> and i thought i did :) > >>>>>> > >>>>>> but it does not work with FuzzyQuery when i used with a *single* > >>>>>> large TextField like street=...value... city=...value... > >>>>>> region=...value... country=...value... (with or without quotes > >>>>>> for the values) > >>>>>> > >>>>>> What i knew about Lucene fuzzy queries are not holding now with > >>>>>> this Textfield form. That is why i suspected of a bug. > >>>>>> > >>>>>> 1. Yes, i saw and have a solid proof on that now. > >>>>>> > >>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are > >>>>>> escaped and it is not analyzed. > >>>>>> > >>>>>> Stuffing into one textfield vs having separate fields should only > >>>>>> affect probably the performance but not the outcome in my case. > >>>>>> But, i have been thinking about this and maybe it is the way to > >>>>>> go in this case. > >>>>>> > >>>>>> mY CONTENT field has street names in mixed case and city, region > >>>>>> country names in UPPERCASE. Can this be a problem? > >>>>>> i thought index stored them in lowercase since i am using > >>>>>> StandardAnalyzer. > >>>>>> > >>>>>> CONTENT field also has full textfield string with street=... > >>>>>> city=... region=... country=... (here all values are UPPERCASE). > >>>>>> > >>>>>> Why cant the index find the names via FuzzyQuery? i tried both > >>>>>> FuzzyQuery and Query builder as i showed before. > >>>>>> > >>>>>> The last advice in Your previous email would nicely go outside > >>>>>> the parantheses since it might be very critical :) :) :) > >>>>>> > >>>>>> Best regards > >>>>>> > >>>>>> > >>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote: > >>>>>> > >>>>>> I'd suggest to correctly understand the way a software works before > >>>>>> suspecting its bug :-) > >>>>>> > >>>>>> I guess you may miss two points: > >>>>>> > >>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double > >>>>>> quote (U+0022) so quotes are not indexed or searched at all if > >>>>>> you are > >>>>>> using standard analyzer. (That is the reason you have same results > >>>>>> with or without quotes.) > >>>>>> See: > >>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e= > >>>>>> and > >>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e= > >>>>>> > >>>>>> 2. double quote has special meaning (it's interpreted as phrase > >>>>>> query) > >>>>>> with the built-in query parser so you need to escape it if you > >>>>>> want to > >>>>>> search double quotes itself. > >>>>>> See: > >>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e= > >>>>>> > >>>>>> (My advice would be to create separate fields for each key value > >>>>>> pairs > >>>>>> instead of stuffing all pairs into one text field, if you need to > >>>>>> search them separately.) > >>>>>> > >>>>>> 2019年6月12日(水) 2:39 <baris.ka...@oracle.com>: > >>>>>> > >>>>>> i can say that quotes is not the issue with index as it still > >>>>>> results in > >>>>>> same results with quotes or without quotes. > >>>>>> > >>>>>> i am starting to feel that this might be a bug maybe?? > >>>>>> > >>>>>> Best regards > >>>>>> > >>>>>> > >>>>>> On 6/10/19 2:46 PM, baris.ka...@oracle.com wrote: > >>>>>> > >>>>>> Somehow " is causing an issue as this should return street with > >>>>>> MAIN: > >>>>>> > >>>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua", > >>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united > >>>>>> states"] -> this was with fuzzyquery on MAINS > >>>>>> > >>>>>> Best regards > >>>>>> > >>>>>> > >>>>>> On 6/10/19 2:24 PM, baris.ka...@oracle.com wrote: > >>>>>> > >>>>>> [+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire", > >>>>>> +contentDFLT:"country united states", contentDFLT:street > >>>>>> contentDFLT:mains] > >>>>>> > >>>>>> QueeryParser chops it into two pieces from > >>>>>> parser.parser("street=\"MAINS\""); > >>>>>> > >>>>>> Index has a TextField named contentDFLT the following data : > >>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW > >>>>>> HAMPSHIRE" country="UNITED STATES" > >>>>>> > >>>>>> > >>>>>> When i set street=\"MAINS~\" with parser: > >>>>>> i get the following > >>>>>> [+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire", > >>>>>> +contentDFLT:"country united states", contentDFLT:street > >>>>>> contentDFLT:mains] > >>>>>> > >>>>>> probably " quotations are messing this up as You were saying... > >>>>>> Best regards > >>>>>> > >>>>>> > >>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote: > >>>>>> > >>>>>> Or, " (double quotation) in your query string may affect query > >>>>>> parsing. > >>>>>> > >>>>>> When I parse this string by classic query parser (lucene 8.1), > >>>>>> street="MAINS~" > >>>>>> parsed (raw) query is > >>>>>> text:street text:mains > >>>>>> (I set the default search field to "text", so text:xxxx is appeared > >>>>>> here.) > >>>>>> > >>>>>> Query parsing is a complex process, so it would be good to check > >>>>>> parsed raw query string especially when you have (reserved) special > >>>>>> characters in your query... > >>>>>> > >>>>>> 2019年6月11日(火) 1:10 Tomoko Uchida <tomoko.uchida.1...@gmail.com>: > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> I noticed one small thing in your previous mail. > >>>>>> > >>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results > >>>>>> > >>>>>> which is good. > >>>>>> > >>>>>> To specify a search field, ":" (colon) should be used instead of > >>>>>> "=". > >>>>>> See the query parser documentation: > >>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e= > >>>>>> > >>>>>> > >>>>>> > >>>>>> I'm not sure this is related to your problem. > >>>>>> > >>>>>> 2019年6月11日(火) 0:51 <baris.ka...@oracle.com>: > >>>>>> > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST); > >>>>>> > >>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new > >>>>>> org.apache.lucene.queryparser.classic.QueryParser(field, > >>>>>> phraseAnalyzer) ; > >>>>>> Query q1 = null; > >>>>>> try { > >>>>>> q1 = parser.parse("MAIN"); > >>>>>> } catch (ParseException e) { > >>>>>> > >>>>>> e.printStackTrace(); > >>>>>> } > >>>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD); > >>>>>> > >>>>>> testQuerySearch2 Time to compute: 0 seconds > >>>>>> Number of results: 1775 > >>>>>> Name: Main St > >>>>>> Score: 37.20959 > >>>>>> ID: 12681979 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.76416, -71.46681 > >>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" > >>>>>> region="NEW HAMPSHIRE" country="UNITED STATES" > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 37.20959 > >>>>>> ID: 12681977 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.747, -71.45957 > >>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" > >>>>>> region="NEW HAMPSHIRE" country="UNITED STATES" > >>>>>> > >>>>>> Name: Main St > >>>>>> Score: 37.20959 > >>>>>> ID: 12681978 > >>>>>> Country Code: US > >>>>>> Coordinates: 42.73492, -71.44951 > >>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" > >>>>>> region="NEW HAMPSHIRE" country="UNITED STATES" > >>>>>> > >>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same > >>>>>> results > >>>>>> which is good. > >>>>>> > >>>>>> But when i switch to MAINS~ then fuzzy query does not work. > >>>>>> > >>>>>> > >>>>>> i need to say something with the q1 only in the booleanquery: > >>>>>> it tries to match the MAIN in street, city, region and country > >>>>>> which are > >>>>>> in a single TextField field. > >>>>>> But i dont want this. that is why i need to street="..." etc when > >>>>>> searching. > >>>>>> > >>>>>> Best regards > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote: > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> just for the basic verification, can you find the document without > >>>>>> fuzzy query? I mean, does this query work for you? > >>>>>> > >>>>>> Query query = parser.parse("MAIN"); > >>>>>> > >>>>>> Tomoko > >>>>>> > >>>>>> 2019年6月11日(火) 0:22 <baris.ka...@oracle.com>: > >>>>>> > >>>>>> why cant the second set not work at all? > >>>>>> > >>>>>> it is indexed as Textfield like street="..." city="..." etc. > >>>>>> > >>>>>> Best regards > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 6/10/19 11:23 AM, baris.ka...@oracle.com wrote: > >>>>>> > >>>>>> i dont know how to use Fuzzyquery with queryparser but probably > >>>>>> You > >>>>>> are suggesting > >>>>>> > >>>>>> QueryParser parser = new QueryParser(field, analyzer) ; > >>>>>> Query query = parser.parse("MAINS~2"); > >>>>>> > >>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD); > >>>>>> > >>>>>> am i right? > >>>>>> Best regards > >>>>>> > >>>>>> > >>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote: > >>>>>> > >>>>>> I would suggest using a QueryParser for your fuzzy query before > >>>>>> adding it to the Boolean query. This should weed out any case > >>>>>> issues. > >>>>>> > >>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.ka...@oracle.com > >>>>>> <mailto:baris.ka...@oracle.com>> wrote: > >>>>>> > >>>>>> BooleanQuery.Builder booleanQuery = new > >>>>>> BooleanQuery.Builder(); > >>>>>> > >>>>>> //First set > >>>>>> > >>>>>> booleanQuery.add(new FuzzyQuery(new > >>>>>> org.apache.lucene.index.Term(field, "MAINS")), > >>>>>> BooleanClause.Occur.SHOULD); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "NASHUA"), BooleanClause.Occur.MUST); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST); > >>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, > >>>>>> "UNITED STATES"), BooleanClause.Occur.MUST); > >>>>>> > >>>>>> // Second set > >>>>>> //booleanQuery.add(new FuzzyQuery(new > >>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")), > >>>>>> BooleanClause.Occur.SHOULD); > >>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer, > >>>>>> > >>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST); > >>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer, > >>>>>> > >>>>>> field, "region=\"NEW HAMPSHIRE\""), > >>>>>> BooleanClause.Occur.MUST); > >>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer, > >>>>>> > >>>>>> field, "country=\"UNITED STATES\""), > >>>>>> BooleanClause.Occur.MUST); > >>>>>> > >>>>>> The first set brings also street with Nashua name. > >>>>>> (NASHUA). > >>>>>> > >>>>>> so, to prevent that and since i also indexed with > >>>>>> street="..." > >>>>>> city="..." i did the second set but it does not bring > >>>>>> anything. > >>>>>> > >>>>>> createPhraseQuery builds a Phrasequery with one term > >>>>>> equal to the > >>>>>> string > >>>>>> in the call. > >>>>>> > >>>>>> Best regards > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 6/10/19 10:47 AM, baris.ka...@oracle.com > >>>>>> <mailto:baris.ka...@oracle.com> wrote: > >>>>>> > How do i check how it is indexed? lowecase or uppercase? > >>>>>> > > >>>>>> > only way is now to by testing. > >>>>>> > > >>>>>> > i am using standardanalyzer. > >>>>>> > > >>>>>> > Best regards > >>>>>> > > >>>>>> > > >>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote: > >>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida > >>>>>> >> <tomoko.uchida.1...@gmail.com > >>>>>> <mailto:tomoko.uchida.1...@gmail.com>> wrote: > >>>>>> >>> Hi, > >>>>>> >>> > >>>>>> >>> What analyzer do you use for the text field? Is the > >>>>>> term "Main" > >>>>>> >>> correctly indexed? > >>>>>> >> Agreed. Also, it would be good if you could post your > >>>>>> actual > >>>>>> code. > >>>>>> >> > >>>>>> >> What analyzer are you using? If you are using > >>>>>> StandardAnalyzer, > >>>>>> then > >>>>>> >> all of your terms while indexing will be lowercased, > >>>>>> AFAIK, but > >>>>>> your > >>>>>> >> query will not be analyzed until you run a > >>>>>> QueryParser on it. > >>>>>> >> > >>>>>> >> > >>>>>> >> Atri > >>>>>> >> > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> --------------------------------------------------------------------- > >>>>>> > >>>>>> > >>>>>> > To unsubscribe, e-mail: > >>>>>> java-user-unsubscr...@lucene.apache.org > >>>>>> <mailto:java-user-unsubscr...@lucene.apache.org> > >>>>>> > For additional commands, e-mail: > >>>>>> java-user-h...@lucene.apache.org > >>>>>> <mailto:java-user-h...@lucene.apache.org> > >>>>>> > > >>>>>> > >>>>>> --------------------------------------------------------------------- > >>>>>> > >>>>>> > >>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org > >>>>>> > >>>>>> --------------------------------------------------------------------- > >>>>>> > >>>>>> > >>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org > >>>>>> > >>>>>> --------------------------------------------------------------------- > >>>>>> > >>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org > >>>>>> > >>>>>> --------------------------------------------------------------------- > >>>>>> > >>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> --------------------------------------------------------------------- > >>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org > >>>>> > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org > >>>> > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >>> For additional commands, e-mail: java-user-h...@lucene.apache.org > >>> > >> > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org