Hello,- Erick explained how to disable stemming in Solr but i am using Lucene purely. i am also researching how to disable it in Lucene but if You have instructions how to do so already i appreciate if You could share here. Best regards
----- Original Message ----- From: baris.ka...@oracle.com To: java-user@lucene.apache.org, tomoko.uchida.1...@gmail.com, erickerick...@gmail.com, a...@linux.com, baris.ka...@oracle.com, luc...@mikemccandless.com Sent: Thursday, June 13, 2019 10:48:47 AM GMT -05:00 US/Canada Eastern Subject: Re: FuzzyQuery- why is it ignored? i see, i am using an older version 6.6 and we should switch to Your 8.1 version of at least 7.X. Tomoko i think i understood You meant MAIN NASHUA .... for the string :) Again i really appreciate all answers. How do we disable or enable stemming while indexing? :) another question. Best regards On 6/13/19 10:40 AM, Tomoko Uchida wrote: > Sorry, I made a mistake when copypasting. Let me just correct my previous > mail. > >> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED >> STATES". > 1. Indexed this text: "MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW > HAMPSHIRE UNITED STATES" > > ---- > As far as I can say, this query correctly find the indexed document > (so I have no idea about what is wrong with fuzzy query). > +contentDFLT:mains~2 +contentDFLT:"nashua" > +contentDFLT:"new-hampshire" +contentDFLT:"united states" > > I am > - using lucene 8.1. > - using standard analyzer for both of indexing and searching. > - using classic query parser for parsing. > > > > 2019年6月13日(木) 23:18 <baris.ka...@oracle.com>: >> However, the index does not have MAINS but MAIN for the expected entry. >> >> Best regards >> >> >> >> On 6/13/19 10:33 AM, baris.ka...@oracle.com wrote: >>> does it consider it as like plural word? :) :) :) >>> That makes sense. >>> >>> Best regards >>> >>> >>> On 6/13/19 10:31 AM, baris.ka...@oracle.com wrote: >>>> Erick, >>>> >>>> Cool, could You give a simple example with my example please? >>>> >>>> Best regards >>>> >>>> >>>> >>>> On 6/13/19 10:12 AM, Erick Erickson wrote: >>>>> Shot in the dark: stemming. Whenever I see a problem with something >>>>> ending in “s” (or “er” or “ing” or….) my first suspect is that >>>>> stemming is turned on. In that case the token in the index that’s >>>>> actually searched on is somewhat different than you expect. >>>>> >>>>> The test is easy, just insure your fieldType contains no stemmers. >>>>> PorterStemmer is particularly aggressive, but for this case to test >>>>> I’d just remove all stemming, re-index and see if the results differ. >>>>> >>>>> Best, >>>>> Erick >>>>> >>>>>> On Jun 13, 2019, at 7:26 AM, baris.ka...@oracle.com wrote: >>>>>> >>>>>> Tomoko,- >>>>>> >>>>>> That is strange indeed. >>>>>> >>>>>> Something is wrong when i use mains but maink, mainl, mainr,mainq, >>>>>> maint all work ok any consonant at the end except s works in this >>>>>> case. >>>>>> >>>>>> Case #3 had +contentDFLT:mains~2 but not +contentDFLT:"mains~2". >>>>>> >>>>>> i am using fuzzy query with ~ from Query.builder and that is not >>>>>> PhraseQuery. >>>>>> >>>>>> Similarly FuzzyQuery with input "mains" (it has to be lowercase >>>>>> since it does not go through StandardAnalyzer) is also not >>>>>> PhraseQuery. >>>>>> >>>>>> can there be a clearer sample case for ComplexPhraseQuery please in >>>>>> the docs? >>>>>> >>>>>> did You also index "MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED >>>>>> STATES" the expected output in this case? >>>>>> >>>>>> Thanks for spending time on this, i would like to thank everyone. >>>>>> >>>>>> Best regards >>>>>> >>>>>> >>>>>> On 6/13/19 12:13 AM, Tomoko Uchida wrote: >>>>>>> Hi, >>>>>>> >>>>>>>> Ok, i think only this very specific only "mains" has an issue. >>>>>>> It looks strange to me. I did some test locally. >>>>>>> >>>>>>> 1. Indexed this text: "NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE >>>>>>> UNITED STATES". >>>>>>> >>>>>>> 2a. This query string (just copied from your Case #3) worked >>>>>>> correctly >>>>>>> for me as far as I can see. >>>>>>> +contentDFLT:mains~2 +contentDFLT:"nashua", >>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united state" >>>>>>> >>>>>>> 2b. However this query string got no results. >>>>>>> +contentDFLT:"mains~2", +contentDFLT:"nashua", >>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states" >>>>>>> It is an expected behaviour because the classic query parser does not >>>>>>> support fuzzy query inside phrase query (as far as I know). >>>>>>> >>>>>>> I suspect you use fuzzy query operator (~) inside phrase query >>>>>>> ("), as >>>>>>> the 2b case. >>>>>>> >>>>>>> FYI: there is a special parser for such complex phrase query. >>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_complexPhrase_ComplexPhraseQueryParser.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=ZcXpaSlwS5DegX76mHTb_6DH3P7noan1eeMXc-Vh5M8&s=FoIMlcjDO2b7Gut9XRx-NIBWiBQWItsj8IlylJC7Wkc&e= >>>>>>> >>>>>>> >>>>>>> Tomoko >>>>>>> >>>>>>> 2019年6月13日(木) 6:16 <baris.ka...@oracle.com>: >>>>>>>> Ok, i think only this very specific only "mains" has an issue. >>>>>>>> >>>>>>>> all i knew about Lucene was fine :) Great... >>>>>>>> >>>>>>>> i have one more question: >>>>>>>> >>>>>>>> which one is advised to use: FuzzyQuery or the Query.parser with >>>>>>>> search string~ appended? >>>>>>>> >>>>>>>> The second one will go through analyzer and make search string >>>>>>>> lowercase. >>>>>>>> >>>>>>>> Best regards >>>>>>>> >>>>>>>> >>>>>>>> On 6/12/19 1:03 PM, baris.ka...@oracle.com wrote: >>>>>>>> >>>>>>>> Hi again,- >>>>>>>> >>>>>>>> this is really interesting and i hope i am missing something. >>>>>>>> Index small cases all entries so case sensitivity is not an issue >>>>>>>> i think. >>>>>>>> >>>>>>>> Case #1: >>>>>>>> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field, >>>>>>>> phraseAnalyzer) ; >>>>>>>> Query q1 = null; >>>>>>>> try { >>>>>>>> q1 = parser.parse("Main"); >>>>>>>> } catch (ParseException e) { >>>>>>>> e.printStackTrace(); >>>>>>>> } >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST); >>>>>>>> >>>>>>>> >>>>>>>> This brings with this: >>>>>>>> >>>>>>>> query plan: >>>>>>>> >>>>>>>> [+contentDFLT:main, +contentDFLT:"nashua", >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"] >>>>>>>> >>>>>>>> testQuerySearch1 Time to compute: 0 seconds (copied answer after >>>>>>>> exec finished) >>>>>>>> >>>>>>>> Number of results: 12 >>>>>>>> Name: Main Dunstable Rd >>>>>>>> Score: 41.204945 >>>>>>>> ID: 12677400 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.72631, -71.50269 >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE >>>>>>>> UNITED STATES >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 41.204945 >>>>>>>> ID: 12681980 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.76416, -71.46681 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 41.204945 >>>>>>>> ID: 12681973 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.75045, -71.4607 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 41.204945 >>>>>>>> ID: 12681974 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.76019, -71.465 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Main Dunstable Rd >>>>>>>> Score: 41.204945 >>>>>>>> ID: 12677399 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.74641, -71.48943 >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE >>>>>>>> UNITED STATES >>>>>>>> >>>>>>>> Name: S Main St >>>>>>>> Score: 41.204945 >>>>>>>> ID: 11893215 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.73412, -71.44797 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 41.204945 >>>>>>>> ID: 12681978 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.73492, -71.44951 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: S Main St >>>>>>>> Score: 41.204945 >>>>>>>> ID: 11893214 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.73958, -71.45895 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 41.204945 >>>>>>>> ID: 12681979 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.76416, -71.46681 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 41.204945 >>>>>>>> ID: 12681977 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.747, -71.45957 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Case #2 >>>>>>>> >>>>>>>> When i did this it also worked by adding ~ to make it Fuzzy query >>>>>>>> to Main word: >>>>>>>> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field, >>>>>>>> phraseAnalyzer) ; >>>>>>>> Query q1 = null; >>>>>>>> try { >>>>>>>> q1 = parser.parse("Main~"); >>>>>>>> } catch (ParseException e) { >>>>>>>> e.printStackTrace(); >>>>>>>> } >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST); >>>>>>>> >>>>>>>> >>>>>>>> query plan: >>>>>>>> >>>>>>>> [+contentDFLT:main~2, +contentDFLT:"nashua", >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"] >>>>>>>> >>>>>>>> testQuerySearch1 Time to compute: 24 seconds (due to debugging >>>>>>>> stops) >>>>>>>> Number of results: 12 >>>>>>>> Name: Main Dunstable Rd >>>>>>>> Score: 41.06405 >>>>>>>> ID: 12677400 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.72631, -71.50269 >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE >>>>>>>> UNITED STATES >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 41.06405 >>>>>>>> ID: 12681980 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.76416, -71.46681 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 41.06405 >>>>>>>> ID: 12681973 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.75045, -71.4607 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 41.06405 >>>>>>>> ID: 12681974 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.76019, -71.465 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Main Dunstable Rd >>>>>>>> Score: 41.06405 >>>>>>>> ID: 12677399 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.74641, -71.48943 >>>>>>>> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE >>>>>>>> UNITED STATES >>>>>>>> >>>>>>>> Name: S Main St >>>>>>>> Score: 41.06405 >>>>>>>> ID: 11893215 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.73412, -71.44797 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 41.06405 >>>>>>>> ID: 12681978 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.73492, -71.44951 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: S Main St >>>>>>>> Score: 41.06405 >>>>>>>> ID: 11893214 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.73958, -71.45895 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 41.06405 >>>>>>>> ID: 12681979 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.76416, -71.46681 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 41.06405 >>>>>>>> ID: 12681977 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.747, -71.45957 >>>>>>>> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Case #3 >>>>>>>> >>>>>>>> But why does this not work with fuzzy mode and i misspelled a bit >>>>>>>> (1 edit away) and as You saw the data is there with Main spelling: >>>>>>>> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field, >>>>>>>> phraseAnalyzer) ; >>>>>>>> >>>>>>>> Query q1 = null; >>>>>>>> try { >>>>>>>> q1 = parser.parse("Mains~"); // 1 edit away >>>>>>>> } catch (ParseException e) { >>>>>>>> e.printStackTrace(); >>>>>>>> } >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.MUST); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST); >>>>>>>> >>>>>>>> query plan: >>>>>>>> >>>>>>>> [+contentDFLT:mains~2, +contentDFLT:"nashua", >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"] >>>>>>>> >>>>>>>> testQuerySearch1 Time to compute: 23 seconds (due to debugging >>>>>>>> stops) >>>>>>>> >>>>>>>> Number of results: 0 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Case #4 >>>>>>>> >>>>>>>> Then i changed q1 to SHOULD from MUST above: and i think fuzzy >>>>>>>> query is ignored here since there is no MAIN in the first 468 >>>>>>>> resuls: >>>>>>>> >>>>>>>> there is no boost for Mains term here. >>>>>>>> >>>>>>>> query plan: >>>>>>>> >>>>>>>> [contentDFLT:mains~2, +contentDFLT:"nashua", >>>>>>>> +contentDFLT:"new-hampshire", +contentDFLT:"united states"] >>>>>>>> >>>>>>>> testQuerySearch1 Time to compute: 125 seconds (due to debugging >>>>>>>> stops) >>>>>>>> Number of results: 1794 >>>>>>>> Name: Nashua Dr >>>>>>>> Score: 34.186226 >>>>>>>> ID: 4974936 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.7636, -71.46063 >>>>>>>> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Nashua River Rail Trl >>>>>>>> Score: 34.186226 >>>>>>>> ID: 4975508 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.7062, -71.53962 >>>>>>>> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE >>>>>>>> UNITED STATES >>>>>>>> >>>>>>>> Name: Nashua Rd >>>>>>>> Score: 33.84896 >>>>>>>> ID: 4975388 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.78746, -71.92823 >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: NASHUA >>>>>>>> Score: 33.84896 >>>>>>>> ID: 21014865 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.75873, -71.46438 >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: NASHUA >>>>>>>> Score: 33.84896 >>>>>>>> ID: 21014865 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.75873, -71.46438 >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: NASHUA >>>>>>>> Score: 33.84896 >>>>>>>> ID: 21014865 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.75873, -71.46438 >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: NASHUA >>>>>>>> Score: 33.84896 >>>>>>>> ID: 21014865 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.75873, -71.46438 >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: NASHUA >>>>>>>> Score: 33.84896 >>>>>>>> ID: 21014865 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.75873, -71.46438 >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Nashua St >>>>>>>> Score: 33.84896 >>>>>>>> ID: 4975671 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.88471, -70.81687 >>>>>>>> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> Name: Nashua Rd >>>>>>>> Score: 33.84896 >>>>>>>> ID: 4975400 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.79014, -71.92364 >>>>>>>> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES >>>>>>>> >>>>>>>> >>>>>>>> Why is the fuzzy query ignored? >>>>>>>> Even if i have separate fields for street, city,region, country, >>>>>>>> this fuzzy query issue will come into place for words with >>>>>>>> multiple parts like main dunstable etc., right? >>>>>>>> >>>>>>>> Best regards >>>>>>>> >>>>>>>> On 6/12/19 11:36 AM, baris.ka...@oracle.com wrote: >>>>>>>> >>>>>>>> Tomoko,- >>>>>>>> >>>>>>>> Thank You for Your suggestions. i am trying to understand it >>>>>>>> and i thought i did :) >>>>>>>> >>>>>>>> but it does not work with FuzzyQuery when i used with a *single* >>>>>>>> large TextField like street=...value... city=...value... >>>>>>>> region=...value... country=...value... (with or without quotes >>>>>>>> for the values) >>>>>>>> >>>>>>>> What i knew about Lucene fuzzy queries are not holding now with >>>>>>>> this Textfield form. That is why i suspected of a bug. >>>>>>>> >>>>>>>> 1. Yes, i saw and have a solid proof on that now. >>>>>>>> >>>>>>>> 2. yes but FuzzyQuery takes quotes as they are as they are >>>>>>>> escaped and it is not analyzed. >>>>>>>> >>>>>>>> Stuffing into one textfield vs having separate fields should only >>>>>>>> affect probably the performance but not the outcome in my case. >>>>>>>> But, i have been thinking about this and maybe it is the way to >>>>>>>> go in this case. >>>>>>>> >>>>>>>> mY CONTENT field has street names in mixed case and city, region >>>>>>>> country names in UPPERCASE. Can this be a problem? >>>>>>>> i thought index stored them in lowercase since i am using >>>>>>>> StandardAnalyzer. >>>>>>>> >>>>>>>> CONTENT field also has full textfield string with street=... >>>>>>>> city=... region=... country=... (here all values are UPPERCASE). >>>>>>>> >>>>>>>> Why cant the index find the names via FuzzyQuery? i tried both >>>>>>>> FuzzyQuery and Query builder as i showed before. >>>>>>>> >>>>>>>> The last advice in Your previous email would nicely go outside >>>>>>>> the parantheses since it might be very critical :) :) :) >>>>>>>> >>>>>>>> Best regards >>>>>>>> >>>>>>>> >>>>>>>> On 6/12/19 12:17 AM, Tomoko Uchida wrote: >>>>>>>> >>>>>>>> I'd suggest to correctly understand the way a software works before >>>>>>>> suspecting its bug :-) >>>>>>>> >>>>>>>> I guess you may miss two points: >>>>>>>> >>>>>>>> 1. the standard analyzer (standard tokenizer) breaks words by double >>>>>>>> quote (U+0022) so quotes are not indexed or searched at all if >>>>>>>> you are >>>>>>>> using standard analyzer. (That is the reason you have same results >>>>>>>> with or without quotes.) >>>>>>>> See: >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e= >>>>>>>> and >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e= >>>>>>>> >>>>>>>> 2. double quote has special meaning (it's interpreted as phrase >>>>>>>> query) >>>>>>>> with the built-in query parser so you need to escape it if you >>>>>>>> want to >>>>>>>> search double quotes itself. >>>>>>>> See: >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e= >>>>>>>> >>>>>>>> (My advice would be to create separate fields for each key value >>>>>>>> pairs >>>>>>>> instead of stuffing all pairs into one text field, if you need to >>>>>>>> search them separately.) >>>>>>>> >>>>>>>> 2019年6月12日(水) 2:39 <baris.ka...@oracle.com>: >>>>>>>> >>>>>>>> i can say that quotes is not the issue with index as it still >>>>>>>> results in >>>>>>>> same results with quotes or without quotes. >>>>>>>> >>>>>>>> i am starting to feel that this might be a bug maybe?? >>>>>>>> >>>>>>>> Best regards >>>>>>>> >>>>>>>> >>>>>>>> On 6/10/19 2:46 PM, baris.ka...@oracle.com wrote: >>>>>>>> >>>>>>>> Somehow " is causing an issue as this should return street with >>>>>>>> MAIN: >>>>>>>> >>>>>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua", >>>>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united >>>>>>>> states"] -> this was with fuzzyquery on MAINS >>>>>>>> >>>>>>>> Best regards >>>>>>>> >>>>>>>> >>>>>>>> On 6/10/19 2:24 PM, baris.ka...@oracle.com wrote: >>>>>>>> >>>>>>>> [+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire", >>>>>>>> +contentDFLT:"country united states", contentDFLT:street >>>>>>>> contentDFLT:mains] >>>>>>>> >>>>>>>> QueeryParser chops it into two pieces from >>>>>>>> parser.parser("street=\"MAINS\""); >>>>>>>> >>>>>>>> Index has a TextField named contentDFLT the following data : >>>>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW >>>>>>>> HAMPSHIRE" country="UNITED STATES" >>>>>>>> >>>>>>>> >>>>>>>> When i set street=\"MAINS~\" with parser: >>>>>>>> i get the following >>>>>>>> [+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire", >>>>>>>> +contentDFLT:"country united states", contentDFLT:street >>>>>>>> contentDFLT:mains] >>>>>>>> >>>>>>>> probably " quotations are messing this up as You were saying... >>>>>>>> Best regards >>>>>>>> >>>>>>>> >>>>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote: >>>>>>>> >>>>>>>> Or, " (double quotation) in your query string may affect query >>>>>>>> parsing. >>>>>>>> >>>>>>>> When I parse this string by classic query parser (lucene 8.1), >>>>>>>> street="MAINS~" >>>>>>>> parsed (raw) query is >>>>>>>> text:street text:mains >>>>>>>> (I set the default search field to "text", so text:xxxx is appeared >>>>>>>> here.) >>>>>>>> >>>>>>>> Query parsing is a complex process, so it would be good to check >>>>>>>> parsed raw query string especially when you have (reserved) special >>>>>>>> characters in your query... >>>>>>>> >>>>>>>> 2019年6月11日(火) 1:10 Tomoko Uchida <tomoko.uchida.1...@gmail.com>: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I noticed one small thing in your previous mail. >>>>>>>> >>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results >>>>>>>> >>>>>>>> which is good. >>>>>>>> >>>>>>>> To specify a search field, ":" (colon) should be used instead of >>>>>>>> "=". >>>>>>>> See the query parser documentation: >>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e= >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I'm not sure this is related to your problem. >>>>>>>> >>>>>>>> 2019年6月11日(火) 0:51 <baris.ka...@oracle.com>: >>>>>>>> >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST); >>>>>>>> >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser = new >>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field, >>>>>>>> phraseAnalyzer) ; >>>>>>>> Query q1 = null; >>>>>>>> try { >>>>>>>> q1 = parser.parse("MAIN"); >>>>>>>> } catch (ParseException e) { >>>>>>>> >>>>>>>> e.printStackTrace(); >>>>>>>> } >>>>>>>> booleanQuery.add(q1, BooleanClause.Occur.SHOULD); >>>>>>>> >>>>>>>> testQuerySearch2 Time to compute: 0 seconds >>>>>>>> Number of results: 1775 >>>>>>>> Name: Main St >>>>>>>> Score: 37.20959 >>>>>>>> ID: 12681979 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.76416, -71.46681 >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES" >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 37.20959 >>>>>>>> ID: 12681977 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.747, -71.45957 >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES" >>>>>>>> >>>>>>>> Name: Main St >>>>>>>> Score: 37.20959 >>>>>>>> ID: 12681978 >>>>>>>> Country Code: US >>>>>>>> Coordinates: 42.73492, -71.44951 >>>>>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" >>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES" >>>>>>>> >>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same >>>>>>>> results >>>>>>>> which is good. >>>>>>>> >>>>>>>> But when i switch to MAINS~ then fuzzy query does not work. >>>>>>>> >>>>>>>> >>>>>>>> i need to say something with the q1 only in the booleanquery: >>>>>>>> it tries to match the MAIN in street, city, region and country >>>>>>>> which are >>>>>>>> in a single TextField field. >>>>>>>> But i dont want this. that is why i need to street="..." etc when >>>>>>>> searching. >>>>>>>> >>>>>>>> Best regards >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> just for the basic verification, can you find the document without >>>>>>>> fuzzy query? I mean, does this query work for you? >>>>>>>> >>>>>>>> Query query = parser.parse("MAIN"); >>>>>>>> >>>>>>>> Tomoko >>>>>>>> >>>>>>>> 2019年6月11日(火) 0:22 <baris.ka...@oracle.com>: >>>>>>>> >>>>>>>> why cant the second set not work at all? >>>>>>>> >>>>>>>> it is indexed as Textfield like street="..." city="..." etc. >>>>>>>> >>>>>>>> Best regards >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 6/10/19 11:23 AM, baris.ka...@oracle.com wrote: >>>>>>>> >>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably >>>>>>>> You >>>>>>>> are suggesting >>>>>>>> >>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ; >>>>>>>> Query query = parser.parse("MAINS~2"); >>>>>>>> >>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD); >>>>>>>> >>>>>>>> am i right? >>>>>>>> Best regards >>>>>>>> >>>>>>>> >>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote: >>>>>>>> >>>>>>>> I would suggest using a QueryParser for your fuzzy query before >>>>>>>> adding it to the Boolean query. This should weed out any case >>>>>>>> issues. >>>>>>>> >>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.ka...@oracle.com >>>>>>>> <mailto:baris.ka...@oracle.com>> wrote: >>>>>>>> >>>>>>>> BooleanQuery.Builder booleanQuery = new >>>>>>>> BooleanQuery.Builder(); >>>>>>>> >>>>>>>> //First set >>>>>>>> >>>>>>>> booleanQuery.add(new FuzzyQuery(new >>>>>>>> org.apache.lucene.index.Term(field, "MAINS")), >>>>>>>> BooleanClause.Occur.SHOULD); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "NASHUA"), BooleanClause.Occur.MUST); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "NEW HAMPSHIRE"), BooleanClause.Occur.MUST); >>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, >>>>>>>> "UNITED STATES"), BooleanClause.Occur.MUST); >>>>>>>> >>>>>>>> // Second set >>>>>>>> //booleanQuery.add(new FuzzyQuery(new >>>>>>>> org.apache.lucene.index.Term(field, "street=\"MAINS\"")), >>>>>>>> BooleanClause.Occur.SHOULD); >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer, >>>>>>>> >>>>>>>> field, "city=\"NASHUA\""), BooleanClause.Occur.MUST); >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer, >>>>>>>> >>>>>>>> field, "region=\"NEW HAMPSHIRE\""), >>>>>>>> BooleanClause.Occur.MUST); >>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer, >>>>>>>> >>>>>>>> field, "country=\"UNITED STATES\""), >>>>>>>> BooleanClause.Occur.MUST); >>>>>>>> >>>>>>>> The first set brings also street with Nashua name. >>>>>>>> (NASHUA). >>>>>>>> >>>>>>>> so, to prevent that and since i also indexed with >>>>>>>> street="..." >>>>>>>> city="..." i did the second set but it does not bring >>>>>>>> anything. >>>>>>>> >>>>>>>> createPhraseQuery builds a Phrasequery with one term >>>>>>>> equal to the >>>>>>>> string >>>>>>>> in the call. >>>>>>>> >>>>>>>> Best regards >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 6/10/19 10:47 AM, baris.ka...@oracle.com >>>>>>>> <mailto:baris.ka...@oracle.com> wrote: >>>>>>>> > How do i check how it is indexed? lowecase or uppercase? >>>>>>>> > >>>>>>>> > only way is now to by testing. >>>>>>>> > >>>>>>>> > i am using standardanalyzer. >>>>>>>> > >>>>>>>> > Best regards >>>>>>>> > >>>>>>>> > >>>>>>>> > On 6/9/19 11:57 AM, Atri Sharma wrote: >>>>>>>> >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida >>>>>>>> >> <tomoko.uchida.1...@gmail.com >>>>>>>> <mailto:tomoko.uchida.1...@gmail.com>> wrote: >>>>>>>> >>> Hi, >>>>>>>> >>> >>>>>>>> >>> What analyzer do you use for the text field? Is the >>>>>>>> term "Main" >>>>>>>> >>> correctly indexed? >>>>>>>> >> Agreed. Also, it would be good if you could post your >>>>>>>> actual >>>>>>>> code. >>>>>>>> >> >>>>>>>> >> What analyzer are you using? If you are using >>>>>>>> StandardAnalyzer, >>>>>>>> then >>>>>>>> >> all of your terms while indexing will be lowercased, >>>>>>>> AFAIK, but >>>>>>>> your >>>>>>>> >> query will not be analyzed until you run a >>>>>>>> QueryParser on it. >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> Atri >>>>>>>> >> >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> --------------------------------------------------------------------- >>>>>>>> >>>>>>>> >>>>>>>> > To unsubscribe, e-mail: >>>>>>>> java-user-unsubscr...@lucene.apache.org >>>>>>>> <mailto:java-user-unsubscr...@lucene.apache.org> >>>>>>>> > For additional commands, e-mail: >>>>>>>> java-user-h...@lucene.apache.org >>>>>>>> <mailto:java-user-h...@lucene.apache.org> >>>>>>>> > >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> >>>>>>>> >>>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> >>>>>>>> >>>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> >>>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> >>>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>> > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org