Hoss, Thanks. You've answered my question. To clarify, what I should have asked for instead of 'exact' was 'not fuzzy'. For some reason it didn't occur to me that I didn't need n-grams to use the wildcard. You asking for me to clarify what I meant made me realize that the n-grams are the source of all my current problems. :)
Thanks! Devon Baumgarten -----Original Message----- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Thursday, December 29, 2011 7:00 PM To: solr-user@lucene.apache.org Subject: RE: Solr, SQL Server's LIKE : Thanks. I know I'll be able to utilize some of Solr's free text : searching capabilities in other search types in this project. The : product manager wants this particular search to exactly mimic LIKE%. ... : Ex: If I search "Albatross" I want "Albert" to be excluded completely, : rather than having a low score. please be specific about the types of queries you want. ie: we need more then one example of the type of input you want to provide, the type of matches you want to see for that input, and the type of matches you want to get back. in your first message you said you need to match company titles "pretty exactly" but then seem to contradict yourself by saying the SQL's LIKE command fit's the bill -- even though the SQL LIKE command exists specificly for in-exact matches on field values. Based on your one example above of Albatross, you don't need anything special: don't use ngrams, don't use stemming, don't use fuzzy anything -- just search for "Albatross" and it will match "Albatross" but not "Albert". if you want "Albatross" to match "Albatross Road" use some basic tokenization. If all you really care about is prefix searching (which seems suggested by your "LIKE%" comment above, which i'm guessing is shorthand for something similar to "LIKE 'ABC%'"), so that queries like "abc" and "abcd" both match "abcdef" and "abcdzzzz" but neither of them match "xxxxabcdyyyy" then just use prefix queries (ie: "abcd*") -- they should be plenty efficient for your purposes. you only need to worry about ngrams when you want to efficiently match in the middle of a string. (ie: "TITLE LIKE %ABC%") -Hoss