Re: Apostrophes in fields
Show us your full field type with analyzer. I suspect that the problem is that one of the index-time filters is turning "dev's" into "devs" (WDF does that), but at query-time there is no filter that removes a trailing apostrophe. Use the Solr Admin UI Analysis page to see home "dev's" gets indexed and how "dev'" gets analyzed at query time. -- Jack Krupansky -Original Message- From: devendra W Sent: Tuesday, September 03, 2013 5:59 AM To: solr-user@lucene.apache.org Subject: Re: Apostrophes in fields in my case - the fields with apostrophe not returned in results When I search for -- dev it shows me following results dev dev's devendra but when I search for -- dev' (dev with apo only) Nothing comes out as result ? What could be the workaround ? Thanks Devendra -- View this message in context: http://lucene.472066.n3.nabble.com/Apostrophes-in-fields-tp475058p4087910.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Apostrophes in fields
On 9/3/2013 3:59 AM, devendra W wrote: > in my case - the fields with apostrophe not returned in results Don't use special characters in field names. If it wouldn't work as an variable name, function name (or other identifier) in a typical programming language (Java, C, Perl), then it will probably cause you problems with a field name. This basically means: 7-bit ASCII only. Starts with a letter, contains only letters, numbers, and the underscore. Most punctuation other than the underscore has a special meaning to Solr. Using extended characters (UTF-8, or those beyond 7-bit ASCII) *might* work, but it's fairly easy to screw that up and use the wrong character set, so it's better if you just don't do it. Thanks, Shawn
Re: Apostrophes in fields
in my case - the fields with apostrophe not returned in results When I search for -- dev it shows me following results dev dev's devendra but when I search for -- dev' (dev with apo only) Nothing comes out as result ? What could be the workaround ? Thanks Devendra -- View this message in context: http://lucene.472066.n3.nabble.com/Apostrophes-in-fields-tp475058p4087910.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Apostrophes in fields
Using the fuzzy searching fixed the problem - I will have a play with the analzyers and see if I can get it working nicely. Thanks again, much apreciated. On 1/17/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : This problem is why some sloppiness is recommended when dealing with : WordDelimiterFilter. particularly when using the generate___Parts="true" options Nick: if you want simpler matching like this, you might want to consider simplifying your definition of "text" ... if you look at the "textTight" fieldtype in the example shema (used by the field "sku") you'll see a simpler usage of WordDelimiterFilter ... alternately you may just want to use lucene's basic StandardAnalzyer ... i believe it strips Apostrophes. as a real last resort, you could use the recently added PatternReplaceFilter to strip out apostrophe's prior to WordDelimiterFilter (if you like everything WordDelim does for you except spliting on apostrophes) : - optionally index ohara at *both* "o" and "hara" then searching for "Shelley ohara memorial" fails without unless yo have slop .. if you need slop, you might as well not index it twice (not to mention it throws off the tf/idf calculations) : - pick the "alignment" based on the token position in the stream... : right-justify the catenations if it's the first token, otherwise : left-justify. One could try to identify proper names and do the : justification correctly too (blech). oh for the love of god please no. -Hoss -- - Nick
Re: Apostrophes in fields
: This problem is why some sloppiness is recommended when dealing with : WordDelimiterFilter. particularly when using the generate___Parts="true" options Nick: if you want simpler matching like this, you might want to consider simplifying your definition of "text" ... if you look at the "textTight" fieldtype in the example shema (used by the field "sku") you'll see a simpler usage of WordDelimiterFilter ... alternately you may just want to use lucene's basic StandardAnalzyer ... i believe it strips Apostrophes. as a real last resort, you could use the recently added PatternReplaceFilter to strip out apostrophe's prior to WordDelimiterFilter (if you like everything WordDelim does for you except spliting on apostrophes) : - optionally index ohara at *both* "o" and "hara" then searching for "Shelley ohara memorial" fails without unless yo have slop .. if you need slop, you might as well not index it twice (not to mention it throws off the tf/idf calculations) : - pick the "alignment" based on the token position in the stream... : right-justify the catenations if it's the first token, otherwise : left-justify. One could try to identify proper names and do the : justification correctly too (blech). oh for the love of god please no. -Hoss
Re: Apostrophes in fields
On 1/16/07, Mike Klaas <[EMAIL PROTECTED]> wrote: > It appears to be matching author:"Shelley Ohara" but when I do this > search no results are returned, searches like author:"Shelley O hara", > author:"Shelley O'hara" work as expected. Any ideas? This problem is why some sloppiness is recommended when dealing with WordDelimiterFilter. "Shelley Ohara"~1 should work. Hmm, shouldn't "ohara" be generated at the same position as "o", not "hara"? It looks like it is failing to do exact phrase matching because the index contains "shelley o (ohara|hara)" The problem is, if you do it one way, the other way breaks. If you index "ohara" with "o", then a field like "O'hara Shelley" wouldn't match a query like "oraha shelly". There are a few possible options: - optionally index ohara at *both* "o" and "hara" - pick the "alignment" based on the token position in the stream... right-justify the catenations if it's the first token, otherwise left-justify. One could try to identify proper names and do the justification correctly too (blech). -Yonik
Re: Apostrophes in fields
On 1/16/07, Nick Jenkin <[EMAIL PROTECTED]> wrote: Hi Jeff, Bertrand THanks for your help, The analyzers I am using are the same as in the example schema.xml Author field: analysis result: http://nickjenkin.com/misc/solr.jpg It appears to be matching author:"Shelley Ohara" but when I do this search no results are returned, searches like author:"Shelley O hara", author:"Shelley O'hara" work as expected. Any ideas? Hmm, shouldn't "ohara" be generated at the same position as "o", not "hara"? It looks like it is failing to do exact phrase matching because the index contains "shelley o (ohara|hara)" -Mike
Re: Apostrophes in fields
Hi Jeff, Bertrand THanks for your help, The analyzers I am using are the same as in the example schema.xml Author field: analysis result: http://nickjenkin.com/misc/solr.jpg It appears to be matching author:"Shelley Ohara" but when I do this search no results are returned, searches like author:"Shelley O hara", author:"Shelley O'hara" work as expected. Any ideas? Thanks -Nick On 1/16/07, Bertrand Delacretaz <[EMAIL PROTECTED]> wrote: On 1/16/07, Jeff Rodenburg <[EMAIL PROTECTED]> wrote: > Nick - this depends on the analyzer used to index the field as well as the > analyzer used in your search query Note that the Solr "analysis" page, in the admin interface, allows you to see exactly how your field's content is converted for indexing. There's an example at http://www.xml.com/lpt/a/1668 in the "Content Analysis" part of the article. -Bertrand -- - Nick
Re: Apostrophes in fields
On 1/16/07, Jeff Rodenburg <[EMAIL PROTECTED]> wrote: Nick - this depends on the analyzer used to index the field as well as the analyzer used in your search query Note that the Solr "analysis" page, in the admin interface, allows you to see exactly how your field's content is converted for indexing. There's an example at http://www.xml.com/lpt/a/1668 in the "Content Analysis" part of the article. -Bertrand
Re: Apostrophes in fields
Nick - this depends on the analyzer used to index the field as well as the analyzer used in your search query. This gets handled in solr with the fieldtype and requesthandler. Referencing the sample schema.xml off the wiki site, I would start with fieldtype="text" and go from there. If it doesn't address apostrophes (it splits on non-alpha chars) you can easily extend it through configuration to reference the necessary filter factory class. Hope this helps. -- j On 1/15/07, Nick Jenkin <[EMAIL PROTECTED]> wrote: Hi This is probably more of a lucene question, but: I have an author field, If I query author:"Shelley Ohara" - no results are returned If I query author:"Shelley O'hara" - many results are returned, Is it possible, to get solr to ignore apostrophes in queries like the one above? e.g. doc Shelley O'Hara true long descirption 9780764559747 Paperback IDGP Kierkegaard Within Your Grasp 2004 Thanks -- - Nick