Activity report on *[JIRA] New Feature SKER4949 - Solr SearchCommand implementation*
Scarab Link: http://sesat.no/scarab/issues/id/SKER4949 Module: Sesat> Kernel Activity generated by Glenn-Erik Sandbakken ([EMAIL PROTECTED]) at 08/21/2008 10:52 *Reasons for the changes* *Comments* - By Glenn-Erik Sandbakken - 08/21/2008 10:52 --- "Some more regarding the default behavior: Solr removes certain (stop) words. If you search for "name:as" you are returned 0 hits. This also affects phrase searches, for name:"sesam as", documents with name=sesam is given the same score as those with "sesam as", also documents with name="sesame" etc. is given the same score. (probably because its al concidered stemmed). We probably don't want to stem the content in our lists, especially because we have much content that is not in english. name:"air crash" returns only those with "air" followed by "crash" (because both words are indexed). name:air crash returns all documents with "air" OR "crash" (and documents with stemmed versions). if you need both to be included, use: name:air AND crash =>To be able to do exac matches we need to change the schema.xml and reindex. After the refeed yesterday there is "only" 6.905.780 documents in the index. Some of the reason for this may be that documents containing stop words only are removed. I have seen "much" content in our lists that are only a few strange characters (i.e. "²", "³", "m²", "m³" etc.) I've also seen entries that are most likely considered the same by solr such as "meter" and "-meter" and "meter*", "meter-", "--meter" etc. And don't forget the mention about character encoding in my prev. post. When I search (and the solr admin) for words containing non a-z0-9 chars, they appear to be sent as iso to solr, and solr works in utf8. I therefore have to urlencode the search. You can quickly find the urlencoded version of strings and characters here: http://www.blooberry.com/indexdot/html/topics/urlencoding.htm (at the bottom) Next we need to meet to determine how I should configure solr to work for our lists purpose. Theres a variety of options to tweak, and we may not know that we want to use them before we actually know the options that solr gives. "
_______________________________________________ Kernel-issues mailing list [email protected] http://sesat.no/mailman/listinfo/kernel-issues
