Re: converting over from sphinx
: way. In particular, I'm doing phrase searching into a corpus of : descriptions, such as I need help with a foo where I have a bunch of foo: : a foo is a subset of a bar often used to create briznatzes, etc. : : With Sphinx, I could convert I need help with a foo into *need* *help* : *with* *foo* and get pretty nice matches. With Solr, my understanding is : that you can only do wildcard matches on the suffix. In addition, stemming : only happens on non-wildcard terms. So, my first thought would be to convert : I need help with a foo into need need* help help* with with* foo foo*. First off, we need to make sure we have all our terminology in sync -- i'm not very familiar with Sphinx, so i'm not sure what types of vernacular are used there to describe various things, but in Solr/Lucene you have options regarding how you want text to be analyzed when it's indexed -- this analysis is what converts an arbitrary stream of characters into Terms that get indexed. at query time, it's very easy to match on terms, or boolean combinations of terms, and sequential phrases of terms -- you only need wildcard type functionality if you want to provide a wildcard expression that could match more then one individual term. In your specific example, if you just configured a basic wildcard tokenizer when you indexed your documents (ie: foo: a foo is a subset of a bar often used to create briznatzes) then at query time any of the individual words (foo, bar, etc...) would match that document. likewise a phrase query like need help with foo would match that text if you defined some stop words (like need and with) and specified a small amount of slop on your phrase queries. The point is: there are a lot of differnet ways to use Solr, and the terminology you are use to with Sphinx may not map exactly to some of the terminology you'll see in the SOlr docs/configs -- so please feel free to ask. -Hoss
Re: converting over from sphinx
Something doesn't sound right here. Why do you need wildcards for queries in the first place? Are you finding that with stopword removal and stemming you are not matching some docs that you think should be matched? If so, we may be able to help if you provide a few examples. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR - Original Message From: Cory Ondrejka cory.ondre...@gmail.com To: solr-user@lucene.apache.org Sent: Sat, November 14, 2009 12:57:56 PM Subject: converting over from sphinx I've been using Sphinx for full text search, but since I want to move my project over to Heroku, need to switch to Solr. Everything's up and running using the acts_as_solr plugin, but I'm curious if I'm using Solr the right way. In particular, I'm doing phrase searching into a corpus of descriptions, such as I need help with a foo where I have a bunch of foo: a foo is a subset of a bar often used to create briznatzes, etc. With Sphinx, I could convert I need help with a foo into *need* *help* *with* *foo* and get pretty nice matches. With Solr, my understanding is that you can only do wildcard matches on the suffix. In addition, stemming only happens on non-wildcard terms. So, my first thought would be to convert I need help with a foo into need need* help help* with with* foo foo*. Thanks in advance for any help. -- Cory Ondrejka cory.ondre...@gmail.com http://ondrejka.net/
converting over from sphinx
I've been using Sphinx for full text search, but since I want to move my project over to Heroku, need to switch to Solr. Everything's up and running using the acts_as_solr plugin, but I'm curious if I'm using Solr the right way. In particular, I'm doing phrase searching into a corpus of descriptions, such as I need help with a foo where I have a bunch of foo: a foo is a subset of a bar often used to create briznatzes, etc. With Sphinx, I could convert I need help with a foo into *need* *help* *with* *foo* and get pretty nice matches. With Solr, my understanding is that you can only do wildcard matches on the suffix. In addition, stemming only happens on non-wildcard terms. So, my first thought would be to convert I need help with a foo into need need* help help* with with* foo foo*. Thanks in advance for any help. -- Cory Ondrejka cory.ondre...@gmail.com http://ondrejka.net/