Re: converting over from sphinx

2009-11-25 Thread Chris Hostetter

: way.  In particular, I'm doing phrase searching into a corpus of
: descriptions, such as I need help with a foo where I have a bunch of foo:
: a foo is a subset of a bar often used to create briznatzes, etc.
: 
: With Sphinx, I could convert I need help with a foo into *need* *help*
: *with* *foo* and get pretty nice matches. With Solr, my understanding is
: that you can only do wildcard matches on the suffix. In addition, stemming
: only happens on non-wildcard terms. So, my first thought would be to convert
: I need help with a foo into need need* help help* with with* foo foo*.

First off, we need to make sure we have all our terminology in sync -- i'm 
not very familiar with Sphinx, so i'm not sure what types of vernacular 
are used there to describe various things, but in Solr/Lucene you have 
options regarding how you want text to be analyzed when it's indexed -- 
this analysis is what converts an arbitrary stream of characters into 
Terms that get indexed.  at query time, it's very easy to match on 
terms, or boolean combinations of terms, and sequential phrases of terms 
-- you only need wildcard type functionality if you want to provide a 
wildcard expression that could match more then one individual term.

In your specific example, if you just configured a basic wildcard 
tokenizer when you indexed your documents (ie: foo: a foo is a subset of 
a bar often used to create briznatzes) then at query time any of the 
individual words (foo, bar, etc...) would match that document.  
likewise a phrase query like need help with foo would match that text if 
you defined some stop words (like need and with) and specified a small 
amount of slop on your phrase queries.


The point is: there are a lot of differnet ways to use Solr, and the 
terminology you are use to with Sphinx may not map exactly to some of the 
terminology you'll see in the SOlr docs/configs -- so please feel free to 
ask.

-Hoss



Re: converting over from sphinx

2009-11-15 Thread Otis Gospodnetic
Something doesn't sound right here.  Why do you need wildcards for queries in 
the first place?
Are you finding that with stopword removal and stemming you are not matching 
some docs that you think should be matched?  If so, we may be able to help if 
you provide a few examples.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Cory Ondrejka cory.ondre...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Sat, November 14, 2009 12:57:56 PM
 Subject: converting over from sphinx
 
 I've been using Sphinx for full text search, but since I want to move my
 project over to Heroku, need to switch to Solr. Everything's up and running
 using the acts_as_solr plugin, but I'm curious if I'm using Solr the right
 way.  In particular, I'm doing phrase searching into a corpus of
 descriptions, such as I need help with a foo where I have a bunch of foo:
 a foo is a subset of a bar often used to create briznatzes, etc.
 
 With Sphinx, I could convert I need help with a foo into *need* *help*
 *with* *foo* and get pretty nice matches. With Solr, my understanding is
 that you can only do wildcard matches on the suffix. In addition, stemming
 only happens on non-wildcard terms. So, my first thought would be to convert
 I need help with a foo into need need* help help* with with* foo foo*.
 
 Thanks in advance for any help.
 
 -- 
 Cory Ondrejka
 cory.ondre...@gmail.com
 http://ondrejka.net/



converting over from sphinx

2009-11-14 Thread Cory Ondrejka
I've been using Sphinx for full text search, but since I want to move my
project over to Heroku, need to switch to Solr. Everything's up and running
using the acts_as_solr plugin, but I'm curious if I'm using Solr the right
way.  In particular, I'm doing phrase searching into a corpus of
descriptions, such as I need help with a foo where I have a bunch of foo:
a foo is a subset of a bar often used to create briznatzes, etc.

With Sphinx, I could convert I need help with a foo into *need* *help*
*with* *foo* and get pretty nice matches. With Solr, my understanding is
that you can only do wildcard matches on the suffix. In addition, stemming
only happens on non-wildcard terms. So, my first thought would be to convert
I need help with a foo into need need* help help* with with* foo foo*.

Thanks in advance for any help.

-- 
Cory Ondrejka
cory.ondre...@gmail.com
http://ondrejka.net/