Re: camel-casing and dismax troubles

2009-05-13 Thread Geoffrey Young
On Wed, May 13, 2009 at 6:23 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Tue, May 12, 2009 at 7:19 PM, Geoffrey Young ge...@modperlcookbook.org wrote: hi all :) I'm having trouble with camel-cased query strings and the dismax handler. a user query LeAnn Rimes isn't matching

camel-casing and dismax troubles

2009-05-12 Thread Geoffrey Young
hi all :) I'm having trouble with camel-cased query strings and the dismax handler. a user query LeAnn Rimes isn't matching the indexed term Leann Rimes even though both are lower-cased in the end. furthermore, the analysis tool shows a match. the debug query looks like

dismax and WordDelimiterFilterFactory+PreserveOriginal

2009-03-16 Thread Geoffrey Young
hi all :) I have two filters combined with dismax on the query side: WordDelimiterFilterFactory { preserveOriginal=1, generateNumberParts=1, catenateWords=0, generateWordParts=1, catenateAll=0, catenateNumbers=0} followed by lowecase filter factory. the analyzer shows the phrase gUYS and

filtering on blank OR specific range

2008-11-19 Thread Geoffrey Young
hi all :) I'm having difficultly filtering my documents when a field is either blank or set to a specific value. I would have thought this would work fq=-Type:[* TO *] OR Type:blue which I would expect to find all document where either Type is undefined or Type is blue. my actual result set

Re: filtering on blank OR specific range

2008-11-19 Thread Geoffrey Young
Lance Norskog wrote: Try: Type:blue OR -Type:[* TO *] You can't have a negative clause at the beginning. Yes, Lucene should barf about this. I did try that, before and again now, and still no luck. anything else? --Geoff

Re: solr 1.3 snapshooter doesn't work, commit never ending

2008-10-15 Thread Geoffrey Young
sunnyfr wrote: I tried last evening before leaving and this morning time elapsed was very important like you can notice above and no snapshot, no error in the logs. I'm actually having a similar trouble. I've enabled postCommit and postOptimize hooks with an absolute path to snapshooter.

Re: using DataImportHandler instead of POST?

2008-10-03 Thread Geoffrey Young
Chris Hostetter wrote: : I chugg away at 1.5 million records in a single file, but solr never : commits. specifically, it ignores my autocommit settings. (I can : commit separately at the end, of course :) the way the autocommit settings work is soemthing i always get confused by --

Re: using DataImportHandler instead of POST?

2008-10-01 Thread Geoffrey Young
Geoffrey Young wrote: Chris Hostetter wrote: : I have a well-formed xml file, suitable for POSTting to solr. that : works just fine. it's very large, though, and using curl in production : is so very lame. is there a very simple config that will let solr just : slurp up the file via

Re: using DataImportHandler instead of POST?

2008-09-29 Thread Geoffrey Young
Chris Hostetter wrote: : I have a well-formed xml file, suitable for POSTting to solr. that : works just fine. it's very large, though, and using curl in production : is so very lame. is there a very simple config that will let solr just : slurp up the file via the DataImportHandler?

using DataImportHandler instead of POST?

2008-09-28 Thread Geoffrey Young
hi all :) I'm sorry I need to ask this, but after reading and re-reading the wiki I don't see a clear path... I have a well-formed xml file, suitable for POSTting to solr. that works just fine. it's very large, though, and using curl in production is so very lame. is there a very simple

Re: spellchecker problems (bugs)

2008-07-25 Thread Geoffrey Young
This issue has been fixed in the trunk. Can you please use the latest trunk code and try? current trunk looks good. thanks! --Geoff

Re: Multiple search components in one handler - ie spellchecker

2008-07-25 Thread Geoffrey Young
Andrew Nagy wrote: Hello - I am attempting to add the spellCheck component in my search requesthandler so when a users does a search, they get the results and spelling corrections all in one query just like the way the facets work. I am having some trouble accomplishing this - can anyone

Re: Multiple search components in one handler - ie spellchecker

2008-07-25 Thread Geoffrey Young
Andrew Nagy wrote: Thanks for getting back to me Geoff. Although, that is pretty much what I have. Maybe if I show my solrconfig someone might be able to point out what I have incorrect? The problem is that nothing related to the spelling options are show in the results, just the normal

Re: spell-checker and faceting

2008-07-23 Thread Geoffrey Young
dudes dudes wrote: Hi, I'm trying to couple spell-checking mechanism with faceting in one url statement.. I can get the spell check right, but the facet doesn't work when it's combined with spell-checker...

Re: spellchecker problems (bugs)

2008-07-23 Thread Geoffrey Young
Jonathan Lee wrote: I don't see the patch attached to my original email either -- does solr-user not allow attachments? This is ugly, but here's the patch inline: issue created in jira: https://issues.apache.org/jira/browse/SOLR-648 --Geoff

Re: spellchecker problems (bugs)

2008-07-22 Thread Geoffrey Young
Shalin Shekhar Mangar wrote: The problems you described in the spellchecker are noted in https://issues.apache.org/jira/browse/SOLR-622 -- I shall create an issue to synchronize spellcheck.build so that the index is not corrupted. I'd like to discuss this a little... I'm not sure that I

Re: problems with SpellCheckComponent

2008-07-08 Thread Geoffrey Young
When I made: http://localhost:8080/solr/spellCheckCompRH?q=*:*spellcheck.q=ruckspellcheck=true I have this exception: Estado HTTP 500 - null java.lang.NullPointerException at org.apache.solr.handler.component.SpellCheckComponent.getTokens(SpellCheckComponent.java:217) I see this all the

Re: problems with SpellCheckComponent

2008-07-08 Thread Geoffrey Young
Shalin Shekhar Mangar wrote: Hi Geoff, I can't find anything in the code which would give this exception when both q and spellcheck.q is specified. Though, this exception is certainly possible when you restart solr. Anyways, I'll look into it more deeply. great, thanks. There are a few

Re: SpellCheckerRequestHandler qt parameter

2008-06-27 Thread Geoffrey Young
I had null pointer exceptions left and right while composing this email... then I added spellcheck.build=true to one and they went away. do you need to rebuild the spelling index every time you alter (certain parts) of solrconfig.xml? it was very consistent as reported below, but after

Re: SpellCheckerRequestHandler qt parameter

2008-06-26 Thread Geoffrey Young
Norberto Meijome wrote: Hi there, Short and sweet : Is SCRH intended to honour qt= ? longer... I'm testing the newest SCRH ( SOLR-572), using last night's nightly build. I have defined a 'dismax' request handler which searches across a number of fields. When I use the SCRH in a query,

Re: SpellCheckerRequestHandler qt parameter

2008-06-26 Thread Geoffrey Young
Grant Ingersoll wrote: On Jun 26, 2008, at 5:25 PM, Geoffrey Young wrote: well *almost* - it works most excellently with q=$term but when I add spellchecker.q=$term things implode: HTTP Status 500 - null java.lang.NullPointerException at org .apache .solr .handler

Re: missing document count?

2008-06-18 Thread Geoffrey Young
Chris Hostetter wrote: : not hard, but useful information to have handy without additional : manipulations on my part. : our pages are the results of multiple queries. so, given a max number of : records per page (or total), the rows asked of query2 is max - query1, of in the common case,

Re: searching only within allowed documents

2008-06-11 Thread Geoffrey Young
Solr allows you to specify filters in separate parameters that are applied to the main query, but cached separately. q=the user queryfq=folder:f13fq=folder:f24 I've been wanting more explanation around this for a while, so maybe now is a good time to ask :) the cached separately verbiage

adding expand=true to WordDelimiterFilter

2008-05-19 Thread Geoffrey Young
hi :) I'm having an interesting problem with my data. in general, I want the results of the WordDelimiterFilter for better matching, but there are times when it's just too aggressive. for example boys2men = boys 2 men (good) p!nk = pnk (maybe) !!! = (nothing - bad) there's

Re: adding expand=true to WordDelimiterFilter

2008-05-19 Thread Geoffrey Young
Chris Hostetter wrote: by expand=true it sounds like you mean you are looking for a way to preserve the orriginal term without any characteres removed. yes, that's it. This sounds like SOLR-14 ... you might want to take a look at it, and see if the patch is still useable, and if not see

Re: token concat filter?

2008-05-08 Thread Geoffrey Young
Otis Gospodnetic wrote: Geoff, Whether synonyms are applied at index time or query time is controlled via schema.xml - it depends on where you put the synonym factory, whether in the index-time or query-time section of a fieldType. Synonyms are read once on start, I believe. It might be

Re: token concat filter?

2008-05-08 Thread Geoffrey Young
Otis Gospodnetic wrote: There is actually a Wiki page explaining this pretty well... have you seen it? I guess not. I've been reading the wiki, but the trouble with wiki's always seems to be (for me) finding stuff. can you point it out? Index-time expansion means larger indices and

Re: Sort results on a field not ordered

2008-05-02 Thread Geoffrey Young
Erik Hatcher wrote: What field type is chapterTitle? I'm betting it is an analyzed field with multiple values (tokens/terms) per document. To successfully sort, you'll need to have a single value per document - using copyField can help with this to have both a searchable field and a

token concat filter?

2008-05-01 Thread Geoffrey Young
hi :) I'm looking for a filter that will compress all tokens into a single token. the WordDelimiterFilterFactory does it for tokens it finds itself, but not ones passed to it. basically, I'm trying to match Radiohead in the index with radio head in the query. if it were spelled

Re: token concat filter?

2008-05-01 Thread Geoffrey Young
Yonik Seeley wrote: If there are only a few such cases, it might be better to use synonyms to correct them. unfortunately, there are too many to handle this way. Off the top of my head there's no concatenating token filter, but it wouldn't be hard to make one. hmm, ok. I'm not a java

Re: token concat filter?

2008-05-01 Thread Geoffrey Young
Walter Underwood wrote: I've been doing it with synonyms and I have several hundred of them. I'm dealing mostly with proper names, so I expect more like 80k of them for our data :) Concatenating bi-word groups is pretty useful for English. We have a habit of gluing words together.

Re: token concat filter?

2008-05-01 Thread Geoffrey Young
Walter Underwood wrote: I doubt it would be that many. I recommend tracking the searches and the clicks, and working on queries with low clickthrough. the trouble is I'm in a dynamic biz - last weeks popular clicks are very different from this weeks, so by the time I analyze last weeks

Re: token concat filter?

2008-05-01 Thread Geoffrey Young
Otis Gospodnetic wrote: Geoff, Whether synonyms are applied at index time or query time is controlled via schema.xml - it depends on where you put the synonym factory, whether in the index-time or query-time section of a fieldType. Synonyms are read once on start, I believe. It might be

Re: Got parseException when search keyword AND on a text field

2008-04-24 Thread Geoffrey Young
Otis Gospodnetic wrote: Not in one place and documented. The place to look are query parsers, but things like AND OR NOT TO are the ones to look out for. this seems like something solr ought to handle gracefully on the backend for me - if I need to write logic to make sure a malicious

another spellchecker question

2008-04-23 Thread Geoffrey Young
hi :) I've noticed that (with solr 1.2) the returned order (as well as the actual matched set) is affected by the number of matches you ask for: q=hannasuggestionCount=1 suggestions:[Yanna] q=hannasuggestionCount=2 suggestions:[Manna, Yanna] q=hannasuggestionCount=5

Re: another spellchecker question

2008-04-23 Thread Geoffrey Young
Shalin Shekhar Mangar wrote: Hi Geoffrey, Yes, this is a caveat in the lucene contrib spellchecker which Solr uses. From the lucene spell checker javadocs: * pAs the Lucene similarity that is used to fetch the most relevant n-grammed terms * is not the same as the edit distance strategy

Re: config for very frequent solr updates

2008-04-18 Thread Geoffrey Young
found the distributed search docs from there and will keep that in mind as I move forward. --Geoff Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Geoffrey Young [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Thursday, April 17, 2008

config for very frequent solr updates

2008-04-17 Thread Geoffrey Young
hi all :) I didn't see any documentation on this, so I was wondering what the experience here was with updating solr with a small but constant trickle of daemon-style updates. unfortunately, it's a business requirement that backend db updates make it to search as the changes roll in (5

Re: schema help

2008-03-12 Thread Geoffrey Young
Rachel McConnell wrote: Our Solr use consists of several rather different data types, some of which have one-to-many relationships with other types. We don't need to do any searching of quite the kind you describe, but I have an idea about it, depending on what you need to do with the book

Re: schema help

2008-03-12 Thread Geoffrey Young
the trouble I'm having is one of dimension. an author has many, many attributes (name, birthdate, biography in $language, etc). as does each book (title in $language, summary in $language, genre, etc). as does each library (name, address, directions in $language, etc). so an author with N

schema help

2008-03-11 Thread Geoffrey Young
hi :) I'm trying to work out a schema for our widgets. more than just coming up with something I'd like something idiomatic in solr terms. any help is much appreciated. here's a similar problem space to what I'm working with... lets say we're talking books. books are written by authors

Re: schema help

2008-03-11 Thread Geoffrey Young
. --Geoff Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Geoffrey Young [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Tuesday, March 11, 2008 12:17:32 PM Subject: schema help hi :) I'm trying to work out a schema for our widgets

multiple things in a document

2008-02-22 Thread Geoffrey Young
hi all :) I'm just getting up to speed with solr (and lucene, for that matter) for a new project. after reading through the available docs I'm not finding an answer to my most basic (newbie, certainly) question. please feel free to just point me to the proper doc :) this isn't my actual