Re: Performing a like query

2006-10-10 Thread Rahil
Hi Erick I think you've made some really important observations. Steven has provided a good regular expression to help with word and non-words. For the moment I have reverted to my analyser and am going to try doing some clever pattern matching later. Also, Ill try using a different analyser

Re: Performing a like query

2006-10-09 Thread Rahil
Hi Steve Thanks for your response. I was just wondering whether there is a difference between the regular expression you sent me i.e. (i) \s*(?:\b|(?=\S)(?=\s)|(?=\s)(?=\S))\s* and (ii) \\b as they lead to the same output. For example, the string search testing a-new string=3/4

Re: Performing a like query

2006-10-09 Thread Steven Rowe
Hi Rahil, Rahil wrote: I was just wondering whether there is a difference between the regular expression you sent me i.e. (i) \s*(?:\b|(?=\S)(?=\s)|(?=\s)(?=\S))\s* and (ii) \\b as they lead to the same output. For example, the string search testing a-new string=3/4 results in

Re: Performing a like query

2006-10-06 Thread Rahil
Hi Erick Im having trouble with writing a good regular expression for the PatternAnalyzer to deal with word and non-word characters.I couldnt figure out a valid regular expression to write a valid Pattern.compile(String regex) which can tokenise a string into O/E - visual acuity R-eye=6/24

Re: Performing a like query

2006-10-06 Thread Steven Rowe
Hi Rahil, Rahil wrote: I couldnt figure out a valid regular expression to write a valid Pattern.compile(String regex) which can tokenise a string into O/E - visual acuity R-eye=6/24 into O,/,E, -, visual, acuity, R, -, eye, =, 6, /, 24. The following regular expression should match

Re: Performing a like query

2006-10-02 Thread Chris Hostetter
: I have a custom-built Analyzer where I tokenize all non-whitespace : characters as well available in the field TERM (which is the only : field being tokenised). : If I now query my index file for a term 6/12 for instance, I get back : only ONE result : instead of TWO. There is another token in

Performing a like query

2006-10-01 Thread Rahil
Hi I have a custom-built Analyzer where I tokenize all non-whitespace characters as well available in the field TERM (which is the only field being tokenised). If I now query my index file for a term 6/12 for instance, I get back only ONE result SCOREDESCRIPTIONSTATUSCONCEPTID

Re: Performing a like query

2006-10-01 Thread Erick Erickson
Most often, from what I've seen on this e-mail list, unexpected results are because you're not indexing on the tokens you *think* you're indexing. Or not searching on them. By that I mean that the analyzers you're using are behaving in ways you don't expect. That said, I think you're getting

Re: Performing a like query

2006-10-01 Thread Rahil
Hi Erick Thanks for your response. There's a lot to chew on in your reply and Im looking at the suggestions you've made. Yeah I have Luke installed and have queried my index but there isn't any great explanation Im getting out of it. A query for 6/12 is sent as TERM:6/12 which is quite

Re: Performing a like query

2006-10-01 Thread Erick Erickson
Well, I'm not the greatest expert, but a quick look doesn't show me anything obvious. But I have to ask, wouldn't WhiteSpaceAnalyzer work for you? Although I don't remember whether WhiteSpaceAnalyzer lowercases or not. It sure looks like you're getting reasonable results given how you're