Yes, surely the analyzer used at indexing time governs what's in the index and thus what can be searched. I'd surmise that your index doesn't contain any tokens with < or > if StandardAnalyzer was used, so no matter what analyzer you use at query time, you won't be able to find tokens that aren't in your index.
There is one exception. If the fields were indexed UN_TOKENIZED then no tokenization occurs at index time. But that means that the input isn't broken into words, so if you indexed "this is a sentence", you wouldn't match unless you searched on "this is a sentence", stop words and all. So, you have to go back and re-index with different analyzers if you really need to search on tokens like < and >. You might get some benefit from PerFieldAnalyzerWrapper, and you might have to index the same data in more than one field, say dataTokenized and dataUnTokenized to get the behavior you need.... Best Erick On 1/30/07, poeta simbolista <[EMAIL PROTECTED]> wrote:
Thanks for your reply. I am working with an index which is created separately. It is created with a StandardTokenizer. I have read there should be used the same tokenizer which the index was created with. Anyway, I have tried other tokenizers while consulting, such as the WhitespaceTokenizer, but the same results happen. So, does it boil down to the tokenizer which the index was created with? Thanks a lot. Diego Ps. Yes, I was using Luke which works great. But anyway, things like <* dont work in any analyser -- don give me any results, although searching in the db gives me loads -- and also i can get those results (via searching other fields) on luke and i can see the text is there, the < and > are there. Erick Erickson wrote: > > Sure, your problem is probably that the query goes through an analyzer and > its associated tokenizer. Probably something like StandardAnalyzer which > "massages" the input and strips out most non-alphabetic characters,.... > except some. It tries to be smart about URLs, e-mail addresses, etc. > > If you're new, I recommend you use WhitespaceAnalyzer until you get > familiar > with what analyzers do for you. Be aware that WhitespaceAnalyzer does NOT > automatically lower-case your input, so "Which" won't match "which". It's > easy enough to make your own analyzer by subclassing of the standard ones. > The book "Lucene in Action " is valuable for this, although you should be > aware that it was written to the 1.4 codebase, so there are a few > differences. > > It is important that the analyzer you use at *index* time is compatible > with > the one you use at *query* time. Until you're more familiar with this, I > simply recommend you use the *same* analyzer at index time that you use at > search time. That'll give you more intuitive results. You'll want to > refine > the use of analyzers later... > > I also recommend that you get a copy of luke (google lucene luke). It will > allow you to examine your index, parse queries through the GUI, examine > the > effects of different analyzers on input etc. It's a great tool and one > that'll make your life much easier. > > Best > Erick > > > On 1/29/07, poeta simbolista <[EMAIL PROTECTED]> wrote: >> >> >> Hi there, this is my very first post at this forum... please be >> considerate >> :) >> >> Well, i have a problem when sending a query such as: >> >> +description:< >> >> Once the query is parsed, it returns me the empty String, which means the >> String "<" that i want to search for on the field description is ignored. >> If i use normal words then it is taken. Do you know why this could be? >> Thanks. >> -- >> View this message in context: >> http://www.nabble.com/Problem-with-lucene.-tf3137405.html#a8694565 >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> > > -- View this message in context: http://www.nabble.com/Problems-with-some-characters-tf3137405.html#a8707691 Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]