From what I can see in the debugger the analyzer chain is implemented
as a stack with last filter at the bottom and the first filter at the top.
An analyzer query chain of:
charFilter: MappingCharFilterFactory
tokenizer : WhitespaceTokenizerFactory
filter : PatternReplaceFilterFactory
filter : LowerCaseFilterFactory
filter : ShingleFilterFactory
filter : SynonymFilterFactory
has a chain of:
this.input(SynonymFilter) --> input(ShingleFilter) -->
input(LowerCaseFilter) --> input(PatternReplaceFilter) -->
input(WhitespaceTokenizer) --> input(MappingCharFilter) -->
input(CharReader) --> input(StringReader).str
So I can always "see" the input of StringReader, but can I access it?
Bernd
Am 26.10.2011 09:37, schrieb Chris Male:
We've also lost the full query string by the time the QP creates its
TokenStream, right? Because the QP tokenizes on whitespace.
On Wed, Oct 26, 2011 at 8:32 PM, Uwe Schindler<[email protected]> wrote:
Hi Simon,
The problem is the xchanged consumer/producer role. Once the TokenStream
calls clearAttributes() the attributes are gone, but query parser can only
set the attribute *before* calling incrementToken(), so you have no chance
to get them, as Tokenizer cleared it before any filter can read it (unless
we use an attribute with clear() a no-op, which would fail lots of tests,
as it's a hack).
Uwe
-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [email protected]
-----Original Message-----
From: Simon Willnauer [mailto:[email protected]]
Sent: Wednesday, October 26, 2011 9:21 AM
To: [email protected]
Subject: Re: accessing the query string from inside TokenFilter
What Uwe says is correct though. What we possibly could do is adding a
queryattribute that is set in a query parser (you can do that yourself
though).
not sure if it is worth it and if we should do it.
simon
On Wed, Oct 26, 2011 at 8:58 AM, Uwe Schindler<[email protected]> wrote:
Hi,
QueryParser and TokenStreams are clearly separated, there is no way to
get the query string from inside a TokenStream (and there cannot be,
because QP is a consumer of the TS, which is used not only for query
parsing). The only chance you have is to use a ThreadLocal that you
set before the query is parsed and then use it in the TokenFilter.
Uwe
-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [email protected]
-----Original Message-----
From: Bernd Fehling [mailto:[email protected]]
Sent: Wednesday, October 26, 2011 8:33 AM
To: [email protected]
Subject: accessing the query string from inside TokenFilter
Dear list,
while writing some TokenFilter for my analyzer chain I need access to
the
query
string from inside of my TokenFilter for some comparison, but the
Filters
are
working with a TokenStream and get seperate Tokens.
Currently I couldn't get any access to the query string.
It would be great to have such a funtionality in lucene/solr.
Should I write a jira issue for it or is there somewhere a wish list?
Best regards
Bernd
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected] For
additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected] For
additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected] For additional
commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
--
*************************************************************
Bernd Fehling Universitätsbibliothek Bielefeld
Dipl.-Inform. (FH) Universitätsstr. 25
Tel. +49 521 106-4060 Fax. +49 521 106-4052
[email protected] 33615 Bielefeld
BASE - Bielefeld Academic Search Engine - www.base-search.net
*************************************************************
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]