Hi Ahmed! 2011/12/1 Ahmed Saidi <ci7nu...@gmail.com>: > Hi Viet, > i want to build a filter to split tokens that contain characters > [a-zA-Z] and numbers into two or more tokens > for example if this filter got a token like "test123" it will split it > into two tokens "test" and "123", and it will split "ci7nucha" to > "ci", "7" and "nusha" > My implementation does that, but rather than converting splited tokens > into Term query, it convert them into PhraseQuery > I want to build a filter like solr WordDelimiterFilter
But then, I think, you have to extend/change the QueryParser. If a field query is parsed than it depends on the "tokinzation" of the text. If the tokenization results in one term then the outcome of parsing the field query is a TermQuery. If not, as in your case, then the outcome is either a BooleanQuery consisting of several TermQueries or a PhraseQuery (cf. QueryParser::getFieldQuery(const TCHAR* _field, TCHAR* queryText) in QueryParser.cpp). I think, you need the BooleanQuery as the outcome, right? This seems to depend on the position increment return by the filter. But haven't understand this completely yet. Kind regards, Veit ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d _______________________________________________ CLucene-developers mailing list CLucene-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/clucene-developers