[SOLR] RFC - Contributing a FrequentSearchTerm component ...

Siegfried Goeschl Fri, 09 Nov 2012 05:37:50 -0800

Hi folks,

I'm now finishing a SOLR project for one of my customers (replacingMicrosoft FAST server with SOLR) and got the permission to contributeour improvements.

The most interesting thing is a "FrequentSearchTerm" component whichallows to analyze the user-supplied search queries in real-time

+) it keeps track of the last queries per core using a LIFO buffer (sowe have an upper limit of memory consumption)

+) per query entry we keep track of the number of invocations, theaverage number of result document and the average execution time

+) we allow for custom searches across the frequent search terms usingthe MVEL expression language (see http://mvel.codehaus.org)

++) find all queries which did not yield any results - 'meanHits==0'

++) find all "iPhone" queries - "searchTerm.contains("iphone) ||searchTerm.contains("i-phone)''++) find all long-running "iPhone" queries -'(searchTerm.contains("iphone) || searchTerm.contains("i-phone)) &&meanTime>50'


+) GUI : we have a JSP page which allows to access the frequent search terms

+) there is also an XML/CSV export we use to display the 50 mostfrequently used search queries in real-time


We use this component

+) to get input for QA regarding frequently used search terms

+) to find strange queries, e.g. queries returning no or too manyresult, e.g. caused by WordDelimeterFilter

+) to keep our management happy ... :-)

So the question is - is the community interested in such a contribution?If yes than I need to spend some time to improve the code from"industrial quality" to "open source quality" including documentation... you know what I mean .... :-)


Thanks in advance,

Siegfried Goeschl

PS: Not sure if the name "Frequent Search Term Component" is perfectlysuitable as it was taken from FAST - suggestions welcome


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[SOLR] RFC - Contributing a FrequentSearchTerm component ...

Reply via email to