Thank you for the links.

The book is really useful, I will definitively have to spend some time
reformatting the logs to to access number of result founds, session id and
much more.

I'm also quite happy that my test cases produces similar results to the
precision reports shown at the beginning of the book.

Giovanni


2014-04-09 12:59 GMT+02:00 Ahmet Arslan <iori...@yahoo.com>:

> Hi Giovanni,
>
> Here are some relevant pointers :
>
>
> http://www.lucenerevolution.org/2013/Test-Driven-Relevancy-How-to-Work-with-Content-Experts-to-Optimize-and-Maintain-Search-Relevancy
>
>
> http://rosenfeldmedia.com/books/search-analytics/
>
> http://www.sematext.com/search-analytics/index.html
>
>
> Ahmet
>
>
> On Wednesday, April 9, 2014 12:17 PM, Giovanni Bricconi <
> giovanni.bricc...@banzai.it> wrote:
> It is about one year I'm working on an e-commerce site, and unfortunately I
> have no "information retrieval" background, so probably I am missing some
> important practices about relevance tuning and search engines.
> During this period I had to fix many "bugs" about bad search results, which
> I have solved sometimes tuning edismax weights, sometimes creating ad hoc
> query filters or query boosting; but I am still not able to figure out what
> should be the correct process to improve search results relevance.
>
> These are the practices I am following, I would really appreciate any
> comments about them and any hints about what practices you follow in your
> projects:
>
> - In order to have a measure of search quality I have written many test
> cases such as "if the user searches for <<nike sport watch>> the search
> result should display at least four <<tom tom>> products with the words
> <<nike>> and <<sportwatch>> in the title". I have written a tool that read
> such tests from json files and applies them to my applications, and then
> counts the number of results that does not match the criterias stated in
> the test cases. (for those interested this tool is available at
> https://github.com/gibri/kelvin but it is still quite a prototype)
>
> - I use this count as a quality index, I tried various times to change the
> edismax weight to lower the whole number of error, or to add new
> filters/boostings to the application to try to decrease the error count.
>
> - The pros of this is that at least you have a number to look at, and that
> you have a quick way of checking the impact of a modification.
>
> - The bad side is that you have to maintain the test cases: now I have
> about 800 tests and my product catalogue changes often, this implies that
> some products exits the catalog, and some test cases cant pass anymore.
>
> - I am populating the test cases using errors reported from users, and I
> feel that this is driving the test cases too much toward pathologic cases.
> An more over I haven't many test for cases that are working well now.
>
> I would like to use search logs as drivers to generate tests, but I feel I
> haven't picked the right path. Using top queries, manually reviewing
> results, and then writing tests is a slow process; moreover many top
> queries are ambiguous or are driven by site ads.
>
> Many many queries are unique per users. How to deal with these cases?
>
> How are you using your log to find out test cases to fix? Are you looking
> for queries where the user is not "opening" any returned results? Which kpi
> have you chosen to find out query that are not providing good results? And
> what are you using as kpi for the whole search, beside the conversion rate?
>
> Can you suggest me any other practices you are using on your projects?
>
> Thank you very much in advance
>
> Giovanni
>
>

Reply via email to