Re: Re: Query Autocomplete Evaluation

Paras Lehana Mon, 24 Feb 2020 21:54:00 -0800

Hey Audrey,

I assume MRR is about the ranking of the intended suggestion. For this, no
human judgement is required. We track position selection - the position
(1-10) of the selected suggestion. For example, this is our recent numbers:


Position 1 Selected (B3) 107,699
Position 2 Selected (B4) 58,736
Position 3 Selected (B5) 23,507
Position 4 Selected (B6) 12,250
Position 5 Selected (B7) 7,980
Position 6 Selected (B8) 5,653
Position 7 Selected (B9) 4,193
Position 8 Selected (B10) 3,511
Position 9 Selected (B11) 2,997
Position 10 Selected (B12) 2,428
*Total Selections (B13)* *228,954*
MRR = (B3+B4/2+B5/3+B6/4+B7/5+B8/6+B9/7+B10/8+B11/9+B12/10)/B13 = 66.45%

Refer here for MRR calculation keeping Auto-Suggest in perspective:
https://medium.com/@dtunkelang/evaluating-search-measuring-searcher-behavior-5f8347619eb0

"In practice, this is inverted to obtain the reciprocal rank, e.g., if the
searcher clicks on the 4th result, the reciprocal rank is 0.25. The average
of these reciprocal ranks is called the mean reciprocal rank (MRR)."

nDCG may require human intervention. Please let me know in case I have not
understood your question properly. :)



On Mon, 24 Feb 2020 at 20:49, Audrey Lorberfeld - audrey.lorberf...@ibm.com
<audrey.lorberf...@ibm.com> wrote:

> Hi Paras,
>
> This is SO helpful, thank you. Quick question about your MRR metric -- do
> you have binary human judgements for your suggestions? If no, how do you
> label suggestions successful or not?
>
> Best,
> Audrey
>
> On 2/24/20, 2:27 AM, "Paras Lehana" <paras.leh...@indiamart.com> wrote:
>
>     Hi Audrey,
>
>     I work for Auto-Suggest at IndiaMART. Although we don't use the
> Suggester
>     component, I think you need evaluation metrics for Auto-Suggest as a
>     business product and not specifically for Solr Suggester which is the
>     backend. We use edismax parser with EdgeNGrams Tokenization.
>
>     Every week, as the property owner, I report around 500 metrics. I would
>     like to mention a few of those:
>
>        1. MRR (Mean Reciprocal Rate): How high the user selection was
> among the
>        returned result. Ranges from 0 to 1, the higher the better.
>        2. APL (Average Prefix Length): Prefix is the query by user. Lesser
> the
>        better. This reports how less an average user has to type for
> getting the
>        intended suggestion.
>        3. Acceptance Rate or Selection: How many of the total searches are
>        being served from Auto-Suggest. We are around 50%.
>        4. Selection to Display Ratio: Did you make the user to click any
> of the
>        suggestions if they are displayed?
>        5. Response Time: How fast are you serving your average query.
>
>
>     The Selection and Response Time are our main KPIs. We track a lot about
>     Auto-Suggest usage on our platform which becomes apparent if you
> observe
>     the URL after clicking a suggestion on dir.indiamart.com. However, not
>     everything would benefit you. Do let me know for any related query or
>     explanation. Hope this helps. :)
>
>     On Fri, 14 Feb 2020 at 21:23, Audrey Lorberfeld -
> audrey.lorberf...@ibm.com
>     <audrey.lorberf...@ibm.com> wrote:
>
>     > Hi all,
>     >
>     > How do you all evaluate the success of your query autocomplete (i.e.
>     > suggester) component if you use it?
>     >
>     > We cannot use MRR for various reasons (I can go into them if you're
>     > interested), so we're thinking of using nDCG since we already use
> that for
>     > relevance eval of our system as a whole. I am also interested in the
> metric
>     > "success at top-k," but I can't find any research papers that
> explicitly
>     > define "success" -- I am assuming it's a suggestion (or suggestions)
>     > labeled "relevant," but maybe it could also simply be the suggestion
> that
>     > receives a click from the user?
>     >
>     > Would love to hear from the hive mind!
>     >
>     > Best,
>     > Audrey
>     >
>     > --
>     >
>     >
>     >
>
>     --
>     --
>     Regards,
>
>     *Paras Lehana* [65871]
>     Development Engineer, *Auto-Suggest*,
>     IndiaMART InterMESH Ltd,
>
>     11th Floor, Tower 2, Assotech Business Cresterra,
>     Plot No. 22, Sector 135, Noida, Uttar Pradesh, India 201305
>
>     Mob.: +91-9560911996
>     Work: 0120-4056700 | Extn:
>     *11096*
>
>     --
>     *
>     *
>
>      <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_IndiaMART_videos_578196442936091_&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M&m=CTfu2EkiAFh-Ra4cn3EL2GdkKLBhD754dBAoRYpr2uc&s=kwWlK4TbSM6iPH6DBIrwg3QCeHrY-83N5hm2HtQQsjc&e=
> >
>
>
>

-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, *Auto-Suggest*,
IndiaMART InterMESH Ltd,

11th Floor, Tower 2, Assotech Business Cresterra,
Plot No. 22, Sector 135, Noida, Uttar Pradesh, India 201305

Mob.: +91-9560911996
Work: 0120-4056700 | Extn:
*11096*

-- 
*
*

 <https://www.facebook.com/IndiaMART/videos/578196442936091/>

Re: Re: Query Autocomplete Evaluation

Reply via email to