Good hits on page two:

There's a few cases where good results could exist only on page two.

One case is when incorrectly searching for a homophone or other
misspelling. Eg: "their red hot" instead of "they're red hot" (expected
result <https://en.wikipedia.org/wiki/They%27re_Red_Hot> -- wikipedia
<https://en.wikipedia.org/w/index.php?search=their+red+hot&title=Special%3ASearch>
(pos
22), google <https://www.google.com/search?q=their+red+hot&oq=their+red+hot>
(pos
1), bing <https://www.bing.com/search?q=their+red+hot> (pos 2), ddg
<https://duckduckgo.com/?q=their+red+hot> (pos 2)).

Another case is when you get an exact string match on incorrect pages, but
only non-exact string match on the correct page. Eg: "Cities in the San
Francisco Bay Area" (expected result
<https://en.wikipedia.org/wiki/List_of_cities_and_towns_in_the_San_Francisco_Bay_Area>
-- wikipedia
<https://en.wikipedia.org/w/index.php?title=Special:Search&search=Cities+in+the+San+Francisco+Bay+Area>
(pos
122), google
<https://www.google.com/search?q=Cities+in+the+San+Francisco+Bay+Area> (pos
1), bing
<https://www.bing.com/search?q=Cities+in+the+San+Francisco+Bay+Area> (pos
1), ddg <https://duckduckgo.com/?q=Cities+in+the+San+Francisco+Bay+Area> (pos
1)).

This style occurs mostly for a navigation query (only one correct result).
For explorative queries, odds are one of the relevant results will be on
page 1.

There's a couple less direct cases, for instance if/once you integrate a
popularity score, freshness score, importance score, page query score, or
personalization (eg. ranking by physical distance from user or user's
interests), you'll find some examples where incorrect results are
non-helpfully boosted.

Investigating queries which lead to clicks on page two may find interesting
things popping out.

--

Knowing the SAT/DSAT-click-rate-vs.-position will tell you if good clicks
often occur beyond position 10. Then running an experiment of 10 SERP
results vs. 20 SERP results may give interesting insights when watching a
session-success-rate metric (and maybe a time-to-success metric). Aka,
checking if a click on position 11+ is almost ever useful, or just leads to
a requery or abandonment. If you run result size experiments, you can
normalize for the query latency effects by generating 20 and displaying 10.

The need of scrolling can cause a faster fall off of the click rates
listed. On my web browser, as it's currently sized, there are only three
results above the fold (my open advanced facet block takes a lot of space,
scrolling required for result 4+). Knowing how-much/if the click rate drops
for results below the fold will also help optimize the number of results to
display, snippet length, and UI design. Could instrument number of results
above the fold.

--

Side note: possible bug, I can't find the page "List of New York University
alumni <https://en.wikipedia.org/wiki/List_of_New_York_University_alumni>"
when querying "New York University alumni
<https://en.wikipedia.org/w/index.php?search=New+York+University+alumni&title=Special%3ASearch&go=Go>"
(screenshot <https://imgur.com/SymW9tv>).

--justin

On Wed, Feb 10, 2016 at 12:04 PM, billinghurst <[email protected]>
wrote:

> It would also be interesting to know the type of page that becomes their
> destination
>
> Person
> Object
> News
> Concept
> Etc.
>
> Some are easier to describe and predict, aligns with a search length too.
>
> -- billinghurst
>
>
> On Wed, 10 Feb 2016 22:01 Justin Ormont <[email protected]> wrote:
>
>> This is great. Do you have any categories tracked that could be
>> interesting to break the position click-rates down by? eg: navigational vs.
>> explorative queries, SAT clicks (satisfied user's query intent) vs. DSAT
>> clicks (not satisfied), requery rate (how many times a user reformulates a
>> new query in a session), time-to-first-click, search session duration,
>> user's country/default language/# edits, length of query (# of query
>> tokens), # of query results, popular vs. uncommon queries, high scoring
>> SERP vs. low scoring SERP (or a proxy like Max BM25F of the top result),
>> speller was click vs. not clicked, category of page clicked on, popular
>> pages vs. rarely visited pages, etc.
>>
>> This experiment running on Special:Search is nice as that page doesn't
>> automatically redirect when the query exactly matches a page.
>>
>> You can measure the positional importance by setting up an A/B test where
>> you flip position 2 & 3. Also, a slowdown experiment would tell you the
>> impact of latency, and help focus engineering efforts towards precision, or
>> latency improvements.
>>
>> --justin
>>
>>
>> On Tue, Feb 9, 2016 at 9:53 AM, Erik Bernhardson <
>> [email protected]> wrote:
>>
>>> 2 |  34214 | 14.26%
>>
>>
>>
>> _______________________________________________
>> discovery mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/discovery
>>
>
> _______________________________________________
> discovery mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/discovery
>
>
_______________________________________________
discovery mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/discovery

Reply via email to