Re: [WikimediaMobile] Similar articles feature performance in CirrusSearch for apps and mobile web

Dmitry Brant Mon, 15 Feb 2016 11:27:12 -0800

Just a quick note that our latest production release (just published)
contains this A/B test, in addition to the other updates.
Looking forward to seeing the numbers from this!


-Dmitry


On Sun, Jan 31, 2016 at 9:35 PM, Dmitry Brant <[email protected]> wrote:

> Roger that! I think we could squeeze it in -- the change would be pretty
> straightforward. We'll be able to release a Beta with this A/B test in
> short order, but it will probably be a couple weeks until our next
> production release. I hope that's all right.
>
>
> On Sat, Jan 30, 2016 at 1:02 PM, Gabriel Wicke <[email protected]>
> wrote:
>
>> We are also happy to add cached entry points for high-traffic end
>> points in the REST API. I commented to that effect at
>> https://phabricator.wikimedia.org/T124216#1984206. Let us know if you
>> think this would be useful for this use case.
>>
>> On Sat, Jan 30, 2016 at 8:11 AM, Adam Baso <[email protected]> wrote:
>> > Okay. As per https://phabricator.wikimedia.org/T124225#1984080 I think
>> if
>> > we're doing near term experimentation with a controlled A/B test the
>> Android
>> > app is the only logical place to start. Dmitry, can that work for you?
>> It's
>> > not required, but I think it would be neat to see if we can move the
>> needle
>> > even more. Of course your quarterly goals take top priority...but what
>> do
>> > you think?
>> >
>> > On Sat, Jan 23, 2016 at 5:58 AM, Adam Baso <[email protected]> wrote:
>> >>
>> >> Hey all, am planning to look at Phabricator tasks and provide a reply
>> >> during the upcoming weekdays. Just wanted to acknowledge I saw your
>> replies!
>> >>
>> >>
>> >> On Friday, January 22, 2016, Erik Bernhardson <
>> [email protected]>
>> >> wrote:
>> >>>
>> >>> On Thu, Jan 21, 2016 at 1:29 AM, Joaquin Oltra Hernandez
>> >>> <[email protected]> wrote:
>> >>>>
>> >>>> Regarding the caching, we would need to agree between apps and web
>> about
>> >>>> the url and smaxage parameter as Adam noted so that the urls are
>> exactly the
>> >>>> same to not bloat varnish and reuse the same cached objects across
>> >>>> platforms.
>> >>>>
>> >>>> It is an extremely adhoc and brittle solution but seems like it
>> would be
>> >>>> the greatest win.
>> >>>>
>> >>>> 20% of the traffic from searches by being only in android and web
>> beta
>> >>>> seems a lot to me, and we should work on reducing it, otherwise when
>> it hits
>> >>>> web stable we're going to crush the servers, so caching seems the
>> highest
>> >>>> priority.
>> >>>>
>> >>> To clarify its 20% of the load, as opposed to 20% of the traffic. But
>> >>> same difference :)
>> >>>
>> >>>>
>> >>>> Let's chime in https://phabricator.wikimedia.org/T124216 and
>> continue
>> >>>> the cache discussion there.
>> >>>>
>> >>>> Regarding the validity of results with opening text only, how should
>> we
>> >>>> proceed? Adam?
>> >>>>
>> >>> I've put together https://phabricator.wikimedia.org/T124258 to track
>> >>> putting together an AB test that measures the difference in click
>> through
>> >>> rates for the two approaches.
>> >>>
>> >>>
>> >>>>
>> >>>> On Wed, Jan 20, 2016 at 9:34 PM, David Causse <[email protected]
>> >
>> >>>> wrote:
>> >>>>>
>> >>>>> Hi,
>> >>>>>
>> >>>>> Yes we can combine many factors, from templates (quality but also
>> >>>>> disambiguation/stubs), size and others.
>> >>>>> Today cirrus uses mostly the number of incoming links which (imho)
>> is
>> >>>>> not very good for morelike.
>> >>>>> On enwiki results will also be scored according the weights defined
>> in
>> >>>>>
>> https://en.wikipedia.org/wiki/MediaWiki:Cirrussearch-boost-templates.
>> >>>>>
>> >>>>> I wrote a small bash to compare results :
>> >>>>> https://gist.github.com/nomoa/93c5097e3c3cb3b6ebad
>> >>>>> Here is some random results from the list (Semetimes better,
>> sometimes
>> >>>>> worse) :
>> >>>>>
>> >>>>> $ sh morelike.sh Revolution_Muslim
>> >>>>> Defaults
>> >>>>>         "title": "Chess",
>> >>>>>         "title": "Suicide attack",
>> >>>>>         "title": "Zachary Adam Chesser",
>> >>>>> =======
>> >>>>> Opening text no boost links
>> >>>>>         "title": "Hungarian Revolution of 1956",
>> >>>>>         "title": "Muslims for America",
>> >>>>>         "title": "Salafist Front",
>> >>>>>
>> >>>>> $ sh morelike.sh Chesser
>> >>>>> Defaults
>> >>>>>         "title": "Chess",
>> >>>>>         "title": "Edinburgh",
>> >>>>>         "title": "Edinburgh Corn Exchange",
>> >>>>> =======
>> >>>>> Opening text no boost links
>> >>>>>         "title": "Dreghorn Barracks",
>> >>>>>         "title": "Edinburgh Chess Club",
>> >>>>>         "title": "Threipmuir Reservoir",
>> >>>>>
>> >>>>> $ sh morelike.sh Time_%28disambiguation%29
>> >>>>> Defaults
>> >>>>>         "title": "Atlantis: The Lost Empire",
>> >>>>>         "title": "Stargate",
>> >>>>>         "title": "Stargate SG-1",
>> >>>>> =======
>> >>>>> Opening text no boost links
>> >>>>>         "title": "Father Time (disambiguation)",
>> >>>>>         "title": "The Last Time",
>> >>>>>         "title": "Time After Time",
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> Le 20/01/2016 19:34, Jon Robson a écrit :
>> >>>>>>
>> >>>>>> I'm actually  interested to see whether this yields better results
>> in
>> >>>>>> certain examples where the algorithm is lacking [1]. If it's done
>> as
>> >>>>>> an A/B test we could even measure things such as click throughs in
>> the
>> >>>>>> related article feature (whether they go up or not)
>> >>>>>>
>> >>>>>> Out of interest is it also possible to take article size and type
>> into
>> >>>>>> account and not returning any morelike results for things like
>> >>>>>> disambiguation pages and stubs?
>> >>>>>>
>> >>>>>> [1] https://www.mediawiki.org/wiki/Topic:Swsjajvdll3pf8ya
>> >>>>>>
>> >>>>>>
>> >>>>>> On Wed, Jan 20, 2016 at 9:47 AM, Adam Baso <[email protected]>
>> >>>>>> wrote:
>> >>>>>>>
>> >>>>>>> One thing we could do regarding the quality of the output is check
>> >>>>>>> results
>> >>>>>>> against a random sample of popular articles (example approach to
>> find
>> >>>>>>> some
>> >>>>>>> articles) on mdot Wikipedia. Presuming that improves the quality
>> of
>> >>>>>>> the
>> >>>>>>> recommendations or at least does not degrade them, we should
>> consider
>> >>>>>>> adding
>> >>>>>>> the enhancement task to a future sprint, with further
>> instrumentation
>> >>>>>>> and
>> >>>>>>> A/B testing / timeboxed beta test, etc.
>> >>>>>>>
>> >>>>>>> Joaquin, smaxage (e.g., 24 hour cached responses) does seem a good
>> >>>>>>> fix for
>> >>>>>>> now for further reduction of client perceived wait, at least for
>> >>>>>>> non-cold
>> >>>>>>> cache requests, even if we stop beating up the backend. Does
>> anyone
>> >>>>>>> know of
>> >>>>>>> a compelling reason to not do that for the time being? The main
>> thing
>> >>>>>>> that
>> >>>>>>> comes to mind as always is growing the Varnish cache object pool -
>> >>>>>>> probably
>> >>>>>>> not a huge deal while the thing is only in beta, but on the stable
>> >>>>>>> channel
>> >>>>>>> maybe noteworthy because it would run on probably most pages (but
>> >>>>>>> that's
>> >>>>>>> what edge caches are for, after all).
>> >>>>>>>
>> >>>>>>> Erik, from your perspective does use of smaxage relieve the
>> backend
>> >>>>>>> sufficiently?
>> >>>>>>>
>> >>>>>>> If we do smaxage, then Web, Android, iOS should standardize their
>> >>>>>>> URLs so we
>> >>>>>>> get more cache hits at the edge across all clients. Here's the
>> URL I
>> >>>>>>> see
>> >>>>>>> being used on the web today from mobile web beta:
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> https://en.m.wikipedia.org/w/api.php?action=query&format=json&formatversion=2&prop=pageimages%7Cpageterms&piprop=thumbnail&pithumbsize=80&wbptterms=description&pilimit=3&generator=search&gsrsearch=morelike%3ACome_Share_My_Love&gsrnamespace=0&gsrlimit=3
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> -Adam
>> >>>>>>>
>> >>>>>>> On Wed, Jan 20, 2016 at 7:45 AM, Joaquin Oltra Hernandez
>> >>>>>>> <[email protected]> wrote:
>> >>>>>>>>
>> >>>>>>>> I'd be up to it if we manage to cram it up in a following sprint
>> and
>> >>>>>>>> it is
>> >>>>>>>> worth it.
>> >>>>>>>>
>> >>>>>>>> We could run a controlled test against production with a long
>> batch
>> >>>>>>>> of
>> >>>>>>>> articles and check median/percentiles response time with repeated
>> >>>>>>>> runs and
>> >>>>>>>> highlight the different results for human inspection regarding
>> >>>>>>>> quality.
>> >>>>>>>>
>> >>>>>>>> It's been noted previously that the results are far from ideal
>> >>>>>>>> (which they
>> >>>>>>>> are because it is just morelike), and I think it would be a great
>> >>>>>>>> idea to
>> >>>>>>>> change the endpoint to a specific one that is smarter and has
>> some
>> >>>>>>>> cache (we
>> >>>>>>>> could do much more to get relevant results besides text
>> similarity,
>> >>>>>>>> take
>> >>>>>>>> into account links, or see also links if there are, etc...).
>> >>>>>>>>
>> >>>>>>>> As a note, in mobile web the related articles extension allows
>> >>>>>>>> editors to
>> >>>>>>>> specify articles to show in the section, which would avoid
>> queries
>> >>>>>>>> to
>> >>>>>>>> cirrussearch if it was more used (once rolled into stable I
>> guess).
>> >>>>>>>>
>> >>>>>>>> I remember that the performance related task was closed as
>> resolved
>> >>>>>>>> (https://phabricator.wikimedia.org/T121254#1907192), should we
>> >>>>>>>> reopen it or
>> >>>>>>>> create a new one?
>> >>>>>>>>
>> >>>>>>>> I'm not sure if we ended up adding the smaxage parameter (I
>> think we
>> >>>>>>>> didn't), should we? To me it seems a no-brainer that we should be
>> >>>>>>>> caching
>> >>>>>>>> this results in varnish since they don't need to be completely
>> up to
>> >>>>>>>> date
>> >>>>>>>> for this use case.
>> >>>>>>>>
>> >>>>>>>> On Tue, Jan 19, 2016 at 11:54 PM, Erik Bernhardson
>> >>>>>>>> <[email protected]> wrote:
>> >>>>>>>>>
>> >>>>>>>>> Both mobile apps and web are using CirrusSearch's morelike:
>> feature
>> >>>>>>>>> which
>> >>>>>>>>> is showing some performance issues on our end. We would like to
>> >>>>>>>>> make a
>> >>>>>>>>> performance optimization to it, but before we would prefer to
>> run
>> >>>>>>>>> an A/B
>> >>>>>>>>> test to see if the results are still "about as good" as they are
>> >>>>>>>>> currently.
>> >>>>>>>>>
>> >>>>>>>>> The optimization is basically: Currently more like this takes
>> the
>> >>>>>>>>> entire
>> >>>>>>>>> article into account, we would like to change this to take only
>> the
>> >>>>>>>>> opening
>> >>>>>>>>> text of an article into account. This should reduce the amount
>> of
>> >>>>>>>>> work we
>> >>>>>>>>> have to do on the backend saving both server load and latency
>> the
>> >>>>>>>>> user sees
>> >>>>>>>>> running the query.
>> >>>>>>>>>
>> >>>>>>>>> This can be triggered by adding these two query parameters to
>> the
>> >>>>>>>>> search
>> >>>>>>>>> api request that is being performed:
>> >>>>>>>>>
>> >>>>>>>>> cirrusMltUseFields=yes&cirrusMltFields=opening_text
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> The API will give a warning that these parameters do not exist,
>> but
>> >>>>>>>>> they
>> >>>>>>>>> are safe to ignore. Would any of you be willing to run this
>> test?
>> >>>>>>>>> We would
>> >>>>>>>>> basically want to look at user perceived latency along with
>> click
>> >>>>>>>>> through
>> >>>>>>>>> rates for the current default setup along with the restricted
>> setup
>> >>>>>>>>> using
>> >>>>>>>>> only opening_text.
>> >>>>>>>>>
>> >>>>>>>>> Erik B.
>> >>>>>>>>>
>> >>>>>>>>> _______________________________________________
>> >>>>>>>>> Mobile-l mailing list
>> >>>>>>>>> [email protected]
>> >>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>>> _______________________________________________
>> >>>>>>> Mobile-l mailing list
>> >>>>>>> [email protected]
>> >>>>>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>> >>>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> Mobile-l mailing list
>> >>>>>> [email protected]
>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> Mobile-l mailing list
>> >>>>> [email protected]
>> >>>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>> >>>>
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> Mobile-l mailing list
>> >>>> [email protected]
>> >>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>> >>>>
>> >>>
>> >
>> >
>> > _______________________________________________
>> > Mobile-l mailing list
>> > [email protected]
>> > https://lists.wikimedia.org/mailman/listinfo/mobile-l
>> >
>>
>>
>>
>> --
>> Gabriel Wicke
>> Principal Engineer, Wikimedia Foundation
>>
>> _______________________________________________
>> Mobile-l mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>>
>
>
>
> --
> Dmitry Brant
> Mobile Apps Team (Android)
> Wikimedia Foundation
> https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering
>
>


-- 
Dmitry Brant
Mobile Apps Team (Android)
Wikimedia Foundation
https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering

_______________________________________________
Mobile-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mobile-l

Re: [WikimediaMobile] Similar articles feature performance in CirrusSearch for apps and mobile web

Reply via email to