Thanks everyone for the research on this! I'll go ahead and create a card for implementing sampling on the high-throughput WikiGrok events.
Kaldari On Wed, Jan 7, 2015 at 5:20 PM, Nuria Ruiz <[email protected]> wrote: > Sorry, I send it too soon, trying again: > > >We're talking about a total of ~170 events per sec for these pages. > This is to high to log in 1:1 rate, we would need to do 1:10. At this time > most events on EL logging log at a much lower rate, events over 1 per sec > are the following, as you can see mobile & media viewer are the majority of > the throughput. > > My preference would be to be less than 400 events per sec until we have > done some perf testing to make sure we can handle it (we might be able to > as we have done many improvements since we set these thresholds) > > MobileWebClickTracking 41.35% (114.15/sec) > MediaViewer 21.66% (59.78/sec) > MobileWikiAppToCInteraction 12.44% (34.35/sec) > PageContentSaveComplete 3.39% (9.35/sec) > EchoInteraction 2.69% (7.42/sec) > NavigationTiming 2.51% (6.93/sec) > MultimediaViewerNetworkPerformance 1.84% (5.07/sec) > SaveTiming 1.58% (4.37/sec) > Edit 1.39% (3.83/sec) > PersonalBar 1.24% (3.43/sec) > TimingData 0.83% (2.28/sec) > MobileWebUIClickTracking 0.73% (2.02/sec) > Popups 0.68% (1.87/sec) > MobileWikiAppOnboarding 0.62% (1.70/sec) > MultimediaViewerDimensions 0.61% (1.68/sec) > UniversalLanguageSelector 0.50% (1.37/sec) > PageCreation 0.50% (1.37/sec) > MultimediaViewerDuration 0.47% (1.30/sec) > MobileWebEditing 0.45% (1.25/sec) > MobileWikiAppSearch 0.41% (1.13/sec) > CentralAuth 0.40% (1.12/sec) > > On Wed, Jan 7, 2015 at 5:12 PM, Nuria Ruiz <[email protected]> wrote: > >> >We're talking about a total of ~170 events per sec for these pages. >> This is to high to log in 1:1 rate, we would need to do 1:10. >> >> On Wed, Jan 7, 2015 at 4:10 PM, Leila Zia <[email protected]> wrote: >> >>> Thanks everyone for chiming in. Your comments were very helpful. :-) >>> >>> Nuria, I checked the per second pageview count for the pages wikigrok >>> will be live on for 3 hours in 2015-01-07 (as a sample). We're talking >>> about a total of ~170 events per sec for these pages. Of course major >>> events can affect this number. This number added to the current 270 events >>> per sec you mentioned will send us over the 350 events per sec limit (if >>> it's a hard limit). What do you think? >>> >>> Leila >>> >>> >>> >>> On Wed, Jan 7, 2015 at 10:13 AM, Nuria Ruiz <[email protected]> wrote: >>> >>>> >Given that information, do you have any idea if we are in danger of >>>> overloading EventLogging? >>>> Logging broad events (such a page load) 1 to 1 might incur into >>>> problems as our traffic is high enough that events logged1/1000 happen >>>> still in very large amounts. >>>> >>>> Some numbers (oversimplyfying and rounding) >>>> >>>> We have about 200 million visits per day for the enwiki mobile site . >>>> This means about 2300 pageviews per sec, if we are sending 1 load event per >>>> pageview EL will (sadly) die, most likely. >>>> >>>> If we assume EL handles up to 350 events per second (and now we are at >>>> 270 events per sec) I would think that sending 10 events per sec on your >>>> case would be pretty safe. That would be sampling about 1/200 for a load >>>> event per every pageview. This seems like a good upper bound. >>>> >>>> Now, since there are no constrains as to how long you keep your >>>> experiment running you can try a lower sampling ratio, say, 1/1000 and keep >>>> the experiment running for longer. >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Jan 6, 2015 at 5:50 PM, Ryan Kaldari <[email protected]> >>>> wrote: >>>> >>>>> The highest volume events we are going to log will be: >>>>> 1. For each of the 166,000 articles, one event when the page loads >>>>> 2. For each of the 166,000 articles, one event when the WikiGrok >>>>> widget enters the viewport (about half as often as #1) >>>>> >>>>> These will be active for all mobile users, logged in and logged out, >>>>> including many high pageview articles. >>>>> >>>>> Given that information, do you have any idea if we are in danger of >>>>> overloading EventLogging? If so, do you have recommendations on sampling? >>>>> So far, everyone has said not to worry about it, but it would be good to >>>>> get a sanity check for this test specifically. >>>>> >>>>> Kaldari >>>>> >>>>> On Tue, Jan 6, 2015 at 4:57 PM, Nuria Ruiz <[email protected]> >>>>> wrote: >>>>> >>>>>> (cc-ing mobile-tech) >>>>>> >>>>>> Since we do not the details of how wikigrok is used and its >>>>>> throughput of requests we can not "estimate" sampling ourselves. I >>>>>> imagine >>>>>> wikigrok is been deployed to a number of users and it is with that usage >>>>>> the mobile team could estimate the total throughput expected, with this >>>>>> throughput we can recommend sampling ratios. >>>>>> >>>>>> >>>>>> Thanks for asking about this without before deploying! >>>>>> >>>>>> >>>>>> On Tue, Jan 6, 2015 at 4:55 PM, Ryan Kaldari <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> I can elaborate on this after I finished the SWAT deployment.... >>>>>>> Gimme 30 minutes or so. >>>>>>> >>>>>>> On Tue, Jan 6, 2015 at 4:51 PM, Leila Zia <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> The mobile team is planning to switch WikiGrok on for non-logged >>>>>>>> in users next week (2014-01-12). The widget will be on on 166,029 >>>>>>>> article >>>>>>>> pages in enwiki. There are two EventLogging schema that may collect >>>>>>>> data >>>>>>>> heavily and we want to make sure EL can handle the influx of data. >>>>>>>> >>>>>>>> The two schema collecting data are: >>>>>>>> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrok >>>>>>>> https://meta.wikimedia.org/wiki/Schema:MobileWebWikiGrokError >>>>>>>> and the list of pages affected is in: >>>>>>>> wgq_page in enwiki.wikigrok_questions. >>>>>>>> >>>>>>>> It would be great if someone from the dev side let us know >>>>>>>> whether we will need sampling. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Leila >>>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Analytics mailing list >>>>>>> [email protected] >>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>>> >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Analytics mailing list >>>>>> [email protected] >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
