Thanks Marcel! Indeed I saw https://wikitech.wikimedia.org/wiki/Analytics/EventLogging#Access_data_in_Hadoop a while ago and asked on #wikimedia-analytics whether this approach might speed up queries for (the previous version of) this schema, the response was a bit ambiguous. Nevertheless I'm really interested in trying this out for speed purposes alone - if you have a moment at the summit this week to answer a question or two about the Hive setup, that would be great.
I think we should reduce the sample rate in any case; will check with the mobile web team before filing a task. On Mon, Jan 4, 2016 at 6:41 AM, Marcel Ruiz Forns <[email protected]> wrote: > Thanks Tilman, > > It makes sense to reduce the sampling rate of the schema for > "Datensparsamkeit and faster queries". However, if you don't specifically > need MySQL, and are fine querying through Hive, we could continue storing > all events at the current 1% rate in Hadoop. > > On Mon, Jan 4, 2016 at 11:28 AM, Tilman Bayer <[email protected]> wrote: >> >> Hi Marcel, >> >> yes, this is to be expected, because the schema is now logging more >> kinds of events than before. However, we could reduce the sampling >> rate considerably, as JonR and I had already envisaged >> (https://phabricator.wikimedia.org/T120292#1854136 ; this got lost a >> bit among the other schema changes, cf. >> https://phabricator.wikimedia.org/T120292#1864549 ). >> >> On Sun, Jan 3, 2016 at 12:30 PM, Marcel Ruiz Forns <[email protected]> >> wrote: >> > BTW, MobileWebSectionUsage schema is sending a lot of events since Dec >> > 18, >> > 2015. >> > It normally would send around 40 events per second, and it's sending >> > around >> > 120 events per second now. It's now the highest throughput schema in EL >> > by >> > far. Is that expected? >> > >> > Sorry for using this same thread. If this needs to be taken care of, I >> > will >> > create a new task. >> > Thanks! >> > >> > >> > On Tue, Dec 29, 2015 at 8:41 PM, Nuria Ruiz <[email protected]> wrote: >> >> >> >> Sorry i misses this but it always has sent events to a real high >> >> volume. >> >> >> >> On Tue, Dec 22, 2015 at 10:25 AM, Jon Katz <[email protected]> wrote: >> >>> >> >>> + Dmitry >> >>> >> >>> Hi Nuria, >> >>> I will ask Dmitry to confirm, but I think a pause is fine for the next >> >>> couple of days as long as we are given the timestamps for outage can >> >>> note it >> >>> on the schema wiki page. Is this a sudden increase or has it always >> >>> been >> >>> sending to high of a volume? Regardless, I imagine a higher sampling >> >>> rate >> >>> can probably be applied. >> >>> -J >> >>> >> >>> On Tue, Dec 22, 2015 at 9:58 AM, Nuria Ruiz <[email protected]> >> >>> wrote: >> >>>> >> >>>> Team: >> >>>> >> >>>> This schema MobileWikiAppShareAFact is sending a lot of events, >> >>>> maybe >> >>>> is worth thinking whether we need that many. It is again a case where >> >>>> tables >> >>>> are becoming huge and hard to query fast. >> >>>> >> >>>> cc-ing Jon as schema owner. >> >>>> >> >>>> Can this data be sampled at a higher sampling rate? I have filed a >> >>>> ticket to this fact: >> >>>> https://phabricator.wikimedia.org/T122224 >> >>>> >> >>>> Thanks, >> >>>> >> >>>> Nuria >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> On Tue, Dec 22, 2015 at 8:35 AM, Adam Baso <[email protected]> >> >>>> wrote: >> >>>>> >> >>>>> Replacing mobile-tech with mobile-l (internal mobile-tech list >> >>>>> discontinued). >> >>>>> >> >>>>> >> >>>>> On Tuesday, December 22, 2015, Nuria Ruiz <[email protected]> >> >>>>> wrote: >> >>>>>> >> >>>>>> Team: >> >>>>>> >> >>>>>> As part of our effort of converting eventlogging mysql database to >> >>>>>> the >> >>>>>> tokudb engine we need to stop eventlogging events from flowing into >> >>>>>> the >> >>>>>> MobileWikiAppShareAFact table, we are using this one table to see >> >>>>>> how long >> >>>>>> the conversion will take in order to plan for a larger outage >> >>>>>> window. >> >>>>>> >> >>>>>> >> >>>>>> Let us know if data should be backfilled as it can be, we >> >>>>>> anticipate >> >>>>>> events will not flow into table for the better part of one day. >> >>>>>> >> >>>>>> >> >>>>>> Thanks, >> >>>>>> >> >>>>>> Nuria >> >>>>>> >> >>>>>> >> >>>>> >> >>>>> _______________________________________________ >> >>>>> Mobile-l mailing list >> >>>>> [email protected] >> >>>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l >> >>>>> >> >>>> >> >>> >> >> >> >> >> >> _______________________________________________ >> >> Analytics mailing list >> >> [email protected] >> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> > >> > >> > >> > -- >> > Marcel Ruiz Forns >> > Analytics Developer >> > Wikimedia Foundation >> > >> > _______________________________________________ >> > Analytics mailing list >> > [email protected] >> > https://lists.wikimedia.org/mailman/listinfo/analytics >> > >> >> >> >> -- >> Tilman Bayer >> Senior Analyst >> Wikimedia Foundation >> IRC (Freenode): HaeB >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics > > > > > -- > Marcel Ruiz Forns > Analytics Developer > Wikimedia Foundation > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > -- Tilman Bayer Senior Analyst Wikimedia Foundation IRC (Freenode): HaeB _______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
