Thanks Marcel! Indeed I saw
https://wikitech.wikimedia.org/wiki/Analytics/EventLogging#Access_data_in_Hadoop
a while ago and asked on #wikimedia-analytics whether this approach
might speed up queries for (the previous version of) this schema, the
response was a bit ambiguous. Nevertheless I'm really interested in
trying this out for speed purposes alone - if you have a moment at the
summit this week to answer a question or two about the Hive setup,
that would be great.

I think we should reduce the sample rate in any case; will check with
the mobile web team before filing a task.


On Mon, Jan 4, 2016 at 6:41 AM, Marcel Ruiz Forns <[email protected]> wrote:
> Thanks Tilman,
>
> It makes sense to reduce the sampling rate of the schema for
> "Datensparsamkeit and faster queries". However, if you don't specifically
> need MySQL, and are fine querying through Hive, we could continue storing
> all events at the current 1% rate in Hadoop.
>
> On Mon, Jan 4, 2016 at 11:28 AM, Tilman Bayer <[email protected]> wrote:
>>
>> Hi Marcel,
>>
>> yes, this is to be expected, because the schema is now logging more
>> kinds of events than before. However, we could reduce the sampling
>> rate considerably, as JonR and I had already envisaged
>> (https://phabricator.wikimedia.org/T120292#1854136 ; this got lost a
>> bit among the other schema changes, cf.
>> https://phabricator.wikimedia.org/T120292#1864549 ).
>>
>> On Sun, Jan 3, 2016 at 12:30 PM, Marcel Ruiz Forns <[email protected]>
>> wrote:
>> > BTW, MobileWebSectionUsage schema is sending a lot of events since Dec
>> > 18,
>> > 2015.
>> > It normally would send around 40 events per second, and it's sending
>> > around
>> > 120 events per second now. It's now the highest throughput schema in EL
>> > by
>> > far. Is that expected?
>> >
>> > Sorry for using this same thread. If this needs to be taken care of, I
>> > will
>> > create a new task.
>> > Thanks!
>> >
>> >
>> > On Tue, Dec 29, 2015 at 8:41 PM, Nuria Ruiz <[email protected]> wrote:
>> >>
>> >> Sorry i misses this but it always has sent events to a real high
>> >> volume.
>> >>
>> >> On Tue, Dec 22, 2015 at 10:25 AM, Jon Katz <[email protected]> wrote:
>> >>>
>> >>> + Dmitry
>> >>>
>> >>> Hi Nuria,
>> >>> I will ask Dmitry to confirm, but I think a pause is fine for the next
>> >>> couple of days as long as we are given the timestamps for outage can
>> >>> note it
>> >>> on the schema wiki page.  Is this a sudden increase or has it always
>> >>> been
>> >>> sending to high of a volume?  Regardless, I imagine a higher sampling
>> >>> rate
>> >>> can probably be applied.
>> >>> -J
>> >>>
>> >>> On Tue, Dec 22, 2015 at 9:58 AM, Nuria Ruiz <[email protected]>
>> >>> wrote:
>> >>>>
>> >>>> Team:
>> >>>>
>> >>>> This  schema MobileWikiAppShareAFact is sending a lot of events,
>> >>>> maybe
>> >>>> is worth thinking whether we need that many. It is again a case where
>> >>>> tables
>> >>>> are becoming huge and hard to query fast.
>> >>>>
>> >>>> cc-ing Jon as schema owner.
>> >>>>
>> >>>> Can this data be sampled at a higher sampling rate? I have filed a
>> >>>> ticket to this fact:
>> >>>> https://phabricator.wikimedia.org/T122224
>> >>>>
>> >>>> Thanks,
>> >>>>
>> >>>> Nuria
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, Dec 22, 2015 at 8:35 AM, Adam Baso <[email protected]>
>> >>>> wrote:
>> >>>>>
>> >>>>> Replacing mobile-tech with mobile-l (internal mobile-tech list
>> >>>>> discontinued).
>> >>>>>
>> >>>>>
>> >>>>> On Tuesday, December 22, 2015, Nuria Ruiz <[email protected]>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> Team:
>> >>>>>>
>> >>>>>> As part of our effort of converting eventlogging mysql database to
>> >>>>>> the
>> >>>>>> tokudb engine we need to stop eventlogging events from flowing into
>> >>>>>> the
>> >>>>>> MobileWikiAppShareAFact table, we are using this one table to see
>> >>>>>> how long
>> >>>>>> the conversion will take in order to plan for a larger outage
>> >>>>>> window.
>> >>>>>>
>> >>>>>>
>> >>>>>> Let us know if data should be backfilled as it can be, we
>> >>>>>> anticipate
>> >>>>>> events will not flow into table for the better part of one day.
>> >>>>>>
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>>
>> >>>>>> Nuria
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> Mobile-l mailing list
>> >>>>> [email protected]
>> >>>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >>
>> >> _______________________________________________
>> >> Analytics mailing list
>> >> [email protected]
>> >> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>
>> >
>> >
>> >
>> > --
>> > Marcel Ruiz Forns
>> > Analytics Developer
>> > Wikimedia Foundation
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > [email protected]
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>>
>>
>>
>> --
>> Tilman Bayer
>> Senior Analyst
>> Wikimedia Foundation
>> IRC (Freenode): HaeB
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
>
> --
> Marcel Ruiz Forns
> Analytics Developer
> Wikimedia Foundation
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to