Hi Gilles -- why won't the page view logs work by themselves for this
purpose? EL can be configured to write into Hadoop which is probably the
best way to get the throughput you need but it seems overcomplicated.

-Toby

On Tue, Jan 6, 2015 at 9:41 AM, Gilles Dubuc <[email protected]> wrote:

> This depends on [1] so we're not going to need that immediately, but in
> order to help Erik Zachte with his RfC [2] to track unique media views in
> Media Viewer, I'm going to need to use something almost exactly like
> EventLogging. The main difference being that it should skip writing to the
> database and write to a log file instead.
>
> That's because we'll be recording around 20-25M image views per day, which
> would needlessly overload EventLogging for little purpose since the data
> will be used for offline stats generation and doesn't need to be made
> available in a relational database. Of course if storage space and
> EventLogging capacity were no object, we could just use EL and keep the
> ever-growing table forever, but I have the impression that we want to be
> reasonable here and only write to a log, since that's what Erik needs.
>
> So here's the question: for a specific schema, can EventLogging work the
> way it does but only record hits to a log file (maybe it already does that
> before hitting the DB?) and not write to the DB? If not, how difficult
> would it be to make EL capable of doing that?
>
> [1] https://phabricator.wikimedia.org/T44815
> [2]
> https://www.mediawiki.org/wiki/Requests_for_comment/Media_file_request_counts
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to