Hi all!  I haven't heard any objections, so I will be stopping the jobs
that generate these datasets today.  I won't delete the existing data until
further notice

On Mon, Dec 14, 2015 at 1:34 PM, Andrew Otto <[email protected]> wrote:

> If we don’t hear any objections by Dec 30th, we will move forward with the
> plan to no longer generate this data.
>
>
> On Dec 11, 2015, at 12:40, Andrew Otto <[email protected]> wrote:
>
> Hi all,
>
> Soon, we will be merging the mobile web cache requests with the text cache
> requests.  text caches will now serve requests for mobile web[1].
>
> This means that the webrequest_source=‘mobile’ partition in the webrequest
> table in Hive will soon be empty, and all data that was previously in it
> will be found in the webrequest_source=‘text’ partition.
>
> There are only 3 datasets that currently only use the
> webrequest_source=‘mobile’ partition:
>
> - /a/log/webrequest/archive/mobile
> - /a/log/webrequest/archive/5xx-mobile
> - /a/log/webrequest/archive/zero
>
> (These are paths on stat1002, but they also exist in HDFS.)
>
> These datasets originally came from udp2log, but since early last year
> they have been generated from Hadoop.  With the upcoming cache merge, these
> jobs will have to parse through all text requests, which will make Hadoop
> busier.
>
> Do we know if these are being used?  Would anyone be upset if we no longer
> generated these datasets?
>
> Thanks!
> -Andrew
>
> [1] https://phabricator.wikimedia.org/T109286
>
>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to