Not an answer to the question, but a question of my own; will the nature of the content being served still be present as /some/ field? FWIW I've found it very helpful to be able to use webrequest_source to trivially distinguish mobile and desktop requests.
On 11 December 2015 at 12:40, Andrew Otto <[email protected]> wrote: > Hi all, > > Soon, we will be merging the mobile web cache requests with the text cache > requests. text caches will now serve requests for mobile web[1]. > > This means that the webrequest_source=‘mobile’ partition in the webrequest > table in Hive will soon be empty, and all data that was previously in it > will be found in the webrequest_source=‘text’ partition. > > There are only 3 datasets that currently only use the > webrequest_source=‘mobile’ partition: > > - /a/log/webrequest/archive/mobile > - /a/log/webrequest/archive/5xx-mobile > - /a/log/webrequest/archive/zero > > (These are paths on stat1002, but they also exist in HDFS.) > > These datasets originally came from udp2log, but since early last year they > have been generated from Hadoop. With the upcoming cache merge, these jobs > will have to parse through all text requests, which will make Hadoop busier. > > Do we know if these are being used? Would anyone be upset if we no longer > generated these datasets? > > Thanks! > -Andrew > > [1] https://phabricator.wikimedia.org/T109286 > > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > -- Oliver Keyes Count Logula Wikimedia Foundation _______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
