> we will be taking up more Wikimedia bandwidth Please, note that from the operations side of things (Disclaimer: I am *not* a netops), I have the understanding that pure bandwidth usage is currently a non-issue (it is mostly a fixed cost, rather than a variable one). Hitting repeatedly a server is way more "costly" (all things considered, such as server purchase and maintenance) that a 1-time dump download. All dump users; use as much as you need (without wasting it) to meet your goals and do not worry too much about bandwidth.
> Also want to say that we're very thankful for the work you all are doing publishing this dataset, it's enormously useful for entity popularity in our search engine for publishers <https://graphiq.com/search>. My personal opinion is that, indeed, Analytics' work is very important for our mission (free knowledge spreading) and they are doing it great. I do not know if that is said enough. On Tue, Aug 16, 2016 at 11:06 PM, Dylan Wenzlau <[email protected]> wrote: > Thank you for the update. No one from our team is on the mailing list, and > we have not viewed the /other/analytics page before (only the > pagecounts-all-sites > page > <https://wikitech.wikimedia.org/wiki/Analytics/Data/Pagecounts-all-sites> > and pages linked from there), which explains why we didn't know about this. > I do see you recently added a link to Phabricator issue though, which is > helpful! > > I am currently rewriting our scripts to utilize the new pagecounts-ez > format, although I think that this new format means that we will be taking > up more Wikimedia bandwidth than we did previously, since we will have to > re-downoad this merged daily file once per hour in order to utilize the > hourly stats. Previously, we only had to download ~100MB per hour, and now > it seems we'll be downloading ~350MB per hour. Please correct me if I'm > missing something obvious here! > > Also want to say that we're very thankful for the work you all are doing > publishing this dataset, it's enormously useful for entity popularity in > our search engine for publishers <https://graphiq.com/search>. > > On Tue, Aug 16, 2016 at 1:48 PM, Dan Andreescu <[email protected]> > wrote: > >> Dylan, there's also been a deprecation message on the page that links to >> these datasets, since last winter: https://dumps.wikimedi >> a.org/other/analytics/ >> >> If you know of other places that these datasets are referenced, I'd be >> happy to update the docs and add links to the email threads. We usually >> publish information about this kind of deprecation on this list well in >> advance, but are open to reaching out in other ways. >> >> On Tue, Aug 16, 2016 at 4:13 PM, Nuria Ruiz <[email protected]> wrote: >> >>> >>> Dylan, >>> >>> (cc-ing analytics@ public list) >>> >>> Please see announcement about deprecation of datasets: >>> https://lists.wikimedia.org/pipermail/analytics/2016-August/005339.html >>> >>> >>> Thanks, >>> >>> Nuria >>> >>> >>> >>> >>> >>> On Tue, Aug 16, 2016 at 12:53 PM, Dylan Wenzlau <[email protected]> >>> wrote: >>> >>>> It seems the pagecounts-all-sites dumps have completely stopped >>>> updating, and I don't see any warning or message about why this is the case >>>> or whether it's currently being resolved. Our company relies pretty heavily >>>> on this data, as I imagine other projects & companies do as well, so I >>>> think it would be useful to at least display a big warning message on the >>>> documentation pages explaining why these are no longer updating. >>>> >>>> Thanks, >>>> >>>> -- >>>> *Dylan Wenzlau* | Director of Engineering | >>>> >>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> > > > -- > *Dylan Wenzlau* | Director of Engineering | > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- Jaime Crespo <http://wikimedia.org>
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
