I second Kevin in the understanding of the problem. I think one approach could be: - Parse current version of Italian Wikipedia dump (no need to go for revisions history, only current version should be enough) and extract pages info (id and title) which contain GPS info (Since I don't know how GPS coordinates are repsented in wiki pages, I can't really help on that side). - Once the list (page_title - GPS point) is built, depending on the size of the lsit, either request the pageview API or ask the analytics team for data extraction for the given pages over a time period. Cheers Joseph
On Tue, Apr 12, 2016 at 1:21 AM, Kevin Leduc <[email protected]> wrote: > I think Nima was referring to articles of monuments / places of interest > that have GPS coordinates in them. For example, the Trevi Fountain is at > these coordinates: 41.902773°N 12.485952°E > > by joining pageviews and coordinate data, you could create heat maps that > may correlate with actual tourist traffic. > > > [1] https://it.wikipedia.org/wiki/Trevi_(rione_di_Roma) > > > On Tue, Apr 5, 2016 at 4:07 AM, Nima Dashtban <[email protected]> > wrote: > >> Hi there, >> >> Hope my email finds you well. My name is Nima Dashtban and I'm a student >> of computer science in Ca'foscari University of Venice / Italy. >> >> I am investigating these access logs of wikipedia pages: >> https://dumps.wikimedia.org/other/pagecounts-raw/ >> >> In particular I would like to build up an DB of the time series of >> accesses to (Italian) pages of wikipedia that have a GPS position, i.e. >> wikipedia page that refer to geographical point of interests. I think that >> such data could be useful as predictive signal of interest of potential >> visitors of such geographical places. >> >> Any help of you whether you say it is possible or not would be huge for >> me. >> >> Sincerely and Regards, >> Nima Dashtban >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- *Joseph Allemandou* Data Engineer @ Wikimedia Foundation IRC: joal
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
