Thanks for your answers guys! :) I'll be looking forward to the improvements on geodata.
Cheers, Marc ᐧ 2015-03-02 23:52 GMT+01:00 <[email protected]>: > Send Labs-l mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.wikimedia.org/mailman/listinfo/labs-l > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Labs-l digest..." > > > Today's Topics: > > 1. doubt on GeoData / how to obtain articles with coords > (Marc Miquel) > 2. Re: [Analytics] doubt on GeoData / how to obtain articles > with coords (Marc Miquel) > 3. Re: [Analytics] doubt on GeoData / how to obtain articles > with coords (Gerard Meijssen) > 4. Re: [Analytics] doubt on GeoData / how to obtain articles > with coords (Bryan White) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 2 Mar 2015 23:33:18 +0100 > From: Marc Miquel <[email protected]> > To: [email protected], [email protected] > Subject: [Labs-l] doubt on GeoData / how to obtain articles with > coords > Message-ID: > <CANSEGinkmuzN3nDmWmWm73AmiGTFyKGjDWeMinQ= > [email protected]> > Content-Type: text/plain; charset="utf-8" > > Hi guys, > > I am doing some research and I struggling a bit to obtain geolocalized > articles in several languages. They told me that the best tool to obtain > the geolocalization for each article would be GeoData API. But I see there > I need to introduce each article name and I don't know if it is the best > way. > > I am thinking for instance that for big wikipedies like French or German I > might need to make a million queries to get only those with coords... Also, > I would like to obtain the region according to ISO 3166-2 which seems to be > there. > > My objective is to obtain different lists of articles related to countries > and regions. > > I don't know if using WikiData with python would be a better option. But I > see that there there isn't the region. Maybe I could combine WikiData and > some other tool to give me the region. > Anyone could help me? > > Thanks a lot. > > Marc Miquel > ᐧ > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > https://lists.wikimedia.org/pipermail/labs-l/attachments/20150302/9ed9822c/attachment-0001.html > > > > ------------------------------ > > Message: 2 > Date: Mon, 2 Mar 2015 23:42:44 +0100 > From: Marc Miquel <[email protected]> > To: "A mailing list for the Analytics Team at WMF and everybody who > has an interest in Wikipedia and analytics." > <[email protected]> > Cc: "[email protected]" <[email protected]> > Subject: Re: [Labs-l] [Analytics] doubt on GeoData / how to obtain > articles with coords > Message-ID: > <CANSEGink9qxRNvzyqrUX4gncX6CJcttqM7eKAcxk= > [email protected]> > Content-Type: text/plain; charset="utf-8" > > Hi Max and Oliver, > > Thanks for your answers. geo_tags table seems quite uncomplete. I just > checked some random articles in for instance Nepali Wikipedia, for its > Capital Katmandú there is coords in the real article but it doesn't appear > in geo_tags. Then it doesn't seem an option. > > Marc > ᐧ > > 2015-03-02 23:38 GMT+01:00 Oliver Keyes <[email protected]>: > > > Max's idea is an improvement but still a lot of requests. We really need > > to start generating these dumps :(. > > > > Until the dumps are available, the fastest way to do it is probably > Quarry > > (http://quarry.wmflabs.org/) an open MySQL client to our public database > > tables. So, you want the geo_tags table; getting all the coordinate sets > on > > the English-language Wikipedia would be something like: > > > > SELECT * FROM enwiki_p.geo_tags; > > > > This should be available for all of our production wikis (SHOW DATABASES > > is your friend): you want [project]_p rather than [project]. Hope that > > helps! > > > > On 2 March 2015 at 17:35, Max Semenik <[email protected]> wrote: > > > >> Use generators: > >> > api.php?action=query&generator=allpages&gapnamespace=0&prop=coordinates&gaplimit=max&colimit=max > >> > >> On Mon, Mar 2, 2015 at 2:33 PM, Marc Miquel <[email protected]> > wrote: > >> > >>> Hi guys, > >>> > >>> I am doing some research and I struggling a bit to obtain geolocalized > >>> articles in several languages. They told me that the best tool to > obtain > >>> the geolocalization for each article would be GeoData API. But I see > there > >>> I need to introduce each article name and I don't know if it is the > best > >>> way. > >>> > >>> I am thinking for instance that for big wikipedies like French or > German > >>> I might need to make a million queries to get only those with coords... > >>> Also, I would like to obtain the region according to ISO 3166-2 which > seems > >>> to be there. > >>> > >>> My objective is to obtain different lists of articles related to > >>> countries and regions. > >>> > >>> I don't know if using WikiData with python would be a better option. > But > >>> I see that there there isn't the region. Maybe I could combine > WikiData and > >>> some other tool to give me the region. > >>> Anyone could help me? > >>> > >>> Thanks a lot. > >>> > >>> Marc Miquel > >>> ᐧ > >>> > >>> _______________________________________________ > >>> Analytics mailing list > >>> [email protected] > >>> https://lists.wikimedia.org/mailman/listinfo/analytics > >>> > >>> > >> > >> > >> -- > >> Best regards, > >> Max Semenik ([[User:MaxSem]]) > >> > >> _______________________________________________ > >> Analytics mailing list > >> [email protected] > >> https://lists.wikimedia.org/mailman/listinfo/analytics > >> > >> > > > > > > -- > > Oliver Keyes > > Research Analyst > > Wikimedia Foundation > > > > _______________________________________________ > > Analytics mailing list > > [email protected] > > https://lists.wikimedia.org/mailman/listinfo/analytics > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > https://lists.wikimedia.org/pipermail/labs-l/attachments/20150302/cf591d4a/attachment-0001.html > > > > ------------------------------ > > Message: 3 > Date: Mon, 2 Mar 2015 23:47:08 +0100 > From: Gerard Meijssen <[email protected]> > To: Wikimedia Labs <[email protected]> > Subject: Re: [Labs-l] [Analytics] doubt on GeoData / how to obtain > articles with coords > Message-ID: > <CAO53wxV3+q_b8RXe-2z4D8tEZz6yCbLeamBxt9k=+ > [email protected]> > Content-Type: text/plain; charset="utf-8" > > Hoi, > What is the point.. Harvest jobs have been run on many Wiikipedias and the > result ended up in Wikidata.. Is this enough or does the data need to be in > the text for a language text as well ? > > When you run a job querying Wikipedias have the result end up in Wikidata > as well.. It allows people stand on the shoulders of giants.. > Thanks, > GerardM > > On 2 March 2015 at 23:42, Marc Miquel <[email protected]> wrote: > > > Hi Max and Oliver, > > > > Thanks for your answers. geo_tags table seems quite uncomplete. I just > > checked some random articles in for instance Nepali Wikipedia, for its > > Capital Katmandú there is coords in the real article but it doesn't > appear > > in geo_tags. Then it doesn't seem an option. > > > > Marc > > ᐧ > > > > 2015-03-02 23:38 GMT+01:00 Oliver Keyes <[email protected]>: > > > >> Max's idea is an improvement but still a lot of requests. We really need > >> to start generating these dumps :(. > >> > >> Until the dumps are available, the fastest way to do it is probably > >> Quarry (http://quarry.wmflabs.org/) an open MySQL client to our public > >> database tables. So, you want the geo_tags table; getting all the > >> coordinate sets on the English-language Wikipedia would be something > like: > >> > >> SELECT * FROM enwiki_p.geo_tags; > >> > >> This should be available for all of our production wikis (SHOW DATABASES > >> is your friend): you want [project]_p rather than [project]. Hope that > >> helps! > >> > >> On 2 March 2015 at 17:35, Max Semenik <[email protected]> wrote: > >> > >>> Use generators: > >>> > api.php?action=query&generator=allpages&gapnamespace=0&prop=coordinates&gaplimit=max&colimit=max > >>> > >>> On Mon, Mar 2, 2015 at 2:33 PM, Marc Miquel <[email protected]> > >>> wrote: > >>> > >>>> Hi guys, > >>>> > >>>> I am doing some research and I struggling a bit to obtain geolocalized > >>>> articles in several languages. They told me that the best tool to > obtain > >>>> the geolocalization for each article would be GeoData API. But I see > there > >>>> I need to introduce each article name and I don't know if it is the > best > >>>> way. > >>>> > >>>> I am thinking for instance that for big wikipedies like French or > >>>> German I might need to make a million queries to get only those with > >>>> coords... Also, I would like to obtain the region according to ISO > 3166-2 > >>>> which seems to be there. > >>>> > >>>> My objective is to obtain different lists of articles related to > >>>> countries and regions. > >>>> > >>>> I don't know if using WikiData with python would be a better option. > >>>> But I see that there there isn't the region. Maybe I could combine > WikiData > >>>> and some other tool to give me the region. > >>>> Anyone could help me? > >>>> > >>>> Thanks a lot. > >>>> > >>>> Marc Miquel > >>>> ᐧ > >>>> > >>>> _______________________________________________ > >>>> Analytics mailing list > >>>> [email protected] > >>>> https://lists.wikimedia.org/mailman/listinfo/analytics > >>>> > >>>> > >>> > >>> > >>> -- > >>> Best regards, > >>> Max Semenik ([[User:MaxSem]]) > >>> > >>> _______________________________________________ > >>> Analytics mailing list > >>> [email protected] > >>> https://lists.wikimedia.org/mailman/listinfo/analytics > >>> > >>> > >> > >> > >> -- > >> Oliver Keyes > >> Research Analyst > >> Wikimedia Foundation > >> > >> _______________________________________________ > >> Analytics mailing list > >> [email protected] > >> https://lists.wikimedia.org/mailman/listinfo/analytics > >> > >> > > > > _______________________________________________ > > Labs-l mailing list > > [email protected] > > https://lists.wikimedia.org/mailman/listinfo/labs-l > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > https://lists.wikimedia.org/pipermail/labs-l/attachments/20150302/a97a6f4c/attachment-0001.html > > > > ------------------------------ > > Message: 4 > Date: Mon, 2 Mar 2015 15:52:00 -0700 > From: Bryan White <[email protected]> > To: Wikimedia Labs <[email protected]> > Subject: Re: [Labs-l] [Analytics] doubt on GeoData / how to obtain > articles with coords > Message-ID: > <CADtx0sducCh= > [email protected]> > Content-Type: text/plain; charset="utf-8" > > Marc, > > If anybody would know, it would be Kolossos. He is one of the people > responsible for geohack, integration with OpenStreetmap and other > geographical referencing doohickeys. > > He is more active on the German site, see > https://de.wikipedia.org/wiki/Benutzer:Kolossos. His email link is there > or you can search thru this email list for it. > > Bryan > > On Mon, Mar 2, 2015 at 3:42 PM, Marc Miquel <[email protected]> wrote: > > > Hi Max and Oliver, > > > > Thanks for your answers. geo_tags table seems quite uncomplete. I just > > checked some random articles in for instance Nepali Wikipedia, for its > > Capital Katmandú there is coords in the real article but it doesn't > appear > > in geo_tags. Then it doesn't seem an option. > > > > Marc > > ᐧ > > > > 2015-03-02 23:38 GMT+01:00 Oliver Keyes <[email protected]>: > > > >> Max's idea is an improvement but still a lot of requests. We really need > >> to start generating these dumps :(. > >> > >> Until the dumps are available, the fastest way to do it is probably > >> Quarry (http://quarry.wmflabs.org/) an open MySQL client to our public > >> database tables. So, you want the geo_tags table; getting all the > >> coordinate sets on the English-language Wikipedia would be something > like: > >> > >> SELECT * FROM enwiki_p.geo_tags; > >> > >> This should be available for all of our production wikis (SHOW DATABASES > >> is your friend): you want [project]_p rather than [project]. Hope that > >> helps! > >> > >> On 2 March 2015 at 17:35, Max Semenik <[email protected]> wrote: > >> > >>> Use generators: > >>> > api.php?action=query&generator=allpages&gapnamespace=0&prop=coordinates&gaplimit=max&colimit=max > >>> > >>> On Mon, Mar 2, 2015 at 2:33 PM, Marc Miquel <[email protected]> > >>> wrote: > >>> > >>>> Hi guys, > >>>> > >>>> I am doing some research and I struggling a bit to obtain geolocalized > >>>> articles in several languages. They told me that the best tool to > obtain > >>>> the geolocalization for each article would be GeoData API. But I see > there > >>>> I need to introduce each article name and I don't know if it is the > best > >>>> way. > >>>> > >>>> I am thinking for instance that for big wikipedies like French or > >>>> German I might need to make a million queries to get only those with > >>>> coords... Also, I would like to obtain the region according to ISO > 3166-2 > >>>> which seems to be there. > >>>> > >>>> My objective is to obtain different lists of articles related to > >>>> countries and regions. > >>>> > >>>> I don't know if using WikiData with python would be a better option. > >>>> But I see that there there isn't the region. Maybe I could combine > WikiData > >>>> and some other tool to give me the region. > >>>> Anyone could help me? > >>>> > >>>> Thanks a lot. > >>>> > >>>> Marc Miquel > >>>> ᐧ > >>>> > >>>> _______________________________________________ > >>>> Analytics mailing list > >>>> [email protected] > >>>> https://lists.wikimedia.org/mailman/listinfo/analytics > >>>> > >>>> > >>> > >>> > >>> -- > >>> Best regards, > >>> Max Semenik ([[User:MaxSem]]) > >>> > >>> _______________________________________________ > >>> Analytics mailing list > >>> [email protected] > >>> https://lists.wikimedia.org/mailman/listinfo/analytics > >>> > >>> > >> > >> > >> -- > >> Oliver Keyes > >> Research Analyst > >> Wikimedia Foundation > >> > >> _______________________________________________ > >> Analytics mailing list > >> [email protected] > >> https://lists.wikimedia.org/mailman/listinfo/analytics > >> > >> > > > > _______________________________________________ > > Labs-l mailing list > > [email protected] > > https://lists.wikimedia.org/mailman/listinfo/labs-l > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > https://lists.wikimedia.org/pipermail/labs-l/attachments/20150302/33159725/attachment.html > > > > ------------------------------ > > _______________________________________________ > Labs-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/labs-l > > > End of Labs-l Digest, Vol 39, Issue 1 > ************************************* >
_______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
