Re: [Wikidata-l] DBpedia-based RDF dumps for Wikidata
Hi Dimistris, everyone in the team. congratulations, great job.. it will certainly be useful Gio On Fri, May 15, 2015 at 11:28 AM, Dimitris Kontokostas kontokos...@informatik.uni-leipzig.de wrote: Dear all, Following up on the early prototype we announced earlier [1] we are happy to announce a consolidated Wikidata RDF dump based on DBpedia. (Disclaimer: this work is not related or affiliated with the official Wikidata RDF dumps) We provide: * sample data for preview http://wikidata.dbpedia.org/downloads/sample/ * a complete dump with over 1 Billion triples: http://wikidata.dbpedia.org/downloads/20150330/ * a SPARQL endpoint: http://wikidata.dbpedia.org/sparql * a Linked Data interface: http://wikidata.dbpedia.org/resource/Q586 Using the wikidata dump from March we were able to retrieve more that 1B triples, 8.5M typed things according to the DBpedia ontology along with 48M transitive types, 6.4M coordinates and 1.5M depictions. A complete report for this effort can be found here: http://svn.aksw.org/papers/2015/ISWC_Wikidata2DBpedia/public.pdf The extraction code is now fully integrated in the DBpedia Information Extraction Framework. We are eagerly waiting for your feedback and your help in improving the DBpedia to Wikidata mapping coverage http://mappings.dbpedia.org/server/ontology/wikidata/missing/ Best, Ali Ismayilov, Dimitris Kontokostas, Sören Auer, Jens Lehmann, Sebastian Hellmann [1] http://www.mail-archive.com/dbpedia-discussion%40lists.sourceforge.net/msg06936.html -- Dimitris Kontokostas Department of Computer Science, University of Leipzig DBpedia Association Projects: http://dbpedia.org, http://http://aligned-project.eu Homepage:http://aksw.org/DimitrisKontokostas Research Group: http://aksw.org ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] DBpedia-based RDF dumps for Wikidata
Dear all, Following up on the early prototype we announced earlier [1] we are happy to announce a consolidated Wikidata RDF dump based on DBpedia. (Disclaimer: this work is not related or affiliated with the official Wikidata RDF dumps) We provide: * sample data for preview http://wikidata.dbpedia.org/downloads/sample/ * a complete dump with over 1 Billion triples: http://wikidata.dbpedia.org/downloads/20150330/ * a SPARQL endpoint: http://wikidata.dbpedia.org/sparql * a Linked Data interface: http://wikidata.dbpedia.org/resource/Q586 Using the wikidata dump from March we were able to retrieve more that 1B triples, 8.5M typed things according to the DBpedia ontology along with 48M transitive types, 6.4M coordinates and 1.5M depictions. A complete report for this effort can be found here: http://svn.aksw.org/papers/2015/ISWC_Wikidata2DBpedia/public.pdf The extraction code is now fully integrated in the DBpedia Information Extraction Framework. We are eagerly waiting for your feedback and your help in improving the DBpedia to Wikidata mapping coverage http://mappings.dbpedia.org/server/ontology/wikidata/missing/ Best, Ali Ismayilov, Dimitris Kontokostas, Sören Auer, Jens Lehmann, Sebastian Hellmann [1] http://www.mail-archive.com/dbpedia-discussion%40lists.sourceforge.net/msg06936.html -- Dimitris Kontokostas Department of Computer Science, University of Leipzig DBpedia Association Projects: http://dbpedia.org, http://http://aligned-project.eu Homepage:http://aksw.org/DimitrisKontokostas Research Group: http://aksw.org ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Using Special:Unconnected Pages? Please read.
Thanks so much for the feedback. That was useful. We'll rework the page to make it not time-out as it currently does on larger wikis. The way we'll go for to achieve this is by sorting the pages by page-id rather than page-title. That should also make it relatively easy to find the newest or oldest pages depending on which you're working on. We'll also provide a filter for the namespace but that might come a bit later. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Using Special:Unconnected Pages? Please read.
...and one additional thing; use of this page is client side, not server side. That imply that the communities at the client side should be asked, not the community at the server side. Which means that this list is the wrong forum. John On Fri, May 15, 2015 at 2:58 PM, John Erling Blad jeb...@gmail.com wrote: That would effectively make the output quasirandom, make paging confusing, and look up of specific pages and following pages impossible. In all I think this makes the page useless for anybody except those projects that has managed to clean up the remaining unconnected pages, and those that has less than 5000 (I think that is the limit) pages in the list. I would rather suggest that an optional category should be added, and that it should be mandatory if the namespace page count indicates that the number of pages goes above some limit. I think the start was implemented as a prefix search originally, but I wonder if that is still the case. It could be wise to check it out. It could be an idea to use a default prefix search if none is added. If neither a prefix pattern or a prefix search are used, then a page-id sorted first N hits can be returned. It should also be possible to switch between oldest first and newest first. The seemingly caching behavior is probably the update which arrives late, but it could be an indication of other issues. The same bug was fixed several years ago, and disappeared, so it can be a new bug. It is interesting that there are now real needs for caching and an API, the page itself wasn't much appreciated and deemed unnecessary when it was created. This it has in common with a lot of the maintenance pages, the editors use them even if they are really crappy. Other special pages are using old, often several days old, reports as their source. When it comes to maintenance reports they should be up to date and actionable, not outdated and questionable. /rant John On Fri, May 15, 2015 at 2:07 PM, Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: Thanks so much for the feedback. That was useful. We'll rework the page to make it not time-out as it currently does on larger wikis. The way we'll go for to achieve this is by sorting the pages by page-id rather than page-title. That should also make it relatively easy to find the newest or oldest pages depending on which you're working on. We'll also provide a filter for the namespace but that might come a bit later. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Using Special:Unconnected Pages? Please read.
Thanks! What you are saying is that you don't want any feedback. On Fri, May 15, 2015 at 3:07 PM, Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: On Fri, May 15, 2015 at 3:01 PM, John Erling Blad jeb...@gmail.com wrote: ...and one additional thing; use of this page is client side, not server side. That imply that the communities at the client side should be asked, not the community at the server side. Which means that this list is the wrong forum. John, I got the answers I needed from the people who actually use the page and we're making it more useful for them based on this feedback within the technical constraints we have right now. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Using Special:Unconnected Pages? Please read.
On Fri, May 15, 2015 at 3:55 PM, John Erling Blad jeb...@gmail.com wrote: Thanks! What you are saying is that you don't want any feedback. I am sure we do (as always). I think we already got (imho) enough feedback and need to proceed best as possible (within our constraints) to satisfy what folks want. What I hear is that people like that the results in the special page are live (vs. cached) and they use the iwlinks filter. I think we can adjust how the query works to make if fast enough to not require caching (except maybe squid caching for a minute). Also, some people are interested in having results sorted by how old the page is (so people can find recently create pages, or other way around). For the most part, page_id provides this while also being better choice performance-wise. It is also consistent with how Special:PagesWithProps works. (Special:PagesWithProps is also pretty fast, thus it doesn't need to be a cached special page) I think we can also extend QueryPage so that we can make this information available in the api. (sorting by page title is not very suitable for QueryPage) and if for some reason we still need caching now or in the future, then it would be more easily possible if we extend QueryPage. Cheers, Katie On Fri, May 15, 2015 at 3:07 PM, Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: On Fri, May 15, 2015 at 3:01 PM, John Erling Blad jeb...@gmail.com wrote: ...and one additional thing; use of this page is client side, not server side. That imply that the communities at the client side should be asked, not the community at the server side. Which means that this list is the wrong forum. John, I got the answers I needed from the people who actually use the page and we're making it more useful for them based on this feedback within the technical constraints we have right now. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Katie Filbert Wikidata Developer Wikimedia Germany e.V. | Tempelhofer Ufer 23-24, 10963 Berlin Phone (030) 219 158 26-0 http://wikimedia.de Wikimedia Germany - Society for the Promotion of free knowledge eV Entered in the register of Amtsgericht Berlin-Charlottenburg under the number 23 855 as recognized as charitable by the Inland Revenue for corporations I Berlin, tax number 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Using Special:Unconnected Pages? Please read.
On Fri, May 15, 2015 at 3:01 PM, John Erling Blad jeb...@gmail.com wrote: ...and one additional thing; use of this page is client side, not server side. That imply that the communities at the client side should be asked, not the community at the server side. Which means that this list is the wrong forum. John, I got the answers I needed from the people who actually use the page and we're making it more useful for them based on this feedback within the technical constraints we have right now. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l