Re: [Wikidata-l] DBpedia-based RDF dumps for Wikidata

2015-05-15 Thread Giovanni Tummarello
Hi Dimistris, everyone in the team. congratulations, great job.. it will
certainly be useful

Gio

On Fri, May 15, 2015 at 11:28 AM, Dimitris Kontokostas 
kontokos...@informatik.uni-leipzig.de wrote:

 Dear all,

 Following up on the early prototype we announced earlier [1] we are happy
 to announce a consolidated Wikidata RDF dump based on DBpedia.
 (Disclaimer: this work is not related or affiliated with the official
 Wikidata RDF dumps)

 We provide:
  * sample data for preview http://wikidata.dbpedia.org/downloads/sample/
  * a complete dump with over 1 Billion triples:
 http://wikidata.dbpedia.org/downloads/20150330/
  * a  SPARQL endpoint: http://wikidata.dbpedia.org/sparql
  * a Linked Data interface: http://wikidata.dbpedia.org/resource/Q586

 Using the wikidata dump from March we were able to retrieve more that 1B
 triples, 8.5M typed things according to the DBpedia ontology along with 48M
 transitive types, 6.4M coordinates and 1.5M depictions. A complete report
 for this effort can be found here:
 http://svn.aksw.org/papers/2015/ISWC_Wikidata2DBpedia/public.pdf

 The extraction code is now fully integrated in the DBpedia Information
 Extraction Framework.

 We are eagerly waiting for your feedback and your help in improving the
 DBpedia to Wikidata mapping coverage
 http://mappings.dbpedia.org/server/ontology/wikidata/missing/

 Best,

 Ali Ismayilov, Dimitris Kontokostas, Sören Auer, Jens Lehmann, Sebastian
 Hellmann

 [1]
 http://www.mail-archive.com/dbpedia-discussion%40lists.sourceforge.net/msg06936.html

 --
 Dimitris Kontokostas
 Department of Computer Science, University of Leipzig  DBpedia
 Association
 Projects: http://dbpedia.org, http://http://aligned-project.eu
 Homepage:http://aksw.org/DimitrisKontokostas
 Research Group: http://aksw.org


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


[Wikidata-l] DBpedia-based RDF dumps for Wikidata

2015-05-15 Thread Dimitris Kontokostas
Dear all,

Following up on the early prototype we announced earlier [1] we are happy
to announce a consolidated Wikidata RDF dump based on DBpedia.
(Disclaimer: this work is not related or affiliated with the official
Wikidata RDF dumps)

We provide:
 * sample data for preview http://wikidata.dbpedia.org/downloads/sample/
 * a complete dump with over 1 Billion triples:
http://wikidata.dbpedia.org/downloads/20150330/
 * a  SPARQL endpoint: http://wikidata.dbpedia.org/sparql
 * a Linked Data interface: http://wikidata.dbpedia.org/resource/Q586

Using the wikidata dump from March we were able to retrieve more that 1B
triples, 8.5M typed things according to the DBpedia ontology along with 48M
transitive types, 6.4M coordinates and 1.5M depictions. A complete report
for this effort can be found here:
http://svn.aksw.org/papers/2015/ISWC_Wikidata2DBpedia/public.pdf

The extraction code is now fully integrated in the DBpedia Information
Extraction Framework.

We are eagerly waiting for your feedback and your help in improving the
DBpedia to Wikidata mapping coverage
http://mappings.dbpedia.org/server/ontology/wikidata/missing/

Best,

Ali Ismayilov, Dimitris Kontokostas, Sören Auer, Jens Lehmann, Sebastian
Hellmann

[1]
http://www.mail-archive.com/dbpedia-discussion%40lists.sourceforge.net/msg06936.html

-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig  DBpedia Association
Projects: http://dbpedia.org, http://http://aligned-project.eu
Homepage:http://aksw.org/DimitrisKontokostas
Research Group: http://aksw.org
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Using Special:Unconnected Pages? Please read.

2015-05-15 Thread Lydia Pintscher
Thanks so much for the feedback. That was useful. We'll rework the
page to make it not time-out as it currently does on larger wikis. The
way we'll go for to achieve this is by sorting the pages by page-id
rather than page-title. That should also make it relatively easy to
find the newest or oldest pages depending on which you're working on.
We'll also provide a filter for the namespace but that might come a
bit later.


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Using Special:Unconnected Pages? Please read.

2015-05-15 Thread John Erling Blad
...and one additional thing; use of this page is client side, not
server side. That imply that the communities at the client side should
be asked, not the community at the server side. Which means that this
list is the wrong forum.

John

On Fri, May 15, 2015 at 2:58 PM, John Erling Blad jeb...@gmail.com wrote:
 That would effectively make the output quasirandom, make paging
 confusing, and look up of specific pages and following pages
 impossible. In all I think this makes the page useless for anybody
 except those projects that has managed to clean up the remaining
 unconnected pages, and those that has less than 5000 (I think that is
 the limit) pages in the list.

 I would rather suggest that an optional category should be added, and
 that it should be mandatory if the namespace page count indicates that
 the number of pages goes above some limit.

 I think the start was implemented as a prefix search originally, but I
 wonder if that is still the case. It could be wise to check it out. It
 could be an idea to use a default prefix search if none is added.

 If neither a prefix pattern or a prefix search are used, then a
 page-id sorted first N hits can be returned. It should also be
 possible to switch between oldest first and newest first.

 The seemingly caching behavior is probably the update which arrives
 late, but it could be an indication of other issues. The same bug was
 fixed several years ago, and disappeared, so it can be a new bug.

 It is interesting that there are now real needs for caching and an
 API, the page itself wasn't much appreciated and deemed unnecessary
 when it was created. This it has in common with a lot of the
 maintenance pages, the editors use them even if they are really
 crappy. Other special pages are using old, often several days old,
 reports as their source. When it comes to maintenance reports they
 should be up to date and actionable, not outdated and questionable.

 /rant

 John

 On Fri, May 15, 2015 at 2:07 PM, Lydia Pintscher
 lydia.pintsc...@wikimedia.de wrote:
 Thanks so much for the feedback. That was useful. We'll rework the
 page to make it not time-out as it currently does on larger wikis. The
 way we'll go for to achieve this is by sorting the pages by page-id
 rather than page-title. That should also make it relatively easy to
 find the newest or oldest pages depending on which you're working on.
 We'll also provide a filter for the namespace but that might come a
 bit later.


 Cheers
 Lydia

 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Product Manager for Wikidata

 Wikimedia Deutschland e.V.
 Tempelhofer Ufer 23-24
 10963 Berlin
 www.wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Using Special:Unconnected Pages? Please read.

2015-05-15 Thread John Erling Blad
Thanks! What you are saying is that you don't want any feedback.

On Fri, May 15, 2015 at 3:07 PM, Lydia Pintscher
lydia.pintsc...@wikimedia.de wrote:
 On Fri, May 15, 2015 at 3:01 PM, John Erling Blad jeb...@gmail.com wrote:
 ...and one additional thing; use of this page is client side, not
 server side. That imply that the communities at the client side should
 be asked, not the community at the server side. Which means that this
 list is the wrong forum.

 John, I got the answers I needed from the people who actually use the
 page and we're making it more useful for them based on this feedback
 within the technical constraints we have right now.


 Cheers
 Lydia

 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Product Manager for Wikidata

 Wikimedia Deutschland e.V.
 Tempelhofer Ufer 23-24
 10963 Berlin
 www.wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Using Special:Unconnected Pages? Please read.

2015-05-15 Thread Katie Filbert
On Fri, May 15, 2015 at 3:55 PM, John Erling Blad jeb...@gmail.com wrote:

 Thanks! What you are saying is that you don't want any feedback.


I am sure we do (as always). I think we already got (imho) enough feedback
and need to proceed best as possible (within our constraints) to satisfy
what folks want.

What I hear is that people like that the results in the special page are
live (vs. cached) and they use the iwlinks filter.  I think we can adjust
how the query works to make if fast enough to not require caching (except
maybe squid caching for a minute).

Also, some people are interested in having results sorted by how old the
page is (so people can find recently create pages, or other way around).
For the most part, page_id provides this while also being better choice
performance-wise.  It is also consistent with how Special:PagesWithProps
works. (Special:PagesWithProps is also pretty fast, thus it doesn't need to
be a cached special page)

I think we can also extend QueryPage so that we can make this information
available in the api. (sorting by page title is not very suitable for
QueryPage)  and if for some reason we still need caching now or in the
future, then it would be more easily possible if we extend QueryPage.

Cheers,
Katie



 On Fri, May 15, 2015 at 3:07 PM, Lydia Pintscher
 lydia.pintsc...@wikimedia.de wrote:
  On Fri, May 15, 2015 at 3:01 PM, John Erling Blad jeb...@gmail.com
 wrote:
  ...and one additional thing; use of this page is client side, not
  server side. That imply that the communities at the client side should
  be asked, not the community at the server side. Which means that this
  list is the wrong forum.
 
  John, I got the answers I needed from the people who actually use the
  page and we're making it more useful for them based on this feedback
  within the technical constraints we have right now.
 
 
  Cheers
  Lydia
 
  --
  Lydia Pintscher - http://about.me/lydia.pintscher
  Product Manager for Wikidata
 
  Wikimedia Deutschland e.V.
  Tempelhofer Ufer 23-24
  10963 Berlin
  www.wikimedia.de
 
  Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
 
  Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
  unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
  Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Katie Filbert
Wikidata Developer

Wikimedia Germany e.V. | Tempelhofer Ufer 23-24, 10963 Berlin
Phone (030) 219 158 26-0

http://wikimedia.de

Wikimedia Germany - Society for the Promotion of free knowledge eV Entered
in the register of Amtsgericht Berlin-Charlottenburg under the number 23
855 as recognized as charitable by the Inland Revenue for corporations I
Berlin, tax number 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Using Special:Unconnected Pages? Please read.

2015-05-15 Thread Lydia Pintscher
On Fri, May 15, 2015 at 3:01 PM, John Erling Blad jeb...@gmail.com wrote:
 ...and one additional thing; use of this page is client side, not
 server side. That imply that the communities at the client side should
 be asked, not the community at the server side. Which means that this
 list is the wrong forum.

John, I got the answers I needed from the people who actually use the
page and we're making it more useful for them based on this feedback
within the technical constraints we have right now.


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l