Hi!

I'd like to announce that the category tree of certain wikis is now
available as RDF dump and in Wikidata Query Service.

More documentation is at:
https://www.mediawiki.org/wiki/Wikidata_query_service/Categories
which I will summarize shortly below.

The dumps are located at
https://dumps.wikimedia.org/other/categoriesrdf/. You can use these
dumps any way you wish, data format is described at the link above[1].

The same dump is loaded into "categories" namespace in WDQS, which can
be queried by
https://query.wikidata.org/bigdata/namespace/categories/sparql?query=SPARQL.
Sorry, no GUI support yet (probably will happen later). See example in
the docs[2].

These datasets are not updated automatically yet, so they'll be up to
date roughly for the date of the latest dump. Hopefully soon it will be
automated and then the datasets will be updated daily.

The list of currently supported wikis is here:
https://noc.wikimedia.org/conf/categories-rdf.dblist - these are
basically all 1M+ wikis and couple more that I added for various
reasons. If you have a good candidate wiki to add, please tell me or
write on the talk page for the document above.

Please note this is only the first step for the project, so there might
still be some rough edges. I am announcing it early since I think it
would be useful for people to look at the dumps and SPARQL endpoint and
see if something is missing or does not work properly, and share ideas
on how it can be used.

We plan eventually to use it for search improvement[3] - this work is
still in progress.

As always, we welcome any comments and suggestions.

[1]
https://www.mediawiki.org/wiki/Wikidata_query_service/Categories#Data_format
[2]
https://www.mediawiki.org/wiki/Wikidata_query_service/Categories#Accessing_the_data
[3] https://phabricator.wikimedia.org/T165982

Thanks,
-- 
Stas Malyshev
smalys...@wikimedia.org

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to