Smalyshev added a comment.
Yes we track category tree but not category membership for individual pages
(much bigger data set, obviously).
TASK DETAIL
https://phabricator.wikimedia.org/T157676
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To:
Lucas_Werkmeister_WMDE added a comment.
We have the category tree in Blazegraph (in a separate namespace), but I
think this task was asking for the membership of individual pages in categories
(i. e. not just subcategories), which we don’t have yet.
TASK DETAIL
ArielGlenn added a comment.
I somehow had the impression that this was complete. Is that wrong?
TASK DETAIL
https://phabricator.wikimedia.org/T157676
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: ArielGlenn
Cc: ArielGlenn, Sylvain_WMFr, debt,
Smalyshev added a comment.
This is partially done, see https://www.mediawiki.org/wiki/Wikidata_query_service/Categories - SPARQL engine now has the category tree and it can be queried.TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 379838 merged by jenkins-bot:
[wikidata/query/rdf@master] Add stored query for category traversal
https://gerrit.wikimedia.org/r/379838TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 379838 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[wikidata/query/rdf@master] Add stored query for category traversal
https://gerrit.wikimedia.org/r/379838TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 377372 merged by jenkins-bot:
[wikidata/query/rdf@master] Add script for loading category data
https://gerrit.wikimedia.org/r/377372TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 377372 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[wikidata/query/rdf@master] Add script for loading category data
https://gerrit.wikimedia.org/r/377372TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 373404 merged by Gehel:
[operations/puppet@production] Enable access to arbitrary namespaces for WDQS
https://gerrit.wikimedia.org/r/373404TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 327862 merged by jenkins-bot:
[mediawiki/core@master] Produce RDF dump of all categories and subcategories in a wiki.
https://gerrit.wikimedia.org/r/327862TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 373404 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[operations/puppet@production] Enable access to arbitrary namespaces for WDQS
https://gerrit.wikimedia.org/r/373404TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 373395 merged by jenkins-bot:
[wikidata/query/rdf@master] Add script for creating new namespaces
https://gerrit.wikimedia.org/r/373395TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 373395 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[wikidata/query/rdf@master] Add script for creating new namespaces
https://gerrit.wikimedia.org/r/373395TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 373354 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[operations/puppet@production] [WIP] Add RDF dumps for categories
https://gerrit.wikimedia.org/r/373354TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 327862 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/core@master] Produce RDF dump of all categories and subcategories in a wiki.
https://gerrit.wikimedia.org/r/327862TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 327862 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/core@master] Produce RDF dump of all categories and subcategories in a wiki.
https://gerrit.wikimedia.org/r/327862TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 359055 merged by jenkins-bot:
[mediawiki/vendor@master] Add Purtle library for RDF generation
https://gerrit.wikimedia.org/r/359055TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
gerritbot added a comment.
Change 359055 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/vendor@master] Add Purtle library for RDF generation
https://gerrit.wikimedia.org/r/359055TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
Smalyshev added a comment.
I do not think we plan to represent all mediawiki database contents in RDF just yet. Categories may make sense since categories are a graph-like structure anyway, and may be useful for structured commons. Anything else would require much more planning. And, probably, a
Bugreporter added a comment.
If you want to export categories metadata as RDF in MediaWiki core there're much more that can exposed: size, number of links, last edits and whether it is flagged/make by bot, the redirect/disambig status, and even pages links to or transcluded. All are supported in
Smalyshev added a comment.
There are two ways to approach it:
Use MW API (T148245)
Export categories as graph and load into WDQS (see https://gerrit.wikimedia.org/r/#/c/327862/ for preview of how it could look like)
TASK DETAILhttps://phabricator.wikimedia.org/T157676EMAIL
21 matches
Mail list logo