Re: [Wikidata-l] Reasonator ignores of qualifier
Derric, Good luck with your edit-a-thon today [1] and it would be awesome if you could introduce Reasonator and Wikidata while discussing Wikipedia. I think it is very interesting that you just discovered Magnus through Reasonator instead of the other way around - discovering more of Magnus through his Wikipedia tools. So few people venture outside of the text-oriented Wikipedia projects, it's always refreshing to see someone in a public-facing capacity getting involved in Wikidata. [1] https://en.wikipedia.org/wiki/User:Zellfaze/Edit-a-thon_Planning Jane On Fri, Jun 13, 2014 at 5:27 PM, Derric Atzrott datzr...@alizeepathology.com wrote: FYI, I maintain Reasonator, as a click on Other/About Reasonator would have revealed... Oh, so it does. I have no idea how I missed that. I promise I did try to look around. Absolutely. Do I take it that you volunteer? The code is at: https://bitbucket.org/magnusmanske/reasonator I can add you to the codebase there, and/or make you co-maintainer on the tool. I'll volunteer at the very least I'll take a look and see if I can find what needs to be modified and then possibly submit a patch. This seems like one of those things that ought to be a pretty simple fix. No need to make me co-maintainer; I'm not looking for that kind of responsibility. Because I sure don't have the time at the moment to fiddle with some edge cases in a secondary component of one of my 50+ tools. Sorry if I came off rude or demanding. That was not my intention. With that many tools, I'm sure there are more important bugs. I just wanted to add this to the list. I'll take a look at the source and see if I can find where the issue is and how to fix it. Thank you, Derric Atzrott ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Reasonator ignores of qualifier
Hi Derrick, I think it's vain to ask yourself if some concept deserves an item. It does not make much sense. There is much more value in the regularity in how we express the same kind of data : this makes really much simpler to develop tools and to help newbies on how to do things if we decide one way on expressing that somebody is the mayor of somewhere and stick to it. 2014-06-13 16:57 GMT+02:00 Derric Atzrott datzr...@alizeepathology.com: It does crop up in many places.. What you see is not the Reasonator per-se it is the script used to generate a text. Compare it with the results in most other languages, you will not see this text. That's fair. In that case the script to generate the text should be modified. Arguably Reasonator expects a different pattern. You make him mayor.. while Reasonator expects him to have office held mayor of Frederick. Compare Ronald Reagan where you find qualifiers for start and end date.. http://tools.wmflabs.org/reasonator/?q=9960 For a while I was actually using Mayor of Frederick as the position held. You can find that item as Q17167581, but I nominated it for deletion once I discovered the of qualifier. The position is itself not really noteworthy enough to merits its own item, (like President of the United States or Mayor of New York City are) and because we have the Of qualifier I said, why not use it. Where does one submit bugs for Reasonator, or is this list a good spot to do so? Thank you, Derric Atzrott ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata RDF exports
Eric, Two general remarks first: (1) Protege is for small and medium ontologies, but not really for such large datasets. To get SPARQL support for the whole data, you could to install Virtuoso. It also comes with a simple Web query UI. Virtuoso does not do much reasoning, but you can use SPARQL 1.1 transitive closure in queries (using * after properties), so you can find all subclasses there too. (You could also try this in Protege ...) (2) If you want to explore the class hierarchy, you can also try our new class browser: http://tools.wmflabs.org/wikidata-exports/miga/?classes It has the whole class hierarchy, but without the leaves (=instances of classes + subclasses that have no own subclasses/instances). For example, it tells you that lepton has 5 direct subclasses, but shows only one: http://tools.wmflabs.org/wikidata-exports/miga/?classes#_item=3338 On the other hand, it includes relationships of classes and properties that are not part of the RDF (we extract this from the data by considering co-occurrence). Example: Classes that have no superclasses but at least 10 instances, and which are often used with the property 'sex or gender': http://tools.wmflabs.org/wikidata-exports/miga/?classes#_cat=Classes/Direct%20superclasses=__null/Number%20of%20direct%20instances=10%20-%202/Related%20properties=sex%20or%20gender I already added superclasses for some of those in Wikidata now -- data in the browser is updated with some delay based on dump files. More answers below: On 14/06/14 05:52, emw wrote: Markus, Thank you very much for this. Translating Wikidata into the language of the Semantic Web is important. Being able to explore the Wikidata taxonomy [1] by doing SPARQL queries in Protege [2] (even primitive queries) is really neat, e.g. SELECT ?subject WHERE { ?subject rdfs:subClassOf http://www.wikidata.org/entity/Q82586 . } This is more of an issue of my ignorance of Protege, but I notice that the above query returns only the direct subclasses of Q82586. The full set of subclasses for Q82586 (lepton) is visible at http://tools.wmflabs.org/wikidata-todo/tree.html?q=Q82586rp=279lang=en -- a few of the 2nd-level subclasses (muon neutrino, tau neutrino, electron neutrino) are shown there but not returned by that SPARQL query. It seems rdfs:subClassOf isn't being treated as a transitive property in Protege. Any ideas? You need a reasoner to compute this properly. For a plain class hierarchy as in our case, ELK should be a good choice [1]. You can install the ELK Protege plugin and use it to classify the ontology [2]. Protege will then show the copmuted class hierarchy in the browser; I am not sure what happens to the SPARQL queries (it's quite possible that they don't use the reasoner). [1] https://code.google.com/p/elk-reasoner/ [2] https://code.google.com/p/elk-reasoner/wiki/ElkProtege Do you know when the taxonomy data in OWL will have labels available? We had not thought of this as a use case. A challenge is that the label data is quite big because of the many languages. Should we maybe create an English label file for the classes? Descriptions too or just labels? Also, regarding the complete dumps, would it be possible to export a smaller subset of the faithful data? The files under Complete Data Dumps in http://tools.wmflabs.org/wikidata-exports/rdf/exports/20140526/ look too big to load into Protege on most personal computers, and would likely require adjusting JVM settings on higher-end computers to load. If it's feasible to somehow prune those files -- and maybe even combine them into one file that could be easily loaded into Protege -- that would be especially nice. What kind of pruning do you have in mind? You can of course take a subset of the data, but then some of the data will be missing. A general remark on mixing and matching RDF files. We use N3 format, where every line in the ontology is self-contained (no multi-line constructs, no header, no namespaces). Therefore, any subset of the lines of any of our files is still a valid file. So if you want to have only a slice of the data (maybe to experiment with), then you could simply do something like: gunzip -c wikidata-statements.nt.gz | head -1 partial-data.nt head simply selects the first 1 lines here. You could also use grep to select specific triples instead, such as: zgrep http://www.w3.org/2000/01/rdf-schema#label; wikidata-terms.nt.gz | grep @en . en-labels.nt This selects all English labels. I am using zgrep here for a change; you can also use gunzip as above. Similar methods can also be used to count things in the ontology (use grep -c to count lines = triples). Finally, you can combine multiple files into one by simply concatenating them in any order: cat partial-data-1.nt mydata.nt cat partial-data-2.nt mydata.nt ... Maybe you can experiment a bit and let us know if there is any export that would be particularly
[Wikidata-l] RFC about part of
This RFC was pending after several discussions. How to clarify the property part of? https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Refining_%22part_of%22 Cheers, Micru ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata RDF exports
Markus, Thanks for the thorough reply! you can use SPARQL 1.1 transitive closure in queries (using * after properties), so you can find all subclasses there too. (You could also try this in Protege ...) I had a feeling I was missing something basic. (I'm also new to SPARQL.) Using * after the property got me what I was looking for by default in Protege. That is, SELECT ?subject WHERE { ?subject rdfs:subClassOf* http://www.wikidata.org/entity/Q82586 . } -- with an asterisk after rdfs:subClassOf -- got me the transitive closure and returned all subclasses of Q82586 / lepton. Should we maybe create an English label file for the classes? Descriptions too or just labels? A file with English labels and descriptions for classes would be great and, I think, address this use case. Per your note, I suppose one would simply concatenate that English terms file and wikidata-taxonomy.nt into a new .nt file, then import that into Protege to explore the class hierarchy. (Having every line in the ontology be self-contained in N3 is very convenient!) Regarding the pruned subset, I think the command-line approach in your examples is enough for me to get started making my own. I won't have time to experiment with these things for a few weeks, but I will return to this then and let you know any interesting findings. Cheers, Eric On Sat, Jun 14, 2014 at 4:41 AM, Markus Krötzsch mar...@semantic-mediawiki.org wrote: Eric, Two general remarks first: (1) Protege is for small and medium ontologies, but not really for such large datasets. To get SPARQL support for the whole data, you could to install Virtuoso. It also comes with a simple Web query UI. Virtuoso does not do much reasoning, but you can use SPARQL 1.1 transitive closure in queries (using * after properties), so you can find all subclasses there too. (You could also try this in Protege ...) (2) If you want to explore the class hierarchy, you can also try our new class browser: http://tools.wmflabs.org/wikidata-exports/miga/?classes It has the whole class hierarchy, but without the leaves (=instances of classes + subclasses that have no own subclasses/instances). For example, it tells you that lepton has 5 direct subclasses, but shows only one: http://tools.wmflabs.org/wikidata-exports/miga/?classes#_item=3338 On the other hand, it includes relationships of classes and properties that are not part of the RDF (we extract this from the data by considering co-occurrence). Example: Classes that have no superclasses but at least 10 instances, and which are often used with the property 'sex or gender': http://tools.wmflabs.org/wikidata-exports/miga/? classes#_cat=Classes/Direct%20superclasses=__null/Number% 20of%20direct%20instances=10%20-%202/Related% 20properties=sex%20or%20gender I already added superclasses for some of those in Wikidata now -- data in the browser is updated with some delay based on dump files. More answers below: On 14/06/14 05:52, emw wrote: Markus, Thank you very much for this. Translating Wikidata into the language of the Semantic Web is important. Being able to explore the Wikidata taxonomy [1] by doing SPARQL queries in Protege [2] (even primitive queries) is really neat, e.g. SELECT ?subject WHERE { ?subject rdfs:subClassOf http://www.wikidata.org/entity/Q82586 . } This is more of an issue of my ignorance of Protege, but I notice that the above query returns only the direct subclasses of Q82586. The full set of subclasses for Q82586 (lepton) is visible at http://tools.wmflabs.org/wikidata-todo/tree.html?q=Q82586rp=279lang=en -- a few of the 2nd-level subclasses (muon neutrino, tau neutrino, electron neutrino) are shown there but not returned by that SPARQL query. It seems rdfs:subClassOf isn't being treated as a transitive property in Protege. Any ideas? You need a reasoner to compute this properly. For a plain class hierarchy as in our case, ELK should be a good choice [1]. You can install the ELK Protege plugin and use it to classify the ontology [2]. Protege will then show the copmuted class hierarchy in the browser; I am not sure what happens to the SPARQL queries (it's quite possible that they don't use the reasoner). [1] https://code.google.com/p/elk-reasoner/ [2] https://code.google.com/p/elk-reasoner/wiki/ElkProtege Do you know when the taxonomy data in OWL will have labels available? We had not thought of this as a use case. A challenge is that the label data is quite big because of the many languages. Should we maybe create an English label file for the classes? Descriptions too or just labels? Also, regarding the complete dumps, would it be possible to export a smaller subset of the faithful data? The files under Complete Data Dumps in http://tools.wmflabs.org/wikidata-exports/rdf/exports/20140526/ look too big to load into Protege on most personal computers, and would likely require adjusting JVM settings on
[Wikidata-l] Weekly Summary #113
Here is the latest summary of what has been happening around Wikidata! As always, feedback is appreciated. Discussions - Wikidata participation in Wiki Loves Pride https://www.wikidata.org/wiki/Wikidata:Project_chat#Wikidata_participation_in_Wiki_Loves_Pride - Discussion: Delete as a new user group https://www.wikidata.org/wiki/Wikidata:Project_chat#Delete_as_own_usergroup - RfC: Refining part of https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Refining_%22part_of%22 - Open RfAs: Calak https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Administrator/Calak , Andre Engels https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Administrator/Andre_Engels - Closed RfAs: Taketa https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Administrator/Taketa (successful) Jiangui67 https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Administrator/Jianhui67 (successful) 555 https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Administrator/555 (successful) Other Noteworthy Stuff - The preview gadget https://www.wikidata.org/wiki/MediaWiki:Gadget-Preview.js has been rewritten and now comes with a completely new user interface. You can enable it here https://www.wikidata.org/wiki/Special:Preferences#mw-prefsection-gadgets . - Wikiquote received access to the data on Wikidata (aka phase 2) on Tuesday. - Try out the new Wikidata Property Browser https://lists.wikimedia.org/pipermail/wikidata-l/2014-June/004013.html. - The Merge gadget https://www.wikidata.org/wiki/MediaWiki:Gadget-Merge.js has been updated. It is now faster and only one edit per merge is performed. Did you know? - Newest properties: original spelling https://www.wikidata.org/wiki/Property:P1353, ranking https://www.wikidata.org/wiki/Property:P1352, number of points/goals scored https://www.wikidata.org/wiki/Property:P1351, number of matches played https://www.wikidata.org/wiki/Property:P1350, ploidy https://www.wikidata.org/wiki/Property:P1349,AlgaeBase URL https://www.wikidata.org/wiki/Property:P1348, military casualty classification https://www.wikidata.org/wiki/Property:P1347, winner https://www.wikidata.org/wiki/Property:P1346, number of victims https://www.wikidata.org/wiki/Property:P1345, participated in https://www.wikidata.org/wiki/Property:P1344, described by source https://www.wikidata.org/wiki/Property:P1343, number of seats https://www.wikidata.org/wiki/Property:P1342 - Newest task forces https://www.wikidata.org/wiki/Wikidata:Task_forces : Sports task forces https://www.wikidata.org/wiki/Wikidata:Sports_task_forces Development - Continued with the implementation of redirects. There are a lot of corner cases to work out... - Created new data value handlers for quantities and geo coordinates, bringing queries closer - Moved browser tests to a new repository at https://github.com/wmde/WikidataBrowserTests to make the main Wikibase repo smaller and cleaner - Fixed a bunch of issues with the monobook skin - Made some more jQuery 1.9 compatibility fixes - More cleanup for the coming switch to WikibaseDataModel 1.0. - Icinga Dispatch Lag monitoring scripts, including IRC notifier bot, have been tested and are ready for Ops implementation. This should give us quicker notifications in case the notifications to Wikipedia and co about changes on Wikidata are slow again. - Enabled data access for Wikiquote See current sprint items https://bugzilla.wikimedia.org/buglist.cgi?list_id=218716resolution=---resolution=LATERresolution=DUPLICATEemailtype1=substringemailassigned_to1=1query_format=advancedbug_status=ASSIGNEDemail1=wikidata for what we’re working on next. You can view the commits currently in review here https://gerrit.wikimedia.org/r/#/q/(+project:mediawiki/extensions/Wikibase+OR+project:mediawiki/extensions/Diff+OR+project:mediawiki/extensions/DataValues+OR+project:mediawiki/extensions/WikibaseSolr+OR+project:mediawiki/extensions/Ask+OR+project:mediawiki/extensions/WikibaseQuery+OR+project:mediawiki/extensions/WikibaseDatabase+OR+project:mediawiki/extensions/WikibaseQueryEngine+OR+project:mediawiki/extensions/WikibaseDataModel+)+status:open,n,z and the ones that have been merged here https://gerrit.wikimedia.org/r/#/q/(+project:mediawiki/extensions/Wikibase+OR+project:mediawiki/extensions/Diff+OR+project:mediawiki/extensions/DataValues+OR+project:mediawiki/extensions/WikibaseSolr+OR+project:mediawiki/extensions/Ask+OR+project:mediawiki/extensions/WikibaseQuery+OR+project:mediawiki/extensions/WikibaseDatabase+OR+project:mediawiki/extensions/WikibaseQueryEngine+OR+project:mediawiki/extensions/WikibaseDataModel+)+status:merged,n,z . You can see all open bugs related to Wikidata here
[Wikidata-l] New template for Wikipedia articles about Wikidata properties
I have created {{Wikidata property}} [1] (example on [2]), for suitable en.Wikipedia articles about subjects for which we have a property in Wikidata. Please help to improve and apply it (can anyone generate a list of relevant articles?), and to migrate it to other-language Wikipedias. [1] https://en.wikipedia.org/wiki/Template:Wikidata_property [2] https://en.wikipedia.org/wiki/ORCID -- Andy Mabbett @pigsonthewing http://pigsonthewing.org.uk ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l