Re: [Pywikipedia-l] Wikidata and Pywikipedia

Amir Ladsgroup Sun, 03 Mar 2013 10:13:40 -0800

You can count on about writing the code :) Tell me if need more hands


On Sat, Mar 2, 2013 at 7:48 PM, Yuri Astrakhan <[email protected]>wrote:

> Of course, the goal is to keep (and even enhance) the current
> functionality. And yes, there will be a significant overlap when both api's
> would function. The main reason for posting it now is so that whoever
> decides to implement it would know the wikidata api roadmap and plan
> accordingly. Plus I hope for a ton of good suggestions :)
>
>
> On Sat, Mar 2, 2013 at 11:13 AM, Maarten Dammers <[email protected]>wrote:
>
>>  Hi Yurik,
>>
>> Op 2-3-2013 17:00, Yuri Astrakhan schreef:
>>
>> I would also like to bring up the pending Wikidata API RFC at
>> http://www.mediawiki.org/wiki/Requests_for_comment/Wikidata_API
>>
>> Thanks for pointing that out.
>>
>>
>>  In that RFC I plan to unify Wikidata API to be more seamlessly
>> integrated with the core query API. Any feedback from pywiki community is
>> welcome.
>>
>> If we can still functionally do the same what I described below you
>> probably won't get any complaints from this side. Is that the case?
>> Can you please include a period of overlap of old-style and new-style api
>> so we have time to update the framework?
>>
>> Maarten
>>
>>
>>
>>  On Sat, Mar 2, 2013 at 10:49 AM, Maarten Dammers <[email protected]>wrote:
>>
>>> Hi everyone,
>>>
>>> As you might know phase 1 of Wikidata (interwiki links) is live at a lot
>>> of Wikipedia's and soon to be turned on for all Wikipedia's. Phase 2 is
>>> next, that's basically about infobox data. We are going to need a lot of
>>> clever bots to fill Wikidata. To make that possible Pywikipedia should
>>> (properly) implement Wikidata. That way bot authors don't have to worry or
>>> care about the inner workings of the Wikidata api, they just talk to the
>>> framework. At the moment trunk has a first implementation that isn't very
>>> clean and in the rewrite it's still missing.
>>>
>>> Legoktm and I talked about this on irc. We need to have a proper data
>>> model in Pywikipedia. Based on
>>> https://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer :
>>> * WikibasePage is a subclass of Page and has some basic shared functions
>>> for labels, descriptions and aliases
>>> * ItemPage is a subclass of WikibasePage with some item specific
>>> functions like claims and sitelinks (example
>>> https://www.wikidata.org/wiki/Q256638)
>>> * PropertyPage is a subclass of WikibasePage with some property specific
>>> functions for the datatype (example
>>> https://www.wikidata.org/wiki/Property:P22)
>>> * QueryPage is a subclass of WikibasePage for the future query type
>>> * Claim is a subclass of object for claims. Simplified: It's a property
>>> (P22, father) attached to an item (Q256638, the princes) linking to another
>>> item (Q380949, Willem IV)
>>>
>>> You can get these pages like a normal page (site object + title), but
>>> you probably also want to get them based on a Wikipedia page. For that
>>> there is
>>> https://www.wikidata.org/wiki/Special:ItemByTitle/enwiki/Princess%20Carolina%20of%20Orange-Nassau.
>>>  We should have a staticmethod itemByPage(Page) in which Page is
>>> https://en.wikipedia.org/wiki/Princess_Carolina_of_Orange-Nassau and it
>>> will give you the itemPage object for
>>> https://www.wikidata.org/wiki/Q256638. Currently in trunk the DataPage
>>> object has a constructor where you can give a page object and you'll get
>>> the corrosponding dataPage. I don't think that's the way to do it because
>>> it violates the data model and will get us in a lot of trouble later on
>>> when other sites (like Commons) might implement the Wikibase extension.
>>>
>>> A WikibasePage should work the same as a normal page when it comes to
>>> fetching data. It should have the initial version (just a title, no
>>> content) and once you use a function that needs data (or you force it), it
>>> will fetch all the data from Wikibase and caches it.
>>> * For an item the data looks like
>>> https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q256638&format=json
>>> * For a property the data looks like
>>> https://www.wikidata.org/w/api.php?action=wbgetentities&ids=P22&format=json
>>> Parts of the data (description, aliases and labels) should be processed
>>> in the get function of WikibasePage, other parts in ItemPage /PropertyPage
>>>
>>> Based on the api we should probably have some generators:
>>> * One or more generator that uses wbgetentities to (pre-)fetch objects
>>> * A search generator that uses wbsearchentities
>>>
>>> WikibasePage:
>>> * Set/add/delete label (@property?)
>>> * Set/add/delete description (@property?)
>>> * Set/add/delete alias (@property?)
>>>
>>> ItemPage
>>> * Set/add/delete sitelink (@property?)
>>>
>>> Claim logic
>>>
>>> Not sure how we can use wbeditentity and wblinktitles
>>>
>>> We took some notes on
>>> https://www.mediawiki.org/wiki/Manual:Pywikipediabot/Wikidata/Rewrite_proposal.
>>>
>>> What do you think? Is this the right direction? Feedback is appreciated.
>>>
>>> Maarten
>>>
>>>
>>> _______________________________________________
>>> Pywikipedia-l mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>>>
>>
>>
>>
>> _______________________________________________
>> Pywikipedia-l mailing 
>> [email protected]https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>>
>>
>>
>> _______________________________________________
>> Pywikipedia-l mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>>
>>
>
> _______________________________________________
> Pywikipedia-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>
>


-- 
Amir

_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Re: [Pywikipedia-l] Wikidata and Pywikipedia

Reply via email to