[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2019-01-16 Thread Jheald
Jheald added a comment.
As a postscript to my comment two posts above, note that in such a scenario a Commons category page might well be associated with both an item on Wikidata (via a sitelink equivalence) and a local item on the Commons wikibase.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: JhealdCc: Jheald, matthiasmullie, Smalyshev, Lydia_Pintscher, Addshore, MarkTraceur, WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2019-01-15 Thread Abit
Abit added a comment.
@WMDE-leszek have you had a chance to think about how WMDE might implement this?  Are you able to share any timeframe?TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: AbitCc: Jheald, matthiasmullie, Smalyshev, Lydia_Pintscher, Addshore, MarkTraceur, WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2019-01-12 Thread Jheald
Jheald added a comment.
In the context of this thread, it's worth recalling the ongoing wish from Commons users for Commons categories to be able to have their own local items on the Commons wikibase.

At the moment only items on Wikidata itself are available to store structured data for Commons categories.  This is okay so far as it goes -- currently there are about 2.1 million Wikidata items linked to Commons categories and supporting "wikidata infobox", which have been very well received.

But that is still under 30% of all Commons categories, and the Wikidata community has some very considerable reservations about extending it further -- see for example this discussion recently at Project Chat.

It would be highly desirable to be able to store structured data for all 7.3 million Commons categories -- and in particular for categories for complex intersections of topics, and for "non-notable" people, both of which are rather unwelcome on Wikidata.Being able to document, by wikibase statements, what these categories relate to would be hugely helpful, to


support wikibase-derived infoboxes, to explain the meaning of the category internationally and multilingually
allow internationalised labels and descriptions to be added -- a long-time Commons request
allow volunteers to work together to build up a structured understanding of the meaning of Commons categories
gather the understanding of the meaning categories needed to translate a file's membership of a category into appropriate Commons wikibase statements for the file -- hugely important if we want Commons wikibase to get populated
identify gaps on WIkidata -- ie 'simple' things (people, places, ideas etc) not currently represented on Wikidata, but present in the Commons ontology
in the reverse direction, make it possible for wikibase statements on files to be used to verify, extend or refine the categorisation of those files -- making categories more systematic and thorough and complete


All of this becomes possible for volunteers to work on, if local Commons wikibase items can be available for Commons categories.

Distinguishing these local items via a prefix, eg c:Q1234, as opposed to plain Q2345 on Wikidata, would seem a very acceptable way to allow both them and Wikidata items to exist and each be referenced.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: JhealdCc: Jheald, matthiasmullie, Smalyshev, Lydia_Pintscher, Addshore, MarkTraceur, WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2019-01-08 Thread Addshore
Addshore added a comment.
It all sounds sane to me.

It would allow Commons define MediaInfo without needing to change any existing pages, templates etc on Commons which refer to P123 and Q456 already

This is one of the more annoying points here, but also the point that makes me lean toward getting rid of the "" prefix assumption for local entities.
Having P123 and wikidata:P123 on commonswiki would be a right pain...

Though it also raises some other thoughts:


So when and if commons will be able to define properties on commonswiki, they will have to prefix them with common: or some other prefix. IMO this sucks a bit. But there is no reasonable way to get around this right now.
The code loading the P123 without a prefix on commons right now is in my head separate to the repo code that is going to do our federation, although maybe this is yet another point for bringing the 2 closer together.


Changing any prefixes of lack of prefixes in the future will be a massive migration and not something that we want to do, but if we pull in the fact that we already refer to entities from wikidata on commons then we are already at that stage.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: AddshoreCc: Smalyshev, Lydia_Pintscher, Addshore, MarkTraceur, WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-21 Thread Smalyshev
Smalyshev added a comment.
To me, having per-item-type mapping where some types can be imported from Wikidata (Q, P) and some local to Commons (M) sounds the most promising (i.e. option 3). We also must note that P is a special type that is unlike any other, as it denotes predicates, not subjects/objects. We should either agree that control over creating properties still lies in Wikidata (Commons people may be unhappy about this) or create some process where Commons can decide to create Wikidata properties (may be tricky community-wise) or somehow have Commons-only properties (technically challenging I presume, even with prefixes since people would be confused about them). Another solution would be to have P in Commons and duplicate necessary properties (not ideal either for many reasons).

For RDF, Commons should definitely know prefixes for Q and P (either from config or from some kind of federation API) since otherwise it's impossible to generate proper statements involving any of these. The outcome should be that Wikidata items on Commons have same URIs as on Wikidata proper, while things particular to Commons (including Commons properties, if any, and Commons M-items) should have distinct prefixes.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: Smalyshev, Lydia_Pintscher, Addshore, MarkTraceur, WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-21 Thread daniel
daniel added a comment.
As far as I can see, the situation is this: Option 1 is ruled out already, since data access from wikitext doesn't work with that approach. Options 2 would need community consensus. Option 3 (default repo per entity type, with prefixes used internally) is still on the table. Option 4, proposed by Leszek (federation directly based on entity type, with no prefixes used internally) is made unattractive by by Lydia's statement that we cannot guarantee that we will not need prefixes in the future, which would put us back into the position we are now, giving us the choice between option 2 and 3.

Cost estimate for option 3a: I can implement this, with Adam reviewing, in one sprint (two weeks). Ironing out issues that come up later may add another couple of weeks. However, note that my January and February are already quite full, and I'd have to coordinate this with my duties for the core platform team. The other person most familiar with the code in question is Leszek.

The cost for option 4 is probably a bit more than this, but not much.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: danielCc: Lydia_Pintscher, Addshore, MarkTraceur, WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-21 Thread MarkTraceur
MarkTraceur added a comment.
With Lydia's response, I'm a little confused - it seems to conflict with Leszek's assertion that WMDE wants to avoid that particular future. If WMDE is looking for a tie-breaking vote, it seems like the entire SDC team (including myself) is on board with "option 3", so...should that be the path forward?

I am fine with us thinking about this over the holiday break, as well - no need to rush it as we aren't blocked by this until (at least) late January, as far as I understand.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: MarkTraceurCc: Lydia_Pintscher, Addshore, MarkTraceur, WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-21 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.
After some discussion with Daniel: I've been advocating for internal federation to only ever take the same entity type from a single repository. I will continue to do so because I believe it is the right thing to do. But we can't say with enough confidence that this assumption will hold true forever.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lydia_PintscherCc: Lydia_Pintscher, Addshore, MarkTraceur, WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-19 Thread daniel
daniel added a comment.

In T211800#4831696, @WMDE-leszek wrote:
As I am gone for two more weeks, it would probably the best that I shortly answer here: to me it seems both kinds of federation are completely separate and possibly even mutually exclusive. So I would rather not mix these, neither conceptually nor in terms of code. So depending on how you look at it, the difference might be either "a lot" or "not at all".


As far as I can see, that means duplicating  a lot of code. Internally, these things look very, very similar.

Also, I'd rather not have them look different in the JSON or the API. If they do, clients have to know about the difference, and implement both as well.

and what options a wikibase instance may have to go from one model to the other. We'd probably have to provide conversion scripts.

Per what said above, it is not clear to me in what cases such conversions would be needed. What situations do you have in mind?

Mostly the thing we had on commons:

"Oh hey, let's use data from wikidata! No prefixes needed." A year later: "You know what would be cool? Having our own data items!"

Second thoughts: is the idea to potentially use the prefixed federation for now on Commons, although it might be "wrong", and then switch to the non-prefixed one once it has been implemented? With this switch I imagine one would want to do some conversion.

This is the status quo, really - federation with prefixes is in the staging pipeline for deployment on commons. With client and repo code configured for different prefixes.

I understand that Commons does not want any prefixes. Current functionality of Wikibase is not what Commons need and can use. That said, WMDE is not able to commit any resources to implement the needed functionality on Wikibase side this year. We would be able to tell more in January once everyone is back from holidays, and we know what exactly is it what we need to build for SDOC.
 I am sorry I cannot offer more at this point, but the calendar has no mercy, and it is too big of a thing to ad-hoc quickfix it.

Well, implementing a type-to-repo configuration for use by PrefixMappingEntityIdParser would not be a lot of work, and would fix the problem. I understand that you don't like this option conceptually, but I don't see a problem with it from the perspective of code.

This seems a quick win, and since SDC is supposed to deliver in January, may be the only option.

Maybe in a next week or so you'd be able to answer those open questions (prefixes or not in JSON, RDF etc), and generally define what would be desired functionality. That would help us answer better how much of a task do we talk about.

We may get away with an initial deployment in "split brain" mode, but for statements, we need to properly solve this.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: danielCc: Addshore, MarkTraceur, WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-18 Thread WMDE-leszek
WMDE-leszek added a comment.
I tend to agree, but I'd like to further discuss with you how much of a difference this really makes in the code,

As I am gone for two more weeks, it would probably the best that I shortly answer here: to me it seems both kinds of federation are completely separate and possibly even mutually exclusive. So I would rather not mix these, neither conceptually nor in terms of code. So depending on how you look at it, the difference might be either "a lot" or "not at all".

and what options a wikibase instance may have to go from one model to the other. We'd probably have to provide conversion scripts.

Per what said above, it is not clear to me in what cases such conversions would be needed. What situations do you have in mind?
Second thoughts: is the idea to potentially use the prefixed federation for now on Commons, although it might be "wrong", and then switch to the non-prefixed one once it has been implemented? With this switch I imagine one would want to do some conversion.

I understand that Commons does not want any prefixes. Current functionality of Wikibase is not what Commons need and can use. That said, WMDE is not able to commit any resources to implement the needed functionality on Wikibase side this year. We would be able to tell more in January once everyone is back from holidays, and we know what exactly is it what we need to build for SDOC.
I am sorry I cannot offer more at this point, but the calendar has no mercy, and it is too big of a thing to ad-hoc quickfix it.

Maybe in a next week or so you'd be able to answer those open questions (prefixes or not in JSON, RDF etc), and generally define what would be desired functionality. That would help us answer better how much of a task do we talk about.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: WMDE-leszekCc: Addshore, MarkTraceur, WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-18 Thread daniel
daniel added a comment.
A detail to consider when going for "federation without prefixes": does this mean no prefixes just for user input, or also in the JSON serialization? Using no prefixes there either may seem intuitive, but it make the data more brittle, and harder to exchange between instances. Other repos consuming that data will then need to replicate the data-type mapping instead of a prefix mapping. That seems error prone to me.

Also, how about the RDF output? At least for that, we will need different prefixes for entities from different sources.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: danielCc: Addshore, MarkTraceur, WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-18 Thread daniel
daniel added a comment.

In T211800#4830605, @WMDE-leszek wrote:

Having two not-necessarily same config flying around and parts of code arbitrarily picking one config, and other parts picking up the other seems like a bug/unfinished implementation to me. I am surprised it only surfaces now (it has been for 99% me who had messed that up), but should be fixed. If the current "broken" state actually makes commons work the fixing schedule can be of course postponed :)



That's indeed the current situation, yes. It's a consequence of the fact that client and repo are independent extensions, none of which depends on the other. But yes, both should share config for the DataAccess component, but doing so would be incompatible with the requirements and data we have.


I don't claim to have the thorough understanding of the Commons issues now, but it seems to me that those two kinds of federation, i.e. one intending to have different entity types in different repos, possibly having e.g. items from multiple repos, and the one where it is clear some entity types are coming from repo A, and some from repo B are actually separate things, they're not really overlapping. The former requires and is based on the concept of prefixes (to be able to distinguish between different sources of items), whereas the latter could actually do without having prefixes at all. Both make sense as separate approach (the former for non-Wikimedia Wikibases, the latter for Commons, for instance). I am not aware of any practical or planned instance where mixing both concept would actually be needed. Therefore I would strongly encourage to NOT mix both approaches in the implementation and to NOT create a super generic federation where all the things could be done using some config magic. The existing stuff is already overly complicated. Let's at least not make it worse.


I tend to agree, but I'd like to further discuss with you how much of a difference this really makes in the code, how confusing a "mixed" config is, and what options a wikibase instance may have to go from one model to the other. We'd probably have to provide conversion scripts.

Coming back to the particular Commons topic: from WMDE perspective option 3 is really something we would not like to see added as the feature etc. I do understand that converting millions of Commons pages to just change Qxyz to wd:Qxyz is going to be a costly migration. We're happy to help with coming up with some temporary/intermediate solution that would allow Commons running while the migration is on-going.

I don't think we can target that solution without consulting the community. And in the light of what you said above, commons seems to fit the "no prefix" case, where we could just map entity types to repos.

That said, I am wondering whether from Commons perspective using prefixes is something what's intended? Or actually the opposite? I am not aware (as in: I simply don't know) whether prefixing Wikidata items on Commons has ever been discussed in previous 2 years. Or has it been simply assumed "we need to add prefixes because this is what software requires"? Or did we in the first place implement the wrong feature few years ago?

Using prefixed to access wikidata has not been proposed to the community, and was never the plan. To my knowledge,  "option 2" in this ticket is the first time it has been proposed. And yes, the only reason to do it is because that's how the software is currently designed.

This was recognized as an issue when we first designed federation, and we discussed the idea of mapping entity types to repos, but this was put off for later, and then forgotten...

Finally, I have to admit that being away from IRC for a while I don't feel like I am fully up-to-date with the status of federation between Commons and Wikidata. My assumption was the recent Daniel's change to beta commons config allowed to use beta wikidata items on Commons and all seems fine for now? Is this correct, or is it all still completely broken?

beta-wikidata items and properties can be used on beta-commons now, in both repo (with prefix) and client (without prefix) code. But accessing MediaInfo from wikitext is not possible there. It's a bit tricky to test, since the UI for adding statements is currently disabled.

Regarding the most recent report above. @daniel could you please elaborate more what exactly does not work once you've verified whether it does or not? Having more information would make it easier for us (at least for me) to reason about the problem.

See my comments above.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: danielCc: WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, 

[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-18 Thread daniel
daniel added a comment.
One source of confusion is the fact that the CachingPropertyInfoLookup used by client code and repo code are hitting the same cache entry. That cache entry ends up containing both prefixed (wikidata:P1) and unprefixed (just P1) property IDs, side by side. This in turn causes API modules, which use the repo service instances, to also accept unprefixed property IDs, which causes data corruption: MediaInfo entities can have some Statements that use prefixed property IDs, and some that use unprefixed property IDs, and these are treated as different and incompatible (correctly - according to the configuration, they come from different repos). Repo side code (in "split brain" operation) should not accept unprefixed property IDs or unprefixed item IDs, since there are no properties or items on the local repo.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: danielCc: WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-18 Thread WMDE-leszek
WMDE-leszek added a comment.
Trying to take the step back and think on how the functionality in question has been implemented/outlined two years ago, here would be my thoughts:


Having two not-necessarily same config flying around and parts of code arbitrarily picking one config, and other parts picking up the other seems like a bug/unfinished implementation to me. I am surprised it only surfaces now (it has been for 99% me who had messed that up), but should be fixed. If the current "broken" state actually makes commons work the fixing schedule can be of course postponed :)
It looks to me that Commons and Wikidata federation is a bit of special case of the federation as it has been envisioned as a general concept. There is no need to have local and wikidata items (like most non-Wikimedia Wikibase instances request in the context in federation). "Funnily" enough, while the "typical" federation is nowhere in use due to current implementation's limitation, so the only real use case is this special/reverse one.
I don't claim to have the thorough understanding of the Commons issues now, but it seems to me that those two kinds of federation, i.e. one intending to have different entity types in different repos, possibly having e.g. items from multiple repos, and the one where it is clear some entity types are coming from repo A, and some from repo B are actually separate things, they're not really overlapping. The former requires and is based on the concept of prefixes (to be able to distinguish between different sources of items), whereas the latter could actually do without having prefixes at all. Both make sense as separate approach (the former for non-Wikimedia Wikibases, the latter for Commons, for instance). I am not aware of any practical or planned instance where mixing both concept would actually be needed. Therefore I would strongly encourage to NOT mix both approaches in the implementation and to NOT create a super generic federation where all the things could be done using some config magic. The existing stuff is already overly complicated. Let's at least not make it worse.


Coming back to the particular Commons topic: from WMDE perspective option 3 is really something we would not like to see added as the feature etc. I do understand that converting millions of Commons pages to just change Qxyz to wd:Qxyz is going to be a costly migration. We're happy to help with coming up with some temporary/intermediate solution that would allow Commons running while the migration is on-going.

That said, I am wondering whether from Commons perspective using prefixes is something what's intended? Or actually the opposite? I am not aware (as in: I simply don't know) whether prefixing Wikidata items on Commons has ever been discussed in previous 2 years. Or has it been simply assumed "we need to add prefixes because this is what software requires"? Or did we in the first place implement the wrong feature few years ago?

Finally, I have to admit that being away from IRC for a while I don't feel like I am fully up-to-date with the status of federation between Commons and Wikidata. My assumption was the recent Daniel's change to beta commons config allowed to use beta wikidata items on Commons and all seems fine for now? Is this correct, or is it all still completely broken?
Regarding the most recent report above. @daniel could you please elaborate more what exactly does not work once you've verified whether it does or not? Having more information would make it easier for us (at least for me) to reason about the problem.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: WMDE-leszekCc: WMDE-leszek, Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-17 Thread daniel
daniel added a comment.
I just realized that this probably blocks access to MediaInfo from wikitext on commons. I have not confirmed this, but if my mental model is right, this needs to be fixed before we'll be able to access MediaInfo from wikitext.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: danielCc: Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-14 Thread daniel
daniel added a comment.
For option three, we'd have to decide whether the per-entity-type-defaults should be defined separately from the prefix mapping, or whether it should use some kind of special syntax., I'd suggest the hybrid approach we have also been using for including the target slot in EntityNamespaceLookup: use separate arrays internally (cleaner), but use some special syntax in the config (simpler). So for commons, you would have a mapping like [ '@item' => 'wikidata', '@propert' => 'wikidata' ], with the @ indicating that the mapping is not for prefixes, but for entity types.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: danielCc: Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-14 Thread Cparle
Cparle added a comment.
+1TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: CparleCc: Cparle, Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T211800: Resolve ambiguity of entity ID prefixes used on Commons.

2018-12-13 Thread Jdforrester-WMF
Jdforrester-WMF added a comment.
Option 3 sounds sanest in terms of fixing/avoiding the issue without a costly migration.TASK DETAILhttps://phabricator.wikimedia.org/T211800EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Jdforrester-WMFCc: Jdforrester-WMF, Abit, EBjune, Ramsey-WMF, Aklapper, daniel, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, LawExplorer, Silverfish, Poyekhali, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, El_Grafo, Dinoguy1000, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs