[Wikidata-bugs] [Maniphest] [Commented On] T209880: [Investigation - 8h] technical overview of current db based Wikibase federation & blockers to get to an API based federation

2019-01-07 Thread Addshore
Addshore added a comment.
Ping @Lydia_PintscherTASK DETAILhttps://phabricator.wikimedia.org/T209880EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Michael, AddshoreCc: Addshore, Lucas_Werkmeister_WMDE, WMDE-leszek, _jensen, johl, Aklapper, Lydia_Pintscher, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, D3r1ck01, Jonas, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T209880: [Investigation - 8h] technical overview of current db based Wikibase federation & blockers to get to an API based federation

2018-12-18 Thread Addshore
Addshore added a comment.

In T209880#4805698, @Michael wrote:





api url

With this line, I just wanted to point out that we currently have only the entity url configured e.g. https://wikidata.beta.wmflabs.org/entity/, but we may want to explicitly configure the url for the api instead of "recalculating" it, since they may be significantly different.

I think there essentially needs to be an api endpoint for a repo to generate a federation config that can then be used on another repo, instead of people manually clobbering together various different URLS and configs etc.
The repo knows all of the settings needed to connect to iut, so just have it do it :)

the way I currently envision this, API-based federation would use the current Wikibase APIs (mostly wbgetentities, I assume). Is that also what you have in mind, or do you want to create some dedicated API or use something else?

Honestly, I'm not sure and I think that depends on how we model the other components. If we can make use of all the information wbgetentities provides and it scales well, then that might be enough.

Right now we probably want to use Special:EntityData instead of wbgetentities, as right now the api has no varnish level caching, but Special:EntityData does.
We may also want a new API. I guess this depends on the granularity of the data updates.
If we want to provide more granularity, we add complexity, retrieving the whole entity / thing that has been updated is always going to be the easiest thing.





In general, I get the feeling that building a prototype for this and noting all the ways where that fails might be a good way to get a reliable list of the things we have to build right. Maybe, let's take a small team and a few weeks or a trail blaze for this?

This is on the road map for next year as far as I know, so something like this will probably happen


In T209880#4803792, @Lucas_Werkmeister_WMDE wrote:
how to handle multiple Wikibase installations containing the same type of entities - e.g. Items or Properties?

Not in scope for this task, if I understand correctly (see also T209880#4778278).


Indeed, the initial version will not allow multiple repo for the same entity types.

api keys? user/pass?

It’s probably enough to make all requests anonymously (I hope). (Authenticating the requests would make tracking easier, though.)

In terms of tracking I guess UAs that make sense would do.
However, in terms of federation between other 3rd party wikibases it might be that they desire some sort of extra level of control not needed by wikimedia itself.



Query 25 federated instances on every keystroke that expects auto-complete?

Currently, we would only query one federated instance, because any entity type can only be provided by a single repository (the local one or a federated one). That will change in the future, but even then, I don’t think many installs will have more than two or three repos for the same entity type (local item, Wikidata item, perhaps something in between?).

I think that is probably true, we will probably only be talking about a couple of federated repos.
Anyway, if a repo does want to federate to 20, and cause JS calls to 20 repos etc, thats up to them, but i guess we won't design it to make that super nice initially.

Decide:

I would vote for option 1 (HTTP variants of services), but I’m not sure what the decision process for this is going to be anyways.

I would also vote for HTTP variants of services.TASK DETAILhttps://phabricator.wikimedia.org/T209880EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Michael, AddshoreCc: Addshore, Lucas_Werkmeister_WMDE, WMDE-leszek, _jensen, johl, Aklapper, Lydia_Pintscher, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, D3r1ck01, Jonas, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T209880: [Investigation - 8h] technical overview of current db based Wikibase federation & blockers to get to an API based federation

2018-12-07 Thread Michael
Michael added a comment.
Currently, we would only query one federated instance, because any entity type can only be provided by a single repository (the local one or a federated one). That will change in the future, but even then, I don’t think many installs will have more than two or three repos for the same entity type (local item, Wikidata item, perhaps something in between?).

That depends on how fractured the ecosystem is going to be. If many glam organizations start hosting their own wikibase but  somewhat agree on a common itemtype (say Artifact:A12345), then we might face a lot of instances. That might be especially true, if we make spinning up instances very easy, e.g. by creating "wikibase hub" or something similar to the wordpress.com model for wikibase with GLAM defaults preconfigured. Also third parties might be create something like wikia.com .

api url

With this line, I just wanted to point out that we currently have only the entity url configured e.g. https://wikidata.beta.wmflabs.org/entity/, but we may want to explicitly configure the url for the api instead of "recalculating" it, since they may be significantly different.

the way I currently envision this, API-based federation would use the current Wikibase APIs (mostly wbgetentities, I assume). Is that also what you have in mind, or do you want to create some dedicated API or use something else?

Honestly, I'm not sure and I think that depends on how we model the other components. If we can make use of all the information wbgetentities provides and it scales well, then that might be enough.



In general, I get the feeling that building a prototype for this and noting all the ways where that fails might be a good way to get a reliable list of the things we have to build right. Maybe, let's take a small team and a few weeks or a trail blaze for this?TASK DETAILhttps://phabricator.wikimedia.org/T209880EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: MichaelCc: Lucas_Werkmeister_WMDE, WMDE-leszek, _jensen, johl, Aklapper, Lydia_Pintscher, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, D3r1ck01, Jonas, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T209880: [Investigation - 8h] technical overview of current db based Wikibase federation & blockers to get to an API based federation

2018-12-06 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment.
Hm, I don’t think that’s something we need to deal with at the Wikibase level… I feel like this is something that’s best dealt with at the HTTP proxy level. Apart from the other repo’s IP address (that the request is coming from), the current instance doesn’t necessarily know anything about the other repo anyways, does it?

Some more responses, then:

how to handle multiple Wikibase installations containing the same type of entities - e.g. Items or Properties?

Not in scope for this task, if I understand correctly (see also T209880#4778278).

api url

To clarify: the way I currently envision this, API-based federation would use the current Wikibase APIs (mostly wbgetentities, I assume). Is that also what you have in mind, or do you want to create some dedicated API or use something else?

api keys? user/pass?

It’s probably enough to make all requests anonymously (I hope). (Authenticating the requests would make tracking easier, though.)

Query 25 federated instances on every keystroke that expects auto-complete?

Currently, we would only query one federated instance, because any entity type can only be provided by a single repository (the local one or a federated one). That will change in the future, but even then, I don’t think many installs will have more than two or three repos for the same entity type (local item, Wikidata item, perhaps something in between?).

Decide:

I would vote for option 1 (HTTP variants of services), but I’m not sure what the decision process for this is going to be anyways.TASK DETAILhttps://phabricator.wikimedia.org/T209880EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Michael, Lucas_Werkmeister_WMDECc: Lucas_Werkmeister_WMDE, WMDE-leszek, _jensen, johl, Aklapper, Lydia_Pintscher, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, D3r1ck01, Jonas, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T209880: [Investigation - 8h] technical overview of current db based Wikibase federation & blockers to get to an API based federation

2018-12-05 Thread Michael
Michael added a comment.

In T209880#4800645, @Lucas_Werkmeister_WMDE wrote:
[...] Or is this a blacklist/whitelist on the target repo, so that e. g. Wikidata would refuse to be a foreign repo for Metapedia?


Yes, that was the usage I thought of. How do we handle badly configured instances that target our current instance? Rate limiting might be another possible way to go here. OTOH, it might be enough to handle this with the existing anti-DOS measures.TASK DETAILhttps://phabricator.wikimedia.org/T209880EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: MichaelCc: Lucas_Werkmeister_WMDE, WMDE-leszek, _jensen, johl, Aklapper, Lydia_Pintscher, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, D3r1ck01, Jonas, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T209880: [Investigation - 8h] technical overview of current db based Wikibase federation & blockers to get to an API based federation

2018-12-05 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment.
I don’t understand this part:


blocking/blacklisting Wikibase installations that want to federate
option to whitelist them instead?



The federation setup is part of the site configuration (I assume), so where would a blacklist or whitelist apply? If the site admin doesn’t want federation with a particular Wikibase instance, they just shouldn’t configure it. Or is this a blacklist/whitelist on the target repo, so that e. g. Wikidata would refuse to be a foreign repo for Metapedia?TASK DETAILhttps://phabricator.wikimedia.org/T209880EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Michael, Lucas_Werkmeister_WMDECc: Lucas_Werkmeister_WMDE, WMDE-leszek, _jensen, johl, Aklapper, Lydia_Pintscher, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, D3r1ck01, Jonas, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T209880: [Investigation - 8h] technical overview of current db based Wikibase federation & blockers to get to an API based federation

2018-12-05 Thread Michael
Michael added a comment.
What is Needed to make  MultipleRepositoryAwareWikibaseServices use an API?

Case Study: PropertyInfoLookup


Interface: \Wikibase\Lib\Store\PropertyInfoLookup
3 relevant implementations:
CachingPropertyInfoLookup
DispatchingPropertyInfoLookup
PropertyInfoTable
Only PropertyInfoTable actually looks up data, the other two are require a lookup for themselves
MultiRepositoryWiring.php creates a DispatchingPropertyInfoLookup and supplies it with an array mapping repo names to service instances
=> possible general approach: add HttpPropertyInfoLookup or something for remote Wikibase instances
there needs to be an database call available for that service to call
how do we handle a service that is unreachable?
think about caching
"push notifications" for changed data?



What is already configured?


wmgWikibaseForeignRepositories at InitialiseSettings.php line 18949 -- but this fails to load on GitHub
wmgWikibaseForeignRepositories at InitialiseSettings-labs.php
Config is read in \Wikibase\Repo\WikibaseRepo::getRepositoryDefinitionsFromSettings
federation concept doc
doc for 80% of the foreign repos options (with broken markup rendering by GitHub)
repoDatabase
baseUri
prefixMapping
entityNamespaces
supportedEntityTypes (not documented)



what further configuration may be necessary?


api url
api keys? user/pass?


Further considerations


blocking/blacklisting Wikibase installations that want to federate
option to whitelist them instead?

how to handle multiple Wikibase installations containing the same type of entities - e.g. Items or Properties?
TASK DETAILhttps://phabricator.wikimedia.org/T209880EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: MichaelCc: Lucas_Werkmeister_WMDE, WMDE-leszek, _jensen, johl, Aklapper, Lydia_Pintscher, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, D3r1ck01, Jonas, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T209880: [Investigation - 8h] technical overview of current db based Wikibase federation & blockers to get to an API based federation

2018-11-27 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.
Yeah. It's definitely something that will come up. But let's take it one step at a time. As the next step not mixing entities from several repositories for the same entity type seems good enough.TASK DETAILhttps://phabricator.wikimedia.org/T209880EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lydia_PintscherCc: Lucas_Werkmeister_WMDE, WMDE-leszek, _jensen, johl, Aklapper, Lydia_Pintscher, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, D3r1ck01, Jonas, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T209880: [Investigation - 8h] technical overview of current db based Wikibase federation & blockers to get to an API based federation

2018-11-27 Thread WMDE-leszek
WMDE-leszek added a comment.
As far as I’m aware, our current federation is not only restricted by the requirement of database access, but also by the fact that each entity type is bound to one repository, and it is not possible to mix entities of the same type from multiple repositories.

Correct.

I suspect many external users will also require that restriction to be removed or relaxed (typically to use local and Wikidata items together); is that part of this task as well?

hmm, arguable this is a separate question, and also of different kind. I mean that how to deal with, say, items from three different places might be also becoming a more "fundamental" conceptual question. The limitation for current (limited indeed) version of federation to require shared DB access with wikidata production seem to be "only" technical problem. So I would see arguments to keep those questions separate.

That said, to be clear, I think the other question seem to be more relevant for non-Wikimedia federation uses I could imagine/I've heard about. So it is without a doubt an important question. Might deserve own investigation, or series of such.TASK DETAILhttps://phabricator.wikimedia.org/T209880EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: WMDE-leszekCc: Lucas_Werkmeister_WMDE, WMDE-leszek, _jensen, johl, Aklapper, Lydia_Pintscher, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, D3r1ck01, Jonas, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T209880: [Investigation - 8h] technical overview of current db based Wikibase federation & blockers to get to an API based federation

2018-11-27 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment.
As far as I’m aware, our current federation is not only restricted by the requirement of database access, but also by the fact that each entity type is bound to one repository, and it is not possible to mix entities of the same type from multiple repositories. I suspect many external users will also require that restriction to be removed or relaxed (typically to use local and Wikidata items together); is that part of this task as well?TASK DETAILhttps://phabricator.wikimedia.org/T209880EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Lucas_Werkmeister_WMDECc: Lucas_Werkmeister_WMDE, WMDE-leszek, _jensen, johl, Aklapper, Lydia_Pintscher, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, D3r1ck01, Jonas, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs