[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2019-01-04 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.
I assume this can be closed now?TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, Lydia_PintscherCc: Bianjiang, Nirmos, CCicalese_WMF, PokestarFan, Rical, Ayack, -jem-, Deskana, SBisson, Izno, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, EvanProdromou, Nandana, kostajh, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, Agabi10, D3r1ck01, Wikidata-bugs, aude, jayvdb, fbstj, santhosh, Mbch331, Rxy, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2017-01-11 Thread cscott
cscott added a comment.
If we use MCR for annotation storage, it would be useful to have a canonical URL for the contents of a specific slot.  That might be an API URL, like https://en.wikipedia.org/api/rest_v1/page/html/Main_Page/749836961/ or else a user-visible URL like https://en.wikipedia.org/wiki/Main_Page/ or https://en.wikipedia.org/wiki/:Main_Page or even a quasi-API URL like https://en.wikipedia.org/wiki/Special:redirect/slot//.  Thoughts?TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, cscottCc: Izno, Pppery, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2017-01-10 Thread Jdforrester-WMF
Jdforrester-WMF added a comment.
Notes for the session right now: https://etherpad.wikimedia.org/p/devsummit17-multi-content-revisionsTASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, Jdforrester-WMFCc: Izno, Pppery, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-11-23 Thread Jdforrester-WMF
Jdforrester-WMF added a comment.
Just for clarity, as I've worked on this task but not actually commented, we in Editing see MCR as very important to our long-term plans. The use cases laid out at Multi-Content Revisions#Use Cases cover a lot, but I'll just pull out the four that we see as most vital:


The structured media info work, as almost goes without saying;
Rejigging templates to have dedicated template, styling, data, and documentation slots, with UI to match;
Rejigging files to have a fused history for the blob and the description, removing UI confusion; and
Moving to a structured-data approach for categories.


Lots of others are also important, but those are the most useful.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, Jdforrester-WMFCc: Izno, Pppery, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-11-19 Thread daniel
daniel added a comment.

In T107595#2791142, @TomT0m wrote:
Ok, I got confused. Does that mean that the documentation will not have its wikipage address anymore ?


Yes, the documentation would be part of the template page proper, and would not have a separate title.

Would this then be possible to have a special type of "reference" slot which would hold a pointer to another page revision ? I guess the parser could be modified to maintain those reference slots when page are saved.

That would theoretically possible, but there are currently no plans to do this. I'm also not sure this would be the best way to tie a page revision to template revisions. So far, slots are intended to be editable, not derived. I have been thinking about derived slots, but the use cases for that idea all seem a bit contrieved, and would perhaps be better served by a more specialized solution, like a dedicated database table.

For example the parser computes a new version of the page when its content is modified, and when he expands a template a hook triggers the slot mangager to store the revision number of the template with those "reference" slots - I guess this this kind of hooks or something similar exists since we got a list of the used templates on previsualisation of a page.

This could be done with a DB table that associated a revision ID of the "transcluder" with a revision ID of the "transcluded" in each row. Simple enough to do, and would be stable against moving the template being renamed, etc. It's going to be a big table, though. And quite a change in how things work. As Tgr pointed out, there is the Memento extension that does this with some limitation. It's a feature that has been discussed time and time again, but never gained enough traction to be properly implemented.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, danielCc: Izno, Pppery, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-11-17 Thread Tgr
Tgr added a comment.

In T107595#2791067, @TomT0m wrote:
If I understand correctly, this feature will potentially allow to view an article with the versions of the templates that existed at the time the wikitext was edited.


You might be thinking of Memento (which is not related to this in any way).TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, TgrCc: Izno, Pppery, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-11-13 Thread TomT0m
TomT0m added a comment.

In T107595#279, @daniel wrote:
@TomT0m No, Multi-Content-Revisions does not help with consistent display of old template revisions. Well, it does in cases where the use of templates is replaced by the use of slots - if e.g. template documentation was stored in a slot instead of a subpage, you would always see the correct version of the documentation for old versions of the template. But that would be because it would no longer use the template mechanism.


Ok, I got confused. Does that mean that the documentation will not have its wikipage address anymore ?

Would this then be possible to have a special type of "reference" slot which would hold a pointer to another page revision ? I guess the parser could be modified to maintain those reference slots when page are saved.

For example the parser computes a new version of the page when its content is modified, and when he expands a template a hook triggers the slot mangager to store the revision number of the template with those "reference" slots - I guess this this kind of hooks or something similar exists since we got a list of the used templates on previsualisation of a page.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, TomT0mCc: Izno, Pppery, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-11-13 Thread daniel
daniel added a comment.
@TomT0m No, Multi-Content-Revisions does not help with consistent display of old template revisions. Well, it does in cases where the use of templates is replaced by the use of slots - if e.g. template documentation was stored in a slot instead of a subpage, you would always see the correct version of the documentation for old versions of the template. But that would be because it would no longer use the template mechanism.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, danielCc: Izno, Pppery, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-11-13 Thread TomT0m
TomT0m added a comment.
Question : History of old articles

If I understand correctly, this feature will potentially allow to view an article with the versions of the templates that existed at the time the wikitext was edited. Two questions arise then :


will that also work for deleted templates ?
will we be able to restore the revisions of version prior to multiple content revision deployment, say a 2005 revison of some article ?
TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, TomT0mCc: Izno, Pppery, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-10-01 Thread daniel
daniel added a comment.

In T107595#2678497, @Alsee wrote:

In T107595#2667512, @daniel wrote:
What I take away from @Alsee's comment is that we should provide a more comprehensive and detailed overview of the use cases.


So the answer is no, no thought of investigating whether the editing community wants this.


That's not what I said. To the contrary, I know that such investigation was done at least for the use case that is my job to take care of, namely structured media info; I also know that separating categories out of the wikitext has been requested and discussed numerous times. What I said is that the status of these investigations and discussions needs to be better documented and linked to from the technical proposal.

The idea of pulling categories, templates, and other things of out the wikitext is a pretty radical change. I understand you have use-case-proposals and the reasons you think they're good ideas. I'm not here to directly debate that. I'm here to alert you to the fact that this is a Big Deal. I am here to alert you that the Community may have a very different perspective, that this may be highly controversial. The proposed use cases may start evaporating if the community considers them unwanted or disruptive.

I agree that it would be a Big Deal to e.g. moved Wikipedia infoboxes out of the wikitext. But please note that this RFC does not propose doing that. It proposes a change to the platform that would allow us to do that -- and more importantly, it would allow other sites to manage infoboxes outside the wikitext.

Of course, if none of the use cases was endorsed by the community (which community?), the proposed change to the platform would be pointless. And you are correct that we need to take care to have the community in the loop when discussing use cases and requirements.

I fear I missed an important point when listing the use cases: I did not make a clear distinction between use cases for which we have consensus for implementing them and use cases for which we see potential, or have had repeated requests, but which have not yet been fully investigated or discussed broadly. That's why I said that  we need a more comprehensive and detailed overview of the use cases.

PS: One side note about discussing changes to the editing interface with the community of editors: the editors who are active on the site today are the ones who like (or at least got used to) the current interface - the ones that find the current way to edit unusable have given up after a few tries. We would like to change this, and open the editing experience to people who do not want to fiddle with complex syntax; this may mean changes that some people who have become experts at fiddeling with wikitext don't like.  We'll need to find the right balance, but we cannot find it if we listen only to the people who are active editors now. But that has nothing to do with the MCR proposal, it's just a general observation about discussing new features with "the" community.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, danielCc: ggellerman, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-30 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.
It's not true that we have not asked the community. Structured data for Commons has been asked for many many times. People are very happy with the progress we have made so far as can be seen for example here: https://commons.wikimedia.org/wiki/Commons_talk:Structured_data#It.27s_alive.21   Or here: https://blog.wikimedia.org/2016/08/23/wikidata-glam/
For the Wikidata team Multi Content Revisions is an essential part of making structured data on Commons happen. All the other use cases are potentials at this point. Their teams will be responsible for doing the community consultations on these as they start working on them. If they go ahead on those or not is independent of our need to have it for structured data on Commons. It is however important to bring them up now to make the case for why Multi Content Revisions are important to have in the long term.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, Lydia_PintscherCc: ggellerman, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-29 Thread Alsee
Alsee added a comment.

In T107595#2666114, @RobLa-WMF wrote:

In T107595#2666094, @Alsee wrote:
Did anyone consider that it might be a bad idea to start building a radical change to the editing environment without investigating whether the editing community wants this?


Each of the use cases have had quite a bit of discussion, and has had quite a bit of investigation by the people proposing it.


So the answer is no, no thought of investigating whether the editing community wants this.


In T107595#2667512, @daniel wrote:
What I take away from @Alsee's comment is that we should provide a more comprehensive and detailed overview of the use cases.


So the answer is no, no thought of investigating whether the editing community wants this.

You've got two editors who stumbled across (*)  this project, both waving red flags that there may be a problem here.

The WMF has been working on a Technical Collaboration Guideline as part of the Software Devlopment Process.  In part, "establishing best practices for inviting community involvement in the product development and deployment cycle". Most development goes smoothly and everyone is happy with a lot of what the WMF develops, but there is a long history of occasional projects that result in conflict. There have been cases where the WMF believed something was obviously a good idea, but where editors had a very different perspective. The editing community may weigh the pros and cons very differently than you have.

The idea of pulling categories, templates, and other things of out the wikitext is a pretty radical change. I understand you have use-case-proposals and the reasons you think they're good ideas. I'm not here to directly debate that. I'm here to alert you to the fact that this is a Big Deal. I am here to alert you that the Community may have a very different perspective, that this may be highly controversial. The proposed use cases may start evaporating if the community considers them unwanted or disruptive.

I'm saying it would be a good idea to post the template-use-case and/or category-use-case and/or others at EnWiki Village Pump to find out how it will be received. (EnWiki is nearly half the global community, you can certainly post elsewhere as well if you feel broader input is needed.)

The response could range from "we love it", to identifying must-have design requirements to support various workflows, to "hell no". Whichever way it goes, the time to get that information is before something is built.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, AlseeCc: ggellerman, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-28 Thread daniel
daniel added a comment.

In T107595#2675167, @Tgr wrote:
It might be helpful to split the use cases into ones where MCR is nice to have and those which need it. As I understand it, there are roughly three groups:


I'm missing the group "currently embedded in wikitext and would benefit from separate storage, editing, diffing, etc", e.g. page assessment, media info, categories, ...TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, danielCc: ggellerman, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-28 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.

In T107595#2675028, @RobLa-WMF wrote:
I may update the description of this task and of the RFC on mediawiki.org to say this.  This answer isn't etched in stone, but when someone asks me "what is the MVP for Multi-Content Revisions", I'll say "structured media info".  I'm not sure which URL I'll point them to, but I'm sure I'll find something.


https://commons.wikimedia.org/wiki/Commons:Structured_data is the best we have atm.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, Lydia_PintscherCc: ggellerman, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-28 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.

In T107595#2675167, @Tgr wrote:
It might be helpful to split the use cases into ones where MCR is nice to have and those which need it. As I understand it, there are roughly three groups:


data that would otherwise be stored on separate pages but could be bundled into a single page for better UX: media info, doc subpages, {{/header}} and similar templates, maps JSON blobs etc. This is mostly "nice to have" territory although in the case of media info (some of which will have to be manually migrated from description page templates) the UX degradation would be pretty jarring so that might be closer to must have.



I would argue it is a must have. We can technically do it in several pages but the chance of getting it accepted by the community with the degraded usability and features is close to 0.


data that is currently stored on multiple pages but needs atomic updates to ensure consistency (gadget CSS/JS, template styles, template/module test pages). MCR is needed to make those behave correctly.
supplementary data that is used by some tool (editor, mobile app etc) and not really intended for direct manual editing: lead image focus, structured categories, page assessments, maps. These would have to be stored somewhere else, which would be a major loss of efficiency for developers as they would have to rebuild fundamental infrastructure from scratch for each one.
TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, Lydia_PintscherCc: ggellerman, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-28 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.

In T107595#2675028, @RobLa-WMF wrote:
Thanks for reminding us of this.  You're obviously the primary contact from WMDE for this, but who is the product manager from WMDE whose work would be blocked if this is delayed?  Is that @Lydia_Pintscher or someone else?


Yes it's mine.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, Lydia_PintscherCc: ggellerman, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-28 Thread Tgr
Tgr added a comment.
It might be helpful to split the use cases into ones where MCR is nice to have and those which need it. As I understand it, there are roughly three groups:


data that would otherwise be stored on separate pages but could be bundled into a single page for better UX: media info, doc subpages, {{/header}} and similar templates, maps JSON blobs etc. This is mostly "nice to have" territory although in the case of media info (some of which will have to be manually migrated from description page templates) the UX degradation would be pretty jarring so that might be closer to must have.
data that is currently stored on multiple pages but needs atomic updates to ensure consistency (gadget CSS/JS, template styles, template/module test pages). MCR is needed to make those behave correctly.
supplementary data that is used by some tool (editor, mobile app etc) and not really intended for direct manual editing: lead image focus, structured categories, page assessments, maps. These would have to be stored somewhere else, which would be a major loss of efficiency for developers as they would have to rebuild fundamental infrastructure from scratch for each one.
TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, TgrCc: ggellerman, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-28 Thread RobLa-WMF
RobLa-WMF added a comment.

In T107595#2674782, @daniel wrote:
To me as a Wikidata developer, the "killer use case" is structured media info, but e.g. James, Mark, or Kaldari may have other priorities. The Wikidata team will provide a brief summary of the requirement and rationale for structured media info soon, but to get it right, we want to coordinate with the WMF multimedia team first.


I may update the description of this task and of the RFC on mediawiki.org to say this.  This answer isn't etched in stone, but when someone asks me "what is the MVP for Multi-Content Revisions", I'll say "structured media info".  I'm not sure which URL I'll point them to, but I'm sure I'll find something.

(Please note that I'm out of office until October 24; I'll be working some of the time, but I will be traveling and attending a conference)

Thanks for reminding us of this.  You're obviously the primary contact from WMDE for this, but who is the product manager from WMDE whose work would be blocked if this is delayed?  Is that @Lydia_Pintscher or someone else?TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, RobLa-WMFCc: ggellerman, Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-28 Thread daniel
daniel added a comment.

In T107595#2674606, @RobLa-WMF wrote:
Well, the "lot clearer" assertion remains to be seen.  I think the current proposal still seems like an enormous change.  I'm starting to wrap my head around it, but I can't fault many skeptics for questioning whether this represents a "minimum viable product".  I realize there are many use cases, but what single use case would you consider your single must-have use case for an MVP?


You are right that it is a big change, both conceptually and technically. I'm doing my best to minimize the cost, but it's not trivial.

To me it seems like the cost is justified because MCR would address the need of several use cases. For a single use case, it would perhaps not be justified, and a more specialized solution would be sufficient. But a specialized solution for each use case would be a lot more expensive, and would introduce a lot more complexity. The idea is that adding a layer of abstraction, MCR, will allow such use cases to be implemented with a minimum of extra code. It's about scalability of the platform when adding features.

To allow this supposed benefit to be assessed and verified, we should define the requirements for the MVP for each must-have use case. If we find significant overlap in the platform needs of several use cases, a generalized solution like MCR is justified. The requirements for that generalized solution can then be derived from the platform needs of MVPs.

I have done the above informally in conversations with WMF product owners and developers over the last year, but I admit that this is not documented sufficiently. We (Lydia and me) are in the process of reaching out to WMF product owners, asking them to provide more detailed requirement, rationales, and priorities for their use cases, and we plan to document them on a subpage of https://www.mediawiki.org/wiki/Multi-Content_Revisions.

To me as a Wikidata developer, the "killer use case" is structured media info, but e.g. James, Mark, or Kaldari may have other priorities. The Wikidata team will provide a brief summary of the requirement and rationale for structured media info soon, but to get it right, we want to coordinate with the WMF multimedia team first.

(Please note that I'm out of office until October 24; I'll be working some of the time, but I will be traveling and attending a conference)TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, danielCc: Alsee, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-28 Thread RobLa-WMF
RobLa-WMF added a comment.

In T107595#2671264, @daniel wrote:

In T107595#2668520, @RobLa-WMF wrote:
The risk: the more that our data formats become a complex mystery that is only understood by a handful of people, the fewer people that will trust the systems we produce.


Ah, yes, I agree. The structure of our content should be clearly defined and easy to grasp for interested people. That structure will become slightly more complex with MCR, since we add a level of indirection. On the plus side, the data formats used to represent things like categories or page assessments or license information will become a lot clearer and easier to understand and re-use.


Well, the "lot clearer" assertion remains to be seen.  I think the current proposal still seems like an enormous change.  I'm starting to wrap my head around it, but I can't fault many skeptics for questioning whether this represents a "minimum viable product".  I realize there are many use cases, but what single use case would you consider your single must-have use case for an MVP?TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, RobLa-WMFCc: Alsee, Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-27 Thread daniel
daniel added a comment.

In T107595#2668520, @RobLa-WMF wrote:
The risk: the more that our data formats become a complex mystery that is only understood by a handful of people, the fewer people that will trust the systems we produce.


Ah, yes, I agree. The structure of our content should be clearly defined and easy to grasp for interested people. That structure will become slightly more complex with MCR, since we add a level of indirection. On the plus side, the data formats used to represent things like categories or page assessments or license information will become a lot clearer and easier to understand and re-use.

However, the more we try to hide the underlying storage format, the less that the most active editors will trust the systems we produce.  Let's make sure that we come up with a system that is easy to explain what a revision is at the byte level.

Yes, right - the system needs to remain transparent, and that's how MCR is designed. My point was that it should not be necessary for editing to know about this. People add tags on flickr without having to think about the underlying storage structure, or learn arcane syntax. It should be the same with MediaWiki.


In T107595#2669600, @Tgr wrote:
That probably works for editors but not for patrollers. Ie. we can keep the editing interface as it is (there would have to be a non-JS fallback with a textfield for each slot, but it does not have to be the default, even for non-JS users), but history will need some changes (it has to expose edits which do not change the main content, and probably add some filtering tools to handle that) and the diff view will have to expose the slots. That might be worth a discussion.


Yes, at least in diffs, slots will be a visible concept. For history, watchlist, recentchanges, etc, filtering by slot may be useful, but otherwise I don't think it's necessary to expose the concept of slots there.

You are right that this aspect could use some more thought and discussion. The best place for this is the talk page of https://www.mediawiki.org/wiki/Multi-Content_Revisions/Views I think.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, danielCc: Alsee, Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-26 Thread Tgr
Tgr added a comment.

In T107595#2667512, @daniel wrote:
I think little of that complexity should be exposed to users. We probably don't want editors to freely mix and match slots - rather, we want an integrated experience for editing and display.  Ideally editors should neither know nor care about slots.


That probably works for editors but not for patrollers. Ie. we can keep the editing interface as it is (there would have to be a non-JS fallback with a textfield for each slot, but it does not have to be the default, even for non-JS users), but history will need some changes (it has to expose edits which do not change the main content, and probably add some filtering tools to handle that) and the diff view will have to expose the slots. That might be worth a discussion.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, TgrCc: Alsee, Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-26 Thread RobLa-WMF
RobLa-WMF added a comment.

In T107595#2667512, @daniel wrote:
I think little of that complexity should be exposed to users. We probably don't want editors to freely mix and match slots - rather, we want an integrated experience for editing and display.  Ideally editors should neither know nor care about slots.


I think I agree with you, but you say this in a way that sounds dangerous.

The risk: the more that our data formats become a complex mystery that is only understood by a handful of people, the fewer people that will trust the systems we produce.

It's true that ideally, editors should not need to understand the underlying formats.  We should create systems that are easy for both humans and computers to understand and manipulate.  If we do this, we'll provide the ability to create user interfaces that behave intuitively.  Advanced editors will learn the underlying model, and will be able to intuitively grasp the nature of the inevitable problems we'll have with the systems we build.  They will also understand how to explain those problems to less advanced editors.

However, the more we try to hide the underlying storage format, the less that the most active editors will trust the systems we produce.  Let's make sure that we come up with a system that is easy to explain what a revision is at the byte level.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, RobLa-WMFCc: Alsee, Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-26 Thread daniel
daniel added a comment.
@RobLa-WMF  wrote

Now, it would seem as though you are bringing this point up now because you're worried about making the system more complicated.  Yes, that seems like a reasonable fear.  A multi-slot "revision" seems similar to a file system fork, and will inevitably come with the same complexity.  I'm eager to see how we make the case that this complexity is worth it.

I think little of that complexity should be exposed to users. We probably don't want editors to freely mix and match slots - rather, we want an integrated experience for editing and display.  Ideally editors should neither know nor care about slots.

What I take away from @Alsee's comment is that we should provide a more comprehensive and detailed overview of the use cases. It is however important to recognize that even if we implement MCR, this only gives us the *option* to manage things a different way. MCR itself changes nothing about how categories are stored - it just provides a sensible place outside the wikitext where they can be stored. MCR is designed to add a degree of freedom to MediaWiki which allows use to implement new features, and (perhaps more importantly) allow some features that have been hacked in in the past the work much more efficiently, smoothly,  and user friendly.

One more thing about the simplicity of wikitext: wikitext isn't simple at all. It's versatile and powerful, but if you use it for anything but formatting text, it becomes rather complex and scary. The idea behind MCR is to use wikitext for formatting text, and move other data elsewhere, where it can be stored, edited, diffed, and rendered more efficiently and nicely.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, danielCc: Alsee, Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-25 Thread RobLa-WMF
RobLa-WMF added a comment.

In T107595#2666094, @Alsee wrote:
Did anyone consider that it might be a bad idea to start building a radical change to the editing environment without investigating whether the editing community wants this?


Each of the use cases have had quite a bit of discussion, and has had quite a bit of investigation by the people proposing it.  For example, changing the way that licensing and media metadata is stored is documented here: https://commons.wikimedia.org/wiki/Commons:Structured_data

Many of the other use cases have similarly deep analysis with similarly long histories.

In general, the work that Daniel and many other people are doing at WMDE is work that affects core infrastructure, so there's not one single editing community we can ask.  It also affects people outside the Wikimedia editing community.  It has the same sort of complexity as interlanguage links had before Wikidata came along.  Discussing it in the context of MediaWiki.org seems like a sensible place to put an RFC, and thus that's why we're discussing it here now.

Now, it would seem as though you are bringing this point up now because you're worried about making the system more complicated.  Yes, that seems like a reasonable fear.  A multi-slot "revision" seems similar to a file system fork, and will inevitably come with the same complexity.  I'm eager to see how we make the case that this complexity is worth it.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, RobLa-WMFCc: Alsee, Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-25 Thread Alsee
Alsee added a comment.
My apologies, my intent wasn't to try to prove a case against MCR here. (Although I do understand why replies focused in that direction). Perhaps it would help if I shortened my previous comment:

Did anyone consider that it might be a bad idea to start building a radical change to the editing environment without investigating whether the editing community wants this?

Is there anyone who believes that point requires debate?TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, AlseeCc: Alsee, Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-25 Thread brion
brion added a comment.
I wrote up some quick thoughts at https://www.mediawiki.org/wiki/User:Brion_VIBBER/MCR_alternative_thoughts

Mainly exploring along two lines:


what if we did a model with separate data tables for each new 'slot' instead of a common content-blob interface (possibly more line with Jaime's thoughts?, possibly different)
what if we went full in on using subpages, what would it take to support that?


The first would be in some ways similar to the MCR model, but with stricter typing, possible benefits in storage and schema consistency, etc but without the conveniences of the common interface for Content blobs. The second might be a much easier transition, but needs better high-level tooling and some new versioning concepts.

May be worth fleshing these out or combining some ideas just to brainstorm a bit.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brionCc: Alsee, Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-24 Thread Tgr
Tgr added a comment.

In T107595#2664438, @Pppery wrote:

Structured Media Data: What exactly is this seperating license information from? This proposed change seems like it would lose some of the flexibility in file licenses



Flexibility means it is impossible to build assumptions into workflows and software. For very much the same reason that wikis use license and info templates to protect patrollers from the "flexibility" of hand-written, unstructured information, storing most of the file description page content as structured data would protect programmers' sanity when that information needs to be reused.


Template documentation: What is wrong with the convention of having a /doc page?


Mainly that you are unable to preview changes. Smaller annoyances include that they do not get included in exports, they make the syntax more unforgiving (MediaWiki removes end-of-page newlines but does not remove newlines between the actual template and the documentation), the edit mechanism is unintuitive (click on the "edit documentation" link, fix a typo, and suddenly you are on a different page).


Modules can have /doc subpages just like templates


More importantly, they tend to have unit test subpages, and currently we offer no help to module editors in testing changes *before* accidentally breaking half the wiki.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, TgrCc: Alsee, Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-24 Thread Pppery
Pppery added a comment.
Agree, Alsee. I don't find any of the use cases for this very compelling: Refutations of some of the usecases:


Structured Media Data: What exactly is this seperating license information from? This proposed change seems like it would lose some of the flexibility in file licenses
Page Assessments: The PageAssessments extension linked seems to be being built without MCR, so I don't see why this is necessary for it
Infobox Data: Already handled via wikidata.
Template styles: That's T483
Template documentation: What is wrong with the convention of having a /doc page?
Categories can already be managed as structure data via tools like HotCat
Modules can have /doc subpages just like templates, and I don't see any advantage into integrating this further


TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, PpperyCc: Alsee, Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-24 Thread TomT0m
TomT0m added a comment.

In T107595#2664347, @Alsee wrote:
A page is simply a text file.


A page is definitely not a simple text page. It's a text page written in a programming language - the wikitext and tempates - that happens to have a textual representation. It also includes references to external content like images inclusion, metadatas as categories and so on. It also happens that the wikitext representation of a page is not the only one and that parsoid has its own who may become the main representation for convenience of the developpers. This representation can also be copy/pasted.

To be rendered, it must be manipulated by a compiler that turns it into HTML content and does pretty complex stuff like managing the wiki categorisation index, include the templates and do on.

This proposal, as far as I understand, won't change a bit the textual representation of the content and the fact that you can copy/paste it into another page.

This is proposing turning Wikipedia into a gigantic complex app.

Open your eyes, Mediawiki is an app with decades of developpement and continuous developement, new features are constantly added, new usecases emerge. It's already a gigantic complex app. The only valuable question is "how do we manage such complexity", and this proposal is one of the answer.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, TomT0mCc: Alsee, Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-24 Thread Alsee
Alsee added a comment.
Did anyone consider that it might be a bad idea to start building a radical change to the editing environment without investigating whether the editing community wants this? Ripping categories and templates and other stuff entirely out of the page?

Wiki operates on an extremely simple, powerful, and flexible paradigm. A page is simply a text file.

We can trivially copy-paste a page into any text editor, possibly close the browser or go offline, do anything and everything, then just paste it here and save.

This is proposing turning Wikipedia into a gigantic complex app.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, AlseeCc: Alsee, Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-16 Thread daniel
daniel added a comment.
@Pppery I'm refering to this mess: https://commons.wikimedia.org/w/index.php?title=File:L%C3%ADneas_de_Nazca,_Nazca,_Per%C3%BA,_2015-07-29,_DD_46.JPG="">.

Here's an overview of the use cases for MCR: https://www.mediawiki.org/wiki/Multi-Content_Revisions#Use_Cases.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, danielCc: Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-16 Thread Pppery
Pppery added a comment.
Its really just a gut feeling that this is needless complexifying change. There is already a seperate TemplateData editor that can be accessed when you click the edit link for a template. And I'm not sure what series of nested templates for file license you are referring to.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, PpperyCc: Pppery, Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-09-15 Thread Pppery
Pppery added a comment.
My concern is that this is just part of a general trend of making things more complicated than they need to be.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: brion, PpperyCc: Florian, Liuxinyu970226, WMDE-leszek, Mholloway, Scott_WUaS, Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-07-19 Thread RobLa-WMF
RobLa-WMF added a comment.
@daniel , I've added a stub at https://www.mediawiki.org/wiki/Requests_for_comment/Multi-content_revisions

Could you port the bulk of the prose of this RFC over there?TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: daniel, RobLa-WMFCc: Niharika, MGChecker, LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-07-14 Thread daniel
daniel added a comment.

In T107595#2456127, @RobLa-WMF wrote:
I'd like to discuss the state of this RFC in our planning meeting tomorrow (E227)


Sorry, couldn't make it to the meeting. Let's talk about it next week.TASK DETAILhttps://phabricator.wikimedia.org/T107595EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: danielCc: LikeLifer, Elitre, Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-06-06 Thread daniel
daniel added a comment.


  Quick summary of a chat with @GWicke:
  
  RevisionContentLookup, RESTBase, and Parsoid:
  
  - There should be an implementation of RevisionContentLookup based on RESTBase
  - RESTBase could provide Parsoid HTML "renders" as a "virtual slot".
  - Each revision may have several Parsoid renders, e.g. when templates change.
  - Each Parsoid "render" may encompass multiple (virtual) slots.
  - To keep slot data consistent, we need to supply the "render id" (a 
time-uids) to getRevisionContent()
  - Note: WYSIWYG editing is based on a rendering, not primary content! 
Especially important when editing template parameters.
  - For this reason, parsoid has "primary" and "generated" parts in the same 
HTML DOM. These perspective would be exposed as separate slots in this model. 
One important use case for this is diffing. Another is to have separate URIs 
for primary and expanded content, for dependency tracking.
  
  MCR Recap:
  
  - primary slots: constitute revisions;  may have conflicts; user edited.
  - derived slot: derived, but persisted as blobs. Can be updated.
  - virtual slots don't use blob store (e.g. parsoid "renderings"), can be 
generated on the fly, or fetched from a remote service
  - Dependency tracking: primary content never depends on anything; renderings 
can depend on primary data, as well as other renderings.
  
  Open questions:
  
  - Do we need to record the (primary) slots in th edatabase, so we can 
enumerate them reliably? Alternatively, we could ask all relevant services for 
all possible slots to get an enumeration.
  - Can primary content be stored outside the blob-store model? Related: should 
the slot table really have blob URLs?
  - Multiple "events" per revisions may be useful: each "like", each "view", 
each "comment", etc. "sub-revisions"?
  - Versioning for content models (wikibase model 0.1, 0.2, 1.0, 2.0; parsoid 
html 1.0, 1.1, 1.2...)
  - Drop content formats?
  - Slot content: use Content interface? More basic RevisionData interface? Or 
allow specialized interfaces, such as ParserOutput objects?
  - Optional meta-data associated with slots (etag, etc?)? with revisions 
(rev-props)? with blobs (hash, size)?

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-13 Thread GWicke
GWicke added a comment.


  > As I understand it, restbase is a front-end caching proxy store, exposed to 
the public internet.
  
  For most use cases (including HTML), it is actually *storing*, and not just 
caching. It is the equivalent of ExternalStore and most of the text table, 
including revision deletions. Longer term, we are looking into replacing 
ExternalStore for wikitext.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, GWicke
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread brion
brion added a comment.


  In https://phabricator.wikimedia.org/T107595#2266131, @GWicke wrote:
  
  > The use case for providing metadata is so that we can use stores like 
RESTBase, which already provide an API keyed on title, revision & render ID. It 
also already deals with the complexities you mention.
  >
  > Basically, if we don't have a way to provide this key information to the 
backend store, then we can't access all the multi-content revision data that's 
already out there through this interface.
  
  
  As I understand it, restbase is a front-end caching proxy store, exposed to 
the public internet. Meanwhile the blob store is the equivalent of MediaWiki's 
existing text table and external storage backing system, entirely internal and 
containing data that is sometimes private (eg deleted or revdel'd page content).
  
  A front-end restbase could proxy access to MediaWiki revisions, backed by 
MediaWiki and the blob store. This would mean that slot data, metadata, titles, 
revision ids etc are all preserved and exposed because you'd be hooking up to 
the level of MediaWiki that has that information.
  
  Is that what you mean, or do you have something else in mind?

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, brion
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread daniel
daniel added a comment.


  In https://phabricator.wikimedia.org/T107595#2266131, @GWicke wrote:
  
  > Basically, if we don't have a way to provide this key information to the 
backend store, then we can't access all the multi-content revision data that's 
already out there through this interface.
  
  
  I agree we should find an appropriate abstraction that allows us to use the 
information that is already available via RESTbase. It seems to me that this 
would be an abstraction layer on the level on slots, between RevisionBuilder 
and BlobStore. it seems pretty clear to me that the BlobStore interface doesn't 
fit: it represents something more low level than what RESTbase is currently 
used for.
  
  We should keep this in mind, but perhaps we can postpone the details until we 
implement derived (dynamic) slots. We will want those to be purely 
programmatic, and not to be forced to rely on slot entries in the database. The 
`RevisionContentLookup` interface from the original proposal would be the 
reading side of this.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread GWicke
GWicke added a comment.


  The use case for providing metadata is so that we can use stores like 
RESTBase, which already provide an API keyed on title, revision & render ID. It 
also already deals with the complexities you mention.
  
  Basically, if we don't have a way to provide this key information to the 
backend store, then we can't access all the multi-content revision data that's 
already out there through this interface.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, GWicke
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread brion
brion added a comment.


  If I understand, the case for passing more metadata to the blob store is as a 
hint for cross-blob data compression.
  
  For this I think we mainly want to pass the identifier of a related blob: the 
blob with the data from the same slot in the previous revision. If the related 
blob is in the same store, then the blob store can potentially optimize its 
actual backing storage (with diff-based storage, or by gzipping adjacent blob 
contents together with a window size larger than the blobs, etc).
  
  It might also be useful to specify a type for 'hey this is precompressed 
binary data, don't bother trying to recompress it or diff it'.
  
  But I would strongly recommend against being too clever. Revision metadata 
may change (yes, change -- revdel etc) and blobs are explicitly reused across 
multiple revisions. Revision histories can be rewritten (yes, rewritten -- 
import/export and delete/undelete can change ordering & adjacency, etc). And 
definitely don't include things like titles that are completely arbitrary and 
may change at any time.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, brion
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread daniel
daniel added a comment.


  In https://phabricator.wikimedia.org/T107595#2265186, @GWicke wrote:
  
  > >> In addition to title and revision (which I assume remains an integer), 
we'd need an optional v1 UUID parameter to retrieve specific renders, in both 
the request & response interfaces.
  >
  >
  >
  > > I have thought about this, too. My solution is to encode this in the slot 
name. So you could have an html.canonical (sub)slot, and a 
html.29e68f78-8765-49f8-86d5-dfc438d459fe, or html.en, or whatever.
  >
  > Hmmm, this sounds like a rather ugly hack. I thought the 'slot' is 
identifying the kind of content, and is not some general-purpose string that is 
used to append otherwise missing parameters, and differs with each render.
  
  
  The slot name defines the function or disposition of the content. I think 
"english html representation" fits that bill.
  
  Since we want to associate some meta-info with slot names (content model, 
primary orderived, blob store to use, etc), we'd have a stable prefix plus an 
optional dynamic suffix.
  
  I think this is conceptually sound, and allows for nice things like finding 
all html.canonical slots in the database easily. Putting hashes or uuids in 
there is less pretty, I agree, but still not horrible. When we start encoding 
JSON in there, then something went wrong indeed...
  
  > I would argue that it is a case of finding an abstraction at the right 
level. A simple blob store is a very low-level abstraction, and severely limits 
the backend's abilities to optimize storage, distribution & consistency. It 
also limits the backend's usefulness as an API in its own right.
  
  A narrow and dumb interface for the BlobStore is quite intentional. I want to 
be able to *easily* implement a BlobStore based on the file system or CDB or 
whatever. Yes, it's fairly low level, but I think we should have it. And I 
think we should have such a low level BlobStore that is backed by RESTBase. But 
perhaps RESTBase can also implement a more high level interface, for managing 
slot information. But I think that should be kept separate from the 
intentionally low level BlobStore interfacce.
  
  > Instead, I think we should clearly define the API for each slot to provide 
/ consume
  > 
  > - page id,
  > - page title,
  > - revision id, and
  > - a UUID / hash / etag.
  
  Every //slot//, yes. Every //blob//, no. The BlobStore should not know or 
care about these things. I guess that is where we disagree.
  
  > This makes sure that backends can continue to implement higher-level 
functionality & important optimizations. This should be part of the API, and 
not a case of a "leak". That said, backends *can* choose to ignore all of this 
(but the UUID / hash).
  
  Yes, that's the kind of high level functionality a RevisionStore would 
provide. If you want to do that in RESTBase, then you'd have to implement a 
RESTBase RevisionStore (or RevisionSlotStore, if we introduce an intermediate 
layer of abstraction).
  
  My point is that we shouldn't have all the meta-info in the //lowest// layer 
of abstraction of storage. And BlobStore is the lowest in this context. And I 
believe we should indeed have such a low level abstraction layer, if only to 
encapsulate what's there already: the text table, and external store.
  
  > A minimum set of metadata (like the versioned content-type) should always 
be provided. It would be nice to model this in a way that's compatible with 
normal HTTP headers, as stored & returned by services like RESTBase.
  
  Our baseline implementation is just what's in the text table. The next level 
implementation is just what ExternalStore has. Neither of them can handle or 
provide meta-data. Our interface needs to be compatible with that baseline.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread daniel
daniel added a comment.


  @GWicke Perhaps some confusion is caused by us thinking of the storage 
backend in different terms. For me, RESTBase is a BlobStore. A BlobStore deals 
with binary data, which it stores and assignes urls to, and which it can 
retrieve given such a url. That's it.
  
  It seems to me that you think of RESTBase more on the level of managing slot 
meta-data. I don't really see how that would work. Though I do see how we can 
expose some of that meta-data to RESTBase (and BlobStores in general) so 
optimization can be applied.
  
  In my mind, a higher level RESTbase interface, eg. one dealing with entire 
revisions, would be based on the  new functionality of RevisionStore and 
friends, not vice versa.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread GWicke
GWicke added a comment.


  > In any case, the PageUpdater / WikiPage code needs to trigger notifications 
(produce events). I don't care what mechanism it used for that. Or rather: I'm 
very happy if we get a generalized mechanism. We'll have to agree on some kind 
of schema for revisions, slots, and blobs, but that should be easy enough.
  
  Makes sense. Thanks for the clarification!
  
  >> In addition to title and revision (which I assume remains an integer), 
we'd need an optional v1 UUID parameter to retrieve specific renders, in both 
the request & response interfaces.
  
  
  
  > I have thought about this, too. My solution is to encode this in the slot 
name. So you could have an html.canonical (sub)slot, and a 
html.29e68f78-8765-49f8-86d5-dfc438d459fe, or html.en, or whatever.
  
  Hmmm, this sounds like a rather ugly hack. I thought the 'slot' is 
identifying the kind of content, and is not some general-purpose string that is 
used to append otherwise missing parameters, and differs with each render.
  
  >> How would a dumb blob store figure out which content belongs to the same 
page (and is thus similar), if all it has is the content & some metadata, but 
not the page id, title, revision & render UUID? This is the same design issue 
that plagues ExternalStore, and something we addressed in RESTBase. With 
large-window compression algorithms like brotli, we are getting down to 2-3% of 
the input HTML size (see https://phabricator.wikimedia.org/T122028). Without 
this locality information, you are likely to use an order of magnitude more 
storage as you are foregoing efficient delta compression.
  > 
  > This is a good point. Once again, we want our abstraction to be a bit 
leaky, to allow for optimizations.
  
  I would argue that it is a case of finding an abstraction at the right level. 
A simple blob store is a very low-level abstraction, and severely limits the 
backend's abilities to optimize storage, distribution & consistency. It also 
limits the backend's usefulness as an API in its own right.
  
  Instead, I think we should clearly define the API for each slot to provide / 
consume
  
  - page id,
  - page title,
  - revision id, and
  - a UUID / hash / etag.
  
  This makes sure that backends can continue to implement higher-level 
functionality & important optimizations. This should be part of the API, and 
not a case of a "leak". That said, backends *can* choose to ignore all of this 
(but the UUID / hash).
  
  > I havn't thought this through yet, but my inclanation is that we could 
associate a metadata array (k/v set) with the blob, which could include things 
like a hash and the page title. A BlobStore would be free to use this or not, 
to store it or not, and to make it retrievable or not.
  
  A minimum set of metadata (like the versioned content-type) should always be 
provided. It would be nice to model this in a way that's compatible with normal 
HTTP headers, as stored & returned by services like RESTBase.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, GWicke
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread daniel
daniel added a comment.


  There is no redirection to maintain. The blob url from the old revision
  
  In https://phabricator.wikimedia.org/T107595#2265137, @GWicke wrote:
  
  > Makes sense, some of these fields won't change between revisions. Depending 
on the constraints, it might still make sense to store unchanged content & rely 
on compression to encode it efficiently, rather than introduce & maintain a 
redirection.
  
  
  There's no redirection. When a slot is not edited between revisions, the 
respective rows in the slots table would simply contain the same blob ID. Or to 
put it differently: when creating a new revision, all (primary) slots are 
copied to the new revision, except for the ones explicitly changed for the new 
revision. So if rev 1 has a slot record for slot X that specified the blob url 
restbase:uri:abc (1,X,restbase:uri:abc), and rev 2 doesn't edit slot X, then 
rev 2 will have   (2,X,restbase:uri:abc). No aliasing, no redirection. Just the 
same blob data url.
  
  > In any case, as long as you ask the backend for content for a specific 
title / page id, revision & UUID, backends are free to use whatever performs 
best.
  
  As the code is currently designed, the storage backend would never be called 
for slots that remain the same for the new revision. Why would it? We'd have to 
load all the data and then send it back to the backend, for nothing.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread GWicke
GWicke added a comment.


  > Blobs would typically be shared by different revisions of the same page. 
This happens every time one primary slot is edited, but another is not changed. 
E.g. the free wikitext description of a file is edited, but the structured data 
isn't (or vice versa). Or the quality assessment data of an article is updated, 
but the article text isn't edited. In both cases, one of the blobs would be 
re-used by the new revision. I think this will actually be more common than 
editing all primary streams at once.
  
  Makes sense, some of these fields won't change between revisions. Depending 
on the constraints, it might still make sense to store unchanged content & rely 
on compression to encode it efficiently. This is likely what we'll continue to 
do in RESTBase, as this makes sure that access by revision continues to perform 
predictably.
  
  In any case, as long as you ask the backend for content for a specific title 
/ page id, revision & UUID, backends are free to use whatever performs best.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, GWicke
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread daniel
daniel added a comment.


  In https://phabricator.wikimedia.org/T107595#2264799, @GWicke wrote:
  
  > It is not entirely clear to me whether PageUpdater (and RevisionUpdater) 
are meant to only handle synchronous low-level updates, or whether they are 
meant to orchestrate asynchronous change propagation as well. I would suggest 
focusing PageUpdater and RevisionUpdater on synchronous / low-level updates 
only, and leave asynchronous change propagation to EventBus / the change 
propagation service.
  
  
  RevisionUpdater/RevisionBuilder operates on the same level as Revision: no 
secondary data, no notifications. Just storage.
  
  PageUpdater would operate on the same level as WikiPage, but I think we 
should first get RevisionBuilder working, and leave PageUpdater as it is, for 
now.
  
  In any case, the PageUpdater / WikiPage code needs to trigger notifications 
(produce events). I don't care what mechanism it used for that. Or rather: I'm 
very happy if we get a generalized mechanism. We'll have to agree on some kind 
of schema for revisions, slots, and blobs, but that should be easy enough.
  
  >> The bob-store is (potentially) content-adressable, so the same blob may be 
used for different revisions of different pages.
  > 
  > Blob sharing would complicate your storage significantly, as you'd either 
have to forgo deleting content forever (very expensive for something like HTML 
renders), 
  >  or incur significant complexity of implementing an atomic reference 
counting scheme.
  
  I have pushed by the //derived slots// in my mind until we have the 
//primary// slots working. I agree that for "volatile" data, we'd not want to 
use content-adressable blobs, for the reason you menationed.
  
  > For textual content, I am pretty certain that sharing is rare, and the 
complexity would overall be a loss in performance and reliability.
  
  Sharing between different pages is probable rare, but:
  
  >> Even for blobs that have an incremental ID (e.g. using the current text 
table storage mechanism), the same blob would frequently be used for multiple 
blobs of the same page.
  
  Blobs would typically be shared by different revisions of the //same// page. 
This happens every time one primary slot is edited, but another is not changed. 
E.g. the free wikitext description of a file is edited, but the structured data 
isn't (or vice versa). Or the quality assessment data of an article is updated, 
but the article text isn't edited. In both cases, one of the blobs would be 
re-used by the new revision. I think this will actually be more common than 
editing all primary streams at once.
  
  > How would a dumb blob store figure out which content belongs to the same 
page (and is thus similar), if all it has is the content & some metadata, but 
not the page id, title, revision & render UUID? This is the same design issue 
that plagues ExternalStore, and something we addressed in RESTBase. With 
large-window compression algorithms like brotli, we are getting down to 2-3% of 
the input HTML size (see https://phabricator.wikimedia.org/T122028). Without 
this locality information, you are likely to use an order of magnitude more 
storage as you are foregoing efficient delta compression.
  
  This is a good point. Once again, we want our abstraction to be a bit leaky, 
to allow for optimizations.
  
  I havn't thought this through yet, but my inclanation is that we could 
associate a metadata array (k/v set) with the blob, which could include things 
like a hash and the page title. A BlobStore would be free to use this or not, 
to store it or not, and to make it retrievable or not.
  
  > I am generally trying to work out how RevisionContentLookup would work for 
use cases like fetching HTML from RESTBase. Some notes / questions:
  > 
  > - In addition to title and revision (which I assume remains an integer), 
we'd need an optional v1 UUID parameter to retrieve specific renders, in both 
the request & response interfaces.
  
  I have thought about this, too. My solution is to encode this in the slot 
name. So you could have an html.canonical (sub)slot, and a 
html.29e68f78-8765-49f8-86d5-dfc438d459fe, or html.en, or whatever.
  
  > - Will getTouched() return the UUID timestamp of a specific render 
(last-modified, essentially), or is this about page_touched? Also, should we 
expose UUIDs to make sure that we have a unique ID with a high-resolution 
timestamp?
  
  getTouched() will return the touch date of the slot. For primary slots, this 
will always be the revision (edit) timestamp. For derived slots, it would be 
the time that slot was last updated [i'd love to use a logical clock for this, 
instead of wall clock time...].
  
  I'd expose URLs. Their format would be left to the blob store. Could be a 
UUID.
  
  > - For content from RESTBase, read restrictions are always enforced as part 
of the API request. No information about the applied restrictions is returned. 
In this context, 

[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread daniel
daniel added a comment.


  ///me notes that we are getting side tracked here, and this could turn into a 
separate rfc//
  
  I'd rather have the Transaction object know about the database, than the 
other way around. Why should the database be in charge of transactions (other 
than transactions inside the database)? So I'd turn this around a little:
  
$trx = new Transaction();
$trx->addDatabaseConnection( $dbw );

$trx->run(function($trx) use ($this) {
  // ...
  $url = $blobStore->saveBlob( $data, $trx );
  // ...
});
// if we reach this far, the transaction successfully committed.
// otherwise it'll have thrown an exception

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread brion
brion added a comment.


  > This assumes the BlobStore will actually talk to the (same) database. I 
would like to have Transaction separate from the DB stuff, so it can be used 
just as well with files, or Cassandra, or Swift, or whatever we come up with to 
store blobs. We shouldn't assume that it knows about SQL at all.
  
  Quite so...
  
  Let's try instead:
  
$dbw->transact(function($trx) use ($this, $dbw) {
  // $trx is a Transaction obj managed by the Database object, which will
  // have its commit or rollback callbacks triggered when Database\transact 
reaches its end state

  // ...
  $url = $blobStore->saveBlob( $data, $trx );
  // ...
  
});
// if we reach this far, the transaction successfully committed.
// otherwise it'll have thrown an exception

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, brion
Cc: JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, 
Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread daniel
daniel added a comment.


  In https://phabricator.wikimedia.org/T107595#2264511, @brion wrote:
  
  > and internally in the BlobStore's save method, we add the rollback callback 
straight onto the db object:
  >
  > That avoids having transaction state live separately in both the connection 
and a Transaction object. Good model? Bad model?
  
  
  This assumes the BlobStore will actually talk to the (same) database. I would 
like to have Transaction separate from the DB stuff, so it can be used just as 
well with files, or Cassandra, or Swift, or whatever we come up with to store 
blobs. We shouldn't assume that it knows about SQL at all.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, 
Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread brion
brion added a comment.


  In https://phabricator.wikimedia.org/T107595#2264334, @daniel wrote:
  
  > We could (optionally?) provide a transaction context to the blob store like 
this:
  
  
  I kinda like that, yeah. Maybe extend Database with a transactional interface 
that takes a callback:
  
$dbw->transaction(function() use ($this, $dbw) {
  // blah blah blah
});
// if we reach this far, the transaction successfully committed.
// otherwise it'll have thrown an exception
  
  and internally in the BlobStore's save method, we add the rollback callback 
straight onto the db object:
  
$dbw->addRollbackCallback( function() use ( $url ) { $this->delete( $url ); 
} )
  
  That avoids having transaction state live separately in both the connection 
and a Transaction object. Good model? Bad model?

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, brion
Cc: JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, 
Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread daniel
daniel added a comment.


  In https://phabricator.wikimedia.org/T107595#2264302, @brion wrote:
  
  > The remaining questions are
  >
  > - whether we want to pass the $dbw parameter through (do we always go 
through load balancer in which case it'll be the same connection anyway? or are 
there cases where a separate connection may be used to insert revs for some 
reason?) and
  
  
  I think the RevisionBuilder can hold a single Database reference for the time 
between initialization and apply(). At the end of apply(), it should release 
the Database objects via LoadBalancer::reuseConnection(). A LoadBalancer 
instance should be injected into the RevisionBuilder from the RevisionStore.
  
  > - whether there's any nested-transaction problems if someone tries to 
insert multiple revs in an explicitly larger transaction
  
  Do we need to support this? Do we have any code that does this? It seems like 
even during import, a transaction shouldn't span multiple revisions. The new 
code should basically expect a transaction bracket exactly where there is one 
now.
  
  Hm... that probably means that we cant have begin/commit/collback inside 
apply(). This needs to be done by the caller, and the caller would need to call 
either apply() (aka commit) or rollback() on the RevisionBuilder.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, 
Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread daniel
daniel added a comment.


  We could (optionally?) provide a transaction context to the blob store like 
this:
  
$trx = new Transaction();
$trx->addDBConnection( $dbw ); 
$trx->start();
try {
  foreach ( $something as $thing ) {
$url = $blobStore->saveBlob( $data, $trx );
...
  }

  $trx->commit();
} catch ( Exception $ex ) {
  $trx->rollback();
  throw $ex;
}
  
  Now we no longer have to worry about whether the data urls are unique or 
content based. The blob store itself knows how to do the cleanup right. Inside 
`saveBlob`, TRX support could look something like this:
  
$url = $this->write( $data );
$try->addRollbackCallback( function() use ( $url ) { $this->delete( $url ); 
} )
return $url;
  
  or perhaps:
  
$url = $this->makeUrl( $data );
$try->addCommitCallback( function() use ( $url, $data ) { $this->write( 
$url, $data ); } )
return $url;
  
  I'd like to bake this ability into the design from the start, or at least 
keep it in mind so it's not too hard to add later. That doesn't mean that the 
blob store has to actually use the trx context. Or that we initially need a trx 
object at all.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, 
Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread brion
brion added a comment.


  (if RevisionBuilder takes a $dbw param via constructor/factory, then the 
question of the connection is easier)

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, brion
Cc: JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, 
Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread brion
brion added a comment.


  > The above code would replace much of what is in the Revision class now, in 
particular insertOn(). We can keep Revision around, but I'm not sure we can 
provide b/c for insertOn().
  
  b/c here looks relatively straightforward to me; it creates a new revision 
with an updated default slot from the text content & metadata in the Revision 
object. This should be implementable by calling through to RevisionBuilder.
  
  The remaining questions are
  
  - whether we want to pass the $dbw parameter through (do we always go through 
load balancer in which case it'll be the same connection anyway? or are there 
cases where a separate connection may be used to insert revs for some reason?) 
and
  - whether there's any nested-transaction problems if someone tries to insert 
multiple revs in an explicitly larger transaction
  
  (Revision\insertOn doesn't try to manage transactions itself, and will leak 
external storage blobs if its text & revision updates get rolled back.)

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, brion
Cc: JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, 
Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread brion
brion added a comment.


  re this:
  
$bs->deleteBlob( $dataUrl ); // dk: this goes wrong if the URL is 
content/hash based!
  
  I think the return from this:
  
$dataUrl = $bs->saveBlob( $content->serialize() );
  
  needs to signal whether a blob was created or whether an existing blob was 
reused. This means the blob store itself needs a transactional concept at least 
within the boundaries of 'does this blob exist?' followed by 'store this blob'. 
If two processes conflict (adding the same content at around the same time), 
then the second one needs to be able to detect the conflict and return the 
'reference to existing' signal instead of the 'inserted new' signal.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, brion
Cc: JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, 
Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread daniel
daniel added a comment.


  Pseudo-code for `saveRevisionRecord()`
  
// assume we are in a db transaction
$this->checkIsCurrentRevision( $this->baseRevision ); // protect against 
race condition

$model = $slots['main']->getModel(); // "main" model must always be there
$length = SUM( $slots[*]->getModel() );
$hash = count($slots) < 2 ? $slots[0]->getHash() : HASH( 
$slots[*]->getHash() ); // special case for b/c

$revId = $this->insertRevision( $user, $comment, $timestamp, $model, 
$length, $hash, $parentRev, ... );

foreach ( $slots as $name => $slot ) {
$this->insertSlot( $revId, $name, $slot );
}

$this->updatePage( $this->title, $revId ); // make the new revision the 
current revision.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, 
Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread daniel
daniel added a comment.


  Thanks for moving this forward, Brion!
  
  Your code is pretty close to what I had in mind. I have repeated it below 
with some changes marked `// dk`
  
  In https://phabricator.wikimedia.org/T107595#2263968, @brion wrote:
  
  > In MediaWiki in general we're pretty lax about allowing unused data to 
accumulate in places like that, as long as its presence isn't harmful. Not the 
best practice but it has plenty of precedent. :)
  
  
  Indeed, and I don't think we should try to implement ACID semantics here. But 
I would like to design the interface in a way that provides a natural place for 
a transaction bracket to be implemented, if we ever have the need.
  
  > So an update pseudocode might look like:
  
$plu = $something->getPageLookupService();
$pr = $plu->getPageRecord( $title );   // dk: $pr is a "dumb" 
record, not a store
$initialRevId = $pr->getCurrentRevisionId();

$rs = $ps->getRevisionStore();

$rb = $rs->getRevisionBuilder( $initialRevId, $baseRevId ); // dk: the 
$baseRevId may not be the same as the $initialRevId
$rb->setUser($user); // dk: let's avoid RequestContext
$rb->setComment("awesome sauce");
$rb->updateSlot('main', $updatedTextContent); // dk: $updatedTextContent 
and $updatedScriptData are Content objects
$rb->updateSlot('script', $updatedScriptData);

$rs->apply( $rb ); // dk: or just save()?
  
  > where inside the RevisionStore\apply method there's something like:
  
$dbw->start();
$bs = $this->blobStore();
$addedBlobs = [];
if ($previousRevision) {
  $oldSlots = $previousRevision->getSlots();
} else {
  $oldSlots = [];
}
try {
  foreach( $this->slotUpdates as $name => $content ) { // dk: slot name => 
Content object
// dk: The blob store knows nothing about revisions or slots or content 
models.
// The idea is that the same service interface can be used for all 
kinds of blobs.
// We could add optional suppor for meta-data there, but it's not 
needed.
$dataUrl = $bs->saveBlob( $content->serialize() );
$addedBlobs[] = $dataUrl;
// dk: the slot table associates $revId + $name to $dataUrl. We'll also 
want to
// store content model, size and hash from the Content object.
$slots[$name] = new Slot( $dataUrl, $content->getModel(), 
$content->getSize(), ... );
  }
  // dk: saveRevisionRecord() has to do a lot of things: intert into the 
revision table,
  // insert into the slot table for each slot (using the new revision id), 
update the
  // page table. rev_len and rev_sha1 need to be calculated from the slots, 
rev_content_model
  // and page_content_model should be set to the main slot's model for b/c. 
  // And it needs to check against $baseRevisionId to detect race 
conditions.
  this->saveRevisionRecord( $blah1, $blah2, $slots ); 
  $dbw->commit();
} catch (Exception $e) {
  // If update failed, clean up any newly added backing blobs, which
  // may be in external databases, filesystems, or services.
  try {
foreach( $addedBlobs as $dataUrl ) {
  $bs->deleteBlob( $dataUrl ); // dk: this goes wrong if the URL is 
content/hash based!
}
  } catch (Exception $e2) {
 // if we can't get in to delete them, let them leak. they're safe.
  }
  try {
// dk: this shoud roll back al lchanges to the page table, the revision 
table, and the slot table.
$dbw->rollback();
  } catch (Exception $e3) {
// that probably means the db connection died.
  }
  throw $e;
}
  
  The above code would replace much of what is in the Revision class now, in 
particular insertOn(). We can keep Revision around, but I'm not sure we can 
provide b/c for insertOn(). 
  I suppose the update logic outlined at the top would would for now live in 
WikiPage for now. It would be called pretty much in the places where we now 
call Revision::insertOn().
  
  Of course, WikiPage should also be refactored, but trying to do this at the 
same time as we introduce multi-content revisions is probably a bad idea. In 
fact, it's what got me stuck when I first tried to implement this.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, 
Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread brion
brion added a comment.


  Regarding transactional nature:
  
  Assuming the backing blob storage continues to work on the model of the 
current `text` table blobs with external storage backing, the "easy way" is to 
allow extra backend blobs to leak in case of transaction rollback, and let them 
be garbage-collected.
  
  If you want to get *fancy* you can do explicit cleanup after a rollback that 
happens in PHP-land (say after catching an exception, aborting the transaction, 
and then re-throwing the exception). But this will fail in the case of a fatal 
error that can't be caught, or the process being killed, leading to leaks again.
  
  In MediaWiki in general we're pretty lax about allowing unused data to 
accumulate in places like that, as long as its presence isn't harmful. Not the 
best practice but it has plenty of precedent. :)
  
  So an update pseudocode might look like:
  
$plu = $something->getPageLookupService();
$ps = $plu->getPageStore($title);
$initialRevId = $ps->getCurrentRevisionId();

$rs = $ps->getRevisionStore();

$rb = $rs->getRevisionBuilder( $initialRevId );
$rb->setUser($context->getUser());
$rb->setComment("awesome sauce");
$rb->updateSlot('main', $updatedTextContent);
$rb->updateSlot('script', $updatedScriptData);

$rs->apply( $rb );
  
  where inside the RevisionStore\commit method there's something like:
  
$dbw->start();
$bs = $this->blobStore();
$addedBlobs = [];
if ($previousRevision) {
  $slots = $previousRevision->getSlots();
} else {
  $slots = [];
}
try {
  foreach( $this->slotUpdates as $su ) {
$blob = $bs->saveDataBlob( $su->getData() );
$addedBlobs[] = $blob;
$slots[$su->getName()] = $this->saveRevisionSlot( $blob );
  }
  $this->saveRevisionRecord( $blah1, $blah2, $slots );
  $dbw->commit();
} catch (Exception $e) {
  // If update failed, clean up any newly added backing blobs, which
  // may be in external databases, filesystems, or services.
  try {
foreach( $addedBlobs as $blob ) {
  $blob->delete();
}
  } catch (Exception $e2) {
 // if we can't get in to delete them, let them leak. they're safe.
  }
  try {
$dbw->rollback();
  } catch (Exception $e3) {
// that probably means the db connection died.
  }
  throw $e;
}

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, brion
Cc: JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, 
Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-04-29 Thread daniel
daniel added a comment.


  In https://phabricator.wikimedia.org/T107595#2250053, @GWicke wrote:
  
  > Some notes:
  >
  > - PageUpdater aims to provide similar functionality as the change 
propagation service (using EventBus) & the job queue. Could you clarify why we 
need another mechanism for change propagation?
  
  
  Where do I propose another mechanism for change propagation? The PageUpdater 
would do exactly what Revision does now: schedule DataUpdates. This can easily 
be changed to use EventBus at some point. PageUpdater (and possibly 
RevisionUpdater) are the "interactors" that bind updates to the storage layer 
to notificatiosn on the event bus. They don't implement a new mechanism. For 
now, they would do exactly what Revision already does.
  
  > - The blob store does not provide any locality information (title or page 
id, revision, render id / time-uuid), which means that it is incompatible with 
existing storage systems like RESTBase. Since locality information is critical 
for consistency and decent compression, I would suggest always providing at 
least these keys.
  
  The bob-store is (potentially) content-adressable, so the same blob may be 
used for different revisions of different pages. Even for blobs that have an 
incremental ID (e.g. using the current text table storage mechanism), the same 
blob would frequently be used for multiple blobs of the same page.
  
  Information like hash, timestamp, revision, etc would be associated with the 
revision_slot entry. The "slot" is the glue between the revision and the blob, 
and holds all relevant information for using the blob, including the content 
model and serialization format.
  
  I have been thinking about a mechanism that allows at least some 
meta-information to be associated with blobs, such as the length, hash, model, 
and format (but never the revision, since there may be any number of 
revisions). The problem is that not all storage mechanism support it (the text 
table only has an id, the blob, and some flags), so this would have to be 
optional.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, 
Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-04-28 Thread GWicke
GWicke added a comment.


  Some notes:
  
  - PageUpdater aims to provide similar functionality as the change propagation 
service (using EventBus) & the job queue. Could you clarify why we need another 
mechanism for change propagation?
  - The blob store does not provide any locality information (title or page id, 
revision, render id / time-uuid), which means that it is incompatible with 
existing storage systems like RESTBase. Since locality information is critical 
for consistency and decent compression, I would suggest always providing at 
least these keys.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, GWicke
Cc: RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, 
Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-04-28 Thread daniel
daniel added a comment.


  Addendum to my brain dump in 
https://phabricator.wikimedia.org/T107595#2235538:
  
  One question I got stuck on was: How do we provide a transactional context to 
the blob stores? We can have a RevisionBuilder with beging/commit, but when 
that interacts with the storage layer below, how is the transactional context 
maintained? Do we create builders/updaters/promises on all levels (SlotBuilder, 
BlobPromise, etc) that are created based on the current transactional context 
object? Or do we make the trx object an optional argument to all methods in 
storage layer interfaces?

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, 
Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-04-25 Thread daniel
daniel added a comment.


  In https://phabricator.wikimedia.org/T107595#2235621, @brion wrote:
  
  > Thoughts:
  >
  > would RevisionContentLookup need both title and revision id in the same 
lookup, or should we rely on database integrity for ids, and have a separate 
lookup method as a shortcut for 'latest revision of page with this page id or 
title'?
  
  
  Still undecided. In theory, the revision ID is enough, at least with our 
database scheme. But for a different implementation, knowing the title may be 
useful, and having to look it up again would be annoying.
  
  > Slot name uniqueness/registration? Are we good enough with ad-hoc names? 
Hopefully extensions won't conflict, etc. we pret much have tons of cases where 
conflicts are possible though so this doesn't really increase our conflict 
surface.
  
  I agree: something to keep in mind, but not worse than content model ids or 
namespaces.
  
  > On the updater interfaces I'm not sure we need an explicit begin, versus 
implicitly beginning in the constructor (RAII)... Or are nested transactions 
still not safe to rely on? H
  
  Nested transactions are not safe to rely on. As far as I can tell, we really 
only use transactions for batching. There are no ACID guarantees.
  
  I would not want a constructor to start any kind of process. Beginning the 
update may grab locks, so it should be explicit (and possible return a magic 
scoped variable that rewinds if it goes out of scope before a commit).
  
  > Derived and virtual slots still make me go "hmmm" as well though I can see 
some benefits. Need to think more as we flesh stuff out, get some good usage 
examples.
  
  The main benefit is: give extension a place to put their stuff, so they don't 
come up with 20 solutions, all a bit different and crazy. Provide a standard 
mechanism for storing derived content along with a page revision.
  
  And if we do that, allowing virtual slots is dictated by the caller not 
wanting to know whether a slot is virtual or not.
  
  > I'll think about the xml dump format (do we have any notes on that I 
missed?) but it should be straightforward to add new elements alongside  
to store the non-primary slots.
  
  Slight confusion here: "primary" to me means "user generated, not derived". 
Maybe I should start saying "original". 
  So yea, we want to put other original content into additional  tags. 
But those text tags will not only have to have a slot name, we also need to 
record the length, hash, content model and serialization format for each  
blob. Do we want to put them into attributes, or change the structure around 
some more?
  
  > What to do with derived and virtual slots on dump? I would tend to not 
include them I guess.
  
  Indeed, yes. The purpose of the dumps is to back up, import and export the 
content that users generated. And if we want derived content in the dumps 
afterall, it's easy enough to add that.
  
  Some more questions:
  
  - do we need "slot handlers", or can all handling be done based on the 
content model?
  - do we need some kind of configurable orchestrator for combining the view, 
edit interface, diffs, etc of the available slots? Would that be configured per 
namespace?

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, 
Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-04-25 Thread brion
brion added a comment.


  Ok in that case... I will trust nothing ;)

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, brion
Cc: RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, 
Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-04-25 Thread daniel
daniel added a comment.


  @brion beware that the patches are old, stale, incomplete, and include dead 
ends. And possible some other dead things, in dark corners...

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, 
Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-04-25 Thread brion
brion added a comment.


  Aaa and now I see the bits in gerrit. I'll review all this tomorrow when I'm 
a little bit rested. Hehehe

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, brion
Cc: RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, 
Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-04-25 Thread brion
brion added a comment.


  Ah great, that was mostly written before your post. ;) sounding good so 
far... Do you have code fleshed out enough to share or should we take that 
class structure and write fresh?

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, brion
Cc: RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, 
Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-04-25 Thread daniel
daniel added a comment.


  @brion: y! I have been thinking about this a lot lately. I have done 
some code experiments I would like to share and document. I'm pretty busy, but 
I'll do my best to squeeze this in. Keep poking me :)
  
  Veeery quick overview (mostly for my own good):
  
  - PageStore -> create/update/delete pages. Uses RevisionStore. Does all the 
secondary data update stuff.
  - RevisionStore -> returns RevisionBuilder; Caller adds RevisionSlots and 
meta-data to RevisionBuilder
- RevisionBuilder maintains transactional context. Needs to be aware of 
base rev id for "late" conflict detection!
- late add support for RevisionUpdater, for updating persistent derived 
revision data
  - RevisionLookup returns RevisionRecord objects; LazyRevisionRecord for lazy 
loading?
  - RevisionRecord can enum RevisionSlots for primary content. LazyRevisionSlot 
for lazy loading of content.
  - RevisionSlots has Content and meta-data (size, hash, content model, change 
date, etc); Do we need a RevisionSlotLookup/RevisionSlotStore?
  - Primary content implements Content. Derived content implements Data(?!); 
Content extends Data.
  - RevisionStore/RevisionLookup is based on BlobStoreMultiplexer. Read/write 
is routed based on a prefix in the blob id.
  - BlobStoreMultiplexer manages multiple BlobStores
  - RevisionStore turns blobs into ContentObjects and creates RevisionSlot and 
RevisionRecord objects from them (or creates a LazyRevisionRecord that loads 
data on demand)
  
  Note: primary (user generated) content slots must be enumerable. Which 
revision has which primary slots is recorded in the database. Secondary 
(derived) content slots may also be persistent in the database, but can just as 
well be purely virtual. As a point in case, we'd want a) a ParserCache 
implementation based on persistent derived slots as well as b) a virtual slot 
implementation based on the existing ParserCache.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, 
Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-04-06 Thread daniel
daniel added a comment.


  @Yurik that was actually what I had in mind originally. We called it 
multi-part content, like the MIME encoding for emails. The problem is that it 
is not backwards compatible. It would break everything that expects to be able 
to edit text via the action=editpage interface, or find text in the  tags 
in an XML dump. It's not transparent - all code processing content has to know 
about the new multi-part model.
  
  That would probably be acceptable for newly introduces types of pages, but we 
cannot use the multi-part approach with article pages, or talk pages, or file 
descriptions - all the client code would break. When discussing this issue with 
James (and I think Roan) in Lyon, we came up with the multi-content revision 
approach. Multi-content revisions are not only backwards compatible, they are 
also a more powerful concept. For example, it supports updating derived content 
slots.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, 
Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-04-05 Thread RobLa-WMF
RobLa-WMF added a comment.


  In https://phabricator.wikimedia.org/T107595#2180530, @Yurik wrote:
  
  > Can we solve some of the proposed usecases by simply wraping "content" into 
a higher level structure, e.g. json, to store multiple streams?  For example, 
for a hypothetical "tabular data", we could have
  >
  >   {
  > "license": "...",
  > "headers": [...],
  > "rows": [ [...], [...], ... ]
  >   }
  >
  
  
  I think this is a really good point; and generally, many of us probably need 
to spend some quality time reading through @daniel wrote (I confess I haven't 
done that yet).  We'll almost certainly need to do something like what Yuri is 
suggesting to have a sane import/export format.  Daniel has clearly thought 
about the dump format problem, so I wouldn't be surprised if the answer is 
already spelled out above.
  
  @Daniel, I suspect the bulk of the prose for this RFC should migrate to 
mediawiki.org as we more seriously consider it.  Is this something you are 
planning on, or do you need help/buy-in/gnome-help in order to accomplish?

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, RobLa-WMF
Cc: RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, 
Qgil, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-04-05 Thread Yurik
Yurik added a comment.


  Can we solve some of the proposed usecases by simply wraping "content" into a 
higher level structure, e.g. json, to store multiple streams?  For example, for 
a hypothetical "tabular data", we could have
  
{
  "license": "...",
  "headers": [...],
  "rows": [ [...], [...], ... ]
}

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, Yurik
Cc: Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Qgil, 
Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Wikidata-bugs, aude, jayvdb, fbstj, RobLa-WMF, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-01-20 Thread Aklapper
Aklapper added a comment.

Wikimedia Developer Summit 2016 ended two weeks ago. This task is still open. 
**If the session in this task took place**, please make sure 1) that the 
session Etherpad notes are linked from this task, 2) that followup tasks for 
any actions identified have been created and linked from this task, 3) to 
change the status of this task to "resolved". **If this session did not take 
place**, change the task status to "declined". **If this task** itself has 
become a well-defined action which **is not finished yet**, drag and drop this 
task into the "Work continues after Summit" column on the project workboard. 
Thank you for your help!


TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, Aklapper
Cc: APerson, TomT0m, Krenair, Krinkle, intracer, Tgr, Qgil, Tobi_WMDE_SW, 
Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, 
MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, 
Spage, MZMcBride, daniel, Wikidata-bugs, aude, jayvdb, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2015-10-14 Thread Spage
Spage added a comment.

You mention

- categories etc. maintained as structured, user editable data outside the 
wikitext

(please spell out "etc." :-) ).

So a page's categories would be in an additional primary slot. But categories 
is currently markup in the wikitext. If you ask for the wikitext of the Sheep 
article, will it still  contain `[[Category:Domesticated animals]]`? If you 
edit wikitext and add `[[Category:Livestock]]`, will that be detected, added to 
the category primary slot, and removed from the wikitext?

Templates do tricks like categorizing themselves within `` while 
categorizing the page that includes them within ``. How would this 
work in the new world?


TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, Spage
Cc: Qgil, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, Wikidata-bugs, aude, 
Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2015-10-14 Thread daniel
daniel added a comment.

@spage if we have multiple content slot, we //can// store categories 
separately. We can store them in a primary slot and edit them directly, or in a 
derived slots (extracted from wikitext). Or we can leave things as they are. Or 
we could allow people to enter categories in the wikitext, and then move them 
to the second primary content in a pre-safe transformation.

Similarly, template documentation (the typical use case for ) 
//could// then be managed in a second primary content slot, as well as the 
"template data" schema, but we don't have to do that.

Also, html and diffs could go into a derived slot. That would be a nice way to 
consolidate the different storage and caching solutions we have.

I'm proposing a storage layer that is flexible enough to accommodate these use 
cases. Whether we actually do all that remains an open question. But from the 
conversations we had, it seems like a storage layer like this will come in 
handy at least for quite a few of these and similar cases.


TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: Qgil, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, Wikidata-bugs, aude, 
Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2015-10-03 Thread Qgil
Qgil added a subscriber: Qgil.
Qgil added a comment.

Congratulations! This is one of the 52 proposals that made it through the first 
deadline of the 
https://phabricator.wikimedia.org/tag/wikimedia-developer-summit-2016/  
selection process 
.
 Please pay attention to the next one:  > By **6 Nov 2015**, all Summit 
proposals must have active discussions and a Summit plan documented in the 
description. Proposals not reaching this critical mass can continue at their 
own path out of the Summit.


TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, Qgil
Cc: Qgil, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, Wikidata-bugs, aude, 
Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2015-09-30 Thread daniel
daniel added a comment.

In https://phabricator.wikimedia.org/T107595#1676115, @GWicke wrote:

> @daniel, your revised version seems to focus even more on implementing 
> storage systems, change propagation etc, rather than defining a data access 
> interface for MediaWiki, which can be backed by services.


I defined several interfaces that can be backed by whatever service you like. 
None of them is bound to an SQL backend.

In Level II I propose a DB schema that could be used as a default storage 
mechanism for multiple Content objects per revision. This could serve as a 
baseline implementation, but the rest of the proposal does not at all depend on 
this being done in SQL.

> Could you clarify how you see this relate to ongoing efforts with similar 
> goals and use cases like

>  a) RESTBase offering a lot of the storage & API functionality (beyond blob 
> storage),




- The RevisionContentLookup interfaces could be implemented on top of RESTBase, 
to provide high level access to Content objects for a given revision. My 
understanding of RESTbase is a bit blurry, but I suppose that would fit with 
the "pagecontent"  concept in RESTbase.
- The BlobStore interface could be implemented on top of RESTBase, to provide a 
low level storage mechanism for content blobs.
- A RESTbase request handler could be implemented on top of a 
RevisionContentLookup service
- A RESTbase request handler could be implemented on top of a BlobStore service

I'm not sure which of we would want. RESTbase access to RevisionContentLookup 
sounds pretty good. A BlobStore backed by RESTbase also seems an obvious win.

> b) the event bus (https://phabricator.wikimedia.org/tag/eventbus/), 
> dependency tracking and change propagation work in 
> https://phabricator.wikimedia.org/T102476 and friends, and


No direct relationship. The current architecture calls for each blob to be 
identified by a URL; For dependency tracking, we would also want URLs for high 
level revision slots, especially for derived content (since it depends on 
whatever it was derived from).

RevisionUpdater is intended to allow derived content to be updated when 
dependencies change. How the notification or re-calculation works is not part 
of this proposal.

> c) the Virtual Rest Service abstraction layer 
> (https://phabricator.wikimedia.org/T112553)?


Same as for RESTbase, really:

- The RevisionContentLookup interfaces could be implemented on top of VRS, to 
provide high level access to Content objects for a given revision.
- The BlobStore interface could be implemented on top of VRS, to provide a low 
level storage mechanism for content blobs.
- A VRS backend could be implemented on top of a RevisionContentLookup service
- A VRS backend could be implemented on top of a BlobStore service

Again, I'm not sure which one we would want, and I'm not yet convinced VRS is a 
good idea at all.


TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, Wikidata-bugs, aude, 
Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: RFC: Multi-Content Revisions

2015-09-29 Thread Tobi_WMDE_SW
Tobi_WMDE_SW added a subscriber: Tobi_WMDE_SW.
Tobi_WMDE_SW added a comment.

@daniel will do in the 
https://phabricator.wikimedia.org/tag/wikidata-sprint-2015-09-29/:
formulate concrete questions to be discussion in the RfC meeting and do some 
experimental coding.


TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, Tobi_WMDE_SW
Cc: Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, Wikidata-bugs, aude, 
Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: RFC: Multi-Content Revisions

2015-09-25 Thread GWicke
GWicke added a comment.

@daniel, your revised version seems to focus even more on implementing storage 
systems, change propagation etc, rather than defining a data access interface 
for MediaWiki, which can be backed by services.

Could you clarify how you see this relate to ongoing efforts with similar goals 
and use cases like

a) RESTBase offering a lot of the storage & API functionality (beyond blob 
storage), and
b) the event bus, dependency tracking and change propagation work in 
https://phabricator.wikimedia.org/T102476 and friends?


TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, GWicke
Cc: Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, 
MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, 
Spage, MZMcBride, daniel, Wikidata-bugs, aude, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs