Re: [Wikitech-l] Unclear Meaning of $baseRevId in WikiPage::doEditContent

2014-06-23 Thread Adam Wight
On Fri, Jun 6, 2014 at 4:08 PM, Aaron Schulz aschulz4...@gmail.com wrote:

 I suppose that naming scheme is reasonable.

 $contentsRevId sounds awkward, maybe $sourceRevId or $originRevId is
 better.


What about rollbackRevId?  I want the variable name to make its purpose
very clear.

-Adam




 --
 View this message in context:
 http://wikimedia.7.x6.nabble.com/Unclear-Meaning-of-baseRevId-in-WikiPage-doEditContent-tp5028661p5029674.html
 Sent from the Wikipedia Developers mailing list archive at Nabble.com.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Unclear Meaning of $baseRevId in WikiPage::doEditContent

2014-06-06 Thread Aaron Schulz
I suppose that naming scheme is reasonable.

$contentsRevId sounds awkward, maybe $sourceRevId or $originRevId is better.



--
View this message in context: 
http://wikimedia.7.x6.nabble.com/Unclear-Meaning-of-baseRevId-in-WikiPage-doEditContent-tp5028661p5029674.html
Sent from the Wikipedia Developers mailing list archive at Nabble.com.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Unclear Meaning of $baseRevId in WikiPage::doEditContent

2014-06-03 Thread Adam Wight
It looks like we should leave the existing hook parameters values alone for
the moment, but it would improve the situation if we renamed variables
which seem to be overloaded or unclear, in MediaWiki core and in
FlaggedRevs.  What do you think of the following conventions,

'oldid' (index.php parameter) -- keep this name only to preserve interface
compatibility.  This refers to a historical revision when used in the
action=view case, and to the latest revision ID of the page at the time an
edit session begins.

$oldid -- keep as-is in the action=view codepath, rename to $parentRevId in
action=edit

$parentRevId -- latest available revision ID at the time an edit session
begins.  Used to detect conflicts, and identify the parent revision record
upon save.  This is updated during successful automatic rebase.  I don't
see a good use case for preserving what Daniel calls the reference
revision, the parentRevId before rebase.

$baseRevId and $baseId -- rename everywhere to $contentsRevId, but
examining the code contexts for the smell of confounding with $parentRevId.

$contentsRevId -- revision ID of the source text to copy when performing
undo or rollback.  We will probably want to supplement hooks that only
passed $contentsRevId, such as NewRevisionFromEditComplete, with
$parentRevId as an additional parameter.

A refactor along these lines would keep me from losing already scant
marbles as I attempt to fix related issues in core:
https://gerrit.wikimedia.org/r/#/c/94584/ , I see now that I've already
begun to introduce mistakes caused by the difficult common-sense
interpretation of current variable naming.

-Adam


On Mon, Jun 2, 2014 at 1:22 AM, Daniel Kinzler dan...@brightbyte.de wrote:

 Am 30.05.2014 15:38, schrieb Brad Jorsch (Anomie):
  I think you need to look again into how FlaggedRevs uses it, without the
  preconceptions you're bringing in from the way you first interpreted the
  name of the variable. The current behavior makes perfect sense for that
  specific use case. Neither of your proposals would work for FlaggedRevs.

 As far as I understand the rather complex FlaggedRevs.hooks.php code, it
 assumes
 that

 a) if $newRev === $baseRevId, it's a null edit. As far as I can see, this
 does
 not work, since $baseRevId will be null for a null edit (and all other
 regular
 edits).

 b) if $newRev !== $baseRevId but the new rev's hash is the same as the base
 rev's hash, it's a rollback. This works with the current implementation of
 commitRollback(), but does not for manual reverts or trivial undos.

 So, FlaggedRevs assumes that EditPage resp WikiPage set $baseRevId to the
 edits
 logical parent (basically, the revision the user loaded when starting to
 edit).
 That's what I described as option (3) in my earlier mail, except for the
 rollback case; It would be fined with me to use the target rev as the base
 for
 rollbacks, as is currently done.

 FlaggedRevs.hooks.php also injects a baseRevId form field and uses it in
 some
 cases, adding to the confusion.

 In order to handle manual reverts and null edits consistently, EditPage
 should
 probably have a base revision as a form field, and pass it on to
 doEditContent.
 As far as I can tell, this would work with the current code in FlaggedRevs.

  As for the EditPage code path, note that it has already done edit
 conflict
  resolution so base revision = current revision of the page. Which is
  probably the intended meaning of false.

 Right. If that's the case though, WikiPage::doEditContent should probably
 set
 $baseRevId = $oldid, before passing it to the hooks.

 Without changing core, it seems that there is no way to implement a
 late/strict
 conflict check based on the base rev id. That would need an additional
 anchor
 revision for checking.

 The easiest solution for the current situation is to simply drop the strict
 conflict check in Wikibase and accept a race condition that may cause a
 revision
 to be silently overwritten, as is currently the case in core.

 -- daniel


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Unclear Meaning of $baseRevId in WikiPage::doEditContent

2014-06-02 Thread Daniel Kinzler
Am 30.05.2014 15:38, schrieb Brad Jorsch (Anomie):
 I think you need to look again into how FlaggedRevs uses it, without the
 preconceptions you're bringing in from the way you first interpreted the
 name of the variable. The current behavior makes perfect sense for that
 specific use case. Neither of your proposals would work for FlaggedRevs.

As far as I understand the rather complex FlaggedRevs.hooks.php code, it assumes
that

a) if $newRev === $baseRevId, it's a null edit. As far as I can see, this does
not work, since $baseRevId will be null for a null edit (and all other regular
edits).

b) if $newRev !== $baseRevId but the new rev's hash is the same as the base
rev's hash, it's a rollback. This works with the current implementation of
commitRollback(), but does not for manual reverts or trivial undos.

So, FlaggedRevs assumes that EditPage resp WikiPage set $baseRevId to the edits
logical parent (basically, the revision the user loaded when starting to edit).
That's what I described as option (3) in my earlier mail, except for the
rollback case; It would be fined with me to use the target rev as the base for
rollbacks, as is currently done.

FlaggedRevs.hooks.php also injects a baseRevId form field and uses it in some
cases, adding to the confusion.

In order to handle manual reverts and null edits consistently, EditPage should
probably have a base revision as a form field, and pass it on to doEditContent.
As far as I can tell, this would work with the current code in FlaggedRevs.

 As for the EditPage code path, note that it has already done edit conflict
 resolution so base revision = current revision of the page. Which is
 probably the intended meaning of false.

Right. If that's the case though, WikiPage::doEditContent should probably set
$baseRevId = $oldid, before passing it to the hooks.

Without changing core, it seems that there is no way to implement a late/strict
conflict check based on the base rev id. That would need an additional anchor
revision for checking.

The easiest solution for the current situation is to simply drop the strict
conflict check in Wikibase and accept a race condition that may cause a revision
to be silently overwritten, as is currently the case in core.

-- daniel


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Unclear Meaning of $baseRevId in WikiPage::doEditContent

2014-05-30 Thread Daniel Kinzler
Am 29.05.2014 21:07, schrieb Aaron Schulz:
 Yes it was for auto-reviewing new revisions. New revisions are seen as a
 combination of (base revision, changes). 

But EditPage in core sets $baseRevId to false. The info isn't there for the
standard case. In fact, the ONLY thing in core that sets it to anything but
false is commitRollback() , and that sets it to a value that to me doesn't make
much sense to me - the revision we revert to, instead of either the revision we
revert *from* (base/physical parent), or at least the *parent* of the revision
we revert to (logical parent).

Also, if you want (base revision, changes), you would use $oldid in
doEditContent, not $baseRevId. Perhaps it's just WRONG to pass $baseRevId to the
hooks called by doEditCOntent, and it should have been $oldid all along? $oldid
is what you need if you want to diff against the previous revision - so
presumably, that's NOT what $baseRevId is.

 If baseRevId is always set to the revision the user started from it would
 cause problems for that extension for the cases where it was previously
 false.

false means don't check, I suppose - or there is no base, but that could
be identified by the EDIT_NEW flag.

I'm not proposing to change the cases where baseRevId is false. They can stay as
they are. I'm proposing to set baseRevId to the revision the user started with,
OR false, so we can detect conflicts safely  sanely.

 It would indeed be useful to have a casRevId value that was the current
 revision at the time of editing just for CAS style conflict detection.

Indeed - but changing the method signature would be painful, and the existing
$baseRevId parameter does not seem to be used at all - or at least, it's used in
such an inconsistent way as to be useless, of not misleading and harmful.

For now, I propose to just have commitRollback call doEditContent with
$baseRevId = false, like the rest of core does. Since core itself doesn't use
this value anywhere, and sets it to false everywhere, that seems consistent. We
could then just clarify the documentation. This way, Wikibase could use the
$baseRevId value for conflict detection - actually, core could, and should, do
just that in doEditContent; this wouldn't do anything in core until the
$baseRevId is supplied at least by EditPage.

Of course, we need to check FlaggedRevs and other extensions, but seeing how
this argument is essentially unused, I can't imagine how this change could break
anything for extensions.

-- daniel



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Unclear Meaning of $baseRevId in WikiPage::doEditContent

2014-05-30 Thread Brad Jorsch (Anomie)
On Fri, May 30, 2014 at 4:06 AM, Daniel Kinzler dan...@brightbyte.de
wrote:

 Am 29.05.2014 21:07, schrieb Aaron Schulz:
  Yes it was for auto-reviewing new revisions. New revisions are seen as a
  combination of (base revision, changes).

 But EditPage in core sets $baseRevId to false. The info isn't there for the
 standard case. In fact, the ONLY thing in core that sets it to anything
 but false is commitRollback() , and that sets it to a value that to me
 doesn't make much sense to me - the revision we revert to, instead of
 either the revision we revert *from* (base/physical parent), or at least
 the *parent* of the revision we revert to (logical parent).


I think you need to look again into how FlaggedRevs uses it, without the
preconceptions you're bringing in from the way you first interpreted the
name of the variable. The current behavior makes perfect sense for that
specific use case. Neither of your proposals would work for FlaggedRevs.

As for the EditPage code path, note that it has already done edit conflict
resolution so base revision = current revision of the page. Which is
probably the intended meaning of false.


 Of course, we need to check FlaggedRevs and other extensions, but seeing
 how this argument is essentially unused, I can't imagine how this change
 could break anything for extensions.


Except FlaggedRevs.


-- 
Brad Jorsch (Anomie)
Software Engineer
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Unclear Meaning of $baseRevId in WikiPage::doEditContent

2014-05-30 Thread Aaron Schulz
FlaggedRevs uses the NewRevisionFromEditComplete hook. Grepping for that, I
see reasonable values set in the callers at a quick glance. This cover
various null edit scenarios too. The $baseRevId in WikiPage is just one of
the cases of that value passed to the hook, and is fine there (being mostly
false). false indeed means not determined and that behavior is needed
for the hook values. The values given in that hook variable make sense and
are more or less consistent.

As I said before, if the NewRevisionFromEditComplete hook is given the same
base revision ID values for all cases, then I don't care to much what
happens to the $baseRevId value semantics in doEditContent(). As long as
everything is changed to keep that part consistent then it won't effect
anything. However, just naively change the $baseRevId values for the
non-false cases will break the extension using it.

As as side note, FlaggedRevs doesn't just end up using $oldid. It only uses
that as the last resort after picking other values in difference scenarios
it detects.



--
View this message in context: 
http://wikimedia.7.x6.nabble.com/Unclear-Meaning-of-baseRevId-in-WikiPage-doEditContent-tp5028661p5029028.html
Sent from the Wikipedia Developers mailing list archive at Nabble.com.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Unclear Meaning of $baseRevId in WikiPage::doEditContent

2014-05-29 Thread Aaron Schulz
Yes it was for auto-reviewing new revisions. New revisions are seen as a
combination of (base revision, changes). If the base revision was reviewed
and the user is trusted, then so is the new revision. MW core had the
obvious cases of rollback and null edits, which are (base revision, no
changes). Their is a lot more base revision detection in FlaggedRevs for
the remaining cases, some less obvious (user supplied baseRevId, X-top edit
undo, fall back to prior edit).

If baseRevId is always set to the revision the user started from it would
cause problems for that extension for the cases where it was previously
false.

It would indeed be useful to have a casRevId value that was the current
revision at the time of editing just for CAS style conflict detection.



--
View this message in context: 
http://wikimedia.7.x6.nabble.com/Unclear-Meaning-of-baseRevId-in-WikiPage-doEditContent-tp5028661p5028902.html
Sent from the Wikipedia Developers mailing list archive at Nabble.com.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Unclear Meaning of $baseRevId in WikiPage::doEditContent

2014-05-28 Thread Daniel Kinzler
Hi all.

We (the Wikidata team) ran into an issue recently with the value that gets
passed as $baseRevId to Content::prepareSave(), see Bug 67831 [1]. This comes
from WikiPage::doEditContent(), and, for core, is nearly always set to false
(e.g. by EditPage).

We interpreted this rev ID to be the revision that is the nominal base revision
of the edit, and implemented an edit conflict check based on it. Which works
with the way we use doEditContent() for wikibase on wikidata, and with most
stuff in core (which generally has $baseRevId = false). But as it turns out, it
does not work with rollbacks: WikiPage::commitRollback sets $baseRevId to the ID
of the revision we revert *to*.

Now, is that correct, or is it a bug? What does base revision mean?

The documentation of WikiPage::doEditContent() is unclear about this (yes, I
wrote this method when introducing the Content class - but I copied the
interface WikiPage::doEdit(), and mostly kept the code as it was). And in the
code, $baseRevId is not used at all except for passing it to hooks and to
Content::prepareSave - which doesn't do anything with it for any of the Content
implementations in core - only in Wikibase we tried to implement a conflict
check here, which should really be in WikiPage, I think.

So, what *does* $baseRevId mean? If you happen to know when and why $baseRevId
was introduced, please enlighten me. I can think of three possibilities:

1) It's the edit's reference revision, used to detect edit conflicts (this is
how we use this in Wikibase). That is, an edit is done with respect to a
specific revision, and that revision is passed back to WikiPage when saving, so
a check for edit conflicts can be done as close to the actual edit as possible
(ideally, in the same DB transaction). Compare bug 56849 [2].

2) The edit's physical parent: that would be the same as (1), unless there is
a conflict that was detected early and automatic resolved by rebasing the edit.
E.g. if an edit is performed based on revision 11, but revision 12 was added
since, and the edit was successfully rebased, the parent would be 12, not 11.
This is what WikiPage::doEditContent() calls $oldid, and what gets saved in
rev_parent_id. Since WikiPage::doEditContent() makes the distinction between
$oldid and $baseRevId, this is probably not what  $baseRevId was intended to be.

3) It could be the logical parent: this would be identical to (2), except for
a rollback: if I revert revision 15 and 14 back to revision 13, the new
revision's logical parent would be rev 13's parent. The idea is that you are
restoring rev 13 as it was, with the same parent rev 13 had. Something like this
seems to be the intention of what commitRollback() currently does, but the way
it is now, the new revision would have rev 13 as its logical parent (which, for
a rollback, would have identical content).

So at present, what commitRollback currently does is none of the above, and I
can't see how what it does makes sense.

I suggest we fix it, define $baseRevId to mean what I explained under (1), and
implement a late conflict check right in the DB transaction that updates the
revision (or page) table. This might confuse some extensions though, we should
double check AbuseFilter, if nothing else.

Is that a good approach? Please let me know.

-- daniel

[1] https://bugzilla.wikimedia.org/show_bug.cgi?id=65831
[2] https://bugzilla.wikimedia.org/show_bug.cgi?id=56849

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Unclear Meaning of $baseRevId in WikiPage::doEditContent

2014-05-28 Thread Brad Jorsch (Anomie)
On Wed, May 28, 2014 at 7:25 AM, Daniel Kinzler dan...@brightbyte.dewrote:

 If you happen to know when and why $baseRevId
 was introduced, please enlighten me.


When is easy enough:
http://mediawiki.org/wiki/Special:Code/MediaWiki/34987

If I had to guess at the why, I'd guess it was probably for FlaggedRevs
and auto-reviewing.


-- 
Brad Jorsch (Anomie)
Software Engineer
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l