[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-12-13 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

Anja Jentzsch anja.jentz...@wikimedia.de changed:

   What|Removed |Added

Summary|Secondary and primary   |Secondary and primary
   |storage go out of sync on   |storage go out of sync on
   |constraint violation in |constraint violation in
   |secondary storage (8)   |secondary storage
 Whiteboard||storypoints: 8

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage (8)

2012-11-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

Anja Jentzsch anja.jentz...@wikimedia.de changed:

   What|Removed |Added

 Status|RESOLVED|VERIFIED

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage (8)

2012-11-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

--- Comment #10 from Anja Jentzsch anja.jentz...@wikimedia.de 2012-11-29 
12:37:07 UTC ---
Verified in Wikidata demo time for sprint 8

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage (8)

2012-08-24 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

Bug 36431 depends on bug 36519, which changed state.

Bug 36519 Summary: Validate data structure
https://bugzilla.wikimedia.org/show_bug.cgi?id=36519

   What|Old Value   |New Value

 Status|NEW |RESOLVED
 Resolution||FIXED

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage (8)

2012-06-28 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

denny vrandecic denny.vrande...@wikimedia.de changed:

   What|Removed |Added

Summary|Secondary and primary   |Secondary and primary
   |storage go out of sync on   |storage go out of sync on
   |constraint violation in |constraint violation in
   |secondary storage   |secondary storage (8)

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-06-27 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

Daniel Kinzler daniel.kinz...@wikimedia.de changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #9 from Daniel Kinzler daniel.kinz...@wikimedia.de 2012-06-27 
20:33:37 UTC ---
fixed (fir sitelinks) in https://gerrit.wikimedia.org/r/13119

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-06-21 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

denny vrandecic denny.vrande...@wikimedia.de changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-06-20 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

denny vrandecic denny.vrande...@wikimedia.de changed:

   What|Removed |Added

   Priority|High|Highest

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-06-20 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

--- Comment #8 from jeb...@gmail.com 2012-06-20 22:44:52 UTC ---
There was an erroneous additional save that lead to the race condition. Seems
like it is fixed.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-06-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

--- Comment #7 from jeb...@gmail.com 2012-06-12 20:19:52 UTC ---
Constraint violation in secondary storage, is that solved now? That is, is it
possible to close this bug or should it be merged with bug 36519 - Validate
data structure?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-06-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

jeb...@gmail.com changed:

   What|Removed |Added

   See Also||https://bugzilla.wikimedia.
   ||org/show_bug.cgi?id=36519

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-05-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

denny vrandecic denny.vrande...@wikimedia.de changed:

   What|Removed |Added

 Depends on||36519

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-05-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

--- Comment #4 from Daniel Kinzler daniel.kinz...@wikimedia.de 2012-05-07 
12:50:23 UTC ---
I agree that we need a Content::isValid() method (or maybe better: getErrors(),
returning a Status object or false). Checking this before save is something
that would need to be done in core, I guess. (By the way - being able to cache
the result of this check would be a reason to have non-mutable Content objects.
Caching would be useful since the check may be called multiple times during a
save, on different levels of processing)

I think doing this kind of check before saving is the first thing we need to
do, no matter how we try to enforce database consistency. By itself, it
introduces a race condition of course. But one that would rarely be hit. While
we should try to avoid inconsistent database states, it would be fine to just
show a database error message or some such in case we hit a conflict. 

As to keeping the database consistent, here's the key points:

* checking validity, saving the primary blob and storing secondary data must
all happen in the same transaction.

* if everything should be consistent, there's no distinction (in this context)
between primary and secondary data. Indeed, it would be handy to perform the
primary update the same way the secondary updates are performed: using a list
of objects representing updates jobs.

* we should consider the case of having secondary data on several different
database systems.

* full ACID conformance always bares the risk of operations blocking for a long
time (because of retires in case of communication failure in the commit phase).


Eventually, I think it should work this way (in core):

* The initial validity check, writing the primary blob and updating the
secondary data stores should all be modeled as update objects.

* When a page is to be saved, a list of update object is constructed. Each
update object can open/commit and abort a transaction, and perform the actual
update. 

* The update is then performed in 3 stages: open all transactions, do all
updates, then commit all transactions.

It would be very hard to make the commit phase truly atomic. I believe we will
have to live with the risk of inconsistencies introduces by connections failing
the the middle of a transaction.


Anyway, for now, I think it's sufficient to implement a pre-save check without
any transactional logic. We should keep the transaction stuff for later. Maybe
in a separate ticket.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-05-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

--- Comment #5 from Jeroen De Dauw jeroen_ded...@yahoo.com 2012-05-07 
13:07:30 UTC ---
Yeah, +1, that's basically what I said, but broken down better :)

 using a list of objects representing updates jobs

As long as the jobs get run immediately - would be very bad to have them end up
in the jobque and not seeing the changes made right after save.

 By the way - being able to cache the result of this check would be a reason 
 to have non-mutable Content objects.

Disagree. You can easily cache the validity and set it to unknown when a change
that can impact it is made to the object. This would be even more effective
since you can ignore changes that don't impact it (you'd need to pass $isValid
to the constructor in the immutable case, which seems evil), and you don't have
all the overhead of constantly creating new instances. ... And it would not
work in the first place since you need to check right before doing the save,
unless you don't mind making the transaction significantly less atomic :)

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-05-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

--- Comment #6 from Daniel Kinzler daniel.kinz...@wikimedia.de 2012-05-07 
13:18:35 UTC ---
(In reply to comment #5)
 Yeah, +1, that's basically what I said, but broken down better :)
 
  using a list of objects representing updates jobs
 
 As long as the jobs get run immediately - would be very bad to have them end 
 up
 in the jobque and not seeing the changes made right after save.

Yes, of course. Such job objects may also be used for deferred updates, but it
must be very clear which jobs have to run when.

  By the way - being able to cache the result of this check would be a reason 
  to have non-mutable Content objects.
 
 Disagree. You can easily cache the validity and set it to unknown when a 
 change
 that can impact it is made to the object. This would be even more effective
 since you can ignore changes that don't impact it 

yes, but it's much more error prone and harder to maintain. but I don't insist
on imutable objects :)

 (you'd need to pass $isValid
 to the constructor in the immutable case, which seems evil),

no - immutable only says that isValid() (and all other getters) will always
return the same value. Which can be cached internally, but may be initialized
lazily.

That's of course problematic in cases like this, when isValid() depends on
external state (the database).

 and you don't have
 all the overhead of constantly creating new instances.

We could have a mutator object, analogous to a StringBuffer in java. But
whatever. 

 ... And it would not
 work in the first place since you need to check right before doing the save,
 unless you don't mind making the transaction significantly less atomic :)

Well, if you want to avoid a race condition, yes. Then you can't cache the
value anyway. But as I said, I think the race condition is acceptable, as long
as it doesn't lead to permanent data corruption or loss.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-05-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

Jeroen De Dauw jeroen_ded...@yahoo.com changed:

   What|Removed |Added

 CC||jeroen_ded...@yahoo.com

--- Comment #2 from Jeroen De Dauw jeroen_ded...@yahoo.com 2012-05-06 
15:04:33 UTC ---
Two different issues here:

* We need to not save invalid data to the page table if it was rejected from
the secondary tables. Might be better to have a Content::isValid() method that
is called before save (but can also be used at other places, such as preview or
whatever) and does all the needed checks (in case of WikibaseItem this includes
some reads to the db to see if no duplicate stuff is entered). This method can
then remove the invalid data and continue the save or abort it, and also return
the encountered issues to the caller. One thing to keep in mind is that any
changes to this kind of info should always be written to both the page table
and the secondary tables, preferably in the same transaction.

* We need some way of dealing with race conditions. If a save happens and we do
a read, and it turns out all is well, we must still be sure that no write that
creates a previously absent conflict does not happen before the save completes.
Not sure how to do this in MediaWiki. Probably needs some investigation.

Let's not put all the storage related stuff into a single bug, so I suggest
splitting up the second issue into it's own bug, so this one can be closed once
the first one has been addressed.

 check one database and then continue, leaving the remaining databases

What databases are you talking about? AFAIK we will only have a single master
db for the Wikidata (repo) wiki. This db will hold both the page table and the
secondary tables needed for duplication checks, so we won't need to care about
any additional stuff we set up for answering queries or other stuff I think.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-05-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

--- Comment #3 from jeb...@gmail.com 2012-05-06 15:33:17 UTC ---
If secondary storage is within the same database in the repo there should be no
problem to use transactions.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

jeb...@gmail.com changed:

   What|Removed |Added

 CC||jeb...@gmail.com

--- Comment #1 from jeb...@gmail.com 2012-05-06 02:45:39 UTC ---
Unless the secondary storage is under direct control by the primary storage it
would be hard to avoid any race conflict or getting a large delay. A working
solution could be to check one database and then continue, leaving the
remaining databases to use rollback for sorting out remaining conflicts. Those
will normally be edit conflicts within a short time frame.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 36431] Secondary and primary storage go out of sync on constraint violation in secondary storage

2012-05-04 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=36431

Mark A. Hershberger m...@everybody.org changed:

   What|Removed |Added

   Priority|Unprioritized   |High
 CC||m...@everybody.org

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l