[google-appengine] Re: Serious problem: Rollback of data on HRD

Tom Phillips Sat, 20 Aug 2011 10:38:17 -0700

X and Y used only get-by-key, and read policy was unset in
jdoconfig.xml, so using the default strong. These transactions were
idempotent.


BUT..I scoured the code again and sure enough, there is third entry
point I missed in my HRD prep that reads the entity state via..you
guessed it..a query accross entity groups.

So there is a third transaction Z that can collide with Y, and it's
the one writing stale state it sees from the query (state after X is
what it overwrote with).

I thought I had covered all of my HR migration tweaks - but missed
this one entry point at least. Easily fixed now.

Thanks, (and thanks for zigzag merge join BTW, it rocks)
Tom

On Aug 19, 9:34 pm, Alfred Fuller <[email protected]>
wrote:
> Are your transactions idempotent? It is possible that the transaction is
> being run (and succeeding) twice in this case. What other request is
> colliding with first? You are not using any non-ancestor queries or setting
> read_policy=EVENTUAL on any reads correct?
>
> On Fri, Aug 19, 2011 at 12:42 PM, Tom Phillips <[email protected]> wrote:
> > I'm seeing the same since moving to HR last week. It happens rarely,
> > and the only clue is a ConcurrentModification in the logs (java in my
> > case).
>
> > Pure speculation, but it looks to me like some sort of background
> > transaction retry might overwrite the entity with stale data, rather
> > than a rollback.
>
> > Scenario for me is like:
> > pre) Entity bob has property height=70
> > 1) thread 1, transaction X, height=75->commit() [appears to succeed]
> > 2) Meanwhile (within a second or so) thread 2, transaction Y height=80-
> > >commit() [ConcurrentModificationException] -> I pause 500ms and retry
> > -> commit() [appears to succeed this time]
> > 3) For a while (up to a minute or so, but possibly much longer) all
> > get-by-key on bob show height==80 (ok)
> > 4) Another while later all get-by-key on bob suddenly show height==75,
> > as per transaction X (not good!)
>
> > My speculation is that the ConcurrentModification could sometimes
> > indicate there was disruption of BOTH transaction X and Y, even though
> > reported for Y.  Perhaps X had gotten past commit() call but hadn't
> > yet reached milestone A of
> >http://code.google.com/appengine/articles/transaction_isolation.html,
> > and was also (temporarily) aborted due to the contention.
>
> > Then some sort of background retry on X sometimes (rarely) re-inserts
> > it into the transaction queue BEHIND my explicit retry on Y, and
> > eventually overwrites with the whole entity state from X in 1)
>
> > And it appears that sometimes the background retry of X may not even
> > happen till a good while later.
>
> > Any chance something like this is happening?
>
> > /Tom
>
> > On Aug 16, 10:11 pm, Greg <[email protected]> wrote:
> > > Please check your logs for a warning "Transaction collision.
> > > Retying...".
>
> > > Something very similar is happening on my app, where DB put()s
> > > silently fail (equivalent to the entity being rolled back) very
> > > occasionally. This has only started happening after moving to HR.
>
> > > In my app, I get this warning very consistently (every time) at
> > > exactly the time the entity is supposed to be stored. I would be very
> > > interested to hear if you find this warning too. If so, I think it
> > > points to a bug in the transaction collision handler in put(). Please
> > > let me know!
>
> > > See my earlier post here:
> >http://groups.google.com/group/google-appengine/browse_thread/thread/...
>
> > > Cheers
> > > Greg.
>
> > > On Aug 14, 10:21 pm, "Raymond C." <[email protected]> wrote:
>
> > > > I have recently ran into a problem after migrating to HRD:
>
> > > > My application is a social online game which I have recently migrated
> > from
> > > > M/S to HR Datastore around 3 weeks ago.  Since 2 weeks ago I have
> > started
> > > > receiving reports from players which their game progress are rolled
> > back
> > > > suddenly while playing, which progress made in the recent few days are
> > > > missing.  I have verified the problem through data on other entities
> > (in
> > > > different entity group) that the reports are actually legit and at
> > least
> > > > several days of progress are actually rolled back (with updates to the
> > > > entities in the last few days are all missing).
>
> > > > Player's data in the game are retrieved through id (
> > > > Player.get_by_id(player_id) ) and because the gap is so large (days) I
> > > > believe it is not a problem on my code (nowhere in my code cache
> > player's
> > > > data).
>
> > > > It has never happened before for nearly 1 year so I am guessing if it
> > is
> > > > related to HRD.  I remember there was a thread here before which
> > reported
> > > > data being rolled back on HRD but I can not find it anymore.
>
> > > > As you know with AppEngine datastore's distributed nature, it is so
> > hard to
> > > > monitor this kind of problem to ensure the problem exist.  I would like
> > to
> > > > ask if anyone has ran into this problem as well or suspect that you
> > have had
> > > > this problem before with your HRD application?
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Google App Engine" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> > [email protected].
> > For more options, visit this group at
> >http://groups.google.com/group/google-appengine?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

[google-appengine] Re: Serious problem: Rollback of data on HRD

Reply via email to