Hari,

It seems you are already thinking along the "correct" lines with your final
suggestion.

There is not requirement that something that is "deleted" must be removed
from a model immediately.

For example, when you delete an entity from the datastore, it isn't deleted.
 It is marked as "deleted" and occassionally the datastore tablets are
compacted and all entites marked "deleted" get removed.

What seems to be in-elegant, is really used all over the place in computer
science.  When something get's deleted.. either a "delete flag" is turned
on.. or there is just a pointer to that thing that gets set to Null or
something or other.

See here for Nick Johnson's description of Log Structured storage:

http://blog.notdot.net/2009/12/Damn-Cool-Algorithms-Log-structured-storage

For more wonky underneaths of distributed filesystems, see Matt Dillon's
description of Hammer ("Data is not (never!) immediately overwritten so no
UNDO is needed for file data."):

http://www.dragonflybsd.org/hammer/hammer.pdf


Also, there is an added benefit of not immediately deleting an entity.. what
if someone is on a roll, and they're deleting questions left and right...
and then they realize that they deleted five questions that shouldn't have
been deleted?  If you've been furiously ensuring all deletes with
transactions, there is nothing they can do.  If you are simply marking items
as deleted, you can simply provide them with an un-delete option.

So.. I may start to sound like a broken record (since I feel like I say this
in every other post)... but do not use transactions and entity groups unless
it is absolutely necessary (you  have gone made and are creating a banking
subsytem on Appengine, for example).

Most of the time, people just get hung up thinking that a delete or some
other event should happen immediately at the moment it was conceived (I
blame twitter and txting and chat for this).. and if it doesn't, there is
something wrong with the design.

So, long story short, consider doing something like the "IS_DELETED" flag..
(or, if more than one Exam can share the same question, just have Exams
point to Pages which point to Questions.. and IS_DELETED is only marked if
an entity is no longer pointed to by anything.. and your nightly delete
process verifies that IS_DELETED is correct by checking if an entity belongs
to something else before delete [that might be a little much])

On Fri, Dec 3, 2010 at 5:53 AM, har_shan <[email protected]> wrote:

> Hello,
> Am learning AppEngine and have started developing new app and want to
> clarify something.
>
> I understood that
> a. To achieve atomicity of update/delete of several entities we need
> to do it in a transaction and hence all should fall under same entity
> group
> b. Having big entity groups is not scalable as it causes contention.
> (Q1: Correct?)
>
> So here is an entity model of an online examination system for sake of
> discussion:
>
> Entities:
> Subject
> Exam
> Page
> Question
> Answer
>
> As you can see from top, each entity 1 - many relationship with the
> immediate bottom one i.e 1 Subject can have many exams, 1 exam -> many
> pages, 1 page can have many questions...
>
> As you can see, i would like to establish cascading update/delete
> relationship among these entities (JPA datanucleus appengine
> implemention supports this (under the hood) by putting all entities
> under same entity group (Q2: Correct?) though AppEngine natively
> doesn't support this constraint) so naturally all would go under same
> entity group so that
> a. i can delete a Page (if my user does) in a transaction and be sure
> that all pages, questions, answers are all deleted
> b. or i can delete a subject altogether in a transaction all clear all
> stuff underneath it
>
> So when i extend this to my real app, i see that all of my (or atleast
> most) entities are interrelated and fit into same entity group to be
> able to transact them altogether - making my model inefficient.
>
> Q3: Please advice on how to rethink this design (and the best
> practice) and still achieve what i need. Ask me more if needed.
> Would be great if you could point me to relevant examples.
>
> p.s. 1 solution i could think of is having each entity in a separate
> entity group and a separate persistent field in each entity (say Exam)
> named 'IS_DELETED' defaulting to FALSE (value 0). Once a user deletes
> an Exam, i will set the field to 1 (TRUE) and that i don't load them
> anymore. I shall write a Cron job which clears all related entities in
> separate separate transaction in the backend which will retry upon
> failures if needed. But am sure this is not elegant and not sure
> whether this will work out..
>
> Thanks all for your responses,
> Hari
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<google-appengine%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to