Hari, It seems you are already thinking along the "correct" lines with your final suggestion.
There is not requirement that something that is "deleted" must be removed from a model immediately. For example, when you delete an entity from the datastore, it isn't deleted. It is marked as "deleted" and occassionally the datastore tablets are compacted and all entites marked "deleted" get removed. What seems to be in-elegant, is really used all over the place in computer science. When something get's deleted.. either a "delete flag" is turned on.. or there is just a pointer to that thing that gets set to Null or something or other. See here for Nick Johnson's description of Log Structured storage: http://blog.notdot.net/2009/12/Damn-Cool-Algorithms-Log-structured-storage For more wonky underneaths of distributed filesystems, see Matt Dillon's description of Hammer ("Data is not (never!) immediately overwritten so no UNDO is needed for file data."): http://www.dragonflybsd.org/hammer/hammer.pdf Also, there is an added benefit of not immediately deleting an entity.. what if someone is on a roll, and they're deleting questions left and right... and then they realize that they deleted five questions that shouldn't have been deleted? If you've been furiously ensuring all deletes with transactions, there is nothing they can do. If you are simply marking items as deleted, you can simply provide them with an un-delete option. So.. I may start to sound like a broken record (since I feel like I say this in every other post)... but do not use transactions and entity groups unless it is absolutely necessary (you have gone made and are creating a banking subsytem on Appengine, for example). Most of the time, people just get hung up thinking that a delete or some other event should happen immediately at the moment it was conceived (I blame twitter and txting and chat for this).. and if it doesn't, there is something wrong with the design. So, long story short, consider doing something like the "IS_DELETED" flag.. (or, if more than one Exam can share the same question, just have Exams point to Pages which point to Questions.. and IS_DELETED is only marked if an entity is no longer pointed to by anything.. and your nightly delete process verifies that IS_DELETED is correct by checking if an entity belongs to something else before delete [that might be a little much]) On Fri, Dec 3, 2010 at 5:53 AM, har_shan <[email protected]> wrote: > Hello, > Am learning AppEngine and have started developing new app and want to > clarify something. > > I understood that > a. To achieve atomicity of update/delete of several entities we need > to do it in a transaction and hence all should fall under same entity > group > b. Having big entity groups is not scalable as it causes contention. > (Q1: Correct?) > > So here is an entity model of an online examination system for sake of > discussion: > > Entities: > Subject > Exam > Page > Question > Answer > > As you can see from top, each entity 1 - many relationship with the > immediate bottom one i.e 1 Subject can have many exams, 1 exam -> many > pages, 1 page can have many questions... > > As you can see, i would like to establish cascading update/delete > relationship among these entities (JPA datanucleus appengine > implemention supports this (under the hood) by putting all entities > under same entity group (Q2: Correct?) though AppEngine natively > doesn't support this constraint) so naturally all would go under same > entity group so that > a. i can delete a Page (if my user does) in a transaction and be sure > that all pages, questions, answers are all deleted > b. or i can delete a subject altogether in a transaction all clear all > stuff underneath it > > So when i extend this to my real app, i see that all of my (or atleast > most) entities are interrelated and fit into same entity group to be > able to transact them altogether - making my model inefficient. > > Q3: Please advice on how to rethink this design (and the best > practice) and still achieve what i need. Ask me more if needed. > Would be great if you could point me to relevant examples. > > p.s. 1 solution i could think of is having each entity in a separate > entity group and a separate persistent field in each entity (say Exam) > named 'IS_DELETED' defaulting to FALSE (value 0). Once a user deletes > an Exam, i will set the field to 1 (TRUE) and that i don't load them > anymore. I shall write a Cron job which clears all related entities in > separate separate transaction in the backend which will retry upon > failures if needed. But am sure this is not elegant and not sure > whether this will work out.. > > Thanks all for your responses, > Hari > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<google-appengine%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
