Thanks for the feedback, Tim. It sounds to me like what you are looking for is MapReduce support. There's an feature in our issue tracker for this:
http://code.google.com/p/googleappengine/issues/detail?id=112 Map/Reduce would be a great fit for our model since the work could be transparently distributed among your application instances. App Engine definitely favors the approach you describe of breaking a big job into smaller pieces and reassembling the data, but currently this is up to the developer to manage and build. On Thu, Nov 12, 2009 at 8:26 AM, [email protected] < [email protected]> wrote: > Ikai, > This is not really a relational data question. It is a summary data > question. To give a brief overview on my approach; here is the history > over the past 20 years on my approach to summary information: > > 1. Calculate the summary information on the fly per user request. > Very database intensive and potentially slow performance for the user. > 2. Create summary data tables which the application can read very > quickly, use database triggers to create/update the summary values. > Improved user experience, but has a penalty at write time and requires > developers to know two tools (database triggers and application > language). > 3. Same approach as number 2, but create/update the summary values > in the application code. Reduces maintenance headaches by having a > single tool, makes the write performance a little worse because now > the transaction spans computers/servers. Since servers are cheap and > developers are not, this became the preferred approach. > 4. Avoid the possible create/search of step two/three and assume a > summary record exists at time of write. Increases performance by > eliminating the check for a summary record at each write, downside; > need an asynchronous process to pre-create all possible summary > records and prune ones which never were used after a reasonable time. > > > Depending on the requirements, I prefer the first or forth choice > (mostly read to write ratio is what matters). However, it is hard to > create a long running process via the existing toolset and constraints > provided by GAE. Because of this, I was falling back to the third > option; which was the basis for my original question. (I am looking > into trying to break the process into many 30 seconds or less tasks, > but it is not looking like a practical solution yet. This is another > reason we need to get support for long running batch processes within > GAE.) > > > Tim > > On Nov 10, 5:44 pm, "Ikai L (Google)" <[email protected]> wrote: > > Tim, > > > > It really depends on what you're doing. One of the challenges of > developing > > on a distributed store like the App Engine data store is adjusting the > way > > you approach persistence for objects. For instance, suppose you store > > favorite colors per application user. The canonical way of solving this > > problem in a relational environment is to normalize the color data and > > create a lock around inserting each individual new color. In App Engine's > > environment, we would likely recommend that you take advantage of data > store > > list properties as a much more performant alternative to data > normalization: > > App Engine will handle all the indexing for you. > > > > If you are working with objects in parent/child relationships and require > > transactional integrity, you should take a look at our documentation > > describing Entities and Entity Groups: > http://code.google.com/appengine/docs/java/datastore/transactions.html. > > > > On Fri, Nov 6, 2009 at 12:12 PM, [email protected] < > > > > > > > > > > > > [email protected]> wrote: > > > > > Guys, > > > In a normal relational database, I am used to using a combination > > > of singletons (single application server), synchronized objects in a > > > dedicated thread (single application server) or table locks (multiple > > > application servers) to manage the creation of summary data records > > > which could created by multiple simultaneous requests. > > > In GAE, none of the methods seem to be supported; what would be > > > the suggested method? > > > > > I am using the JPA method of accessing the data store. > > > > > Thanks, > > > > > Tim > > > > -- > > Ikai Lan > > Developer Programs Engineer, Google App Engine > > -- > > You received this message because you are subscribed to the Google Groups > "Google App Engine for Java" group. > To post to this group, send email to > [email protected]. > To unsubscribe from this group, send email to > [email protected]<google-appengine-java%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/google-appengine-java?hl=. > > > -- Ikai Lan Developer Programs Engineer, Google App Engine -- You received this message because you are subscribed to the Google Groups "Google App Engine for Java" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=.
