Hi Bob, Some thoughts inline. On Fri, May 13, 2011 at 05:40, Bob <[email protected]> wrote: > Hello, > > I've started writing GAE app that will have a 3D lattice object model with a > couple of thousand nodes per user. Each request will typically need to read > and analyse 1-2% of the records to walk the model in order to find the > target node and half a dozen writes to update it and it's neighbours. > > With the new pricing I'm now thinking of storing the model as a single large > serialised object, with one read to get the whole thing and one write to put > it back after manipulation in memory.
How big is this object? If it is over 1mb then you're going to need to shard it across multiple datastore entities. If you will know the 'addresses' of the nodes you'll be interested in, then using those 'addresses' as the key_names might be another option to consider. Otherwise, if the object will be under 1mb, and the cost of serializing / deserializing it is acceptable, storing it in a single entity might not be bad. I suggest you think about the frequency of each event (ie write vs read) and how you need to access and present the data. Optimize for the common cases. > > It seems wrong to be reading and writing so much *unchanged* data but we're > being charged per datastore API operation not by actual volume.The CPU cost > of serialising and de-serialising is, apparently, irrelevant. Well, it takes time to serialize / deserialize the data, during that time that instance will be occupied. In other words you should figure out which option is the fastest for the 'common' operation(s), because that will make your app as a whole perform better and be more cost-effective. > > I must surely be missing something? > > (It does mean that if I decided to migrate away from GAE I can replace all > the clever storage with a flat file!) > > If we're charged per record on bulk fetches (still to be decided?) then this > strategy might be good for iterating over any list. Storing a serialised > copy of the whole collection along side the individual records would add one > (large) read and one (large) write per record update but would allow > iteration for one read rather than "N". Yeah, you might not even need to read all of the data on updates. If you have all of the data you'll need, and you know the key(s), you could just overwrite any existing data. Obviously this depends on your use-case, but it is something to be aware of. I'd suggest you do some simple experiments to figure out what method will be best for your use-case. Robert > > Admittedly I'm new to GAE and haven't really studied this through yet, but > Google's policy of charging a "real" monetary value for "real" resources, > CPU time, disk space, network bandwidth, via an "abstract" resource concept > like datastore API calls seems like asking for trouble. Developers will > always try to optimise their costs so Google needs to set charges based on > what _they_ want optimised. I can't see how minimising the number of > individual API calls can possibly be a real target for Google. > > I think I'll investigate other hosting possibilities before committing a lot > of time to learning this one. I can't guess at how much it's going to cost > and I haven't the heart to fight. > > Bob > > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
