Hi all, If you don't want to read this whole post, skip directly to 'QUESTION 1' and 'QUESTION 2' below.
A bit of background first: GeoModel is a library I wrote that adds very basic geospatial indexing and querying functionality to App Engine apps. It is similar in approach to geohashing. The equivalent location hash in GeoModel is called a 'geocell.' Source here: http://code.google.com/p/geomodel/. Currently, the GeoModel library adds 13 properties (location_geocell_n, n=1..13) to each location-aware entity. For example, an entity can have property values such as location_geocell_1='a', location_geocell_2='a3', location_geocell_3='a3f', etc. This is required in order to not use up an inequality filter during spatial queries. The problem with the 13-properties approach is that, for any geo query an app would like to run, 13 new indexes must be defined and built. This is definitely a maintenance hassle, as I've just painfully realized while rewriting the demo app for the project. However, this also leads to my first question: **QUESTION 1. Is there any significant storage overhead per index? i.e. if I have 13 indexes with n entities in each, versus 1 index with 13*n entities in it, is the former much worse than the latter in terms of storage? It seems like the answer to (1) is no, per http://code.google.com/appengine/articles/index_building.html, but I'd just like to see if anyone has had a different experience. Now, I'm considering adjusting the GeoModel library so that instead of 13 string properties, there'd only be one StringListProperty called location_geocells, and it would contain values like ['a', 'a3', 'a3f']. This results in a much cleaner index.yaml. But, I do question the performance implications: **QUESTION 2. If I switch from 13 string properties to 1 StringListProperty, will query performance be adversely affected; my current query looks like: .filter('location_geocell_%d =' % len(search_cell), search_cell) and the new query would look like: .filter('location_geocells =', search_cell) Note that the first query has a search space of n entities, whereas the second query has a search space of 13*n entities. It seems like the answer to (2) is that both result in equal query performance, per tip #6 in http://googleappengine.blogspot.com/2009/06/10-things-you-probably-didnt-know-about.html, but again, I'd like to see if anyone has any differing real-world experiences with this. Lastly, if anyone has any other suggestions or tips that can help improve storage utilization, query performance and/or ease of use (specifically w.r.t. index.yaml), please do let me know! The source can be found here: http://code.google.com/p/geomodel/ http://code.google.com/p/geomodel/source/browse/trunk/geo/geomodel.py Thanks, Roman --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---
