Hi all,

If you don't want to read this whole post, skip directly to 'QUESTION
1' and 'QUESTION 2' below.

A bit of background first: GeoModel is a library I wrote that adds
very basic geospatial indexing and querying functionality to App
Engine apps. It is similar in approach to geohashing. The equivalent
location hash in GeoModel is called a 'geocell.' Source here:
http://code.google.com/p/geomodel/.

Currently, the GeoModel library adds 13 properties
(location_geocell_n, n=1..13) to each location-aware entity. For
example, an entity can have property values such as
location_geocell_1='a', location_geocell_2='a3',
location_geocell_3='a3f', etc. This is required in order to not use up
an inequality filter during spatial queries.

The problem with the 13-properties approach is that, for any geo query
an app would like to run, 13 new indexes must be defined and built.
This is definitely a maintenance hassle, as I've just painfully
realized while rewriting the demo app for the project. However, this
also leads to my first question:

**QUESTION 1. Is there any significant storage overhead per index?
i.e. if I have 13 indexes with n entities in each, versus 1 index with
13*n entities in it, is the former much worse than the latter in terms
of storage?

It seems like the answer to (1) is no, per
http://code.google.com/appengine/articles/index_building.html, but I'd
just like to see if anyone has had a different experience.

Now, I'm considering adjusting the GeoModel library so that instead of
13 string properties, there'd only be one StringListProperty called
location_geocells, and it would contain values like ['a', 'a3',
'a3f']. This results in a much cleaner index.yaml. But, I do question
the performance implications:

**QUESTION 2. If I switch from 13 string properties to 1
StringListProperty, will query performance be adversely affected; my
current query looks like:

    .filter('location_geocell_%d =' % len(search_cell), search_cell)

and the new query would look like:

    .filter('location_geocells =', search_cell)

Note that the first query has a search space of n entities, whereas
the second query has a search space of 13*n entities.

It seems like the answer to (2) is that both result in equal query
performance, per tip #6 in
http://googleappengine.blogspot.com/2009/06/10-things-you-probably-didnt-know-about.html,
but again, I'd like to see if anyone has any differing real-world
experiences with this.

Lastly, if anyone has any other suggestions or tips that can help
improve storage utilization, query performance and/or ease of use
(specifically w.r.t. index.yaml), please do let me know! The source
can be found here:

http://code.google.com/p/geomodel/
http://code.google.com/p/geomodel/source/browse/trunk/geo/geomodel.py

Thanks,
Roman

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to