I don't think we are yet at the point where we need to start worrying about RAM caches, there is still a lot of room for using the database in a smarter (or rather, less braindead) way. In the specific example of displaying GeoNetwork search results, we are hitting the database separately for each search result to grab the metadata for it. Luke has implemented a workaround that cleverly sidesteps this issue by holding off on loading data from the database until the user expands a JS widget, but really we should be able to work out a way to batch up those requests if we think about it for a minute.

Conceptually, I'm not sure we even need to hit Django's DB for this at all, all the metadata needed to display search results could probably live in GeoNetwork (and the more metadata that lives in GeoNetwork, the better, since GeoNetwork has its own search interface and can be federated with other GeoNetworks). However, mirroring this stuff in Django is going to be fairly important if we want to do the kinds of things we've been talking up for GeoNode - having user profiles influence layer metadata, etc.

I suppose one formulation of the problem is in a use case:

   Jorge has uploaded several dozen layers to GeoNode.  Since he has
   filled out the profile for his GeoNode account, he has been able to
   avoid a lot of repetitive work filling out the descriptions for each
   of these layers.  Now, however, he's been promoted and needs to
   change his title from Data Wrangler to Poobah of Informatology ...
   on 200 layers.  GeoNode to the rescue!  He simply edits his profile
   and GeoNode updates all the metadata documents that reference him as
   provider or metadata maintainer with current contact information.


So, we need some sort of data architecture that can
* figure out which layers need updating after a user profile changes
* update just the fields corresponding to that user profile (actually, GN is basically storing the metadata documents as blobs so we will have to overwrite everything... but we need to make sure that we don't clobber the fields that aren't being modified)

One possible implementation would be to have a more relational model in GeoNode and use the typical "WHERE owner.uid = updated_profile.uid" kind of query to figure out what documents to update, and then just generate entire new metadata documents to clobber the pre-existing ones. To preserve the fields that aren't coming from GeoNetwork, we'd probably want to store everything in the layer's Django representation.

--
David Winslow
OpenGeo - http://opengeo.org/

On 06/14/2010 12:06 PM, Ariel Nunez wrote:
Short story:
http://github.com/sebleier/django-redis-cache

Long story:
IMHO, the best idea is to just cache the metadata in RAM, unlike
memcached, Redis also writes a backup periodically to the disk and is
able to maintain the data between restarts. What we would do then is
either write the key, value pairs or just store a geojson dict for a
given layer.

Here is some code I wrote a while ago that uses redis in a very simple
yet effective way to cache an expensive operation:

http://github.com/ingenieroariel/dondevoto/blob/master/server.py#L16

Reply via email to