I don't think we are yet at the point where we need to start worrying
about RAM caches, there is still a lot of room for using the database in
a smarter (or rather, less braindead) way. In the specific example of
displaying GeoNetwork search results, we are hitting the database
separately for each search result to grab the metadata for it. Luke has
implemented a workaround that cleverly sidesteps this issue by holding
off on loading data from the database until the user expands a JS
widget, but really we should be able to work out a way to batch up those
requests if we think about it for a minute.
Conceptually, I'm not sure we even need to hit Django's DB for this at
all, all the metadata needed to display search results could probably
live in GeoNetwork (and the more metadata that lives in GeoNetwork, the
better, since GeoNetwork has its own search interface and can be
federated with other GeoNetworks). However, mirroring this stuff in
Django is going to be fairly important if we want to do the kinds of
things we've been talking up for GeoNode - having user profiles
influence layer metadata, etc.
I suppose one formulation of the problem is in a use case:
Jorge has uploaded several dozen layers to GeoNode. Since he has
filled out the profile for his GeoNode account, he has been able to
avoid a lot of repetitive work filling out the descriptions for each
of these layers. Now, however, he's been promoted and needs to
change his title from Data Wrangler to Poobah of Informatology ...
on 200 layers. GeoNode to the rescue! He simply edits his profile
and GeoNode updates all the metadata documents that reference him as
provider or metadata maintainer with current contact information.
So, we need some sort of data architecture that can
* figure out which layers need updating after a user profile changes
* update just the fields corresponding to that user profile (actually,
GN is basically storing the metadata documents as blobs so we will have
to overwrite everything... but we need to make sure that we don't
clobber the fields that aren't being modified)
One possible implementation would be to have a more relational model in
GeoNode and use the typical "WHERE owner.uid = updated_profile.uid" kind
of query to figure out what documents to update, and then just generate
entire new metadata documents to clobber the pre-existing ones. To
preserve the fields that aren't coming from GeoNetwork, we'd probably
want to store everything in the layer's Django representation.
--
David Winslow
OpenGeo - http://opengeo.org/
On 06/14/2010 12:06 PM, Ariel Nunez wrote:
Short story:
http://github.com/sebleier/django-redis-cache
Long story:
IMHO, the best idea is to just cache the metadata in RAM, unlike
memcached, Redis also writes a backup periodically to the disk and is
able to maintain the data between restarts. What we would do then is
either write the key, value pairs or just store a geojson dict for a
given layer.
Here is some code I wrote a while ago that uses redis in a very simple
yet effective way to cache an expensive operation:
http://github.com/ingenieroariel/dondevoto/blob/master/server.py#L16