Chris, correct me if I'm wrong, but we go the goahead to work on this, yeah?

I guess that means we should have a new git branch for this?

On Wed, Jun 23, 2010 at 2:41 PM, Gabriel Roldan <[email protected]> wrote:

> This is gonna be awesome. Some comments inline.
>
> On 6/23/10 12:08 PM, David Winslow wrote:
> > shooting from the hip with some feedback on these ideas
> >
> > On 06/22/2010 06:18 PM, Chris Holmes wrote:
> >> I've been thinking a bit about how we can bring GeoWebCache in to
> >> GeoNode, to get at some of the great performance enhancements it can
> >> bring.  Ideally we seamlessly cache all layers viewed in GeoNode, both
> >> local and remote, even when those change.   There are twists with each,
> >> and both revolve around stale caches.
> >>
> >> With local caches we need a way to truncate the cache if the style
> >> changes.  Ideally when one is in style edit mode we don't use GWC at
> >> all, only when someone does a final 'save' does it start caching the
> >> change.
> >>
> >> With remote caches we need a way for a user to manage the cache, to
> >> invalidate it when the remote server changes, either data or style.
> >> Ideally it would have a GeoRSS feed of changes that GWC automatically
> >> truncates based on.  Less ideally there's a manual way to restart the
> >> caching.
> >>
> >> A rough roadmap of how we might achieve the end goal:
> >>
> >> * Start with just caching remote layers.  So when anyone puts in a
> >> remote WMS it automatically gets added as a GWC layer.
> The GWC REST API is definitely clunky currently and would highly benefit
> if we do this
>
> >>  Gabriel is about
> >> to commit a Least Recently Used cache to GWC, which will allow an admin
> >> to set a total max for the cache.
> Right now the diskquota is an opt-in process meaning there's no global
> cache size cap, but you need to set the limit on a layer by layer basis.
> I think it would be easy to add a global limit so any non explicitly
> configured layer gets evenly capped to cope up with the global limit.
> How does that sound?
>
>   So we could let people add any layer,
> >> but the admin of GeoNode can configure it to just cache the most used
> >> tiles, up to a limit they set, be it 100 megs or 2 terabytes.  For this
> >> first step the caches may just get invalid, but the admin would have the
> >> ability to truncate them in the GWC admin.
> >>
> > LRU worries me a bit; if we set the disk limit too low we may just end
> > up with a lot of cache churn for little/no performance benefit.
> In my mind, GeoWebcache is "incomplete" as a product until we add the
> following enhancements:
>  - configuration option to cache layers only up to a certain zoom
> level, and from that level on, defer to pure proxy mode
>  - diskquota, which is kind of in beta testing now
>  - Identify and avoid seeding empty tiles. This can be easily done with
> the JAI Extrema operation (or even Histogram) or the user might
> configure a no-data color for the layer?
>  - Definition of an area of interest, so that a geometry defines the
> allowed seeding area for a layer
>
>   And the
> > disk requirements can grow with minimal warning, since anyone can add a
> > layer. There's also an easy DOS attack - anyone can fetch 18 zoom levels
> > of some layer nobody uses and trash the cache (not a huge deal, how long
> > would it take an attacker to do that anyway?).
> That would put the LRU diskquota enforcement job to work and hence wipe
> out those tiles that are least used. This plus the ability to set a
> limit on the number of zoom levels to actually cache would bring us
> closer to the safe zone?
>
>   I'm not saying an LRU is
> > a bad idea.  I think caching will be a great improvement.  It's just
> > that there is a lot of room for refinement here (probably once we have
> > better usage tracking we can use that to prioritize tilesets, for
> example.)
> Wouldn't the LRU stats be enough for that? Note we also have an LFU
> (Least Frequently Used) expiration policy for diskquota enforcement,
> which looks closer to the kind of usage tracking you mention?
> >
> >> * Cache local layers, coordinating with Style changes.  I think Arne may
> >> have coded this up, at least for the embedded GWC.
> Yes. The problem with the embedded GWC is that is completely wipes out
> the entire layer cache upon _any_ modification, including WFS
> transactions, resulting too heavily truncated caches. You
> add/remote/edit a single feature, the whole layer cache is discarded.
> There's room to improve that based on bounding box/bounding polygon with
> some stuff created for the GeoRSS module though.
>    We could perhaps
> >> start with just doing the cache on the embedded maps, since those won't
> >> have people switching to 'style mode'.  Maybe that intermediary step
> >> isn't necessary, but when we're in the map composer view we want to be
> >> sure that when people are styling they're not seeing GWC tiles.
> Related: I've been wondering since some time now if it wouldn't make
> sense to also integrate the WMS service endpoints for WMS and GWC, like
> in GWC being a front barrier for /geosever/wms instead of having to
> explicitly go through /geoserver/gwc?service=WMS...
>
> Back to topic: couldn't the styles just use a CGI flag to indicate when
> to ignore the cache and go straight to the WMS? AFAIK tiled=false would
> make the trick.
>
>   When
> >> they finish styling we should then truncate the existing cache and start
> >> over.  Another simplifying assumption we could also consider making is
> >> only cache on the default style.  Not sure how much that actually helps.
> >>
> > I don't think we need to avoid caching alternative styles.
> I think right now GWC only seeds on the default style, and lazily caches
> non default styles. Are we talking about preseeding here or just lazy
> cacheing?
>
> >
> > I do think we need to skip the cache while editing styles.
> >
> > It would be nice if we could use cached layers everywhere, and have only
> > the layer being styled switch to "straight" WMS when styling is active.
> >> * Remote layer management.  This is sort of more general, I think in the
> >> future we should figure out some more full representation in each
> >> GeoNode of a remote layer.  Right now remote layers can be added, but no
> >> metadata can be found out about it.  This is another whole topic, but
> >> the implication for here is that such a page should/could have a way to
> >> manage the cache of the local GeoNode.  So you could truncate the cache
> >> there (maybe just the person who added?  Maybe you can set permissions
> >> of who can truncate?).  And then possibly also add a GeoRSS location to
> >> automatically truncate from.
> >>
> > Yeah, it would be awesome if adding a WMS to the composer application
> > got that service added to the GeoNode's GeoNetwork index, complete with
> > metadata pages in the Django web app.  And GeoNode can periodically scan
> > the capabilities for added/removed layers, updated descriptions, new
> > styles.  These would be reflected in GeoNetwork and GeoWebCache as well
> > as the Django database.
> >
> > It might be nice to also provide a listing of indexed services so users
> > can track down the originating WMS services if they want.
> >> The cool thing this set of things should lead to is to give a benefit to
> >> people adding remote servers.  They get increased speed and reliability
> >> if they just add it to a map on a geoNode.  So we can come in with a
> >> GeoNode to an existing nice SDI implementation that already has a bunch
> >> of WMS services, and then people can start creating maps on top of it,
> >> and those maps perform even faster than the straight WMS.
> >>
> >> Thoughts?  I think this could be a nice performance win, as most all our
> >> maps are tiled.  Should obviously be complemented by other
> >> optimizations, like on the javascript side, but the two together should
> >> make things quite zippy.
> >>
> > Having the WMS capabilities handled on the server side (and cached
> > there) would probably be a nice win for loading services.  We could do
> > away with reading capabilities entirely until the user pulls up the add
> > layers dialog (which is not available in the embedded viewers).
> >
> > We don't do GFI requests now but it might be worth thinking about how
> > they interact with the cache.  I also don't see this map caching doing
> > much for offline/distributed data management, which seems like caching
> > of another sort.  It would be good to work out some answers related to
> that.
> I don't get it. Could you elaborate?
>
> Cheers,
> Gabriel
> >
> > --
> > David Winslow
> > OpenGeo - http://opengeo.org/
>
>
> --
> Gabriel Roldan
> OpenGeo - http://opengeo.org
> Expert service straight from the developers.
>



-- 
Sebastian Benthall
OpenGeo - http://opengeo.org

Reply via email to