Re: [geonode] GWC in GeoNode

Gabriel Roldan Fri, 25 Jun 2010 11:07:00 -0700

we can figure out the workflow details later.
What I'd want is for us to lay down a clear scope, something with can 
easily track progress against.
How should we proceed? use cases -> feature specs -> tech spec?


Gabriel
On 6/25/10 1:56 PM, David Winslow wrote:
> We don't have a build of GeoWebCache in the GeoNode source tree right
> now so if we are going to start customizing it we will need more than a
> branch. Gabriel, let me know if you think we need to do something about
> this.
>
> --
> David Winslow
> OpenGeo - http://opengeo.org/
>
> On 06/25/2010 12:51 PM, Sebastian Benthall wrote:
>> Chris, correct me if I'm wrong, but we go the goahead to work on this,
>> yeah?
>>
>> I guess that means we should have a new git branch for this?
>>
>> On Wed, Jun 23, 2010 at 2:41 PM, Gabriel Roldan <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> This is gonna be awesome. Some comments inline.
>>
>> On 6/23/10 12:08 PM, David Winslow wrote:
>> > shooting from the hip with some feedback on these ideas
>> >
>> > On 06/22/2010 06:18 PM, Chris Holmes wrote:
>> >> I've been thinking a bit about how we can bring GeoWebCache in to
>> >> GeoNode, to get at some of the great performance enhancements
>> it can
>> >> bring. Ideally we seamlessly cache all layers viewed in
>> GeoNode, both
>> >> local and remote, even when those change. There are twists
>> with each,
>> >> and both revolve around stale caches.
>> >>
>> >> With local caches we need a way to truncate the cache if the style
>> >> changes. Ideally when one is in style edit mode we don't use
>> GWC at
>> >> all, only when someone does a final 'save' does it start
>> caching the
>> >> change.
>> >>
>> >> With remote caches we need a way for a user to manage the cache, to
>> >> invalidate it when the remote server changes, either data or style.
>> >> Ideally it would have a GeoRSS feed of changes that GWC
>> automatically
>> >> truncates based on. Less ideally there's a manual way to
>> restart the
>> >> caching.
>> >>
>> >> A rough roadmap of how we might achieve the end goal:
>> >>
>> >> * Start with just caching remote layers. So when anyone puts in a
>> >> remote WMS it automatically gets added as a GWC layer.
>> The GWC REST API is definitely clunky currently and would highly
>> benefit
>> if we do this
>>
>> >> Gabriel is about
>> >> to commit a Least Recently Used cache to GWC, which will allow
>> an admin
>> >> to set a total max for the cache.
>> Right now the diskquota is an opt-in process meaning there's no global
>> cache size cap, but you need to set the limit on a layer by layer
>> basis.
>> I think it would be easy to add a global limit so any non explicitly
>> configured layer gets evenly capped to cope up with the global limit.
>> How does that sound?
>>
>> So we could let people add any layer,
>> >> but the admin of GeoNode can configure it to just cache the
>> most used
>> >> tiles, up to a limit they set, be it 100 megs or 2 terabytes.
>> For this
>> >> first step the caches may just get invalid, but the admin would
>> have the
>> >> ability to truncate them in the GWC admin.
>> >>
>> > LRU worries me a bit; if we set the disk limit too low we may
>> just end
>> > up with a lot of cache churn for little/no performance benefit.
>> In my mind, GeoWebcache is "incomplete" as a product until we add the
>> following enhancements:
>> - configuration option to cache layers only up to a certain zoom
>> level, and from that level on, defer to pure proxy mode
>> - diskquota, which is kind of in beta testing now
>> - Identify and avoid seeding empty tiles. This can be easily done
>> with
>> the JAI Extrema operation (or even Histogram) or the user might
>> configure a no-data color for the layer?
>> - Definition of an area of interest, so that a geometry defines the
>> allowed seeding area for a layer
>>
>> And the
>> > disk requirements can grow with minimal warning, since anyone
>> can add a
>> > layer. There's also an easy DOS attack - anyone can fetch 18
>> zoom levels
>> > of some layer nobody uses and trash the cache (not a huge deal,
>> how long
>> > would it take an attacker to do that anyway?).
>> That would put the LRU diskquota enforcement job to work and hence
>> wipe
>> out those tiles that are least used. This plus the ability to set a
>> limit on the number of zoom levels to actually cache would bring us
>> closer to the safe zone?
>>
>> I'm not saying an LRU is
>> > a bad idea. I think caching will be a great improvement. It's just
>> > that there is a lot of room for refinement here (probably once
>> we have
>> > better usage tracking we can use that to prioritize tilesets,
>> for example.)
>> Wouldn't the LRU stats be enough for that? Note we also have an LFU
>> (Least Frequently Used) expiration policy for diskquota enforcement,
>> which looks closer to the kind of usage tracking you mention?
>> >
>> >> * Cache local layers, coordinating with Style changes. I think
>> Arne may
>> >> have coded this up, at least for the embedded GWC.
>> Yes. The problem with the embedded GWC is that is completely wipes out
>> the entire layer cache upon _any_ modification, including WFS
>> transactions, resulting too heavily truncated caches. You
>> add/remote/edit a single feature, the whole layer cache is discarded.
>> There's room to improve that based on bounding box/bounding
>> polygon with
>> some stuff created for the GeoRSS module though.
>> We could perhaps
>> >> start with just doing the cache on the embedded maps, since
>> those won't
>> >> have people switching to 'style mode'. Maybe that intermediary
>> step
>> >> isn't necessary, but when we're in the map composer view we
>> want to be
>> >> sure that when people are styling they're not seeing GWC tiles.
>> Related: I've been wondering since some time now if it wouldn't make
>> sense to also integrate the WMS service endpoints for WMS and GWC,
>> like
>> in GWC being a front barrier for /geosever/wms instead of having to
>> explicitly go through /geoserver/gwc?service=WMS...
>>
>> Back to topic: couldn't the styles just use a CGI flag to indicate
>> when
>> to ignore the cache and go straight to the WMS? AFAIK tiled=false
>> would
>> make the trick.
>>
>> When
>> >> they finish styling we should then truncate the existing cache
>> and start
>> >> over. Another simplifying assumption we could also consider
>> making is
>> >> only cache on the default style. Not sure how much that
>> actually helps.
>> >>
>> > I don't think we need to avoid caching alternative styles.
>> I think right now GWC only seeds on the default style, and lazily
>> caches
>> non default styles. Are we talking about preseeding here or just lazy
>> cacheing?
>>
>> >
>> > I do think we need to skip the cache while editing styles.
>> >
>> > It would be nice if we could use cached layers everywhere, and
>> have only
>> > the layer being styled switch to "straight" WMS when styling is
>> active.
>> >> * Remote layer management. This is sort of more general, I
>> think in the
>> >> future we should figure out some more full representation in each
>> >> GeoNode of a remote layer. Right now remote layers can be
>> added, but no
>> >> metadata can be found out about it. This is another whole
>> topic, but
>> >> the implication for here is that such a page should/could have
>> a way to
>> >> manage the cache of the local GeoNode. So you could truncate
>> the cache
>> >> there (maybe just the person who added? Maybe you can set
>> permissions
>> >> of who can truncate?). And then possibly also add a GeoRSS
>> location to
>> >> automatically truncate from.
>> >>
>> > Yeah, it would be awesome if adding a WMS to the composer
>> application
>> > got that service added to the GeoNode's GeoNetwork index,
>> complete with
>> > metadata pages in the Django web app. And GeoNode can
>> periodically scan
>> > the capabilities for added/removed layers, updated descriptions, new
>> > styles. These would be reflected in GeoNetwork and GeoWebCache
>> as well
>> > as the Django database.
>> >
>> > It might be nice to also provide a listing of indexed services
>> so users
>> > can track down the originating WMS services if they want.
>> >> The cool thing this set of things should lead to is to give a
>> benefit to
>> >> people adding remote servers. They get increased speed and
>> reliability
>> >> if they just add it to a map on a geoNode. So we can come in
>> with a
>> >> GeoNode to an existing nice SDI implementation that already has
>> a bunch
>> >> of WMS services, and then people can start creating maps on top
>> of it,
>> >> and those maps perform even faster than the straight WMS.
>> >>
>> >> Thoughts? I think this could be a nice performance win, as
>> most all our
>> >> maps are tiled. Should obviously be complemented by other
>> >> optimizations, like on the javascript side, but the two
>> together should
>> >> make things quite zippy.
>> >>
>> > Having the WMS capabilities handled on the server side (and cached
>> > there) would probably be a nice win for loading services. We
>> could do
>> > away with reading capabilities entirely until the user pulls up
>> the add
>> > layers dialog (which is not available in the embedded viewers).
>> >
>> > We don't do GFI requests now but it might be worth thinking
>> about how
>> > they interact with the cache. I also don't see this map caching
>> doing
>> > much for offline/distributed data management, which seems like
>> caching
>> > of another sort. It would be good to work out some answers
>> related to that.
>> I don't get it. Could you elaborate?
>>
>> Cheers,
>> Gabriel
>> >
>> > --
>> > David Winslow
>> > OpenGeo - http://opengeo.org/
>>
>>
>> --
>> Gabriel Roldan
>> OpenGeo - http://opengeo.org
>> Expert service straight from the developers.
>>
>>
>>
>>
>> --
>> Sebastian Benthall
>> OpenGeo - http://opengeo.org
>>
>
>


-- 
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Re: [geonode] GWC in GeoNode

Reply via email to