We don't have a build of GeoWebCache in the GeoNode source tree right now so if we are going to start customizing it we will need more than a branch. Gabriel, let me know if you think we need to do something about this.

--
David Winslow
OpenGeo - http://opengeo.org/

On 06/25/2010 12:51 PM, Sebastian Benthall wrote:
Chris, correct me if I'm wrong, but we go the goahead to work on this, yeah?

I guess that means we should have a new git branch for this?

On Wed, Jun 23, 2010 at 2:41 PM, Gabriel Roldan <[email protected] <mailto:[email protected]>> wrote:

    This is gonna be awesome. Some comments inline.

    On 6/23/10 12:08 PM, David Winslow wrote:
    > shooting from the hip with some feedback on these ideas
    >
    > On 06/22/2010 06:18 PM, Chris Holmes wrote:
    >> I've been thinking a bit about how we can bring GeoWebCache in to
    >> GeoNode, to get at some of the great performance enhancements
    it can
    >> bring.  Ideally we seamlessly cache all layers viewed in
    GeoNode, both
    >> local and remote, even when those change.   There are twists
    with each,
    >> and both revolve around stale caches.
    >>
    >> With local caches we need a way to truncate the cache if the style
    >> changes.  Ideally when one is in style edit mode we don't use
    GWC at
    >> all, only when someone does a final 'save' does it start
    caching the
    >> change.
    >>
    >> With remote caches we need a way for a user to manage the cache, to
    >> invalidate it when the remote server changes, either data or style.
    >> Ideally it would have a GeoRSS feed of changes that GWC
    automatically
    >> truncates based on.  Less ideally there's a manual way to
    restart the
    >> caching.
    >>
    >> A rough roadmap of how we might achieve the end goal:
    >>
    >> * Start with just caching remote layers.  So when anyone puts in a
    >> remote WMS it automatically gets added as a GWC layer.
    The GWC REST API is definitely clunky currently and would highly
    benefit
    if we do this

    >>  Gabriel is about
    >> to commit a Least Recently Used cache to GWC, which will allow
    an admin
    >> to set a total max for the cache.
    Right now the diskquota is an opt-in process meaning there's no global
    cache size cap, but you need to set the limit on a layer by layer
    basis.
    I think it would be easy to add a global limit so any non explicitly
    configured layer gets evenly capped to cope up with the global limit.
    How does that sound?

      So we could let people add any layer,
    >> but the admin of GeoNode can configure it to just cache the
    most used
    >> tiles, up to a limit they set, be it 100 megs or 2 terabytes.
     For this
    >> first step the caches may just get invalid, but the admin would
    have the
    >> ability to truncate them in the GWC admin.
    >>
    > LRU worries me a bit; if we set the disk limit too low we may
    just end
    > up with a lot of cache churn for little/no performance benefit.
    In my mind, GeoWebcache is "incomplete" as a product until we add the
    following enhancements:
     - configuration option to cache layers only up to a certain zoom
    level, and from that level on, defer to pure proxy mode
     - diskquota, which is kind of in beta testing now
     - Identify and avoid seeding empty tiles. This can be easily done
    with
    the JAI Extrema operation (or even Histogram) or the user might
    configure a no-data color for the layer?
     - Definition of an area of interest, so that a geometry defines the
    allowed seeding area for a layer

      And the
    > disk requirements can grow with minimal warning, since anyone
    can add a
    > layer. There's also an easy DOS attack - anyone can fetch 18
    zoom levels
    > of some layer nobody uses and trash the cache (not a huge deal,
    how long
    > would it take an attacker to do that anyway?).
    That would put the LRU diskquota enforcement job to work and hence
    wipe
    out those tiles that are least used. This plus the ability to set a
    limit on the number of zoom levels to actually cache would bring us
    closer to the safe zone?

      I'm not saying an LRU is
    > a bad idea.  I think caching will be a great improvement.  It's just
    > that there is a lot of room for refinement here (probably once
    we have
    > better usage tracking we can use that to prioritize tilesets,
    for example.)
    Wouldn't the LRU stats be enough for that? Note we also have an LFU
    (Least Frequently Used) expiration policy for diskquota enforcement,
    which looks closer to the kind of usage tracking you mention?
    >
    >> * Cache local layers, coordinating with Style changes.  I think
    Arne may
    >> have coded this up, at least for the embedded GWC.
    Yes. The problem with the embedded GWC is that is completely wipes out
    the entire layer cache upon _any_ modification, including WFS
    transactions, resulting too heavily truncated caches. You
    add/remote/edit a single feature, the whole layer cache is discarded.
    There's room to improve that based on bounding box/bounding
    polygon with
    some stuff created for the GeoRSS module though.
      We could perhaps
    >> start with just doing the cache on the embedded maps, since
    those won't
    >> have people switching to 'style mode'.  Maybe that intermediary
    step
    >> isn't necessary, but when we're in the map composer view we
    want to be
    >> sure that when people are styling they're not seeing GWC tiles.
    Related: I've been wondering since some time now if it wouldn't make
    sense to also integrate the WMS service endpoints for WMS and GWC,
    like
    in GWC being a front barrier for /geosever/wms instead of having to
    explicitly go through /geoserver/gwc?service=WMS...

    Back to topic: couldn't the styles just use a CGI flag to indicate
    when
    to ignore the cache and go straight to the WMS? AFAIK tiled=false
    would
    make the trick.

      When
    >> they finish styling we should then truncate the existing cache
    and start
    >> over.  Another simplifying assumption we could also consider
    making is
    >> only cache on the default style.  Not sure how much that
    actually helps.
    >>
    > I don't think we need to avoid caching alternative styles.
    I think right now GWC only seeds on the default style, and lazily
    caches
    non default styles. Are we talking about preseeding here or just lazy
    cacheing?

    >
    > I do think we need to skip the cache while editing styles.
    >
    > It would be nice if we could use cached layers everywhere, and
    have only
    > the layer being styled switch to "straight" WMS when styling is
    active.
    >> * Remote layer management.  This is sort of more general, I
    think in the
    >> future we should figure out some more full representation in each
    >> GeoNode of a remote layer.  Right now remote layers can be
    added, but no
    >> metadata can be found out about it.  This is another whole
    topic, but
    >> the implication for here is that such a page should/could have
    a way to
    >> manage the cache of the local GeoNode.  So you could truncate
    the cache
    >> there (maybe just the person who added?  Maybe you can set
    permissions
    >> of who can truncate?).  And then possibly also add a GeoRSS
    location to
    >> automatically truncate from.
    >>
    > Yeah, it would be awesome if adding a WMS to the composer
    application
    > got that service added to the GeoNode's GeoNetwork index,
    complete with
    > metadata pages in the Django web app.  And GeoNode can
    periodically scan
    > the capabilities for added/removed layers, updated descriptions, new
    > styles.  These would be reflected in GeoNetwork and GeoWebCache
    as well
    > as the Django database.
    >
    > It might be nice to also provide a listing of indexed services
    so users
    > can track down the originating WMS services if they want.
    >> The cool thing this set of things should lead to is to give a
    benefit to
    >> people adding remote servers.  They get increased speed and
    reliability
    >> if they just add it to a map on a geoNode.  So we can come in
    with a
    >> GeoNode to an existing nice SDI implementation that already has
    a bunch
    >> of WMS services, and then people can start creating maps on top
    of it,
    >> and those maps perform even faster than the straight WMS.
    >>
    >> Thoughts?  I think this could be a nice performance win, as
    most all our
    >> maps are tiled.  Should obviously be complemented by other
    >> optimizations, like on the javascript side, but the two
    together should
    >> make things quite zippy.
    >>
    > Having the WMS capabilities handled on the server side (and cached
    > there) would probably be a nice win for loading services.  We
    could do
    > away with reading capabilities entirely until the user pulls up
    the add
    > layers dialog (which is not available in the embedded viewers).
    >
    > We don't do GFI requests now but it might be worth thinking
    about how
    > they interact with the cache.  I also don't see this map caching
    doing
    > much for offline/distributed data management, which seems like
    caching
    > of another sort.  It would be good to work out some answers
    related to that.
    I don't get it. Could you elaborate?

    Cheers,
    Gabriel
    >
    > --
    > David Winslow
    > OpenGeo - http://opengeo.org/


    --
    Gabriel Roldan
    OpenGeo - http://opengeo.org
    Expert service straight from the developers.




--
Sebastian Benthall
OpenGeo - http://opengeo.org


Reply via email to