Re: [Geoserver-devel] WPS integration back into the catalog: import, temp layers, dynamic layers

Justin Deoliveira Mon, 19 Jul 2010 08:12:15 -0700

Very interesting problems. Some random thoughts for you inline. Thanks 
for your continued hard work on pushing WPS to the core.


On 10-07-19 7:56 AM, Andrea Aime wrote:
> Hi all,
> I would like to submit the community with three ideas that I'd like
> to implement, one in the short term, the other possibly in the
> short term as well, and one for the future.
>
> As you may know I've been working to revitalize WPS enough that may
> become an extension in GS 2.1 (at least, that's the plan).
>
> One of that main attractions of a WPS in GeoServer is that the WPS
> is not stand alone, but it has at its disposal local services and
> a catalog.
> So far that means a WPS process does not have to painfully gather
> data from remote, but it can also get it directly from the local
> catalog. This is great, but it's one way, the outputs are still
> going out in some form (gml, shapefiles, json)
> that the client has to process by itself.
>
> I want to integrated back in the other direction by having an "import"
> process that can be used at the end of a processing chain to save back
> the results into the catalog, so that the result can then  be rendered
> by WMS and queried by WFS. Which makes it possible to interact with
> GS WPS with lightweight clients without the limitation of using small
> data sets (not to mention the fact that the result layer can be
> a legitimate new layer to be used long term).
>
> For vectors the import process would take:
> - the feature collection to be stored
> - a layer name
> - the workspace (optional, we can used the default)
> - the target store (optional, on trunk we have a concept of default
>     store). I'd say the target store must exist (and be either a DB, or a
>     directory store)
> - a style name (optional, we can use one of the built ins)
>
> It's evident there is some overlap with restconfig, but a processing
> chain will result in a feature collection, something we cannot
> throw at REST (plus we don't want the data to travel back to the client
> and then again to the server).
> This would be a special case, I don't intend to actually go and redo
> RESTConfig as a set of WPS processes (btw, if you have ideas of how
> to integrate the two without having the data go round the world I'm
> all ears).
Well there has been talk of integrating restconfig into the core for 
2.1.x. So a hackish but relatively easy way to integrate could be to 
just use the restlet resources directly, sort of mocking up request 
objects.

A cleaner way would be to refactor restconfig into some reusable command 
like objects, and add those objects to the core. I like this approach 
and think it could be useful in terms of code reuse even today as there 
is some code overlap between the ui and restconfig. But not a trivial 
undertaking to be sure.

> At most it could be useful to add a RemoveLayer process that would
> remove the layer and the underlying contents from the catalog, so
> that a client can actually do the two most common things without having
> to switch protocols (add a layer, remove a layer).
>
> Oh, the process would actually run only if a admin level user is
> invoking it (yeah, would be nice to have more granular administration
> rights, but that's a can of worms I don't intend to open in my spare time)
>
> So ok, this would be step one, and something I'd definitely like to do
> this week.
>
> For step two, let's consider not all processed layers are meant to live
> for a long time. You do some processing, _look_ at the results, decide
> some of the filtering, buffering distances, or stuff like that, is not
> ok, and want to redo the process with different params, and look at the
> results again.
> Of course this can be done by using the above Import process, but
> reality is, you probably don't want to:
> a) have the layer be visible to the world
> b) maybe you don't want to have to manage its lifecycle, the layer is
>      meant to be a throwaway anyways
> So it would be nice to mark a layer as temporary and as private somehow.
> Temporary means the layer (and the data backing it) would disappear in
> thin air after some time since it has been last used, private could
> either mean:
> - it would not be advertised in the caps (but anyone knowing its full
>     name could access it)
> - it would be protected using security so that only a certain user can
>     access it
> I would go for the first, since the second implies working on the
> granular security can of worms.
> Also, adding a handling of temp layers sounds relatively easy to
> implement, a little touch in the capabilities transformers, a scheduled
> activity that periodically checks when the layer has last been accessed,
> and it's done. Perfect for spare time coding (whilst more complex
> solutions still can get it using funding, when and if there is some).
>
The temp layer idea makes sense but I can see having to explicitly skip 
over temp layer objects seems a bit error prone. There are a few places 
in code that have to iterate through layers, capabilities, ui, 
restconfig, etc... It would be a lot to update and probably the first 
thing someone forgets to do when writing code to iterate over layers.

Obviously some sort of thread local view of the catalog would not work 
since the layers need to live across requests. But I wonder if it could 
work in conjunction with a sort of token or key system. What I am 
thinking is the temp layers are stored outside the core catalog. But can 
be engaged (thread locally) when the client specifies a particular token.

How does the WPS send back the info for a temp layer to a client? If it 
is the full OGC request link like a GetFeature or GetMap request the 
token could be used relatively transparently... anyways, just a thought.

> Step three is daydreaming. But let me dream for once. Say I have a
> process that generates a layer. It does in a way that the layer is
> cached, but dynamic: it is computed, but the process used to compute it
> is saved, and it's run every time the input data changes (well, maybe
> driven by a certain polling so that a storm of changes does not result
> in a storm of processing routines running).
> Actually this should not be _so_ hard to implement. Add to the layer
> definition three new entries in the metadata section:
> - the full process definition (as xml)
> - the last reprocessing date
> - the recompute interval
> - the last date a input was changed
> Then add a scheduled job that:
> - knows about all the dynamic layers
> - knows about the sources and has a transaction listener on them
> - runs the processes again when the output is stale and uses
> transactions to change in a single shot the data under the layer feets.
>
Yeah I agree that I don't think this is too far fetched. Might be cool 
to try and drag in the H2 datastore as the temporary storage... I have 
been working on improving performance lately and hope to get it usable 
as a first class datastore in geoserver.
> How does this sound? I think temporary/private layers are more important
> than this (my dream is to have one day a GeoServer based in-browser
> client that can behave similar to a desktop gis one day).
> On the other side it seems the latter is doable without API changes,
> which makes it a low hanging fruit.
>
> Opinions and comments.... very welcomed!
>
> Cheers
> Andrea
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Sprint
> What will you do first with EVO, the first 4G phone?
> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
> _______________________________________________
> Geoserver-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/geoserver-devel


-- 
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Geoserver-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Re: [Geoserver-devel] WPS integration back into the catalog: import, temp layers, dynamic layers

Reply via email to