Sounds pretty interesting. The general approach all sounds good. Although
you lost me at the part about streaming and result persistence. Admittedly I
know very little about how the WPS works. How is it a process will exit
without doing anything? Is the data not being "pulled" in that thread?

Aside from that, there is a general comment that the number of components
that do their own thread/job/task management seems to be growing... Although
a few them are still just community modules. I wonder if its worth at some
point trying to come up with a central task manager of sorts. If the task
manager you come up with for WPS processes is generic enough it might be
worth trying to throw it in the core for other components to use.

On Fri, Oct 28, 2011 at 2:44 PM, Andrea Aime
<[email protected]>wrote:

> Hi,
> in the next weeks I'll be working to add asynchronous execution
> support for GeoServer.
> I'd like to give you some heads up and discuss some design details.
>
> The specification allows asynchronous requests when the caller asks for
> storeResponse=true and status=true, meaning the actual response document
> is stored somewhere and the status contained in it is updated while the
> process
> proceeds.
> (by spec, If status="true" and storeExecuteResponse is "false" then the
> service
> shall raise an exception)
>
> The location of the document is reported in the execute response and then
> shall be updated while the process performs the computation.
> The spec does not say where this document should be located, but for ease
> of implementation I propose to make it into another service call, looking
> like:
>
> wps?service=WPS&version=1.0&request=executionStatus&identifier=xyz
>
> This makes it rather natural to implement with our current framework.
>
> Process wise, we already have the factories pass down a ProgressListener
> among the call arguments, so the process can update its status.
>
> Now, how to handle the process asynch execution and tracking?
> I was thinking to have a ProcessManager interface that the WPS service
> code submits processes to and can ask about their status too.
> It might look roughly like this:
>
> interface ProcessManager {
>  /**
>   * Submits the process for execution, returns a id to refer to the
> execution later
>   */
>  String submit(String processName, Map<String, Object> inputs);
>
>  Status getStatus(String executionId);
> }
>
> Where Status is:
>
> Status {
>  StatusType status; /* queued, paused, executing, complete, ... */
>  double progress;
>  Map<String, Object> output;
> }
>
> The default implementation of process manager would use a fixed size thread
> pool, callable and futures to handle the execution, but the interface will
> allow
> to plugin (from spring context) other custom managers.
> For example people might want to roll very long processes (several
> hours or more)
> that can be restarted from known checkpoints, in that case the manager
> would also
> need persistent storage of the processes in flight to allow the same
> to be resumed in case
> of a crash and restart of the WPS server, or enforce their own
> particular execution
> policies (e.g, link priority and amount of processes executed to the user).
>
> Now, the above might look fine but there is a trouble: streaming execution
> and
> result persistence.
>
> Streaming execution means that most vector processes, and raster ones too,
> calculate the result as data gets pulled from them via iterators or tile
> access,
> so the process will actually exit from asynch execution without having
> computed
> anything, and potentially taking its dear time to actually compute
> when the results
> are finally accessed.
> Also, the inputs might not be there anymore when the result is being
> accessed
> (think a source layer that was removed in the meantime).
>
> If the result is not streaming, but fully loaded in memory, there is
> the problem of
> how many results we can keep in memory (and for how long, this should
> be configurable
> too).
>
> Long story short, imho we want to write out the results on disk as
> soon as possible, and
> I guess include that into the "execution" phase from the user pont of view.
>
> This changes the process manager, which at this point should take care of
> laying
> out the results on disk and returning not a map of outputs when the
> process is done,
> but a link to the file that contains the response xml, which in turn might
> link
> to other documents, which happens if the user asked the output to be
> returned
> as references (common if you are generating a tiff, you probably would
> not like it
> being base64 encoded inline in the xml).
>
> This "lay out on the disk" part would be pretty common among various
> implementations
> so I guess I'll make a helper object for that part that various
> ProcessManager implementations
> can reuse.
>
> Opinions, suggestions?
>
> Cheers
> Andrea
>
>
>
> --
> -------------------------------------------------------
> Ing. Andrea Aime
> GeoSolutions S.A.S.
> Tech lead
>
> Via Poggio alle Viti 1187
> 55054  Massarosa (LU)
> Italy
>
> phone: +39 0584 962313
> fax:      +39 0584 962313
>
> http://www.geo-solutions.it
> http://geo-solutions.blogspot.com/
> http://www.youtube.com/user/GeoSolutionsIT
> http://www.linkedin.com/in/andreaaime
> http://twitter.com/geowolf
>
> -------------------------------------------------------
>
>
> ------------------------------------------------------------------------------
> The demand for IT networking professionals continues to grow, and the
> demand for specialized networking skills is growing even more rapidly.
> Take a complimentary Learning@Cisco Self-Assessment and learn
> about Cisco certifications, training, and career opportunities.
> http://p.sf.net/sfu/cisco-dev2dev
> _______________________________________________
> Geoserver-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/geoserver-devel
>



-- 
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.
------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Geoserver-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Reply via email to