On 10 December 2011 00:21, Brett Henderson <[email protected]> wrote: > I've been experimenting with some changes to the way the Osmosis pipeline > executes. > > *Existing Operation* > > Currently, the typical interaction between a source task and its sink is > as follows: > > - > - Zero or more calls to process(xxxx). > - One call to complete() if processing is successful. > - One call to release() regardless of success or failure. > > This works well enough in most cases. The main disadvantage for current > functionality is that *many* classes have to implement lazy initialisation > and initialise on the first call to process. > > *New Operation* > > However there's a new feature I'd like to introduce. I'd like "header" > information to be able to be passed through the pipeline. This will take > the form of a Map<String, Object> and provide a generic way to pass > additional meta data through the pipeline. The task interaction would now > look like: > > - *One call to initialize(Map<String, Object>) at the start of > processing. If startup fails it doesn't have to be called.* > - Zero or more calls to process(xxxx). > - One call to complete() if processing is successful. > - One call to release() regardless of success or failure. > > *Reasons* > > This may be used for something as simple as passing additional information > such as replication timestamps, but may also be used by closely related > tasks to exchange more complex objects. My main driver for doing this > right now is to allow me to decompose the current monolithic tasks used for > replication into smaller tasks. For example, I can separate the apidb > schema specific code which extracts data by tracking PostgreSQL specific > transaction ids from the code that writes the data into change and state > files. This allows the apidb code to then feed changes into other tasks > (eg. constant updates streaming over HTTP). Along with the metatags now > able to be attached to all entities, it is now possible to pass all kinds > of additional data through the pipeline without extending the core. > > The XML tasks already support writing the recently added entity metatags > as additional entity attributes and I'd like them to support this new > global metadata as well by adding new XML attributes to the main <osm> or > <osmChange> elements. > > Longer term I'd like to replace the existing Bound class with something > like. Bound is currently being treated as a normal Entity like nodes, ways > and relations but it is awkward and involves a number of hacks. Passing > all bound information during pipeline startup would be much cleaner (I > think). But that isn't a trivial task so will have to wait for another day. > > *Code Changes* > > The pipeline design hasn't changed much since it was introduced so this is > a fairly significant change. The code is already implemented and at least > compiles and passes unit tests. > > https://github.com/brettch/osmosis/tree/init > > All tasks have been updated where necessary to support the new initialize > method, but I haven't updated tasks to take full advantage of it (eg. > eliminate lazy initialization logic). Unless I hear any major objections > I'll merge it into the master branch at least on my repository and probably > the main openstreetmap/osmosis repository within the next few days. > This change is now merged. It's a fairly invasive change so let me know if you notice any issues.
Brett
_______________________________________________ osmosis-dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/osmosis-dev
