Re: [osmosis-dev] Streaming Replication

2012-10-13 Thread Jochen Topf
Very interesting work!

How do you handle new incoming requests. They have to start from a known point
so I guess you have to do an SQL query for each of them? Or do you just read
the existing .osc files from disk and stream them out? This could take a long
time...

Jochen

On Sat, Oct 13, 2012 at 03:43:32PM +1100, Brett Henderson wrote:
 Date: Sat, 13 Oct 2012 15:43:32 +1100
 From: Brett Henderson br...@bretth.com
 To: osmosis-dev osmosis-dev@openstreetmap.org
 Subject: [osmosis-dev] Streaming Replication
 
 Hi All,
 
 For those of you who currently use the minute diffs to keep a local
 database up to date, you may be interested to know that a new form of
 replication has hit the street.
 
 The current replication system is based on a series of static replication
 files that are placed on a web server for clients to download as described
 here:
 http://wiki.openstreetmap.org/wiki/Planet.osm/diffs#Using_the_replication_diffs
 
 It is a very simple mechanism and works well for the existing daily, hourly
 and minutely replication feeds.  Unfortunately it doesn't work well for
 sub-minute replication because it becomes far too chatty.  On the server
 side, the current feeds are generated from cron which also works well down
 to one minute intervals, but the overhead of launching a new process and
 connecting to the database for every replication interval also becomes too
 inefficient for shorter intervals.
 
 To solve this, a new streaming replication mechanism has been developed.
 Under the covers the same database queries are utilised, but the process
 performing the queries runs continously and polls the database for changes
 at a shorter interval.  It is currently set to poll every 10 seconds, but
 it can be reduced further if required.  The network transport is also
 continuous and holds a single HTTP connection open for the lifetime of
 communication between the server and client.  It is all implemented within
 the latest version of Osmosis 0.41.  If you wish to experiment with the
 server-side tasks however, several bugs have been fixed in the latest
 development version.  Internally it uses the JBoss Netty framework which
 means that it's all event-driven (ie. doesn't require a thread per client)
 and should theoretically support a large number of concurrent clients.
 
 To quickly see this in action, point your browser at this URL and you
 should see new replication state data become available approximately
 every 10 seconds.
 http://planet.openstreetmap.org/replication/streaming/replicationState/current/tail
 
 New Osmosis tasks have been developed to consume this data.  For some basic
 instructions to help you get started, refer to this link:
 http://wiki.openstreetmap.org/wiki/Osmosis/Replication#Client-side_Streaming
 
 If you don't wish to use Osmosis, some basic documentation on the wire
 protocol is available here:
 http://wiki.openstreetmap.org/wiki/Osmosis/Replication#Streaming_Replication_Wire_Protocol
 
 This is very much experimental and bugs will undoubtedly be encountered to
 please be wary about trusting it to update your database if you've just
 spent two weeks importing a planet file.  However, I'd love to see it get
 some usage and would welcome any feedback.  This is not intended for use in
 updating a local planet file as the existing daily files are better suited
 to that.  For databases that can tolerate a minute delay, the existing
 mechanism is very simple and has proven to be fairly reliable.  But if you
 really need current access to data, and can cope with the additional
 complexity, this should be useful.  The current 10 second delay is not a
 lower limit, but is a good starting point for now.
 
 Cheers,
 Brett

 ___
 osmosis-dev mailing list
 osmosis-dev@openstreetmap.org
 http://lists.openstreetmap.org/listinfo/osmosis-dev


-- 
Jochen Topf  joc...@remote.org  http://www.remote.org/jochen/  +49-721-388298

___
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev


Re: [osmosis-dev] java.lang.AbstractMethodError: org.openstreetmap.osm.data.osmbin.v1_0.OsmBinV10Writer.initialize

2012-10-13 Thread Brett Henderson
Hi Kim,

On 13 October 2012 00:36, KHOO KIM HWA khoofu...@yahoo.com wrote:

 Hello,

 I got some errors when I try to convert .osm to osmbin file by executing:
 osmosis --read-xml file=idf_main_roads_new.osm --write-osmbin-0.6
 dir=./osmbin-full-map-dir

 Thread for task 1-read-xml failed
 java.lang.AbstractMethodError:
 org.openstreetmap.osm.data.osmbin.v1_0.OsmBinV10W
 riter.initialize(Ljava/util/Map;)V
 at
 org.openstreetmap.osmosis.xml.v0_6.XmlReader.run(XmlReader.java:95)
 at java.lang.Thread.run(Unknown Source)

 Can someone give me some ideas to fix the problems?


This error appears to be coming from an Osmosis plugin, not Osmosis
itself.  The plugin will need to be updated to support the latest version
of Osmosis.  If you can't get a newer version of the plugin you'll need to
downgrade to version 0.39 of Osmosis.  Recent versions of Osmosis require
all tasks to support a new initialize method.

Brett
___
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev


Re: [osmosis-dev] Streaming Replication

2012-10-13 Thread Brett Henderson
Hi Jochen,

On 14 October 2012 06:17, Jochen Topf joc...@remote.org wrote:

 Very interesting work!

 How do you handle new incoming requests. They have to start from a known
 point
 so I guess you have to do an SQL query for each of them? Or do you just
 read
 the existing .osc files from disk and stream them out? This could take a
 long
 time...


It just reads existing .osc files.  The server-side is made up of two
processes.  The first process extracts data from the database and writes
.state.txt and .osc files in a similar way to existing replication but it
runs continuously using a single database connection.  The second process
serves the data to clients.  The two processes talk via an internal
HTTP-based channel so that the extracter can notify the data server when
new intervals have been processed.  The two processes can be run in a
single Osmosis process, but I usually run them separately.

It could take a long time to download a long time interval, but it is
perhaps faster than you'd expect.  I suspect that in most cases the
bottleneck will be client side trying to consume the data.  I've thought
about writing the data into a database instead, but it's more effort to
both develop and manage.  I'm planning to wait to see if it becomes an
issue.  Downloading bulk data should be relatively rare because most
connected clients should be up to date and just waiting for new data to
arrive.

One other thing to note is that it supports a tree of servers where one
master server feeds data to any number of slaves which in turn feed data to
end users.  Example commands are here:
http://wiki.openstreetmap.org/wiki/Osmosis/Replication#Streaming_Caching

Brett
___
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev