subject:"\[OSM\-talk\] Live Data \- all new Data in OSM"

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-21 Thread Jaak Laineste

 To put OSM data live to xmpp ist very simple and I don't think it's
 expensive.

Coming back to this a bit older topic. XMPP is server-based solution, so you
will overload some server. Why not use good old and free Kazaa network, in
its Skype groupchat re-incarnation, so the delivery channel would be nicely
distributed? 

There could be be traffic limitations in Skype, so it needs checked out and
tested. Also creation of skype plugin for generating and loading the feed
would be maybe even easier than with xmpp.

/Jaak



___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Tom Hughes

Ian Dees wrote:

 I'd like to continue this part of the thread. As was discussed by 
 Frederik, I think the end goal should be a real-time OSM stream of 
 what's getting applied to the database. Doing that in a performant way 
 is relatively difficult (which is why we're using Osmosis and minutely 
 diffs right now), but I think we should be striving for having a 
 realtime XML feed.

I have to say I don't see any great reason to strive for it. I don't 
think anybody has ever given a use case which requires such a stream and 
can't work with the diffs.

Given that such a stream is uncacheable (and hence requires much higher 
bandwidth outgoing from the core servers) and much more fragile than the 
diffs, it is not obvious that we should put what would undoubtedly be a 
huge amount of effort into creating and maintaining such a system rather 
than into doing other things.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Lennard

Matt Amos wrote:

 these might be of interest:
 
 http://matt.sandbox.cloudmade.com/

Which would have been fine and dandy in the past, but somebody needs to 
nudge that one into life again, /me thinks.

-- 
Lennard

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Tom Hughes

Frederik Ramm wrote:

 Tom Hughes wrote:
 It's a completely insane solution though. It we want to do it we 
 should just do it properly in the database not fart around with stupid 
 hacks in the rails code that break as soon as any updates are not done 
 via rails.
 
 Assuming for a moment that the database was our bottleneck, something 
 that can be done by farting around on a number of easily scalable API 
 servers would of course compare favourably to burdening the 
 not-so-scalable database with triggers and extra write operations, would 
 it not?

The fact that the servers are easily scalable is part of the problem as 
it means that any such logging system involves merging the actions of 
some 80 or so processes spread over 4 separate machines (at present).

That either means some complicated and fragile locking scheme to control 
who is writing to the log at any given time or some scheme for merging a 
whole load of separate logs.

 Now I don't know how often you manually modify database contents, but I 
 would think that any operation of a scale that would lead us to bypass 
 the rails API would also be very likely to blow apart anyone who listens 
 for edits downstream, so in my eyes there's not much to be gained by 
 streaming these manual override kinds of edits as well.

I'm not thinking about manual modifications. I'm thinking about things 
like the gpx import that are no longer in rails. I think that is only 
likely to spread to include much of the API in the not too distant future.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Bernhard zwischenbrugger

Hi

Maybe you like this:
http://datenkueche.com/osmlive/

If I get nice feedback I will make it zoomable.

Bernhard

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos

On Wed, May 13, 2009 at 8:43 AM, Lennard l...@xs4all.nl wrote:
 Matt Amos wrote:

 these might be of interest:

 http://matt.sandbox.cloudmade.com/

 Which would have been fine and dandy in the past, but somebody needs to
 nudge that one into life again, /me thinks.

yeah, sorry. its on my todo list ;-)

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees

On Wed, May 13, 2009 at 2:41 AM, Tom Hughes t...@compton.nu wrote:

 Ian Dees wrote:

  I'd like to continue this part of the thread. As was discussed by
 Frederik, I think the end goal should be a real-time OSM stream of what's
 getting applied to the database. Doing that in a performant way is
 relatively difficult (which is why we're using Osmosis and minutely diffs
 right now), but I think we should be striving for having a realtime XML
 feed.


 I have to say I don't see any great reason to strive for it.


Because it's there? Why are we striving to cover the globe with map data? :)


 I don't think anybody has ever given a use case which requires such a
 stream and can't work with the diffs.


I agree, but the point is that minutely-diffs are a minute old. At some
point in the future someone will want to see the data in real time as a
stream. The only reason I can currently think of is because they don't want
to have to deal with downloading the minutely diffs and would rather read a
stream of XML messages, applying each one to their database somehow as they
came in.

Given that such a stream is uncacheable (and hence requires much higher
 bandwidth outgoing from the core servers)


The stream would be uncacheable, but could be repeated by others outside of
the core server so that the bandwidth load was spread amongst the community.


 and much more fragile than the diffs, it is not obvious that we should put
 what would undoubtedly be a huge amount of effort into creating and
 maintaining such a system rather than into doing other things.


Ok, this I'll agree on. My original post was just to talk about it... not
really to do it. But it sounds like we should take baby steps. Let's work
on the minutely diffs first and if some crazy person comes up with a good
use case for streaming, we can talk about it then.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Peter Childs

2009/5/13 Ian Dees ian.d...@gmail.com:

 Ok, this I'll agree on. My original post was just to talk about it... not
 really to do it. But it sounds like we should take baby steps. Let's work
 on the minutely diffs first and if some crazy person comes up with a good
 use case for streaming, we can talk about it then.


The Problem is that you can't rebuild the map from a continuing
stream, This is the problem with Database Replication in general.

If you lose the stream for any reason you have to start again, which
is a nightmare.

Things that can be streamed like TV and Radio change over time and you
don't need whats gone before. If you miss the beginning of a 24x7 News
channel or a soap you can still work out whats going on about after a
few minutes. With a Map you have not got a chance.

Maps worry about quality. Streams don't

I'm not quite sure how your stream would work.

Theory Great, Practice don't work.

Peter.

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Frederik Ramm

Hi,

Peter Childs wrote:
 The Problem is that you can't rebuild the map from a continuing
 stream, This is the problem with Database Replication in general.

True, but maybe the stream use cases don't require that? Maybe it is 
more important for an application to know in an instant where something 
is being edited, than having complete knowledge of what has been edited 
yesterday?

I don't have a killer app in mind where I would say this works with a 
stream and doesn't work with minute diffs. But I can think of a number 
of applications that would be cooler with a proper stream. I mean, just 
look at Bernhard's application:

http://datenkueche.com/osmlive/

It looks very cool and you have the individual spots lighting up in 
something that looks like real time but then it is five minutes 
delayed and based on chunked diffs - meaning what you see is a 
fabricated replay of what has probably happened, and not the real 
thing. Which diminshes the coolness, if only slightly.

Now I'm not saying we should turn the database inside out to support 
fractionally more coolness.

But saying: We don't intend to support this because we cannot think of 
an application that absolutely requires it, is quite un-OSM, is it not?

Bye
Frederik

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Iván Sánchez Ortega

El Miércoles, 13 de Mayo de 2009, Ian Dees escribió:
 [...] the point is that minutely-diffs are a minute old. At some point in 
 the future someone will want to see the data in real time as a stream.

If you can't wait *one* minute to see the data, you have a very acute case of 
OSMOCD, and you should see a psychiatrist.

 The only reason I can currently think of is because they don't want 
 to have to deal with downloading the minutely diffs and would rather read a
 stream of XML messages, applying each one to their database somehow as they
 came in.

As a wise man once said, all problems in computer science can be solved by 
adding another indirection layer.

If you really really want a stream, I'm positive it can be hacked with a 
couple of scripts and the minutely diffs.

-- 
--
Iván Sánchez Ortega i...@sanchezortega.es

Un ordenador no es un televisor ni un microondas, es una herramienta compleja.


signature.asc
Description: This is a digitally signed message part.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees

2009/5/13 Iván Sánchez Ortega i...@sanchezortega.es

 As a wise man once said, all problems in computer science can be solved by
 adding another indirection layer.

 If you really really want a stream, I'm positive it can be hacked with a
 couple of scripts and the minutely diffs.


You have discovered one of my use-cases for the stream: the minutely diffs
should be generated from the stream by slicing the stream up into
minute-long segments and saving them to disk, not the other way around.

From previous discussions with Brett, this is essentially what Osmosis is
doing, but with the database as the input instead of the stream.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Jonathan Bennett

Frederik Ramm wrote:
 But saying: We don't intend to support this because we cannot think of 
 an application that absolutely requires it, is quite un-OSM, is it not?

Qualify application as application which actually uses the geodata,
and it's not so far off the mark. We don't need a million tools that
just tell us where people are mapping.
-- 
Jonathan (Jonobennett)

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread andrzej zaborowski

2009/5/13 Jonathan Bennett openstreet...@jonno.cix.co.uk:
 Ian Dees wrote:
     I don't think anybody has ever given a use case which requires such
    a stream and can't work with the diffs.


 I agree, but the point is that minutely-diffs are a minute old. At some
 point in the future someone will want to see the data in real time as a
 stream. The only reason I can currently think of is because they don't
 want to have to deal with downloading the minutely diffs and would
 rather read a stream of XML messages, applying each one to their
 database somehow as they came in.

 The updates to the database aren't records of real-time, real-world
 events; They're just mappers updating parts of the map. Anything which
 analyses that, rather than the data itself as a whole is just
 navel-gazing. It tells you something about the project, but not the
 world it's mapping.

 You're not missing out on anything by having minute-old data.

You might be missing out on a cool visualisation tool though (maybe
what Bernhard is trying doing is similar), but that's the only use
case I can think of right now.

What is a little worrying is that, as far as I see, there's no simple
way to get a copy of the osm data (as in, everything that's in the
database), even a week old -- because the planet file is only a
projection of the data on a plane.  AFAIK Wikipedia manages to
provide full database dumps so technically it should also be possible
for OSM as we still (?) have less data and less traffic than WP.

I'd think the streaming/download and upload (merging) of new data are
two separable tasks that can be provided by separate servers with db
replication between them.

Cheers

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Jonathan Bennett

andrzej zaborowski wrote:
 You might be missing out on a cool visualisation tool though (maybe
 what Bernhard is trying doing is similar), but that's the only use
 case I can think of right now.

How does that help anyone a) use the data, or b) improve the data? See
ITO's OSM Mapper if you want a *useful* visualisation tool. No live
stream needed there.

 What is a little worrying is that, as far as I see, there's no simple
 way to get a copy of the osm data (as in, everything that's in the
 database), even a week old -- because the planet file is only a
 projection of the data on a plane.  

I have no idea what a projection of the data on a plane is, unless
you're talking about an in-flight OSM movie. The planet file is
everything that's in the database, barring history info. Nothing more,
nothing less.

-- 
Jonathan (Jonobennett)

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos

2009/5/13 Ian Dees ian.d...@gmail.com:
 2009/5/13 Iván Sánchez Ortega i...@sanchezortega.es

 As a wise man once said, all problems in computer science can be solved
 by
 adding another indirection layer.

 If you really really want a stream, I'm positive it can be hacked with a
 couple of scripts and the minutely diffs.

+1

 You have discovered one of my use-cases for the stream: the minutely diffs
 should be generated from the stream by slicing the stream up into
 minute-long segments and saving them to disk, not the other way around.

why not?

when its done the other way around its far, far simpler - just xml
files on disk.

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees

On Wed, May 13, 2009 at 10:11 AM, Jonathan Bennett 
openstreet...@jonno.cix.co.uk wrote:

 Frederik Ramm wrote:
  But saying: We don't intend to support this because we cannot think of
  an application that absolutely requires it, is quite un-OSM, is it not?

 Qualify application as application which actually uses the geodata,
 and it's not so far off the mark. We don't need a million tools that
 just tell us where people are mapping.


Woah! Since when can OSM tell me what sort of applications I can and can't
write with the open source data that OSM is providing**?

OSM isn't about the geodata, it's about the data. That includes the fact
that it is in the geographic domain, but it also means that we can
manipulate it or store it however we want.

** Provided it meets the requirements of the license that the data is
released under.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Jonathan Bennett

Ian Dees wrote:
 Woah! Since when can OSM tell me what sort of applications I can and
 can't write with the open source data that OSM is providing**?

You're not being told what to do with the data, but it's being suggested
to you that you can't have it in a particular, resource-intensive format
unless you can justify why you need it over and above an existing, less
resource hungry format, for an application that does something other
than go Ooooh, shiny!

 OSM isn't about the geodata, it's about the data. That includes the fact
 that it is in the geographic domain, but it also means that we can
 manipulate it or store it however we want.

You can. On your own infrastructure.

-- 
Jonathan (Jonobennett)

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread andrzej zaborowski

2009/5/13 Jonathan Bennett openstreet...@jonno.cix.co.uk:
 andrzej zaborowski wrote:
 You might be missing out on a cool visualisation tool though (maybe
 what Bernhard is trying doing is similar), but that's the only use
 case I can think of right now.

 How does that help anyone a) use the data, or b) improve the data? See
 ITO's OSM Mapper if you want a *useful* visualisation tool. No live
 stream needed there.

Cool visualisation tools don't have to comply with a) or b), they just
need to be cool :)


 What is a little worrying is that, as far as I see, there's no simple
 way to get a copy of the osm data (as in, everything that's in the
 database), even a week old -- because the planet file is only a
 projection of the data on a plane.

 I have no idea what a projection of the data on a plane is, unless
 you're talking about an in-flight OSM movie. The planet file is
 everything that's in the database, barring history info.

Yup, barring history info.  One of the dimensions is thrown away, this
operation is called projection.

I don't say I need to have a use case for the full database, but in
any project it's only fair to give contributors a way to download the
entire database with the data they created.

Cheers

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Jonathan Bennett

andrzej zaborowski wrote:
  Cool visualisation tools don't have to comply with a) or b), they just
 need to be cool :)

So cool you're prepared to pay for the infrastructure to support it?


-- 
Jonathan (Jonobennett)

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread andrzej zaborowski

2009/5/13 Jonathan Bennett openstreet...@jonno.cix.co.uk:
 andrzej zaborowski wrote:
   Cool visualisation tools don't have to comply with a) or b), they just
 need to be cool :)

 So cool you're prepared to pay for the infrastructure to support it?

I didn't say that.  I said there *are* things you're missing out on.

In a different mail you said:
 Ian Dees wrote:
 OSM isn't about the geodata, it's about the data. That includes the fact
 that it is in the geographic domain, but it also means that we can
 manipulate it or store it however we want.

 You can. On your own infrastructure.

Except you can't right now, the dumps don't provide enough information
to duplicate OSM database even on your own infrastructure.

Cheers

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees

On Wed, May 13, 2009 at 10:28 AM, Jonathan Bennett 
openstreet...@jonno.cix.co.uk wrote:

 Ian Dees wrote:
  Woah! Since when can OSM tell me what sort of applications I can and
  can't write with the open source data that OSM is providing**?

 You're not being told what to do with the data, but it's being suggested
 to you that you can't have it in a particular, resource-intensive format
 unless you can justify why you need it over and above an existing, less
 resource hungry format, for an application that does something other
 than go Ooooh, shiny!


The whole argument I'm making is that after the initial implementation**,
streaming the data is a lot less resource intensive than what we are
currently doing. Perhaps I don't have the whole picture of what goes on in
the backend, but at some point the changeset XML files are applied to the
database. At this point, we already have the XML changeset that was created
by the client. The stream would simply be mirroring that out to anyone
listening over a compressed HTTP channel.

Of course it could then by propogated to other servers if the bandwidth load
was too great.

One of the clients to this stream might be Osmosis, saving off chunks of
data one minute wide and sending it to planet.openstreetmap.org, for
example.

** ...and I've always said I would be willing to impelement this if we
discussed it and decided there was a way to source the data in a technically
feasible way.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Frederik Ramm

Hi,

Jonathan Bennett wrote:
 andrzej zaborowski wrote:
 You might be missing out on a cool visualisation tool though (maybe
 what Bernhard is trying doing is similar), but that's the only use
 case I can think of right now.
 
 How does that help anyone a) use the data, or b) improve the data? See
 ITO's OSM Mapper if you want a *useful* visualisation tool. No live
 stream needed there.

Who are you to say what is useful and what isn't? The presentation from 
SOTM 2007 that I remember most vividly - the wiggly maps - was also 
the most useless.

Every day someone says let's not map some detail here because it is 
useless, and our mantra is maybe it is just your limited imagination 
that makes this look useless. Why suddenly this very different attitude 
of yours?

I fully agree that streaming is probably a niche thing, a nice-to-have 
and not a must-have, and I have no problem if the idea is treated as a 
small priority. But dismissing it just because your imagination is too 
limited...?

Bye
Frederik


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees

On Wed, May 13, 2009 at 10:33 AM, Jonathan Bennett 
openstreet...@jonno.cix.co.uk wrote:

 andrzej zaborowski wrote:
   Cool visualisation tools don't have to comply with a) or b), they just
  need to be cool :)

 So cool you're prepared to pay for the infrastructure to support it?


I think talking about hardware infrastructure is a little premature at this
point, but yes, I would be happy to set up a server or 3 to send this
streaming data around the world and take the load off of the db/api servers.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Bernhard zwischenbrugger

Jonathan Bennett schrieb:
 andrzej zaborowski wrote:
   Cool visualisation tools don't have to comply with a) or b), they just
   
 need to be cool :)
 

 So cool you're prepared to pay for the infrastructure to support it?


   
To put OSM data live to xmpp ist very simple and I don't think it's 
expensive.

An easy way would be to post it to a xmpp groupchat:

message type=groupchat to=osml...@conference.thejabberserver.org/bot
osmgeodata here/osm
/message

After login it's just a copy to a tcp socket port 5222.
Everybody who wants the data can log into the groupchat and gets all the 
new data.
Jabber Servers can handle the load without problem (not sure about that 
) and maybe its possible to use an existing jabber server like 
jabber.org, jabber.ru,

I would like to see that. It would be a perfect playground for me.

Bernhard

OSM Live (6 Minutes delay):
http://datenkueche.com/osmlive/






___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Tom Hughes

Bernhard zwischenbrugger wrote:

 To put OSM data live to xmpp ist very simple and I don't think it's 
 expensive.
 
 An easy way would be to post it to a xmpp groupchat:
 
 message type=groupchat to=osml...@conference.thejabberserver.org/bot
 osmgeodata here/osm
 /message
 
 After login it's just a copy to a tcp socket port 5222.
 Everybody who wants the data can log into the groupchat and gets all the 
 new data.
 Jabber Servers can handle the load without problem (not sure about that 
 ) and maybe its possible to use an existing jabber server like 
 jabber.org, jabber.ru,

Yes and then as soon your client disconnects for a second you've lost a 
ton of edits and you have no way to resync.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Tom Hughes

Ian Dees wrote:

 The whole argument I'm making is that after the initial 
 implementation**, streaming the data is a lot less resource intensive 
 than what we are currently doing. Perhaps I don't have the whole picture 
 of what goes on in the backend, but at some point the changeset XML 
 files are applied to the database. At this point, we already have the 
 XML changeset that was created by the client. The stream would simply be 
 mirroring that out to anyone listening over a compressed HTTP channel.

You don't want Potlatch's changes then? or changes made by changing 
individual objects rather than uploading diffs?

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos

On Wed, May 13, 2009 at 4:40 PM, andrzej zaborowski balr...@gmail.com wrote:
 In a different mail you said:
 Ian Dees wrote:
 OSM isn't about the geodata, it's about the data. That includes the fact
 that it is in the geographic domain, but it also means that we can
 manipulate it or store it however we want.

 You can. On your own infrastructure.

 Except you can't right now, the dumps don't provide enough information
 to duplicate OSM database even on your own infrastructure.

they don't *yet*. brett has been working on full diffs, i.e: diffs
with all edits, whether they were later overridden or not. this would
allow you to fully reproduce the whole database. see
http://planet.openstreetmap.org/history/ for whats been done so far.

Frederik said:
 I fully agree that streaming is probably a niche thing, a nice-to-have
 and not a must-have, and I have no problem if the idea is treated as a
 small priority. But dismissing it just because your imagination is too
 limited...?

+1

i think if we can get the delay on the diffs down from 5 mins to under
2 mins then there's no reason why streaming can't be built on top of
the diffs and be able to support all the things people want to do with
streaming.

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos

On Wed, May 13, 2009 at 5:15 PM, Tom Hughes t...@compton.nu wrote:
 Ian Dees wrote:
 The whole argument I'm making is that after the initial
 implementation**, streaming the data is a lot less resource intensive
 than what we are currently doing. Perhaps I don't have the whole picture
 of what goes on in the backend, but at some point the changeset XML
 files are applied to the database. At this point, we already have the
 XML changeset that was created by the client. The stream would simply be
 mirroring that out to anyone listening over a compressed HTTP channel.

 You don't want Potlatch's changes then? or changes made by changing
 individual objects rather than uploading diffs?

+1

or even the diffs? any diff where someone creates an element has
negative placeholder IDs, so extra work would have to be done altering
the XML to match the IDs returned by the database.

and the HTTP stream would contain many osmChange documents? that won't
really work with any XML parser i know of... you'd need to pre-parse
it into separate XML documents first.

and how would you take these XML documents on the API servers and
merge them into a consistent ordered stream, ensuring all data
dependencies are satisfied?

all of that in less work than than osmosis' diff queries?

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Frederik Ramm

Hi,

Matt Amos wrote:
 i think if we can get the delay on the diffs down from 5 mins to under
 2 mins then there's no reason why streaming can't be built on top of
 the diffs and be able to support all the things people want to do with
 streaming.

What you are talking about is simulated streaming not real streaming. 
But it would be a good start; establish some kind of simulated streaming 
that is based on the diffs and costs us almost nothing (can be done by 
someone on their own server off-site!), and when interesting 
applications spring from this where everybody says oh if these could 
only be real-time instead of 2 minutes delayed then one an still work 
on providing the same stream in a live fashion.

By the way, if someone really wants to chase the edge of the database by 
always downloading the latest minute diff, what is the suggested way to 
do this? If he makes only one GET request per minute then the diff he is 
looking for might already be 59 seconds delayed ;-) can any of today's 
hip  trendy messaging protocols be used to painlessly notify anyone who 
is interested that there's a new diff ready, instead of having 
over-eager scripts poll the directory every 10 seconds?

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees

On Wed, May 13, 2009 at 12:18 PM, Matt Amos zerebub...@gmail.com wrote:

 On Wed, May 13, 2009 at 5:15 PM, Tom Hughes t...@compton.nu wrote:
  Ian Dees wrote:
  The whole argument I'm making is that after the initial
  implementation**, streaming the data is a lot less resource intensive
  than what we are currently doing. Perhaps I don't have the whole picture
  of what goes on in the backend, but at some point the changeset XML
  files are applied to the database. At this point, we already have the
  XML changeset that was created by the client. The stream would simply be
  mirroring that out to anyone listening over a compressed HTTP channel.
 
  You don't want Potlatch's changes then? or changes made by changing
  individual objects rather than uploading diffs?

 +1

 or even the diffs? any diff where someone creates an element has
 negative placeholder IDs, so extra work would have to be done altering
 the XML to match the IDs returned by the database.


These are implementation details that would have to be hammered out after we
talk about design.

You're right, I would prefer to have the database itself (via triggers) dump
to a file/network handle the data that's being written to it. This way, it
would be able to get everything (including Potlatch and diffs) as it was
created.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees

On Wed, May 13, 2009 at 12:30 PM, Frederik Ramm frede...@remote.org wrote:

 can any of today's
 hip  trendy messaging protocols be used to painlessly notify anyone who
 is interested that there's a new diff ready, instead of having
 over-eager scripts poll the directory every 10 seconds?


The server would need to open up a socket and send out some sort of
notification to whoever is listening whenever a new diff is ready.

I imagine might even be an intermediate server application that listens to
that notification, grabs the diff, and creates the pseudo-stream for others.
This way, the pseudo-stream would be delayed by N+60 seconds, where N is the
number of seconds it took to create/post/notify/download the diff. That's
pretty darn good.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos

On Wed, May 13, 2009 at 7:05 PM, Ian Dees ian.d...@gmail.com wrote:
 On Wed, May 13, 2009 at 12:18 PM, Matt Amos zerebub...@gmail.com wrote:
 On Wed, May 13, 2009 at 5:15 PM, Tom Hughes t...@compton.nu wrote:
  Ian Dees wrote:
  The whole argument I'm making is that after the initial
  implementation**, streaming the data is a lot less resource intensive
  than what we are currently doing. Perhaps I don't have the whole
  picture
  of what goes on in the backend, but at some point the changeset XML
  files are applied to the database. At this point, we already have the
  XML changeset that was created by the client. The stream would simply
  be
  mirroring that out to anyone listening over a compressed HTTP channel.
 
  You don't want Potlatch's changes then? or changes made by changing
  individual objects rather than uploading diffs?

 +1

 or even the diffs? any diff where someone creates an element has
 negative placeholder IDs, so extra work would have to be done altering
 the XML to match the IDs returned by the database.

 These are implementation details that would have to be hammered out after we
 talk about design.

 You're right, I would prefer to have the database itself (via triggers) dump
 to a file/network handle the data that's being written to it. This way, it
 would be able to get everything (including Potlatch and diffs) as it was
 created.

why via triggers?

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees

On Wed, May 13, 2009 at 1:09 PM, Matt Amos zerebub...@gmail.com wrote:

 why via triggers?


Because the database is the only aggregation point for the data. There are
many API servers (which would be the ideal spot for creating this data
feed), but my initial thought was that it was quite cumbersome to try and
aggregate the streams from the various API servers (along with time-aligning
them) when the DB server was already doing that for you.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos

On Wed, May 13, 2009 at 7:13 PM, Ian Dees ian.d...@gmail.com wrote:
 On Wed, May 13, 2009 at 1:09 PM, Matt Amos zerebub...@gmail.com wrote:

 why via triggers?

 Because the database is the only aggregation point for the data. There are
 many API servers (which would be the ideal spot for creating this data
 feed), but my initial thought was that it was quite cumbersome to try and
 aggregate the streams from the various API servers (along with time-aligning
 them) when the DB server was already doing that for you.

sorry, i wasn't clear in my question: why triggers in particular,
rather than one of the many other features that the DB provides for
doing this?

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos

On Wed, May 13, 2009 at 6:30 PM, Frederik Ramm frede...@remote.org wrote:
 Matt Amos wrote:

 i think if we can get the delay on the diffs down from 5 mins to under
 2 mins then there's no reason why streaming can't be built on top of
 the diffs and be able to support all the things people want to do with
 streaming.

 What you are talking about is simulated streaming not real streaming. But
 it would be a good start; establish some kind of simulated streaming that is
 based on the diffs and costs us almost nothing (can be done by someone on
 their own server off-site!),

indeed! good, isn't it? ;-)

 and when interesting applications spring from
 this where everybody says oh if these could only be real-time instead of 2
 minutes delayed then one an still work on providing the same stream in a
 live fashion.

given that nothing is ever truly live - there will be a processing
delay with any method - whats the real advantage in a 2 minute delay
rather than a 1 minute delay?

 By the way, if someone really wants to chase the edge of the database by
 always downloading the latest minute diff, what is the suggested way to do
 this? If he makes only one GET request per minute then the diff he is
 looking for might already be 59 seconds delayed ;-)

yep... but does another 59 seconds really matter? ;-)

 can any of today's hip 
 trendy messaging protocols be used to painlessly notify anyone who is
 interested that there's a new diff ready, instead of having over-eager
 scripts poll the directory every 10 seconds?

i guess it would be fairly easy to have a CGI script for the next
diff, i.e: after receiving the request it blocks until a new diff is
ready and then returns that diff.

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees

On Wed, May 13, 2009 at 1:27 PM, Matt Amos zerebub...@gmail.com wrote:

 On Wed, May 13, 2009 at 7:13 PM, Ian Dees ian.d...@gmail.com wrote:
  On Wed, May 13, 2009 at 1:09 PM, Matt Amos zerebub...@gmail.com wrote:
 
  why via triggers?
 
  Because the database is the only aggregation point for the data. There
 are
  many API servers (which would be the ideal spot for creating this data
  feed), but my initial thought was that it was quite cumbersome to try and
  aggregate the streams from the various API servers (along with
 time-aligning
  them) when the DB server was already doing that for you.

 sorry, i wasn't clear in my question: why triggers in particular,
 rather than one of the many other features that the DB provides for
 doing this?


Mostly because it would allow us to use the same XML format that everybody
already knows how to parse and because it's what I've worked with in my
limited PostgreSQL experience.

What other features were you thinking about?
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos

On Wed, May 13, 2009 at 7:30 PM, Ian Dees ian.d...@gmail.com wrote:
 On Wed, May 13, 2009 at 1:27 PM, Matt Amos zerebub...@gmail.com wrote:
 sorry, i wasn't clear in my question: why triggers in particular,
 rather than one of the many other features that the DB provides for
 doing this?

 Mostly because it would allow us to use the same XML format that everybody
 already knows how to parse and because it's what I've worked with in my
 limited PostgreSQL experience.

why would it allow us to use the XML format? nothing in XML ever goes
near the database.

 What other features were you thinking about?

i was looking at snapshots and transaction IDs to isolate the updated
rows in the history tables.

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees

On Wed, May 13, 2009 at 1:33 PM, Matt Amos zerebub...@gmail.com wrote:

 On Wed, May 13, 2009 at 7:30 PM, Ian Dees ian.d...@gmail.com wrote:
  On Wed, May 13, 2009 at 1:27 PM, Matt Amos zerebub...@gmail.com wrote:
  sorry, i wasn't clear in my question: why triggers in particular,
  rather than one of the many other features that the DB provides for
  doing this?
 
  Mostly because it would allow us to use the same XML format that
 everybody
  already knows how to parse and because it's what I've worked with in my
  limited PostgreSQL experience.

 why would it allow us to use the XML format? nothing in XML ever goes
 near the database.


I meant that it would trigger some external executable that would build up
the XML, not that the database would do it.


  What other features were you thinking about?

 i was looking at snapshots and transaction IDs to isolate the updated
 rows in the history tables.


I yield to your judgment on that. I haven't given myself enough time to
explore abusing the database app for such a thing.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos

On Wed, May 13, 2009 at 7:36 PM, Ian Dees ian.d...@gmail.com wrote:
 On Wed, May 13, 2009 at 1:33 PM, Matt Amos zerebub...@gmail.com wrote:
 On Wed, May 13, 2009 at 7:30 PM, Ian Dees ian.d...@gmail.com wrote:
  On Wed, May 13, 2009 at 1:27 PM, Matt Amos zerebub...@gmail.com wrote:
  sorry, i wasn't clear in my question: why triggers in particular,
  rather than one of the many other features that the DB provides for
  doing this?
 
  Mostly because it would allow us to use the same XML format that
  everybody
  already knows how to parse and because it's what I've worked with in my
  limited PostgreSQL experience.

 why would it allow us to use the XML format? nothing in XML ever goes
 near the database.

 I meant that it would trigger some external executable that would build up
 the XML, not that the database would do it.

is the external executable called osmosis?

  What other features were you thinking about?

 i was looking at snapshots and transaction IDs to isolate the updated
 rows in the history tables.

 I yield to your judgment on that. I haven't given myself enough time to
 explore abusing the database app for such a thing.

its better to get this done without the main db and the rails_port
code diverging too much, so i'm looking for methods which are as
un-invasive as possible.

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees

On Wed, May 13, 2009 at 1:48 PM, Matt Amos zerebub...@gmail.com wrote:

 its better to get this done without the main db and the rails_port
 code diverging too much, so i'm looking for methods which are as
 un-invasive as possible.


I agree. Since it seems like a huge amount of work to augment the current
infrastructure to support this, perhaps it would make more sense to follow
what (I think) Frederik said: use the minutely diffs to create a
pseudo-stream and see what sort of apps build up around it.

What's left on making the diffs work?
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Erik Johansson

This is an implementation of this for  Live Journal:
http://updates.sixapart.com/

Lets you connect to a TCP port and get live XML feed of all updates on
Livejournal.. Has some cool features, such as discarding data from the
stream when you can't keep up.

/Erik

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

[OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Bernhard zwischenbrugger

Hi all

Is there a possibility to get all new data entered to OSM in realtime?

If someone adds a new road, building, restaurant,... I would like to 
have this data.

There was talks to put this kind of data to the jabber network.
Is this already available?

Bernhard

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Ian Dees

On Tue, May 12, 2009 at 4:56 PM, Bernhard zwischenbrugger 
b...@datenkueche.com wrote:

 Hi all

 Is there a possibility to get all new data entered to OSM in realtime?

 If someone adds a new road, building, restaurant,... I would like to
 have this data.

 There was talks to put this kind of data to the jabber network.
 Is this already available?


There is no live feed of data available. The closest to live is the minutely
diffs on the planet server.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Iván Sánchez Ortega

El Martes, 12 de Mayo de 2009, Bernhard zwischenbrugger escribió:
 Is there a possibility to get all new data entered to OSM in realtime?

No, AFAIK. The closest you can get is the minutely diffs (all the changes done 
in the last minute).

 If someone adds a new road, building, restaurant,... I would like to
 have this data.

Well, head on to planet.openstreetmap.org and download planets and diffs to 
your heart's content :-)


Cheers,
-- 
--
Iván Sánchez Ortega i...@sanchezortega.es

Aviso: Este e-mail es confidencial y no debería ser usado por nadie que no sea 
el destinatario original. No se permite la reproducción mediante fotocopia, 
walkie-talkie, emisora de radioaficionado, satélite, televisión por cable, 
proyector, señales de humo, código morse, braille, lenguaje de signos, 
taquigrafía o cualquier otro medio. Bajo ningún concepto debe traducirse al 
francés este e-mail. Este e-mail no puede ser ridiculizado, parodiado, 
juzgado en una competición, o leído en voz alta con un acento gracioso 
llevando un bigote falso y/o cualquier tipo de sombrero, incluyendo pero no 
limitándose a pañuelos. No inciten ni provoquen a este e-mail. Si está 
medicándose, puede experimentar nauseas, desorientación, histeria, vómitos, 
pérdida temporal de la memoria a corto plazo y malestar general al leer este 
e-mail. Consulte a su médico o farmacéutico antes de leer este e-mail. Todas 
las modelos descritas en este e-mail son mayores de 18 años. Si ha recibido 
este e-mail por error es probablemente porque estaba borracho cuando escribí 
la dirección del destinatario.


signature.asc
Description: This is a digitally signed message part.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Iván Sánchez Ortega

El Miércoles, 13 de Mayo de 2009, andrzej zaborowski escribió:
 From the minutely diffs if a new way is created and deleted in the same 
 minute, you would never know about it 

Can't you get the changeset IDs from the diff, then query the API to know the 
exact time of the changeset?

-- 
--
Iván Sánchez Ortega i...@sanchezortega.es

Good people do not need laws to tell them to act responsibly, while bad 
people will find a way around the laws.
 - Plato (427-347 B.C.)


signature.asc
Description: This is a digitally signed message part.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Matt Amos

On Tue, May 12, 2009 at 11:50 PM, andrzej zaborowski balr...@gmail.com wrote:
 You can in theory extract all edits, at higher than 1 minute
 granularity, from http://www.openstreetmap.org/browse/changesets
 together with all history.  (From the minutely diffs if a new way is
 created and deleted in the same minute, you would never know about it)

in theory, yes, but please don't as it puts extra strain on the
servers. please use the minute diffs from the planet server instead
:-)

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Tom Hughes

andrzej zaborowski wrote:

 You can in theory extract all edits, at higher than 1 minute
 granularity, from http://www.openstreetmap.org/browse/changesets
 together with all history.  (From the minutely diffs if a new way is
 created and deleted in the same minute, you would never know about it)

Anybody trying such a stunt will be liable to summary blocking when 
caught however.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Matt Amos

2009/5/13 Iván Sánchez Ortega i...@sanchezortega.es:
 El Miércoles, 13 de Mayo de 2009, andrzej zaborowski escribió:
 From the minutely diffs if a new way is created and deleted in the same
 minute, you would never know about it

 Can't you get the changeset IDs from the diff, then query the API to know the
 exact time of the changeset?

the way (and the changeset its in) may not even appear in the diff.
also, changesets are not atomic, so they don't have a single time -
they have a created_at time and a closed_at time which can be up to
24h apart.

however, brett is testing a new form of diffs that contain all edits,
which should solve that problem.

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Frederik Ramm

Hi,

Tom Hughes wrote:
 Anybody trying such a stunt will be liable to summary blocking when 
 caught however.

I was waiting for that ;-)

To be just slightly more constructive, the least invasive way of 
querying the API for new data only without changing the code would be to 
make multi-GETs for batches of object IDs just above the highest known 
object ID. That would probably not disrupt services if done by one user, 
but then if one user is allowed to do it, what can we say if 10 others 
wanted to do the same?

Probably the best way to have a live feed - and a technique that has 
been discussed on dev about two years ago - would be to have the rails 
code log all successful database operations into a file which could then 
be retrieved by an independent daemon and fed into whatever distribution 
network you want. That would be about the same thing that database 
replication does, just on a higher level.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Tom Hughes

Frederik Ramm wrote:

 Probably the best way to have a live feed - and a technique that has 
 been discussed on dev about two years ago - would be to have the rails 
 code log all successful database operations into a file which could then 
 be retrieved by an independent daemon and fed into whatever distribution 
 network you want. That would be about the same thing that database 
 replication does, just on a higher level.

It's a completely insane solution though. It we want to do it we should 
just do it properly in the database not fart around with stupid hacks in 
the rails code that break as soon as any updates are not done via rails.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Matt Amos

On Wed, May 13, 2009 at 12:36 AM, Frederik Ramm frede...@remote.org wrote:
 To be just slightly more constructive, the least invasive way of
 querying the API for new data only without changing the code would be to
 make multi-GETs for batches of object IDs just above the highest known
 object ID. That would probably not disrupt services if done by one user,
 but then if one user is allowed to do it, what can we say if 10 others
 wanted to do the same?

the least invasive way is to use the minutely diffs, as it doesn't
touch the API or DB servers at all.

 Probably the best way to have a live feed - and a technique that has
 been discussed on dev about two years ago - would be to have the rails
 code log all successful database operations into a file which could then
 be retrieved by an independent daemon and fed into whatever distribution
 network you want. That would be about the same thing that database
 replication does, just on a higher level.

given that there are more efficient ways of doing the database
replication than aggregating these feeds from all the different API
servers into a coherent whole, i think its probably better to continue
creating the feed (i.e: diffs) from the database.

unless, of course, you're talking about twittering the updates. that
would be teh moar ;-)

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Frederik Ramm

Hi,

Tom Hughes wrote:
 It's a completely insane solution though. It we want to do it we should 
 just do it properly in the database not fart around with stupid hacks in 
 the rails code that break as soon as any updates are not done via rails.

Assuming for a moment that the database was our bottleneck, something 
that can be done by farting around on a number of easily scalable API 
servers would of course compare favourably to burdening the 
not-so-scalable database with triggers and extra write operations, would 
it not?

Now I don't know how often you manually modify database contents, but I 
would think that any operation of a scale that would lead us to bypass 
the rails API would also be very likely to blow apart anyone who listens 
for edits downstream, so in my eyes there's not much to be gained by 
streaming these manual override kinds of edits as well.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Bernhard zwischenbrugger

Hi
 http://planet.openstreetmap.org/minute/

That's perfect!!!

Is there also the a file with the *newest* data?
Or do I have to read the timestamp file?

I don't want to synchronize a database. The thing I'm thinking
about is a visualization of the current activity.

Bernhard

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Frederik Ramm

Hi,

Bernhard zwischenbrugger wrote:
 I don't want to synchronize a database. The thing I'm thinking
 about is a visualization of the current activity.

Google for OSMAware for some inspiration!

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Matt Amos

On Wed, May 13, 2009 at 1:10 AM, Bernhard zwischenbrugger
b...@datenkueche.com wrote:
 Hi
 http://planet.openstreetmap.org/minute/

 That's perfect!!!

 Is there also the a file with the *newest* data?
 Or do I have to read the timestamp file?

reading the timestamp.txt is the best way to do it.

 I don't want to synchronize a database. The thing I'm thinking
 about is a visualization of the current activity.

these might be of interest:

http://matt.sandbox.cloudmade.com/
http://trac.openstreetmap.org/browser/applications/utils/export/tile_expiry
http://lists.openstreetmap.org/pipermail/dev/2009-February/013934.html
http://vimeo.com/4548155

/shameless plug

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Shaun McDonald

Frederik,

On 13 May 2009, at 01:01, Frederik Ramm wrote:

 Hi,

 Tom Hughes wrote:
 It's a completely insane solution though. It we want to do it we  
 should
 just do it properly in the database not fart around with stupid  
 hacks in
 the rails code that break as soon as any updates are not done via  
 rails.

 Assuming for a moment that the database was our bottleneck, something
 that can be done by farting around on a number of easily scalable  
 API
 servers would of course compare favourably to burdening the
 not-so-scalable database with triggers and extra write operations,  
 would
 it not?

 Now I don't know how often you manually modify database contents,  
 but I
 would think that any operation of a scale that would lead us to bypass
 the rails API would also be very likely to blow apart anyone who  
 listens
 for edits downstream, so in my eyes there's not much to be gained by
 streaming these manual override kinds of edits as well.


I really don't want to be attempting to try and collate the edits from  
the api server logs. For a start they don't contain all the  
information that you would need.

There needs to be a fix for the Osmosis method where things are  
committed with a huge delay from the timestamp, however that is still  
the best method of distributing updates of the OSM data. It is then up  
to someone else to do what they like with that. If you want to  
summarise each minutely diff and twitter it, be my guest, though  
remember you need to compress it into 140 chars.

Shaun


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Frederik Ramm

Matt,

 the least invasive way is to use the minutely diffs, as it doesn't
 touch the API or DB servers at all.

Sure, but they are (a) delayed by 5 minutes and (b) broken ;-)

I was initially opposed to the concept of diffs. I remember a developer 
meeting in Essen in 2007 where I rather violently requested more 
frequent updates and NickB said something like we could do daily or 
hourly diffs and I said I want the f*ing real thing, not canned diffs.

I must say that, especially with the convenience Osmosis brings in 
dealing with them, I have meanwhile changed my mind. The diffs are a 
very crude solution but they work remarkably well, and they are quite 
robust compared to some kind of replication feed that may go out of sync 
at any time.

I still think that there are use cases for almost-realtime feeds but the 
diffs work for most people. - I didn't know the original poster was 
unaware of the diffs; I assumed he must know the diffs and was looking 
for something better!

 given that there are more efficient ways of doing the database
 replication than aggregating these feeds from all the different API
 servers into a coherent whole, 

As I said in another post, I was under the impression that while you can 
easily have any number of servers running API daemons on them, you'd 
rather not stuff too much into the database because at least for write 
requests we'll be stuck with it for a long while to come. But hey, maybe 
I underestimate the Postgres factor ;-)

 unless, of course, you're talking about twittering the updates. that
 would be teh moar ;-)

For once, it would not be TomH who bans an IP range then ;-)

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Frederik Ramm

Hi,

Shaun McDonald wrote:
 I really don't want to be attempting to try and collate the edits from 
 the api server logs. For a start they don't contain all the information 
 that you would need.

I was not talking about the web server logs, but special log files 
created solely for the purpose of recording, and relaying, changes.

 If you want to summarise 
 each minutely diff and twitter it, be my guest, though remember you need 
 to compress it into 140 chars.

Well, I could spread the content over 60 seconds ;-)

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Matt Amos

On Wed, May 13, 2009 at 1:27 AM, Frederik Ramm frede...@remote.org wrote:
 Matt,

 the least invasive way is to use the minutely diffs, as it doesn't
 touch the API or DB servers at all.

 Sure, but they are (a) delayed by 5 minutes and (b) broken ;-)

we're working on both (a) and (b) at the moment... we'll fix it real
soon now, i promise :-)

 I was initially opposed to the concept of diffs. I remember a developer
 meeting in Essen in 2007 where I rather violently requested more frequent
 updates and NickB said something like we could do daily or hourly diffs
 and I said I want the f*ing real thing, not canned diffs.

the trouble with the f*ing real thing is that, because it needs the
very latest information, it has to hit the database. imagine that
TF*RT is like WMS - every different request has a slightly different
lat/lon/scale, so its basically uncacheable unless some clever things
are done. granular diffs are like tiles - you only get discrete
chunks, but it makes caching *so* much easier. in fact, you could look
at the files on planet.osm.org as direct access to the cache - no need
to hit the DB, no extra DB load which would be better used serving
editors**. :-)

 I must say that, especially with the convenience Osmosis brings in dealing
 with them, I have meanwhile changed my mind. The diffs are a very crude
 solution but they work remarkably well, and they are quite robust compared
 to some kind of replication feed that may go out of sync at any time.

exactly. because they're just files on disk they're robust against API
downtime or bugs, they're quick to download, etc...

 I still think that there are use cases for almost-realtime feeds but the
 diffs work for most people. - I didn't know the original poster was unaware
 of the diffs; I assumed he must know the diffs and was looking for something
 better!

i think we can find a compromise. if we could get the diff generation
time down from about 5 minutes (and fix (b)!) to 1-2 minutes, would
that be good enough for almost-realtime?

 given that there are more efficient ways of doing the database
 replication than aggregating these feeds from all the different API
 servers into a coherent whole,

 As I said in another post, I was under the impression that while you can
 easily have any number of servers running API daemons on them, you'd rather
 not stuff too much into the database because at least for write requests
 we'll be stuck with it for a long while to come. But hey, maybe I
 underestimate the Postgres factor ;-)

but then a single something has to communicate with all the API
daemons, collate all the API activity, and ensure edits' atomicity,
consistency, isolation and durability... what kind of software might
have these ACID properties, i wonder? ;-)

 unless, of course, you're talking about twittering the updates. that
 would be teh moar ;-)

 For once, it would not be TomH who bans an IP range then ;-)

hey, the postgres guys were happy with OSM using postgres - why
wouldn't twitter be happy? they just re-wrote their backend for better
scalability, so we'd be doing them a favour by testing it!

cheers,

matt

**: yeah, there's going to be an overhead for pulling the minute diffs
out, but thats done once and amortised over all the consumers of the
data.

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Paul Johnson

Iván Sánchez Ortega wrote:
 El Martes, 12 de Mayo de 2009, Bernhard zwischenbrugger escribió:
 Is there a possibility to get all new data entered to OSM in realtime?
 
 No, AFAIK. The closest you can get is the minutely diffs (all the changes 
 done 
 in the last minute).

It would be cool to get this automagically delivered via XMPP... that
would be handy since both XMPP and OSM are XML.



signature.asc
Description: OpenPGP digital signature
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Ian Dees

Sorry, I lost the thread in Gmail here, but:
On Tue, May 12, 2009 at 7:53 PM, Matt Amos zerebub...@gmail.com wrote:

  unless, of course, you're talking about twittering the updates. that
  would be teh moar ;-)
 


I'd like to continue this part of the thread. As was discussed by Frederik,
I think the end goal should be a real-time OSM stream of what's getting
applied to the database. Doing that in a performant way is relatively
difficult (which is why we're using Osmosis and minutely diffs right now),
but I think we should be striving for having a realtime XML feed.

If we assume that's the goal (ok, it can just be my goal and you guys can
think I'm crazy :)), what do we need to think about or plan for in the
future to make it happen?

DB triggers? API collation? Realtime data stream server**?

I'd love to hear lively, continued discussion on this topic.

-Ian

** Currently, my day job is writing server software for medical devices that
does broadcast streams of XML data over TCP/HTTP channels. I'd love to
spend some time working on this if I knew there was a source for the data.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

61 matches

Mail list logo