Re: [OSM-dev] question about node identifiers

2009-08-23 Thread Frank O'Dwyer
Frederik Ramm wrote:
 Frank O'Dwyer wrote:
 Are the IDs on poi nodes reliable as unique identifiers for the same 
 POI? I.e. will future updates to that POI have the same node ID or 
 could they have a different one?

 They will have a different id if someone deletes and re-adds them. If 
 they're just moved around or their properties changed, they will 
 retain the id.
Thanks - do you know is deletion and re-adding common? I assume that 
would normally happen only in error or is there a use case for that?

I guess then I will have to use some heuristics to figure out if  two 
different node ids near each other  represent the same  POI, when one 
has been added and the other deleted from OSM.

Thanks,
Frank


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] question about node identifiers

2009-08-23 Thread John Smith
--- On Sun, 23/8/09, Frank O'Dwyer frank-...@wordonthestreethq.com wrote:

 Thanks - do you know is deletion and re-adding common? I
 assume that 
 would normally happen only in error or is there a use case
 for that?
 
 I guess then I will have to use some heuristics to figure
 out if  two 
 different node ids near each other  represent the
 same  POI, when one 
 has been added and the other deleted from OSM.

If you parse change sets, you would get the node id to be deleted along with a 
new node id if there was something added, otherwise if it's just modified you 
get a modified change along with the original id.


  

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] question about node identifiers

2009-08-23 Thread Frank O'Dwyer
John Smith wrote:
 If you parse change sets, you would get the node id to be deleted 
 along with a new node id if there was something added, otherwise if 
 it's just modified you get a modified change along with the original id.
But would still have to pair them up myself, right?  I would parse out a 
set of deleted nodes and a set of added nodes and would need to figure 
out that (delete x) and (add y) referrred to the same logical POI - e.g. 
because they had the same / similar name and properties and near each 
other? Or are they already paired up in the change set?

Also when tracking changesets, is the maximum period daily or is it 
possible to get them on a weekly / monthly cycle? Or to put it another 
way, I'm currently experimenting with a snapshot of planet.osm I grabbed 
a few days ago. How often do I need to grab changesets to stay current? 
Say if I want to get a single changeset to cover a period of a month, is 
that possible, or do I need to be mirroring changesets every day?

I also only want to get POI nodes and not ways/relations etc, so I don't 
really need the whole planet.osm, just a subset - is there a recommended 
way to ask for only those to save bandwidth and load on the server? It 
would also save me time on parsing as currently parsing out the amenity 
nodes takes ages.

Last thing I am wondering is if there is a description somewhere of the 
mandatory / optional tags per different type of amenity? I assume this 
changes over time and so if I'm parsing these I need to handle cases 
where a particular expected tag is not there.

Cheers,
Frank

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] question about node identifiers

2009-08-23 Thread John Smith
--- On Sun, 23/8/09, Frank O'Dwyer frank-...@wordonthestreethq.com wrote:

 But would still have to pair them up myself, right?  I
 would parse out a set of deleted nodes and a set of added
 nodes and would need to figure out that (delete x) and (add
 y) referrred to the same logical POI - e.g. because they had
 the same / similar name and properties and near each other?
 Or are they already paired up in the change set?

It's unlikely someone would delete and add a POI in the same spot, they are 
more likely to be modified.

 Also when tracking changesets, is the maximum period daily
 or is it possible to get them on a weekly / monthly cycle?
 Or to put it another way, I'm currently experimenting with a
 snapshot of planet.osm I grabbed a few days ago. How often
 do I need to grab changesets to stay current? Say if I want
 to get a single changeset to cover a period of a month, is
 that possible, or do I need to be mirroring changesets every
 day?

I tend to use the minutely ones to keep my DB up to date as much as possible, 
but I think daily is the longest snap shot available, you could just grab the 7 
daily change files if you only wanted to update once a week.

 I also only want to get POI nodes and not ways/relations
 etc, so I don't really need the whole planet.osm, just a
 subset - is there a recommended way to ask for only those to
 save bandwidth and load on the server? It would also save me
 time on parsing as currently parsing out the amenity nodes
 takes ages.

The vast majority of data in changesets is nodes, however most of these 
wouldn't be amenity nodes, on a half decent machine it doesn't take much to 
process a daily file through an xml parser to pull out just the amenity 
changes. Also node changes could turn an amenity into something else if it was 
tagged wrong so you would need to keep tabs on this and delete them.

 Last thing I am wondering is if there is a description
 somewhere of the mandatory / optional tags per different
 type of amenity? I assume this changes over time and so if
 I'm parsing these I need to handle cases where a particular
 expected tag is not there.

There is a list on the Map Features wiki page:

http://wiki.openstreetmap.org/wiki/Map_Features

However people are free to choose their own tags as well.


  

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] question about node identifiers

2009-08-23 Thread Frank O'Dwyer
John Smith wrote:
 It's unlikely someone would delete and add a POI in the same spot, 
 they are more likely to be modified.
Cheers, that's what I expected. So I can probably just ignore that case 
for now and run a cleanup script separately later on if needed.
 I tend to use the minutely ones to keep my DB up to date as much as 
 possible, but I think daily is the longest snap shot available, you could 
 just grab the 7 daily change files if you only wanted to update once a week.
 
Is there a recommended way to do this so as not to cause undue load on 
osm servers? Any existing code I could to pull changesets?
 The vast majority of data in changesets is nodes, however most of 
 these wouldn't be amenity nodes, on a half decent machine it doesn't 
 take much to process a daily file through an xml parser to pull out 
 just the amenity changes. Also node changes could turn an amenity into 
 something else if it was tagged wrong so you would need to keep tabs 
 on this and delete them.
Is there already existing ruby code to parse the changesets? So far I've 
been playing around with osmlib but it seems like it only supports .osm 
files.
 There is a list on the Map Features wiki page:

 http://wiki.openstreetmap.org/wiki/Map_Features
   
Cool - exactly what I was looking for - thanks.

Thanks,
Frank 



___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] question about node identifiers

2009-08-23 Thread John Smith
--- On Sun, 23/8/09, Frank O'Dwyer frank-...@wordonthestreethq.com wrote:

 Cheers, that's what I expected. So I can probably just
 ignore that case for now and run a cleanup script separately
 later on if needed.

Just keep tabs on ID numbers these are unique.

 Is there a recommended way to do this so as not to cause
 undue load on osm servers? Any existing code I could to pull
 changesets?

Use a mirror.

 Is there already existing ruby code to parse the
 changesets? So far I've been playing around with osmlib but
 it seems like it only supports .osm files.

No idea, most stuff seems to be in C/C++ or java.


  

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] question about node identifiers

2009-08-23 Thread andrzej zaborowski
2009/8/23 Frank O'Dwyer frank-...@wordonthestreethq.com:
 John Smith wrote:
 It's unlikely someone would delete and add a POI in the same spot,
 they are more likely to be modified.
 Cheers, that's what I expected. So I can probably just ignore that case
 for now and run a cleanup script separately later on if needed.

It happens more often with ways than nodes because someone splits the
way at a node and the editor creates two new ways and deletes the old
one, and then even if they're merged back together, the way gets a new
id. (though some editors are smarter about it)

 Is there a recommended way to do this so as not to cause undue load on
 osm servers? Any existing code I could to pull changesets?

You can run osmosis weekly in a script to download 7 daily diffs and
produce one weekly diff which you'd then apply or apply the four of
them at the end of the month.  Osmosis can track what new diffs there
are since you last ran it and you don't need to mess with the names of
the diffs which include day numbers.

 The vast majority of data in changesets is nodes, however most of
 these wouldn't be amenity nodes, on a half decent machine it doesn't
 take much to process a daily file through an xml parser to pull out
 just the amenity changes. Also node changes could turn an amenity into
 something else if it was tagged wrong so you would need to keep tabs
 on this and delete them.
 Is there already existing ruby code to parse the changesets? So far I've
 been playing around with osmlib but it seems like it only supports .osm
 files.

I know very little about ruby but I'd expect you can reuse the ruby
code for parsing changesets that runs the API server.

Cheers

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Osmosis problem or problem with planet.osm.bz2?

2009-08-23 Thread Timo Juhani Lindfors
John Smith delta_foxt...@yahoo.com writes:
 If that's the case shouldn't osmosis code should be updated to call
 bzcat if it's given a bz2 file?

bzcat might not always be present.


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Osmosis problem or problem with planet.osm.bz2?

2009-08-23 Thread John Smith
--- On Sun, 23/8/09, Timo Juhani Lindfors timo.lindf...@iki.fi wrote:

 bzcat might not always be present.

Neither is java, but people still install and use osmosis anyway.


  

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Osmosis problem or problem with planet.osm.bz2?

2009-08-23 Thread Timo Juhani Lindfors
John Smith delta_foxt...@yahoo.com writes:
 Neither is java, but people still install and use osmosis anyway.

Ah, I first thought that we'd need some runtime detection for the
existence of bzcat. If we make it mandatory then it could be easier
indeed.


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


[OSM-dev] Ways without nodes

2009-08-23 Thread Stephan Knauss
Hi,

is there any practical use for ways without nodes? There are over 600 of 
such ways in the database, latest one changed 2009-04-16

SELECT id FROM planet_osm_ways  WHERE # nodes = 0

I dtored an extracted list here:
http://www.stephans-server.de/osm/zeroways.xml

The affected ways don't seam to have anything in common. Some changes 
had even been made by xybot.
Is this a known problem that somehow nodes can get lost during the upload?

Is it of any use for the project to fix these? I also extracted ways 
consisting of only a single node.

Should the API check this?

Stephan

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Ways without nodes

2009-08-23 Thread Stefan de Konink
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Stephan Knauss schreef:
 Is it of any use for the project to fix these? I also extracted ways 
 consisting of only a single node.

I think you should delete them in one go (changeset), the ZoneImport
script mentioned on the wiki shows you how to do it from the
commandline. If you need help, I'm willing to assist :)


Stefan
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEAREKAAYFAkqRlrsACgkQYH1+F2Rqwn2AEwCcDd8+a2N1dcZA0DCaQJ2vq1Ty
VwsAn2Ggy6k3CFAeO5Blh6RmkIVTizEX
=lhJI
-END PGP SIGNATURE-

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Osmosis problem or problem with planet.osm.bz2?

2009-08-23 Thread John Smith
--- On Mon, 24/8/09, Timo Juhani Lindfors timo.lindf...@iki.fi wrote:

 Ah, I first thought that we'd need some runtime detection
 for the
 existence of bzcat. If we make it mandatory then it could
 be easier
 indeed.

That would be easier than trying to fix the bz decompression code with osmosis, 
and it's quicker too apparently.


  

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [josm-dev] Spam in Trac

2009-08-23 Thread Dirk Stöcker
On Sat, 22 Aug 2009, Sebastian Waschik wrote:

 sorry for German text.  There is new spam in trac.  All spam has the
 same url.  Spam bot is some sort of eliza [1].  Can someone who has
 the right to do so, delete the that?

Uih. That is no fun anymore. Some of the texts sounded like valid answers 
to the bugs. One even caused plaicy to give a comment. I removed the 
entries and added the URL to list of BadContents, but I fear the current 
SPAM methods will fail against such attacks. Neither the list of 
BadContents nor the Bayes filters will be able to handle this.

I think we see the first beginnings of a new SPAM area.

Ciao
-- 
http://www.dstoecker.eu/ (PGP key available)


___
josm-dev mailing list
josm-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/josm-dev


Re: [josm-dev] Spam in Trac

2009-08-23 Thread Tobias Wendorff
Am So, 23.08.2009, 13:26 schrieb Dirk Stöcker:
 On Sat, 22 Aug 2009, Sebastian Waschik wrote:

 to the bugs. One even caused plaicy to give a comment.

plaice = Schollen-Fisch?!


___
josm-dev mailing list
josm-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/josm-dev


Re: [josm-dev] Spam in Trac

2009-08-23 Thread Sebastian Waschik
Hello,

Dirk Stöcker openstreet...@dstoecker.de
writes:

 On Sat, 22 Aug 2009, Sebastian Waschik wrote:
 I think we see the first beginnings of a new SPAM area.

Some wikis use a capcha if you add many links.  But this spam even use
only one url: This is rare today (but not tomorrow).

Greetings
Sebastian Waschik


___
josm-dev mailing list
josm-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/josm-dev


Re: [josm-dev] Spam in Trac

2009-08-23 Thread Dirk Stöcker
On Sun, 23 Aug 2009, Sebastian Waschik wrote:

 I think we see the first beginnings of a new SPAM area.

 Some wikis use a capcha if you add many links.  But this spam even use
 only one url: This is rare today (but not tomorrow).

I know. But currently (after a lot of training and finetuning as well as 
bugfixing in the spam filter) our misdetection rate is very low and I did 
not want to introduce one more step making the Trac more complicated.

And also I fear the time of captchas will be over very soon as well. There 
are already a lot of captchas which are better machine than human 
readable.

Currently I have the same problem with my mail spam filter, where I get 
more and more properly and well formed SPAM mails which slip through the 
filters. Anti-SPAM mechanisms worked more reliable last time and SPAMMERS 
increase their efforts. This will be a never ending fight I fear.

Ciao
-- 
http://www.dstoecker.eu/ (PGP key available)


___
josm-dev mailing list
josm-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/josm-dev


Re: [josm-dev] Spam in Trac

2009-08-23 Thread Lubomir Varga
Hi.

Just nonproffesional suggestion. What to about add some vote to tickets or some 
 link to report spam? It wont do machine filter, but google also use human spam 
filter. Human is now imho only 100% option to filter spam.


-Original Message-
From: Dirk Stöcker openstreet...@dstoecker.de
Sent: 23. augusta 2009 22:11
To: josm-dev@openstreetmap.org
Subject: Re: [josm-dev] Spam in Trac

On Sun, 23 Aug 2009, Sebastian Waschik wrote:

 I think we see the first beginnings of a new SPAM area.

 Some wikis use a capcha if you add many links.  But this spam even use
 only one url: This is rare today (but not tomorrow).

I know. But currently (after a lot of training and finetuning as well as 
bugfixing in the spam filter) our misdetection rate is very low and I did 
not want to introduce one more step making the Trac more complicated.

And also I fear the time of captchas will be over very soon as well. There 
are already a lot of captchas which are better machine than human 
readable.

Currently I have the same problem with my mail spam filter, where I get 
more and more properly and well formed SPAM mails which slip through the 
filters. Anti-SPAM mechanisms worked more reliable last time and SPAMMERS 
increase their efforts. This will be a never ending fight I fear.

Ciao
-- 
http://www.dstoecker.eu/ (PGP key available)


___
josm-dev mailing list
josm-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/josm-dev


___
josm-dev mailing list
josm-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/josm-dev