Re: [Talk-us] Is this a bad import or an experiment?

2017-03-23 Thread andrzej zaborowski
On 23 March 2017 at 17:19, Eric Ladner <eric.lad...@gmail.com> wrote:
> On Thu, Mar 23, 2017 at 9:25 AM andrzej zaborowski <balr...@gmail.com>
> wrote:
>> Unfortunately it looks like someone has started deleting the areas you
>> found, I looked at a random neighborhood and they were still visible
>> in the tiles but the map data shows only the small ones, now
>> unconnected to anything as the bigger ones are missing.  Haven't
>> looked at the edits history.
>>
>
> Nobody objected so I'm going through the area and removing the small
> driveway areas and replacing larger ones with service roads and/or parking
> areas as appropriate.

Ah I now see where you proposed this.  You should first contact the
original authors in any case.

It is a shame to lose all the work done when it is easy to retag
properly.  On the other hand it didn't seem too precise and could be
easier to replace than improve.  In any case I wouldn't delete it
before having something to replace the data.  It is orthogonal to the
highway centerlines -- one can exist without the other but eventually
both are useful and both will be added at some point.

Best regards

___
Talk-us mailing list
Talk-us@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Is this a bad import or an experiment?

2017-03-23 Thread andrzej zaborowski
Hi,

On 22 March 2017 at 19:41, Eric Ladner  wrote:
> On Wed, Mar 22, 2017 at 1:07 PM Mike N  wrote:
>> On 3/22/2017 2:02 PM, Kevin Kenny wrote:
>> > Are small driveways offensive, or is it just the polygonal ones that
>> > don't connect to anything?
>>
>> To me, it's just the disconnected polygons.   Small driveways don't hurt
>> anything, and can only provide information such as telling self-driving
>> cars which driveway to pull into.
>
> Really, any "highway=*" drawn as an outline rather than a center line is a
> problem.   Routers and other processing code expects to follow the way
> segments, not honor its area as somewhere you can drive.

The current most popular tag for this is area:highway I believe and
there's quite a lot of area mapping going on in OSM now, and there are
some potential uses in addition to rendering.  Originally the
highway=* tag plus area=yes was used but that was problematic for
various reasons including confusing routers that don't support the
area mapping (all the popular ones..) and they have been long
retagged.  Note that the area:highway polygons are not supposed to be
connected to the centerlines, only between themselves.

http://wiki.openstreetmap.org/wiki/Proposed_features/area:highway

Unfortunately it looks like someone has started deleting the areas you
found, I looked at a random neighborhood and they were still visible
in the tiles but the map data shows only the small ones, now
unconnected to anything as the bigger ones are missing.  Haven't
looked at the edits history.

Best regards

___
Talk-us mailing list
Talk-us@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Public Labs/balloon mapping?

2013-10-28 Thread andrzej zaborowski
Hi,

On 28 October 2013 02:35, Ian McEwen ianmcorvi...@ianmcorvidae.net wrote:
 Hi; I've been recently looking around http://publiclab.org/, especially
 at their tools for doing ground-tethered balloon and kite mapping
 (http://publiclab.org/wiki/balloon-mapping). The bulk of the prose on
 the site seems to be activism-oriented -- documenting the BP oil spill,
 Occupy encampments, etc. As you might guess I'm more interested in the
 potential to use this for OSM, but stories of others doing that seem to
 be sparse.

 Has anyone here used balloon mapping or these tools (or similar ones)
 who can share experience, pitfalls, etc.?

I've done some kite photography around the San Francisco Bay area and
more recently one session in Seattle, but haven't had time to process
 stitch any images from within the US.  I've been following what
Public Lab / grassrootsmapping.org do, and had a chance to fly kite
with Jeff Warren and Stuart Long of Public Lab, but as you say their
process and tools are designed for activism, perhaps documentation
(historial, social, not geographical), and not exactly what we need in
OSM.  The MapKnitter tool is great for easy stitching but it's
difficult to get a precision map from it, although it surely would be
a good base for an OSM oriented tool.  In theory most of the process
can be automated away but there's a shortage of opensource tools for
that.
Public Lab generally (not always) shoot from low altitudes at high
ground resultion, thus covering small areas.  It's possible to go up
to at least 3000ft so you can actually cover a couple square miles if
you allow for a bigger angle than Google Maps etc. which is not so
much of an issue for mapping.  Going high is difficult technically and
possibly legally though, and requires great conditions.

I've done some attempts with balloon mapping and many attempts using a
cheap DIY RC platform (which is gradually improving) but I've had most
success with the kite so far.

These same methods (kites, ballons, drones) are used a lot in
archaeology with established processes, but mostly use commercial
software.

Best regards

___
Talk-us mailing list
Talk-us@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Removing US Bicycle Route tags

2013-06-05 Thread andrzej zaborowski
On 5 June 2013 23:50, Martin Koppenhöfer dieterdre...@gmail.com wrote:



 Am 05.06.2013 um 19:20 schrieb Frederik Ramm frede...@remote.org:

  The usual OSM approach would be that if a route is signposted, then it
 can be mapped - if not, then not.


 Somehow the on-the-ground rule was extended to include what is verifiable
 on paper as well. See administrative borders for instance, they are only
 very punctually surveyable.


I think more than that the surveyable / on-the-ground criteria is extended
to things that can be surveyed by asking a local or a few locals and
getting reasonably consistent answers, even when not signposted in the
usual way.  This is sometimes not consistent with the official answers.
 This could be the case with cycling routes but also even place names and
borders.

(Not a US mapper either except when staying in the US)

Cheers
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] User pxptyrone's edits

2013-02-10 Thread andrzej zaborowski
Hi John,

On 29 January 2013 03:21, the Old Topo Depot oldto...@novacell.com wrote:
 Message sent to user via osm messaging

Have you had any success communicating with pxptyrone?

If not then I think it makes sense to undelete the objects and tags
that were removed by this user.  Some of it was apparently imported
data, but a lot was user contributed or enhanced with local knowledge.

pxptyrone has 39 changesets altogether and his last 35 changesets
consist of removals, some changesets containing 1000s of objects.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


[Talk-us] User pxptyrone's edits

2013-01-28 Thread andrzej zaborowski
Hi,

denimboy on IRC mentioned that the label for Bakersfield, CA was
missing and a few other things had disappeared.  Harry Wood found that
Bakerfield's boundary relation was missing the actual outer ring.
This relation was edited by user pxptyrone on Nov 18 where he removed
some of its members with a comment without trees.  It seems that
almost all of this user's ~30 edits were done over a period of a few
days with this same comment and consisted of almost solely deletions
or removing attributes from objects (including names like in
http://www.openstreetmap.org/browse/way/59727184/history).  denimboy
mentions that Bakersfield had a lot of trees mapped which I guess
could be annoying since he says they were sometimes in the middle of
roads.  I haven't contacted pxptyrone but it would be great if someone
can check with him, or analyze the edits, to see if they need to be
reverted.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] MassGIS Building Import - process

2012-12-15 Thread andrzej zaborowski
On 15 December 2012 20:09, Jeff Meyer j...@gwhat.org wrote:
 Paul - I've added a few comments and questions about changeset size and
 revert policies on the Import Guidelines  Plan Outline wiki pages.

 Are there any recommended changeset size limits and/or revert plan
 practices?

One good practice is not to revert data that is not known to be wrong.
 If a big changeset fails halfway through it's possible to fix the
remaining part to use the nodes that have been uploaded and continue,
rather than delete the 1000s of nodes just to create new ones in the
same places.

You can probably now do that in JOSM by downloading the changeset
containing the orphaned nodes, opening in JOSM together with the data
being uploaded and telling the validator to fix all duplicate nodes.
 Myself I've been using the python scripts at
http://wiki.openstreetmap.org/wiki/Upload.py in such situations,
although the api is much more stable now than it was a couple years
ago.

The other good practice, but possibly not usable with JOSM alone, is
not to let the program upload the naked nodes in bulk and then the
buildings in bulk.  You can sort the elements in such a way that every
50k element changeset contains say 45k nodes and 5k ways.  The scripts
let you limit the number of elements in a chunk which the next chunk
depends on to the minimum (optimally 0), this way there's no risk of a
passer by spotting orphan nodes and deleting some causing you
conflicts in your next chunk.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] MassGIS Building Import - process

2012-12-15 Thread andrzej zaborowski
On 15 December 2012 22:45, Paul Norman penor...@mac.com wrote:
 From: andrzej zaborowski [mailto:balr...@gmail.com]
 Subject: Re: [Talk-us] MassGIS Building Import - process

 On 15 December 2012 20:09, Jeff Meyer j...@gwhat.org wrote:
  Paul - I've added a few comments and questions about changeset size
  and revert policies on the Import Guidelines  Plan Outline wiki
 pages.
 
  Are there any recommended changeset size limits and/or revert plan
  practices?

 One good practice is not to revert data that is not known to be wrong.
  If a big changeset fails halfway through it's possible to fix the
 remaining part to use the nodes that have been uploaded and continue,
 rather than delete the 1000s of nodes just to create new ones in the
 same places.

 You can probably now do that in JOSM by downloading the changeset
 containing the orphaned nodes, opening in JOSM together with the data
 being uploaded and telling the validator to fix all duplicate nodes.

 This hasn't come up for me on an import but I've tried it with normal
 mapping. I don't believe you can do a fix all on duplicate nodes and instead
 have to resolve them all individually

With a current JOSM it seems you can select all the errors and click
fix, or you can select the Other duplicate nodes category and
click fix, I just checked.  But JOSM will add each pair of merged
nodes as an individual operation in undo history.

I noticed it also merges the dupe nodes when you're merging two layers
rather than copy from one layer and paste onto another.



  Myself I've been using the python scripts at
 http://wiki.openstreetmap.org/wiki/Upload.py in such situations,
 although the api is much more stable now than it was a couple years ago.

 What's your workflow? Do changes in JOSM, save and then pass to the scripts?

Yes.  The osm2change script understands the JOSM format and produces
and .osc.  With the TIGER name expansion the bot produced .osc files
directly which were reviewed in a text editor.


 The other good practice, but possibly not usable with JOSM alone, is not
 to let the program upload the naked nodes in bulk and then the
 buildings in bulk.  You can sort the elements in such a way that every
 50k element changeset contains say 45k nodes and 5k ways.  The scripts
 let you limit the number of elements in a chunk which the next chunk
 depends on to the minimum (optimally 0), this way there's no risk of a
 passer by spotting orphan nodes and deleting some causing you conflicts
 in your next chunk.

 The best way to do this in JOSM alone is to only merge in 1-5k nodes+ways at
 a time, review them, then upload. This also avoids most of the problems
 above.

 I would only ever do a 50k object changeset in very limited circumstances
 where I am confident that it is safe to do so. Even then I'd try to keep it
 under 25k normally. For normal imports I'd suggest 10k as a soft limit.


Yes, good points.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] MassGIS Building Import - simplify

2012-12-13 Thread andrzej zaborowski
Hi,

On 14 December 2012 02:12, Jason Remillard remillard.ja...@gmail.com wrote:
 In my town, there are 5427 buildings. 43,628 nodes, or 8 nodes per
 structure. I did a 0.25 meter simplify on the entire town, and the node
 count went down to 41,809. We are looking at an excess of 5%. 0.25 meter may
 seem tight, but consider that source data was from 30cm and 15cm per pixel,
 and I am sure the algorithms used do some kind of sub pixel interpolation.
 Until I can figure out how to do the simplification from my scripts,
 ..

As I understand some processing is done in PostGIS already.  You could
add a ST_SimplifyPreserveTopology(the_geom, 0.25) at the end of the
process and it should have the same effect.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Proposed Welcome Working Group meeting (was: Role of the Wiki)

2012-12-11 Thread andrzej zaborowski
Hi,

On 10 December 2012 22:59, Martijn van Exel m...@rtijn.org wrote:
 * What has been attempted before? Did it work? Why (not)?

I won't be in the meeting but there's a (small) dataset that could be
used for analyses of whether automated welcome messages work.  I'm not
planning to produce statistics myself but others can run the
statistics that they see useful.  We're sending automated welcome
messages to users who's first node edit is within Poland and some two
months ago after talking to Paul Norman I changed the logic to only
send the message to ~75% of those users so that the other ~25% could
be used for comparisons with a good probability that the receiving of
the message is independent from other factors.

Roughly all editing users with UIDs = 813385 and not divisible by 4
were greeted and those divisible by 4 were not (note that the order of
messages is based on the time of first edit, not on the UID).  The
sample is still small, we see about 4 new users a day on average.

I maintain a simple python api to send OSM messages[1] and if it is
decided that it would be good to start welcoming users in a different
region with a specific message I can add that fairly easily if the
rate of new users is below the rate of messages allowed by rails port.

Cheers

1. https://github.com/balrog-kun/osm-scripts

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] MassGIS building conversion

2012-12-11 Thread andrzej zaborowski
On 11 December 2012 03:46, Jason Remillard remillard.ja...@gmail.com wrote:
 I think the plan should be to give the local mappers some time hand merge
 the data in, record what towns were done by hand, then import rest of the
 data with a script (or coarse not putting buildings over buildings).

Sounds like a reasonable plan.

 Most of
 the towns int he state don't have local mappers, most of the state will be
 done with the script.



 Re: license, I've heard personally from MassGIS director Christian Jacqz
 that all of Mass' GIS data is public domain based on the state's open policy
 on public records. I think this is fantastic, a link to a law or similar
 would still be useful.

 Is there address data that could be conflated with these buildings?


 Yes, there is a level 3 parcel data that has addresses in it. The buildings
 need to go in first, so we can do the addresses next year.

It might be better to include the addresses in the tags of the same
dataset so that the manually imported areas don't have to be worked on
twice, at least those addresses that can be associated with a
building.

It's a shame that the average height information was not included in
the attributes of those roofprints.  This information is often
estimated when people draw buildings manually by adding the floors
count, but actual heights for buildings in OSM more often comes from
imports.  
http://www.mass.gov/anf/research-and-tech/it-serv-and-support/application-serv/office-of-geographic-information-massgis/datalayers/structures.html
has an interesting description of how the heights were calculated for
MassGIS buildings in the areas covered by state LIDAR data for the
purpose of shifting the outlines to their correct places.  I wonder
how difficult it would be to repeat this process to add the height=
tags.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] [Imports] City of Seattle imports

2012-12-08 Thread andrzej zaborowski
Hi,

On 6 December 2012 14:04, Paul Norman penor...@mac.com wrote:
 Jeff Meyer @
 http://lists.osm.org/pipermail/imports/2012-December/001602.html
  How will you handle object conflation?
 Manually and methodically.

 Although not a trivial problem there is work underway on code that will
 handle the address-address conflation
 (https://github.com/pnorman/addressmerge).
 Address-POI and address-building conflation remains a purely manual job.

 It shouldn't be too hard to merge addresses with buildings they are within
 when the building has only one address within in and the building does
 not itself have an address. Addresses placed by building doors outside
 the building itself add complications but I expect they are solvable.
 Having said that it shouldn't be too hard, it's not trivial.

We've had a few localized address+building outlines imports.  I
haven't developed clean re-usable tools to do those tasks because the
datasets are so different every time, in scope, in data format,
accuracy, etc.  But usually I wrote simple python scripts for each
one, made of similar  re-usable blocks of code.  In the case of
address nodes placed roughly inside the buildings' outlines you can
re-use this scipt:
https://gist.github.com/4241509

Call it with two arguments:
./merge-building-addrs.py buildings.osm addresses.osm

It'll produce an output.osm file containing the same buildings and
addresses, but with the address tags assigned to the outlines where
there was exactly one address node contained within it.  Everywhere
else the nodes and the buildings remain intact.

In more complicated cases, e.g. nodes places near the entrances
slightly outside the building, as you mention, I found it's easiest to
load both datasets into two postgres tables and assign the addresses
using a postgis function like ST_DWithin() with a max. distance of a
couple of feet.

Similarly finding buildings that overlap with existing buildings in
osm can be easily done with ST_Intersects().  I'd usually add a
boolean column overlap containing true if the building collides
with an existing one.  Then in JOSM I'd skip those buildings with
overlap=true from upload and deal with each one manually.

Address nodes conflation is easier done in python because it requires
a little text processing.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] US Addressing

2012-11-29 Thread andrzej zaborowski
On 29 November 2012 21:12, Toby Murray toby.mur...@gmail.com wrote:
 Now for the hard part. Converting and conflating the information with
 the non-trivial number of addresses I have already collected on the
 ground.

Compared to conflating names or geometries, addresses are not a
problem because the street and the housenumber form a unique id.
Before comparing them it is worth splitting the street name into
words, reordering the words within the name to order them lexically,
and abbreviating them using a not-necessarily-perfect word list (such
as that from tiger or nominatim), to account for variations.  We've
had to do that for one city recently but it turned out to be a simple
check.

The (150 loc) conversion script for that data would read 4 files:
* the new addresses in a particular format,
* an output of an overpass/xapi query for elements tagged with
adds:housenumber=*
* an output of an overpass/xapi query for ways tagged with building=*
* an output of an overpass/xapi query for named highways.

it would output:
* an .osm file adding the addresses missing from OSM, either attached
to existing buildings or added as nodes
* a list of potentially missing streets who's names appeared in the addresses.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] street name expansion thoughts

2012-11-26 Thread andrzej zaborowski
On 27 November 2012 04:26, Toby Murray toby.mur...@gmail.com wrote:
 On Sun, Nov 25, 2012 at 7:34 PM, Clifford Snow cliff...@snowandsnow.us 
 wrote:
 I've been cleaning up are area of Jackson County, NC and found roads where
 the name expansion algorithm failed to expand all of the abbreviations . For
 example Yellow Bird Br Road.
 http://www.openstreetmap.org/?lat=35.36564lon=-83.24253zoom=17layers=M
 (not fixed yet)

 I didn't write down the problems thinking at first it was just an
 abnormality.

 I think it might be worth while to look at the names in the TIGER database
 for Jackson County NC to see why the names are not being expanded properly.
 I'll try to pull up a list tomorrow to see if it can help improve the
 expansion algorithm.

 I assume the Br is supposed to be Branch? It seems to just be an
 oddity in the TIGER data. The name field, where you would see Main
 for North Main Street has the value of Yellow Bird Br

 It might be because this road seems to have two type suffixes: Branch
 and Road. But the TIGER data model only allows for one so they shoved
 the first one into the name field. Ideally (IMO) they should have put
 Branch in its unabbreviated form into the name field. But I guess
 that would have been too easy. Have you found very many of these?

Another common example is Cemetery Road (Cem Rd) or XYZ Lake Road (Lk Rd).

For this reason the python expansion bot that run on the west US
allows multiple prefixes  suffixes regardless name_type tag, but this
required adding some special cases.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] TIGER expansion bot

2012-11-26 Thread andrzej zaborowski
On 27 November 2012 03:53, Serge Wroclawski emac...@gmail.com wrote:
 On Mon, Nov 26, 2012 at 9:50 PM, Clifford Snow cliff...@snowandsnow.us 
 wrote:
 I did look at your tiger.py script. I think br might also stand for branch
 as well as bridge. Also, I've seen mtn for Mountain.

 How would one determine whether a br would be bridge or branch? If we
 can't, then the script can't expand it.

In the TIGER docs br is only for branch, though there may be cases
that deviate from this.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Abbreviating names in tools

2012-10-07 Thread andrzej zaborowski
On 6 October 2012 21:57, Werner Poppele popp...@hm.edu wrote:
 Hi Toby,

 there is a typo in line 262 of file shorten.c:

 Lhauptbanhof, Lhbf,

 must be

 Lhauptbahnhof, Lhbf,

Thanks, fixed.  Note that the German word list is just a stub and
needs input from a more knowledgeable person than me.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Announcing Remap-a-tron

2012-09-01 Thread andrzej zaborowski
On 1 September 2012 15:28, Mike N nice...@att.net wrote:
 On 8/31/2012 11:17 PM, Martijn van Exel wrote:

 I want to add The Remap-A-Tron to the ever growing list of tools
 designed to support the ongoing remapping effort.


   That's a fantastic application!

Isn't this because it directly uses the data that OSM is not supposed
to use?  I guess it's less of a problem with roads in the continental
US, but elsewhere, when a user is pointed at a possibly blank spot on
a map with no imagery available, you can only expect them to attempt
to copy what they saw in the application.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Discardable TIGER tags

2012-07-29 Thread andrzej zaborowski
On 29 July 2012 23:21, Toby Murray toby.mur...@gmail.com wrote:
 On Sun, Jul 29, 2012 at 3:25 PM, Ian Dees ian.d...@gmail.com wrote:

  While we're incrementing every single version number of TIGER data, we
 should think about expanding the road names, too. Using the prefix and
 suffix data already on the majority of the ways makes this pretty
 fool-proof, so where it makes sense I think we should do that, too.

 Please check your glasses and re-read my message :) I am absolutely
 not proposing an automated edit.

 I have a working JOSM that automatically deletes tiger:upload_uuid.
 The way I implemented it mirrors JOSM's concept of uninteresting
 tags. This means that there is a default list of discardable tag
 keys. The list can be modified in the advanced preferences. Search for
 tags.uninteresting in the preferences if you want to see how it
 works now.

 The question is just what goes in the default list that gets populated
 the first time you fire up a new JOSM version with this feature.

 So far I have:

 tiger:upload_uuid - definite yes

 tiger:source - No opinions

 tiger:separated - mixed feelings. If I could just remove no values
 there seems to be unanimous support but that would require doing it
 differently, probably hardcoding it in the source instead of going off
 of a user preference.

Or extending this user preference's syntax, probably not much work.


 tiger:tlid - there seems to be support for removing it although I do
 recall someone opposing it strongly in the past as Anthony mentioned.
 In theory it lets you link back to a specific TIGER object. In
 practice it seems minimally useful with way splitting/merging and a
 fairly high degree of certainty that an automated TIGER 2011+ reimport
 where this could actually be used is probably not going to happen.

I have opposed removing tiger:tlid in the past but this tag is
unlikely to be ever used in practice.  This information could also be
gathered from the history dumps even in case of way splits, they're
easy to detect. (For people interested in dbpedia and linked data it's
probably nice to see a direct external key of another database in a
database but it's apparent that this data is not being maintained)

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Seeing things you don't care about in the database

2012-06-11 Thread andrzej zaborowski
 On Mon, Jun 11, 2012 at 4:47 PM, Nathan Edgars II nerou...@gmail.com
 wrote:
 I agree with this. But I'm not sure that there is a solution. You can use
 XAPI/Overpass API to download only roads in an area, but you get conflicts
 (or worse, you move a node and screw up something else without realizing it)
 when nodes are shared with other non-downloaded features. This can happen
 directly (road passing through a building or the IMO bad practice of using
 roads as landuse borders) or indirectly (e.g. road - parking lot - building
 - landuse).

One option would be for every object returned from an API query to
have a complete/incomplete flag.  This flag would be set if an object
(e.g. a node) is part of another object that has not been downloaded
because it's not on the same layer.  If the editor sees such an
object being modified, it pulls all the parent elements from the
API.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Special issues in LA remap

2012-06-06 Thread andrzej zaborowski
On 6 June 2012 09:07, Steve All steve...@softworkers.com wrote:
  andrzej replied:
  Is it a pressing issue though?  Mike N already said this, but the
  license redaction algorithm is being designed to do no more damage
  than a revert of the tainted edits, with the exception of undeletions
  mentioned by NE2.  So, by my understanding, the best you can get by
  reverting edits is a state similar to that which you'll obtain by
  doing nothing and moving on to actual useful mapping.


 SteveA here:  Then I think what might make most sense is to point Charlotte,
 me, and other readers of this list to Mike N's license redaction algorithm
 thread.  I guess I missed that.

I was referring to the post at the beginning of this thread.  You're
right that the redaction algorithm is being created at this time so
it's hard to know what it's going to look like.  However it's quite
clear that a plain revert of all tainted edits would produce a
(mostly) clean dataset and I believe that is the baseline assumption
for the redaction algorithm.  From there all the work happening is
done to minimize the damage, make it better than a plain automatic
revert if possible.

So all I'm saying is that a plain revert of the edits is not going to
produce better results than just doing nothing, because the redaction
algorithm is likely to do a job that's at least as good.  On top of
that the definition of what is tainted is still changing (there are
places where the current definition as written on the wiki gives an
almost opposite effect to what is intended -- incompatible data would
be preserved and compatible data removed).

Replacing data with TIGER 2011 roads might be a better idea but it's
orthogonal to the license change, it can be done before as well as
after the change.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Road Expansions and Seattle Imports (if you're the Seattle importer, please read!)

2012-05-19 Thread andrzej zaborowski
On 19 May 2012 21:02, Serge Wroclawski emac...@gmail.com wrote:
 On Sat, May 19, 2012 at 1:31 PM, Mike N nice...@att.net wrote:
 On 5/19/2012 12:52 PM, Serge Wroclawski wrote:


  Ultimately, the only way to get a 100% successful upload is to query the
 changeset on failure to determine what was actually uploaded, then resume
 from that point.

 While true, that method in isolatiuon is a bit complex and should be
 unnecessary.

 The OSC upload method is transactional in nature. If you get a
 positive response from it, you can know the upload succeeded. The
 issue is that there can be a network interruption between the upload
 completion and the return response. During that time the transaction
 has completed successfully but the client won't know.

 If there is a 1:1 correlation between OSC and changeset, that's fine,
 but generally uploads are often done with large chagesets.

 There are a few ways to handle this (including async upload methods in
 the API code, which has been proposed), but right now, AFAIK, there's
 no public code for making reliable mass uploads to OSM.

As mentioned on irc, I trust the tool set I did the original TIGER
expansion with.  I've used it for many other uploads too and I never
had to revert a single upload, i.e. it was always possible to resume
the process in case of a network error or conflict.  I've heard of two
or three more people who used it successfully.  It also (tries to)
solve the problem of nodes being uploaded in bulk before ways get
uploaded, before relations, so this minimizes the probability of
someone spotting orphan nodes in the middle of an import and removing
them.

Unfortunately it's not very user friendly and requires manual
intervention sometimes.  I'd try improving it instead of starting from
the beginning though.

It's documented here: http://wiki.openstreetmap.org/wiki/Upload.py

It is also limited to xml files that fit in memory because it uses the
python ElementTree parser right now.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-12 Thread andrzej zaborowski
On 11 May 2012 22:17, Dale Puch dale.p...@gmail.com wrote:
 I understand the script checks for only one instance of the abbreviation.
 My point was what is someone manually expanded ONE of the abbreviations,
 leaving st something street?  Is that checked for?  The question also
 applies to Dr something Dr previously changed to Dr something Drive, and
 possibly directionals as well.  Serge seems to be doing a good job with
 this, and this is just feedback so there aren't any incorrect expansions.

The way the old script deals with those, is it has a list of
abbreviations that come as a suffix and those that come as a prefix,
from the TIGER documentation.  It checks suffixes starting from the
end, so if you have St something St E or St something St East,
it'll only check E or East and then St and then stop because
something is not a known suffix.

There are cases where something can be both a suffix and a prefix, but
those cases are known from the TIGER documentation.

Note that that St something St, can be Saint something St, but it
can also be State something St.  The script uses a list of things
that can be saint and those that can be state owned.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] First bona fide mini-roundabout spotted

2012-05-07 Thread andrzej zaborowski
On 7 May 2012 22:28, Nathan Mills nat...@nwacg.net wrote:
 So this is not/should not be a mini_roundabout? It seems a little silly to
 call it anything else, since the city just dug a hole in the center of the
 existing intersection, built a circular curb, and planted a tree:

 http://g.co/maps/e2gsv

 What about this one? Also a full on roundabout?

 http://g.co/maps/d6n74

These two don't give priority to the vehicles going round so they're
not roundabouts going by the wiki definition.

This is one in Berkeley that I have previously tagged mini_roundabout:
http://g.co/maps/5uc9t (using the size criteria) but from Google it
looks like there are stop signs only from Mathews St even though there
are roundabout symbols on the island.  So should this be redrawn as a
circle, with highway=stop nodes from N/S only?

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Excellent progress, u.s.

2012-04-14 Thread andrzej zaborowski
On 14 April 2012 03:30, John F. Eldredge j...@jfeldredge.com wrote:
 One drawback to this new-coordinate technique is that, in some cases, the 
 tainted nodes will have been in the proper locations to match the real world. 
  So, in order to make the cleanup bot not consider the nodes to be tainted, 
 we have to knowingly make the map data less accurate than it had formerly 
 been.


It also will remain tainted, only the bot will not know about it and
consider it untainted.  So it's a way to trick the bot and potentially
put the OSM Foundation under legal risk.

This is why the remapping effort before the bot run is finished, is a
Really Bad Idea.  It is both more time costly and it is provoking
users to cause incompatible IP to be preserved over the license
change, often unconsciously.  See all the ideas of using the
incompatible IP to create the new compatible IP, such as using the
tainted coastlines data to remap small islands.  (RichardF said he
does not agree it's a bad idea, but he wouldn't explain which point he
disagrees with or why)

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] name expansion bot (Re: Imports information on the wiki)

2012-02-17 Thread andrzej zaborowski
On 18 February 2012 00:43, TC Haddad tchad...@gmail.com wrote:
 On Fri, Feb 17, 2012 at 2:42 PM, Paul Johnson ba...@ursamundi.org wrote:
 Sounds like a problem for the renderer to solve.  It's possible for
 renderers to easily create abbreviations when full words are not desired,
 but impossible for automated translation and renderers to expand
 abbreviations accurately.


 I *guess*, but that seems unrealistic expectation to put on GPS hardware
 manufacturers. Particularly if name expansion is inappropriate in one town,
 but perfectly appropriate in another, and usual practice is to load a large
 area (like a whole state or region) into a GPS device. How woud a device
 renderer know to even try to distinguish across community lines?

 From the user perspective it would be nicer if the names in the data set
 correspond to the actual street sign names. In Portland the street name is
 Tillamook and if I am on NE Tillamook that just helps me know the
 quadrant of town. Northeast on it's own doesn't tell me anything if I
 can't see the rest of the street name.

 This example feels more like tag for reality, vs tag for the renderer
 argument, and the short prefixes feel more like reality in Portland, but
 maybe that's just me...

 I do see the value if text-to-speech is the real reason this was done
 though.

I think the other benefit is the consistency.  If you decide to
abbreviate in some areas, skip prefixes in other areas, use full words
in yet other areas, the edit wars are never going to end.  I don't see
much value in following the street signs too closely because those are
a very specific use case, where, depending most likely only on someone
in highways management and the manufacturer of the signage AND
circumstances like there being only a limited space to mount the signs
at some crossings, segments of the name can be skipped, abbreviated,
reordered.  There's no requirement that the signage be consistent
along a single street.  In the end it has little to do with what
people call the street.

I also want state, for the record, that the name expansion I ran over
the western states, in general should have dealt with distinguishing
between Saints/States/Streets, single letter streets, and other
ambiguities correctly.  There were cases where it failed for various
reasons (e.g. poor metadata in TIGER) but in general it worked fine.
There were also cases in the first two states I worked on, where the
script had a bug that I fixed only after uploading the first changeset
and had to send a second correcting changeset.  The script has also
flagged a number of cases as too ambiguous and those I dealt with
manually.  Some errors surely still remain.  But all in all the
technical side of the name expansion should not be a problem.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] name expansion bot (Re: Imports information on the wiki)

2012-02-17 Thread andrzej zaborowski
On 15 January 2012 14:35, Mike N nice...@att.net wrote:
 On 1/15/2012 8:28 AM, Nathan Edgars II wrote:
 Actually the script also expanded the W to West. But my point is that it
 is a TIGER entry error, and any future script needs to take into account
 that these exist and people may have already fixed them to the correct
 names.

  Agreed- if we're thinking of a bot that periodically fixes everything, we
 need a special tag that says abbreviation_bot=back_off (but perhaps not so
 verbose) - something that tells the bot not to touch the name because it is
 unusual and has been manually checked.

Running a bot periodically would be a really bad idea IMHO.  Even if
you add such a tag, the average mapper is not going to know about it.
An edit war between an unsuspecting human and a script is something
that shouldn't happen even if the mapper turns out to be wrong.

It only makes sense where there's a huge import that has simply not
considered the abbreviations.  Such an import will also affect what
first-time local mappers think is the agreed way of doing things in
OSM so imho it made sense to do a one-time name expansion over all
highways.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Analysis of US road network and TIGER status

2012-02-16 Thread andrzej zaborowski
On 17 February 2012 01:04, Mike N nice...@att.net wrote:
 On 2/16/2012 6:38 PM, Martijn van Exel wrote:
 I did an state-by-state and county-by-county analysis of the road
 network in the US. I focused particularly on TIGER and user-related
 metrics.
 Results (with maps of course) are here:

 https://oegeo.wordpress.com/2012/02/16/the-state-of-the-openstreetmap-road-network-in-the-us/
 I'd love to hear your ideas for further analysis, and other feedback.


  A very good analysis!   I have some observations, that may or may not be
 significant.

  For the Average version increase over TIGER ways - the effects of the name
 expansion bot may have created the green states out west 'balrog-kun'.
 This might also apply to the 'Percentage untouched TIGER ways' map.

Yes, I had the same feeling watching the visualisations.  You can see
a clear difference between east and west because of the one
additional revision.

But yes, this is a nice analysis.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Remapping tips

2012-02-07 Thread andrzej zaborowski
On 7 February 2012 05:47, Nick Hocking nick.hock...@gmail.com wrote:
 andrzej wrote

 my personal appeal is that you spend
 the time between now and April 1 mapping one of the so many blank
 spots in OSM

 I totally disagree,

 There are real people out there actually using OSM Data for
 car navigation, cycle navigation etc..

 For the OSM data to become unroutable when that is so easily
 avoided would be unthinkable.

And there are real people using OSM in many other fields.  What I mean
is that by deciding to drop a (mostly arbitrary) subset of OSM, you're
setting back some or all of those fields either by breaking the
service for those people or by not fixing it in other places so people
can start using it.  All you can do is select a path with least damage
and remote re-mapping is not that path.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Remapping tips

2012-02-06 Thread andrzej zaborowski
On 7 February 2012 00:27, Nathan Edgars II nerou...@gmail.com wrote:
 On 2/6/2012 6:19 PM, John F. Eldredge wrote:

 So, you are implying that nothing further can be done after April 1st?  If
 the remapping can't be completed by then, OSM is doomed?  I agree that you
 are being overly pessimistic.

 Any remapping after April Fools will not be able to use tags added by good
 users to ungood objects. Any loss will be on the hands of the OSMF.

Why not?  They'll be in the history same as they are now.  Right now
you also have to check the edits history to be able to use them.  In
fact it'd be very difficult to decide a tag is CT-clean even today.
It's also an at least questionable practice when you remove an object
and then immediately re-add it (with the assumption you're not using
any of the information your brain registered when you selected the
object -- even that of it being a map feature).  In any clean-room
process I know, where the goal is to obtain a copy not tainted by
patents or copyright, the two actions must be performed by different
people.

The other ironic thing is the armchair remote re-mapping from imagery
by those who complain about armchair mapping (but then happily join a
baseball field challenge) and the amount of effort put into making an
area green on a red  green colored map.  We've seen that go wrong
with the dupe-nodes map.

I also want to point out that there is a false perception that OSM
will gain anything if everything becomes CT-clean before the potential
switchover.  The time you spend re-mapping things is the time you
don't spend mapping new places.  The net gain is the same in case of
the switchover happening as planned and only assuming that (unlikely)
the cost of remapping is the same as that of mapping one of the many
blank spots from scratch.  In every other scenario OSM as a whole
loses if you spend your time re-mapping instead of adding new data.

The whole change process is harmful to the project in many other ways
too.  I know Martijn van Exel and Michael Collinson have issued
personal appeals for people to help remapping but I think those are
based on false assumptions and my personal appeal is that you spend
the time between now and April 1 mapping one of the so many blank
spots in OSM or otherwise adding new verbatim information.  Re-mapping
may help LWG show that the change has been less harmful than it really
was, but it doesn't enrich the commons of free data available
whichever way you look at it.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Anyone want to team up to work on Austin, Texas (was LA and other license changeover challenged areas)

2012-01-31 Thread andrzej zaborowski
On 31 January 2012 18:51, Paul Johnson ba...@ursamundi.org wrote:
 Looks like about 4% of Austin was balrog-kun; I'm in the process of tagging
 that odbl=clean right now per his previous request.

I don't believe I made any non-automatic edits in TX, and those are
already considered clean by the license plugins.

Note also that if you use odbl=clean you need to make sure other edits
in the history are ODbL-clean and as far as I know there's no general
way to do that.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] name expansion bot (Re: Imports information on the wiki)

2012-01-15 Thread andrzej zaborowski
On 15 January 2012 14:35, Mike N nice...@att.net wrote:
 On 1/15/2012 8:28 AM, Nathan Edgars II wrote:

 Actually the script also expanded the W to West. But my point is that it
 is a TIGER entry error, and any future script needs to take into account
 that these exist and people may have already fixed them to the correct
 names.


  Agreed- if we're thinking of a bot that periodically fixes everything, we
 need a special tag that says abbreviation_bot=back_off (but perhaps not so
 verbose) - something that tells the bot not to touch the name because it is
 unusual and has been manually checked.

Perhaps checking if either the name= tag or the direction_suffix tag
has ever been edited by a human would be a good measure.  The ways
which have been edited might need to be manually reviewed if they
contain an unexpanded N, E, W or S.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Getting ready for the license change

2012-01-13 Thread andrzej zaborowski
On 13 January 2012 22:50, Paul Johnson ba...@ursamundi.org wrote:
 On Fri, Jan 13, 2012 at 12:44:30PM -0700, Martijn van Exel wrote:

 I do that too. There is of course a small chance of the decliner
 changing his or her mind, so I only delete data that is tainted by a
 decliner that I have personally been in touch with about the license
 change and my best judgement is that his / her decision is final.

 Speaking of, has anyone talked to balrog-kun yet?  I know he was at
 one point insanely prolific and I often stumble across his data, he's
 currently a decliner.

I'm balrog-kun, if you see my data in the US you can add odbl=clean*
because I haven't used any non-ODbL sources outside of Europe.  But I
have used such sources in Europe and so (Richard Weait says..) this is
not compatible with CT 1.2.4.  This is not the main reason I have
declined CT, but in any case all of my own work is ODbL compatible and
me saying that probably has more legal value than a half-automatic
click-through.. well, unless I'm not balrog-kun, but then I couldn't
have put it on the user page.

I'm not in Oregon these days which is why no new data is appearing from me.

Cheers

* provided other authors in the history also license their
contributions ODbL, which old LWG minutes say is *not* implied by CT
acceptance.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Getting ready for the license change

2012-01-13 Thread andrzej zaborowski
On 14 January 2012 01:19, Andrew Cleveland evil.salt...@gmail.com wrote:
 So every TIGER way in the western US will require the odbl=clean tag?

No, the bot edits are assumed by LWG to not deserve protection, plus
additionally all the related changesets are on Frederik Ramm's
whitelist which I think is likely to be used during the license switch
should it happen.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] User adding many Safeway grocery stores, with ref number in name

2012-01-08 Thread andrzej zaborowski
On 8 January 2012 19:51, Mike N nice...@att.net wrote:
 On 1/8/2012 1:31 PM, Toby Murray wrote:

 He has done good work mapping several Safeway subsidiaries throughout
 the US. But yeah I agree the store number should go in the ref tag. We
 don't really care about internal Safeway naming conventions... we care
 about how that data fits into OUR data model.


  Our data model is freeform tags.   Anything goes.   The only penalty for
 excessive creativity is the loss of usefulness to data consumers. A quick
 Wiki search didn't turn up anything about refs and buildings. I've tagged
 gas stations as Lou's Spinx #46 before from the receipt info without
 knowing that it was not useful information.

It is certainly useful but the ref tag (like many other tags) is
popular enough that I think it's fair to say it's standard in OSM
tagging.  So let's tag consistently and use this tag for reference
numbers.

BTW It's great that people in companies like Safeway are tasked with
adding their stores, this is so good to see.  They only need to find,
or make, a render that will show their name + ref number on the map.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Now you can see how much vandalism the OSMF will carry out on April Fools

2011-12-14 Thread andrzej zaborowski
Hi Grant,

On 14 December 2011 21:48, Humphries, Grant humph...@trimet.org wrote:
 It was mentioned balrog-kun’s automated edits are exempted from being
 reverted.  Can anyone expand on that or point me in the direction to find
 more information about this?  Does anyone have an idea of what percentage of
 edits done from that account were automated?

The changesets tagged bot=yes are automated and it was mentioned
somewhere in LWG minutes that those could be relicensed.  They're not
100% automated, but mostly (i.e. about a day of manual review went
into each of the abbreviations expanding changesets).  AFAIK
Frederik's visualization tool already treats those changesets as
clean.

It is a small percentage of my edits changeset-count-wise but they are
large changesets so it's a notable percentage data wise.

(I hope my other changesets will not be reverted either but it's hard
to tell right now with the amount of false assumptions made in the
license change process)
Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] note to abbreviation bot authors

2011-05-16 Thread andrzej zaborowski
On 16 May 2011 06:09, Richard Welty rwe...@averillpark.net wrote:
 i recently spent some quality time doing tiger review in a community
 that has a rather straightforward naming system for its avenues.
 it starts with A Avenue to the south and steps through the alphabet
 as the avenues appear to the north.

 a de-abbreviation bot had been run on the area.

 it should most assuredly _not_ have corrected E Ave to East Avenue,
 nor should it have corrected N Ave to North Avenue.

 please be careful with these things, folks.

Yes, the bot tried to use direction prefix tag to distinguish between
E as a letter and E for East and it turned out to be wrong in the 2006
data way too often.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] note to abbreviation bot authors

2011-05-16 Thread andrzej zaborowski
On 16 May 2011 17:56, Richard Welty rwe...@averillpark.net wrote:
 On 5/16/11 11:10 AM, andrzej zaborowski wrote:

 On 16 May 2011 06:09, Richard Weltyrwe...@averillpark.net  wrote:

 i recently spent some quality time doing tiger review in a community
 that has a rather straightforward naming system for its avenues.
 it starts with A Avenue to the south and steps through the alphabet
 as the avenues appear to the north.

 a de-abbreviation bot had been run on the area.

 it should most assuredly _not_ have corrected E Ave to East Avenue,
 nor should it have corrected N Ave to North Avenue.

 please be careful with these things, folks.

 Yes, the bot tried to use direction prefix tag to distinguish between
 E as a letter and E for East and it turned out to be wrong in the 2006
 data way too often.

 i'm not sure i'm following this. there are no prefix tags that i can see
 on the ways in question:

 http://www.openstreetmap.org/browse/way/16034490

Ironically this one looks like a manual edit (or at least it doesn't
have a bot=yes tag on the changeset and it's more localised than bot
edits tend to be), and it's in Iowa where the automated expansion has
not been run.

So this particular one is a counter example to the don't run bots,
leave it to be done manually rule :)  I also think that large imports
are a special case because the tagging is already a result of
automated processing by the import script, and if you want to fix it
you can't really count on users having a sense of onwership /
maintainership of the whole area.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] TIGER edited map updated with Toby's suggestion

2011-01-24 Thread andrzej zaborowski
On 25 January 2011 00:57, Alan Mintz alan_mintz+...@earthlink.net wrote:
 At 2011-01-24 15:37, Toby Murray wrote:

 On Mon, Jan 24, 2011 at 5:18 PM, Andrew Ayre a...@britishideas.com
 wrote:
  Sorry, the village of Summerhaven, which I totally reworked in Sep 2009
  is
  still shown in red:
 
 
  http://open.mapquestapi.com/tigerviewer/index.html?zoom=12lat=32.438lon=-110.75635layers=B

 I haven't checked every way in the area but it looks like most of
 these aren't TIGER roads to begin with. You created them  (version 1)
 then balrog-kun renamed expanded all the street name abbreviations
 (version 2) so they are in version 2 and last touched by balrog-kun
 which is why they are being rendered as red. The TIGER edited map
 isn't really intended for these ways since they aren't originally
 TIGER data.

 I guess it would be nice to turn ways that weren't imported from TIGER
 green, regardless of last editor and version number. But only the
 current version of the way is available for inspection while rendering
 so this is kind of hard to do...

 Only objects with tiger:* tags should be candidates for being red. I'd
 suggest looking for the tiger:county or tiger:name_base tags, since some
 have removed other tiger:* tags but left ones like tiger:cfcc or tiger:zip_*
 for reference.

 However, I find another problem. When I split a TIGER-imported way and keep
 the tiger:* tags on it, I end up with what looks like a TIGER way, but
 isn't. It has tiger:*=* and v=1 (or v=2 if I edited it again). However, the
 UID is mine, not balrog-kun or DaveHansenTiger, so filtering for this would
 solve the problem as well.

 In summary, I propose to add the following requirements to the existing
 filter for turning a feature red:
 - Must have tiger:name_base tag

I'd suggest tiger:reviewed=no which is kind of what the tag was for.

Also I (balrog-kun) have edited a good amount of data manually, from
survey or imagery, at the same time there are a portion of roads where
I changed the name twice, generating v=2 and then v=3 because after
the first run I found that some more patterns needed manual review
because it was impossible to automatically Do The Right Thing in more
situations than expected.  On the TIGER scale they're both small
groups though.

It's a pity that the full history dump can't be used easily because
that's often the assumption when processing OSM data and it's often
recommended on the mailing lists to use changeset tags instead of tags
on features.  You could for example filter out changesets with bot=yes
which would skip all of my road name expanding and possibly more stuff
skewing the results.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] TIGER edited map updated with Toby's suggestion

2011-01-24 Thread andrzej zaborowski
On 25 January 2011 02:54, Alan Mintz alan_mintz+...@earthlink.net wrote:
 At 2011-01-24 16:55, andrzej zaborowski wrote:

 On 25 January 2011 00:57, Alan Mintz alan_mintz+...@earthlink.net wrote:
  At 2011-01-24 15:37, Toby Murray wrote:
 
  On Mon, Jan 24, 2011 at 5:18 PM, Andrew Ayre a...@britishideas.com
  wrote:
   Sorry, the village of Summerhaven, which I totally reworked in Sep
   2009
   is
   still shown in red:
  
  
  
   http://open.mapquestapi.com/tigerviewer/index.html?zoom=12lat=32.438lon=-110.75635layers=B
 
  I haven't checked every way in the area but it looks like most of
  these aren't TIGER roads to begin with. You created them  (version 1)
  then balrog-kun renamed expanded all the street name abbreviations
  (version 2) so they are in version 2 and last touched by balrog-kun
  which is why they are being rendered as red. The TIGER edited map
  isn't really intended for these ways since they aren't originally
  TIGER data.
 
  I guess it would be nice to turn ways that weren't imported from TIGER
  green, regardless of last editor and version number. But only the
  current version of the way is available for inspection while rendering
  so this is kind of hard to do...
 
  Only objects with tiger:* tags should be candidates for being red. I'd
  suggest looking for the tiger:county or tiger:name_base tags, since some
  have removed other tiger:* tags but left ones like tiger:cfcc or
  tiger:zip_*
  for reference.
 
  However, I find another problem. When I split a TIGER-imported way and
  keep
  the tiger:* tags on it, I end up with what looks like a TIGER way, but
  isn't. It has tiger:*=* and v=1 (or v=2 if I edited it again). However,
  the
  UID is mine, not balrog-kun or DaveHansenTiger, so filtering for this
  would
  solve the problem as well.
 
  In summary, I propose to add the following requirements to the existing
  filter for turning a feature red:
  - Must have tiger:name_base tag

 I'd suggest tiger:reviewed=no which is kind of what the tag was for.

 ...except that some (many?) people don't know (or don't care) to remove the
 tag after they edit/confirm the feature. There are many edited TIGER ways
 out there with this tag.

Right, but at this point we just want to determine if the way comes
from TIGER... so if either tiger:reviewed=no or tiger:base_name *is*
set, it's an indication that it may be from TIGER.

I can imagine someone adding a tiger:base_name to a non-TIGER name for
consistency, but I can't imagine someone reasonably adding
tiger:reviewed=no.



 Also I (balrog-kun) have edited a good amount of data manually, from
 survey or imagery, at the same time there are a portion of roads where
 I changed the name twice, generating v=2 and then v=3 because after
 the first run I found that some more patterns needed manual review
 because it was impossible to automatically Do The Right Thing in more
 situations than expected.  On the TIGER scale they're both small
 groups though.

 It's a pity that the full history dump can't be used easily because
 that's often the assumption when processing OSM data and it's often
 recommended on the mailing lists to use changeset tags instead of tags
 on features.  You could for example filter out changesets with bot=yes
 which would skip all of my road name expanding and possibly more stuff
 skewing the results.

 How about changing:
        - UID must be balrog_kun or DaveHansenTiger
 to:
        - (UID=balrog_kun and changeset in list_of_changesets) or
 UID=DaveHansenTiger

 where list_of_changesets is the list of changeset IDs that were used for the
 name expansion.

That would work I guess but still not for 100% cases, so I think the
difference isn't worth the effort.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] TIGER edited map updated with Toby's suggestion

2011-01-24 Thread andrzej zaborowski
On 25 January 2011 03:36, Mike N nice...@att.net wrote:
 How about changing:
 - UID must be balrog_kun or DaveHansenTiger
 to:
 - (UID=balrog_kun and changeset in list_of_changesets) or
 UID=DaveHansenTiger

 where list_of_changesets is the list of changeset IDs that were used for
 the name expansion.

  This won't work either because balrog_kun may have un-abbreviated a road
 that had previously been edited.

Together with the version check it would work better than the current
heuristic, but only minimally better.


  It currently looks correct, in pseudo code,

  Not Edited = (user:DaveHansenTiger AND (Date between 2007-09-01 and
 2008-05-04)
 OR  (user:Milenko and (Date between 2008-10-29 and 2007-12-12) )
 OR (user:balrog-kun and version3)

   I believe that is the correct logic.  The scripts of balrog-kun that I've
 seen only automatically un-abbreviate TIGER ways, so there shouldn't be any
 false positives.

As someone noticed, the scripts did not go too far in checking that
the ways were from TIGER import and affected some 100% user-mapped
areas too (mostly positively though)

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] [OSM-talk] Watch out if mapping in Florida, Georgia...

2011-01-23 Thread andrzej zaborowski
On 21 January 2011 19:18, John Smith deltafoxtrot...@gmail.com wrote:
 On 22 January 2011 04:13,  si...@mungewell.org wrote:
 Also if you know where the source of the jamming is coming from would
 it be possible to use that information to over come some of the
 degradation?

 Nope, not if the jamming is any good.

 Maybe they aim to find out.

GPS-jammer-based global positioning system? :)
According to [1] it is easy to detect and locate GPS jammers.

Note that the news or the NOTAM doesn't actually say the signals are
being jammed, but the decreasing radius with decrease of altitude
suggests that...

1. 
http://en.wikipedia.org/wiki/Selective_Availability#Artificial_sources_of_interference

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Creating relations for abandoned railway lines

2011-01-10 Thread andrzej zaborowski
Hi,

On 10 January 2011 17:23, Nathan Edgars II nerou...@gmail.com wrote:
 On Mon, Jan 10, 2011 at 11:11 AM, Kristian M Zoerhoff
 kristian.zoerh...@gmail.com wrote:
 type = route
 route = train
 operator = Elgin  Belvidere Electric Co.
 abandoned = yes

 It's that last tag I'm unsure of. Is abandoned = yes allowed/understood in
 relations?

 I think what you want to use is route=railway, not route=train. The
 latter would include trackage (if any) owned by other companies that
 the EBE used to reach downtown terminals, while the former would be
 the single line owned and operated by the EBE.

At some point route=historic was a preset or on the wiki (I don't
remember), I think it would work better here.

Something like:
route=historic
historic=railway
following the convention of avoiding misleading the tools, which
usually just look at the one tag that interests them (route=railways
for example).

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] San Francisco Geodata

2010-09-01 Thread andrzej zaborowski
Hi,

On 26 August 2010 22:15, Gregory Arenius greg...@arenius.com wrote:
 We need addresses if we want to have usable data for routing programs.  The
 addresses file has them in point format.  The city lots file also has
 addresses or address ranges for each parcel.  Has anybody done imports of
 similar address data?  If so, did you keep it in a point format or convert
 it into the parallel ways format?

The parallel ways is an approximation of the address points, so if you
have the points, use them IMO.  I've imported some of both in Europe,
but I surveyed some addresses in San Francisco too and put them on the
buildings where possible.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] A Friendly Guide to 'Bots and Imports

2010-08-05 Thread andrzej zaborowski
Hi,

On 5 August 2010 21:46, Richard Weait rich...@weait.com wrote:
 On Sat, Jan 8, 2000 at 4:20 PM, Katie Filbert filbe...@gmail.com wrote:
 Leaving imports to local mappers is good.  They are best able to assess the
 quality of the data for that area an care about quality of their local map
 data.   It also leaves low hanging fruit for them. Some areas without
 local mappers may take longer to finish. That is okay.

Definitely there are advantages from the import being done by a local,
but, as always, there are also advantages from the import being done
by the author of conversion script, someone who understands exactly
what parts need to be checked manually and someone who has done many
such imports instead of only a limited area.  (I have taken part in an
import where I made converted data available on the web for locals to
import and often had to spend longer fixing stuff after them than it
would have taken me to do it myself).

So it's hard to stand on one side or the other, probably best to look
at it case by case.


 I have no arguments with this.

 Consider this: Does importing to an area where there is no thriving
 OSM community inhibit the creation of that thriving community in
 future?

 At SotM, one of our friends suggested that imports are, okay except
 road networks.  Never import road networks.  The suggestion is that
 building the road network also builds the community.  An existing road
 network inhibits the community.  I apologize for not attributing that
 comment.  I've forgotten who said it to me.

 Or from another point of view.  If the local community isn't
 substantial enough to maintain the imported data and keep it up to
 date, is it better to not import until the community can maintain it?
 Why import 2004 data, if it will be unchanged when the 2006 update is
 published?  Does that mean that you should only import once you have
 such a thriving community and high quality local data that you no
 longer would benefit substantially from that import?

I totally agree here, it's a bit of a trade-off choosing the right
moment.  If you do it too soon, you get an unmaintained map of the
area.  If you do it too late, local mappers who didn't know about the
datasource contribute their time to re-collect the data, which later
clashes with the datasource and costs time to choose the better
version, to merge, and it is frustrating when someone finds out they
could have spent the time on the finer details.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Abbreviation Police

2010-08-04 Thread andrzej zaborowski
On 4 August 2010 08:23, Apollinaris Schoell ascho...@gmail.com wrote:
 On Tue, Aug 3, 2010 at 11:04 PM, Kevin Atkinson ke...@atkinson.dhs.org 
 wrote:
 On Tue, 3 Aug 2010, Apollinaris Schoell wrote:
 On 3 Aug 2010, at 22:32 , Kevin Atkinson wrote:
 most of the times I see it
 name=Frontage Road
 ref=US 29

 this will be rendered in similar way as on other maps. Name is on the
 street and US, I, is on a shield. Doesn't make sense to duplicate the ref on
 the name.

 Since when does a frontage road get a Highway shield?


 got this wrong and meant Frontage road is a name, but now need to
 correct altogether.
 but what is meant here has most likely no name at all. frontage road is a
 then a type of highway not a name. and US 29 in any form is not really a
 name either.
 again all other maps will not render names unless there really is a defined
 name.
 normally ramp, access road, frontage road are mapped as highway=*link
 without name

Maybe the description= tag would be better for that, although name= is
traditionally abused so much for descriptions that I don't see it as a
problem.

I agree about U.S. not being read out fully, so possibly it's better
written this way too.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Directional Prefix/Postfix Proposal

2010-07-31 Thread andrzej zaborowski
On 1 August 2010 03:54, Kevin Atkinson ke...@atkinson.dhs.org wrote:
  1) An exception to the abbreviation rule for directional indicators
     with the fully expanded name going into alt_name

First I'd like to oppose making exceptions from the global rules in
local rules.  The global rules are vague enough that there's always
some space to express all that is needed by further specifying the
rules.

(Note that in the end there's no official local or global rules..
question you asked at the end of your mail.  So in the end many people
will try to learn the scheme by looking at the map data around them,
or by doing what seems logical.  This does not mean that there
shouldn't be any rules, but it does mean that they need to be rather
simple)

...
 #1)

 I propose an exception to the abbreviation rule be made for directional
 indicators.  'North, 'South', 'East', and 'West' when a directional
 indicator (and not part of the street name) shall be abbreviated 'N.', 'S.',
 'E.', and 'W.' (with a period, will explain why below), and Northeast,
 Southeast, Northwest, Southwest shall be abbreviated as 'NW', 'SW', 'SE',
 and 'NW' (without any periods).  The fully expanded name may be included in
 alt_name.

You can make up a new tag, like full_name, or alt_spelling, there's no
limitation on this.  I feel that alt_name should probably be left for
actual alternative names.

...
 3) Spelling out the prefix can lead to ambiguous situations where it is
 unclear if the prefix is part of the street name (vid the kid gave several
 examples in his web page)

But you later propose the included thing which removes this ambiguity.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Removing tiger:* tags

2010-07-30 Thread andrzej zaborowski
On 30 July 2010 21:00, Anthony o...@inbox.org wrote:
 On Fri, Jul 30, 2010 at 2:24 PM, Alan Mintz
 alan_mintz+...@earthlink.net wrote:
 There's another, very important use for the tiger:reviewed tag.

 As I've said above, that's the one tiger tag I don't remove (until
 I've reviewed the way, of course).

 You don't seem to have read that message.  In it I went through each
 of the tiger tags individually and explained what was wrong with them.
  The tiger:tlid key in particular is in horrible shape, to the point
 where I guess at least 95% of them *are* wrong.

How do you come to that figure?  My guess would be that 95% are right.
 The only objects that may contain a TLID that refers to a different
real life object and don't contain a TLID that refers to the actual
object can be those that (a) underwent very heavy surgery (not simple
splitting or joining, but exchanging tags and geometry with another
object for example) or (b) were fictitious and shouldn't have been in
tiger in the first place.

Most objects have not been touched at all, out of those which have
been touched by a mapper, most have been changed using common sense to
find the shortest path to make the object correct (e.g. change
street's name tag and leave geometry mostly alone or change geometry
and leave the name alone, splitting, joining, etc)

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Removing tiger:* tags

2010-07-30 Thread andrzej zaborowski
On 30 July 2010 22:12, Alan Mintz alan_mintz+...@earthlink.net wrote:
 Do we really need
 the database space that badly?

I've heard arguments on the talk list that this clutters the database
and similarly wikipedia= tags should be massively removed and if at
all, links should be maintained then in the wikipedia database *to*
our objects rather in our database to wikipedia pages.

This just sounds like passing on the hot potato.  Even if osm comes to
be a point where references to ten different databases are kept for
objects, it's still valuable information, and I personally don't see
how it's inconvenient.  If it hurts your eyes how the name= and
highway= tags are lost among the other tags in your favourite editor,
then perhaps modify the editor.  Keep the links in whatever database
it makes most sense, for example wikipedia pages are indexed by their
title, which is a pretty stable reference, as opposed to OSM id's,
that's why it make more sense to keep them here.  TIGER data we can't
edit, that's why it makes more sense to keep the id's here.

Flickr (if treated as a big database where each photo is a record) had
the balls to store references to osm objects, as well as dopplr.com
IDs and foursquare.com venue IDs in their machine tags for each
picture that is a photograph of a given object.  There was no fear of
cluttering their machine tags space.  Why would it be an issue in
osm?

Also note that once there's a photo on flickr that is tagged with an
osm object id and a foursquare.com venue id at the same time, you have
a link between OSM and foursquare.com, no need to duplicate this
information in either of these databases.  If that osm object contains
a tiger tlid, you can tie the foursquare.com venue to a tiger record
and so on.

I'm not asking anyone to go adding these tags, but just saying that
they don't hurt, even if they're just a hint (a bridge that contains
twenty TLIDs and perhaps only one of them is the right one).

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Removing tiger:* tags

2010-07-30 Thread andrzej zaborowski
On 31 July 2010 00:50, Nathan Edgars II nerou...@gmail.com wrote:
 On Fri, Jul 30, 2010 at 6:36 PM, andrzej zaborowski balr...@gmail.com wrote:
 Also note that once there's a photo on flickr that is tagged with an
 osm object id and a foursquare.com venue id at the same time, you have
 a link between OSM and foursquare.com, no need to duplicate this
 information in either of these databases.  If that osm object contains
 a tiger tlid, you can tie the foursquare.com venue to a tiger record
 and so on.

 Serious question: why would anyone want to do this? (putting aside the
 fact that foursquare is probably not for streets) Does the TLID have
 any significance outside TIGER?

Various use cases I can see right now, and there are more.
 * You may just want to display a link to the osm object or tiger
object on a flickr photo page (flickr already does it for photos
tagged with osm:node|way|relation= ), the service may even
automatically extract metadata from either of the databases, like
this is a building, this is a road, so even the computer can know
what exactly is on the photo, no need to analyse the picture.  Google
could use it to enhance picture search etc.  OSM gives you some
information on the object, TIGER gives you other type of information
(official classification, weird area codes etc), another database
(like foursquare.com? not sure) can tell you the capacity of a bar and
maybe even price level for a restaurant that's a node in OSM.
 * knowing which direction the camera looked, you can actually overlay
the road geometry on it, make it clickable etc., same way Google
Street View shows 3d lines for roads on the panoramas.
 * knowing that road A in TIGER crosses roads B, C and D, you can do
sanity checks if the same ways cross each other in OSM, that may be
helpful both to the tiger maintainers and to OSM.  Same way you can
check if a junction has the right number of roads meeting there.
 * you can provide routing in one area using map A, and seemlessly
switch to map B when you cross some border and based on some other
critera.  In effect you can generate a single route using multiple
maps, you can mix and match in any ways you like.

Wikipedia page on Linked Data has more on this, there are endless
possibilities.


 I'm not asking anyone to go adding these tags, but just saying that
 they don't hurt, even if they're just a hint (a bridge that contains
 twenty TLIDs and perhaps only one of them is the right one).

 What about a bridge that contains forty TLIDs and none is the right
 one because the right one was the fiftieth and that many TLIDs
 wouldn't fit in the maximum field size (255 characters, I believe)?

 The way I see it is that if I were mapping an area from scratch,
 nobody would go adding the TIGER tags. So if I completely redo an
 area, whether I use existing ways or draw new ways, there's no reason
 to keep the TIGER tags. If anyone objects, I can change my workflow to
 delete the old ways and create new ways rather than redrawing the old
 ways :)


What I mean is keep the tags if it doesn't cost you anything.  If it
would impact your mapping effiency then perhaps it make more sense to
skip them, it's a tradeoff.  However when you map an area from
scratch, what metadata do you add?  Perhaps highway= classes and
name=, all other other information are pretty boring to survey and
it's easier to just copy them over from the tiger ways you delete.  I
just use ctrl+c + ctrl+shift+v, this copies all the tags in JOSM, and
you can then modify the values if anything is wrong in that data.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Removing tiger:* tags

2010-07-30 Thread andrzej zaborowski
On 31 July 2010 02:24, Nathan Edgars II nerou...@gmail.com wrote:
 On Fri, Jul 30, 2010 at 8:11 PM, andrzej zaborowski balr...@gmail.com wrote:
 On 31 July 2010 00:50, Nathan Edgars II nerou...@gmail.com wrote:
 On Fri, Jul 30, 2010 at 6:36 PM, andrzej zaborowski balr...@gmail.com 
 wrote:
 Also note that once there's a photo on flickr that is tagged with an
 osm object id and a foursquare.com venue id at the same time, you have
 a link between OSM and foursquare.com, no need to duplicate this
 information in either of these databases.  If that osm object contains
 a tiger tlid, you can tie the foursquare.com venue to a tiger record
 and so on.

 Serious question: why would anyone want to do this? (putting aside the
 fact that foursquare is probably not for streets) Does the TLID have
 any significance outside TIGER?

 Various use cases I can see right now, and there are more.
  * You may just want to display a link to the osm object or tiger
 object on a flickr photo page (flickr already does it for photos
 tagged with osm:node|way|relation= ), the service may even
 automatically extract metadata from either of the databases, like
 this is a building, this is a road, so even the computer can know
 what exactly is on the photo, no need to analyse the picture.  Google
 could use it to enhance picture search etc.  OSM gives you some
 information on the object, TIGER gives you other type of information
 (official classification, weird area codes etc), another database
 (like foursquare.com? not sure) can tell you the capacity of a bar and
 maybe even price level for a restaurant that's a node in OSM.
  * knowing which direction the camera looked, you can actually overlay
 the road geometry on it, make it clickable etc., same way Google
 Street View shows 3d lines for roads on the panoramas.
  * knowing that road A in TIGER crosses roads B, C and D, you can do
 sanity checks if the same ways cross each other in OSM, that may be
 helpful both to the tiger maintainers and to OSM.  Same way you can
 check if a junction has the right number of roads meeting there.
  * you can provide routing in one area using map A, and seemlessly
 switch to map B when you cross some border and based on some other
 critera.  In effect you can generate a single route using multiple
 maps, you can mix and match in any ways you like.

 I don't think you understand how the TLIDs are stored in OSM. They
 were never one TLID per way; the initial import joined a bunch of
 adjacent ways and concatenated the TLIDs.

I don't see how it changes anything.  If a piece of interstate I-405
is described by one relation or two ways one for each carriage in osm,
and 10 segments in TIGER, than that's a way to describe it.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Removing tiger:* tags

2010-07-30 Thread andrzej zaborowski
On 31 July 2010 02:33, Nathan Edgars II nerou...@gmail.com wrote:
 On Fri, Jul 30, 2010 at 8:28 PM, andrzej zaborowski balr...@gmail.com wrote:
 I don't see how it changes anything.  If a piece of interstate I-405
 is described by one relation or two ways one for each carriage in osm,
 and 10 segments in TIGER, than that's a way to describe it.

 So how would you do any of the applications described above? They all
 require either a single TLID or everything to be tagged with a field
 that includes the correct TLID (due to joining, splitting, and
 redrawing, the latter is not true).

So your program tries to come up with a route, it knows it's driving
on road A in osm.
A has id=1 and is tagged tiger:tlid=20:21:22:23, and it is connected
to road B (id=2, tiger:tlid=24:25:26:27) by node id=3.  You also see
that tiger way 23 meets 24.  That clearly means that from osm road A
you can go into tiger way 24 when you reach node id=3, without even
looking at the geometry and fuzzy guessing things (remember routing
works on huge graphs).

Or say that the government releases a database that says how many
traffic signals there are on each segment of road.  Then you can find
the junction nodes on which they should be in OSM, or at least count
how many there should be on a given street.

And yes, street names are not 100% correct or complete in OSM today..
we don't remove them though.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Removing tiger:* tags

2010-07-30 Thread andrzej zaborowski
On 31 July 2010 03:02, Nathan Edgars II nerou...@gmail.com wrote:
 On Fri, Jul 30, 2010 at 8:44 PM, andrzej zaborowski balr...@gmail.com wrote:
 On 31 July 2010 02:33, Nathan Edgars II nerou...@gmail.com wrote:
 On Fri, Jul 30, 2010 at 8:28 PM, andrzej zaborowski balr...@gmail.com 
 wrote:
 I don't see how it changes anything.  If a piece of interstate I-405
 is described by one relation or two ways one for each carriage in osm,
 and 10 segments in TIGER, than that's a way to describe it.

 So how would you do any of the applications described above? They all
 require either a single TLID or everything to be tagged with a field
 that includes the correct TLID (due to joining, splitting, and
 redrawing, the latter is not true).

 So your program tries to come up with a route, it knows it's driving
 on road A in osm.
 A has id=1 and is tagged tiger:tlid=20:21:22:23, and it is connected
 to road B (id=2, tiger:tlid=24:25:26:27) by node id=3.  You also see
 that tiger way 23 meets 24.  That clearly means that from osm road A
 you can go into tiger way 24 when you reach node id=3, without even
 looking at the geometry and fuzzy guessing things (remember routing
 works on huge graphs).

 But road A has been rerouted since the TIGER data was created and now
 ends at road C, without touching road B. You can't use shortcuts like
 this.

Sure it can be outdated same as any other tag value.


 Or am I misunderstanding your example? If you already know A and B are
 joined at node 3, what do the TLIDs tell you?

The TLIDs tell you you where you are if you want to switch from OSM
routing to TIGER routing at that node for example.  And they tell you
road A in TIGER has (say) 4 crossings with other roads, so if that's
not true in OSM, you know one of the maps needs fixing.

If something changes between TIGER2006 and TIGER2009 you can see which
osm segments may need fixing too.


 Or say that the government releases a database that says how many
 traffic signals there are on each segment of road.  Then you can find
 the junction nodes on which they should be in OSM, or at least count
 how many there should be on a given street.

 TLID 24 has two lights and TLID 25 has three. Joined TLID 24;25 might
 have four or five.

Well.. sure, possible, but that's assuming that the database was made
in such unfortunate way that each single lights can be counted two or
more times.  The census data tends to not be that bad (at least in the
design)

 Add one to the possible error for each new segment.
 Then split out bridges and it becomes unmanageable.

Again note about bad data in osm..

Plus if your program sees a non-bridge segment with
tiger:tlid=20:21:23 and a next (bridge) segment with the same
tiger:tlid, it should really notice that the five traffic lights are
somewhere on those two segments, not five on each.


 And yes, street names are not 100% correct or complete in OSM today..
 we don't remove them though.

 So are you saying you or someone else will be checking all TLIDs
 against the TIGER data and correcting errors and adding missing ones?

So people in Germany are mapping curb heights for streets.  There's
the openpistemap and special tagging for ski piste types.  There's a
whole spectrum of informations with different numbers of people who
care about it, and it changes in time. (specially when a visualisation
becomes available.. who cared about dupe nodes before the dupe nodes
map?)

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Removing tiger:* tags

2010-07-30 Thread andrzej zaborowski
 So are you saying you or someone else will be checking all TLIDs
 against the TIGER data and correcting errors and adding missing ones?

I can imagine someone making some clever scripts and then manually
verifying it where there are doubts as a kind of personal project of
the week or something.  A couple of months ago Marcus Wolschon has
been reporting on the general talk list on his progress in adding the
TMC road IDs to OSM in some parts of Germany or Austria.  TMC is some
kind of radio broadcast current traffic amount estimates, some satnavs
can use it to avoid traffic jams automatically.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Removing tiger:* tags

2010-07-30 Thread andrzej zaborowski
On 31 July 2010 04:06, Nathan Edgars II nerou...@gmail.com wrote:
 On Fri, Jul 30, 2010 at 9:40 PM, andrzej zaborowski balr...@gmail.com wrote:
 So are you saying you or someone else will be checking all TLIDs
 against the TIGER data and correcting errors and adding missing ones?

 I can imagine someone making some clever scripts and then manually
 verifying it where there are doubts as a kind of personal project of
 the week or something.  A couple of months ago Marcus Wolschon has
 been reporting on the general talk list on his progress in adding the
 TMC road IDs to OSM in some parts of Germany or Austria.  TMC is some
 kind of radio broadcast current traffic amount estimates, some satnavs
 can use it to avoid traffic jams automatically.

 Sounds like a useful ID number to map. Unlike TLIDs.

Internet is a *network* of linked databases [1].. if someone has a TMC
to TIGER mapping, you get a TMC to OSM mapping for free.

Cheers

1. http://en.wikipedia.org/wiki/File:Lod-datasets_2009-07-14_colored.png
(see US Census bubble)

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Removing tiger:* tags

2010-07-29 Thread andrzej zaborowski
On 29 July 2010 19:12, Alan Mintz alan_mintz+...@earthlink.net wrote:
 One responded that it was because they were sometimes wrong (which is, of
 course, true, for those roads that we've corrected) and that they did not
 seem to provide any useful data. However, they also contain the original
 breakdown of the prefix, root, and suffix before they got combined into the
 name and then expanded by the balrog-kun bot - information which will be
 useful in the majority of cases if we ever get back to
 splitting/standardizing.

The only tiger tag that is important to keep (to me) is the
tiger:tlid, all the other values can be pulled from the original TIGER
database provided the TLID.  I can also see the argument for keeping
the name segments as they are now largely used as generic tags, in the
absence of some agreed non tiger: -prefixed tags.

For the record I (balrog-kun) removed the tiger:upload_uuid on any
ways that I touched back when I was expanding the names.  This tag has
no value whatsoever now that API 0.6 supports changesets (and even
without it), but other ways still have the upload_uuid.  The uuid is a
quite long, random string so it occupied a very big part of the planet
snapshots and made it very hard to for example build a search index of
all the tag values including substrings (for example using suffix
trees).

I would recommend that sequential, integer ids are always used in
databases like OSM, instead of UUIDs.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Removing tiger:* tags

2010-07-29 Thread andrzej zaborowski
On 30 July 2010 00:58, Anthony o...@inbox.org wrote:
 Please define them in the wiki, and I'll keep them.  Unless I have a
 definition, I have no way of determining if they're correct or not.

So you're going to delete anything you can't verify?  Well good luck.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Removing tiger:* tags

2010-07-29 Thread andrzej zaborowski
On 30 July 2010 02:26, Anthony o...@inbox.org wrote:
 But as I've shown (http://www.openstreetmap.org/browse/way/44945783)
 the tlids don't even make sense.  tiger:tlid =
 86486485:86486486:86486387;
 86507262:86489492:86507324:86490164:86489590:86489573:86490037:86489467:86490875:86490202:86499582:86497723:86486483:86486384:86486386:86520528:86520529:86489713:86489637:86489612:86489601?
  Just for that short little bridge?  This info should be right (which
 means *one* tlid) or it shouldn't be there at all.  We shouldn't keep
 this crap around just for the hell of it.

By deleting it you're not making it more correct.  Probably the bridge
just corresponds to one TLID (if you can't be bothered checking which,
a good rule is leave it alone for someone to fix), but there are other
situations where one way will correspond to two or more TLIDs.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Removing tiger:* tags

2010-07-29 Thread andrzej zaborowski
On 30 July 2010 03:04, Anthony o...@inbox.org wrote:
 On Thu, Jul 29, 2010 at 8:46 PM, Anthony o...@inbox.org wrote:
 If the tlids represent the original set of data from
 which the bridge might have come, then it's best off in the history.

 And sticking with the theme of creating a general solution rather
 than maintaining kludgy tiger-specific hacks, maybe we could

It's not tiger specific to be specific.  If anybody wants to find
correspondences between OSM objects and USGS objects and store in the
db then I really believe it's useful information.  We can't help
having many databases on the internet referring to / describing the
same real objects, so let's at least order the mess.

That's also why it's not best stored in the object's history -- the
same osm object may come to describe a different real world object
after some edits.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] [Imports] NHD data skipped by nhd2osm

2010-06-28 Thread andrzej zaborowski
On 28 June 2010 04:36, Nakor nakor@gmail.com wrote:
 On 06/23/2010 01:07 PM, James U wrote:


 2.  Merge all the files in josm.  Make sure you have allocated a lot of
 memory
 to josm as this can take a lot of memory.  It will also take a lot of
 time.



 Would it be possible to add a parameter to specify the first id for the
 generated file?

 You can then generate the flowline file with id starting at -1, the flowline
 with id starting at -1,000,000 and so on and instead of waiting hours for
 JOSM to merge the files, you just concatenate all the files.

There's a dummy script at
http://repo.or.cz/w/ump2osm.git/blob/HEAD:/osm-merge to do that (would
need to be modified for 2 layers).  The bug mentioned there has been
fixed and I though josm now does something clever on merging, like
merging nodes.

BTW I have always strongly discouraged doing big uploads with josm as
it always dropped them on the floor on network errors in a way that
doesn't let you recover and continue from where it broke, being one of
the main causes of people needing to do reverts and spamming db
history with deleted nodes.  It may have gotten more clever recently,
not sure.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Percentage of data imported vs mapped

2010-06-23 Thread andrzej zaborowski
On 23 June 2010 18:12, Dave Hansen d...@sr71.net wrote:
 On Wed, 2010-06-23 at 12:05 -0400, Richard Weait wrote:
 On Wed, Jun 23, 2010 at 11:42 AM, McGuire, Matthew
 matt.mcgu...@metc.state.mn.us wrote:
  Does anyone know the percentage of OSM data that is imported vs mapped in
  the US? How does this compare to other countries?

 When TIGER was imported it dominated the database and contributed data
 from all countries combined.  The US community is growing and
 individual contributions are growing, but so are imports from other
 sources.

Hence the question for statistics.. I imagine. :)


 In Oregon, 48% of the objects have been most recently modified by
 non-TIGER sources.  Calculated this way:

 grep -o 'user=[a-zA-Z0-9]*' oregon.osm | sort | uniq -c | sort -n | grep 
 -vi tiger | awk '{sum += $1} END { print sum;}'
 grep -o 'user=[a-zA-Z0-9]*' oregon.osm | sort | uniq -c | sort -n | awk 
 '{sum += $1} END { print sum;}'

95% of the objects in Oregon edited by myself I'd still considered
TIGER-sourced (it was a massive name= change and based on TIGER docs
only).  grep -vi tiger\\\|balrog would be more accurate.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Cloudmade california.osm: dups or osmosis bug?

2010-05-21 Thread andrzej zaborowski
Hi Mike,
good catch, I forgot to add the ...Factory.java file.  Hopefully I
haven't forgotten any more files.

Cheers

On 22 May 2010 04:08, Mike N. nice...@att.net wrote:
 Hi andrzej,

  I tried to apply the patch, but I get an undefined symbol when trying to
 build 'FlattenFilterFactory'.   Am I missing part of the patch or is there
 another setting to create the Factory?

  Thanks,

  Mike Nice

 --
 From: andrzej zaborowski balr...@gmail.com
 Sent: Friday, May 21, 2010 2:40 PM
 To: David Carmean d...@halibut.com
 Cc: osmosis-...@openstreetmap.org; OSM US Talk
 talk-us@openstreetmap.org
 Subject: Re: [Talk-us] Cloudmade california.osm: dups or osmosis bug?

 Hi,

 On 21 May 2010 01:54, David Carmean d...@halibut.com wrote:

 Is it the dataset or osmosis that is giving me a single 2-node duplicate
 for each postGIS table created by osmosis?

 Not sure if it's related, but I noticed that many of the CloudMade
 extracts can't be processed using osmosis (even though they're
 generated with osmosis), except for the operations that don't try to
 validate the order of elements.  It seems they often contain
 consecutive elements with the same id, but possibly different version.
 I have a little patch (attached) to add a new filter (--ff) that
 flattens the file.  It does the same operation on sorted entity
 streams as --simplify-change does on change streams, unfortunately
 --simplify is taken by a different operation now.  I decided it was
 faster to modify osmosis than to download the planet and make my own
 un-broken extracts.

 Cheers
 (I'm not subscribed to osmosis-dev)




 ___
 Talk-us mailing list
 Talk-us@openstreetmap.org
 http://lists.openstreetmap.org/listinfo/talk-us


Index: src/org/openstreetmap/osmosis/core/filter/v0_6/FlattenFilter.java
===
--- src/org/openstreetmap/osmosis/core/filter/v0_6/FlattenFilter.java	(revision 0)
+++ src/org/openstreetmap/osmosis/core/filter/v0_6/FlattenFilter.java	(revision 0)
@@ -0,0 +1,91 @@
+/* This software is released into the Public Domain.
+ * See copying.txt for details.  */
+package org.openstreetmap.osmosis.core.filter.v0_6;
+
+import org.openstreetmap.osmosis.core.container.v0_6.BoundContainer;
+import org.openstreetmap.osmosis.core.container.v0_6.EntityContainer;
+import org.openstreetmap.osmosis.core.container.v0_6.EntityProcessor;
+import org.openstreetmap.osmosis.core.domain.v0_6.Entity;
+import org.openstreetmap.osmosis.core.task.v0_6.Sink;
+import org.openstreetmap.osmosis.core.task.v0_6.SinkSource;
+
+
+/**
+ * Flatten / simplify a sorted entity stream.
+ * (similar to --simplify-change)
+ */
+public class FlattenFilter implements SinkSource {
+	private Sink sink;
+	private EntityContainer previous_container;
+
+	/**
+	 * Creates a new instance.
+	 */
+	public FlattenFilter() {
+	}
+
+	/**
+	 * Process a node, way or relation.
+	 *
+	 * @param current_container
+	 *The entity container to be processed.
+	 */
+	public void process(EntityContainer current_container) {
+		if (previous_container == null) {
+			previous_container = current_container;
+			return;
+		}
+
+		Entity current = current_container.getEntity();
+		Entity previous = previous_container.getEntity();
+
+		if (current.getId() != previous.getId() ||
+current.getClass() != previous.getClass()) {
+			sink.process(previous_container);
+			previous_container = current_container;
+			return;
+		}
+
+		if (current.getVersion()  previous.getVersion())
+			previous_container = current_container;
+	}
+
+	/**
+	 * Process the bound.
+	 *
+	 * @param boundContainer
+	 *The bound to be processed.
+	 */
+	public void process(BoundContainer boundContainer) {
+		/* By default, pass it on unchanged */
+		sink.process(boundContainer);
+	}
+
+	/**
+	 * {...@inheritdoc}
+	 */
+	public void complete() {
+		/*
+		 * If we've stored entities temporarily, we now need to
+		 * forward the stored ones to the output.
+		 */
+		if (previous_container != null)
+			sink.process(previous_container);
+
+		sink.complete();
+	}
+
+	/**
+	 * {...@inheritdoc}
+	 */
+	public void release() {
+		sink.release();
+	}
+
+	/**
+	 * {...@inheritdoc}
+	 */
+	public void setSink(Sink sink) {
+		this.sink = sink;
+	}
+}
Index: src/org/openstreetmap/osmosis/core/filter/v0_6/FlattenFilterFactory.java
===
--- src/org/openstreetmap/osmosis/core/filter/v0_6/FlattenFilterFactory.java	(revision 0)
+++ src/org/openstreetmap/osmosis/core/filter/v0_6/FlattenFilterFactory.java	(revision 0)
@@ -0,0 +1,26 @@
+/* This software is released into the Public Domain.
+ * See copying.txt for details.  */
+package org.openstreetmap.osmosis.core.filter.v0_6;
+
+import org.openstreetmap.osmosis.core.pipeline.common.TaskConfiguration;
+import org.openstreetmap.osmosis.core.pipeline.common.TaskManager;
+import

Re: [Talk-us] Street Naming Conventions

2010-05-17 Thread andrzej zaborowski
Hi,

On 15 May 2010 05:58, David ``Smith'' vidthe...@gmail.com wrote:
 I now believe that it is /also/ acceptable for the
 name=* tag to specify the full, unabbreviated name -- however, if
 abbreviation of that name is used commonly and consistently, then that
 abbreviated form should go in another tag.  (I've been using
 abbr_name=* for that.)

 2) I've heard there's a bot that's automatically expanding names from
 the TIGER import.  To the operator of that bot: proceed with caution
 if you will, and /PLEASE/ preserve the abbreviated name in some other
 tag(s).

I've also been using abbr_name where there's a non-obvious way to
abbreviate the name, but actually there's no reason why it couldn't be
used always for commonly used abbreviated versions, other than it's a
bit of work to add.

In case of TIGER (I'm the person that ran the bot) it can be done
automatically to some extent so I can make another run to re-add the
original shortened names as abbr_name.  The original names, however,
were shortened according to some rules documented in the TIGER docs,
regardless of how common or consistent their usage was (following the
USPS rules, for example, they would be different in some small
percentage), so I'm not sure if that name should ever appear in a tag
that is not namespaced as a tiger tag.  On the other hand it may be a
good start for the abbr_name values, in 90% cases it should just be
correct.  In the remaining cases they can be fixed manually (rather
than having to add *all* abbr_names manually) and specially if we can
get mapnik to use the value of that tag when there isn't enough space
for the full name, then mappers would probably put some care in
maintaining the values correct.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Resigning in protest

2010-05-13 Thread andrzej zaborowski
On 13 May 2010 13:07, Frederik Ramm frede...@remote.org wrote:
 Andrzej,

 andrzej zaborowski wrote:

 1) A creates road; B edits road; C edits road.
 2) A creates road; B deletes road; C undeletes road.

 Well, I can kind of see a problem here (and am not in the states now
 :-) ).  In both situations the final version is a derived work of
 version A or B, or even a copy.  User C obtained version B under
 CC-By-SA, but claims to hold copyright of it and grant all the rights
 to OSMF when she uploads her change.

 That's not how it works. If what you sketched here was true, then anything
 in OSM that I have edited last would be PD[*] because I say so. But in
 reality, changing the license of something in OSM generally requires consent
 from all those who ever modified it.

That's exactly what I'm saying -- I assumed user C is a new user,
registered after the recent change, and B an old user.  So by
uploading any change, user C confirms that they hold the copyright to
the work and transfer all rights to OSMF.  But it's obvious they don't
because they just downloaded the previous version from OSM (usually),
and they may be in violation of the sharealike in CC-By-SA (assuming
CC-By-SA was valid for data).

That means that newly registered users as of two days ago can't make
any edits other than those exceptional edits where a new version is a
total remake of the object, not deriving from the previous versions.
Especially they can't undelete things, under the contributor terms
they agreed to.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Resigning in protest

2010-05-13 Thread andrzej zaborowski
On 13 May 2010 14:18, Frederik Ramm frede...@remote.org wrote:
 Hi,

 andrzej zaborowski wrote:

 That's exactly what I'm saying -- I assumed user C is a new user,
 registered after the recent change, and B an old user.  So by
 uploading any change, user C confirms that they hold the copyright to
 the work and transfer all rights to OSMF.  But it's obvious they don't
 because they just downloaded the previous version from OSM (usually),
 and they may be in violation of the sharealike in CC-By-SA (assuming
 CC-By-SA was valid for data).

 C only makes a statement about his (own) contributions which, in the case of
 an object downloaded and edited only, make up *part* of the whole object.
 Just because I download your motorway and add some detail, the motorway does
 not become my contribution.

Okay, you may be right, I assumed the contents of your contibution
are the contents of osmChange xml you upload, but the contributor
terms page doesn't make it clear.  So you say that when you undelete
an object, your only contribution is the setting of the visible flag.

I think it still could be argued that in majority of cases your edit
is a derived work of the original work.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Resigning in protest

2010-05-12 Thread andrzej zaborowski
On 13 May 2010 02:32, Frederik Ramm frede...@remote.org wrote:
 Anthony wrote:
         What if a new contributor reverts it?  Would the revert then be
         considered ODBL?

     A revert is an edit like any other.

 What does that mean?

 It means that the legal situation in the following two cases is exactly
 the same:

 1) A creates road; B edits road; C edits road.
 2) A creates road; B deletes road; C undeletes road.

Well, I can kind of see a problem here (and am not in the states now
:-) ).  In both situations the final version is a derived work of
version A or B, or even a copy.  User C obtained version B under
CC-By-SA, but claims to hold copyright of it and grant all the rights
to OSMF when she uploads her change.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Admin boundaries tied to roads

2010-04-25 Thread andrzej zaborowski
Hi Alan,

On 24 April 2010 06:33, Alan Mintz alan_mintz+...@earthlink.net wrote:
 At 2010-04-22 13:09, andrzej zaborowski wrote:
  On 22 April 2010 04:24, Alan Mintz alan_mintz+...@earthlink.net wrote:
   At 2010-04-21 17:12, andrzej zaborowski wrote:
  On 22 April 2010 01:18, Apollinaris Schoell ascho...@gmail.com wrote:
    On Wed, Apr 21, 2010 at 3:36 PM, andrzej zaborowski balr...@gmail.com
    wrote:
    Where's damage in that -- is it in that you can now read the name out
    without checking the documentation for what that funny string means in
    that particular database that is TIGER?
  
   I just had a machine crash as I was trying to find stats, but I'll bet 
 that
   at least 90% of the cases are St, Ave/Av, and Blvd/Bl, with the
   occasional Ln and Cir/Cr thrown in. When there's a lone N, S, E, or 
 W
   as a prefix to a street name, it's clear to everyone what that means. 
 These
   are the same abbreviations that _everyone_ uses every day - children,
   adults, businesses, governments, etc.
  
  Well, you just gave examples of the obvious ones, I'm not claiming any of
 these are not known.  But the list has 672 different forms.

 My point, though, was that we were going to a lot of trouble for a small
 percentage of real-world cases that _might_ (see below) present a problem
 for someone to understand.

Right, but we don't want to be inconsistent or we again have to keep
lists of exception to the normal rules in every tool.  Even if we
just wanted to document that on the wiki (or elsewhere, really doesn't
need to be wiki) for new mappers, then it would have to say something
like Don't use abbreviations in name=, except final St in English
speaking countries and Foo in Bar speaking countries and... and.. and
so on...  Let's just avoid this area completely.



  (but even the easy ones are hard for non-human consumers because St has
 at least three possible meanings, all three quite popular across the db).

 I'm sorry, but as a suffix (i.e. for the regex / St$/), what else does St
 mean but Street?

Sure you can have a regex for every allowed abbreviation, perhaps a
few regexes for some of the more complicated ones like St before names
of saints, and then for every language and every source of data, at
which point you start having to look at the source= tag or other tags
before you can fully interpret name=, because in TIGER data Stra at
the end is for Stravenue while in other places (nominatim's current
list of abbreviations) Stra at the end is for Straight.



   And I will do so again. My problem is mostly that this was done without a
   safety net. You clobbered existing data with no easy way to walk it
 back...
  
  Well, the way to walk it back is pretty easy, all the names can be
 taken from version-1 or reassembled from the tiger tags, so no worries there.

 This doesn't work for streets that were edited by users. Again, my problem
 is that, in thousands of edits, I specifically only expanded, for example,
 the prefix N to North when it is logically part of the root name. When
 it is logically a housenumber suffix, as it is in the majority of southern
 CA, I left the prefix alone. The road name may have been otherwise edited,
 though (to correct spelling, rename completely, etc.) This was to be used
 in the future when we could agree on a way to correctly separate these
 component parts of the name, as they are and must be in any database to be
 used with routing and street addressing in the real world. To walk it
 back, we will have to query the history of the way and find the version
 before the bot, to see what was done. It's not just v1, or TIGER, because
 it may have been otherwise edited. It's not even v[last-1] any more because
 there may have been other edits since the bot (I've done many myself).

Well I can provide you a list of the original names before I touched
them with the script along with their id's and versions so you can
check if the name has been edited afterwards, if you need to revert
these edits.  Note the edits also contain hundreds if not thousands of
my manual fixes for some frequent typos in TIGER and for some cases of
wrong segmentation into direction_prefix, base_name etc.

 I don't understand. Why do I have to remember them? Am I not capable of
 inferring their meaning? Do I have to infer anything anyway, since they are
 likely to be similar/identical to signage?

You have to if you want to give the name to somebody on the phone or
find a name someone gave you on the phone.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Admin boundaries tied to roads

2010-04-22 Thread andrzej zaborowski
On 22 April 2010 04:24, Alan Mintz alan_mintz+...@earthlink.net wrote:
 At 2010-04-21 17:12, andrzej zaborowski wrote:
On 22 April 2010 01:18, Apollinaris Schoell ascho...@gmail.com wrote:
  On Wed, Apr 21, 2010 at 3:36 PM, andrzej zaborowski balr...@gmail.com
  wrote:
  Where's damage in that -- is it in that you can now read the name out
  without checking the documentation for what that funny string means in
  that particular database that is TIGER?

 I just had a machine crash as I was trying to find stats, but I'll bet that
 at least 90% of the cases are St, Ave/Av, and Blvd/Bl, with the
 occasional Ln and Cir/Cr thrown in. When there's a lone N, S, E, or W
 as a prefix to a street name, it's clear to everyone what that means. These
 are the same abbreviations that _everyone_ uses every day - children,
 adults, businesses, governments, etc.

Well, you just gave examples of the obvious ones, I'm not claiming any
of these are not known.  But the list has 672 different forms.
(but even the easy ones are hard for non-human consumers because St
has at least three possible meanings, all three quite popular across
the db).

 And I will do so again. My problem is mostly that this was done without a
 safety net. You clobbered existing data with no easy way to walk it back.
 The existing name value should have been put in a foo_name tag so we could
 at least see what used to be. I would at least encourage that a bot be run
 to find these edits, find the previous version in history, and do this, if
 we can't soon agree on a better schema to split the name up into components
 at the same time.

Well, the way to walk it back is pretty easy, all the names can be
taken from version-1 or reassembled from the tiger tags, so no worries
there.


I don't know who defined the ones used in TIGER but this is not the
only way to abbreviate the names, that is proven by USPS having their
own list that is not identical.  The most popular words will be the
same in both lists but some are really cryptic and arbitrary, could as
well be numeric codes.  Then TIGER also includes Spanish names and the
list has abbreviations for those too, which rarely anyone in US can
read, while they can cope with unabbreviated ok.

 I don't agree. Much of the US speaks Spanish. Many more possess the
 tremendous brainpower and enoUGH grade-school Spanish required to know that
 Cl. in front of a street name might mean Calle or Cam. might mean Camino,
 or that S means Sur and N means Norte.

But do you remember the 600 abbreviations used in tiger?  It's neither
practical or useful or helps anyone, they're much like numerical
codes.  The one single thing they may be good for is for rendering at
lower zoom levels.




 name: The pre-balrog name

99% percent of the cases this was an arbitrary version of name, taken
from a database which was chosen only on the basis of its license, not
because it was more correct or anything.  So I don't see any reason to
hang on to it.


  The reason it was done with a script is that doing it manually was
  taking a lot of time and mappers were spending that time doing this
  instead of going out mapping. Â And it's always been on the wiki about
  not using abbreviated names, even when the original import was done,
  ignoring this.

 So what most newbies, including myself, did, was to follow the style of the
 majority of the data, instead of the often-outdated, incomplete, and
 inaccurate wiki, which is often not even self-consistent.

The majority of the data in this case was an imported dataset that
hasn't even been fully reviewed by a human, so while I agree learning
by example is a good way to make a quick start, it doesn't mean if you
followed the example then it's the only correct way to go.
I'm not using wiki as an argument to tell you what you should do, but
I think it's a good way to see what others were thinking.  I have
never edited the Key:name page, and I had never read it before
noticing that using abbreviations in a dataset that is supposed to be
parseable is a recipe for problems.



 In the Los Angeles area, I rarely saw expanded names (which is why I
 continue to abbreviate), except for those rare instances where someone drew
 a street from scratch before TIGER (apparently), and not even all of those.


You could surely change the wiki but it's a conclusion that a lot of
people individually seem to come to so I'm sure you wouldn't even need
a bot before someone would add a phrase to that effect.

 I don't know about a lot. I mostly just hear people regurgitate the
 don't abbreviate mantra without justification. Admittedly, maybe it's
 because it's already been hashed out to death and I'm late to the party.
 Regardless, maybe I'm not alone, and it deserves some re-thinking.

 Do people that are actually mapping (not bulk-importers) really want to
 type in North Martin Luther King, Junior Boulevard Southwest and then
 proofread that to make sure they didn't typo anything?

It completely depends on what

Re: [Talk-us] Admin boundaries tied to roads

2010-04-22 Thread andrzej zaborowski
On 22 April 2010 17:40, Apollinaris Schoell ascho...@gmail.com wrote:
 On 21 Apr 2010, at 17:12 , andrzej zaborowski wrote:
 The signs are posted there by authorities so this is similar to having
 access to a tiny piece of a map or database made by these authorities.
 For maps people usually agreed on this list that we don't trust them.


 are you saying authorities are wrong and we should correct what they are 
 doing and follow tiger or USPS standards instead?

I'm saying we should name the objects what they're called, not what it
is written as in somebody's database.


 Is the wiki any better as a reference than what is in the osm DB? I could
 change the wiki and then will someone write a bot to reverse it? Is the wiki
 written with the situation in US in mind?

 Well one good rule is if there should be any rules then they should be 
 global.


 no not at all. US is very different in many aspects and has to be done 
 different. several countries don't use abbrev names on maps or addresses. 
 Most street names don't even have a st/ave/blvd/ct … postfix at all and so 
 there is no reason to even discuss this topic. And in case they use abbrev 
 it's only when there is a need to shorten. But all official use will be 
 expanded. But in US it looks very much it's the opposite. abbrev is the 
 standard use model and expanded name is the exception

Seriously?  I can't think of a single place in Europe where the
street part is not commonly abbreviated just like what you describe
(maybe Germany, but I wouldn't know).  Just look at some paper maps or
postal addresses, or google, you will very rarely find the names
spelled out in full.  In the UK it's pretty much like in the US with
regard to the feature type suffix (St/Ave...) ([1]) but people have
been fixing it in OSM for some time, in Germany I think they use Str.
though not sure how commonly.  In all the slavic countries Street is
abbreviated as ul. prefix and Avenue as al. practically always
(just look at Belarus in OSM), in Hungary it's a Ut. prefix, in
Spain C/ (although the OSM community there agreed to not go with the
popular forms and spell everything out and put in any optional
articles someone might possibly squeeze in when referring to the
street -- basically use the longest form, to avoid ambiguity.  So you
won't find C/ in OSM even though it's on the signs), in Turkey it's
Sk. for sokak, in Greece it's something like Od, I don't remember
exactly.  Someone on IRC yesterday asked whether they should put the
Greek names in all caps because the street signs are in all caps.  I
guess your anwser would be yes, they should?

Cheers

1. http://osm.org/go/erdGBcIdM-

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Admin boundaries tied to roads

2010-04-21 Thread andrzej zaborowski
On 20 April 2010 05:24, Apollinaris Schoell ascho...@gmail.com wrote:
 Sounds a lot like the IMO ill-considered road name expansion that was
 apparently agreed upon by a small group of people without input from the
 majority of active mappers whose work has been damaged.

 agreed, no idea why this was done. it's a change without much benefit but 
 lot's of damage.

Where's damage in that -- is it in that you can now read the name out
without checking the documentation for what that funny string means in
that particular database that is TIGER?  You can now also write an
intelligent search engine that will understand both forms, you can
pipe the names through text-to-speach and do a lot more.

The reason it was done with a script is that doing it manually was
taking a lot of time and mappers were spending that time doing this
instead of going out mapping.  And it's always been on the wiki about
not using abbreviated names, even when the original import was done,
ignoring this.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Street Naming Conventions

2010-04-13 Thread andrzej zaborowski
On 8 April 2010 01:09, andrzej zaborowski balr...@gmail.com wrote:
 As for the different segments of
 the name, there are already fields for them which we inherited from
 TIGER, you'll find the middle of the name is unmodified in the
 tiger:base_name= tag, the cardinal direction in
 tiger:directional_prefix= and tiger:directional_suffix and the feature
 type (Street, Ave etc) in type:name_type.

Going back to the segmenting discussion, I have been manually
reviewing great numbers of TIGER streets, confirming some of riskier
changes done by the bot and can say with a good confidence that all or
majority of the name_direction_prefix, name_direction_suffix,
name_base and name_type attributes have been automatically generated
from the full name.  It can be seen in the types of errors found in
the segmentation.

One of the more interesting errors is that in areas where there are
only English names, if the name starts or ends with O, TIGER removed
it from name_base and put it in the prefix or suffix.  That must be
because TIGER is adapted to both English and Spanish, so for it the
cardinal directions are all of N, E, W, S, O, SE, SW, SO, NE, NW, NO.

I've been correcting the ones I found so that the tiger tags can be
used as basis for some unified tags.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Street Naming Conventions

2010-04-11 Thread andrzej zaborowski
On 9 April 2010 15:30, Matthias Julius li...@julius-net.net wrote:
 Val Kartchner val...@gmail.com writes:

 3) Prefix, body, suffix is available from the TIGER data, but what about
 streets that have already been added (or corrected) by users?  As we've
 seen, a bot won't always be able to correctly make these separations (as
 in the example of Southbay vs. South Bay given previously)  How do
 we make it so that it meets the goals I've given?

 I would say:
 - assemble the name out of the tiger:name_* tags
 - if that matches the name tag re-assemble the name while expanding
 tiger:name_direction_prefix and tiger:name_direction_prefix and
 replace the name tag.

Ok, added the check in r20882 although I'd say the script is useful
for data from sources other than TIGER too.

I don't think that only the direction_prefix/suffix should be
expanded, basically all name should be the way it is pronounced to
avoid ambiguity.

The East Doctor Martin Luther King, Junior Boulevard is an example
that I think shows that the direction parts of the name are the least
of the problems.  On the signage the name appears as E DR MLKjr BLVD
or similar.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Street Naming Conventions

2010-04-11 Thread andrzej zaborowski
On 10 April 2010 11:07, Richard Finegold goldfndr+...@gmail.com wrote:
 On Thu, Apr 8, 2010 at 20:32, Val Kartchner val...@gmail.com wrote:
 On Thu, 2010-04-08 at 00:59 +0200, andrzej zaborowski wrote:
 On 7 April 2010 20:12, Mike Thompson miketh...@gmail.com wrote:
  Having said that, I think it is a bad idea to have a bot going
 through
  and attempting to expand abbreviations.

 I agree. If a bot can do this then that is evidence that a renderer or
 other data consumer can expand them if desired. But is the bot
 supplying its source of heightened accuracy? Surely the bot isn't
 checking physical signage.

No, its only useful piece of knowledge is that the TIGER ruleset
applies to this data here.  If you had to incorporate in your renderer
such a bot for every one of the 200 countries this wouldn't be fun,
that's why the consensus (according to wiki and discussions on t...@..
and irc) is not to use abbrevs at all. (with some exceptions)

So it really shou;d have done as part of the conversion from TIGER to
osm format.



 but not at lower zooms. There's a claim of This will allow a renderer
 to introduce abbreviations as necessary. in the wiki, but is it true?
 Does andrzej or someone else have an algorithm that can work with the
 examples in the essay at http://vidthekid.info/misc/osm-abbr.html

No, but it's a good idea, on some weekend I'll set up a mapnik and see
how to get it to display shortened names in US at lower zoom levels.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Street Naming Conventions

2010-04-09 Thread andrzej zaborowski
On 9 April 2010 15:06, Matthias Julius li...@julius-net.net wrote:
 Val Kartchner val...@gmail.com writes:

 On Thu, 2010-04-08 at 12:23 -0400, Richard Welty wrote:
 i don't think anyone would argue with this. it's why having a bot
 rampage through
 fixing things is probably a Real Bad Idea unless it's extremely well
 thought out
 and comprehensively tested beforehand.

 While I didn't like what the bot was doing (at the time),

 What was the bot doing?

 I don't thing rampage is the correct word to use.  That implies
 malice, which wasn't what was attempted.  However, it did have a
 beneficial side effect: this topic.  ;-)

 In the special case of TIGER data there is a tag
 tiger:name_type=Rd|Ct|Dr|...

 I would have thought it should be fairly save to reconstruct the name
 from the tiger:name_* tags while expanding tiger:name_type - IF the
 name is still the original one.

Except for a few caveats the bot follows the TIGER documentation and
expands everything listed there (taking into account the suffix/prefix
requirements), it only touches name and name_1, 2 and so on, leaving
alone other tags.  I did a dry run on a piece of Canada and the
ruleset applies pretty well there too, the streets there were from
Geobase.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Street Naming Conventions

2010-04-08 Thread andrzej zaborowski
Hi Val,
you send the mail to me only, you might want to resend to list.

On 8 April 2010 08:31, Val Kartchner val...@gmail.com wrote:
 5) Should suggestions be given to renderers to use the USPS
 abbreviations?

Another possibility is to use the TIGER guidelines, if the USPS list
has issues related to copyright.

  b) I have used the alternate names (name, name_1, name_2, etc.) for
 alternates which would include expansions of the abbreviations.  Should
 we establish a standard for how these are used and their order?  For
 instance, north of 200 North, Washington Blvd is also 400 East and State
 Route 235 (though I know that routes are now tagged by relations).

There's also some confusion with what name_N tags are used for in
TIGER imports vs. the rest of OSM.  There are different tags for the
different names according to their type:

loc_name= for a not-necessarily official name, but used by locals,
official_name=,
alt_name=
int_name= (I'm not sure what this one is for)
ref=
reg_ref=

while in TIGER all of these are stuffed into name_N= tags as same
level citizens in seemingly random order.  I think maybe the _N
convention should only be used for different possible spellings of the
same name, while things like National Forest Development Road 0160
should be put in ref= or reg_ref=, and things like US Highway 30
removed completely and moved to relation's name.  And things like
Cemetery Road into loc_name unless name is empty.

Cheers,
Andrew

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Street Naming Conventions

2010-04-08 Thread andrzej zaborowski
On 8 April 2010 15:48, Lord-Castillo, Brett
blord-casti...@stlouisco.com wrote:
 One issue with using unabbreviated names, is sometimes the abbreviations are 
 part of the official name.
 Examples here:
 1st Community CU Dr (First Community Credit Union goes to a -different- 
 address)
 River City Blvd/River City Casino Blvd; many people think the first is an 
 abbreviation of the former. It isn't, two different streets that will route 
 mail (and traffic) to two different sets of addresses
 St Louis Street, which is different from Rue St Louis, which is different 
 from Saint Louis Street and Saint Louis Boulevard, which is still different 
 from The Boulevard St Louis. In each of those cases, the non-type 
 abbreviations are part of the name and expanding the abbreviations can turn 
 them into different streets.

In 1st Community CU Dr when you read it out, I guess you do say ...
CU Drive.

But the St Louis Street vs. Saint Louis Street? They're pronounced
exactly same way and I'm sure 90% people will write both of them as
St Louis Street in rush, are there seriously cases where these are
different addresses?  What do the people living at St Louis St say on
the phone when asked for address?

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Street Naming Conventions

2010-04-08 Thread andrzej zaborowski
On 8 April 2010 22:40, Dale Puch dale.p...@gmail.com wrote:
 Using a bot for specific well know Suffix abbreviations only should be
 reasonably safe.  IE never change ST to street if it is a prefix sort of
 rules.

TIGER specifies which qualifiers can appear as prefixes and which as
suffixes.  Unfortunately a lot of data is inconsistent with the docs,
for example for National Forest Development Road I counted about 30
different variations of the short version, the documentation only list
one.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Street Naming Conventions

2010-04-07 Thread andrzej zaborowski
Hi,

On 7 April 2010 20:12, Mike Thompson miketh...@gmail.com wrote:
 Having said that, I think it is a bad idea to have a bot going through
 and attempting to expand abbreviations.

I ran the bot ([1]) over the west half of the US because me and
another mapper in Portland, OR were tired of correcting the names
manually when a lot of the work could really be automated.
Abbreviations in the TIGER dataset are documented, both English and
Spanish, and the bot bases on the list from TIGER docs (I believe this
should have been done before the import as it has been done in case of
other imports elsewhere in the world).  After Oregon I ran the bot on
the other states because of the comments I got from mappers on IRC.
This was what prompted Val to start the discussion here.  I'm going to
hold off with it according to your comment.  Funnily in an IRC
discussion we concluded that it was nice that at least one thing had
been agreed on in OSM :)

I need to mention that the script is not fully automatic and a lot of
names have been reviewed manually with note= tags added here and
there, where stuff was abbreviated/misspelled beyond recognition.  The
current version knows about a lot of undocumented abbreviations that
appear a lot in TIGER and the most common misspellings -- TIGER is
really bad about consistency.  Surely there are cases of erroneously
expanded names and I'll be fixing them as they're found.

Cheers

1. http://svn.openstreetmap.org/applications/utils/import/tiger2osm/expand/

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Street Naming Conventions

2010-04-07 Thread andrzej zaborowski
On 8 April 2010 00:13, Dale Puch dale.p...@gmail.com wrote:
 Personally I like the 3 field, or event the 4 field storage of the name.
 Yes that means there is not a single field that has the full name, but I do
 not think a lookup and concat of 4 fields vs 1 field is much different in an
 indexed database.  Perhaps a DB admin can shed some light on the best
 approach from a design and system resources standpoint.

 This may be a problem for apps, but if it is the best way to manage the
 data, the apps just need to be changed.  And it would not be that hard to
 make the programming change if I can figure out how to do it with my limited
 knowledge.

 Some of the examples comma separated into the 4 field format:
 South, ,1000 East, Street
 ,State, Park, Street
 ,Saint, Tropez, Street

Paul Johnson mentioned on IRC today the case of East Doctor Martin
Luther King, Junior Boulevard, which wouldn't work with this schema
and I don't even want to imagine how the schema would be adapted to
the 200 other languages used in the database :)  I think the tag like
name= should really be consistent so tools can rely on it without
adapting to every single country.  As for the different segments of
the name, there are already fields for them which we inherited from
TIGER, you'll find the middle of the name is unmodified in the
tiger:base_name= tag, the cardinal direction in
tiger:directional_prefix= and tiger:directional_suffix and the feature
type (Street, Ave etc) in type:name_type.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us