Re: [Talk-us] Admin boundaries tied to roads
On Tue, 2010-04-27 at 10:57 -0700, am12 wrote: I'm saying that abbreviations are part of every day life, and locals know what to abbreviate and what not to. Sure, according to their local usage, which will be inconsistent with local usage in other places. What one local thinks is an obvious abbreviation usage because everyone knows it will not be obvious to a map user from elsewhere. How does commercial text-2-speech handle this? Unabbreviated, better-structured data. So, does that mean that street names like 40th Street, for instance should be expanded to Fortieth Street? - Val - ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
I understand that this is a collaborative project, where standards are as much defined by what somebody decides to do as anything else. Neither the wiki pages nor mailing list opinions (or votes) are definitive mandates. Given that, I'll toss my opinion out here. I'm saying that abbreviations are part of every day life, and locals know what to abbreviate and what not to. Sure, according to their local usage, which will be inconsistent with local usage in other places. What one local thinks is an obvious abbreviation usage because everyone knows it will not be obvious to a map user from elsewhere. How does commercial text-2-speech handle this? Unabbreviated, better-structured data. Can we agree for now that, with appropriate local knowledge, it will be acceptable to strip just these prefixes out of the name tag into another tag? Supplemental tags are great, but don't remove it from the name tag. Accepted OSM usage is the name tag is the complete full name. There are other variations like local_name or alt_name for the shortened version. There would have to be both a Something XYZ and a Something ABC in the same general area for you to get lost. Apparently you don't have many of these in your local area so you don't seem too concerned about it. My local area? I have them, and it's a pain. Multiply this by the already small percentage of both ABC and XYZ being uncommon abbreviations, and you have a really small set. And keeping unabbreviated data still eliminates this problem completely. To me, it's pretty simple: you can go from more data to less easily (full to abbreviated), but when you extrapolate backwards from less to more you will lose somewhere. Remember the mantra about don't tag for the renderer? It's there for a reason. OSM, in philosophy, is not about creating a pretty map. It is about creating an underlying map data set, and creating a pretty map is one of the key uses of it, but not the only one. Printing abbreviations is a job for the renderer. I understand the feeling that I can't change the renderer myself, but I can change the data entry myself, so that's the right thing for me to do. But it still doesn't make it the best solution. Let's make the data as clear and unambiguous as possible, and if the renderer needs fixing, work on it there. That's my free opinion, worth every penny :-) - Alan Millar ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
On Mon, 2010-04-26 at 16:31 -0700, Alan Mintz wrote: Good. We also need to settle on a set of component tags to make best use of the information present in those edits - particularly to separate out cardinal directions from those that are really part of the name. Can we agree for now that, with appropriate local knowledge, it will be acceptable to strip just these prefixes out of the name tag into another tag? Should I propose a set of component tags for a (hopefully quick) vote? The suffixes and root tags could then be populated at the same time (without stripping them from the name). I second you proposing this. We need to separate out the prefix, suffix and root. Though you need to remember these things when you make your proposal: http://vidthekid.info/misc/osm-abbr.html - Val - ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
Hi Alan, On 24 April 2010 06:33, Alan Mintz alan_mintz+...@earthlink.net wrote: At 2010-04-22 13:09, andrzej zaborowski wrote: On 22 April 2010 04:24, Alan Mintz alan_mintz+...@earthlink.net wrote: At 2010-04-21 17:12, andrzej zaborowski wrote: On 22 April 2010 01:18, Apollinaris Schoell ascho...@gmail.com wrote: On Wed, Apr 21, 2010 at 3:36 PM, andrzej zaborowski balr...@gmail.com wrote: Where's damage in that -- is it in that you can now read the name out without checking the documentation for what that funny string means in that particular database that is TIGER? I just had a machine crash as I was trying to find stats, but I'll bet that at least 90% of the cases are St, Ave/Av, and Blvd/Bl, with the occasional Ln and Cir/Cr thrown in. When there's a lone N, S, E, or W as a prefix to a street name, it's clear to everyone what that means. These are the same abbreviations that _everyone_ uses every day - children, adults, businesses, governments, etc. Well, you just gave examples of the obvious ones, I'm not claiming any of these are not known. But the list has 672 different forms. My point, though, was that we were going to a lot of trouble for a small percentage of real-world cases that _might_ (see below) present a problem for someone to understand. Right, but we don't want to be inconsistent or we again have to keep lists of exception to the normal rules in every tool. Even if we just wanted to document that on the wiki (or elsewhere, really doesn't need to be wiki) for new mappers, then it would have to say something like Don't use abbreviations in name=, except final St in English speaking countries and Foo in Bar speaking countries and... and.. and so on... Let's just avoid this area completely. (but even the easy ones are hard for non-human consumers because St has at least three possible meanings, all three quite popular across the db). I'm sorry, but as a suffix (i.e. for the regex / St$/), what else does St mean but Street? Sure you can have a regex for every allowed abbreviation, perhaps a few regexes for some of the more complicated ones like St before names of saints, and then for every language and every source of data, at which point you start having to look at the source= tag or other tags before you can fully interpret name=, because in TIGER data Stra at the end is for Stravenue while in other places (nominatim's current list of abbreviations) Stra at the end is for Straight. And I will do so again. My problem is mostly that this was done without a safety net. You clobbered existing data with no easy way to walk it back... Well, the way to walk it back is pretty easy, all the names can be taken from version-1 or reassembled from the tiger tags, so no worries there. This doesn't work for streets that were edited by users. Again, my problem is that, in thousands of edits, I specifically only expanded, for example, the prefix N to North when it is logically part of the root name. When it is logically a housenumber suffix, as it is in the majority of southern CA, I left the prefix alone. The road name may have been otherwise edited, though (to correct spelling, rename completely, etc.) This was to be used in the future when we could agree on a way to correctly separate these component parts of the name, as they are and must be in any database to be used with routing and street addressing in the real world. To walk it back, we will have to query the history of the way and find the version before the bot, to see what was done. It's not just v1, or TIGER, because it may have been otherwise edited. It's not even v[last-1] any more because there may have been other edits since the bot (I've done many myself). Well I can provide you a list of the original names before I touched them with the script along with their id's and versions so you can check if the name has been edited afterwards, if you need to revert these edits. Note the edits also contain hundreds if not thousands of my manual fixes for some frequent typos in TIGER and for some cases of wrong segmentation into direction_prefix, base_name etc. I don't understand. Why do I have to remember them? Am I not capable of inferring their meaning? Do I have to infer anything anyway, since they are likely to be similar/identical to signage? You have to if you want to give the name to somebody on the phone or find a name someone gave you on the phone. Cheers ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
-Original Message- From: Apollinaris Schoell [mailto:ascho...@gmail.com] Sent: Friday, April 23, 2010 9:47 AM To: Lord-Castillo, Brett Cc: 'talk-us@openstreetmap.org' Subject: Re: [Talk-us] Admin boundaries tied to roads On 23 Apr 2010, at 7:13 , Lord-Castillo, Brett wrote: On 19 Apr 2010, at 20:24, Apollinaris Schoell wrote: On 19 Apr 2010, at 20:07 , Alan Mintz wrote: Not to mention that merging them will result in the inability to hide these boundaries. When doing a bunch of editing on a road that follows one, in the past, I've taken the time to verify that the boundary doesn't share any nodes with anything and then remove it from my local OSM file manually so I don't have to constantly deal with it. If it shares nodes with anything else, this is no longer possible. fully agree, the good thing is these boundaries are tiger data and bad data anyway and should be replaced with better boundaries While I understand the mantra of TIGER=Bad because of the state of the road data, this is not true for the boundary data. Most of the boundary data comes directly from recorded surveys (something not available for roads) and is not bad data for most of the United States. The rural areas would be the one exception (mostly because they did not have surveys converted to digital layers in 2000), but rural areas are also highly likely to have realigned boundary roads that no longer correspond to the original boundaries. I can tell for sure that they are completely wrong in California. They are not even close to USGS 24k, don't align with official county borders from official sources and don't align with natural features, fences which are sometimes visible on Yahoo. Yes, California is one of the well-known exceptions. Their LUCA program fell apart (and this time around has been split into two separate regions as a result). If you take the Midwest states though, like Iowa, Minnesota, Missouri with their 300+ counties between them, the TIGER lines are directly from official sources, especially the 2009 updates. Brett Lord-Castillo Information Systems Designer/GIS Programmer St. Louis County Police Office of Emergency Management 14847 Ladue Bluffs Crossing Drive Chesterfield, MO 63017 Office: 314-628-5400 Fax: 314-628-5508 Direct: 314-628-5407 ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
I'd agree with Brett on the boundaries. The Census data is not perfect by any means, but it's pretty good, at least in my area--Minnesota. (and orders of magnitude better than it was in 2000!) And if it's not good in your area, you should talk to your local government and make sure they're participating in the Census' yearly Boundary Annexation Survey. http://www.census.gov/geo/www/bas/bashome.html I can tell for sure that they are completely wrong in California. They are not even close to USGS 24k, don't align with official county borders from official sources and don't align with natural features, fences which are sometimes visible on Yahoo. To further respond to this, there is no claim by the Census that it's survey accuracy, or that it aligns with other data. Fundamentally, it is created by the Census for internal purposes, and all TIGER boundary data is relative to the other TIGER data. (just like a lot of traced OSM data is relative to the Yahoo imagery) Everybody gets access to it for free and you can use it when its good or ignore it when its bad or modify it when its in between. The bigger issue with it being imported into OSM is the currency, because municipal boundaries are always changing, and as has been mentioned, boundaries are not usually something that is easily verifiable on the ground Cheers, Brad On Fri, Apr 23, 2010 at 9:54 AM, Lord-Castillo, Brett blord-casti...@stlouisco.com wrote: -Original Message- From: Apollinaris Schoell [mailto:ascho...@gmail.com] Sent: Friday, April 23, 2010 9:47 AM To: Lord-Castillo, Brett Cc: 'talk-us@openstreetmap.org' Subject: Re: [Talk-us] Admin boundaries tied to roads On 23 Apr 2010, at 7:13 , Lord-Castillo, Brett wrote: On 19 Apr 2010, at 20:24, Apollinaris Schoell wrote: On 19 Apr 2010, at 20:07 , Alan Mintz wrote: Not to mention that merging them will result in the inability to hide these boundaries. When doing a bunch of editing on a road that follows one, in the past, I've taken the time to verify that the boundary doesn't share any nodes with anything and then remove it from my local OSM file manually so I don't have to constantly deal with it. If it shares nodes with anything else, this is no longer possible. fully agree, the good thing is these boundaries are tiger data and bad data anyway and should be replaced with better boundaries While I understand the mantra of TIGER=Bad because of the state of the road data, this is not true for the boundary data. Most of the boundary data comes directly from recorded surveys (something not available for roads) and is not bad data for most of the United States. The rural areas would be the one exception (mostly because they did not have surveys converted to digital layers in 2000), but rural areas are also highly likely to have realigned boundary roads that no longer correspond to the original boundaries. I can tell for sure that they are completely wrong in California. They are not even close to USGS 24k, don't align with official county borders from official sources and don't align with natural features, fences which are sometimes visible on Yahoo. Yes, California is one of the well-known exceptions. Their LUCA program fell apart (and this time around has been split into two separate regions as a result). If you take the Midwest states though, like Iowa, Minnesota, Missouri with their 300+ counties between them, the TIGER lines are directly from official sources, especially the 2009 updates. Brett Lord-Castillo Information Systems Designer/GIS Programmer St. Louis County Police Office of Emergency Management 14847 Ladue Bluffs Crossing Drive Chesterfield, MO 63017 Office: 314-628-5400 Fax: 314-628-5508 Direct: 314-628-5407 ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
On Fri, Apr 23, 2010 at 11:01 AM, Brad Neuhauser brad.neuhau...@gmail.comwrote: The bigger issue with it being imported into OSM is the currency, because municipal boundaries are always changing, and as has been mentioned, boundaries are not usually something that is easily verifiable on the ground I'd say the biggest issue is the fact that, when the census bureau couldn't find data on municipalities, they decided to just make shit up. They picked some arbitrary boundary which had roughly the right number of people in it, and then named it after an actual place which happened to be nearby. The CDPs are horrible when used for any purpose other than interpreting census data. I really wish the census bureau had named them CDP 1283, CDP 1284, CDP 1285, etc. ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
At 2010-04-23 18:11, Anthony wrote: A navi system is more useful if the instructions and signs match. Depends on your purpose. If you're trying to navigate to the missigned street (e.g. California Street, where the sign reads Carolina Street), you don't want to get a response of street not found. For most other purposes you'd rather have the incorrect name (at least until it gets fixed). Yeah - this is always a quandary. In my experience, the street sign usually ends up being right anyway, so I'm usually asking the responsible authority to fix their GIS and/or the source map (yes, even tract maps that are decades old :) ). I don't really consider this as original research, since it's really a matter of reconciling sources, but it's admittedly time consuming and requires additional research that many mappers (understandably) may not want to do. Still, I think it's value that I can add, not only to OSM, but also for my fellow citizens. When the sign is wrong, I notify the signing authority and, if it seems that they intend to fix it soon (the usual case), I put the correct value in the name tag and the signed value in the alt_name tag, with a note tag describing the situation. If there is no easy contact with the authority, or it seems they may not fix it soon, I reverse the tagging. Either way, there are notes/FIXMEs there to remind me (or others) to survey again in the future. BTW, technically, I would call surveying/photographing, and then mapping based on it, original research :) P.S. http://www.openstreetmap.org/browse/way/56123368 is one of those strange cases where it's been signed and likely known wrong according to the cited docs, because the signed name is more logical in context. I name'd it as signed and put the recorded name in the official_name tag instead. If there's anyone nearby that would like to have a look, It'd be useful to know how it's signed at the intersection with Outer Traffic Circle here: http://www.openstreetmap.org/browse/node/122696036 . -- Alan Mintz alan_mintz+...@earthlink.net ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
At 2010-04-23 07:47, Apollinaris Schoell wrote: While I understand the mantra of TIGER=Bad because of the state of the road data, this is not true for the boundary data. Most of the boundary data comes directly from recorded surveys (something not available for roads) and is not bad data for most of the United States. The rural areas would be the one exception (mostly because they did not have surveys converted to digital layers in 2000), but rural areas are also highly likely to have realigned boundary roads that no longer correspond to the original boundaries. I can tell for sure that they are completely wrong in California. They are not even close to USGS 24k, don't align with official county borders from official sources and don't align with natural features, fences which are sometimes visible on Yahoo. I don't know about completely. The parts of the Kern/LA/Orange/San Bernardino/Riverside/San Diego borders that I have surveyed are at least close to the signage at important points (admittedly a weak standard), but I've also gone hunting for detail in law in some spots and found that the borders were right as of their date of creation in the source data. I remember manually fixing a little bit of the OC/LA border in La Habra from some sort of change description - maybe something out the BAS project. What a pain that was. Is anyone working on borders currently? Is the BAS a reasonable source? -- Alan Mintz alan_mintz+...@earthlink.net ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
At 2010-04-22 13:33, andrzej zaborowski wrote: On 22 April 2010 17:40, Apollinaris Schoell ascho...@gmail.com wrote: On 21 Apr 2010, at 17:12 , andrzej zaborowski wrote: The signs are posted there by authorities so this is similar to having access to a tiny piece of a map or database made by these authorities. For maps people usually agreed on this list that we don't trust them. are you saying authorities are wrong and we should correct what they are doing and follow tiger or USPS standards instead? I'm saying we should name the objects what they're called, not what it is written as in somebody's database. what they're called, though, may indeed be from somebody's database, when that database is the county recorder's or assessor's. The recorder, in particular, should be the truth by definition, except when you can see that there's an obvious mistake and can confirm it with them. -- Alan Mintz alan_mintz+...@earthlink.net ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
On 23 Apr 2010, at 19:46 , Alan Mintz wrote: At 2010-04-23 07:47, Apollinaris Schoell wrote: I don't know about completely. The parts of the Kern/LA/Orange/San Bernardino/Riverside/San Diego borders that I have surveyed are at least close to the signage at important points (admittedly a weak standard), but I've also gone hunting for detail in law in some spots and found that the borders were right as of their date of creation in the source data. I remember manually fixing a little bit of the OC/LA border in La Habra from some sort of change description - maybe something out the BAS project. What a pain that was. depends on the definition, for me a difference of 100-200m is too bad. any GPS or verbal description is better if matched with Yahoo. In some corners even worse complex edges have been entirely clipped. USGS is pretty good and matches county borders. County borders are from official state data and are high accuracy. Also Sat matches well when borders follow natural features. USGS tracing is very difficult because borders are often hard to identify among other features. Is anyone working on borders currently? Is the BAS a reasonable source? what is BAS? any better source will be useful -- Alan Mintz alan_mintz+...@earthlink.net ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
At 2010-04-22 13:09, andrzej zaborowski wrote: On 22 April 2010 04:24, Alan Mintz alan_mintz+...@earthlink.net wrote: At 2010-04-21 17:12, andrzej zaborowski wrote: On 22 April 2010 01:18, Apollinaris Schoell ascho...@gmail.com wrote: On Wed, Apr 21, 2010 at 3:36 PM, andrzej zaborowski balr...@gmail.com wrote: Where's damage in that -- is it in that you can now read the name out without checking the documentation for what that funny string means in that particular database that is TIGER? I just had a machine crash as I was trying to find stats, but I'll bet that at least 90% of the cases are St, Ave/Av, and Blvd/Bl, with the occasional Ln and Cir/Cr thrown in. When there's a lone N, S, E, or W as a prefix to a street name, it's clear to everyone what that means. These are the same abbreviations that _everyone_ uses every day - children, adults, businesses, governments, etc. Well, you just gave examples of the obvious ones, I'm not claiming any of these are not known. But the list has 672 different forms. My point, though, was that we were going to a lot of trouble for a small percentage of real-world cases that _might_ (see below) present a problem for someone to understand. (but even the easy ones are hard for non-human consumers because St has at least three possible meanings, all three quite popular across the db). I'm sorry, but as a suffix (i.e. for the regex / St$/), what else does St mean but Street? And I will do so again. My problem is mostly that this was done without a safety net. You clobbered existing data with no easy way to walk it back... Well, the way to walk it back is pretty easy, all the names can be taken from version-1 or reassembled from the tiger tags, so no worries there. This doesn't work for streets that were edited by users. Again, my problem is that, in thousands of edits, I specifically only expanded, for example, the prefix N to North when it is logically part of the root name. When it is logically a housenumber suffix, as it is in the majority of southern CA, I left the prefix alone. The road name may have been otherwise edited, though (to correct spelling, rename completely, etc.) This was to be used in the future when we could agree on a way to correctly separate these component parts of the name, as they are and must be in any database to be used with routing and street addressing in the real world. To walk it back, we will have to query the history of the way and find the version before the bot, to see what was done. It's not just v1, or TIGER, because it may have been otherwise edited. It's not even v[last-1] any more because there may have been other edits since the bot (I've done many myself). ...Then TIGER also includes Spanish names and the list has abbreviations for those too, which rarely anyone in US can read, while they can cope with unabbreviated ok. I don't agree. Much of the US speaks Spanish. Many more possess the tremendous brainpower and enoUGH grade-school Spanish required to know that Cl. in front of a street name might mean Calle or Cam. might mean Camino, or that S means Sur and N means Norte. But do you remember the 600 abbreviations used in tiger? It's neither practical or useful or helps anyone, they're much like numerical codes. The one single thing they may be good for is for rendering at lower zoom levels. I don't understand. Why do I have to remember them? Am I not capable of inferring their meaning? Do I have to infer anything anyway, since they are likely to be similar/identical to signage? Also, to me lower zoom levels is almost any level at which I want to see a map. Anything more than a small neighborhood, and it's all we can do just to fit the root of the name in - we don't need any _more_ characters. name: The pre-balrog name 99% percent of the cases this was an arbitrary version of name, taken from a database which was chosen only on the basis of its license, not because it was more correct or anything. So I don't see any reason to hang on to it. If I understand you correctly, I disagree completely. In my experience in southern CA, 90% of the time, TIGER is correct with the exception of the presence of the directional prefix. The real problem was the geometry[1]. In the Los Angeles area, I rarely saw expanded names (which is why I continue to abbreviate), except for those rare instances where someone drew a street from scratch before TIGER (apparently), and not even all of those. BTW, from my previously cited data chunk (35988 unique names in about 4400 sq mi (11000 sq km) of southern CA) , I can now say that only ~0.2% of suffixes were present in their expanded form (i.e. Street, Avenue, etc.). You could surely change the wiki but it's a conclusion that a lot of people individually seem to come to so I'm sure you wouldn't even need a bot before someone would add a phrase to that effect. I don't know
Re: [Talk-us] Admin boundaries tied to roads
On 22 April 2010 04:24, Alan Mintz alan_mintz+...@earthlink.net wrote: At 2010-04-21 17:12, andrzej zaborowski wrote: On 22 April 2010 01:18, Apollinaris Schoell ascho...@gmail.com wrote: On Wed, Apr 21, 2010 at 3:36 PM, andrzej zaborowski balr...@gmail.com wrote: Where's damage in that -- is it in that you can now read the name out without checking the documentation for what that funny string means in that particular database that is TIGER? I just had a machine crash as I was trying to find stats, but I'll bet that at least 90% of the cases are St, Ave/Av, and Blvd/Bl, with the occasional Ln and Cir/Cr thrown in. When there's a lone N, S, E, or W as a prefix to a street name, it's clear to everyone what that means. These are the same abbreviations that _everyone_ uses every day - children, adults, businesses, governments, etc. Well, you just gave examples of the obvious ones, I'm not claiming any of these are not known. But the list has 672 different forms. (but even the easy ones are hard for non-human consumers because St has at least three possible meanings, all three quite popular across the db). And I will do so again. My problem is mostly that this was done without a safety net. You clobbered existing data with no easy way to walk it back. The existing name value should have been put in a foo_name tag so we could at least see what used to be. I would at least encourage that a bot be run to find these edits, find the previous version in history, and do this, if we can't soon agree on a better schema to split the name up into components at the same time. Well, the way to walk it back is pretty easy, all the names can be taken from version-1 or reassembled from the tiger tags, so no worries there. I don't know who defined the ones used in TIGER but this is not the only way to abbreviate the names, that is proven by USPS having their own list that is not identical. The most popular words will be the same in both lists but some are really cryptic and arbitrary, could as well be numeric codes. Then TIGER also includes Spanish names and the list has abbreviations for those too, which rarely anyone in US can read, while they can cope with unabbreviated ok. I don't agree. Much of the US speaks Spanish. Many more possess the tremendous brainpower and enoUGH grade-school Spanish required to know that Cl. in front of a street name might mean Calle or Cam. might mean Camino, or that S means Sur and N means Norte. But do you remember the 600 abbreviations used in tiger? It's neither practical or useful or helps anyone, they're much like numerical codes. The one single thing they may be good for is for rendering at lower zoom levels. name: The pre-balrog name 99% percent of the cases this was an arbitrary version of name, taken from a database which was chosen only on the basis of its license, not because it was more correct or anything. So I don't see any reason to hang on to it. The reason it was done with a script is that doing it manually was taking a lot of time and mappers were spending that time doing this instead of going out mapping. Â And it's always been on the wiki about not using abbreviated names, even when the original import was done, ignoring this. So what most newbies, including myself, did, was to follow the style of the majority of the data, instead of the often-outdated, incomplete, and inaccurate wiki, which is often not even self-consistent. The majority of the data in this case was an imported dataset that hasn't even been fully reviewed by a human, so while I agree learning by example is a good way to make a quick start, it doesn't mean if you followed the example then it's the only correct way to go. I'm not using wiki as an argument to tell you what you should do, but I think it's a good way to see what others were thinking. I have never edited the Key:name page, and I had never read it before noticing that using abbreviations in a dataset that is supposed to be parseable is a recipe for problems. In the Los Angeles area, I rarely saw expanded names (which is why I continue to abbreviate), except for those rare instances where someone drew a street from scratch before TIGER (apparently), and not even all of those. You could surely change the wiki but it's a conclusion that a lot of people individually seem to come to so I'm sure you wouldn't even need a bot before someone would add a phrase to that effect. I don't know about a lot. I mostly just hear people regurgitate the don't abbreviate mantra without justification. Admittedly, maybe it's because it's already been hashed out to death and I'm late to the party. Regardless, maybe I'm not alone, and it deserves some re-thinking. Do people that are actually mapping (not bulk-importers) really want to type in North Martin Luther King, Junior Boulevard Southwest and then proofread that to make sure they didn't typo anything? It completely depends on what
Re: [Talk-us] Admin boundaries tied to roads
On 22 April 2010 17:40, Apollinaris Schoell ascho...@gmail.com wrote: On 21 Apr 2010, at 17:12 , andrzej zaborowski wrote: The signs are posted there by authorities so this is similar to having access to a tiny piece of a map or database made by these authorities. For maps people usually agreed on this list that we don't trust them. are you saying authorities are wrong and we should correct what they are doing and follow tiger or USPS standards instead? I'm saying we should name the objects what they're called, not what it is written as in somebody's database. Is the wiki any better as a reference than what is in the osm DB? I could change the wiki and then will someone write a bot to reverse it? Is the wiki written with the situation in US in mind? Well one good rule is if there should be any rules then they should be global. no not at all. US is very different in many aspects and has to be done different. several countries don't use abbrev names on maps or addresses. Most street names don't even have a st/ave/blvd/ct … postfix at all and so there is no reason to even discuss this topic. And in case they use abbrev it's only when there is a need to shorten. But all official use will be expanded. But in US it looks very much it's the opposite. abbrev is the standard use model and expanded name is the exception Seriously? I can't think of a single place in Europe where the street part is not commonly abbreviated just like what you describe (maybe Germany, but I wouldn't know). Just look at some paper maps or postal addresses, or google, you will very rarely find the names spelled out in full. In the UK it's pretty much like in the US with regard to the feature type suffix (St/Ave...) ([1]) but people have been fixing it in OSM for some time, in Germany I think they use Str. though not sure how commonly. In all the slavic countries Street is abbreviated as ul. prefix and Avenue as al. practically always (just look at Belarus in OSM), in Hungary it's a Ut. prefix, in Spain C/ (although the OSM community there agreed to not go with the popular forms and spell everything out and put in any optional articles someone might possibly squeeze in when referring to the street -- basically use the longest form, to avoid ambiguity. So you won't find C/ in OSM even though it's on the signs), in Turkey it's Sk. for sokak, in Greece it's something like Od, I don't remember exactly. Someone on IRC yesterday asked whether they should put the Greek names in all caps because the street signs are in all caps. I guess your anwser would be yes, they should? Cheers 1. http://osm.org/go/erdGBcIdM- ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
On 20 April 2010 05:24, Apollinaris Schoell ascho...@gmail.com wrote: Sounds a lot like the IMO ill-considered road name expansion that was apparently agreed upon by a small group of people without input from the majority of active mappers whose work has been damaged. agreed, no idea why this was done. it's a change without much benefit but lot's of damage. Where's damage in that -- is it in that you can now read the name out without checking the documentation for what that funny string means in that particular database that is TIGER? You can now also write an intelligent search engine that will understand both forms, you can pipe the names through text-to-speach and do a lot more. The reason it was done with a script is that doing it manually was taking a lot of time and mappers were spending that time doing this instead of going out mapping. And it's always been on the wiki about not using abbreviated names, even when the original import was done, ignoring this. Cheers ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
At 2010-04-21 17:12, andrzej zaborowski wrote: On 22 April 2010 01:18, Apollinaris Schoell ascho...@gmail.com wrote: On Wed, Apr 21, 2010 at 3:36 PM, andrzej zaborowski balr...@gmail.com wrote: Where's damage in that -- is it in that you can now read the name out without checking the documentation for what that funny string means in that particular database that is TIGER? I just had a machine crash as I was trying to find stats, but I'll bet that at least 90% of the cases are St, Ave/Av, and Blvd/Bl, with the occasional Ln and Cir/Cr thrown in. When there's a lone N, S, E, or W as a prefix to a street name, it's clear to everyone what that means. These are the same abbreviations that _everyone_ uses every day - children, adults, businesses, governments, etc. Even when travelling to another country, it takes me very little time to understand what common abbreviations are used for in addresses. there is damage by doing it wrong, others have pointed to it already. And I will do so again. My problem is mostly that this was done without a safety net. You clobbered existing data with no easy way to walk it back. The existing name value should have been put in a foo_name tag so we could at least see what used to be. I would at least encourage that a bot be run to find these edits, find the previous version in history, and do this, if we can't soon agree on a better schema to split the name up into components at the same time. I am not deep enough into the history of the abbreviations used and who defined them. But I am pretty sure there is a lot of errors. Errors that I, and a lot of other mappers, painstakingly fixed by hand, based on ground surveys and research into public records. In particular, I'm worried about the cases where I spelled out North because it was actually part of the name, as opposed to a cardinal direction related to addresses, which I left alone, hoping to later move the latter directions to a addr:direction_prefix tag, while leaving the former along. I can no longer distinguish between the two. I don't know who defined the ones used in TIGER but this is not the only way to abbreviate the names, that is proven by USPS having their own list that is not identical. The most popular words will be the same in both lists but some are really cryptic and arbitrary, could as well be numeric codes. Then TIGER also includes Spanish names and the list has abbreviations for those too, which rarely anyone in US can read, while they can cope with unabbreviated ok. I don't agree. Much of the US speaks Spanish. Many more possess the tremendous brainpower and enoUGH grade-school Spanish required to know that Cl. in front of a street name might mean Calle or Cam. might mean Camino, or that S means Sur and N means Norte. - in the city I live there is no street sign with street, avenue, boulevard, and even more surprising there are no abbreviations either. osm principle is to map what's on the ground. So tiger import is definitely wrong and expanding the names is also wrong. on the other hand postal address usually use it in one or the other form so it's not completely fiction. Exactly. Many places in Orange County have the bad habit of leaving the suffix off the large street signs at intersections, perhaps as a way of saving space to reduce sign size and cost. Just because the big sign says just Orange doesn't mean that the street's real name is Orange Street, nor that it shouldn't be entered into any reasonable database or map that way. map what's on the ground is the wrong thing to do so often that I don't really understand why it was decided upon, nor why people continue hold it up on a pedestal, despite continuing problems with it. For the record street signs on different ends of the same street often use different forms and you'll sometimes find really strange conventions, so while I agree mapping what's on the ground is good because stuff can be confirmed, in this case it's not a solution. In many places you'll find the names are all caps on the signs but in a local newspaper they're capitalized the usual way. And the signs are sometimes wrong. In the thousands of streets I've photographed and mapped, I've corrected hundreds of signage errors/inconsistencies, often requiring substantial research into records, and resulting in notification of the appropriate authority to fix the records and/or signs (for free :( ). - many geocding engines do not find expanded names. even google doesn't in many cases. To me it looks like nearly anyone doesn't use the expanded name at all. So my question is is the expanded name really the correct name? Exactly! Sounds like it's only useful purpose is text-2-speech. Here's what I'd like to see: name: The pre-balrog name name_direction_prefix: The 1-2 char cardinal direction before the root use_name_direction_prefix: {yes|no} Yes indicates that the name_direction_prefix
Re: [Talk-us] Admin boundaries tied to roads
On 4/20/10 3:44 AM, Frederik Ramm wrote: Hi, Alan Mintz wrote: At 2010-04-19 10:45, Mike N. wrote: I see that the separate VS tangled argument has been settled in the US by the Duplicate Node attack bots, who have blindly merged all duplicate nodes. http://www.openstreetmap.org/browse/way/38855677 Is this really happening? Can someone describe exactly what criteria are being used, and just how it was decided that this was a good idea? It seems that someone is, more or less blindly, using the JOSM validator de-duplication. Doesn't look like a bot but, as Richard said, has similar results. given the way that it is currently set up, i'll wager that a lot of less experienced josm users are doing this, because the validator, in its current form, leads them down this path. richard ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
On Mon, Apr 19, 2010 at 12:45 PM, Mike N. nice...@att.net wrote: From an old message: I take the point that 'road realignment' may require the boundary also to move, but the word is MAY and so what ever happens to the road, the location of the boundary needs to be checked separately! It is quite surprising in the UK how many roads are being moved, but that does not also move the original boundary. I see that the separate VS tangled argument has been settled in the US by the Duplicate Node attack bots, who have blindly merged all duplicate nodes. http://www.openstreetmap.org/browse/way/38855677 When I imported GNIS last year, a fairly significant portion of the data (2-5%) had POI with coordinates exactly the same as another POI (e.g. a post office inside a town hall building). I wonder what these duplicate nod bots are doing with those nodes... ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
On 4/19/10 1:45 PM, Mike N. wrote: From an old message: I take the point that 'road realignment' may require the boundary also to move, but the word is MAY and so what ever happens to the road, the location of the boundary needs to be checked separately! It is quite surprising in the UK how many roads are being moved, but that does not also move the original boundary. I see that the separate VS tangled argument has been settled in the US by the Duplicate Node attack bots, who have blindly merged all duplicate nodes. http://www.openstreetmap.org/browse/way/38855677 i don't know if settled is the word for it, the debate is still open, but currently the josm validator reports duplicate nodes as errors, and provides a fix button that merges them. it's not fully automated like a bot, but the result is effectively the same. richard ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us
Re: [Talk-us] Admin boundaries tied to roads
At 2010-04-19 10:45, Mike N. wrote: I see that the separate VS tangled argument has been settled in the US by the Duplicate Node attack bots, who have blindly merged all duplicate nodes. http://www.openstreetmap.org/browse/way/38855677 Is this really happening? Can someone describe exactly what criteria are being used, and just how it was decided that this was a good idea? Seems like the wrong thing to do - city and county boundaries are often defined in law, or by survey, and do not necessarily keep up with changes in road alignment. I have resisted editing most of these boundaries until/unless I take the time to research the true definition of the boundary. Not to mention that merging them will result in the inability to hide these boundaries. When doing a bunch of editing on a road that follows one, in the past, I've taken the time to verify that the boundary doesn't share any nodes with anything and then remove it from my local OSM file manually so I don't have to constantly deal with it. If it shares nodes with anything else, this is no longer possible. Sounds a lot like the IMO ill-considered road name expansion that was apparently agreed upon by a small group of people without input from the majority of active mappers whose work has been damaged. -- Alan Mintz alan_mintz+...@earthlink.net ___ Talk-us mailing list Talk-us@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-us