Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-12 Thread Dale Puch
Lots of weird ones from Florida Many should not give you an issue due to how your processing, but it is best to test them anyhow. Also it might be a good reference when looking at other expansions after this runs. way id=10761946 name v=E 10th Ct E way id=10763539 name v=E 10th St E way

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-12 Thread andrzej zaborowski
On 11 May 2012 22:17, Dale Puch dale.p...@gmail.com wrote: I understand the script checks for only one instance of the abbreviation. My point was what is someone manually expanded ONE of the abbreviations, leaving st something street?  Is that checked for?  The question also applies to Dr

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-12 Thread Anthony
On Sat, May 12, 2012 at 4:21 PM, andrzej zaborowski balr...@gmail.com wrote: It checks suffixes starting from the end, so if you have St something St E or St something St East, it'll only check E or East and then St and then stop because something is not a known suffix. So Calle Ave Maria

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-12 Thread Anthony
On Sat, May 12, 2012 at 4:47 PM, Anthony o...@inbox.org wrote: On Sat, May 12, 2012 at 4:21 PM, andrzej zaborowski balr...@gmail.com wrote: It checks suffixes starting from the end, so if you have St something St E or St something St East, it'll only check E or East and then St and then stop

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-12 Thread Nathan Edgars II
The process seems obvious to me: check that the name is still what it originally was (from the tiger:name_base etc. tags), and if so, use those tags to expand abbreviations. (Ignore any with semicolons/colons from joining.) If not, set it aside for semi-manual checking. The only false

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Kristian M Zoerhoff
On Fri, May 11, 2012 at 04:47:37AM -0400, Serge Wroclawski wrote: I've added direction expansion into a new version, and thrown it up as a gist: https://gist.github.com/2656735 I don't treat direction prefixes and suffixes any differently- I haven't seen an example where there is both a

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Anthony
On Thu, May 10, 2012 at 11:45 PM, Dale Puch dale.p...@gmail.com wrote: Clarity!  The abbreviations are just that, they mean the full word, and are spoken that way, but written and displayed as the abbreviation.  I also disagree I have never know anyone that said whatever A V E  they do not

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Anthony
On Fri, May 11, 2012 at 9:45 AM, Anthony o...@inbox.org wrote: The only way to capture the full information is to have additional tags telling you what the base is.  And if you do that, abbreviating or not abbreviating doesn't matter. And if you want to avoid tremendous redundancy, the way to

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Minh Nguyen
On 2012-05-11 6:45 AM, Anthony wrote: Not really. Is 1515 South West Shore Boulevard, Tampa abbreviated 1515 S West Shore Blvd, Tampa, or is it abbreviated 1515 S W Shore Blvd, Tampa? If you want the answer, ask usps.com. The only way to capture the full information is to have additional tags

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Alan Mintz
At 2012-05-10 19:40, Anthony wrote: On Thu, May 10, 2012 at 10:25 PM, Mike N nice...@att.net wrote: The only question is what to do about those cases where it's only referred to locally as 'Ave', and the postal service would refuse letters addressed to 'Avenue'. The postal service would

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Alan Mintz
At 2012-05-10 19:56, Anthony wrote: On Thu, May 10, 2012 at 10:45 PM, Mike N nice...@att.net wrote: But you wouldn't be confused if an stranger came in asking how to get to Whatever Avenue?If not, then there's no problem with the expansion. Okay, so basically we're ignoring the

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Anthony
On Fri, May 11, 2012 at 12:26 PM, Minh Nguyen m...@1ec5.org wrote: On 2012-05-11 6:45 AM, Anthony wrote: The only way to capture the full information is to have additional tags telling you what the base is.  And if you do that, abbreviating or not abbreviating doesn't matter. That's similar

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Anthony
On Fri, May 11, 2012 at 1:35 PM, Alan Mintz alan_mintz+...@earthlink.net wrote: At 2012-05-10 19:40, Anthony wrote: On Thu, May 10, 2012 at 10:25 PM, Mike N nice...@att.net wrote:  The only question is what to do about those cases where it's only referred to locally as 'Ave', and the

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Alan Mintz
At 2012-05-11 10:20, David ``Smith'' wrote: Third, I suggest retaining the abbreviated form in a tag like abbr_name. Ideally, this should be the exact abbreviated form used on signs, if that's consistent. Getting this right requires local knowledge, but TIGER's abbreviation might be better

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Dale Puch
I understand the script checks for only one instance of the abbreviation. My point was what is someone manually expanded ONE of the abbreviations, leaving st something street? Is that checked for? The question also applies to Dr something Dr previously changed to Dr something Drive, and possibly

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Mike N
On 5/11/2012 1:36 PM, Alan Mintz wrote: Okay, so basically we're ignoring the on-the-ground rule in order to map for the renderer. Exactly :) Why that is ok, I don't know :( Mapping for the renderer has never been wrong or discouraged. Tagging incorrectly for the renderer is another

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Alan Mintz
At 2012-05-11 14:11, Mike N wrote: On 5/11/2012 1:36 PM, Alan Mintz wrote: Okay, so basically we're ignoring the on-the-ground rule in order to map for the renderer. Exactly :) Why that is ok, I don't know :( Mapping for the renderer has never been wrong or discouraged. Tagging

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Serge Wroclawski
On Fri, May 11, 2012 at 4:17 PM, Dale Puch dale.p...@gmail.com wrote: I understand the script checks for only one instance of the abbreviation. My point was what is someone manually expanded ONE of the abbreviations, leaving st something street?  Is that checked for? I have a number of

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Dale Puch
Sorry, I should have been clearer, the results I posted were from my quick test. I just wanted to report the abbreviations I saw as possible additions to the list in Serge's script. And to give an idea of which showed up most either for scripting or if someone wanted to handle the lesser used

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Anthony
On Thu, May 10, 2012 at 3:28 PM, Dale Puch dale.p...@gmail.com wrote: As a quick and dirty test I took Florida and Illinois road data from cloudmade.  A simple replace of the top 7 or so suffixes at the end of the name an with a space in front of it resulted in over 700,000 name changes for

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Mike N
On 5/10/2012 9:48 PM, Anthony wrote: You seem to be assuming all the changes are positive. I didn't take it that way - it was just a quick test for orders of magnitude. An actual script takes more review. What happened to the on the ground rule, anyway? That already doesn't

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Mike N
On 5/10/2012 10:19 PM, Anthony wrote: What I'm questioning is why it doesn't apply. If the people call it Whatever Ave, shouldn't the data read Whatever Ave? Most of the US wouldn't call it 'Whatever Ave'; when spoken, it would be 'Avenue'. Having it expanded makes programs with spoken

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Anthony
On Thu, May 10, 2012 at 10:25 PM, Mike N nice...@att.net wrote: On 5/10/2012 10:19 PM, Anthony wrote: What I'm questioning is why it doesn't apply.  If the people call it Whatever Ave, shouldn't the data read Whatever Ave?  Most of the US wouldn't call it 'Whatever Ave'; when spoken, it

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Mike N
On 5/10/2012 10:40 PM, Anthony wrote: Depends on what street you're talking about. I've certainly lived in places where the vast majority of the locals called it Whatever Ave, and not Whatever Avenue. Most of the US...wouldn't talk about the street at all. But you wouldn't be confused if

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Dale Puch
The issue with abbreviations is very muddy. BUT it has been said many time that we do not want to abbreviate where possible. There are several reasons. - Clarity! The abbreviations are just that, they mean the full word, and are spoken that way, but written and displayed as the

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Dale Puch
I think I came up with a rare possibility for error. The original st something st was manually expanded to st something street your checking for a single st, and there would be. Or am I missing another check? I can't think of any other situations besides Saint and Street like this. Possibly

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-08 Thread Anthony
On Wed, May 2, 2012 at 7:28 AM, Mike N nice...@att.net wrote: On 5/1/2012 11:49 PM, Anthony wrote:  That assumes that the TIGER tags will always be present to assist with  proper automatic expansion. I'm not sure what you mean, because I am not making that assumption at all.  You

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-08 Thread Anthony
On Wed, May 2, 2012 at 12:01 PM, Serge Wroclawski emac...@gmail.com wrote: 2) My human error rate estimation of 1/1000 seems entirely reasonable. Think typos, or misreading. I'm sure we see error rates that high now in OSM and we find them acceptable. A computer that's acting conservatively

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-08 Thread Anthony
On Tue, May 8, 2012 at 11:31 PM, Anthony o...@inbox.org wrote: Doctor Martin Luther King Bolevard is one thing.  Drive Martin Luther King Boulevard is another. And if we're going to make so many mistakes (1/1000 means thousands of mistakes), I'd rather it just be left as Dr Martin Luther King

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-02 Thread Chris Lawrence
ISTM this might be a good mechanical turk application if there is genuine concern that there will be a substantial error rate (my point-of-view as a social scientist is that a hypothesized 1/1000 error rate is pretty darn low, but I can appreciate that some might have more exacting standards),

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-02 Thread Serge Wroclawski
On Wed, May 2, 2012 at 11:08 AM, Chris Lawrence lordsu...@gmail.com wrote: ISTM this might be a good mechanical turk application if there is genuine concern that there will be a substantial error rate (my point-of-view as a social scientist is that a hypothesized 1/1000 error rate is pretty

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-02 Thread Toby Murray
On Wed, May 2, 2012 at 10:08 AM, Chris Lawrence lordsu...@gmail.com wrote: ISTM this might be a good mechanical turk application if there is genuine concern that there will be a substantial error rate (my point-of-view as a social scientist is that a hypothesized 1/1000 error rate is pretty

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Mon, Apr 30, 2012 at 11:28 PM, Serge Wroclawski emac...@gmail.com wrote: On Mon, Apr 30, 2012 at 8:14 PM, Paul Johnson ba...@ursamundi.org wrote: There have been some limited automated expansions, though they can be problematic, because abbreviations can mean many possible things.  Expanding

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Tue, May 1, 2012 at 1:06 PM, Serge Wroclawski emac...@gmail.com wrote: The other point that's being missed is that we as a community already accept an error rate in our data that's far larger than any potential mistake rate on a well written script. If the script makes one error in 1000

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Mike N
On 5/1/2012 12:59 PM, Anthony wrote: I'm not sure what you're saying. Automatically expanding abbreviations is a terrible idea. If an abbreviation is unambiguous, then it can be expanded during the preprocessing step. If, on the other hand, it is ambiguous, then you are turning ambiguous data

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Tue, May 1, 2012 at 1:18 PM, Nathan Edgars II nerou...@gmail.com wrote: On 5/1/2012 12:59 PM, Anthony wrote: Automatically expanding abbreviations is a terrible idea.  If an abbreviation is unambiguous, then it can be expanded during the preprocessing step.  If, on the other hand, it is

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Nathan Edgars II
On 5/1/2012 1:23 PM, Anthony wrote: On Tue, May 1, 2012 at 1:18 PM, Nathan Edgars IInerou...@gmail.com wrote: On 5/1/2012 12:59 PM, Anthony wrote: Automatically expanding abbreviations is a terrible idea. If an abbreviation is unambiguous, then it can be expanded during the preprocessing

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Tue, May 1, 2012 at 1:26 PM, Nathan Edgars II nerou...@gmail.com wrote: On 5/1/2012 1:23 PM, Anthony wrote: On Tue, May 1, 2012 at 1:18 PM, Nathan Edgars IInerou...@gmail.com  wrote: On 5/1/2012 12:59 PM, Anthony wrote: Automatically expanding abbreviations is a terrible idea.  If an

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Tue, May 1, 2012 at 1:31 PM, Anthony o...@inbox.org wrote: And actually, if the bot is going to be smart enough to look at the history, to find deleted TIGER tags, then maybe there is some advantage to doing this during the preprocessing step (which would often not have access to history

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Mike N
On 5/1/2012 1:21 PM, Anthony wrote: The preprocessing step between downloading the data from OSM and doing something with it. That assumes that the TIGER tags will always be present to assist with proper automatic expansion. And I'd rather have the US data in line with the world-wide

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Ian Dees
On Tue, May 1, 2012 at 12:26 PM, Nathan Edgars II nerou...@gmail.comwrote: The TIGER tags are not exactly standard OSM tags that belong in the database. Better that we get rid of them at the same time as we expand abbreviations. Although the tiger:* keys aren't standard, the information they

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Tue, May 1, 2012 at 1:36 PM, Mike N nice...@att.net wrote: On 5/1/2012 1:21 PM, Anthony wrote: The preprocessing step between downloading the data from OSM and doing something with it.  That assumes that the TIGER tags will always be present to assist with proper automatic expansion.

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Tue, May 1, 2012 at 1:41 PM, Ian Dees ian.d...@gmail.com wrote: On Tue, May 1, 2012 at 12:26 PM, Nathan Edgars II nerou...@gmail.com wrote: The TIGER tags are not exactly standard OSM tags that belong in the database. Better that we get rid of them at the same time as we expand

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-04-30 Thread Toby Murray
On Mon, Apr 30, 2012 at 7:14 PM, Paul Johnson ba...@ursamundi.org wrote: On Apr 30, 2012 5:00 PM, David Litke dwli...@comcast.net wrote: I just did a few manual TIGER reviews in JOSM and got a validation warning that words like Street and Avenue were abbreviated as St and Ave. So I wonder if

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-04-30 Thread Mike N
On 4/30/2012 10:24 PM, Toby Murray wrote: I believe It was stopped after some complaints about it not handling some situations correctly. But I would probably be in favor of trying to complete it. I would agree - there's no point in asserting that we have to spend time manually expanding

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-04-30 Thread John F. Eldredge
David Litke dwli...@comcast.net wrote: I just did a few manual TIGER reviews in JOSM and got a validation warning that words like Street and Avenue were abbreviated as St and Ave. So I wonder if this is considered something that needs to be fixed? If so, shouldn't it be easy to somehow do a

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-04-30 Thread Richard Welty
On 4/30/12 10:35 PM, Mike N wrote: On 4/30/2012 10:24 PM, Toby Murray wrote: I believe It was stopped after some complaints about it not handling some situations correctly. But I would probably be in favor of trying to complete it. I would agree - there's no point in asserting that we have