Lots of weird ones from Florida Many should not give you an issue due to
how your processing, but it is best to test them anyhow. Also it might be
a good reference when looking at other expansions after this runs.
way id=10761946 name v=E 10th Ct E
way id=10763539 name v=E 10th St E
way
On 11 May 2012 22:17, Dale Puch dale.p...@gmail.com wrote:
I understand the script checks for only one instance of the abbreviation.
My point was what is someone manually expanded ONE of the abbreviations,
leaving st something street? Is that checked for? The question also
applies to Dr
On Sat, May 12, 2012 at 4:21 PM, andrzej zaborowski balr...@gmail.com wrote:
It checks suffixes starting from the
end, so if you have St something St E or St something St East,
it'll only check E or East and then St and then stop because
something is not a known suffix.
So Calle Ave Maria
On Sat, May 12, 2012 at 4:47 PM, Anthony o...@inbox.org wrote:
On Sat, May 12, 2012 at 4:21 PM, andrzej zaborowski balr...@gmail.com wrote:
It checks suffixes starting from the
end, so if you have St something St E or St something St East,
it'll only check E or East and then St and then stop
The process seems obvious to me: check that the name is still what it
originally was (from the tiger:name_base etc. tags), and if so, use
those tags to expand abbreviations. (Ignore any with semicolons/colons
from joining.) If not, set it aside for semi-manual checking. The only
false
On Fri, May 11, 2012 at 04:47:37AM -0400, Serge Wroclawski wrote:
I've added direction expansion into a new version, and thrown it up as a gist:
https://gist.github.com/2656735
I don't treat direction prefixes and suffixes any differently- I
haven't seen an example where there is both a
On Thu, May 10, 2012 at 11:45 PM, Dale Puch dale.p...@gmail.com wrote:
Clarity! The abbreviations are just that, they mean the full word, and are
spoken that way, but written and displayed as the abbreviation. I also
disagree I have never know anyone that said whatever A V E they do not
On Fri, May 11, 2012 at 9:45 AM, Anthony o...@inbox.org wrote:
The only way to capture the full information is to have additional
tags telling you what the base is. And if you do that, abbreviating
or not abbreviating doesn't matter.
And if you want to avoid tremendous redundancy, the way to
On 2012-05-11 6:45 AM, Anthony wrote:
Not really. Is 1515 South West Shore Boulevard, Tampa abbreviated
1515 S West Shore Blvd, Tampa, or is it abbreviated 1515 S W Shore
Blvd, Tampa? If you want the answer, ask usps.com.
The only way to capture the full information is to have additional
tags
At 2012-05-10 19:40, Anthony wrote:
On Thu, May 10, 2012 at 10:25 PM, Mike N nice...@att.net wrote:
The only question is what to do about those cases where it's only referred
to locally as 'Ave', and the postal service would refuse letters addressed
to 'Avenue'.
The postal service would
At 2012-05-10 19:56, Anthony wrote:
On Thu, May 10, 2012 at 10:45 PM, Mike N nice...@att.net wrote:
But you wouldn't be confused if an stranger came in asking how to get to
Whatever Avenue?If not, then there's no problem with the expansion.
Okay, so basically we're ignoring the
On Fri, May 11, 2012 at 12:26 PM, Minh Nguyen m...@1ec5.org wrote:
On 2012-05-11 6:45 AM, Anthony wrote:
The only way to capture the full information is to have additional
tags telling you what the base is. And if you do that, abbreviating
or not abbreviating doesn't matter.
That's similar
On Fri, May 11, 2012 at 1:35 PM, Alan Mintz
alan_mintz+...@earthlink.net wrote:
At 2012-05-10 19:40, Anthony wrote:
On Thu, May 10, 2012 at 10:25 PM, Mike N nice...@att.net wrote:
The only question is what to do about those cases where it's only
referred
to locally as 'Ave', and the
At 2012-05-11 10:20, David ``Smith'' wrote:
Third, I suggest retaining the abbreviated form in a tag like abbr_name.
Ideally, this should be the exact abbreviated form used on signs, if
that's consistent. Getting this right requires local knowledge, but
TIGER's abbreviation might be better
I understand the script checks for only one instance of the abbreviation.
My point was what is someone manually expanded ONE of the abbreviations,
leaving st something street? Is that checked for? The question also
applies to Dr something Dr previously changed to Dr something Drive,
and possibly
On 5/11/2012 1:36 PM, Alan Mintz wrote:
Okay, so basically we're ignoring the on-the-ground rule in order to
map for the renderer.
Exactly :) Why that is ok, I don't know :(
Mapping for the renderer has never been wrong or discouraged.
Tagging incorrectly for the renderer is another
At 2012-05-11 14:11, Mike N wrote:
On 5/11/2012 1:36 PM, Alan Mintz wrote:
Okay, so basically we're ignoring the on-the-ground rule in order to
map for the renderer.
Exactly :) Why that is ok, I don't know :(
Mapping for the renderer has never been wrong or discouraged. Tagging
On Fri, May 11, 2012 at 4:17 PM, Dale Puch dale.p...@gmail.com wrote:
I understand the script checks for only one instance of the abbreviation.
My point was what is someone manually expanded ONE of the abbreviations,
leaving st something street? Is that checked for?
I have a number of
Sorry, I should have been clearer, the results I posted were from my quick
test. I just wanted to report the abbreviations I saw as possible
additions to the list in Serge's script. And to give an idea of which
showed up most either for scripting or if someone wanted to handle the
lesser used
On Thu, May 10, 2012 at 3:28 PM, Dale Puch dale.p...@gmail.com wrote:
As a quick and dirty test I took Florida and Illinois road data from
cloudmade. A simple replace of the top 7 or so suffixes at the end of the
name an with a space in front of it resulted in over 700,000 name changes
for
On 5/10/2012 9:48 PM, Anthony wrote:
You seem to be assuming all the changes are positive.
I didn't take it that way - it was just a quick test for orders of
magnitude. An actual script takes more review.
What happened to the on the ground rule, anyway?
That already doesn't
On 5/10/2012 10:19 PM, Anthony wrote:
What I'm questioning is why it doesn't apply. If the people call it
Whatever Ave, shouldn't the data read Whatever Ave?
Most of the US wouldn't call it 'Whatever Ave'; when spoken, it would
be 'Avenue'. Having it expanded makes programs with spoken
On Thu, May 10, 2012 at 10:25 PM, Mike N nice...@att.net wrote:
On 5/10/2012 10:19 PM, Anthony wrote:
What I'm questioning is why it doesn't apply. If the people call it
Whatever Ave, shouldn't the data read Whatever Ave?
Most of the US wouldn't call it 'Whatever Ave'; when spoken, it
On 5/10/2012 10:40 PM, Anthony wrote:
Depends on what street you're talking about. I've certainly lived in
places where the vast majority of the locals called it Whatever Ave,
and not Whatever Avenue. Most of the US...wouldn't talk about the
street at all.
But you wouldn't be confused if
The issue with abbreviations is very muddy. BUT it has been said many time
that we do not want to abbreviate where possible. There are several
reasons.
- Clarity! The abbreviations are just that, they mean the full word,
and are spoken that way, but written and displayed as the
I think I came up with a rare possibility for error.
The original st something st was manually expanded to st something
street your checking for a single st, and there would be. Or am I
missing another check? I can't think of any other situations besides Saint
and Street like this. Possibly
On Wed, May 2, 2012 at 7:28 AM, Mike N nice...@att.net wrote:
On 5/1/2012 11:49 PM, Anthony wrote:
That assumes that the TIGER tags will always be present to assist with
proper automatic expansion.
I'm not sure what you mean, because I am not making that assumption at
all.
You
On Wed, May 2, 2012 at 12:01 PM, Serge Wroclawski emac...@gmail.com wrote:
2) My human error rate estimation of 1/1000 seems entirely reasonable.
Think typos, or misreading. I'm sure we see error rates that high now
in OSM and we find them acceptable. A computer that's acting
conservatively
On Tue, May 8, 2012 at 11:31 PM, Anthony o...@inbox.org wrote:
Doctor Martin Luther King Bolevard is one thing. Drive Martin Luther King
Boulevard is another.
And if we're going to make so many mistakes (1/1000 means thousands of
mistakes), I'd rather it just be left as Dr Martin Luther King
ISTM this might be a good mechanical turk application if there is
genuine concern that there will be a substantial error rate (my
point-of-view as a social scientist is that a hypothesized 1/1000
error rate is pretty darn low, but I can appreciate that some might
have more exacting standards),
On Wed, May 2, 2012 at 11:08 AM, Chris Lawrence lordsu...@gmail.com wrote:
ISTM this might be a good mechanical turk application if there is
genuine concern that there will be a substantial error rate (my
point-of-view as a social scientist is that a hypothesized 1/1000
error rate is pretty
On Wed, May 2, 2012 at 10:08 AM, Chris Lawrence lordsu...@gmail.com wrote:
ISTM this might be a good mechanical turk application if there is
genuine concern that there will be a substantial error rate (my
point-of-view as a social scientist is that a hypothesized 1/1000
error rate is pretty
On Mon, Apr 30, 2012 at 11:28 PM, Serge Wroclawski emac...@gmail.com wrote:
On Mon, Apr 30, 2012 at 8:14 PM, Paul Johnson ba...@ursamundi.org wrote:
There have been some limited automated expansions, though they can be
problematic, because abbreviations can mean many possible things. Expanding
On Tue, May 1, 2012 at 1:06 PM, Serge Wroclawski emac...@gmail.com wrote:
The other point that's being missed is that we as a community already
accept an error rate in our data that's far larger than any potential
mistake rate on a well written script. If the script makes one error
in 1000
On 5/1/2012 12:59 PM, Anthony wrote:
I'm not sure what you're saying.
Automatically expanding abbreviations is a terrible idea. If an
abbreviation is unambiguous, then it can be expanded during the
preprocessing step. If, on the other hand, it is ambiguous, then you
are turning ambiguous data
On Tue, May 1, 2012 at 1:18 PM, Nathan Edgars II nerou...@gmail.com wrote:
On 5/1/2012 12:59 PM, Anthony wrote:
Automatically expanding abbreviations is a terrible idea. If an
abbreviation is unambiguous, then it can be expanded during the
preprocessing step. If, on the other hand, it is
On 5/1/2012 1:23 PM, Anthony wrote:
On Tue, May 1, 2012 at 1:18 PM, Nathan Edgars IInerou...@gmail.com wrote:
On 5/1/2012 12:59 PM, Anthony wrote:
Automatically expanding abbreviations is a terrible idea. If an
abbreviation is unambiguous, then it can be expanded during the
preprocessing
On Tue, May 1, 2012 at 1:26 PM, Nathan Edgars II nerou...@gmail.com wrote:
On 5/1/2012 1:23 PM, Anthony wrote:
On Tue, May 1, 2012 at 1:18 PM, Nathan Edgars IInerou...@gmail.com
wrote:
On 5/1/2012 12:59 PM, Anthony wrote:
Automatically expanding abbreviations is a terrible idea. If an
On Tue, May 1, 2012 at 1:31 PM, Anthony o...@inbox.org wrote:
And actually, if the bot is going to be smart enough to look at the
history, to find deleted TIGER tags, then maybe there is some
advantage to doing this during the preprocessing step (which would
often not have access to history
On 5/1/2012 1:21 PM, Anthony wrote:
The preprocessing step between downloading the data from OSM and doing
something with it.
That assumes that the TIGER tags will always be present to assist
with proper automatic expansion.
And I'd rather have the US data in line with the world-wide
On Tue, May 1, 2012 at 12:26 PM, Nathan Edgars II nerou...@gmail.comwrote:
The TIGER tags are not exactly standard OSM tags that belong in the
database. Better that we get rid of them at the same time as we expand
abbreviations.
Although the tiger:* keys aren't standard, the information they
On Tue, May 1, 2012 at 1:36 PM, Mike N nice...@att.net wrote:
On 5/1/2012 1:21 PM, Anthony wrote:
The preprocessing step between downloading the data from OSM and doing
something with it.
That assumes that the TIGER tags will always be present to assist with
proper automatic expansion.
On Tue, May 1, 2012 at 1:41 PM, Ian Dees ian.d...@gmail.com wrote:
On Tue, May 1, 2012 at 12:26 PM, Nathan Edgars II nerou...@gmail.com
wrote:
The TIGER tags are not exactly standard OSM tags that belong in the
database. Better that we get rid of them at the same time as we expand
On Mon, Apr 30, 2012 at 7:14 PM, Paul Johnson ba...@ursamundi.org wrote:
On Apr 30, 2012 5:00 PM, David Litke dwli...@comcast.net wrote:
I just did a few manual TIGER reviews in JOSM and got a validation warning
that words like Street and Avenue were abbreviated as St and Ave. So I
wonder if
On 4/30/2012 10:24 PM, Toby Murray wrote:
I believe It was stopped after some
complaints about it not handling some situations correctly. But I
would probably be in favor of trying to complete it.
I would agree - there's no point in asserting that we have to spend
time manually expanding
David Litke dwli...@comcast.net wrote:
I just did a few manual TIGER reviews in JOSM and got a validation
warning that words like Street and Avenue were abbreviated as St and
Ave. So I wonder if this is considered something that needs to be
fixed? If so, shouldn't it be easy to somehow do a
On 4/30/12 10:35 PM, Mike N wrote:
On 4/30/2012 10:24 PM, Toby Murray wrote:
I believe It was stopped after some
complaints about it not handling some situations correctly. But I
would probably be in favor of trying to complete it.
I would agree - there's no point in asserting that we have
47 matches
Mail list logo