I was asked to post this here instead of emailing folks directly.

So here it goes.

I have a form field that users will be asked to enter "city, state or
zip". Then they will select a range such as 50 miles. This input will
be used to find matches of records that are within that range.

I expect to get entries such as

St. Louis, MO
St. Louis, Missouri
Saint Louis, MO 63101
Saint Louis, Missouri 63101

etc.

I've already made great progress on this, but have hit some
roadblocks.  At this point, I already have the functions to strip out
a zip code (5 digit numeric) and check to see if it is a valid zip
code in the database. Since the zip is the most accurate, this
location data has highest priority in the string.

So if someone enters St. Louis, MO 63101, the code grabs the 63101 and
uses that. The form asks for City, State OR zip, but if by chances
someone enters in the full string with a valid zip, then all that is
needed is the zip.

So to the point in the script where I am having problems.

Let's say a user enters

St. Louis, MO 63107 634

The 634 is removed from the string since it doesn't match the
characteristics of a zip.

The 63107 is used to check for a valid zip, and let's say it doesn't
match anything and is deemed worthless. it is removed from the string
as well.

As soon as it is determined that there is no valid zip in the string,
all numerics are stripped out to isolate the possible city/state
combo.

At this point, the string would be

'St. Louis, MO'

>From here, I first try and find the state. So my idea is to look for
the first comma, remove any spaces after that comma, test the 1st and
2nd position after the comma to see if the characters are
alphabetical, then test if the 3rd position is a space or if the 3rd
position even exists.

'St. Louis, MO' would become 'St. Louis,MO' and since the two
characters after the comma are alphanumeric, and there isn't a 3rd
character in the string (disqualifying it a state abbreviation), MO
would be deemed a qualified possible state abbreviation.

Then I query the db, and if there is a match I set a variable called
"state_match" to the state that is found.

If there is no match, then I need to check for a full state name.

My stance is that if the string after the comma doesn't match a state
name as is, it's really not worth trying a bunch of different things
to figure out if its valid.

If I have 'St. Louis, Missouri' get 'Missouri' and check it. If the
string yields 'missour' check flag it as not valid and return to the
user that they didn't enter in a valid location. Maybe down the road I
can do something more advanced, but I need to move on.

Now at this point, I either have a state match or I don't. It's time
to look at the city.

I just want to find the first comma, strip out everything to the right
of the comma, including the comma, so I'm just left with the potential
city name.

'St. Louis, MO' yields 'St. Louis'

First thing, I need to run a replace on the string for any instance of
abbreviations such as 'st. ft. etc'. and replace the matches with
"saint, fort' etc.

Let's just use st. for now. What's the best way to look for 'st.' and
replace it with 'saint'?

>From here, I just need to search for the city name match.

1. If a statematch was found, I will search the database where city =
'citymatch' and state = 'statematch'. This is why I wanted to search
for state first, so I could isolate the city to the defined state.

2. If a statematch was not found, I will just search for city matches.
In the case of saint louis, there are three states that have a match.
I want to grab the city which has the most related zip codes.

>From here on out, I should have parsed enough data out to find a
location, and if not, tell the user to enter some valid data.

-- 
Open BlueDragon Public Mailing List
 http://www.openbluedragon.org/   http://twitter.com/OpenBlueDragon
 mailing list - http://groups.google.com/group/openbd?hl=en

 !! save a network - please trim replies before posting !!

Reply via email to