I was asked to post this here instead of emailing folks directly. So here it goes.
I have a form field that users will be asked to enter "city, state or zip". Then they will select a range such as 50 miles. This input will be used to find matches of records that are within that range. I expect to get entries such as St. Louis, MO St. Louis, Missouri Saint Louis, MO 63101 Saint Louis, Missouri 63101 etc. I've already made great progress on this, but have hit some roadblocks. At this point, I already have the functions to strip out a zip code (5 digit numeric) and check to see if it is a valid zip code in the database. Since the zip is the most accurate, this location data has highest priority in the string. So if someone enters St. Louis, MO 63101, the code grabs the 63101 and uses that. The form asks for City, State OR zip, but if by chances someone enters in the full string with a valid zip, then all that is needed is the zip. So to the point in the script where I am having problems. Let's say a user enters St. Louis, MO 63107 634 The 634 is removed from the string since it doesn't match the characteristics of a zip. The 63107 is used to check for a valid zip, and let's say it doesn't match anything and is deemed worthless. it is removed from the string as well. As soon as it is determined that there is no valid zip in the string, all numerics are stripped out to isolate the possible city/state combo. At this point, the string would be 'St. Louis, MO' >From here, I first try and find the state. So my idea is to look for the first comma, remove any spaces after that comma, test the 1st and 2nd position after the comma to see if the characters are alphabetical, then test if the 3rd position is a space or if the 3rd position even exists. 'St. Louis, MO' would become 'St. Louis,MO' and since the two characters after the comma are alphanumeric, and there isn't a 3rd character in the string (disqualifying it a state abbreviation), MO would be deemed a qualified possible state abbreviation. Then I query the db, and if there is a match I set a variable called "state_match" to the state that is found. If there is no match, then I need to check for a full state name. My stance is that if the string after the comma doesn't match a state name as is, it's really not worth trying a bunch of different things to figure out if its valid. If I have 'St. Louis, Missouri' get 'Missouri' and check it. If the string yields 'missour' check flag it as not valid and return to the user that they didn't enter in a valid location. Maybe down the road I can do something more advanced, but I need to move on. Now at this point, I either have a state match or I don't. It's time to look at the city. I just want to find the first comma, strip out everything to the right of the comma, including the comma, so I'm just left with the potential city name. 'St. Louis, MO' yields 'St. Louis' First thing, I need to run a replace on the string for any instance of abbreviations such as 'st. ft. etc'. and replace the matches with "saint, fort' etc. Let's just use st. for now. What's the best way to look for 'st.' and replace it with 'saint'? >From here, I just need to search for the city name match. 1. If a statematch was found, I will search the database where city = 'citymatch' and state = 'statematch'. This is why I wanted to search for state first, so I could isolate the city to the defined state. 2. If a statematch was not found, I will just search for city matches. In the case of saint louis, there are three states that have a match. I want to grab the city which has the most related zip codes. >From here on out, I should have parsed enough data out to find a location, and if not, tell the user to enter some valid data. -- Open BlueDragon Public Mailing List http://www.openbluedragon.org/ http://twitter.com/OpenBlueDragon mailing list - http://groups.google.com/group/openbd?hl=en !! save a network - please trim replies before posting !!
