# [PHP] Re: Postal / Zip Code Proximity Search

```>   I have a couple of questions with respect to creating a postal /
>zip code proximity search that is remotely accurate. The system I am
>using now is OK for small distances, but is terrible at calculating
>large distances.```
```
What formula are you using to calculate distance?

You see, the Earth is not flat :-)

So, after a certain distance, the error margin of the curvature of the Earth
becomes quite significant.

There are several formulas you could be using...

Geographics and Cartographers have been known to challenge each other to
duels over the relative merits of these formulas, so be careful which one
you pick :-)

A quick search for "longitude, latitude, distance, and Great Circle" should
turn up several options, with explaining text about their relative merits.
(Or it did when I did this same thing years ago...)

In the end, though, *I* ended up just using Cartesian distance, since
anybody searching for things "close by" is not going to be worried about
curvature of the Earth in their calculations.

So, yeah, it's way wrong for those establishments thousands of miles away...
So what?  Nobody cares about how far is the Jiffy Lube 2000 miles away from
them :-)

In *SOME* applications, you'd need a *MUCH* more accurate formula -- But if
you're searching for "nearby" establishments, forget worrying about the
long-distance error.

The performance on the trigonometric functions for the "better" formula were
intolerable, even in PostgreSQL user-defined functions.  YMMV.

But there's just no way you're going to want to download all the
establishments to PHP and do the calculations there.  Dog-slow.

You'll *HAVE* to get them into SQL some way.

Also, don't even think of doing a JOIN with your establishments and the Zip
Code table.

It's rule-breaking time. :-)

Add long/lat columns to your establishments and fill them in with the Zip
Code data if you haven't already.

Again, YMMV, but the JOIN of 2000 music venues and the ~10K Zip codes was
intolerably slow.

>   What I have is a database of establishments and their corresponding
>latitudes and longitudes, and I need to have users enter their own
>postal / zip code, and have the system calculate which establishments
>fall into a set range (\$range) from their location. Not only does it
>need to find the establishments, it needs to calculate the distances
>(straight line, of course).

So long as you keep that "distance" small -- a few score miles, the
curvature should not be that big a deal...

You will need a "fudge factor" of "arc" (long/lat) to miles to divide your
Cartesian long/lat "distance" (using the term distance very loosely).

Also note that the closer you get to the North/South Pole, the worse your
error rate will be, since the longitude lines converge.

>   Now, I have dug through the PHP archives, but there is no single
>clear explanation of how to do this, from beginning to end. I will
>start by putting down what I need, and then maybe other people can
>contribute to this and in the end we will have the solution, nice and
>tidy, in one place.

There really can't be a tidy solution, due to the underlying formulas'
relative merits and the performance issues, and the widely different
requirements of the definition of "distance"...

Still, a summary of the options would be nice.

>*** NOTE: I know doing this in PHP will be SLOW, but I do not want to
>write a MySQL user-defined function and recompile it and blah, blah,
>blah... (unless someone wants to cover that aspect!!! :D)

If you switch to PostgreSQL instead of MySQL, you can define a function in
PostgreSQL for *MUCH* better performance.   (Actually, PostgreSQL may even
have a good built-in long/lat distance function by now...)

>   Step 1: Take the postal / zip code that the user entered, and
>search the national postal / zip code database (which I already have)
>and return its corresponding latitude and longitude (\$userLat and
>\$userLong).
>
>   Step 2: ???
>
>   Step 3: Return the establishments information to the user, complete
>with the distances to them, ordered by smallest distance to largest.

For sure, limit the number of responses.  In fact, let the *USER* decide how
far they will travel.  Somebody in Idaho is probably willing to drive a lot
further than, say, a Manhattan resident...

For the short distances you are talking about, you could *ALMOST* get away
with:

# Assume \$longitude/\$latitude is the user's location.
select name, address from establishments where sqrt((longitude -
\$longitude)*(longitude - \$longitude) + (latitude - \$latitude)*(latitude -
\$latitude))

IE, the plain old cartesian (x, y) distance from Algebra II class.

You could then take those results and divide by a "fudge factor" of arcs to
miles (long/lat to mile conversion) to get a pretty close approximation for
*NEARBY* establishments.

Once you get past a few score miles, though, it will be noticeably "off"

Seek out the formulas as I suggested, though, and you'll be able to get
pretty accurate distances, but you'll have to use the Trigonometric
functions MySQL might not have, so then you get to review Trigonometry and
play some identity function games to convert it to the ones that MySQL
*does* have, assuming it has any.  If MySQL has *no* Trig, your options just
got real easy, if not attractive, to choose...

Sample with the Cartesian crappy fake "distance" and fudge factor I chose:

http://chatmusic.com/venues.htm

When I calculate LA to Chicago, it's way off, but so what?  Nobody needs it
to work that far away if they're using the application for what it's
intended.

--
Like Music?  http://l-i-e.com/artists.htm

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

```