You can using any language to automate it, a scripting language would probably 
be the easiest (php, perl, ruby, python, shell, etc).

The table type doesn't matter too much. Although if the tables are static, I 
would use MyISAM.

Spatial indexes come across as more for use in geometry, rather than distances between locations using long/lat. But all you're doing is trying to do is find distances on a spehere that happens to be called Earth, the points of which you refer to as long/lat instead of x/y. It is geometry. You should be able to find a number of examples showing distances between zip codes.

----- Original Message ----- From: "Chris Prakoso" <[EMAIL PROTECTED]>
To: "Brent Baisley" <[EMAIL PROTECTED]>
Cc: <mysql@lists.mysql.com>
Sent: Wednesday, May 23, 2007 9:41 AM
Subject: Re: Help on selecting a View with 3 Billions rows !


Brent,

Thanks for the reply. I knew that MySQL has spatial index, although reading it quickly, I wasn't sure that this is for me. But maybe I'll read it again in more details.

On your second suggestion, i.e. to do INSERT a bit at a time, how can I 
automate this ?

Also, does it matter if the tables are using MyISAM or InnoDB ?

Cheers,
Chris

----- Original Message ----
From: Brent Baisley <[EMAIL PROTECTED]>
To: Chris Prakoso <[EMAIL PROTECTED]>; mysql@lists.mysql.com
Sent: Wednesday, 23 May, 2007 1:30:50 PM
Subject: Re: Help on selecting a View with 3 Billions rows !

I think what you want to do is look into creating a spacial index on the raw data. Then just searching on that index will allow you
to judge distances between things. You won't need to create a table with every 
single possible combination. There are quite a few
examples in the manual and online.

But, if you really want to go that route, do a little bit at a time instead of all at once. It will go quicker and you'll be able to
judge the prgress. For example, join 1 record from one table with the 65K 
records from the other and insert that into the
combination table. Then grab the second record, etc. Do that 54K times. You'll 
be surprised at how quickly it goes.

----- Original Message ----- From: "Chris Prakoso" <[EMAIL PROTECTED]>
To: <mysql@lists.mysql.com>
Sent: Tuesday, May 22, 2007 6:43 PM
Subject: Help on selecting a View with 3 Billions rows !


Hi all,




I am having a big performance performance problem here, and I was wondering if 
you could give me some advise here.


I have 2 big static tables, one with 65K rows, the other with 54K rows.
I need to do a calculation for each combination of these two table
rows.


So what I did was to create a view like so:




 select


   `c`.`TRACT` AS `TRACT`,


   `c`.`LNG` AS `tlng`,


   `c`.`LAT` AS `tlat`,


   `p`.`NAME` AS `name`,


   `p`.`LNG` AS `lng`,


   `p`.`LAT` AS `lat`,


   `Calc_Distance`(`c`.`LAT`,`c`.`LNG`,`p`.`LAT`,`p`.`LNG`) AS `distance`


 from


   (`tracts` `c` join `parks` `p`);




This give me a view with more than 3,500,000,000  rows !



Now, the second part of this exercise is to grab only a set or rows, where the 
distance is less than 50.


So, I thought I would create another view like so:




 select


   `t`.`TRACT` AS `TRACT`,


   `t`.`tlng` AS `tlng`,


   `t`.`tlat` AS `tlat`,


   `t`.`name` AS `name`,


   `t`.`lng` AS `lng`,


   `t`.`lat` AS `lat`,


   `t`.`distance` AS `distance`


 from


   `tractparkdistance` `t`


 where


   (`t`.`distance` < 50);




tractparkdisctance is the name of the view.




But opening this view takes 'a lot of time' !  I just couldn't wait for it.



So, I though I would try to export this to an external file via
SELECT INTO, and re-import the resulting file back to a new table.




So I did like so:




select *


into outfile "park_distances"


from tractparkdistance


where distance < 50;



Running this statement took more than 12 hours, and still counting
until I killed the process. So far it has produced an 800 MB file.


Moreover, I still need to do a Mean calculation from that 'limited' set of 
data, and still do more calculations.


Next try, I using INSERT INTO SELECT like this:

insert into park_distance_radius50s(tract,tlng,tlat,name,lng,lat,distance)
select tract,tlng,tlat,name,lng,lat,distance
from tractparkdistance
where distance < 50

This was running a very long time as well.



I think I'm out of my depth here.  Anybody has any idea on this ?




Thanks very much in advance !




Regards,


Chris


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]







--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to