Hi Fred, So I am guessing then your "real time" calculations are all going to be focused about the moving vehicles right? If the way-points are relatively static you can preprocess information about those offline (distance between each, data mining average time taken to travel between 2 etc).
So I am guessing you would need to find way-points relative to a given vehicle - if this is the case, I think you are going to need to investigate some kind of index for the way-points. We do this for our 150 million points by putting them in an identified 1 degree x 1 degree cell (and then 0.1 x 0.1 degree cells), so that if someone is interested in points near a location, we first determine which cells are candidates and immediately we have reduced the candidate points to check. In database terms, we have latitude, longitude and then create a (cell_id int, centi_cell_id int). If you know the routes that a vehicle is taking, is there any way you could preplan it's route perhaps and cache that, or store somehow known routes between way-points? This might allow you to really reduce the candidates to check. Just some ideas Tim skype: timrobertson100 On Fri, Jun 19, 2009 at 10:16 PM, Fred Zappert<[email protected]> wrote: > Tim, > > Thanks so much for the additional links. > > Our problem is for the moment much smaller - 4,000,000 mapped way-points, > and 80,000 moving vehicles. > > Clustering the way-points into polygons makes a lot of sense. > > Fred. > > On Fri, Jun 19, 2009 at 2:43 PM, tim robertson > <[email protected]>wrote: > >> Hi Fred, >> >> I was working on 150million point records, and 150,000 fairly detailed >> polygons. I had to batch it up and do 40,000 polygons in memory at a >> time on the MapReduce jobs. >> >> If you are dealing with a whole bunch of points, might it be worth >> clustering them into polygons first to get candidate points? >> We are running this: >> http://code.flickr.com/blog/2008/10/30/the-shape-of-alpha/ and >> clustering 1 million points into multipolygons in 5 seconds. This >> might get the numbers down to a sensible number. >> >> It is a problem of great interest to us also, so happy to discuss >> ideas... >> http://biodivertido.blogspot.com/2008/11/reproducing-spatial-joins-using-hadoop.html >> was one of my early tests. >> >> Cheers >> >> Tim >> >> >> On Fri, Jun 19, 2009 at 9:37 PM, Fred Zappert<[email protected]> wrote: >> > Tim, >> > >> > Thanks. That suggests an implementation that could be very effective at >> the >> > current scale. >> > >> > Regards, >> > >> > Fred. >> > >> > On Fri, Jun 19, 2009 at 2:27 PM, tim robertson < >> [email protected]>wrote: >> > >> >> I've used it as a source for a bunch of point data, and then tested >> >> them in polygons with a contains(). I ended up loading the polygons >> >> into memory with an RTree index though using the GeoTools libraries. >> >> >> >> Cheers >> >> >> >> Tim >> >> >> >> >> >> On Fri, Jun 19, 2009 at 9:22 PM, Fred Zappert<[email protected]> >> wrote: >> >> > Hi, >> >> > >> >> > I would like to know if anyone is using HBase for spatial databases. >> >> > >> >> > The requirements are relatively simple. >> >> > >> >> > 1. Two dimensions. >> >> > 2. Each object represented as a point. >> >> > 3. Basic query is nearest neighbor, with a few qualifications such as: >> >> > a >> >> > >> >> >> > >> >
