Hi Kurt and everyone, I have an untried, vaguely specified idea:
1. Add a column for "consisent_track_id" - this will have a number for a "consistent track" - one formed by a set of points all within one hour of each other for a given ship. This will be null at first. 2. In a procedural language, grab a list of all the measurement points for a given ship *ordered by time*. Start with the first one, give it an arbitrary unique ID for a "consistent track", store that ID in a variable. Grab the next one measure point, and if the time difference < single hour assign the same ID to it, otherwise get a new "consistent track" id and start the process again. This will fill the table with IDs that represent which "consistent track" a point belongs to. (My logic may be slightly off, especially at the edges, but you get the idea.) 3. Now it is easy: use a GROUP BY consistent_track in a select to aggregate the points into LINESTRINGs. Do a select into a new table. 4. With a ship id, you could do this all in one table for all the ships and measurement points, which would be a little more graceful. The primary key would probably be (ship_id, measurement_timestamp). Have a separate table to store coalesced tracks. 5. If you are streaming the measurement points, try to assign a consistent track ID as they come in, which would be easy. Saves on query time. 6. "Indexes", not "keys", are what (sometimes) speed up queries. HTH and let us all know how you finally do fix it. On 7/1/07, Kurt Schwehr <[EMAIL PROTECTED]> wrote:
Hi Rodrigo, I am not sure how to dump a view in a way that is useful, so here is what makes up the data set and answer for one ship. select key,userid,cg_sec,cg_timestamp from oneShip; key | userid | cg_sec | cg_timestamp -----+-----------+------------+--------------------- 251 | 366725230 | 1178661410 | 2007-05-08 21:56:50 252 | 366725230 | 1178661421 | 2007-05-08 21:57:01 253 | 366725230 | 1178661561 | 2007-05-08 21:59:21 254 | 366725230 | 1178661710 | 2007-05-08 22:01:50 255 | 366725230 | 1178661900 | 2007-05-08 22:05:00 256 | 366725230 | 1178661940 | 2007-05-08 22:05:40 257 | 366725230 | 1178663311 | 2007-05-08 22:28:31 258 | 366725230 | 1178663360 | 2007-05-08 22:29:20 259 | 366725230 | 1178663441 | 2007-05-08 22:30:41 260 | 366725230 | 1178663540 | 2007-05-08 22:32:20 -- here is the time jump between the two that I need to split on 266 | 366725230 | 1178729461 | 2007-05-09 16:51:01 267 | 366725230 | 1178730221 | 2007-05-09 17:03:41 (12 rows) Where the results should be this for the separated transits: userid | startKey | endKey 366725230 | 251 | 260 366725230 | 266 | 267 I've also put a dump of the db up on the web of "pg_dump --create ais> ais_pgdump.sql" in case that might be easier to deal with. http://vislab-ccom.unh.edu/~schwehr/ais/tmp/ I've written a little python/psycopg2 script, but the full database has 4.5M entries and the processing is going to take the whole day (which I think implies that I need to tell postgresql that userid should be a key). Thanks much! -kurt
_______________________________________________ postgis-users mailing list [email protected] http://postgis.refractions.net/mailman/listinfo/postgis-users
