I am going to change the query to CTE structure to make it more readable. To know more about CTE take a look at http://www.postgresql.org/docs/8.4/static/queries-with.html

The CTE query is then like:

with
--first part
status_flag as(
select vehicle_id, geom, time, taxi_is_occupied, case when (taxi_is_occupied <> lag(taxi_is_occupied,1) over(order by time)) then 1 else 0 end as taxi_status_change from gpx
)
,
--second part
track_group as (
select vehicle_id,geom, time, taxi_is_occupied, sum(taxi_status_change) over (order by time) as track_num from status_flag
)

--third part
select vehicle_id, st_makeline(geom), track_num, taxi_is_occupied from track_group
group by track_num,taxi_is_occupied, vehicle_id order by track_num



Yes. With the first part of the query (status_flag) you get the same amount of raws but with an extra column "taxi_status_change". This extra column detects when a change in the taxi_is_occupied value occurs.


The second part (track_group), keeps the extra column but changes its value. The new value makes it possible to group (in the third part of the query) all the consecutive points with the same taxi_is_occupied value.



On 02/12/2014 08:59, Oliver Burgfeld wrote:
Hi and thanks, that's exactly what I was looking for.

But nevertheless it seems not to work as it's supposed to, I think.

If I just run the first query to detect when the status changes, I get a really strange result.
I have at least the same amount of rows as in my original table.
But the status does not change at every point of course, so the result has to be lower. As I have only very basic knowledge of SQL, I really don't know how to handle this strange behaviour.

Or did I miss a thing here?


Am Montag, 1. Dezember 2014 10:50:09 UTC+1 schrieb toni hernández:

    Hi,

    Maybe this is already solved but anyway....

    As the "taxi_is_occupied" field  is boolean (or binary), if you
    group by this field you will get only two multilinestrings for
    each vehicle. One multilinestring when "taxi_is_occupied" is true,
    and one multilinestring when is false.

If you want to get as many geometries as clients a taxi has done, then you can use the window functions "LAG". With this function
    you can detect when the taxi changes from free to occupied and
    viceversa

    This sentence detects when taxi changes status:
    select geom, time, status, case when (status <> lag(status,1)
    over(order by time)) then 1 else 0 end as canvi from table

    From there you can count the number of times the
    "taxi_is_occupied" has changed.

    select geom, time, status, sum(canvi) over (order by time) as
    track_num from
    (
    select geom, time, status, case when (status <> lag(status,1)
    over(order by time)) then 1 else 0 end as canvi from table
    ) as foo


    And finally , you can use the previous SELECT statement to create
    all tracks for all taxis

    select vehicleid, st_makeline(geom), track_num, status
    from
    (
    select vehicleid,geom, time, status, sum(canvi) over (order by
    time) as track_num from
    (
    select vehicleid, geom, time, status, case when (status <>
    lag(status,1) over(order by time)) then 1 else 0 end as canvi from
    table
    ) as foo
    ) as fooo
    group by track_num, status, vehicleid
    order by track_num



    On 25/11/2014 19:16, Roxanne Reid-Bennett wrote:

        On 11/25/2014 11:48 AM, Oliver Burgfeld wrote:

            I also tried that and it works but it does not give me
            those two columns in my new table. There are only id and
            status inside.

        From the quer below change
        SELECT id, status, (ST_Dump(mylines)).geom
        to

        SELECT id, status, (ST_Dump(mylines)).geom, time_start, time_end

        Something to keep an eye out for... Depending upon your
        version of PostGIS and the underlying libraries, because we
        were working with an older version of the underlying
libraries, I don't know if this is still a potential issue. We ran into issues with "stacked" points (2 GPS points with
        the same coordinates), and GPS drift causing issues with the
        linestrings being pretty goofy (jagged points in the
        linestring that were clearly NOT what the vehicle did). We
        wrote cleanup scripts to take care of those issues (that are
        still in place).  I've not taken time to revisit the library
        routines to see if they now handle those conditions cleanly.

        Roxanne


            Am Dienstag, 25. November 2014 17:39:21 UTC+1 schrieb
            Brent Wood:

                as in my previous reply, I figured that would be useful...
                WITH multis AS (
                SELECT id, status,*min(timestamp) as time_start,
                max(timestamp) as time_end, *ST_MakeLine( point_geom
                ORDER BY timestamp) AS mylines
                FROM your_table
                GROUP BY id, status
                )
                SELECT id, status, (ST_Dump(mylines)).geom
                FROM multisBrent Wood
                
------------------------------------------------------------------------
                *From:* Oliver Burgfeld <[email protected]>
                *To:* [email protected]
                *Cc:* [email protected]; [email protected];
                [email protected]; [email protected]
                *Sent:* Wednesday, November 26, 2014 5:10 AM
                *Subject:* Re: [postgis-users] Creating
                trajectory/lines from millions of points[PostGIS]

                Thank you and all the others who were answering :)

                I tried that and it seems that its working.
                Nevertheless I only tried it with a small part of my
                data (round about 1 million rows out of ~500 million)
                but if it's working now, it should also work with the
                whole dataset.

                Is there a way to also include the time_field into the
                result? I created a new table with this statement
                given but there are only two columns (vehicleid and
                status) included.
                I know thats logical because I only included those two
                into my select clause but it would be great to not
                only order by time but also have a time column in my
                table.

                For example:

                vehicleid | status | time_start | time_end


                I hope its understandable and not to mixed up...

                Thanks!


                Am Dienstag, 25. November 2014 16:06:33 UTC+1 schrieb
                Rémi Cura:

                    Hey, a small correction :
                    ST_MakeLine is already an aggregate, and you may
                    want to enforce the order inside the aggregate
                    (see at the end).
                    Another interesting point is the possiblity to pu
                    somehting in the M value of each point of the
                    line, for instance the time.
                    This comes very handy when you want to extrat
                    parts of the lines.


                    So for instance for the first proposition :

                    WITH multis AS (
                    SELECT id, status,*ST_MakeLine( point_**geom ORDER
                    BY time_field) *AS mylines
                    FROM your_table
                    GROUP BY id, status
                    )
                    SELECT id, status, (ST_Dump(mylines)).geom
                    FROM multis

                    Cheers,
                    Rémi-c


                    2014-11-25 9:53 GMT+01:00 Brent Wood
                    <[email protected]>:

                        or automatically get the start & end times for
                        each trackline in the record like this:

                        WITH multis AS (
                        SELECT id, min(time_field) AS time_start,
                        max(time_field) as time_end, status,
                        ST_MakeLine(array_agg(point_ geom )) AS mylines
                        FROM your_table
                        GROUP BY id, status
                        ORDER BY time_field
                        )
                        SELECT id, status, (ST_Dump(mylines)).geom
                        FROM multis;


                        Cheers,
                           Brent Wood
                        
------------------------------------------------------------------------
                        *From:* Hugues François <[email protected]>
                        *To:* PostGIS Users Discussion
                        <[email protected] >
                        *Sent:* Tuesday, November 25, 2014 8:13 PM
                        *Subject:* Re: [postgis-users] Creating
                        trajectory/lines from millions of points[PostGIS]

                        Hello,
                        In your case I would have try to make
                        multilines for each taxi and each status (i.e.
                        two multi by taxi) and then dump them into
                        simple linestrings. All in a query that may
                        look like this assuming you have a taxi id field:
                        WITH multis AS (
                        SELECT id, status,
                        ST_MakeLine(array_agg(point_ geom )) AS mylines
                        FROM your_table
                        GROUP BY id, status
                        ORDER BY time_field
                        )
                        SELECT id, status, (ST_Dump(mylines)).geom
                        FROM multis
                        You may want to add a time reference to your
                        lines. To do this, you can add an extraction
                        from your timestamp field (e.g. day or month)
                        and add it into the WITH and to the group by
                        clause.
                        Hugues.
                        *De :*postgis-us...@lists. osgeo.org
                        [mailto:postgis-us...@ lists.osgeo.org] *De la
                        part de* Oliver Burgfeld
                        *Envoyé :* mardi 25 novembre 2014 07:09
                        *À :* [email protected]
                        *Objet :* [postgis-users] Creating
                        trajectory/lines from millions of points[PostGIS]


                        Hi,
                        I have millions of points in a PostGIS
                        database containing taxi gps tracks. Now I
                        want to create lines from these points by
                        vehicleid and ordered by timestamp. But, and
                        that's my problem right now, at first I want
                        to include every column of my point table into
                        the "line table" and I also need to intersect
                        those lines at specific points.
                        I have one column representing the
                        "taxi_is_occupied" status with 0 or 1.
                        What I want now is to create lines which are
                        divided every time this status changes. In the
                        end I need lines which show the path of every
                        taxi over time, divided every time the status
                        of the car changes so that I can query all
                        lines where the taxi is occupied, for example.
                        What do I have to use therefore? I know that
                        there is the ST_MakeLines tool existing in
                        PostGIS, but as I am a new PostGIS

    ...


_______________________________________________
postgis-users mailing list
[email protected]
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users

Reply via email to