I33Buckler commented on issue #1287:
URL: https://github.com/apache/age/issues/1287#issuecomment-1783935480

   A factor that I've noticed with regard to the noticeable performance 
difference between joining the PostgreSQL tables in the database and and 
joining the nodes is the difference in estimated cost.
   The native query below in response to EXECUTION PLAN returns a total cost 
score of 296852.63.
   ```SQL
   select st.* from 
   allservices.stops s,
   allservices.trips t,
   allservices.stop_times st 
   where s.tc_date=t.tc_date and t.tc_date=st.tc_date
                and s.stop_id=st.stop_id and t.trip_id=st.trip_id 
        ;
   ```
   Whereas the graph query in response to the EXECUTION PLAN returns a total 
cost score of 2115202.70
   ```SQL
   CREATE OR REPLACE FUNCTION create_stop_times(graph_name text, tc_date date, 
trip_id text, stop_id text, 
                                                arrival_time text, 
arrival_time_sec bigint, departure_time text, departure_time_sec bigint,
                                                stop_sequence bigint, 
pickup_type int, drop_off_type int)
   returns text 
   LANGUAGE plpgsql
   VOLATILE
   as $stop_times$
   declare 
        nodename text := graph_name || '.' || 'stop_times';
   BEGIN
   execute 
   format ('select * from 
                        cypher(''%1$s'', 
                                $$match (t:trips),(s:stops)
                                                 where t.id=%2$s and s.id=%3$s 
and t.tc_date=%11$s and t.tc_date=s.tc_date
                                        create(t)<-
                                        [:PART_OF_TRIP]-
                                        (st:stop_times {tc_date: %11$s, 
arrival_time: %4$s, arrival_time_seconds: %5$s, departure_time: %6$s, 
departure_time_seconds: %7$s, 
                                                        stop_sequence: %8$s, 
pickup_type: %9$s, drop_off_type: %10$s})-
                                        [:LOCATED_AT]->(s)$$) as (st agtype);',
                        
quote_ident(graph_name),quote_ident(trip_id),quote_ident(stop_id),
                                        
quote_ident(arrival_time),to_char(arrival_time_sec,'999999'),
                                        
quote_ident(departure_time),to_char(departure_time_sec,'999999'),
                                        
to_char(stop_sequence,'9999'),to_char(pickup_type,'9'),to_char(drop_off_type,'9'),
                                        
format('"%s"',to_char(tc_date,'yyyy-mm-dd')));
   return nodename;
   END
   $stop_times$
   ;
   select 
create_stop_times('transport_network',a.tc_date,a.trip_id,a.stop_id,a.arrival_time,a.arrival_time_sec,a.departure_time,a.departure_time_sec,a.row_num,a.pickup_type,a.drop_off_type)
   from stop_times a
   ;
   ```
   This about 7:1 difference between the native SQL and the graph query. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@age.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to