jrgemignani commented on issue #2194: URL: https://github.com/apache/age/issues/2194#issuecomment-3145606964
@serdarmicrosoft I need to point out that these 2 commands are still not quite the same - ``` psql-17.5-5432-pgsql=# select count(*), u.id from goodreads_graph_x."User" u, goodreads_graph_x."HAS_INTERACTION" h, goodreads_graph_x."Book" b where u.id = h.start_id and b.id = h.end_id GROUP BY u.id ORDER BY 1 DESC LIMIT 10; count | id -------+----------------- 8889 | 844424930606649 8328 | 844424930455288 6882 | 844424930701142 6793 | 844424930595806 6165 | 844424930340731 5915 | 844424930542262 5899 | 844424930582002 5858 | 844424930295719 5699 | 844424930678986 5649 | 844424930222488 (10 rows) psql-17.5-5432-pgsql=# SELECT * FROM cypher('goodreads_graph_x', $$ MATCH (u:User)-[:HAS_INTERACTION]->(:Book) RETURN u.user_id, count(*) ORDER BY count(*) DESC LIMIT 10 $$) AS (user_id agtype, interaction_count agtype); user_id | interaction_count ------------------------------------+------------------- "bc7862cf7449815372fc58c8a817b488" | 8889 "806c0fdc5bed9757ed4d3ca8e3a13be5" | 8328 "e1f84cc029a12d6b57ad10432a0d3649" | 6882 "b83502f98865ff9ed70755404c353b56" | 6793 "52f5430e583a5e0043b3e6c83953b68a" | 6165 "a2ed685cf6398e72c5ec21a556489761" | 5915 "b2b770716941c4aab7227ac62230cc7f" | 5899 "410ffd0e48b0899ec40b54e571c4607d" | 5858 "d93961c6b197f33c3747ceba6fafbbdc" | 5699 "2406ed0afc9f589ec7bf0766242371e0" | 5649 (10 rows) ``` **u.id** and **user_id** are different. **u.id** is the id of the vertex, **user_id** is the id inside the vertex properties. This would be the comparable command - ``` psql-17.5-5432-pgsql=# SELECT * FROM cypher('goodreads_graph_x', $$ MATCH (u:User)-[:HAS_INTERACTION]->(:Book) RETURN count(*), id(u) ORDER BY count(*) DESC LIMIT 10 $$) AS (interaction_count agtype, user_id agtype); interaction_count | user_id -------------------+----------------- 8889 | 844424930606649 8328 | 844424930455288 6882 | 844424930701142 6793 | 844424930595806 6165 | 844424930340731 5915 | 844424930542262 5899 | 844424930582002 5858 | 844424930295719 5699 | 844424930678986 5649 | 844424930222488 (10 rows) ``` Here is the explain analyze for this command - ``` psql-17.5-5432-pgsql=# SELECT * FROM cypher('goodreads_graph_x', $$ explain analyze MATCH (u:User)-[:HAS_INTERACTION]->(:Book) RETURN count(*), id(u) ORDER BY count(*) DESC LIMIT 10 $$) AS (interaction_count agtype, user_id agtype); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------- Limit (cost=1007580.65..1007580.67 rows=10 width=64) (actual time=17823.193..17844.658 rows=10 loops=1) -> Sort (cost=1007580.65..1009192.36 rows=644686 width=64) (actual time=17823.191..17844.655 rows=10 loops=1) Sort Key: ((count(*))::agtype) DESC Sort Method: top-N heapsort Memory: 26kB -> Finalize GroupAggregate (cost=823871.40..993649.21 rows=644686 width=64) (actual time=17402.177..17766.446 rows=132661 loops=1) Group Key: (age_id(_agtype_build_vertex(u.id, _label_name('237668'::oid, u.id), u.properties))) -> Gather Merge (cost=823871.40..974308.63 rows=1289372 width=40) (actual time=17402.160..17558.219 rows=133027 loops=1) Workers Planned: 2 Workers Launched: 2 -> Sort (cost=822871.37..824483.09 rows=644686 width=40) (actual time=17394.384..17400.814 rows=44342 loops=3) Sort Key: (age_id(_agtype_build_vertex(u.id, _label_name('237668'::oid, u.id), u.properties))) Sort Method: quicksort Memory: 3516kB Worker 0: Sort Method: quicksort Memory: 3379kB Worker 1: Sort Method: quicksort Memory: 3438kB -> Partial HashAggregate (cost=674786.11..743035.36 rows=644686 width=40) (actual time=17102.089..17116.448 rows=44342 loops=3) Group Key: age_id(_agtype_build_vertex(u.id, _label_name('237668'::oid, u.id), u.properties)) Planned Partitions: 16 Batches: 1 Memory Usage: 6673kB Worker 0: Batches: 1 Memory Usage: 6417kB Worker 1: Batches: 1 Memory Usage: 6417kB -> Parallel Hash Join (cost=23994.82..342748.46 rows=4166747 width=32) (actual time=2721.135..13781.338 rows=3333333 loops=3) Hash Cond: (_age_default_alias_0.start_id = u.id) -> Parallel Hash Join (cost=2374.89..219877.18 rows=4166747 width=8) (actual time=19.124..1877.394 rows=3333333 loops=3 ) Hash Cond: (_age_default_alias_0.end_id = _age_default_alias_1.id) -> Parallel Seq Scan on "HAS_INTERACTION" _age_default_alias_0 (cost=0.00..184515.47 rows=4166747 width=16) (actu al time=0.034..505.455 rows=3333333 loops=3) -> Parallel Hash (cost=1888.44..1888.44 rows=38916 width=8) (actual time=18.173..18.174 rows=31133 loops=3) Buckets: 131072 Batches: 1 Memory Usage: 4736kB -> Parallel Index Only Scan using idx_book_id on "Book" _age_default_alias_1 (cost=0.29..1888.44 rows=38916 width=8) (actual time=0.104..7.185 rows=31133 loops=3) Heap Fetches: 0 -> Parallel Hash (cost=13802.19..13802.19 rows=268619 width=110) (actual time=118.915..118.916 rows=214895 loops=3) Buckets: 65536 (originally 65536) Batches: 32 (originally 16) Memory Usage: 9792kB -> Parallel Seq Scan on "User" u (cost=0.00..13802.19 rows=268619 width=110) (actual time=0.045..42.617 rows=2148 95 loops=3) Planning Time: 0.956 ms Execution Time: 17847.382 ms (33 rows) psql-17.5-5432-pgsql=# ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org