serdarmicrosoft commented on issue #2194: URL: https://github.com/apache/age/issues/2194#issuecomment-3104155140
Sorry @jrgemignani I missed your question. Here is the output of the queries: EXPLAIN select count(*), u.id from "User" u, "HAS_INTERACTION" h, "Book" b where u.id = h.start_id and b.id = h.end_id GROUP BY u.id ORDER BY 1 DESC LIMIT 10; `QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=728449.72..728449.74 rows=10 width=16) -> Sort (cost=728449.72..730061.43 rows=644686 width=16) Sort Key: (count(*)) DESC -> Finalize HashAggregate (cost=695479.90..714518.28 rows=644686 width=16) Group Key: u.id Planned Partitions: 16 -> Gather (cost=470126.84..638062.55 rows=1289372 width=16) Workers Planned: 2 -> Partial HashAggregate (cost=469126.84..508125.35 rows=644686 width=16) Group Key: u.id Planned Partitions: 8 -> Parallel Hash Join (cost=15770.68..299858.22 rows=4166612 width=8) Hash Cond: (h.start_id = u.id) -> Parallel Hash Join (cost=1875.90..219375.78 rows=4166612 width=8) Hash Cond: (h.end_id = b.id) -> Parallel Seq Scan on "HAS_INTERACTION" h (cost=0.00..184514.12 rows=4166612 width=16) -> Parallel Hash (cost=1389.45..1389.45 rows=38916 width=8) -> Parallel Index Only Scan using "Book_id_idx" on "Book" b (cost=0.29..1389.45 rows=38916 width=8) -> Parallel Hash (cost=9487.05..9487.05 rows=268619 width=8) -> Parallel Index Only Scan using "User_id_idx" on "User" u (cost=0.42..9487.05 rows=268619 width=8) (20 rows)` SELECT * FROM cypher('goodreads_graph', $$ EXPLAIN MATCH (u:User)-[:HAS_INTERACTION]->() RETURN u.user_id, count(*) ORDER BY count(*) DESC LIMIT 10 $$) AS (user_id agtype, interaction_count agtype); `QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=34254481.74..34254481.76 rows=10 width=64) -> Sort (cost=34254481.74..34256093.45 rows=644686 width=64) Sort Key: ((count(*))::agtype) DESC -> GroupAggregate (cost=1109643.97..34240550.31 rows=644686 width=64) Group Key: agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(u.id, _label_name('1332417'::oid, u.id), u.properties), '"user_id"'::agtype]) -> Nested Loop (cost=1109643.97..21754040.28 rows=2494723262 width=32) -> Gather Merge (cost=1109643.96..2274293.50 rows=9999869 width=118) Workers Planned: 2 -> Sort (cost=1108643.94..1119060.47 rows=4166612 width=118) Sort Key: (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(u.id, _label_name('1332417'::oid, u.id), u.properties), '"user_id"'::agtype])) -> Parallel Hash Join (cost=21655.93..284305.72 rows=4166612 width=118) Hash Cond: (_age_default_alias_0.start_id = u.id) -> Parallel Seq Scan on "HAS_INTERACTION" _age_default_alias_0 (cost=0.00..184514.12 rows=4166612 width=16) -> Parallel Hash (cost=13838.19..13838.19 rows=268619 width=110) -> Parallel Seq Scan on "User" u (cost=0.00..13838.19 rows=268619 width=110) -> Memoize (cost=0.01..1.54 rows=5 width=8) Cache Key: _age_default_alias_0.end_id Cache Mode: logical -> Append (cost=0.00..1.53 rows=5 width=8) -> Seq Scan on _ag_label_vertex _age_default_alias_1_1 (cost=0.00..0.00 rows=1 width=8) Filter: (id = _age_default_alias_0.end_id) -> Index Only Scan using "User_id_idx" on "User" _age_default_alias_1_2 (cost=0.42..0.44 rows=1 width=8) Index Cond: (id = _age_default_alias_0.end_id) -> Index Only Scan using "Book_id_idx" on "Book" _age_default_alias_1_3 (cost=0.29..0.31 rows=1 width=8) Index Cond: (id = _age_default_alias_0.end_id) -> Index Only Scan using "Author_id_idx" on "Author" _age_default_alias_1_4 (cost=0.29..0.31 rows=1 width=8) Index Cond: (id = _age_default_alias_0.end_id) -> Index Only Scan using "Review_id_idx" on "Review" _age_default_alias_1_5 (cost=0.43..0.45 rows=1 width=8) Index Cond: (id = _age_default_alias_0.end_id) (29 rows)` Here is the indexes I used: ```sql -- Index on 'id' for all vertex labels CREATE INDEX IF NOT EXISTS idx_user_id ON goodreads_graph."User" USING BTREE (id); CREATE INDEX IF NOT EXISTS idx_review_id ON goodreads_graph."Review" USING BTREE (id); CREATE INDEX IF NOT EXISTS idx_book_id ON goodreads_graph."Book" USING BTREE (id); CREATE INDEX IF NOT EXISTS idx_author_id ON goodreads_graph."Author" USING BTREE (id); CREATE INDEX ON goodreads_graph."User" USING GIN (properties); CREATE INDEX ON goodreads_graph."Review" USING GIN (properties); CREATE INDEX ON goodreads_graph."Book" USING GIN (properties); CREATE INDEX ON goodreads_graph."Author" USING GIN (properties); -- Index for WHERE clause: u.user_id = '...' CREATE INDEX IF NOT EXISTS idx_user_userid_op ON goodreads_graph."User" USING BTREE (agtype_access_operator(VARIADIC ARRAY[properties, '"user_id"'::agtype])); -- Index for WHERE clause: b.book_id = '...' CREATE INDEX IF NOT EXISTS idx_book_bookid_op ON goodreads_graph."Book" USING BTREE (agtype_access_operator(VARIADIC ARRAY[properties, '"book_id"'::agtype])); -- WRITTEN_BY edge: Review -[:WRITTEN_BY]-> User CREATE INDEX IF NOT EXISTS idx_written_by_start ON goodreads_graph."WRITTEN_BY" USING BTREE (start_id); CREATE INDEX IF NOT EXISTS idx_written_by_end ON goodreads_graph."WRITTEN_BY" USING BTREE (end_id); -- HAS_REVIEW edge: Book -[:HAS_REVIEW]-> Review CREATE INDEX IF NOT EXISTS idx_has_review_start ON goodreads_graph."HAS_REVIEW" USING BTREE (start_id); CREATE INDEX IF NOT EXISTS idx_has_review_end ON goodreads_graph."HAS_REVIEW" USING BTREE (end_id); -- Index for WHERE clause: r.review_id = '...' CREATE INDEX IF NOT EXISTS idx_review_reviewid_op ON goodreads_graph."Review" USING BTREE (agtype_access_operator(VARIADIC ARRAY[properties, '"review_id"'::agtype])); -- AUTHORED_BY edge: Book -[:AUTHORED_BY]-> Author CREATE INDEX IF NOT EXISTS idx_authored_by_start ON goodreads_graph."AUTHORED_BY" USING BTREE (start_id); CREATE INDEX IF NOT EXISTS idx_authored_by_end ON goodreads_graph."AUTHORED_BY" USING BTREE (end_id); -- HAS_INTERACTION edge: User -[:HAS_INTERACTION]-> Book CREATE INDEX IF NOT EXISTS idx_has_interaction_start ON goodreads_graph."HAS_INTERACTION" USING BTREE (start_id); CREATE INDEX IF NOT EXISTS idx_has_interaction_end ON goodreads_graph."HAS_INTERACTION" USING BTREE (end_id); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org