Trying to reply to Rob: Apologies if this does not end up in the thread (gmail is just driving me mad, I can't seem to receive messages, so I've subscribed again)
For some reason Limit 1 cause my query to go on for minutes without a response, which was not the case. The following query takes about 10.6 seconds, with a table that has about 20 million rows. First, the indices of the table with create statement: CREATE TABLE public.path_value ( val_string TEXT, feature_mapping_id INTEGER NOT NULL, parent_feature_mapping_id INTEGER, feature_name TEXT, rm_type_name TEXT, path INTEGER NOT NULL, payload_id INTEGER NOT NULL, id INTEGER NOT NULL, ehr_id INTEGER, CONSTRAINT path_value_pkey PRIMARY KEY(id) ) WITHOUT OIDS; CREATE INDEX path_value_idx ON public.path_value USING btree (rm_type_name COLLATE pg_catalog."default", feature_name COLLATE pg_catalog."default"); CREATE INDEX path_value_idx1 ON public.path_value USING btree (feature_mapping_id, parent_feature_mapping_id); CREATE INDEX path_value_idx2 ON public.path_value USING btree (payload_id); Now results of EXPLAIN ANALYZE: "QUERY PLAN" "Nested Loop Semi Join (cost=543422.34..571013.45 rows=5 width=4) (actual time=5854.267..10684.798 rows=82003 loops=1)" " Join Filter: (root.feature_mapping_id = anodeid.parent_feature_mapping_id)" " -> Merge Join (cost=543422.34..543426.29 rows=12 width=24) (actual time=5854.182..5953.531 rows=82003 loops=1)" " Merge Cond: ((root.feature_mapping_id = node1.parent_feature_mapping_id) AND (root.payload_id = node2.payload_id))" " -> Sort (cost=1150.65..1151.35 rows=283 width=12) (actual time=359.940..380.707 rows=82003 loops=1)" " Sort Key: root.feature_mapping_id, root.payload_id" " Sort Method: external merge Disk: 1768kB" " -> Index Scan using path_value_idx on path_value root (cost=0.00..1139.12 rows=283 width=12) (actual time=0.073..261.240 rows=82003 loops=1)" " Index Cond: ((rm_type_name = 'COMPOSITION'::text) AND (feature_name = 'composition'::text))" " -> Sort (cost=542271.69..542272.31 rows=247 width=12) (actual time=5494.236..5516.788 rows=82003 loops=1)" " Sort Key: node1.parent_feature_mapping_id, node2.payload_id" " Sort Method: external sort Disk: 2088kB" " -> HashAggregate (cost=542259.40..542261.87 rows=247 width=12) (actual time=5377.761..5394.453 rows=82003 loops=1)" " -> Hash Semi Join (cost=506297.37..542253.38 rows=1205 width=12) (actual time=2788.758..5325.838 rows=164004 loops=1)" " Hash Cond: ((node1.payload_id = node2.payload_id) AND (node1.feature_mapping_id = node2.parent_feature_mapping_id))" " -> Bitmap Heap Scan on path_value node1 (cost=77.61..9421.62 rows=2464 width=12) (actual time=76.628..2430.473 rows=246005 loops=1)" " Recheck Cond: ((rm_type_name = 'CONTENTITEM'::text) AND (feature_name = 'content'::text))" " -> Bitmap Index Scan on path_value_idx (cost=0.00..77.00 rows=2464 width=0) (actual time=75.640..75.640 rows=246005 loops=1)" " Index Cond: ((rm_type_name = 'CONTENTITEM'::text) AND (feature_name = 'content'::text))" " -> Hash (cost=498674.38..498674.38 rows=399092 width=8) (actual time=2711.653..2711.653 rows=410015 loops=1)" " Buckets: 4096 Batches: 16 Memory Usage: 1013kB" " -> Bitmap Heap Scan on path_value node2 (cost=10345.32..498674.38 rows=399092 width=8) (actual time=71.718..2607.467 rows=410015 loops=1)" " Recheck Cond: (rm_type_name = 'ITEMSTRUCTURE'::text)" " -> Bitmap Index Scan on path_value_idx (cost=0.00..10245.55 rows=399092 width=0) (actual time=69.634..69.634 rows=410015 loops=1)" " Index Cond: (rm_type_name = 'ITEMSTRUCTURE'::text)" " -> Index Scan using path_value_idx2 on path_value anodeid (cost=0.00..2298.91 rows=1 width=8) (actual time=0.057..0.057 rows=1 loops=82003)" " Index Cond: (payload_id = node2.payload_id)" " Filter: ((feature_name = 'archetypeNodeId'::text) AND (val_string = 'openEHR-EHR-COMPOSITION.discharge.v1'::text))" "Total runtime: 10692.277 ms" and finaly, the query that gave this result: EXPLAIN ANALYZE SELECT root.id from path_value as root WHERE root.rm_type_name = 'COMPOSITION' AND root.feature_name = 'composition' AND EXISTS (SELECT anodeid.id from path_value as anodeid WHERE anodeId.parent_feature_mapping_id = root.feature_mapping_id AND anodeId.payload_id = root.payload_id AND anodeId.feature_name = 'archetypeNodeId' AND anodeId.val_string = 'openEHR-EHR-COMPOSITION.discharge.v1' ) AND EXISTS (SELECT 1 from path_value as node1 WHERE node1.payload_id = root.payload_id AND node1.parent_feature_mapping_id = root.feature_mapping_id AND node1.feature_name = 'content' AND node1.rm_type_name = 'CONTENTITEM' AND EXISTS (SELECT 1 from path_value as node2 WHERE node2.payload_id = node1.payload_id AND node2.parent_feature_mapping_id = node1.feature_mapping_id AND node2.rm_type_name = 'ITEMSTRUCTURE' ) ) Is there a glaring error in my approach? Should I be better off with another SQL query, or Ltree/XPATH queries? Best regards Seref