I think you need a union there instead of a union all. Otherwise you're double 
(or more) counting leaves where there is more than 1 path to get to the leaf.

I don't have a large dataset to test it on, but how about something like:

create table nodes
(
  id integer primary key,
  description text
);

create table edges
(
  parent int not null references nodes,
  child int not null references nodes,
  primary key (parent, child),
  check (parent != child)
) without rowid;
create index reverseEdges on edges (child, parent);

create view leafCounts as with recursive
leaves (id) as (
  select nodes.id
  from nodes left outer join edges
  on nodes.id = edges.parent
  where edges.parent is null
),
paths (parent, child) as (
  select parent, child from edges
  union
  select paths.parent, edges.child
  from paths inner join edges
  on paths.child = edges.parent
)
select parent, count(*) as leafCount
from paths
where child in leaves
group by parent;


-----Original Message-----
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Shane Dev
Sent: Monday, January 01, 2018 11:14 AM
To: SQLite mailing list
Subject: [sqlite] Efficient query to count number of leaves in a DAG.

Hi,

I want to the count the number of leaves (descendants without children) for
each node in a DAG

DAG definition -

CREATE TABLE nodes(id integer primary key, description text);
CREATE TABLE edges(parent not null references nodes, child not null
references nodes, primary key(parent, child));

My query -

CREATE VIEW v_count_leaves as with recursive r(id, top) as (
select id, id from nodes
union all
select t.id, top from nodes as t, edges as e, r where e.parent=r.id and t.id
=e.child)
select top, count(*) from r where top<>id and id not in (select parent from
edges where parent=id) group by top;

It seems to work but is complex to understand and debug despite my aim to
keep it simple as possible, but more importantly - it is very slow when
there are more than a few thousand nodes and edges.

It there a more efficient (and ideally simpler) way?
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to