# Re: [cellml-discussion] Identifying smaller subsystems of simultaneous equations in differential-algebraic models

```Randall Britten wrote:
>> However, in general, I haven't been able to find an efficient
>> (polynomial time) algorithm to compute this break-down (but I also
>> haven't yet proved that the problem is NP-complete, so there may be a
>> polynomial time solution even if P != NP).
>>
> Hi Andrew
>
> If possible, please outline the algorithm that you have found, even though
> it is not efficient.```
```
The most trivial algorithm is of course a simple depth first exhaustive
search. I believe this is what Justin already started implementing.

Firstly, I'll define the exact problem that I'm trying to solve in
formal language.

========
There are several ways to formulate the problem, but matrix notation
seems the cleanest.

Let A be an n*n matrix, where n is the number of equations in the model
(which haven't already been used), and is also the number of unknown
variables (which must be equal for the problem to be correctly constrained).

Let A_ij = 1 if the ith equation involves the jth unknown, and 0 otherwise.

The objective is to find the smallest possible non-empty set E such that
|E| = |V|, where V = {j where exists i in E such that A_ij = 1 }
=======

Algorithm findSmallestSystem(A, n):
for setSize from 1 through to n inclusive:
result <- findSystemOfSize(A, n, setSize, {}, 1)
if result is not failure:
return result
end
end
Not reached because there is always a system of size n.
end

Algorithm findSystemOfSize(A, n, addMembers, existingMembers,
lowestConsidered):
if (addMembers != 0):
result <- findSystemOfSize(A, n, addMembers - 1, existingMembers \
{i}, lowestConsidered + 1)
if result is not failure:
return result
end
result <- findSystemOfSize(A, n, addMembers, existingMembers,
lowestConsidered + 1)
if result is not failure:
return result
end
return failure because this branch is impossible.
end

V <- {}
for j from 1 through to n inclusive:
for i from 1 through to n inclusive:
if A_ij = 1:
V <- V union {j}
next iteration of the outer loop
end
end
end

if |V| = |existingMembers|:
return existingMembers
else
return failure
end
end

I am measuring the time complexity of this by the number of times we
check if A_ij = 1. In the worst case, which is that the only possible
system is the system of size n, we fail for systems of size 1 through to
n-1, and run to the end of the nth step before failing. This takes  n^2
* (choose(n, 1) + choose(n, 2) + ... + choose(n, n)) = n^2 * (2^n - 1)
steps.

We can optimise certain cases. For example, if we can isolate disjoint
sets of variables and equations using a disjoint sets algorithm, we can
do better on this case (this won't improve the worst case, because there
is no guarantee of this being possible).

We can also adapt the above algorithm to be branch-and-bound by noting
that even before we get to the end of a branch, if |V| > |E|, then it is
never going to shrink as a result of adding more equations so we can
close off that branch.

Another possibility might be to implement a beam search based on some
kind of graph connectivity heuristic. However, it is not clear if this
would actually improve real world problems.

Also, it is worth noting that the above algorithm builds up the equation
set. We can also attack the problem from the other end (i.e. a divisive
algorithm) by starting from the set of all equations, which means we
know that |V| = |E|, and removing single equations or pairs of equations
and so on from |E| to see if this causes |V| to shrink by the
corresponding amount.

What I suspect will be the best improvement for average world models
would combine the disjoint sets separator with alternating between
divisive (shrink until we fail) and build-up (grow until we succeed)
algorithm attempts. This might work for real world problems if we limit
the number of equations we try to add or remove before just returning
the best divisive algorithm result even though it may be suboptimal.
However, whether this is actually useful is something we would need to
try with a number of models to find out; we may find that we still need
the metadata as a fallback approach for models where the heuristic is
not useful.

Best regards,
Andrew

>
> Regards,
> Randall
>
> _______________________________________________
> cellml-discussion mailing list
> cellml-discussion@cellml.org
> http://www.cellml.org/mailman/listinfo/cellml-discussion

_______________________________________________
cellml-discussion mailing list
cellml-discussion@cellml.org
http://www.cellml.org/mailman/listinfo/cellml-discussion
```