Re: [cellml-discussion] Identifying smaller subsystems of simultaneous equations in differential-algebraic models

2008-04-21 Thread Randall Britten
 However, in general, I haven't been able to find an efficient
 (polynomial time) algorithm to compute this break-down (but I also
 haven't yet proved that the problem is NP-complete, so there may be a
 polynomial time solution even if P != NP).
 
Hi Andrew

If possible, please outline the algorithm that you have found, even though
it is not efficient.  

Regards,
Randall

___
cellml-discussion mailing list
cellml-discussion@cellml.org
http://www.cellml.org/mailman/listinfo/cellml-discussion


Re: [cellml-discussion] Identifying smaller subsystems of simultaneous equations in differential-algebraic models

2008-04-21 Thread Andrew Miller
Randall Britten wrote:
 However, in general, I haven't been able to find an efficient
 (polynomial time) algorithm to compute this break-down (but I also
 haven't yet proved that the problem is NP-complete, so there may be a
 polynomial time solution even if P != NP).

 Hi Andrew
 
 If possible, please outline the algorithm that you have found, even though
 it is not efficient.

The most trivial algorithm is of course a simple depth first exhaustive
search. I believe this is what Justin already started implementing.

Firstly, I'll define the exact problem that I'm trying to solve in
formal language.


There are several ways to formulate the problem, but matrix notation
seems the cleanest.

Let A be an n*n matrix, where n is the number of equations in the model
(which haven't already been used), and is also the number of unknown
variables (which must be equal for the problem to be correctly constrained).

Let A_ij = 1 if the ith equation involves the jth unknown, and 0 otherwise.

The objective is to find the smallest possible non-empty set E such that
|E| = |V|, where V = {j where exists i in E such that A_ij = 1 }
===

Algorithm findSmallestSystem(A, n):
  for setSize from 1 through to n inclusive:
result - findSystemOfSize(A, n, setSize, {}, 1)
if result is not failure:
  return result
end
  end
  Not reached because there is always a system of size n.
end

Algorithm findSystemOfSize(A, n, addMembers, existingMembers,
lowestConsidered):
  if (addMembers != 0):
result - findSystemOfSize(A, n, addMembers - 1, existingMembers \
{i}, lowestConsidered + 1)
if result is not failure:
  return result
end
result - findSystemOfSize(A, n, addMembers, existingMembers,
lowestConsidered + 1)
if result is not failure:
  return result
end
return failure because this branch is impossible.
  end

  V - {}
  for j from 1 through to n inclusive:
for i from 1 through to n inclusive:
  if A_ij = 1:
V - V union {j}
next iteration of the outer loop
  end
end
  end

  if |V| = |existingMembers|:
return existingMembers
  else
return failure
  end
end

I am measuring the time complexity of this by the number of times we
check if A_ij = 1. In the worst case, which is that the only possible
system is the system of size n, we fail for systems of size 1 through to
n-1, and run to the end of the nth step before failing. This takes  n^2
* (choose(n, 1) + choose(n, 2) + ... + choose(n, n)) = n^2 * (2^n - 1)
steps.

We can optimise certain cases. For example, if we can isolate disjoint
sets of variables and equations using a disjoint sets algorithm, we can
do better on this case (this won't improve the worst case, because there
is no guarantee of this being possible).

We can also adapt the above algorithm to be branch-and-bound by noting
that even before we get to the end of a branch, if |V|  |E|, then it is
never going to shrink as a result of adding more equations so we can
close off that branch.

Another possibility might be to implement a beam search based on some
kind of graph connectivity heuristic. However, it is not clear if this
would actually improve real world problems.

Also, it is worth noting that the above algorithm builds up the equation
set. We can also attack the problem from the other end (i.e. a divisive
algorithm) by starting from the set of all equations, which means we
know that |V| = |E|, and removing single equations or pairs of equations
and so on from |E| to see if this causes |V| to shrink by the
corresponding amount.

What I suspect will be the best improvement for average world models
would combine the disjoint sets separator with alternating between
divisive (shrink until we fail) and build-up (grow until we succeed)
algorithm attempts. This might work for real world problems if we limit
the number of equations we try to add or remove before just returning
the best divisive algorithm result even though it may be suboptimal.
However, whether this is actually useful is something we would need to
try with a number of models to find out; we may find that we still need
the metadata as a fallback approach for models where the heuristic is
not useful.

Best regards,
Andrew

 
 Regards,
 Randall
 
 ___
 cellml-discussion mailing list
 cellml-discussion@cellml.org
 http://www.cellml.org/mailman/listinfo/cellml-discussion

___
cellml-discussion mailing list
cellml-discussion@cellml.org
http://www.cellml.org/mailman/listinfo/cellml-discussion


[cellml-discussion] Identifying smaller subsystems of simultaneous equations in differential-algebraic models

2008-04-15 Thread Andrew Miller
Hi all,

One issue which has recently been discussed amongst those of us working 
on CellML tools is the best way to handle models which require systems 
of equations to be solved efficiently. For example, if a model has 
equations like:

f1(a, b, c) = 0
f2(a, b, c) = 0
f3(a, c) = 0
f4(b, c, d, e) = 0
f5(b, d) = 0

We could simply solve the entire system in one go in a non-linear 
solver, over all five variables. However, this is not an efficient 
approach generally; we could have thousands of variables in a complex 
model, but only have one place in which we need to solve a system of two 
models. In this case, we could break the solution down to the following 
procedural steps:
Step 1) Solve the following system:
f1(a, b, c) = 0
f2(a, b, c) = 0
f3(a, c) = 0

to get a, b, and c.

Step 2) Solve the following equation:
f4(b, d) = 0
using the value of b obtained in the above step, to get d.

Step 3) Solve the following equation:
f4(b, c, d, e) = 0
using the known values of b, c, and d, to get e.

However, in general, I haven't been able to find an efficient 
(polynomial time) algorithm to compute this break-down (but I also 
haven't yet proved that the problem is NP-complete, so there may be a 
polynomial time solution even if P != NP).

We have several other approaches available to us, even if finding the 
exactly optimal break-down into systems is intractable:
  1) Use a heuristic algorithm which cannot always break down the 
system, but can for simple cases. For example, we could limit the size 
of systems allowed to control the complexity, or we could try some 
divisive algorithm which starts with the system which requires that all 
variables be solved, and tries to isolate individual equations or pairs 
or triples from it to see if this eliminates the need to solve for any 
variables, or perhaps tries both heuristics. We can also efficiently 
identify completely disconnected systems which don't share any common 
variables and solve them individually using a disjoint sets datatype.
  2) Require the model creator to provide us with this information. This 
is perhaps an unclean approach because certain kinds of changes to the 
model may change the types of systems that need to be solved. However, I 
understand it is essentially the status quo for most non-CellML 
modelling work which requires systems of equations to be expressed, so 
might still be a good last resort if we don't find a better solution (or 
even useful as a override for cases where the heuristic algorithm fails 
to find the optimal procedural steps).

I would be very grateful if anyone has feedback or suggestions on how to 
handle this issue.

Best regards,
Andrew
___
cellml-discussion mailing list
cellml-discussion@cellml.org
http://www.cellml.org/mailman/listinfo/cellml-discussion