Re: [cellml-discussion] Identifying smaller subsystems of simultaneous equations in differential-algebraic models

2008-04-21 Thread Andrew Miller
Andrew Miller wrote:
> Randall Britten wrote:
>>> However, in general, I haven't been able to find an efficient
>>> (polynomial time) algorithm to compute this break-down (but I also
>>> haven't yet proved that the problem is NP-complete, so there may be a
>>> polynomial time solution even if P != NP).
>>>
>> Hi Andrew
>>
>> If possible, please outline the algorithm that you have found, even though
>> it is not efficient.
> 
> The most trivial algorithm is of course a simple depth first exhaustive
> search. I believe this is what Justin already started implementing.
> 
> Firstly, I'll define the exact problem that I'm trying to solve in
> formal language.
> 
> 
> There are several ways to formulate the problem, but matrix notation
> seems the cleanest.
> 
> Let A be an n*n matrix, where n is the number of equations in the model
> (which haven't already been used), and is also the number of unknown
> variables (which must be equal for the problem to be correctly constrained).
> 
> Let A_ij = 1 if the ith equation involves the jth unknown, and 0 otherwise.
> 
> The objective is to find the smallest possible non-empty set E such that
> |E| = |V|, where V = {j where exists i in E such that A_ij = 1 }
> ===
> 
> Algorithm findSmallestSystem(A, n):
>   for setSize from 1 through to n inclusive:
> result <- findSystemOfSize(A, n, setSize, {}, 1)
> if result is not failure:
>   return result
> end
>   end
>   Not reached because there is always a system of size n.
> end
> 
> Algorithm findSystemOfSize(A, n, addMembers, existingMembers,
> lowestConsidered):
>   if (addMembers != 0):
> result <- findSystemOfSize(A, n, addMembers - 1, existingMembers \
> {i}, lowestConsidered + 1)

Of course, we don't have i here, I meant lowestConsidered

> if result is not failure:
>   return result
> end
> result <- findSystemOfSize(A, n, addMembers, existingMembers,
> lowestConsidered + 1)
> if result is not failure:
>   return result
> end
> return failure because this branch is impossible.
>   end
> 
>   V <- {}
>   for j from 1 through to n inclusive:
> for i from 1 through to n inclusive:
>   if A_ij = 1:
> V <- V union {j}
> next iteration of the outer loop
>   end
> end
>   end
> 
>   if |V| = |existingMembers|:
> return existingMembers
>   else
> return failure
>   end
> end
> 
> I am measuring the time complexity of this by the number of times we
> check if A_ij = 1. In the worst case, which is that the only possible
> system is the system of size n, we fail for systems of size 1 through to
> n-1, and run to the end of the nth step before failing. This takes  n^2
> * (choose(n, 1) + choose(n, 2) + ... + choose(n, n)) = n^2 * (2^n - 1)
> steps.
> 
> We can optimise certain cases. For example, if we can isolate disjoint
> sets of variables and equations using a disjoint sets algorithm, we can
> do better on this case (this won't improve the worst case, because there
> is no guarantee of this being possible).
> 
> We can also adapt the above algorithm to be branch-and-bound by noting
> that even before we get to the end of a branch, if |V| > |E|, then it is
> never going to shrink as a result of adding more equations so we can
> close off that branch.
> 
> Another possibility might be to implement a beam search based on some
> kind of graph connectivity heuristic. However, it is not clear if this
> would actually improve real world problems.
> 
> Also, it is worth noting that the above algorithm builds up the equation
> set. We can also attack the problem from the other end (i.e. a divisive
> algorithm) by starting from the set of all equations, which means we
> know that |V| = |E|, and removing single equations or pairs of equations
> and so on from |E| to see if this causes |V| to shrink by the
> corresponding amount.
> 
> What I suspect will be the best improvement for average world models

I meant real world models.

> would combine the disjoint sets separator with alternating between
> divisive (shrink until we fail) and build-up (grow until we succeed)
> algorithm attempts. This might work for real world problems if we limit
> the number of equations we try to add or remove before just returning
> the best divisive algorithm result even though it may be suboptimal.
> However, whether this is actually useful is something we would need to
> try with a number of models to find out; we may find that we still need
> the metadata as a fallback approach for models where the heuristic is
> not useful.
> 
> Best regards,
> Andrew
> 
>> Regards,
>> Randall
>>
>> ___
>> cellml-discussion mailing list
>> cellml-discussion@cellml.org
>> http://www.cellml.org/mailman/listinfo/cellml-discussion
> 
> ___
> cellml-discussion mailing list
> cellml-discussion@cellml.org
> http://www.cellml.org/mailman/listinfo/cellml-discussion


Re: [cellml-discussion] Identifying smaller subsystems of simultaneous equations in differential-algebraic models

2008-04-21 Thread Andrew Miller
Randall Britten wrote:
>> However, in general, I haven't been able to find an efficient
>> (polynomial time) algorithm to compute this break-down (but I also
>> haven't yet proved that the problem is NP-complete, so there may be a
>> polynomial time solution even if P != NP).
>>
> Hi Andrew
> 
> If possible, please outline the algorithm that you have found, even though
> it is not efficient.

The most trivial algorithm is of course a simple depth first exhaustive
search. I believe this is what Justin already started implementing.

Firstly, I'll define the exact problem that I'm trying to solve in
formal language.


There are several ways to formulate the problem, but matrix notation
seems the cleanest.

Let A be an n*n matrix, where n is the number of equations in the model
(which haven't already been used), and is also the number of unknown
variables (which must be equal for the problem to be correctly constrained).

Let A_ij = 1 if the ith equation involves the jth unknown, and 0 otherwise.

The objective is to find the smallest possible non-empty set E such that
|E| = |V|, where V = {j where exists i in E such that A_ij = 1 }
===

Algorithm findSmallestSystem(A, n):
  for setSize from 1 through to n inclusive:
result <- findSystemOfSize(A, n, setSize, {}, 1)
if result is not failure:
  return result
end
  end
  Not reached because there is always a system of size n.
end

Algorithm findSystemOfSize(A, n, addMembers, existingMembers,
lowestConsidered):
  if (addMembers != 0):
result <- findSystemOfSize(A, n, addMembers - 1, existingMembers \
{i}, lowestConsidered + 1)
if result is not failure:
  return result
end
result <- findSystemOfSize(A, n, addMembers, existingMembers,
lowestConsidered + 1)
if result is not failure:
  return result
end
return failure because this branch is impossible.
  end

  V <- {}
  for j from 1 through to n inclusive:
for i from 1 through to n inclusive:
  if A_ij = 1:
V <- V union {j}
next iteration of the outer loop
  end
end
  end

  if |V| = |existingMembers|:
return existingMembers
  else
return failure
  end
end

I am measuring the time complexity of this by the number of times we
check if A_ij = 1. In the worst case, which is that the only possible
system is the system of size n, we fail for systems of size 1 through to
n-1, and run to the end of the nth step before failing. This takes  n^2
* (choose(n, 1) + choose(n, 2) + ... + choose(n, n)) = n^2 * (2^n - 1)
steps.

We can optimise certain cases. For example, if we can isolate disjoint
sets of variables and equations using a disjoint sets algorithm, we can
do better on this case (this won't improve the worst case, because there
is no guarantee of this being possible).

We can also adapt the above algorithm to be branch-and-bound by noting
that even before we get to the end of a branch, if |V| > |E|, then it is
never going to shrink as a result of adding more equations so we can
close off that branch.

Another possibility might be to implement a beam search based on some
kind of graph connectivity heuristic. However, it is not clear if this
would actually improve real world problems.

Also, it is worth noting that the above algorithm builds up the equation
set. We can also attack the problem from the other end (i.e. a divisive
algorithm) by starting from the set of all equations, which means we
know that |V| = |E|, and removing single equations or pairs of equations
and so on from |E| to see if this causes |V| to shrink by the
corresponding amount.

What I suspect will be the best improvement for average world models
would combine the disjoint sets separator with alternating between
divisive (shrink until we fail) and build-up (grow until we succeed)
algorithm attempts. This might work for real world problems if we limit
the number of equations we try to add or remove before just returning
the best divisive algorithm result even though it may be suboptimal.
However, whether this is actually useful is something we would need to
try with a number of models to find out; we may find that we still need
the metadata as a fallback approach for models where the heuristic is
not useful.

Best regards,
Andrew

> 
> Regards,
> Randall
> 
> ___
> cellml-discussion mailing list
> cellml-discussion@cellml.org
> http://www.cellml.org/mailman/listinfo/cellml-discussion

___
cellml-discussion mailing list
cellml-discussion@cellml.org
http://www.cellml.org/mailman/listinfo/cellml-discussion


Re: [cellml-discussion] Identifying smaller subsystems of simultaneous equations in differential-algebraic models

2008-04-21 Thread Randall Britten
> However, in general, I haven't been able to find an efficient
> (polynomial time) algorithm to compute this break-down (but I also
> haven't yet proved that the problem is NP-complete, so there may be a
> polynomial time solution even if P != NP).
> 
Hi Andrew

If possible, please outline the algorithm that you have found, even though
it is not efficient.  

Regards,
Randall

___
cellml-discussion mailing list
cellml-discussion@cellml.org
http://www.cellml.org/mailman/listinfo/cellml-discussion