[sympy] Re: Summations for machine learning applications

Danny Tarlow Mon, 29 Dec 2008 22:03:40 -0800

For concreteness, here is an example of one of my attempts:

from sympy import *


N = 5   # Number of data instances

HOnes = Matrix(ones((1,N)))
VOnes = Matrix(ones((N,1)))

# Allocate variational mean parameters and put in matrix
tau = []
for i in range(N):
        tau.append(Symbol('tau_' + str(i)))
Tau = Matrix(tau)

# Allocate model parameters
mu = Symbol('mu')
sigma = Symbol('sigma')

# Create L as a matrix with one row per term
L_matrix = Tau.applyfunc(lambda i: (i - mu) ** 2 / (2 * sigma))

print L_matrix

mu_update = solve((HOnes * L_matrix).diff(mu)[0], mu)
print "mu :=", mu_update

# The sigma term also needs to incorporate the normalization constant.  Add
# it in right before the differentiation.
sigma_update = solve(sigma**2 * ((-log(sigma)/2 + (HOnes *
L_matrix)[0]).diff(sigma)), sigma)
print "sigma :=", sigma_update

tau_0_update = solve((HOnes * L_matrix).diff(tau[0])[0], tau[0])
print "tau_0 :=", tau_0_update

The output is a mess:
mu := [tau_0/5 + tau_1/5 + tau_2/5 + tau_3/5 + tau_4/5]
sigma := [2*mu*tau_0 + 2*mu*tau_1 + 2*mu*tau_2 + 2*mu*tau_3 +
2*mu*tau_4 - tau_0**2 - tau_1**2 - tau_2**2 - tau_3**2 - tau_4**2 -
5*mu**2]
tau_0 := [mu]

Ideally, I'd like to get something more like this, where the sums are
left in tact:
mu := 1/5 * \sum_{i=0}^4 tau_i
sigma := 1/5 * \sum_{i=0}^4 (mu - tau_i) ** 2
tau_0 := [mu]

For extra credit, there would also be a way to substitute sufficient
statistics for the \sum_{i=0}^4 tau_i terms, for example, so that the
sample mean could be computed once, then used many times in the
updates.

Thanks in advance.

Danny



On Mon, Dec 29, 2008 at 1:58 PM, Danny <[email protected]> wrote:
> Hi,
>
> I'm interested in using Sympy to automatically derive updates for a
> general specification of some machine learning problems (specifically
> variational EM algorithms).
>
> Most likelihood functions I'm concerned with follow some variation of
> the form:
> L = \sum_{i=1}^N \frac{(\mu - tau_{i})^2} { 2 \sigma^2}
>
> Where there is one set of model parameters for each _variable_ (\mu
> and \sigma), and one set of variational parameters for each
> _data_instance_ (\tau_{i}).
>
> I want to do the following:
> mu_update = solve(L.diff('mu'), 'mu')
> tau_i_update = solve(L.diff('tau_i'), 'tau_i')
>
> I'd like to express L in sympy such that these calls will give me
> proper updates in a compact form (there will often be tens of
> thousands of data instances, so I don't ever want summations expanded
> in the mu_update, for example).  I'd also like to be able to
> differentiate with respect to some arbitrary tau, which will discard
> all of the other irrelevant terms in the summation.
>
> Is this possible to do with Sympy?  If I were interested in working on
> something like this, how hard would it be for a sympy newbie (but
> competent programmer) to implement?  Any tips on where to start?
>
> Thanks,
> Danny

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [email protected]
For more options, visit this group at http://groups.google.com/group/sympy?hl=en
-~----------~----~----~----~------~----~------~--~---

[sympy] Re: Summations for machine learning applications

Reply via email to