[theano-users] text generator LSTM

Lien Hoang Mon, 22 Aug 2016 11:32:17 -0700

I am trying to create a word generator using probabilities for a LSTM 
model. The problem is every time it runs it gives me the same output words 
regardless of what the input text is and the sentences produced does not 
make much sense either.



"""
Define output and cost function
logsoftmax output unit as in the torch code
as note in the rnn-theano code, T.nnet.softmax will not operate on 
T.tensor3 types, only matrices
We take our n_steps x n_seq x n_classes output from the net
and reshape it into a (n_steps * n_seq) x n_classes matrix
apply softmax, then reshape back
"""
y_p_m = T.reshape(y_vals, (y_vals.shape[0] * y_vals.shape[1], -1))
y_p_s = T.nnet.softmax(y_p_m) # pred
#off = 1e-8
# for the PTB language model, the cost is the perplexity
y_f = y.flatten(ndim=1) # y is n_seq x n_steps
cost = -T.mean(T.log(y_p_s)[T.arange(y_p_s.shape[0]),y_f])

#cost = -T.log(y_p_s[T.arange(y_p_s.shape[0]),y_f] + off).mean()
#p_y_given_x = T.nnet.softmax(y_vals)
#p_y_given_x_sentence = y_f[:, 0, :]
self.p_dist = theano.function([x],y_p_s, on_unused_input='ignore')

def random_generator(probs):
xk = xrange(10000)
custm = stats.rv_discrete(name='custm', values=(xk,probs))
return custm.rvs(size=1)

# idxss = T.ivector()
# prediction = theano.function([idxss],self.y_f)

def next_word(text, vocab_map, index2word, 
seq_length,length,p_dist,vocab_size):
words = text.split()
for j in xrange(20):
#idxs = vocab_map[1]
idxs = [vocab_map[w] for w in words]
vocab_id = 
np.zeros((idxs[1],seq_length,vocab_size)).astype(theano.config.floatX)
for i in xrange(length):
prob_dist = p_dist(np.asarray(vocab_id).astype('int32'))
#prob_dist = np.asarray((vocab_id)).astype('int32')
next_index = random_generator(prob_dist[-1,:])
idxs.append(next_index[0])
print [index2word[index] for index in idxs]

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[theano-users] text generator LSTM

Reply via email to