Folks --
I am stuck in version hell I suppose and need some help for running theano
on my Mac.
Here is the config:
Mac OS: 10.12.3 (16D32)
python 2.7 (I use anaconda but I have tried the /usr-version that comes
with the OS as well -- same result)
theano: Theano (0.9.0rc1)
lasagne: Lasagne (0.2.dev1)
numpy: numpy (1.12.0)
Xcode: Version 7.3 (7D175)
clang:
Apple LLVM version 7.3.0 (clang-703.0.29)
Target: x86_64-apple-darwin16.4.0
Thread model: posix
Unfortunately, I cannot downgrade XCode (clang) because I need it for other
projects.
So, here is what happens. I am working on LSTM models and have narrowed my
version problem down as follows: Using the code provided
here: http://colinraffel.com/talks/hammer2015recurrent.pdf (python file
attached) calls to theano.function (line 154 in the attached) result in the
following error message (excerpt):
======/ SNIP /============
Problem occurred during compilation with the command line below:
/usr/bin/clang++ -dynamiclib -g -O3 -fno-math-errno -Wno-unused-label
-Wno-unused-variable -Wno-write-strings -march=haswell
-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -fPIC -undefined
dynamic_lookup
-I/Users/thomas/anaconda/lib/python2.7/site-packages/numpy/core/include
-I/Users/thomas/anaconda/include/python2.7
-I/Users/thomas/anaconda/lib/python2.7/site-packages/theano/gof
-L/Users/thomas/anaconda/lib -fvisibility=hidden -o
/Users/thomas/.theano/compiledir_Darwin-16.4.0-x86_64-i386-64bit-i386-2.7.13-64/tmp5GtYpU/39a151e745f8754653c9e8ca5ea9cf75.so
/Users/thomas/.theano/compiledir_Darwin-16.4.0-x86_64-i386-64bit-i386-2.7.13-64/tmp5GtYpU/mod.cpp
/Users/thomas/.theano/compiledir_Darwin-16.4.0-x86_64-i386-64bit-i386-2.7.13-64/tmp5GtYpU/mod.cpp:894:21:
warning: comparison of array 'outputs' equal to a null pointer is always
false [-Wtautological-pointer-compare]
if (outputs == NULL) {
^~~~~~~ ~~~~
/Users/thomas/.theano/compiledir_Darwin-16.4.0-x86_64-i386-64bit-i386-2.7.13-64/tmp5GtYpU/mod.cpp:919:54:
error: arithmetic on a pointer to void
PyArray_DATA(V3) + data_offset,
~~~~~~~~~~~~~~~~ ^
1 warning and 1 error generated.
Traceback (most recent call last):
File "lstm_baseline.py", line 154, in <module>
train = theano.function([l_in.input_var, target_values,
l_mask.input_var], cost, updates=updates)
File
"/Users/thomas/anaconda/lib/python2.7/site-packages/theano/compile/function.py",
line 326, in function
output_keys=output_keys)
File
"/Users/thomas/anaconda/lib/python2.7/site-packages/theano/compile/pfunc.py",
line 486, in pfunc
output_keys=output_keys)
======/ SNAP /============
later on in theoutput:
======/ SNIP /============
Exception: ('The following error happened while compiling the node',
Split{4}(Assert{msg='Theano Assert failed!'}.0, TensorConstant{1},
MakeVector{dtype='int64'}.0), '\n', "Compilation failed (return status=1):
/Users/thomas/.theano/compiledir_Darwin-16.4.0-x86_64-i386-64bit-i386-2.7.13-64/tmp5GtYpU/mod.cpp:894:21:
warning: comparison of array 'outputs' equal to a null pointer is always
false [-Wtautological-pointer-compare]. if (outputs ==
NULL) {. ^~~~~~~ ~~~~.
/Users/thomas/.theano/compiledir_Darwin-16.4.0-x86_64-i386-64bit-i386-2.7.13-64/tmp5GtYpU/mod.cpp:919:54:
error: arithmetic on a pointer to void.
PyArray_DATA(V3) + data_offset,.
~~~~~~~~~~~~~~~~ ^. 1 warning and 1 error generated.. ", '[*1 ->
Split{4}(<TensorType(float64, matrix)>, TensorConstant{1},
<TensorType(int64, vector)>), *1::1, *1::2, *1::3]')
======/ SNAP /============
Now, I know that this code works because we have it up and running on
another Mac with the same configuration APART from the clang version, which
there is:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin14.5.0
Thread model: posix
The code also runs fine on a recent Ubuntu box.
==> So, it looks like the newer clang version is to blame for being more
pedantic (?) with comparisons to NULL pointers (in the generated C code).
Now, this behaviour is a little tricky to debug (to say the least). I have
quadruple checked the Python code and it looks fine. The fact that it is
running on two other of out machines and that the original author of it
(Colin Raffel -- all kudos to him for the tutorial code) is using it in his
tutorial tells me that my suspicion is not entirely wrong.
I tried installing an older version of the command line tools for XCode
(essentially the c-compiler) on my box to run it in parallel to the current
version (which I need for other projects) but that did not work out.
According to the apple developer forum I would need to install the complete
(!) version of an older XCode package, which is not really an option.
So, does anyone have a clue / experience / advise on this?
Many thanks!
Thomas
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.
# coding: utf-8
# In[1]:
from __future__ import print_function
import theano
import theano.tensor as T
import lasagne
import numpy as np
import sklearn.datasets
import os
import matplotlib.pyplot as plt
#get_ipython().magic(u'matplotlib inline')
# In[2]:
# Min/max sequence length
MIN_LENGTH = 50
MAX_LENGTH = 55
# Number of units in the hidden (recurrent) layer
N_HIDDEN = 100
# Number of training sequences in each batch
N_BATCH = 100
# Optimization learning rate
LEARNING_RATE = .001
# All gradients above this will be clipped
GRAD_CLIP = 100
# How often should we check the output?
EPOCH_SIZE = 100
# Number of epochs to train the net
NUM_EPOCHS = 10
def gen_data(min_length=MIN_LENGTH, max_length=MAX_LENGTH, n_batch=N_BATCH):
'''
Generate a batch of sequences for the "add" task, e.g. the target for the
following
``| 0.5 | 0.7 | 0.3 | 0.1 | 0.2 | ... | 0.5 | 0.9 | ... | 0.8 | 0.2 |
| 0 | 0 | 1 | 0 | 0 | | 0 | 1 | | 0 | 0 |``
would be 0.3 + .9 = 1.2. This task was proposed in [1]_ and explored in
e.g. [2]_.
Parameters
----------
min_length : int
Minimum sequence length.
max_length : int
Maximum sequence length.
n_batch : int
Number of samples in the batch.
Returns
-------
X : np.ndarray
Input to the network, of shape (n_batch, max_length, 2), where the last
dimension corresponds to the two sequences shown above.
y : np.ndarray
Correct output for each sample, shape (n_batch,).
mask : np.ndarray
A binary matrix of shape (n_batch, max_length) where ``mask[i, j] = 1``
when ``j <= (length of sequence i)`` and ``mask[i, j] = 0`` when ``j >
(length of sequence i)``.
References
----------
.. [1] Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory."
Neural computation 9.8 (1997): 1735-1780.
.. [2] Sutskever, Ilya, et al. "On the importance of initialization and
momentum in deep learning." Proceedings of the 30th international
conference on machine learning (ICML-13). 2013.
'''
# Generate X - we'll fill the last dimension later
X = np.concatenate([np.random.uniform(size=(n_batch, max_length, 1)),
np.zeros((n_batch, max_length, 1))],
axis=-1)
mask = np.zeros((n_batch, max_length))
y = np.zeros((n_batch,))
# Compute masks and correct values
for n in range(n_batch):
# Randomly choose the sequence length
length = np.random.randint(min_length, max_length)
# Make the mask for this sample 1 within the range of length
mask[n, :length] = 1
# Zero out X after the end of the sequence
X[n, length:, 0] = 0
# Set the second dimension to 1 at the indices to add
X[n, np.random.randint(length/10), 1] = 1
X[n, np.random.randint(length/2, length), 1] = 1
# Multiply and sum the dimensions of X to get the target value
y[n] = np.sum(X[n, :, 0]*X[n, :, 1])
# Center the inputs and outputs
X -= X.reshape(-1, 2).mean(axis=0)
y -= y.mean()
return (X.astype(theano.config.floatX), y.astype(theano.config.floatX),
mask.astype(theano.config.floatX))
# In[3]:
#from recurrent import gen_data
# By setting the first and second dimensions to None, we allow # arbitrary minibatch sizes with arbitrary sequence lengths.
# The number of feature dimensions is 2, as described above.
l_in = lasagne.layers.InputLayer(shape=(None, None, 2))
# This input will be used to provide the network with masks.
# Masks are expected to be matrices of shape (n_batch, n_time_steps); # both of these dimensions are variable for us so we will use
# an input shape of (None, None)
l_mask = lasagne.layers.InputLayer(shape=(None, None))
# Our LSTM will have 10 hidden/cell units
N_HIDDEN = 10
l_lstm = lasagne.layers.recurrent.LSTMLayer( l_in, N_HIDDEN, mask_input=l_mask)
# We need to specify a separate input for masks mask_input=l_mask,
# Here, we supply the gate parameters for each gate ingate=gate_parameters, forgetgate=gate_parameters, cell=cell_parameters, outgate=gate_parameters,
# We'll learn the initialization and use gradient clipping learn_init=True, grad_clipping=100.)
# In[4]:
l_lstm_back = lasagne.layers.recurrent.LSTMLayer(
l_in, N_HIDDEN,
mask_input=l_mask,backwards=True)
l_sum = lasagne.layers.ElemwiseSumLayer([l_lstm, l_lstm_back])
# In[5]:
# First, retrieve symbolic variables for the input shape
n_batch, n_time_steps, n_features = l_in.input_var.shape
# Now, squash the n_batch and n_time_steps dimensions
l_reshape = lasagne.layers.ReshapeLayer(l_sum, (-1, N_HIDDEN)) # Now, we can apply feed-forward layers as usual.
# We want the network to predict a single value, the sum, so we'll use a single unit.
l_dense = lasagne.layers.DenseLayer(
l_reshape, num_units=1, nonlinearity=lasagne.nonlinearities.tanh)
# Now, the shape will be n_batch*n_timesteps, 1. We can then reshape to
# n_batch, n_timesteps to get a single value for each timstep from each sequence
l_out = lasagne.layers.ReshapeLayer(l_dense, (n_batch, n_time_steps))
# In[6]:
target_values = T.vector('target_output')
#mask = T.matrix('mask')
network_output = lasagne.layers.get_output(l_out)
predicted_values = network_output[:, -1]
cost = T.mean((predicted_values - target_values)**2)
all_params = lasagne.layers.get_all_params(l_out)
updates = lasagne.updates.adam(cost, all_params)
train = theano.function([l_in.input_var, target_values, l_mask.input_var], cost, updates=updates)
compute_cost = theano.function([l_in.input_var, target_values, l_mask.input_var], cost)
# In[ ]:
X_val, y_val, mask_val = gen_data()
NUM_EPOCHS = 10
EPOCH_SIZE = 100
for epoch in range(NUM_EPOCHS):
for _ in range(EPOCH_SIZE):
X, y, m = gen_data()
train(X, y, m)
cost_val = compute_cost(X_val, y_val, mask_val)
print("Epoch {} validation cost = {}".format(epoch + 1, cost_val))