You are better to open a new thread with your own error message. Small change in the error message can lead to different cause.
Fred On Fri, Aug 5, 2016 at 4:30 PM, <[email protected]> wrote: > Hi, Tassilo, > Have you solved this problem? > I encounter the same problem at present without any idea for it. > I just run some separate programs in the same time then the function > compiling outputs the similar problem. > Thanks a lot for any idea. > > 在 2015年3月9日星期一 UTC-4下午1:51:30,Tassilo Klein写道: >> >> Hi Fred, >> >> I just wanted to follow up on this issue as I am sort of deadlocked with >> this problem now. Do you see any solution? >> Btw, the non-thread safety issue of Theano has not been fixed yet, right? >> Because it sounds somewhat related to that. >> >> -Tassilo >> >> On Mon, Mar 2, 2015 at 7:07 PM, Tassilo Klein <[email protected]> wrote: >> >>> Hi Fred, >>> >>> yes, master and slave have the same directory. Sometimes it works >>> better, sometimes worse - feels like a race condition. It is weird. Here >>> are stack-traces for different error outputs. >>> >>> Cheers, >>> Tassilo >>> >>> >>> 15/03/02 18:55:57 WARN TaskSetManager: Lost task 64.0 in stage 7.19 (TID >>> 17710, node004.cm.cluster): org.apache.spark.api.python.PythonException: >>> Traceback (most recent call last): >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/worker.py", line >>> 101, in main >>> process() >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/worker.py", line >>> 96, in process >>> serializer.dump_stream(func(split_index, iterator), outfile) >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/serializers.py", >>> line 236, in dump_stream >>> vs = list(itertools.islice(iterator, batch)) >>> File "/home/tjklein/cnn-3d/spark_LORM_refactored.py", line 1046, in >>> distributed_gradient_computation >>> return broadcast_ADMM_gradient_function.value(*param_list) >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/broadcast.py", >>> line 106, in value >>> self._value = self.load(self._path) >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/broadcast.py", >>> line 95, in load >>> return cPickle.loads(data) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/compile/function_module.py", >>> line 747, in _constructor_Function >>> f = maker.create(input_storage, trustme = True) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/compile/function_module.py", >>> line 1415, in create >>> input_storage=input_storage_lists) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/link.py", >>> line 525, in make_thunk >>> output_storage=output_storage)[:3] >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/vm.py", line >>> 897, in make_all >>> no_recycling)) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line >>> 1002, in make_thunk >>> compute_map, no_recycling) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line >>> 739, in make_thunk >>> output_storage=node_output_storage) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cc.py", line >>> 1072, in make_thunk >>> keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cc.py", line >>> 1014, in __compile__ >>> keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cc.py", line >>> 1441, in cthunk_factory >>> key=key, lnk=self, keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 1076, in module_from_key >>> module = lnk.compile_cmodule(location) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cc.py", line >>> 1353, in compile_cmodule >>> preargs=preargs) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 1889, in compile_str >>> raise MissingGXX("g++ not available! We can't compile c code.") >>> MissingGXX: (MissingGXX('The following error happened while compiling the >>> node', Elemwise{neg,no_inplace}(y), '\n', "g++ not available! We can't >>> compile c code.", '[Elemwise{neg,no_inplace}(y)]'), <function >>> _constructor_Function at 0x2aaacc32d230>, >>> (<theano.compile.function_module.FunctionMaker object at 0x2aab2a53ff10>, >>> [<None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, >>> <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, >>> <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, >>> <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, >>> <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, >>> <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, >>> <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <array([[ >>> 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>], [None, None, None, None, None, None, >>> None, None, None, None, None, None, None, None, None, None, None, None, >>> None, None, None, None, None, None, None, None, None, None, None, None, >>> None, None, None, None, None, None, None, None, None, None, None, None, >>> None, None, None, None, None, None, None, None, None, None, None, None, >>> None, None, None, None, None, None, None, None, array([[ 0., 0., 0., 0., >>> 0.], >>> >>> >>> aceback (most recent call last): >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/worker.py", line >>> 101, in main >>> process() >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/worker.py", line >>> 96, in process >>> serializer.dump_stream(func(split_index, iterator), outfile) >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/serializers.py", >>> line 236, in dump_stream >>> vs = list(itertools.islice(iterator, batch)) >>> File "/home/tjklein/cnn-3d/spark_LORM_refactored.py", line 1046, in >>> distributed_gradient_computation >>> return broadcast_ADMM_gradient_function.value(*param_list) >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/broadcast.py", >>> line 106, in value >>> self._value = self.load(self._path) >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/broadcast.py", >>> line 95, in load >>> return cPickle.loads(data) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/compile/function_module.py", >>> line 747, in _constructor_Function >>> f = maker.create(input_storage, trustme = True) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/compile/function_module.py", >>> line 1415, in create >>> input_storage=input_storage_lists) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/link.py", >>> line 525, in make_thunk >>> output_storage=output_storage)[:3] >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/vm.py", line >>> 897, in make_all >>> no_recycling)) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line >>> 1002, in make_thunk >>> compute_map, no_recycling) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line >>> 739, in make_thunk >>> output_storage=node_output_storage) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cc.py", line >>> 1072, in make_thunk >>> keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cc.py", line >>> 1014, in __compile__ >>> keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cc.py", line >>> 1441, in cthunk_factory >>> key=key, lnk=self, keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 1054, in module_from_key >>> module = self._get_from_hash(module_hash, key, keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 954, in _get_from_hash >>> module = self._get_from_key(None, key_data) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 949, in _get_from_key >>> return self._get_module(name) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 632, in _get_module >>> self.module_from_name[name] = dlimport(name) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 293, in dlimport >>> rval = __import__(module_name, {}, {}, [module_name]) >>> ImportError: (ImportError('The following error happened while compiling the >>> node', Elemwise{add,no_inplace}(TensorConstant{(1,) of -1}, >>> AdvancedSubtensor1.0), '\n', 'No module named >>> tmpuawGWX.7ecbb18ed585719993b1efa9e2a60ff5'), <function >>> _constructor_Function at 0x2aaacc3570c8>, >>> (<theano.compile.function_module.FunctionMaker object at 0x2aab2a314110>, >>> [<None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, >>> <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, >>> <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, >>> <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, >>> <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, >>> <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, >>> <None>, <None>, <None>, <None>, <None>, <None>, <None>, <None>, <array([[ >>> 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>, <array([[ 0., 0., 0., 0., 0.], >>> [ 1., 0., 0., 0., 0.], >>> [ 1., 1., 0., 0., 0.], >>> [ 1., 1., 1., 0., 0.], >>> [ 1., 1., 1., 1., 0.], >>> [ 1., 1., 1., 1., 1.]])>] >>> >>> >>> 15/03/02 18:56:42 WARN TaskSetManager: Lost task 56.1 in stage 7.19 (TID >>> 17719, node004.cm.cluster): org.apache.spark.api.python.PythonException: >>> Traceback (most recent call last): >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/worker.py", line >>> 101, in main >>> process() >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/worker.py", line >>> 96, in process >>> serializer.dump_stream(func(split_index, iterator), outfile) >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/serializers.py", >>> line 236, in dump_stream >>> vs = list(itertools.islice(iterator, batch)) >>> File "/home/tjklein/cnn-3d/spark_LORM_refactored.py", line 1046, in >>> distributed_gradient_computation >>> return broadcast_ADMM_gradient_function.value(*param_list) >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/broadcast.py", >>> line 106, in value >>> self._value = self.load(self._path) >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/broadcast.py", >>> line 95, in load >>> return cPickle.loads(data) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/compile/function_module.py", >>> line 747, in _constructor_Function >>> f = maker.create(input_storage, trustme = True) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/compile/function_module.py", >>> line 1415, in create >>> input_storage=input_storage_lists) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/link.py", >>> line 525, in make_thunk >>> output_storage=output_storage)[:3] >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/vm.py", line >>> 897, in make_all >>> no_recycling)) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line >>> 1002, in make_thunk >>> compute_map, no_recycling) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line >>> 739, in make_thunk >>> output_storage=node_output_storage) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cc.py", line >>> 1072, in make_thunk >>> keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cc.py", line >>> 1014, in __compile__ >>> keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cc.py", line >>> 1441, in cthunk_factory >>> key=key, lnk=self, keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 1045, in module_from_key >>> module = self._get_from_key(key) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 949, in _get_from_key >>> return self._get_module(name) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 632, in _get_module >>> self.module_from_name[name] = dlimport(name) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 293, in dlimport >>> rval = __import__(module_name, {}, {}, [module_name]) >>> ImportError: (ImportError('The following error happened while compiling the >>> node', Elemwise{add,no_inplace}(TensorConstant{(1,) of -1}, >>> AdvancedSubtensor1.0), '\n', 'No module named >>> tmpuawGWX.7ecbb18ed585719993b1efa9e2a60ff5'), < >>> >>> >>> 15/03/02 18:57:02 INFO TaskSetManager: Lost task 51.1 in stage 7.19 (TID >>> 17724) on executor node004.cm.cluster: >>> org.apache.spark.api.python.PythonException (Traceback (most recent call >>> last): >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/worker.py", line >>> 101, in main >>> process() >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/worker.py", line >>> 96, in process >>> serializer.dump_stream(func(split_index, iterator), outfile) >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/serializers.py", >>> line 236, in dump_stream >>> vs = list(itertools.islice(iterator, batch)) >>> File "/home/tjklein/cnn-3d/spark_LORM_refactored.py", line 1046, in >>> distributed_gradient_computation >>> return broadcast_ADMM_gradient_function.value(*param_list) >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/broadcast.py", >>> line 106, in value >>> self._value = self.load(self._path) >>> File "/scratch/users/tjklein/215178/spark/python/pyspark/broadcast.py", >>> line 95, in load >>> return cPickle.loads(data) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/compile/function_module.py", >>> line 747, in _constructor_Function >>> f = maker.create(input_storage, trustme = True) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/compile/function_module.py", >>> line 1415, in create >>> input_storage=input_storage_lists) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/link.py", >>> line 525, in make_thunk >>> output_storage=output_storage)[:3] >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/vm.py", line >>> 897, in make_all >>> no_recycling)) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line >>> 739, in make_thunk >>> output_storage=node_output_storage) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cc.py", line >>> 1072, in make_thunk >>> keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cc.py", line >>> 1014, in __compile__ >>> keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cc.py", line >>> 1441, in cthunk_factory >>> key=key, lnk=self, keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 1054, in module_from_key >>> module = self._get_from_hash(module_hash, key, keep_lock=keep_lock) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 957, in _get_from_hash >>> key_data.add_key(key, save_pkl=bool(key[0])) >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 467, in add_key >>> self.save_pkl() >>> File >>> "/home/tjklein/anaconda/lib/python2.7/site-packages/theano/gof/cmodule.py", >>> line 484, in save_pkl >>> with open(self.key_pkl, 'wb') as f: >>> IOError: [Errno [Errno 2] No such file or directory: >>> '/scratch/users/tjklein/215178/theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7.9-64/tmpQrM4nE/key.pkl'] >>> >>> >>> >>> On Mon, Mar 2, 2015 at 11:21 AM, Frédéric Bastien <[email protected] >>> > wrote: >>> >>>> Hi, >>>> >>>> Can you give the full stack trace of the error? >>>> >>>> Do you use the same compiledir for the master and all process when it >>>> is on the same compute node? >>>> >>>> Fred >>>> >>>> On Sun, Mar 1, 2015 at 5:17 PM, Tassilo Klein <[email protected]> wrote: >>>> >>>>> Hi Fred, >>>>> >>>>> I got the latest version of Theano from the repository. I ran it on a >>>>> single machine and also tried it on a distributed cluster. But still have >>>>> the same issues as before, in both scenarios. >>>>> >>>>> I get many: >>>>> >>>>> INFO (theano.gof.compilelock): To manually release the lock, delete >>>>> /scratch/users/tjklein/STANDALONE/theano/compiledir_Linux-2. >>>>> 6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7.9-64/lock_dir >>>>> >>>>> INFO (theano.gof.compilelock): Waiting for existing lock by process >>>>> '252635' (I am process '249796') >>>>> >>>>> INFO (theano.gof.compilelock): To manually release the lock, delete >>>>> /scratch/users/tjklein/STANDALONE/theano/compiledir_Linux-2. >>>>> 6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7.9-64/lock_dir >>>>> >>>>> INFO (theano.gof.compilelock): Waiting for existing lock by process >>>>> '252635' (I am process '238059') >>>>> >>>>> INFO (theano.gof.compilelock): To manually release the lock, delete >>>>> /scratch/users/tjklein/STANDALONE/theano/compiledir_Linux-2. >>>>> 6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7.9-64/lock_dir >>>>> >>>>> INFO (theano.gof.compilelock): Waiting for existing lock by process >>>>> '252635' (I am process '237960') >>>>> >>>>> INFO (theano.gof.compilelock): To manually release the lock, delete >>>>> /scratch/users/tjklein/STANDALONE/theano/compiledir_Linux-2. >>>>> 6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7.9-64/lock_dir >>>>> >>>>> INFO (theano.gof.compilelock): Waiting for existing lock by process >>>>> '252635' (I am process '238179') >>>>> >>>>> INFO (theano.gof.compilelock): To manually release the lock, delete >>>>> /scratch/users/tjklein/STANDALONE/theano/compiledir_Linux-2. >>>>> 6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7.9-64/lock_dir >>>>> >>>>> INFO (theano.gof.compilelock): Waiting for existing lock by process >>>>> '252635' (I am process '247811') >>>>> >>>>> INFO (theano.gof.compilelock): To manually release the lock, delete >>>>> /scratch/users/tjklein/STANDALONE/theano/compiledir_Linux-2. >>>>> 6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7.9-64/lock_dir >>>>> >>>>> INFO (theano.gof.compilelock): Waiting for existing lock by process >>>>> '252635' (I am process '237684') >>>>> >>>>> INFO (theano.gof.compilelock): To manually release the lock, delete >>>>> /scratch/users/tjklein/STANDALONE/theano/compiledir_Linux-2. >>>>> 6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7.9-64/lock_dir >>>>> >>>>> INFO (theano.gof.compilelock): Waiting for existing lock by process >>>>> '252635' (I am process '251284') >>>>> >>>>> and >>>>> >>>>> IOError: [Errno [Errno 2] No such file or directory: >>>>> '/scratch/users/STANDALONE/theano/compiledir_Linux-2.6-el6. >>>>> x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7.9-64/tmptlW68q/key.pkl'] >>>>> <function _constructor_Function at 0x2aaacc1b7b90>: >>>>> (<theano.compile.function_module.FunctionMaker object at >>>>> 0x2aab24139b50>, >>>>> >>>>> >>>>> >>>>> Let me know if you want me to try out something else. >>>>> >>>>> Best, >>>>> >>>>> Tassilo >>>>> >>>>> On Wed, Feb 25, 2015 at 5:39 PM, Frédéric Bastien < >>>>> [email protected]> wrote: >>>>> >>>>>> This could cause duplicate compilation and warning related to that. >>>>>> Can you try with up to date Theano without your work around? >>>>>> Le 22 févr. 2015 09:16, "György Solymosi" <[email protected]> a >>>>>> écrit : >>>>>> >>>>>>> Hi Fred, >>>>>>> >>>>>>> Sorry to jump in, but I had some experiences with the same problem, >>>>>>> so I simply switched off the lock and I had no issues -perhaps >>>>>>> seemingly- >>>>>>> with multiprocessing some -even 4 another- Theano tasks. Is it possible >>>>>>> or >>>>>>> it was just an illusion and it generated some issues I didn't see in >>>>>>> background? >>>>>>> >>>>>>> George >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> --- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "theano-users" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> -- >>>>>> >>>>>> --- >>>>>> You received this message because you are subscribed to a topic in >>>>>> the Google Groups "theano-users" group. >>>>>> To unsubscribe from this topic, visit https://groups.google.com/d/to >>>>>> pic/theano-users/Pi4zQpfn5Ts/unsubscribe. >>>>>> To unsubscribe from this group and all its topics, send an email to >>>>>> [email protected]. >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>>>> >>>>> --- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "theano-users" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>>> >>>> --- >>>> You received this message because you are subscribed to a topic in the >>>> Google Groups "theano-users" group. >>>> To unsubscribe from this topic, visit https://groups.google.com/d/to >>>> pic/theano-users/Pi4zQpfn5Ts/unsubscribe. >>>> To unsubscribe from this group and all its topics, send an email to >>>> [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >> -- > > --- > You received this message because you are subscribed to the Google Groups > "theano-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
