Re: [Numpy-discussion] Using numpy on hadoop streaming: ImportError: cannot import name multiarray

2015-02-11 Thread Kartik Kumar Perisetla
Hi David, Thanks for your response. But I can't install anything on cluster. *Could anyone please help me understand how the file 'multiarray.so' is used by the tagger. I mean how it is loaded( I assume its some sort of DLL for windows and shared library for unix based systems). Is it a module or

Re: [Numpy-discussion] Using numpy on hadoop streaming: ImportError: cannot import name multiarray

2015-02-11 Thread Daπid
On 11 February 2015 at 08:06, Kartik Kumar Perisetla wrote: > Thanks David. But do I need to install virtualenv on every node in hadoop > cluster? Actually I am not very sure whether same namenodes are assigned for > my every hadoop job. So how shall I proceed on such scenario. I have never used

Re: [Numpy-discussion] Using numpy on hadoop streaming: ImportError: cannot import name multiarray

2015-02-10 Thread Kartik Kumar Perisetla
Thanks David. But do I need to install virtualenv on every node in hadoop cluster? Actually I am not very sure whether same namenodes are assigned for my every hadoop job. So how shall I proceed on such scenario. Thanks for your inputs. Kartik On Feb 11, 2015 1:56 AM, "Daπid" wrote: > On 11 Febr

Re: [Numpy-discussion] Using numpy on hadoop streaming: ImportError: cannot import name multiarray

2015-02-10 Thread Daπid
On 11 February 2015 at 03:38, Kartik Kumar Perisetla wrote: > Also, I don't have root access thus, can't install numpy or any other > package on cluster You can create a virtualenv, and install packages on it without needing root access. To minimize trouble, you can ensure it uses the system pack

[Numpy-discussion] Using numpy on hadoop streaming: ImportError: cannot import name multiarray

2015-02-10 Thread Kartik Kumar Perisetla
Hi all, for one of my projects I am using basically using NLTK for pos tagging, which internally uses a 'english.pickle' file. I managed to package the nltk library with these pickle files to make them available to mapper and reducer for hadoop streaming job using -file option. However, when nltk