Hi Rishi, I've had success using the approach outlined here: https://community.hortonworks.com/articles/58418/running-pyspark-with-conda-env.html
Does this work for you? On Tue, Apr 30, 2019 at 12:32 AM Rishi Shah <rishishah.s...@gmail.com> wrote: > modified the subject & would like to clarify that I am looking to create > an anaconda parcel with pyarrow and other libraries, so that I can > distribute it on the cloudera cluster.. > > On Tue, Apr 30, 2019 at 12:21 AM Rishi Shah <rishishah.s...@gmail.com> > wrote: > >> Hi All, >> >> I have been trying to figure out a way to build anaconda parcel with >> pyarrow included for my cloudera managed server for distribution but this >> doesn't seem to work right. Could someone please help? >> >> I have tried to install anaconda on one of the management nodes on >> cloudera cluster... tarred the directory, but this directory doesn't >> include all the packages to form a proper parcel for distribution. >> >> Any help is much appreciated! >> >> -- >> Regards, >> >> Rishi Shah >> > > > -- > Regards, > > Rishi Shah > -- *Patrick McCarthy * Senior Data Scientist, Machine Learning Engineering Dstillery 470 Park Ave South, 17th Floor, NYC 10016