Re: ml models distribution
Hi Sean, On Fri, Jul 22, 2016 at 12:52 PM, Sean Owen wrote: > > If you mean, how do you distribute a new model in your application, > then there's no magic to it. Just reference the new model in the > functions you're executing in your driver. > > If you implemented some other manual way of deploying model info, just > do that again. There's no special thing to know. > Well, because some huge model, we typically bundle both logic (pipeline/application) and models separately. Normally we use a shared stores (e.g., HDFS) or coordinated distribution of the models. But I wanted to know if there is any infrastructure in Spark that specifically addresses such need. Thanks. Cheers, P.S.: sorry Jacek, with "ml" I meant "Machine Learning". I thought is a quite spread acronym. Sorry for the possible confusion. -- Sergio Fernández Partner Technology Manager Redlink GmbH m: +43 6602747925 e: sergio.fernan...@redlink.co w: http://redlink.co
ml models distribution
Hi, I have one question: How is the ML models distribution done across all nodes of a Spark cluster? I'm thinking about scenarios where the pipeline implementation does not necessary need to change, but the models have been upgraded. Thanks in advance. Best regards, -- Sergio Fernández Partner Technology Manager Redlink GmbH m: +43 6602747925 e: sergio.fernan...@redlink.co w: http://redlink.co
Re: processing 50 gb data using just one machine
In theory yes... the common sense say that: volume / resources = time So more volume on the same processing resources would just take more time. On Jun 15, 2016 6:43 PM, "spR" wrote: > I have 16 gb ram, i7 > > Will this config be able to handle the processing without my ipythin > notebook dying? > > The local mode is for testing purpose. But, I do not have any cluster at > my disposal. So can I make this work with the configuration that I have? > Thank you. > On Jun 15, 2016 9:40 AM, "Deepak Goel" wrote: > >> What do you mean by "EFFECIENTLY"? >> >> Hey >> >> Namaskara~Nalama~Guten Tag~Bonjour >> >> >>-- >> Keigu >> >> Deepak >> 73500 12833 >> www.simtree.net, dee...@simtree.net >> deic...@gmail.com >> >> LinkedIn: www.linkedin.com/in/deicool >> Skype: thumsupdeicool >> Google talk: deicool >> Blog: http://loveandfearless.wordpress.com >> Facebook: http://www.facebook.com/deicool >> >> "Contribute to the world, environment and more : >> http://www.gridrepublic.org >> " >> >> On Wed, Jun 15, 2016 at 9:33 PM, spR wrote: >> >>> Hi, >>> >>> can I use spark in local mode using 4 cores to process 50gb data >>> effeciently? >>> >>> Thank you >>> >>> misha >>> >> >>
Re: ImportError: No module named numpy
On Thu, Jun 2, 2016 at 9:59 AM, Bhupendra Mishra wrote: > > and i have already exported environment variable in spark-env.sh as > follows.. error still there error: ImportError: No module named numpy > > export PYSPARK_PYTHON=/usr/bin/python > According the documentation at http://spark.apache.org/docs/latest/configuration.html#environment-variables the PYSPARK_PYTHON environment variable is for poniting to the Python interpreter binary. If you check the programming guide https://spark.apache.org/docs/0.9.0/python-programming-guide.html#installing-and-configuring-pyspark it says you need to add your custom path to PYTHONPATH (the script automatically adds the bin/pyspark there). So typically in Linux you would need to add the following (assuming you installed numpy there): export PYTHONPATH=$PYTHONPATH:/usr/lib/python2.7/dist-packages Hope that helps. > On Thu, Jun 2, 2016 at 12:04 AM, Julio Antonio Soto de Vicente < > ju...@esbet.es> wrote: > >> Try adding to spark-env.sh (renaming if you still have it with .template >> at the end): >> >> PYSPARK_PYTHON=/path/to/your/bin/python >> >> Where your bin/python is your actual Python environment with Numpy >> installed. >> >> >> El 1 jun 2016, a las 20:16, Bhupendra Mishra >> escribió: >> >> I have numpy installed but where I should setup PYTHONPATH? >> >> >> On Wed, Jun 1, 2016 at 11:39 PM, Sergio Fernández >> wrote: >> >>> sudo pip install numpy >>> >>> On Wed, Jun 1, 2016 at 5:56 PM, Bhupendra Mishra < >>> bhupendra.mis...@gmail.com> wrote: >>> >>>> Thanks . >>>> How can this be resolved? >>>> >>>> On Wed, Jun 1, 2016 at 9:02 PM, Holden Karau >>>> wrote: >>>> >>>>> Generally this means numpy isn't installed on the system or your >>>>> PYTHONPATH has somehow gotten pointed somewhere odd, >>>>> >>>>> On Wed, Jun 1, 2016 at 8:31 AM, Bhupendra Mishra < >>>>> bhupendra.mis...@gmail.com> wrote: >>>>> >>>>>> If any one please can help me with following error. >>>>>> >>>>>> File >>>>>> "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", >>>>>> line 25, in >>>>>> >>>>>> ImportError: No module named numpy >>>>>> >>>>>> >>>>>> Thanks in advance! >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Cell : 425-233-8271 >>>>> Twitter: https://twitter.com/holdenkarau >>>>> >>>> >>>> >>> >>> >>> -- >>> Sergio Fernández >>> Partner Technology Manager >>> Redlink GmbH >>> m: +43 6602747925 >>> e: sergio.fernan...@redlink.co >>> w: http://redlink.co >>> >> >> > -- Sergio Fernández Partner Technology Manager Redlink GmbH m: +43 6602747925 e: sergio.fernan...@redlink.co w: http://redlink.co
Re: ImportError: No module named numpy
sudo pip install numpy On Wed, Jun 1, 2016 at 5:56 PM, Bhupendra Mishra wrote: > Thanks . > How can this be resolved? > > On Wed, Jun 1, 2016 at 9:02 PM, Holden Karau wrote: > >> Generally this means numpy isn't installed on the system or your >> PYTHONPATH has somehow gotten pointed somewhere odd, >> >> On Wed, Jun 1, 2016 at 8:31 AM, Bhupendra Mishra < >> bhupendra.mis...@gmail.com> wrote: >> >>> If any one please can help me with following error. >>> >>> File >>> "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", >>> line 25, in >>> >>> ImportError: No module named numpy >>> >>> >>> Thanks in advance! >>> >>> >> >> >> -- >> Cell : 425-233-8271 >> Twitter: https://twitter.com/holdenkarau >> > > -- Sergio Fernández Partner Technology Manager Redlink GmbH m: +43 6602747925 e: sergio.fernan...@redlink.co w: http://redlink.co