Re: installation of spark

2019-06-05 Thread Alonso Isidoro Roman
When using osx, it is recommended to install java, scala and spark using
brew.

Run these commands on a terminal:

brew update

brew install scala

brew install sbt

brew cask install java

brew install spark


There is no need to install HDFS, you  can use your local file system
without a problem.


*How to set JAVA_HOME on Mac OS X **temporary *

   1. Open *Terminal*.
   2. Confirm you have JDK by typing “which java”. ...
   3. Check you have the needed version of Java, by typing “java -version”.
   4. *Set JAVA_HOME* using this command in *Terminal*: *export JAVA_HOME*
   =/Library/Java/Home.
   5. echo $*JAVA_HOME* on *Terminal* to confirm the path.
   6. You should now be able to run your application.


*How to set JAVA_HOME on Mac OS X permanently*

$ vim .bash_profile

$ export JAVA_HOME=$(/usr/libexec/java_home)

$ source .bash_profile

$ echo $JAVA_HOME


Have fun!

Alonso


El mié., 5 jun. 2019 a las 6:10, Jack Kolokasis ()
escribió:

> Hello,
>
> at first you will need to make sure that JAVA is installed, or install
> it otherwise. Then install scala and a build tool (sbt or maven). In my
> point of view, IntelliJ IDEA is a good option to create your Spark
> applications.  At the end you have to install a distributed file system e.g
> HDFS.
>
> I think there is no an all-in-one configuration. But there are
> examples about how to configure you Spark cluster (e.g
> https://github.com/jaceklaskowski/mastering-apache-spark-book/blob/master/spark-standalone-example-2-workers-on-1-node-cluster.adoc
> ).
> Best,
> --Iacovos
> On 5/6/19 5:50 π.μ., ya wrote:
>
> Dear list,
>
> I am very new to spark, and I am having trouble installing it on my mac. I
> have following questions, please give me some guidance. Thank you very much.
>
> 1. How many and what software should I install before installing spark? I
> have been searching online, people discussing their experiences on this
> topic with different opinions, some says there is no need to install hadoop
> before install spark, some says hadoop has to be installed before spark.
> Some other people say scala has to be installed, whereas others say scala
> is included in spark, and it is installed automatically once spark in
> installed. So I am confused what to install for a start.
>
> 2.  Is there an simple way to configure these software? for instance, an
> all-in-one configuration file? It takes forever for me to configure things
> before I can really use it for data analysis.
>
> I hope my questions make sense. Thank you very much.
>
> Best regards,
>
> YA
>
>

-- 
Alonso Isidoro Roman
[image: https://]about.me/alonso.isidoro.roman



Re: installation of spark

2019-06-04 Thread Jack Kolokasis

Hello,

    at first you will need to make sure that JAVA is installed, or 
install it otherwise. Then install scala and a build tool (sbt or 
maven). In my point of view, IntelliJ IDEA is a good option to create 
your Spark applications.  At the end you have to install a distributed 
file system e.g HDFS.


    I think there is no an all-in-one configuration. But there are 
examples about how to configure you Spark cluster (e.g 
https://github.com/jaceklaskowski/mastering-apache-spark-book/blob/master/spark-standalone-example-2-workers-on-1-node-cluster.adoc).


Best,
--Iacovos
On 5/6/19 5:50 π.μ., ya wrote:

Dear list,

I am very new to spark, and I am having trouble installing it on my 
mac. I have following questions, please give me some guidance. Thank 
you very much.


1. How many and what software should I install before installing 
spark? I have been searching online, people discussing their 
experiences on this topic with different opinions, some says there is 
no need to install hadoop before install spark, some says hadoop has 
to be installed before spark. Some other people say scala has to be 
installed, whereas others say scala is included in spark, and it is 
installed automatically once spark in installed. So I am confused what 
to install for a start.


2.  Is there an simple way to configure these software? for instance, 
an all-in-one configuration file? It takes forever for me to configure 
things before I can really use it for data analysis.


I hope my questions make sense. Thank you very much.

Best regards,

YA


installation of spark

2019-06-04 Thread ya
Dear list,


I am very new to spark, and I am having trouble installing it on my mac. I have 
following questions, please give me some guidance. Thank you very much.


1. How many and what software should I install before installing spark? I have 
been searching online, people discussing their experiences on this topic with 
different opinions, some says there is no need to install hadoop before install 
spark, some says hadoop has to be installed before spark. Some other people say 
scala has to be installed, whereas others say scala is included in spark, and 
it is installed automatically once spark in installed. So I am confused what to 
install for a start.


2.  Is there an simple way to configure these software? for instance, an 
all-in-one configuration file? It takes forever for me to configure things 
before I can really use it for data analysis.


I hope my questions make sense. Thank you very much.


Best regards,


YA

Re: clear steps for installation of spark, cassandra and cassandra connector to run on spyder 2.3.7 using python 3.5 and anaconda 2.4 ipython 4.0

2016-09-06 Thread ayan guha
Spark has pretty extensive documentation, that should be your starting
point. I do not use Cassandra much, but Cassandra connector should be a
spark package, so look for spark package website.

If I may say so, all docs should be one or two Google search away :)
On 6 Sep 2016 20:34, "muhammet pakyürek" <mpa...@hotmail.com> wrote:

>
>
> could u send me  documents and links to satisfy all above requirements of 
> installation
> of spark, cassandra and cassandra connector to run on spyder 2.3.7 using
> python 3.5 and anaconda 2.4 ipython 4.0
>
>
> --
>
>


clear steps for installation of spark, cassandra and cassandra connector to run on spyder 2.3.7 using python 3.5 and anaconda 2.4 ipython 4.0

2016-09-06 Thread muhammet pakyürek


could u send me  documents and links to satisfy all above requirements of 
installation of spark, cassandra and cassandra connector to run on spyder 2.3.7 
using python 3.5 and anaconda 2.4 ipython 4.0






Installation Issues - Spark 1.6.0 With Hadoop 2.6 - Pre Built On Windows 7

2016-04-12 Thread My List
Dear Experts,

Need help to get this resolved -

What am I doing wrong? Any Help greatly appreciated.

Env -
Windows 7 - 64 bit OS
Spark 1.6.0 With Hadoop 2.6 - Pre Built setup
JAVA_HOME - point to 1.7
SCALA_HOME - 2.11

I have Admin User and Standard User on Windows.

All the setups and running of spark is done using the Standard User Acc.

Have Spark setup on D
drive.- D:\Home\Prod_Inst\BigData\Spark\VER_1_6_0_W_H_2_6
Have set Hadoop_Home to point to winutils.exe (64 bit version) on D drive
- D:\Home\Prod_Inst\BigData\Spark\MySparkSetup\winutils

Standard User Acc -  w7-PC\Shaffu_Knowledge
Using Standard User account - created mkdir D:\tmp\hive
Using Standard User account - winutils.exe chmod -R 777 D:\tmp
Using Standard User account - winutils.exe ls D:\tmp and D:\tmp\hive

*drwxrwxrwx 1 w7-PC\Shaffu_Knowledge w7-PC\None 0 Apr 12 2016 \tmp*
*drwxrwxrwx 1 w7-PC\Shaffu_Knowledge w7-PC\None 0 Apr 12 2016 \tmp\hive*

Running spark-shell results in the following exception

D:\Home\Prod_Inst\BigData\Spark\VER_1_6_0_W_H_2_6>./bin/spark-shell
'.' is not recognized as an internal or external command,
operable program or batch file.

D:\Home\Prod_Inst\BigData\Spark\VER_1_6_0_W_H_2_6>.\bin\spark-shell
16/04/12 14:40:16 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Welcome to
    __
 / __/__  ___ _/ /__
_\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.6.0
  /_/

Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java
1.7.0_25)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc.
16/04/12 14:40:22 WARN General: Plugin (Bundle)
"org.datanucleus.store.rdbms" is already registered. Ensure you dont have
multiple JAR versions of the same plugin in the classpath. The URL
"file:/D:/Home/Prod_Inst/BigData/Spark/VER_1_6_0_W_H_2_6/lib/
16/04/12 14:40:22 WARN General: Plugin (Bundle) "org.datanucleus" is
already registered. Ensure you dont have multiple JAR versions of the same
plugin in the classpath. The URL
"file:/D:/Home/Prod_Inst/BigData/Spark/VER_1_6_0_W_H_2_6/bin/../lib/datan
16/04/12 14:40:22 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo"
is already registered. Ensure you dont have multiple JAR versions of the
same plugin in the classpath. The URL
"file:/D:/Home/Prod_Inst/BigData/Spark/VER_1_6_0_W_H_2_6/bin/../l
16/04/12 14:40:22 WARN Connection: BoneCP specified but not present in
CLASSPATH (or one of dependencies)
16/04/12 14:40:23 WARN Connection: BoneCP specified but not present in
CLASSPATH (or one of dependencies)
16/04/12 14:40:38 WARN ObjectStore: Version information not found in
metastore. hive.metastore.schema.verification is not enabled so recording
the schema version 1.2.0
16/04/12 14:40:38 WARN ObjectStore: Failed to get database default,
returning NoSuchObjectException
16/04/12 14:40:39 WARN : Your hostname, w7-PC resolves to a
loopback/non-reachable address: fe80:0:0:0:8d4f:1fa9:cf7d:23d0%17, but we
couldn't find any external IP address!
j*ava.lang.RuntimeException: java.lang.RuntimeException: The root scratch
dir: /tmp/hive on HDFS should be writable. Current permissions are:
rw-rw-rw-*
at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at
org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:194)
at
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:238)
at
org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:218)
at
org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:208)
at
org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:462)
at
org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:461)
at
org.apache.spark.sql.UDFRegistration.(UDFRegistration.scala:40)
at org.apache.spark.sql.SQLContext.(SQLContext.scala:330)
at
org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:90)
at
org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:101)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at
org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028)
at $iwC$$iwC.(:15)
at $iwC.(:24)
at (:26)
at .(:30)
at .()
at .(:7)
at .()
at $print()
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at

Re: PLease help: installation of spark 1.6.0 on ubuntu fails

2016-02-25 Thread Shixiong(Ryan) Zhu
Please use Java 7 instead.

On Thu, Feb 25, 2016 at 1:54 PM, Marco Mistroni  wrote:

> Hello all
>  could anyone help?
> i have tried to install spark 1.6.0 on ubuntu, but the installation failed
> Here are my steps
>
> 1. download spark (successful)
>
> 31  wget http://d3kbcqa49mib13.cloudfront.net/spark-1.6.0.tgz
>
> 33  tar -zxf spark-1.6.0.tgz
>
>
>
> 2. cd spark-1.6.0
>
> 2.1 sbt assembly
>
>
>
> error] /home/vagrant/spark-1.6.0/project/SparkBuild.scala:19: object file
> is not a member of package java.nio
>
> error] import java.nio.file.Files
>
> error] ^
>
> error] /home/vagrant/spark-1.6.0/project/SparkBuild.scala:465: not found:
> value Files
>
> error]   Files.copy(jar.toPath, dest.toPath)
>
> error]   ^
> error] two errors foun
>
> could anyone assist?
> kind regards
>


PLease help: installation of spark 1.6.0 on ubuntu fails

2016-02-25 Thread Marco Mistroni
Hello all
 could anyone help?
i have tried to install spark 1.6.0 on ubuntu, but the installation failed
Here are my steps

1. download spark (successful)

31  wget http://d3kbcqa49mib13.cloudfront.net/spark-1.6.0.tgz

33  tar -zxf spark-1.6.0.tgz



2. cd spark-1.6.0

2.1 sbt assembly



error] /home/vagrant/spark-1.6.0/project/SparkBuild.scala:19: object file
is not a member of package java.nio

error] import java.nio.file.Files

error] ^

error] /home/vagrant/spark-1.6.0/project/SparkBuild.scala:465: not found:
value Files

error]   Files.copy(jar.toPath, dest.toPath)

error]   ^
error] two errors foun

could anyone assist?
kind regards


Re: Integrate Spark Editor with Hue for source compiled installation of spark/spark-jobServer

2014-07-03 Thread Sunita Arvind
That's good to know. I will try it out.

Thanks Romain

On Friday, June 27, 2014, Romain Rigaux romain.rig...@gmail.com wrote:

 So far Spark Job Server does not work with Spark 1.0:
 https://github.com/ooyala/spark-jobserver

 So this works only with Spark 0.9 currently:

 http://gethue.com/get-started-with-spark-deploy-spark-server-and-compute-pi-from-your-web-browser/

 Romain



 Romain


 On Tue, Jun 24, 2014 at 9:04 AM, Sunita Arvind sunitarv...@gmail.com
 javascript:_e(%7B%7D,'cvml','sunitarv...@gmail.com'); wrote:

 Hello Experts,

 I am attempting to integrate Spark Editor with Hue on CDH5.0.1. I have
 the spark installation build manually from the sources for spark1.0.0. I am
 able to integrate this with cloudera manager.

 Background:
 ---
 We have a 3 node VM cluster with CDH5.0.1
 We requried spark1.0.0 due to some features in it, so I did a

  yum remove spark-core spark-master spark-worker spark-python

  of the default spark0.9.0 and compiled spark1.0.0 from source:

 Downloaded the spark-trunk from

 git clone https://github.com/apache/spark.git
 cd spark
 SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true ./sbt/sbt assembly

  The spark-assembly-1.0.0-SNAPSHOT-hadoop2.2.0.jar was built and spark by
 itself seems to work well. I was even able to run a text file count.

 Current attempt:
 
 Referring to this article -
 http://gethue.com/a-new-spark-web-ui-spark-app/
 Now I am trying to add the Spark editor to Hue. AFAIK, this requires
 git clone https://github.com/ooyala/spark-jobserver.git
 cd spark-jobserver
 sbt
 re-start

 This was successful after lot of struggle with the proxy settings.
 However, is this the job Server itself? Will that mean the job Server has
 to be manually started. I intend to have the spark editor show up in hue
 web UI and I am no way close. Can some one please help?

 Note, the 3 VMs are Linux CentOS. Not sure if setting something like can
 be expected to work.:

 [desktop]
 app_blacklist=


 Also, I have made the changes to vim .
 /job-server/src/main/resources/application.conf as recommended, however,
 I do not expect this to impact hue in any way.

 Also, I intend to let the editor stay available, not spawn it everytime
 it is required.


 Thanks in advance.

 regards





Re: Integrate Spark Editor with Hue for source compiled installation of spark/spark-jobServer

2014-06-27 Thread Romain Rigaux
So far Spark Job Server does not work with Spark 1.0:
https://github.com/ooyala/spark-jobserver

So this works only with Spark 0.9 currently:
http://gethue.com/get-started-with-spark-deploy-spark-server-and-compute-pi-from-your-web-browser/

Romain



Romain


On Tue, Jun 24, 2014 at 9:04 AM, Sunita Arvind sunitarv...@gmail.com
wrote:

 Hello Experts,

 I am attempting to integrate Spark Editor with Hue on CDH5.0.1. I have the
 spark installation build manually from the sources for spark1.0.0. I am
 able to integrate this with cloudera manager.

 Background:
 ---
 We have a 3 node VM cluster with CDH5.0.1
 We requried spark1.0.0 due to some features in it, so I did a

  yum remove spark-core spark-master spark-worker spark-python

  of the default spark0.9.0 and compiled spark1.0.0 from source:

 Downloaded the spark-trunk from

 git clone https://github.com/apache/spark.git
 cd spark
 SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true ./sbt/sbt assembly

  The spark-assembly-1.0.0-SNAPSHOT-hadoop2.2.0.jar was built and spark by
 itself seems to work well. I was even able to run a text file count.

 Current attempt:
 
 Referring to this article -
 http://gethue.com/a-new-spark-web-ui-spark-app/
 Now I am trying to add the Spark editor to Hue. AFAIK, this requires
 git clone https://github.com/ooyala/spark-jobserver.git
 cd spark-jobserver
 sbt
 re-start

 This was successful after lot of struggle with the proxy settings.
 However, is this the job Server itself? Will that mean the job Server has
 to be manually started. I intend to have the spark editor show up in hue
 web UI and I am no way close. Can some one please help?

 Note, the 3 VMs are Linux CentOS. Not sure if setting something like can
 be expected to work.:

 [desktop]
 app_blacklist=


 Also, I have made the changes to vim .
 /job-server/src/main/resources/application.conf as recommended, however,
 I do not expect this to impact hue in any way.

 Also, I intend to let the editor stay available, not spawn it everytime it
 is required.


 Thanks in advance.

 regards