Re: Q on downloading spark for standalone cluster

Daniel Siegmann Thu, 28 Aug 2014 14:05:10 -0700

If you aren't using Hadoop, I don't think it matters which you download.
I'd probably just grab the Hadoop 2 package.


Out of curiosity, what are you using as your data store? I get the
impression most Spark users are using HDFS or something built on top.


On Thu, Aug 28, 2014 at 4:07 PM, Sanjeev Sagar <
[email protected]> wrote:

> Hello there,
>
> I've a basic question on the downloadthat which option I need to
> downloadfor standalone cluster.
>
> I've a private cluster of three machineson Centos. When I click on
> download it shows me following:
>
>
>    Download Spark
>
> The latest release is Spark 1.0.2, released August 5, 2014 (release notes)
> <http://spark.apache.org/releases/spark-release-1-0-2.html> (git tag) <
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=
> 8fb6f00e195fb258f3f70f04756e07c259a2351f>
>
> Pre-built packages:
>
>  * For Hadoop 1 (HDP1, CDH3): find an Apache mirror
>    <http://www.apache.org/dyn/closer.cgi/spark/spark-1.0.2/
> spark-1.0.2-bin-hadoop1.tgz>
>    or direct file download
>    <http://d3kbcqa49mib13.cloudfront.net/spark-1.0.2-bin-hadoop1.tgz>
>  * For CDH4: find an Apache mirror
>    <http://www.apache.org/dyn/closer.cgi/spark/spark-1.0.2/
> spark-1.0.2-bin-cdh4.tgz>
>    or direct file download
>    <http://d3kbcqa49mib13.cloudfront.net/spark-1.0.2-bin-cdh4.tgz>
>  * For Hadoop 2 (HDP2, CDH5): find an Apache mirror
>    <http://www.apache.org/dyn/closer.cgi/spark/spark-1.0.2/
> spark-1.0.2-bin-hadoop2.tgz>
>    or direct file download
>    <http://d3kbcqa49mib13.cloudfront.net/spark-1.0.2-bin-hadoop2.tgz>
>
> Pre-built packages, third-party (NOTE: may include non ASF-compatible
> licenses):
>
>  * For MapRv3: direct file download (external)
>    <http://package.mapr.com/tools/apache-spark/1.0.2/
> spark-1.0.2-bin-mapr3.tgz>
>  * For MapRv4: direct file download (external)
>    <http://package.mapr.com/tools/apache-spark/1.0.2/
> spark-1.0.2-bin-mapr4.tgz>
>
>
> From the above it looks like that I've to donwload Hadoop or CDH4 first in
> order to use Spark ? I've a standalone cluster and my data size is also
> like hundreds of Gig or close to Terabyte.
>
> I don't get it that which one I need to download from the above list.
>
> Could some one assist me that which one I need to download for standalone
> cluster and for big data foot print ?
>
> or Hadoop is needed or mandatory for using Spark? that's not the
> understanding I've. My understanding is that you can use spark with Hadoop
> if you like from yarn2 but you could use spark standalone also without
> hadoop.
>
> Please assist. I'm confused !
>
> -Sanjeev
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>


-- 
Daniel Siegmann, Software Developer
Velos
Accelerating Machine Learning

440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
E: [email protected] W: www.velos.io

Re: Q on downloading spark for standalone cluster

Reply via email to