If you aren't using Hadoop, I don't think it matters which you download. I'd probably just grab the Hadoop 2 package.
Out of curiosity, what are you using as your data store? I get the impression most Spark users are using HDFS or something built on top. On Thu, Aug 28, 2014 at 4:07 PM, Sanjeev Sagar < [email protected]> wrote: > Hello there, > > I've a basic question on the downloadthat which option I need to > downloadfor standalone cluster. > > I've a private cluster of three machineson Centos. When I click on > download it shows me following: > > > Download Spark > > The latest release is Spark 1.0.2, released August 5, 2014 (release notes) > <http://spark.apache.org/releases/spark-release-1-0-2.html> (git tag) < > https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h= > 8fb6f00e195fb258f3f70f04756e07c259a2351f> > > Pre-built packages: > > * For Hadoop 1 (HDP1, CDH3): find an Apache mirror > <http://www.apache.org/dyn/closer.cgi/spark/spark-1.0.2/ > spark-1.0.2-bin-hadoop1.tgz> > or direct file download > <http://d3kbcqa49mib13.cloudfront.net/spark-1.0.2-bin-hadoop1.tgz> > * For CDH4: find an Apache mirror > <http://www.apache.org/dyn/closer.cgi/spark/spark-1.0.2/ > spark-1.0.2-bin-cdh4.tgz> > or direct file download > <http://d3kbcqa49mib13.cloudfront.net/spark-1.0.2-bin-cdh4.tgz> > * For Hadoop 2 (HDP2, CDH5): find an Apache mirror > <http://www.apache.org/dyn/closer.cgi/spark/spark-1.0.2/ > spark-1.0.2-bin-hadoop2.tgz> > or direct file download > <http://d3kbcqa49mib13.cloudfront.net/spark-1.0.2-bin-hadoop2.tgz> > > Pre-built packages, third-party (NOTE: may include non ASF-compatible > licenses): > > * For MapRv3: direct file download (external) > <http://package.mapr.com/tools/apache-spark/1.0.2/ > spark-1.0.2-bin-mapr3.tgz> > * For MapRv4: direct file download (external) > <http://package.mapr.com/tools/apache-spark/1.0.2/ > spark-1.0.2-bin-mapr4.tgz> > > > From the above it looks like that I've to donwload Hadoop or CDH4 first in > order to use Spark ? I've a standalone cluster and my data size is also > like hundreds of Gig or close to Terabyte. > > I don't get it that which one I need to download from the above list. > > Could some one assist me that which one I need to download for standalone > cluster and for big data foot print ? > > or Hadoop is needed or mandatory for using Spark? that's not the > understanding I've. My understanding is that you can use spark with Hadoop > if you like from yarn2 but you could use spark standalone also without > hadoop. > > Please assist. I'm confused ! > > -Sanjeev > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > -- Daniel Siegmann, Software Developer Velos Accelerating Machine Learning 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001 E: [email protected] W: www.velos.io
