1. You don't need to start HDFS or anything like that, just set up Spark so that it can use the Hadoop APIs for some things; on windows this depends on some native libs. This means you don't need to worry about learning it. Focus on the Spark APIs, python and/or scala
2. You should be able to find the native windows binaries here: https://github.com/steveloughran/winutils These are either version I lifted out of HDP for windows, or, more recently, checking out and building on Windows the same git commit as used/voted in for the ASF releases. 3. For performance, you also need the native libs for compression codecs: snappy, LZO &c. I see I've put them in the windows 2.7.1 release, but not the others (ASF mvn package doesn't add them): https://github.com/steveloughran/winutils/tree/master/hadoop-2.7.1/bin ... If the version of the hadoop libs aren't working with one of those versions I've put up, ping me and I'll build up the relevant binaries out of the ASF From: My List <mylistt...@gmail.com<mailto:mylistt...@gmail.com>> Date: Monday, 18 April 2016 at 13:13 To: Deepak Sharma <deepakmc...@gmail.com<mailto:deepakmc...@gmail.com>> Cc: SparkUser <user@spark.apache.org<mailto:user@spark.apache.org>> Subject: Re: How to start HDFS on Spark Standalone Deepak, The following could be a very dumb questions so pardon me for the same. 1) When I download the binary for Spark with a version of Hadoop(Hadoop 2.6) does it not come in the zip or tar file? 2) If it does not come along,Is there a Apache Hadoop for windows, is it in binary format or will have to build it? 3) Is there a basic tutorial for Hadoop on windows for the basic needs of Spark. Thanks in Advance ! On Mon, Apr 18, 2016 at 5:35 PM, Deepak Sharma <deepakmc...@gmail.com<mailto:deepakmc...@gmail.com>> wrote: Once you download hadoop and format the namenode , you can use start-dfs.sh to start hdfs. Then use 'jps' to sss if datanode/namenode services are up and running. Thanks Deepak On Mon, Apr 18, 2016 at 5:18 PM, My List <mylistt...@gmail.com<mailto:mylistt...@gmail.com>> wrote: Hi , I am a newbie on Spark.I wanted to know how to start and verify if HDFS has started on Spark stand alone. Env - Windows 7 - 64 bit Spark 1.4.1 With Hadoop 2.6 Using Scala Shell - spark-shell -- Thanks, Harry -- Thanks Deepak www.bigdatabig.com<http://www.bigdatabig.com> www.keosha.net<http://www.keosha.net> -- Thanks, Harmeet