The data may be spilled off to disk hence HDFS is a necessity for Spark. You can run Spark on a single machine & not use HDFS but in distributed mode HDFS will be required.
Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi <https://twitter.com/mayur_rustagi> On Wed, Mar 19, 2014 at 4:10 AM, Sai Prasanna <ansaiprasa...@gmail.com>wrote: > Mayur, > > While reading a local file which is not in HDFS through spark shell, does > the HDFS need to be up and running ??? > > > On Tue, Mar 18, 2014 at 9:46 PM, Mayur Rustagi <mayur.rust...@gmail.com>wrote: > >> Your hdfs is down. Probably forgot to format namenode. >> check if namenode is running >> ps -aef|grep Namenode >> if not & data in hdfs is not critical >> hadoop namenode -format >> & restart hdfs >> >> >> Mayur Rustagi >> Ph: +1 (760) 203 3257 >> http://www.sigmoidanalytics.com >> @mayur_rustagi <https://twitter.com/mayur_rustagi> >> >> >> >> On Tue, Mar 18, 2014 at 5:59 AM, Sai Prasanna <ansaiprasa...@gmail.com>wrote: >> >>> Hi ALL !! >>> >>> In the interactive spark shell i get the following error. >>> I just followed the steps of the video "First steps with spark - spark >>> screen cast #1" by andy konwinski... >>> >>> Any thoughts ??? >>> >>> scala> val textfile = sc.textFile("README.md") >>> textfile: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at textFile at >>> <console>:12 >>> >>> scala> textfile.count >>> java.lang.RuntimeException: java.net.ConnectException: Call to master/ >>> 192.168.1.11:9000 failed on connection exception: >>> java.net.ConnectException: Connection refused >>> at >>> org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:546) >>> at >>> org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:318) >>> at >>> org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:291) >>> at >>> org.apache.spark.SparkContext$$anonfun$13.apply(SparkContext.scala:439) >>> at >>> org.apache.spark.SparkContext$$anonfun$13.apply(SparkContext.scala:439) >>> at >>> org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:112) >>> at >>> org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:112) >>> at scala.Option.map(Option.scala:133) >>> at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:112) >>> at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:134) >>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:201) >>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:199) >>> at scala.Option.getOrElse(Option.scala:108) >>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:199) >>> at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:26) >>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:201) >>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:199) >>> at scala.Option.getOrElse(Option.scala:108) >>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:199) >>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:886) >>> at org.apache.spark.rdd.RDD.count(RDD.scala:698) >>> at <init>(<console>:15) >>> at <init>(<console>:20) >>> at <init>(<console>:22) >>> at <init>(<console>:24) >>> at <init>(<console>:26) >>> at .<init>(<console>:30) >>> at .<clinit>(<console>) >>> at .<init>(<console>:11) >>> at .<clinit>(<console>) >>> at $export(<console>) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:606) >>> at >>> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:629) >>> at >>> org.apache.spark.repl.SparkIMain$Request$$anonfun$10.apply(SparkIMain.scala:897) >>> at >>> scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43) >>> at scala.tools.nsc.io.package$$anon$2.run(package.scala:25) >>> at java.lang.Thread.run(Thread.java:744) >>> Caused by: java.net.ConnectException: Call to >>> master/192.168.1.11:9000failed on connection exception: >>> java.net.ConnectException: Connection >>> refused >>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:1099) >>> at org.apache.hadoop.ipc.Client.call(Client.java:1075) >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) >>> at com.sun.proxy.$Proxy8.getProtocolVersion(Unknown Source) >>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) >>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379) >>> at >>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119) >>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238) >>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203) >>> at >>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89) >>> at >>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386) >>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) >>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404) >>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) >>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123) >>> at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:542) >>> ... 39 more >>> Caused by: java.net.ConnectException: Connection refused >>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>> at >>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) >>> at >>> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) >>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489) >>> at >>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434) >>> at >>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560) >>> at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184) >>> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1206) >>> at org.apache.hadoop.ipc.Client.call(Client.java:1050) >>> ... 53 more >>> >>> >>> -- >>> *Sai Prasanna. AN* >>> *II M.Tech (CS), SSSIHL* >>> >>> >>> * Entire water in the ocean can never sink a ship, Unless it gets >>> inside.All the pressures of life can never hurt you, Unless you let them >>> in.* >>> >> >> > > > -- > *Sai Prasanna. AN* > *II M.Tech (CS), SSSIHL* > > > *Entire water in the ocean can never sink a ship, Unless it gets inside. > All the pressures of life can never hurt you, Unless you let them in.* >