Hi Jasbir, As a first note, please if you are using a vendor distribution, please contact the vendor for any issue you are facing. This mailing list is for the community so we focus on the community edition of Spark.
Anyway, the error seems to be quite clear: your file on HDFS has a missing block. This might happen if you loose a datanode or the block gets corrupted and there are no more replicas available for that node. The exact root cause of the problem is hard to tell but anyway you have to investigate what is going on your HDFS. Spark has nothing to do with this problem. Thanks, Marco On Tue, 24 Apr 2018, 09:21 Sing, Jasbir, <jasbir.s...@accenture.com> wrote: > i am using HDP2.6.3 and 2.6.4 and using the below code – > > > > 1. Creating sparkContext object > 2. Reading a text file using – rdd =sc.textFile(“hdfs:// > 192.168.142.129:8020/abc/test1.txt”); > 3. println(rdd.count); > > After executing the 3rd line i am getting the below error – > > Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain > block: BP-32082187-172.17.0.2-1517480669419:blk_1073742897_2103 > file=/abc/test1.txt > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:838) > at > org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:526) > at > org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793) > at java.io.DataInputStream.read(Unknown Source) > at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211) > at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174) > at > org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206) > at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45) > at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:246) > at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:208) > at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) > at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1595) > at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1143) > at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1143) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Unknown Source) > > Can you please help me out in this. > > > > Regards, > > Jasbir Singh > > > > ------------------------------ > > This message is for the designated recipient only and may contain > privileged, proprietary, or otherwise confidential information. If you have > received it in error, please notify the sender immediately and delete the > original. Any other use of the e-mail by you is prohibited. Where allowed > by local law, electronic communications with Accenture and its affiliates, > including e-mail and instant messaging (including content), may be scanned > by our systems for the purposes of information security and assessment of > internal compliance with Accenture policy. Your privacy is important to us. > Accenture uses your personal data only in compliance with data protection > laws. For further information on how Accenture processes your personal > data, please see our privacy statement at > https://www.accenture.com/us-en/privacy-policy. > > ______________________________________________________________________________________ > > www.accenture.com >