Re: Block Missing Exception while connecting Spark with HDP

2018-04-24 Thread Marco Gaido
Hi Jasbir,

As a first note, please if you are using a vendor distribution, please
contact the vendor for any issue you are facing. This mailing list is for
the community so we focus on the community edition of Spark.

Anyway, the error seems to be quite clear: your file on HDFS has a missing
block. This might happen if you loose a datanode or the block gets
corrupted and there are no more replicas available for that node. The exact
root cause of the problem is hard to tell but anyway you have to
investigate what is going on your HDFS. Spark has nothing to do with this
problem.

Thanks,
Marco

On Tue, 24 Apr 2018, 09:21 Sing, Jasbir,  wrote:

> i am using HDP2.6.3 and 2.6.4 and using the below code –
>
>
>
> 1. Creating sparkContext object
> 2. Reading a text file using – rdd =sc.textFile(“hdfs://
> 192.168.142.129:8020/abc/test1.txt”);
> 3. println(rdd.count);
>
> After executing the 3rd line i am getting the below error –
>
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain
> block: BP-32082187-172.17.0.2-1517480669419:blk_1073742897_2103
> file=/abc/test1.txt
> at
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:838)
> at
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:526)
> at
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
> at java.io.DataInputStream.read(Unknown Source)
> at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
> at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
> at
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
> at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:246)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:208)
> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
> at
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1595)
> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1143)
> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1143)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
>
> Can you please help me out in this.
>
>
>
> Regards,
>
> Jasbir Singh
>
>
>
> --
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed
> by local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy. Your privacy is important to us.
> Accenture uses your personal data only in compliance with data protection
> laws. For further information on how Accenture processes your personal
> data, please see our privacy statement at
> https://www.accenture.com/us-en/privacy-policy.
>
> __
>
> www.accenture.com
>


Block Missing Exception while connecting Spark with HDP

2018-04-24 Thread Sing, Jasbir
i am using HDP2.6.3 and 2.6.4 and using the below code –



1. Creating sparkContext object
2. Reading a text file using – rdd 
=sc.textFile(“hdfs://192.168.142.129:8020/abc/test1.txt”);
3. println(rdd.count);

After executing the 3rd line i am getting the below error –

Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
block: BP-32082187-172.17.0.2-1517480669419:blk_1073742897_2103 
file=/abc/test1.txt
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:838)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:526)
at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
at java.io.DataInputStream.read(Unknown Source)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:246)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:208)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at 
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1595)
at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1143)
at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1143)
at 
org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
at 
org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Can you please help me out in this.

Regards,
Jasbir Singh




This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Where allowed by local law, electronic 
communications with Accenture and its affiliates, including e-mail and instant 
messaging (including content), may be scanned by our systems for the purposes 
of information security and assessment of internal compliance with Accenture 
policy. Your privacy is important to us. Accenture uses your personal data only 
in compliance with data protection laws. For further information on how 
Accenture processes your personal data, please see our privacy statement at 
https://www.accenture.com/us-en/privacy-policy.
__

www.accenture.com