Re: InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag

2015-02-23 Thread fanooos
I have resolved this issue. Actually there was two problems. 

The first problem in the application was the port of the HDFS. It was
configured (in core-site.xml) to 9000 but in the application I was using
50070 which (as I think) the default port.

The second problem, I forgot to put the file into HDFS :( .





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/InvalidProtocolBufferException-Protocol-message-end-group-tag-did-not-match-expected-tag-tp21777p21781.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag

2015-02-23 Thread أنس الليثي
Hadoop version : 2.6.0
Spark Version : 1.2.1

here is also the pom.xml
http://maven.apache.org/POM/4.0.0"; xmlns:xsi="
http://www.w3.org/2001/XMLSchema-instance"; xsi:schemaLocation="
http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
  4.0.0
  TestSpark
  TestSpark
  0.0.1-SNAPSHOT
  
  
  org.apache.spark
  spark-core_2.10
  1.2.1
  
  
  
src

  
maven-compiler-plugin
3.1

  1.8
  1.8

  

  


Best regards

On 24 February 2015 at 08:43, Ted Yu  wrote:

> bq. have installed hadoop on a local virtual machine
>
> Can you tell us the release of hadoop you installed ?
>
> What Spark release are you using ? Or be more specific, what hadoop
> release was the Spark built against ?
>
> Cheers
>
> On Mon, Feb 23, 2015 at 9:37 PM, fanooos  wrote:
>
>> Hi
>>
>> I have installed hadoop on a local virtual machine using the steps from
>> this
>> URL
>>
>>
>> https://www.digitalocean.com/community/tutorials/how-to-install-hadoop-on-ubuntu-13-10
>>
>> In the local machine I write a little Spark application in java to read a
>> file from the hadoop instance installed in the virtual machine.
>>
>> The code is below
>>
>> public static void main(String[] args) {
>>
>> JavaSparkContext sc = new JavaSparkContext(new
>> SparkConf().setAppName("Spark Count").setMaster("local"));
>>
>> JavaRDD lines =
>> sc.textFile("hdfs://10.62.57.141:50070/tmp/lines.txt");
>> JavaRDD lengths = lines.flatMap(new FlatMapFunction> Integer>() {
>> @Override
>> public Iterable call(String t) throws Exception {
>> return Arrays.asList(t.length());
>> }
>> });
>> List collect = lengths.collect();
>> int totalLength = lengths.reduce(new Function2> Integer>() {
>> @Override
>> public Integer call(Integer v1, Integer v2) throws
>> Exception {
>> return v1+v2;
>> }
>> });
>> System.out.println(totalLength);
>>
>>   }
>>
>>
>> The application throws this exception
>>
>> Exception in thread "main" java.io.IOException: Failed on local
>> exception: com.google.protobuf.InvalidProtocolBufferException: Protocol
>> message end-group tag did not match expected tag.; Host Details : local
>> host
>> is: "TOSHIBA-PC/192.168.56.1"; destination host is: "10.62.57.141":50070;
>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>> at org.apache.hadoop.ipc.Client.call(Client.java:1351)
>> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>> at
>>
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>> at java.lang.reflect.Method.invoke(Unknown Source)
>> at
>>
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>> at
>>
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>> at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
>> at
>>
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>> at
>> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679)
>> at
>>
>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)
>> at
>>
>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
>> at
>>
>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>> at
>>
>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
>> at
>> org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1701)
>> at
>> org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1647)
>> at
>>
>> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:222)
>> at
>>
>> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:270)
>> at
>> org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:201)
>> at
>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222)
>> at
>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:220)
>> at scala.Option.getOrElse(Option.scala:120)
>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:220)
>> at
>> org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
>> at
>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222)
>> at
>> org.apache.s

Re: InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag

2015-02-23 Thread Ted Yu
bq. have installed hadoop on a local virtual machine

Can you tell us the release of hadoop you installed ?

What Spark release are you using ? Or be more specific, what hadoop release
was the Spark built against ?

Cheers

On Mon, Feb 23, 2015 at 9:37 PM, fanooos  wrote:

> Hi
>
> I have installed hadoop on a local virtual machine using the steps from
> this
> URL
>
>
> https://www.digitalocean.com/community/tutorials/how-to-install-hadoop-on-ubuntu-13-10
>
> In the local machine I write a little Spark application in java to read a
> file from the hadoop instance installed in the virtual machine.
>
> The code is below
>
> public static void main(String[] args) {
>
> JavaSparkContext sc = new JavaSparkContext(new
> SparkConf().setAppName("Spark Count").setMaster("local"));
>
> JavaRDD lines =
> sc.textFile("hdfs://10.62.57.141:50070/tmp/lines.txt");
> JavaRDD lengths = lines.flatMap(new FlatMapFunction Integer>() {
> @Override
> public Iterable call(String t) throws Exception {
> return Arrays.asList(t.length());
> }
> });
> List collect = lengths.collect();
> int totalLength = lengths.reduce(new Function2 Integer>() {
> @Override
> public Integer call(Integer v1, Integer v2) throws
> Exception {
> return v1+v2;
> }
> });
> System.out.println(totalLength);
>
>   }
>
>
> The application throws this exception
>
> Exception in thread "main" java.io.IOException: Failed on local
> exception: com.google.protobuf.InvalidProtocolBufferException: Protocol
> message end-group tag did not match expected tag.; Host Details : local
> host
> is: "TOSHIBA-PC/192.168.56.1"; destination host is: "10.62.57.141":50070;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1351)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at
>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> at java.lang.reflect.Method.invoke(Unknown Source)
> at
>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at
>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
> at
>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
> at
> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679)
> at
>
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)
> at
>
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
> at
>
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
>
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
> at
> org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1701)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1647)
> at
>
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:222)
> at
>
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:270)
> at
> org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:201)
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222)
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:220)
> at scala.Option.getOrElse(Option.scala:120)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:220)
> at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222)
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:220)
> at scala.Option.getOrElse(Option.scala:120)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:220)
> at
> org.apache.spark.rdd.FlatMappedRDD.getPartitions(FlatMappedRDD.scala:30)
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222)
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:220)
> at scala.Option.getOrElse(Option.scala:120)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:220)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1367)
> at org.apache.spark.rdd.RDD.collect(RDD.scala:797)
>