Re: InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag
I have resolved this issue. Actually there was two problems. The first problem in the application was the port of the HDFS. It was configured (in core-site.xml) to 9000 but in the application I was using 50070 which (as I think) the default port. The second problem, I forgot to put the file into HDFS :( . -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/InvalidProtocolBufferException-Protocol-message-end-group-tag-did-not-match-expected-tag-tp21777p21781.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag
Hadoop version : 2.6.0 Spark Version : 1.2.1 here is also the pom.xml http://maven.apache.org/POM/4.0.0"; xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance"; xsi:schemaLocation=" http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd";> 4.0.0 TestSpark TestSpark 0.0.1-SNAPSHOT org.apache.spark spark-core_2.10 1.2.1 src maven-compiler-plugin 3.1 1.8 1.8 Best regards On 24 February 2015 at 08:43, Ted Yu wrote: > bq. have installed hadoop on a local virtual machine > > Can you tell us the release of hadoop you installed ? > > What Spark release are you using ? Or be more specific, what hadoop > release was the Spark built against ? > > Cheers > > On Mon, Feb 23, 2015 at 9:37 PM, fanooos wrote: > >> Hi >> >> I have installed hadoop on a local virtual machine using the steps from >> this >> URL >> >> >> https://www.digitalocean.com/community/tutorials/how-to-install-hadoop-on-ubuntu-13-10 >> >> In the local machine I write a little Spark application in java to read a >> file from the hadoop instance installed in the virtual machine. >> >> The code is below >> >> public static void main(String[] args) { >> >> JavaSparkContext sc = new JavaSparkContext(new >> SparkConf().setAppName("Spark Count").setMaster("local")); >> >> JavaRDD lines = >> sc.textFile("hdfs://10.62.57.141:50070/tmp/lines.txt"); >> JavaRDD lengths = lines.flatMap(new FlatMapFunction> Integer>() { >> @Override >> public Iterable call(String t) throws Exception { >> return Arrays.asList(t.length()); >> } >> }); >> List collect = lengths.collect(); >> int totalLength = lengths.reduce(new Function2> Integer>() { >> @Override >> public Integer call(Integer v1, Integer v2) throws >> Exception { >> return v1+v2; >> } >> }); >> System.out.println(totalLength); >> >> } >> >> >> The application throws this exception >> >> Exception in thread "main" java.io.IOException: Failed on local >> exception: com.google.protobuf.InvalidProtocolBufferException: Protocol >> message end-group tag did not match expected tag.; Host Details : local >> host >> is: "TOSHIBA-PC/192.168.56.1"; destination host is: "10.62.57.141":50070; >> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) >> at org.apache.hadoop.ipc.Client.call(Client.java:1351) >> at org.apache.hadoop.ipc.Client.call(Client.java:1300) >> at >> >> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) >> at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) >> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) >> at java.lang.reflect.Method.invoke(Unknown Source) >> at >> >> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) >> at >> >> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) >> at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source) >> at >> >> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) >> at >> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679) >> at >> >> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106) >> at >> >> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102) >> at >> >> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) >> at >> >> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102) >> at >> org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1701) >> at >> org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1647) >> at >> >> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:222) >> at >> >> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:270) >> at >> org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:201) >> at >> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222) >> at >> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:220) >> at scala.Option.getOrElse(Option.scala:120) >> at org.apache.spark.rdd.RDD.partitions(RDD.scala:220) >> at >> org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28) >> at >> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222) >> at >> org.apache.s
Re: InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag
bq. have installed hadoop on a local virtual machine Can you tell us the release of hadoop you installed ? What Spark release are you using ? Or be more specific, what hadoop release was the Spark built against ? Cheers On Mon, Feb 23, 2015 at 9:37 PM, fanooos wrote: > Hi > > I have installed hadoop on a local virtual machine using the steps from > this > URL > > > https://www.digitalocean.com/community/tutorials/how-to-install-hadoop-on-ubuntu-13-10 > > In the local machine I write a little Spark application in java to read a > file from the hadoop instance installed in the virtual machine. > > The code is below > > public static void main(String[] args) { > > JavaSparkContext sc = new JavaSparkContext(new > SparkConf().setAppName("Spark Count").setMaster("local")); > > JavaRDD lines = > sc.textFile("hdfs://10.62.57.141:50070/tmp/lines.txt"); > JavaRDD lengths = lines.flatMap(new FlatMapFunction Integer>() { > @Override > public Iterable call(String t) throws Exception { > return Arrays.asList(t.length()); > } > }); > List collect = lengths.collect(); > int totalLength = lengths.reduce(new Function2 Integer>() { > @Override > public Integer call(Integer v1, Integer v2) throws > Exception { > return v1+v2; > } > }); > System.out.println(totalLength); > > } > > > The application throws this exception > > Exception in thread "main" java.io.IOException: Failed on local > exception: com.google.protobuf.InvalidProtocolBufferException: Protocol > message end-group tag did not match expected tag.; Host Details : local > host > is: "TOSHIBA-PC/192.168.56.1"; destination host is: "10.62.57.141":50070; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) > at org.apache.hadoop.ipc.Client.call(Client.java:1351) > at org.apache.hadoop.ipc.Client.call(Client.java:1300) > at > > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > at java.lang.reflect.Method.invoke(Unknown Source) > at > > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) > at > > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source) > at > > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) > at > org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679) > at > > org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106) > at > > org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102) > at > > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102) > at > org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1701) > at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1647) > at > > org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:222) > at > > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:270) > at > org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:201) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:220) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:220) > at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:220) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:220) > at > org.apache.spark.rdd.FlatMappedRDD.getPartitions(FlatMappedRDD.scala:30) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:222) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:220) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:220) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:1367) > at org.apache.spark.rdd.RDD.collect(RDD.scala:797) >