Note: I am running Spark on Windows 7 in standalone mode.

In my app, I run the following:

        DataFrame df = sqlContext.sql("SELECT * FROM tbBER");
        System.out.println("Count: " + df.count());

tbBER is registered as a temp table in my SQLContext. When I try to print
the number of rows in the DataFrame, the job fails and I get the following
error message:

        java.io.EOFException
        at
java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:2747)
        at java.io.ObjectInputStream.readFully(ObjectInputStream.java:1033)
        at
org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
        at 
org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
        at org.apache.hadoop.io.UTF8.readChars(UTF8.java:216)
        at org.apache.hadoop.io.UTF8.readString(UTF8.java:208)
        at org.apache.hadoop.mapred.FileSplit.readFields(FileSplit.java:87)
        at 
org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:237)
        at 
org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
        at
org.apache.spark.SerializableWritable$$anonfun$readObject$1.apply$mcV$sp(SerializableWritable.scala:43)
        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1137)
        at
org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:39)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:483)
        at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1896)
        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
        at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
        at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
        at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
        at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:94)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:185)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

This only happens when I try to call df.count(). The rest runs fine. Is the
count() function not supported in standalone mode? The stack trace makes it
appear to be Hadoop functionality...



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-3-0-DataFrame-count-method-throwing-java-io-EOFException-tp22344.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to