Hi, all:

I test spark-1.5.*-bin-hadoop2.6, and find this problem, it’s easy to reproduce.

Environment:
OS:
CentOS release 6.5 (Final)
2.6.32-431.el6.x86_64

JVM:
java version "1.7.0_60"
Java(TM) SE Runtime Environment (build 1.7.0_60-b19)
Java HotSpot(TM) 64-Bit Server VM (build 24.60-b09, mixed mode)


When enable  spark.unsafe.offHeap and spark.sql.tungsten.enabled,  query 
“select distinct name from people” failed with java.lang.NullPointerException
When disable spark.unsafe.offHeap or spark.sql.tungsten.enabled, it’s ok.


$ pwd
/data1/spark-1.5.2-bin-hadoop2.6

$ cat conf/spark-defaults.conf:
spark.driver.memory     16g
spark.unsafe.offHeap       true
spark.sql.tungsten.enabled true

$ bin/beeline
0: jdbc:hive2://192.168.1.19:10000/default> show tables;
+------------+--------------+--+
| tableName  | isTemporary  |
+------------+--------------+--+
+------------+--------------+--+
No rows selected (0.66 seconds)
0: jdbc:hive2://192.168.1.19:10000/default> CREATE TABLE people USING 
org.apache.spark.sql.json OPTIONS (path 
"examples/src/main/resources/people.json");
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.378 seconds)
0: jdbc:hive2://192.168.1.19:10000/default> show tables;
+------------+--------------+--+
| tableName  | isTemporary  |
+------------+--------------+--+
| people     | false        |
+------------+--------------+--+
1 row selected (0.039 seconds)
0: jdbc:hive2://192.168.1.19:10000/default> select * from people;
+-------+----------+--+
|  age  |   name   |
+-------+----------+--+
| NULL  | Michael  |
| 30    | Andy     |
| 19    | Justin   |
+-------+----------+--+
3 rows selected (1.515 seconds)
0: jdbc:hive2://192.168.1.19:10000/default> select distinct name from people;
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 
1 in stage 2.0 failed 1 times, most recent failure: Lost task 1.0 in stage 2.0 
(TID 5, localhost): java.lang.NullPointerException
        at 
org.apache.spark.sql.catalyst.expressions.UnsafeRowWriters$UTF8StringWriter.getSize(UnsafeRowWriters.java:90)
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
 Source)
        at 
org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator$$anonfun$generateResultProjection$3.apply(TungstenAggregationIterator.scala:306)
        at 
org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator$$anonfun$generateResultProjection$3.apply(TungstenAggregationIterator.scala:305)
        at 
org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.next(TungstenAggregationIterator.scala:666)
        at 
org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.next(TungstenAggregationIterator.scala:76)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        at 
org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:119)
        at 
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:88)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

Driver stacktrace: (state=,code=0)
0: jdbc:hive2://192.168.1.19:10000/default>




---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to