I tried with master branch. scala> sc.getConf.getAll.foreach(println) (spark.executor.id,driver) (spark.driver.memory,16g) (spark.unsafe.offHeap,true) (spark.driver.host,172.18.128.12) (spark.repl.class.uri,http://172.18.128.12:59780) (spark.sql.tungsten.enabled,true) (spark.fileserver.uri,http://172.18.128.12:50889) (spark.driver.port,55842) (spark.app.name,Spark shell) (spark.externalBlockStore.folderName,spark-a3ebcb78-dd13-434c-ad9a-df2fd5e9c107) (spark.jars,) (spark.master,local[*]) (spark.submit.deployMode,client) (spark.app.id,local-1447386154142)
When I ran scala> sqlContext.sql("select distinct name from people").show 15/11/12 19:42:00 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-0,5,main] java.lang.OutOfMemoryError: Unable to acquire 262144 bytes of memory, got 0 at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:91) at org.apache.spark.unsafe.map.BytesToBytesMap.allocate(BytesToBytesMap.java:728) at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:196) at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:211) at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.<init>(UnsafeFixedWidthAggregationMap.java:103) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:483) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:704) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:704) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) FYI On Thu, Nov 12, 2015 at 5:20 PM, tyronecai <tyrone...@163.com> wrote: > Hi, all: > > I test spark-1.5.*-bin-hadoop2.6, and find this problem, it’s easy to > reproduce. > > Environment: > OS: > CentOS release 6.5 (Final) > 2.6.32-431.el6.x86_64 > > JVM: > java version "1.7.0_60" > Java(TM) SE Runtime Environment (build 1.7.0_60-b19) > Java HotSpot(TM) 64-Bit Server VM (build 24.60-b09, mixed mode) > > > When enable spark.unsafe.offHeap and spark.sql.tungsten.enabled, query > “select distinct name from people” failed with > java.lang.NullPointerException > When disable spark.unsafe.offHeap or spark.sql.tungsten.enabled, it’s ok. > > > $ pwd > /data1/spark-1.5.2-bin-hadoop2.6 > > $ cat conf/spark-defaults.conf: > spark.driver.memory 16g > spark.unsafe.offHeap true > spark.sql.tungsten.enabled true > > $ bin/beeline > 0: jdbc:hive2://192.168.1.19:10000/default> show tables; > +------------+--------------+--+ > | tableName | isTemporary | > +------------+--------------+--+ > +------------+--------------+--+ > No rows selected (0.66 seconds) > 0: jdbc:hive2://192.168.1.19:10000/default> CREATE TABLE people USING > org.apache.spark.sql.json OPTIONS (path > "examples/src/main/resources/people.json"); > +---------+--+ > | Result | > +---------+--+ > +---------+--+ > No rows selected (0.378 seconds) > 0: jdbc:hive2://192.168.1.19:10000/default> show tables; > +------------+--------------+--+ > | tableName | isTemporary | > +------------+--------------+--+ > | people | false | > +------------+--------------+--+ > 1 row selected (0.039 seconds) > 0: jdbc:hive2://192.168.1.19:10000/default> select * from people; > +-------+----------+--+ > | age | name | > +-------+----------+--+ > | NULL | Michael | > | 30 | Andy | > | 19 | Justin | > +-------+----------+--+ > 3 rows selected (1.515 seconds) > 0: jdbc:hive2://192.168.1.19:10000/default> select distinct name from > people; > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 1 in stage 2.0 failed 1 times, most recent failure: Lost task 1.0 in > stage 2.0 (TID 5, localhost): java.lang.NullPointerException > at > org.apache.spark.sql.catalyst.expressions.UnsafeRowWriters$UTF8StringWriter.getSize(UnsafeRowWriters.java:90) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown > Source) > at > org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator$$anonfun$generateResultProjection$3.apply(TungstenAggregationIterator.scala:306) > at > org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator$$anonfun$generateResultProjection$3.apply(TungstenAggregationIterator.scala:305) > at > org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.next(TungstenAggregationIterator.scala:666) > at > org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.next(TungstenAggregationIterator.scala:76) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:119) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:88) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > Driver stacktrace: (state=,code=0) > 0: jdbc:hive2://192.168.1.19:10000/default> > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >