Re: Spark 1.5: How to trigger expression execution through UnsafeRow/TungstenProject

Ted Yu Wed, 09 Sep 2015 14:18:38 -0700

Here is the example from Reynold (
http://search-hadoop.com/m/q3RTtfvs1P1YDK8d) :


scala> val data = sc.parallelize(1 to size, 5).map(x =>
(util.Random.nextInt(size /
repetitions),util.Random.nextDouble)).toDF("key", "value")
data: org.apache.spark.sql.DataFrame = [key: int, value: double]

scala> data.explain
== Physical Plan ==
TungstenProject [_1#0 AS key#2,_2#1 AS value#3]
 Scan PhysicalRDD[_1#0,_2#1]

...
scala> val res = df.groupBy("key").agg(sum("value"))
res: org.apache.spark.sql.DataFrame = [key: int, sum(value): double]

scala> res.explain
15/09/09 14:17:26 INFO MemoryStore: ensureFreeSpace(88456) called with
curMem=84037, maxMem=556038881
15/09/09 14:17:26 INFO MemoryStore: Block broadcast_2 stored as values in
memory (estimated size 86.4 KB, free 530.1 MB)
15/09/09 14:17:26 INFO MemoryStore: ensureFreeSpace(19788) called with
curMem=172493, maxMem=556038881
15/09/09 14:17:26 INFO MemoryStore: Block broadcast_2_piece0 stored as
bytes in memory (estimated size 19.3 KB, free 530.1 MB)
15/09/09 14:17:26 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory
on localhost:42098 (size: 19.3 KB, free: 530.2 MB)
15/09/09 14:17:26 INFO SparkContext: Created broadcast 2 from explain at
<console>:27
== Physical Plan ==
TungstenAggregate(key=[key#19],
functions=[(sum(value#20),mode=Final,isDistinct=false)],
output=[key#19,sum(value)#21])
 TungstenExchange hashpartitioning(key#19)
  TungstenAggregate(key=[key#19],
functions=[(sum(value#20),mode=Partial,isDistinct=false)],
output=[key#19,currentSum#25])
   Scan ParquetRelation[file:/tmp/data][key#19,value#20]

FYI

On Wed, Sep 9, 2015 at 12:31 PM, lonikar <loni...@gmail.com> wrote:

> The tungsten, cogegen etc options are enabled by default. But I am not able
> to get the execution through the UnsafeRow/TungstenProject. It still
> executes using InternalRow/Project.
>
> I see this in the SparkStrategies.scala: If unsafe mode is enabled and we
> support these data types in Unsafe, use the tungsten project. Otherwise use
> the normal project.
>
> Can someone give an example code on what can trigger this? I tried some of
> the primitive types but did not work.
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-1-5-How-to-trigger-expression-execution-through-UnsafeRow-TungstenProject-tp14026.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>

Re: Spark 1.5: How to trigger expression execution through UnsafeRow/TungstenProject

Reply via email to