Github user ajithme commented on the issue:

    https://github.com/apache/spark/pull/22277
  
    Attaching a sql file to reproduce the issue and see the effect of PR : 
    [test.txt](https://github.com/apache/spark/files/2356468/test.txt)
    
    
    
    
    ### Without patch: 
    ```
    spark-2.3.1-bin-hadoop2.7/bin # ./spark-sql -f test.txt
    Time taken: 3.405 seconds
    Time taken: 0.373 seconds
    Time taken: 0.202 seconds
    Time taken: 0.024 seconds
    18/09/06 11:29:49 WARN HiveMetaStore: Location: 
file:/user/hive/warehouse/table11 specified for non-external table:table11
    Time taken: 0.541 seconds
    18/09/06 11:29:49 WARN HiveMetaStore: Location: 
file:/user/hive/warehouse/table22 specified for non-external table:table22
    Time taken: 0.115 seconds
    18/09/06 11:29:50 WARN HiveMetaStore: Location: 
file:/user/hive/warehouse/table33 specified for non-external table:table33
    Time taken: 6.075 seconds
    18/09/06 11:31:38 ERROR SparkSQLDriver: Failed in [
    create table table44 as
    select a.*
    from
    (
    select
    (concat(
    case when a1 is null then '' else cast(a1 as string) end,'|~|',
    case when a2 is null then '' else cast(a2 as string) end,'|~|',
    case when a3 is null then '' else cast(a3 as string) end,'|~|',
    case when a4 is null then '' else cast(a4 as string) end,'|~|',
    case when a5 is null then '' else cast(a5 as string) end,'|~|',
    case when a6 is null then '' else cast(a6 as string) end,'|~|',
    case when a7 is null then '' else cast(a7 as string) end,'|~|',
    case when a8 is null then '' else cast(a8 as string) end,'|~|',
    case when a9 is null then '' else cast(a9 as string) end,'|~|',
    case when a10 is null then '' else cast(a10 as string) end,'|~|',
    case when a11 is null then '' else cast(a11 as string) end,'|~|',
    case when a12 is null then '' else cast(a12 as string) end,'|~|',
    case when a13 is null then '' else cast(a13 as string) end,'|~|',
    case when a14 is null then '' else cast(a14 as string) end,'|~|',
    case when a15 is null then '' else cast(a15 as string) end,'|~|',
    case when a16 is null then '' else cast(a16 as string) end,'|~|',
    case when a17 is null then '' else cast(a17 as string) end,'|~|',
    case when a18 is null then '' else cast(a18 as string) end,'|~|',
    case when a19 is null then '' else cast(a19 as string) end
    )) as KEY_ID ,
    case when a1 is null then '' else cast(a1 as string) end as a1,
    case when a2 is null then '' else cast(a2 as string) end as a2,
    case when a3 is null then '' else cast(a3 as string) end as a3,
    case when a4 is null then '' else cast(a4 as string) end as a4,
    case when a5 is null then '' else cast(a5 as string) end as a5,
    case when a6 is null then '' else cast(a6 as string) end as a6,
    case when a7 is null then '' else cast(a7 as string) end as a7,
    case when a8 is null then '' else cast(a8 as string) end as a8,
    case when a9 is null then '' else cast(a9 as string) end as a9,
    case when a10 is null then '' else cast(a10 as string) end as a10,
    case when a11 is null then '' else cast(a11 as string) end as a11,
    case when a12 is null then '' else cast(a12 as string) end as a12,
    case when a13 is null then '' else cast(a13 as string) end as a13,
    case when a14 is null then '' else cast(a14 as string) end as a14,
    case when a15 is null then '' else cast(a15 as string) end as a15,
    case when a16 is null then '' else cast(a16 as string) end as a16,
    case when a17 is null then '' else cast(a17 as string) end as a17,
    case when a18 is null then '' else cast(a18 as string) end as a18,
    case when a19 is null then '' else cast(a19 as string) end as a19
    from table22
    ) A
    left join table11 B ON A.KEY_ID = B.KEY_ID
    where b.KEY_ID is null]
    java.lang.OutOfMemoryError: GC overhead limit exceeded
            at java.lang.Class.copyConstructors(Class.java:3130)
            at java.lang.Class.getConstructors(Class.java:1651)
            at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$makeCopy$1.apply(TreeNode.scala:387)
            at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$makeCopy$1.apply(TreeNode.scala:385)
            at 
org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52)
            at 
org.apache.spark.sql.catalyst.trees.TreeNode.makeCopy(TreeNode.scala:385)
            at 
org.apache.spark.sql.catalyst.trees.TreeNode.withNewChildren(TreeNode.scala:244)
            at 
org.apache.spark.sql.catalyst.expressions.Expression.canonicalized$lzycompute(Expression.scala:190)
            at 
org.apache.spark.sql.catalyst.expressions.Expression.canonicalized(Expression.scala:188)
            at 
org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$1.apply(Expression.scala:189)
            at 
org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$1.apply(Expression.scala:189)
            at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
            at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
            at scala.collection.immutable.List.foreach(List.scala:381)
            at 
scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
            at scala.collection.immutable.List.map(List.scala:285)
            at 
org.apache.spark.sql.catalyst.expressions.Expression.canonicalized$lzycompute(Expression.scala:189)
            at 
org.apache.spark.sql.catalyst.expressions.Expression.canonicalized(Expression.scala:188)
            at 
org.apache.spark.sql.catalyst.expressions.ExpressionSet.add(ExpressionSet.scala:63)
            at 
org.apache.spark.sql.catalyst.expressions.ExpressionSet$$anonfun$$plus$plus$1.apply(ExpressionSet.scala:79)
            at 
org.apache.spark.sql.catalyst.expressions.ExpressionSet$$anonfun$$plus$plus$1.apply(ExpressionSet.scala:79)
            at 
scala.collection.immutable.HashSet$HashSet1.foreach(HashSet.scala:316)
            at 
scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:972)
            at 
scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:972)
            at 
scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:972)
            at 
scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:972)
            at 
org.apache.spark.sql.catalyst.expressions.ExpressionSet.$plus$plus(ExpressionSet.scala:79)
            at 
org.apache.spark.sql.catalyst.expressions.ExpressionSet.$plus$plus(ExpressionSet.scala:55)
            at 
org.apache.spark.sql.catalyst.plans.logical.UnaryNode$$anonfun$getAliasedConstraints$1.apply(LogicalPlan.scala:254)
            at 
org.apache.spark.sql.catalyst.plans.logical.UnaryNode$$anonfun$getAliasedConstraints$1.apply(LogicalPlan.scala:249)
            at scala.collection.immutable.List.foreach(List.scala:381)
            at 
org.apache.spark.sql.catalyst.plans.logical.UnaryNode.getAliasedConstraints(LogicalPlan.scala:249)
    ```  
    
    ### After applying patch: 
    ```
    spark-2.3.1-bin-hadoop2.7/bin # ./spark-sql -f test.txt
    Time taken: 3.469 seconds
    Time taken: 0.294 seconds
    Time taken: 0.223 seconds
    Time taken: 0.023 seconds
    18/09/06 11:33:08 WARN HiveMetaStore: Location: 
file:/user/hive/warehouse/table11 specified for non-external table:table11
    Time taken: 0.546 seconds
    18/09/06 11:33:08 WARN HiveMetaStore: Location: 
file:/user/hive/warehouse/table22 specified for non-external table:table22
    Time taken: 0.11 seconds
    18/09/06 11:33:10 WARN HiveMetaStore: Location: 
file:/user/hive/warehouse/table33 specified for non-external table:table33
    Time taken: 6.258 seconds
    18/09/06 11:33:15 WARN HiveMetaStore: Location: 
file:/user/hive/warehouse/table44 specified for non-external table:table44
    Time taken: 2.603 seconds
    ``` 
    
    As you can see here, when we have many aliases in projection, computing it 
will cause significant overhead with current code.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to