spark git commit: [SPARK-15255][SQL] limit the length of name for cached DataFrame

rxin Tue, 10 May 2016 22:30:07 -0700

Repository: spark
Updated Branches:
  refs/heads/branch-2.0 a675f5e1d -> 1b446a461



[SPARK-15255][SQL] limit the length of name for cached DataFrame

## What changes were proposed in this pull request?

We use the tree string of an SparkPlan as the name of cached DataFrame, that 
could be very long, cause the browser to be not responsive. This PR will limit 
the length of the name to 1000 characters.

## How was this patch tested?

Here is how the UI looks right now:

![ui](https://cloud.githubusercontent.com/assets/40902/15163355/d5640f9c-16bc-11e6-8655-809af8a4fed1.png)

Author: Davies Liu <[email protected]>

Closes #13033 from davies/cache_name.

(cherry picked from commit 1fbe2785dff53a9eae5f13809091de7520a1e1b2)
Signed-off-by: Reynold Xin <[email protected]>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1b446a46
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1b446a46
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1b446a46

Branch: refs/heads/branch-2.0
Commit: 1b446a461de42358895d252739bf6477e775b2a6
Parents: a675f5e
Author: Davies Liu <[email protected]>
Authored: Tue May 10 22:29:41 2016 -0700
Committer: Reynold Xin <[email protected]>
Committed: Tue May 10 22:29:48 2016 -0700

----------------------------------------------------------------------
 .../spark/sql/execution/columnar/InMemoryTableScanExec.scala   | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/1b446a46/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala
----------------------------------------------------------------------
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala
index a36071a..009fbaa 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala
@@ -19,6 +19,8 @@ package org.apache.spark.sql.execution.columnar
 
 import scala.collection.mutable.ArrayBuffer
 
+import org.apache.commons.lang.StringUtils
+
 import org.apache.spark.{Accumulable, Accumulator}
 import org.apache.spark.network.util.JavaUtils
 import org.apache.spark.rdd.RDD
@@ -177,7 +179,9 @@ private[sql] case class InMemoryRelation(
       }
     }.persist(storageLevel)
 
-    cached.setName(tableName.map(n => s"In-memory table 
$n").getOrElse(child.toString))
+    cached.setName(
+      tableName.map(n => s"In-memory table $n")
+        .getOrElse(StringUtils.abbreviate(child.toString, 1024)))
     _cachedColumnBuffers = cached
   }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

spark git commit: [SPARK-15255][SQL] limit the length of name for cached DataFrame

Reply via email to