[jira] [Created] (SPARK-21429) show on structured Dataset is equivalent to writeStream to console once

Jacek Laskowski (JIRA) Sun, 16 Jul 2017 11:24:35 -0700

Jacek Laskowski created SPARK-21429:
---------------------------------------


             Summary: show on structured Dataset is equivalent to writeStream 
to console once
                 Key: SPARK-21429
                 URL: https://issues.apache.org/jira/browse/SPARK-21429
             Project: Spark
          Issue Type: Improvement
          Components: Structured Streaming
    Affects Versions: 2.2.0
            Reporter: Jacek Laskowski
            Priority: Minor


While working with Datasets it's often helpful to do {{show}}. It does not work 
for streaming Datasets (and leads to {{AnalysisException}} - see below), but 
think it could just be the following under the covers and very helpful (would 
cut plenty of keystrokes for sure).

{code}
val sq = ...
scala> sq.isStreaming
res0: Boolean = true

import org.apache.spark.sql.streaming.Trigger
scala> sq.writeStream.format("console").trigger(Trigger.Once).start
{code}

Since {{show}} returns {{Unit}} that could just work.

Currently {{show}} reports {{AnalysisException}}.

{code}
scala> sq.show
org.apache.spark.sql.AnalysisException: Queries with streaming sources must be 
executed with writeStream.start();;
rate
  at 
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.org$apache$spark$sql$catalyst$analysis$UnsupportedOperationChecker$$throwError(UnsupportedOperationChecker.scala:297)
  at 
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForBatch$1.apply(UnsupportedOperationChecker.scala:36)
  at 
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForBatch$1.apply(UnsupportedOperationChecker.scala:34)
  at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:126)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:126)
  at scala.collection.immutable.List.foreach(List.scala:381)
  at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:126)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:126)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:126)
  at scala.collection.immutable.List.foreach(List.scala:381)
  at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:126)
  at 
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.checkForBatch(UnsupportedOperationChecker.scala:34)
  at 
org.apache.spark.sql.execution.QueryExecution.assertSupported(QueryExecution.scala:63)
  at 
org.apache.spark.sql.execution.QueryExecution.withCachedData$lzycompute(QueryExecution.scala:74)
  at 
org.apache.spark.sql.execution.QueryExecution.withCachedData(QueryExecution.scala:72)
  at 
org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:78)
  at 
org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:78)
  at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:84)
  at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:80)
  at 
org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:89)
  at 
org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:89)
  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3027)
  at org.apache.spark.sql.Dataset.head(Dataset.scala:2340)
  at org.apache.spark.sql.Dataset.take(Dataset.scala:2553)
  at org.apache.spark.sql.Dataset.showString(Dataset.scala:241)
  at org.apache.spark.sql.Dataset.show(Dataset.scala:671)
  at org.apache.spark.sql.Dataset.show(Dataset.scala:630)
  at org.apache.spark.sql.Dataset.show(Dataset.scala:639)
  ... 50 elided
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (SPARK-21429) show on structured Dataset is equivalent to writeStream to console once

Reply via email to