Jacek Laskowski created SPARK-21429:
---------------------------------------
Summary: show on structured Dataset is equivalent to writeStream
to console once
Key: SPARK-21429
URL: https://issues.apache.org/jira/browse/SPARK-21429
Project: Spark
Issue Type: Improvement
Components: Structured Streaming
Affects Versions: 2.2.0
Reporter: Jacek Laskowski
Priority: Minor
While working with Datasets it's often helpful to do {{show}}. It does not work
for streaming Datasets (and leads to {{AnalysisException}} - see below), but
think it could just be the following under the covers and very helpful (would
cut plenty of keystrokes for sure).
{code}
val sq = ...
scala> sq.isStreaming
res0: Boolean = true
import org.apache.spark.sql.streaming.Trigger
scala> sq.writeStream.format("console").trigger(Trigger.Once).start
{code}
Since {{show}} returns {{Unit}} that could just work.
Currently {{show}} reports {{AnalysisException}}.
{code}
scala> sq.show
org.apache.spark.sql.AnalysisException: Queries with streaming sources must be
executed with writeStream.start();;
rate
at
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.org$apache$spark$sql$catalyst$analysis$UnsupportedOperationChecker$$throwError(UnsupportedOperationChecker.scala:297)
at
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForBatch$1.apply(UnsupportedOperationChecker.scala:36)
at
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForBatch$1.apply(UnsupportedOperationChecker.scala:34)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:126)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:126)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:126)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:126)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:126)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:126)
at
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.checkForBatch(UnsupportedOperationChecker.scala:34)
at
org.apache.spark.sql.execution.QueryExecution.assertSupported(QueryExecution.scala:63)
at
org.apache.spark.sql.execution.QueryExecution.withCachedData$lzycompute(QueryExecution.scala:74)
at
org.apache.spark.sql.execution.QueryExecution.withCachedData(QueryExecution.scala:72)
at
org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:78)
at
org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:78)
at
org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:84)
at
org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:80)
at
org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:89)
at
org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:89)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3027)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2340)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2553)
at org.apache.spark.sql.Dataset.showString(Dataset.scala:241)
at org.apache.spark.sql.Dataset.show(Dataset.scala:671)
at org.apache.spark.sql.Dataset.show(Dataset.scala:630)
at org.apache.spark.sql.Dataset.show(Dataset.scala:639)
... 50 elided
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]