[
https://issues.apache.org/jira/browse/SPARK-27975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877836#comment-16877836
]
Gabor Somogyi commented on SPARK-27975:
---------------------------------------
After I've moved console sink from v1 to v2 the progress looks like this:
{code:java}
{
"id" : "b0a8ffaa-900b-4a8b-8769-c2f43b9cf99b",
"runId" : "6b4d01ed-7d0f-44de-8997-0b31bf2a106e",
"name" : null,
"timestamp" : "2019-07-03T12:55:17.104Z",
"batchId" : 0,
"numInputRows" : 1,
"processedRowsPerSecond" : 0.5115089514066496,
"durationMs" : {
"addBatch" : 1486,
"getBatch" : 3,
"latestOffset" : 0,
"queryPlanning" : 335,
"triggerExecution" : 1955,
"walCommit" : 69
},
"stateOperators" : [ ],
"sources" : [ {
"description" : "MemoryStream[value#1]",
"startOffset" : null,
"endOffset" : 0,
"numInputRows" : 1,
"processedRowsPerSecond" : 0.5115089514066496
} ],
"sink" : {
"description" :
"org.apache.spark.sql.execution.streaming.ConsoleTable@373e6cb2",
"numOutputRows" : 1
}
}
{code}
I think TextSocketV2 is different from ConsoleTable. In TextSocketV2 the
mentioned parameters are keys for the instance which is not the same for
ConsoleTable.
I agree at the moment it's not possible to see the parameters of ConsoleWrite
but this case I suggest to add a log entry (let's consider a sink where maybe
10+ different params are there for example Kafka).
> ConsoleSink should display alias and options for streaming progress
> -------------------------------------------------------------------
>
> Key: SPARK-27975
> URL: https://issues.apache.org/jira/browse/SPARK-27975
> Project: Spark
> Issue Type: Improvement
> Components: Structured Streaming
> Affects Versions: 3.0.0
> Reporter: Jacek Laskowski
> Priority: Minor
>
> {{console}} sink shows itself in progress with this internal instance
> representation as follows:
> {code:json}
> "sink" : {
> "description" :
> "org.apache.spark.sql.execution.streaming.ConsoleSinkProvider@12fa674a"
> }
> {code}
> That is not very user-friendly and would be much better for debugging if it
> included the alias and options as {{socket}} does:
> {code}
> "sources" : [ {
> "description" : "TextSocketV2[host: localhost, port: 8888]",
> ...
> } ],
> {code}
> The entire sample progress looks as follows:
> {code}
> 19/06/07 11:47:18 INFO MicroBatchExecution: Streaming query made progress: {
> "id" : "26bedc9f-076f-4b15-8e17-f09609aaecac",
> "runId" : "8c365e74-7ac9-4fad-bf1b-397eb086661e",
> "name" : "socket-console",
> "timestamp" : "2019-06-07T09:47:18.969Z",
> "batchId" : 2,
> "numInputRows" : 0,
> "inputRowsPerSecond" : 0.0,
> "durationMs" : {
> "getEndOffset" : 0,
> "setOffsetRange" : 0,
> "triggerExecution" : 0
> },
> "stateOperators" : [ ],
> "sources" : [ {
> "description" : "TextSocketV2[host: localhost, port: 8888]",
> "startOffset" : 0,
> "endOffset" : 0,
> "numInputRows" : 0,
> "inputRowsPerSecond" : 0.0
> } ],
> "sink" : {
> "description" :
> "org.apache.spark.sql.execution.streaming.ConsoleSinkProvider@12fa674a"
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]