[ 
https://issues.apache.org/jira/browse/SPARK-20378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15975395#comment-15975395
 ] 

Apache Spark commented on SPARK-20378:
--------------------------------------

User 'ymahajan' has created a pull request for this issue:
https://github.com/apache/spark/pull/17689

> StreamSinkProvider should provide schema in createSink. 
> --------------------------------------------------------
>
>                 Key: SPARK-20378
>                 URL: https://issues.apache.org/jira/browse/SPARK-20378
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 2.1.0
>            Reporter: Yogesh Mahajan
>
> We have our own Sink implementation based on our in memory store and this 
> sink is also queryable through SparkSQL with corresponding logical and 
> physical plans.  It is very similar to memory Sink provided in structured 
> streaming. Custom Sinks are registered through DataSource and it's per query 
> and hence per schema. 
> StreamingQueryManager can have multiple queries in one sparkSession and their 
> schema could be different. 
> So with the proposed changes StreamSinkProvider trait will change as follows 
> - 
> From this definition 
> trait StreamSinkProvider {
>   def createSink(
>       sqlContext: SQLContext,
>       parameters: Map[String, String],
>       partitionColumns: Seq[String],
>       outputMode: OutputMode): Sink
> }
> to this definition - 
> trait StreamSinkProvider {
>   def createSink(
>       schema: StructType,
>       sqlContext: SQLContext,
>       parameters: Map[String, String],
>       partitionColumns: Seq[String],
>       outputMode: OutputMode): Sink
> }



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to