[
https://issues.apache.org/jira/browse/SPARK-20378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-20378:
------------------------------------
Assignee: (was: Apache Spark)
> StreamSinkProvider should provide schema in createSink.
> --------------------------------------------------------
>
> Key: SPARK-20378
> URL: https://issues.apache.org/jira/browse/SPARK-20378
> Project: Spark
> Issue Type: Improvement
> Components: Structured Streaming
> Affects Versions: 2.1.0
> Reporter: Yogesh Mahajan
>
> We have our own Sink implementation based on our in memory store and this
> sink is also queryable through SparkSQL with corresponding logical and
> physical plans. It is very similar to memory Sink provided in structured
> streaming. Custom Sinks are registered through DataSource and it's per query
> and hence per schema.
> StreamingQueryManager can have multiple queries in one sparkSession and their
> schema could be different.
> So with the proposed changes StreamSinkProvider trait will change as follows
> -
> From this definition
> trait StreamSinkProvider {
> def createSink(
> sqlContext: SQLContext,
> parameters: Map[String, String],
> partitionColumns: Seq[String],
> outputMode: OutputMode): Sink
> }
> to this definition -
> trait StreamSinkProvider {
> def createSink(
> schema: StructType,
> sqlContext: SQLContext,
> parameters: Map[String, String],
> partitionColumns: Seq[String],
> outputMode: OutputMode): Sink
> }
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]