[ 
https://issues.apache.org/jira/browse/SPARK-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453412#comment-15453412
 ] 

Ofir Manor commented on SPARK-15406:
------------------------------------

For me - structured streaming is currently all about real window operations 
based on event time (fields in the event), not processing time (already in 2.0 
with some limitations). In a future release it may also be about some new 
sink-related features (managing exactly-once from Spark to relational databases 
or HDFS, automatically doing upserts to databases).
So, I just want the same Kafka features as before - the value is the new 
processing capabilities, it just happens that my source of real-time events is 
Kafka,not Parquet files (as in 2.0).
I expect a couple of things. First, some basic config control like a pointer to 
Kafka (bootstrap servers), one or more topics, optionally an existing consumer 
group or an offset definition, optionally kerberised connection. I also expect 
exactly-once processing from Kafka to Spark (including correctly recovering 
after Spark node failure)

> Structured streaming support for consuming from Kafka
> -----------------------------------------------------
>
>                 Key: SPARK-15406
>                 URL: https://issues.apache.org/jira/browse/SPARK-15406
>             Project: Spark
>          Issue Type: New Feature
>            Reporter: Cody Koeninger
>
> Structured streaming doesn't have support for kafka yet.  I personally feel 
> like time based indexing would make for a much better interface, but it's 
> been pushed back to kafka 0.10.1
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to