[
https://issues.apache.org/jira/browse/SPARK-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15488420#comment-15488420
]
Tathagata Das commented on SPARK-15406:
---------------------------------------
[Combining the comments in the doc and on the JIRA]
Thank you very much for highlighting the issues in more details. And I agree
that string-string is NOT SUFFICIENT, and configurations may need to be passed
in non-string form. In fact, kafka configurations allow
[string-to-object|http://kafka.apache.org/0100/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#KafkaConsumer(java.util.Map)]
configs to be passed, and some configuration do need to be passed as
non-string objects (integers, etc.). So eventually, its very likely that we
have to come up with a solution that accommodates such stuff. All I am
suggesting now is
1. Let's put in something simple that works generically across all languages,
but that may not be complete.
2. After that, let's discuss and design something that makes it complete, and
this may include new APIs, language-specific stuff, etc.
Rather than trying come up with a complete solutions and delay the release, I
feel that this approach allows us to give something to the community with most
common usecases, as soon as possible. Even if it is incomplete, the community
can start testing ASAP and providing more concrete feedback on what it would
take to make this feature complete.
Do you think this plan is okay? If so, do you think there is anything in the
design for 1, that prevents us from doing stuff in 2?
> Structured streaming support for consuming from Kafka
> -----------------------------------------------------
>
> Key: SPARK-15406
> URL: https://issues.apache.org/jira/browse/SPARK-15406
> Project: Spark
> Issue Type: New Feature
> Reporter: Cody Koeninger
>
> This is the parent JIRA to track all the work for the building a Kafka source
> for Structured Streaming. Here is the design doc for an initial version of
> the Kafka Source.
> https://docs.google.com/document/d/19t2rWe51x7tq2e5AOfrsM9qb8_m7BRuv9fel9i0PqR8/edit?usp=sharing
> ================== Old description =========================
> Structured streaming doesn't have support for kafka yet. I personally feel
> like time based indexing would make for a much better interface, but it's
> been pushed back to kafka 0.10.1
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]