gaborgsomogyi commented on issue #24738: [WIP][SPARK-23098][SQL] Migrate Kafka 
Batch source to v2.
URL: https://github.com/apache/spark/pull/24738#issuecomment-503029450
 
 
   @rdblue @HeartSaVioR Thanks for the helpful comments! I've just had a look 
and the suggested approach looks good.
   
   @HeartSaVioR thanks for bringing up those important concerns when adding new 
required columns to Kafka. Related the mentioned 2 bullet points the 
`dropDuplicates` issue shouldn't be problem here because this is only the batch 
source but this still stands: `"select *" returns different schemas and 
results.`
   
   The topic column was also odd to me at the first glance but from usage 
perspective makes sense and useful. Lately I've seen couple of use-cases where 
some error topic is used as sink when processing was not successful.
   
   I'm basically fine to expose metadata after discussing such questions like 
what @HeartSaVioR brought up. Presume this can be done in a separate PR.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to