Thank you. From: Peyman Mohajerian [mailto:[email protected]] Sent: Thursday, February 24, 2022 9:00 AM To: Michael Williams (SSI) <[email protected]> Cc: [email protected] Subject: Re: Consuming from Kafka to delta table - stream or batch mode?
If you want to batch consume from Kafka, trigger-once config would work with structured streaming and you get the benefit of the checkpointing. On Thu, Feb 24, 2022 at 6:07 AM Michael Williams (SSI) <[email protected]<mailto:[email protected]>> wrote: Hello, Our team is working with Spark (for the first time) and one of the sources we need to consume is Kafka (multiple topics). Are there any practical or operational issues to be aware of when deciding whether to a) consume in batches until all messages are consumed then shut down the spark job, then when new messages show up, start a new job; or b) use spark streaming and run the job continuously? If it makes a difference, the environment is on-premise spark on k8s. Any experience shared is appreciated. Thank you, Mike This electronic message may contain information that is Proprietary, Confidential, or legally privileged or protected. It is intended only for the use of the individual(s) and entity named in the message. If you are not an intended recipient of this message, please notify the sender immediately and delete the material from your computer. Do not deliver, distribute or copy this message and do not disclose its contents or take any action in reliance on the information it contains. Thank You. This electronic message may contain information that is Proprietary, Confidential, or legally privileged or protected. It is intended only for the use of the individual(s) and entity named in the message. If you are not an intended recipient of this message, please notify the sender immediately and delete the material from your computer. Do not deliver, distribute or copy this message and do not disclose its contents or take any action in reliance on the information it contains. Thank You.
