Re: Structured Streaming Process Each Records Individually

2024-01-11 Thread Mich Talebzadeh
Hi, Let us visit the approach as some fellow members correctly highlighted the use case for spark structured streaming and two key concepts that I will mention - foreach: A method for applying custom write logic to each individual row in a streaming DataFrame or Dataset. - foreachBatch:

Re: Structured Streaming Process Each Records Individually

2024-01-10 Thread Ant Kutschera
It might be good to first split the stream up into smaller streams, one per type. If ordering of the Kafka records is important, then you could partition them at the source based on the type, but be careful how you configure Spark to read from Kafka as that could also influence ordering. kdf

Re: Structured Streaming Process Each Records Individually

2024-01-10 Thread Mich Talebzadeh
Use an intermediate work table to put json data streaming in there in the first place and then according to the tag store the data in the correct table HTH Mich Talebzadeh, Dad | Technologist | Solutions Architect | Engineer London United Kingdom view my Linkedin profile

Re: Structured Streaming Process Each Records Individually

2024-01-10 Thread Khalid Mammadov
Use foreachBatch or foreach methods: https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#using-foreach-and-foreachbatch On Wed, 10 Jan 2024, 17:42 PRASHANT L, wrote: > Hi > I have a use case where I need to process json payloads coming from Kafka > using structured s

Structured Streaming Process Each Records Individually

2024-01-10 Thread PRASHANT L
Hi I have a use case where I need to process json payloads coming from Kafka using structured streaming , but thing is json can have different formats , schema is not fixed and each json will have a @type tag so based on tag , json has to be parsed and loaded to table with tag name , and if a json