Re: Structured Streaming Process Each Records Individually

Khalid Mammadov Wed, 10 Jan 2024 10:54:59 -0800

Use foreachBatch or foreach methods:
https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#using-foreach-and-foreachbatch


On Wed, 10 Jan 2024, 17:42 PRASHANT L, <prashant...@gmail.com> wrote:

> Hi
> I have a use case where I need to process json payloads coming from Kafka
> using structured streaming , but thing is json can have different formats ,
> schema is not fixed
> and each json will have a @type tag so based on tag , json has to be
> parsed and loaded to table with tag name  , and if a json has nested sub
> tags , those tags shd go to different table
> so I need to process each json record individually , and determine
> destination tables what would be the best approach
>
>
>> *{*
>> *    "os": "andriod",*
>> *    "type": "mobile",*
>> *    "device": {*
>> *        "warrenty": "3 years",*
>> *        "replace": "yes"*
>> *    },*
>> *    "zones": [*
>> *        {*
>> *            "city": "Bangalore",*
>> *            "state": "KA",*
>> *            "pin": "577401"*
>> *        },*
>> *        {*
>> *            "city": "Mumbai",*
>> *            "state": "MH",*
>> *            "pin": "576003"*
>> *        }*
>> *    ],*
>> *    "@table": "product"**}*
>
>
> so for the above json , there are 3 tables created
> 1. Product (@type) THis is a parent table
> 2.  poduct_zones and product_devices , child table
>

Re: Structured Streaming Process Each Records Individually

Reply via email to