Hi, I guess whether structured streaming (SS) inherited anything from spark streaming is a moot point now, although it is a concept built on spark streaming which will be defunct soon.
Going forward, It all depends on what problem you are trying to address. These are explained in the following doc <https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html> However, within SSS micro-batching, you have the concept of working out running aggregates within a given timeframe akin to spark streaming with sliding window and window's length. The other one relies on a fixed triggering mechanism to invoke a function to perform some specific tasks (processing CDC, writing the result set to a database, working out average prices per security etc) on streaming data in that triggering period. To see a discussion on running aggregates please look for the following thread in this forum "Calculate average from Spark stream" And for triggering mechanism you can see an example in my linkedin below https://www.linkedin.com/pulse/processing-change-data-capture-spark-structured-talebzadeh-ph-d-/ HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Mon, 31 May 2021 at 19:32, S <sheelst...@gmail.com> wrote: > Hi, > > I am using Structured Streaming on Azure HdInsight. The version is 2.4.6. > > I am trying to understand the microbatch mode - default and fixed > intervals. Does the fixed interval microbatch follow something similar to > receiver based model where records keep getting pulled and stored into > blocks for the duration of the interval at the end of which a job is kicked > off? Or, does the job just process the current microbatch and sleep for the > rest of the interval and pulls records only at the end of the interval? > > I am fully aware of the two dstreams models - receiver and direct based > dstreams. I am just trying to figure out if either of these two models were > reused in Structured Streaming. > > Regards, > Sheel >