If you want to find what offset ranges are present in a microbatch in Structured Streaming, you have to look at the StreamingQuery.lastProgress or use the QueryProgressListener <https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/streaming/StreamingQueryListener.html>. Both of these approaches gives you access to the SourceProgress <https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/streaming/SourceProgress.html> which gives Kafka offsets as a JSON string.
Hope this helps! On Wed, May 22, 2024 at 10:04 AM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > OK to understand better your current model relies on streaming data input > through Kafka topic, Spark does some ETL and you send to a sink, a > database for file storage like HDFS etc? > > Your current architecture relies on Direct Streams (DStream) and RDDs and > you want to move to Spark sStructured Streaming based on dataframes and > datasets? > > You have not specified your sink > > With regard to your question? > > "Is there an equivalent of Dstream HasOffsetRanges in structure streaming > to get the microbatch end offsets to the checkpoint in our external > checkpoint store ?" > > There is not a direct equivalent of DStream HasOffsetRanges in Spark > Structured Streaming. However, Structured Streaming provides mechanisms to > achieve similar functionality: > > HTH > > Mich Talebzadeh, > Technologist | Architect | Data Engineer | Generative AI | FinCrime > London > United Kingdom > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > *Disclaimer:* The information provided is correct to the best of my > knowledge but of course cannot be guaranteed . It is essential to note > that, as with any advice, quote "one test result is worth one-thousand > expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von > Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". > > > On Wed, 22 May 2024 at 10:32, ashok34...@yahoo.com.INVALID > <ashok34...@yahoo.com.invalid> wrote: > >> Hello, >> >> what options are you considering yourself? >> >> On Wednesday 22 May 2024 at 07:37:30 BST, Anil Dasari < >> adas...@guidewire.com> wrote: >> >> >> Hello, >> >> We are on Spark 3.x and using Spark dstream + kafka and planning to use >> structured streaming + Kafka. >> Is there an equivalent of Dstream HasOffsetRanges in structure streaming >> to get the microbatch end offsets to the checkpoint in our external >> checkpoint store ? Thanks in advance. >> >> Regards >> >>