Hi Pedro, You should be able to use flight for this: pack you subscription call in a DoGet and listen on the FlightDataStream for new data.
I thinkˆyou can control the granularity of your messages through the size of the record batches you are writing, but I am not a flight developer so don’t take my word for it. Overall, “live" data streaming was not the primary use case behind arrow flight, but I think there is a lot of interest in that application and I think flight fundamentals are quite suitable for it. Here is a somewhat related thread: http://mail-archives.apache.org/mod_mbox/arrow-dev/202008.mbox/%3CCADr7h-dAJrsYB%2BOUN94Z-KBkd4Jt82F78pfE3%2Bj7fg7MX1BrXw%40mail.gmail.com%3E <http://mail-archives.apache.org/mod_mbox/arrow-dev/202008.mbox/%3ccadr7h-dajrsyb+oun94z-kbkd4jt82f78pfe3+j7fg7mx1b...@mail.gmail.com%3E> > On Sep 4, 2020, at 3:39 AM, Pedro Silva <pedro.cl...@gmail.com> wrote: > > Hello, > > This may be a stupid question but is Arrow used for or designed with > streaming processing use-cases in mind, where data is non-stationary. I.e: > Flink stream processing jobs? > > Particularly, is it possible from a given event source (say Kafka) to > efficiently generate incremental record batches for stream processing? > > Suppose there is a data source that continuously generates messages with > 100+ fields. You want to compute grouped aggregations (sums, averages, > count distinct, etc...) over a select few of those fields, say 5 fields at > most used for all queries. > > Is this a valid use-case for Arrow? > What if time is important and some windowing technique has to be applied? > > Thank you very much for your time! > Have a good day.