Hi Lu, It would be helpful to know about your query requirements, before making a recommendation.
E.g. does it just need to be a key-value store, and thus you’re querying by a single key (which has to match the state partitioning key)? What about latency requirements? E.g. if you’re processing Flink state (option 3) then this is going to be large. As a final take-away, in my experience I’ve always wound up shoving data into a separate system (Pinot is my current favorite) for queries. — Ken > On Aug 29, 2022, at 3:19 PM, Lu Niu <qqib...@gmail.com> wrote: > > Hi, Flink Users > > We have a user case that requests running ad hoc queries to query flink > state. There are several options: > > 1. Dump flink state to external data systems, like kafka, s3 etc. from there > we can query the data. This is a very straightforward approach, but adds > system complexity and overall cost. > 2. Flink Queryable State. This requires additional development and also when > the job is down, we can not query the data, which violates the need for > debugging in the first place. Last, from some channel I happen to know this > feature is on the deprecation list. > 3. Flink State API. This requires additional development. > > I am wondering what are some best practices applied in production. For me, I > really hope there is one product that 1. let me query the flink state using > SQL 2. decouple with flink job > > Best > Lu > > -------------------------- Ken Krugler http://www.scaleunlimited.com Custom big data solutions Flink, Pinot, Solr, Elasticsearch