Each keyed state in Flink is a hashtable or a column family in RocksDB. Having too many of those is not memory efficient.
Having fewer states is better, if you can adapt your schema that way. I would also look into "MapState", which is an efficient way to have "sub keys" under a keyed state. Stephan On Mon, Jul 31, 2017 at 6:01 PM, shashank agarwal <shashank...@gmail.com> wrote: > Hello, > > I have to compute results on basis of lot of history data, parameters like > total transactions in last 1 month, last 1 day, last 1 hour etc. by email > id, ip, mobile, name, address, zipcode etc. > > So my question is this right approach to create keyed state by email, > mobile, zipcode etc. or should i create 1 big mapped state (BS) and than > process that BS, may be in process function or by applying some loop and > filter logic in window or process function. > > My main worry is i will end up with millions of states, because there can > be millions unique emails, phone numbers or zipcode if i create keyed state > by email, phone etc. > > am i right ? is this impact on the performance or is this wrong approach ? > Which approach would you suggest in this use case. > > > -- > Thanks Regards > > SHASHANK AGARWAL > --- Trying to mobilize the things.... > > > > >