Re: Multiple MapState vs single nested MapState in stateful Operator

2019-01-11 Thread Gagan Agrawal
This makes perfect sense to me. Thanks Congxian and Kostas for your inputs. Gagan On Thu, Jan 10, 2019 at 6:03 PM Kostas Kloudas wrote: > Hi Gagan, > > I agree with Congxian! > In MapState, when accessing the state/value associated with a key in the > map, then the whole value is de-serialized

Re: Multiple MapState vs single nested MapState in stateful Operator

2019-01-10 Thread Kostas Kloudas
Hi Gagan, I agree with Congxian! In MapState, when accessing the state/value associated with a key in the map, then the whole value is de-serialized (and serialized in case of a put()). Given this, it is more efficient to have many keys, with small state, than fewer keys with huge state. Cheers,

Re: Multiple MapState vs single nested MapState in stateful Operator

2019-01-10 Thread Congxian Qiu
Hi, Gagan Agrawal In my opinion, I prefer the first. Here is the reason. In RocksDB StateBackend, we will serialize the key, namespace, user-key into a serialized bytes (key-bytes) and serialize user-value to serialized bytes(value-bytes) then insert into the key-bytes/value-bytes into

Multiple MapState vs single nested MapState in stateful Operator

2019-01-09 Thread Gagan Agrawal
Hi, I have a use case where 4 streams get merged (union) and grouped on common key (keyBy) and a custom KeyedProcessFunction is called. Now I need to keep state (RocksDB backend) for all 4 streams in my custom KeyedProcessFunction where each of these 4 streams would be stored as map. So I have 2