Hi, +1
I think it would be a good idea to separate the 2 state backends. I think you are right in most cases the new partitioned state implementations will benefit from this as it removes a lot of additional overhead (although sometimes it's nice to have the 2 together, for instance if they both use the filesystem) :) Cheers, Gyula Aljoscha Krettek <aljos...@apache.org> ezt írta (időpont: 2016. jan. 7., Cs, 11:02): > Hi, > I’m currently examining ways to 1) change the window operators to use the > partitioned state abstraction for window state and 2) implement state > backends for managed memory/out-of-core state. > > I think it would be helpful to pull the state backend apart. Right now, > for example, the DbStateBackend has a custom way of specifying another > state backend that should be used for non-partitioned state since a data > base really only makes sense for partitioned state. I was thinking about > adding a state backend based on RocksDB, which would also only make sense > for partitioned state. Pulling the two ways of state apart would allow the > implementation to focus on the important parts and give the user > flexibility without requiring every state backend to implement this. > > What do you think about pulling the back ends apart? > > — > Aljoscha > > P.S. I have a prototype WindowOperator on partitioned state that does not > regress in performance compared to the current WindowOperator. Also, I have > a prototype RocksDB state backend. Here, the performance is about 1/10th > compared to using the in-memory state backend (with 100.000 keys) but it > scales way better (with the in-memory state backend performance goes down > when increasing the number of keys while it stays constant with RocksDB). > This is quite nice since it allows to use the same windowing code while > exchanging the state backend based on the job requirements.