Hi, Our application is being designed to operate at all times on a large sliding window (day+) of data. The operations performed on the window of data will change fairly frequently and I need a way to save and restore the sliding window after an app upgrade without having to wait the duration of the sliding window to "warm up". Because it's an app upgrade, checkpointing will not work unfortunately.
I can potentially dump the window to an outside storage periodically or on app shutdown, but I don't have an ideal way of restoring it. I thought about two non-ideal solutions: 1. Load the previous data all at once into the sliding window on app startup. The problem is, at one point I will have double the data in the sliding window until the initial batch of data goes out of scope. 2. Broadcast the previous state of the window separately from the window. Perform the operations on both sets of data until it comes out of scope. The problem is, the data will not fit into memory. Solutions that would solve my problem: 1. Ability to pre-populate sliding window. 2. Have control over batch slicing. It would be nice for a Receiver to dictate the current batch timestamp in order to slow down or fast forward time. Any feedback would be greatly appreciated! Thank you, Matus --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org