Hi,

Our application is being designed to operate at all times on a large
sliding window (day+) of data. The operations performed on the window
of data will change fairly frequently and I need a way to save and
restore the sliding window after an app upgrade without having to wait
the duration of the sliding window to "warm up". Because it's an app
upgrade, checkpointing will not work unfortunately.

I can potentially dump the window to an outside storage periodically
or on app shutdown, but I don't have an ideal way of restoring it.

I thought about two non-ideal solutions:
1. Load the previous data all at once into the sliding window on app
startup. The problem is, at one point I will have double the data in
the sliding window until the initial batch of data goes out of scope.
2. Broadcast the previous state of the window separately from the
window. Perform the operations on both sets of data until it comes out
of scope. The problem is, the data will not fit into memory.

Solutions that would solve my problem:
1. Ability to pre-populate sliding window.
2. Have control over batch slicing. It would be nice for a Receiver to
dictate the current batch timestamp in order to slow down or fast
forward time.

Any feedback would be greatly appreciated!

Thank you,
Matus

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to