Hi all, During our test at twitter, when a state is very large (GBs), the current checkpointing mechanism doesn't work well. Here's a proposal of transferring large checkpointing state through local disk to solve the problem we've found.
https://docs.google.com/document/d/1MiOAV3bZATezuIgpwk8JkwaGx7Og4EeCwNcxXlEH5OM/edit?usp=sharing Please take a look and feel free to comment any ideas. Best Regards, Neng
