Github user kl0u commented on the pull request:
https://github.com/apache/flink/pull/934#issuecomment-126076384
Hi @mxm ,
Thanks a lot for the comments!
I integrated most of them. Please have a look and let me know what you
think.
For the merging of the the different types of snapshots and handling them
uniformly I do not have any current solution. If you have any, I am open, of
course, to discuss it, because I agree that this would be nice.
For the comment on the getAccumulatorResultsStringified():
1) this is to be presented by the web interface to the user, just for
monitoring purposes
2) this is called at the jobManager.
The problem is that the jobManager has only the blobKeys that point to the
stored accumulators. The serialized data reside in the blobCache and have to be
fetched in order to be inspected.
Currently the jobManager just forwards the blobKeys to the client, which
fetches the blobs and does the deserialization and the final merging. This is
done for jobManager scalability reasons, as given that we are talking about
accumulators of arbitrary size, loading them from disk and deserializing them
would be time and resource consuming. The same holds in the case that we wanted
to get the type of these large accumulators (it is needed by the method). We
would have to load and deserialize them at the jobManager. The currently
implemented solution is just the result of this design decision. If you have
any other strategy or solution that is worth implementing, let me know.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---