Hello all,
I would also be really interested in how a PS-like architecture would work
in Flink. Note that we not necessarily talking about PS, but generally how
QueryableState can be used for ML tasks with I guess a focus on
model-parallel training.
One suggestion I would make is to take a look
Hey Ufuk,
I'm happy to contribute. At least I'll get a bit more understanding of
the details.
Breaking the assumption that only a single thread updates state would
brings us from strong isolation guarantees (i.e. serializability at the
updates and read committed at the external queries) to
Hey Gabor,
great ideas here. It's only slightly related, but I'm currently working on a
proposal to improve the queryable state APIs for lookups (partly along the
lines of what you suggested with higher level accessors). Maybe you are
interested in contributing there?
I really like your ideas
Hi Gyula, Jinkui Shi,
Thanks for your thoughts!
@Gyula: I'll try and explain a bit more detail.
The API could be almost like the QueryableState's. It could be
higher-level though: returning Java objects instead of serialized data
(because there would not be issues with class loading). Also,
hi,Gábor Hermann
The online parameter server is a good proposal.
PS’ paper [1] have a early implement [2], and now it’s mxnet [3].
I have some thought about online PS in Flink:
1. Whether support flexible and configurable update strategy?
For example, in one iteration, computing
Hi Gábor,
I think the general idea is very nice, but it would nice to see clearer
what benefit does this bring from the developers perspective. Maybe rough
API sketch and 1-2 examples.
I am wondering what sort of consistency guarantees do you imagine for such
operations, or why the fault
Hi all,
TL;DR: Is it worth to implement a special QueryableState for querying
state from another part of a Flink streaming job and aligning it with
fault tolerance?
I've been thinking about implementing a Parameter Server with/within
Flink. A Parameter Server is basically a specialized