Yes exactly Avi. Cheers, Till
On Wed, Jan 2, 2019 at 5:42 PM Avi Levi <avi.l...@bluevoyant.com> wrote: > Thanks Till I will defiantly going to check it. just to make sure that I > got you correctly. you are suggesting the the list that I want to broadcast > will be broadcasted via control stream and it will be than be kept in the > relevant operator state correct ? and updates (CRUD) on that list will be > preformed via the control stream. correct ? > BR > Avi > > On Wed, Jan 2, 2019 at 4:28 PM Till Rohrmann <trohrm...@apache.org> wrote: > >> Hi Avi, >> >> you could use Flink's broadcast state pattern [1]. You would need to use >> the DataStream API but it allows you to have two streams (input and control >> stream) where the control stream is broadcasted to all sub tasks. So by >> ingesting messages into the control stream you can send model updates to >> all sub tasks. >> >> [1] >> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_dev_stream_state_broadcast-5Fstate.html&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=dpWtkT5FJRWFqDA3MAnB4-dRYGDQjgfQTYAocqGkRKo&m=u5UQh821Gau2wZ7S3M8IRmVpL5JxGADJaq_k7iq6sYo&s=uITdFlQPKLbqxkTux4nR21JhUpLIkS5Pdfi9D_ZSUwE&e=> >> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html >> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_dev_stream_state_broadcast-5Fstate.html&d=DwQFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=dpWtkT5FJRWFqDA3MAnB4-dRYGDQjgfQTYAocqGkRKo&m=u5UQh821Gau2wZ7S3M8IRmVpL5JxGADJaq_k7iq6sYo&s=uITdFlQPKLbqxkTux4nR21JhUpLIkS5Pdfi9D_ZSUwE&e=> >> >> Cheers, >> Till >> >> On Tue, Jan 1, 2019 at 6:49 PM miki haiat <miko5...@gmail.com> wrote: >> >>> Im trying to understand your use case. >>> What is the source of the data ? FS ,KAFKA else ? >>> >>> >>> On Tue, Jan 1, 2019 at 6:29 PM Avi Levi <avi.l...@bluevoyant.com> wrote: >>> >>>> Hi, >>>> I have a list (couple of thousands text lines) that I need to use in my >>>> map function. I read this article about broadcasting variables >>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_dev_batch_-23broadcast-2Dvariables&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=dpWtkT5FJRWFqDA3MAnB4-dRYGDQjgfQTYAocqGkRKo&m=u5UQh821Gau2wZ7S3M8IRmVpL5JxGADJaq_k7iq6sYo&s=U3vGeHdL9fGDfP0GNZUkGpSlcVLz9CNLg2MXNwHP0_M&e=> >>>> or >>>> using distributed cache >>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_dev_batch_-23distributed-2Dcache&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=dpWtkT5FJRWFqDA3MAnB4-dRYGDQjgfQTYAocqGkRKo&m=u5UQh821Gau2wZ7S3M8IRmVpL5JxGADJaq_k7iq6sYo&s=m5IHbX1Dbz7AYERvVgyxKXmrUQQ06IkA4VCDllkR0HM&e=> >>>> however I need to update this list from time to time, and if I understood >>>> correctly it is not possible on broadcast or cache without restarting the >>>> job. Is there idiomatic way to achieve this? A db seems to be an overkill >>>> for that and I do want to be cheap on io/network calls as much as possible. >>>> >>>> Cheers >>>> Avi >>>> >>>>