Yes exactly Avi.

Cheers,
Till

On Wed, Jan 2, 2019 at 5:42 PM Avi Levi <avi.l...@bluevoyant.com> wrote:

> Thanks Till I will defiantly going to check it. just to make sure that I
> got you correctly. you are suggesting the the list that I want to broadcast
> will be broadcasted via control stream and it will be than be kept in the
> relevant operator state correct ? and updates (CRUD) on that list will be
> preformed via the control stream. correct ?
> BR
> Avi
>
> On Wed, Jan 2, 2019 at 4:28 PM Till Rohrmann <trohrm...@apache.org> wrote:
>
>> Hi Avi,
>>
>> you could use Flink's broadcast state pattern [1]. You would need to use
>> the DataStream API but it allows you to have two streams (input and control
>> stream) where the control stream is broadcasted to all sub tasks. So by
>> ingesting messages into the control stream you can send model updates to
>> all sub tasks.
>>
>> [1]
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_dev_stream_state_broadcast-5Fstate.html&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=dpWtkT5FJRWFqDA3MAnB4-dRYGDQjgfQTYAocqGkRKo&m=u5UQh821Gau2wZ7S3M8IRmVpL5JxGADJaq_k7iq6sYo&s=uITdFlQPKLbqxkTux4nR21JhUpLIkS5Pdfi9D_ZSUwE&e=>
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_dev_stream_state_broadcast-5Fstate.html&d=DwQFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=dpWtkT5FJRWFqDA3MAnB4-dRYGDQjgfQTYAocqGkRKo&m=u5UQh821Gau2wZ7S3M8IRmVpL5JxGADJaq_k7iq6sYo&s=uITdFlQPKLbqxkTux4nR21JhUpLIkS5Pdfi9D_ZSUwE&e=>
>>
>> Cheers,
>> Till
>>
>> On Tue, Jan 1, 2019 at 6:49 PM miki haiat <miko5...@gmail.com> wrote:
>>
>>> Im trying to understand  your  use case.
>>> What is the source  of the data ? FS ,KAFKA else ?
>>>
>>>
>>> On Tue, Jan 1, 2019 at 6:29 PM Avi Levi <avi.l...@bluevoyant.com> wrote:
>>>
>>>> Hi,
>>>> I have a list (couple of thousands text lines) that I need to use in my
>>>> map function. I read this article about broadcasting variables
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_dev_batch_-23broadcast-2Dvariables&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=dpWtkT5FJRWFqDA3MAnB4-dRYGDQjgfQTYAocqGkRKo&m=u5UQh821Gau2wZ7S3M8IRmVpL5JxGADJaq_k7iq6sYo&s=U3vGeHdL9fGDfP0GNZUkGpSlcVLz9CNLg2MXNwHP0_M&e=>
>>>>  or
>>>> using distributed cache
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_dev_batch_-23distributed-2Dcache&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=dpWtkT5FJRWFqDA3MAnB4-dRYGDQjgfQTYAocqGkRKo&m=u5UQh821Gau2wZ7S3M8IRmVpL5JxGADJaq_k7iq6sYo&s=m5IHbX1Dbz7AYERvVgyxKXmrUQQ06IkA4VCDllkR0HM&e=>
>>>> however I need to update this list from time to time, and if I understood
>>>> correctly it is not possible on broadcast or cache without restarting the
>>>> job. Is there idiomatic way to achieve this? A db seems to be an overkill
>>>> for that and I do want to be cheap on io/network calls as much as possible.
>>>>
>>>> Cheers
>>>> Avi
>>>>
>>>>

Reply via email to