Hello!

I think you should consider using putAll() operation if resiliency is
important for you, since this operation will be salvaged if initiator node
fails.

Regards,
-- 
Ilya Kasnacheev


чт, 16 янв. 2020 г. в 15:48, narges saleh <[email protected]>:

> Thanks Saikat.
>
> I am not sure if sequential keys/timestamps and Kafka like offsets would
> help if there are many data source clients and many streamer nodes in play;
> depending on the checkpoint, we might still end up duplicates (unless
> you're saying each client sequences its payload before sending it to the
> streamer; even then duplicates are possible on the cache). The only sure
> way, it seems to me, is for the client that catches the exception to check
> the cache and only resend the diff, which make things very complex. The
> other approach, if I am right is, to enable overwrite, so the streamer
> would dedup the data in cache. The latter is costly too. I think the ideal
> approach would have been if there were some type of streamer resiliency in
> place where another streamer node could pick up the buffer from a crashed
> streamer and continue the work.
>
>
> On Wed, Jan 15, 2020 at 9:00 PM Saikat Maitra <[email protected]>
> wrote:
>
>> Hi,
>>
>> To minimise data loss during streamer node failure I think we can use the
>> following steps:
>>
>> 1. Use autoFlushFrequency param to set the desired flush frequency,
>> depending on desired consistency level and performance you can choose how
>> frequently you would like the data to be flush to Ignite nodes.
>>
>> 2. Develop a automated checkpointing process to capture and store the
>> source data offset, it can be something like kafka message offset or cache
>> keys if keys are sequential or timestamp for last flush and depending on
>> that the Ignite client can restart the data streaming process from last
>> checkpoint if there are node failure.
>>
>> HTH
>>
>> Regards,
>> Saikat
>>
>> On Fri, Jan 10, 2020 at 4:34 AM narges saleh <[email protected]>
>> wrote:
>>
>>> Thanks Saikat for the feedback.
>>>
>>> But if I use the overwrite option set to true to avoid duplicates in
>>> case I have to resend the entire payload in case of a streamer node
>>> failure, then I won't
>>>  get optimal performance, right?
>>> What's the best practice for dealing with data streamer node failures?
>>> Are there examples?
>>>
>>> On Thu, Jan 9, 2020 at 9:12 PM Saikat Maitra <[email protected]>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> AFAIK, the DataStreamer check for presence of key and if it is present
>>>> in the cache then it does not allow overwrite of value if allowOverwrite is
>>>> set to false.
>>>>
>>>> Regards,
>>>> Saikat
>>>>
>>>> On Thu, Jan 9, 2020 at 6:04 AM narges saleh <[email protected]>
>>>> wrote:
>>>>
>>>>> Thanks Andrei.
>>>>>
>>>>> If the external data source client sending batches of 2-3 MB say via
>>>>> TCP socket connection to a bunch of socket streamers (deployed as ignite
>>>>> services deployed to each ignite node) and say of the streamer nodes die,
>>>>> the data source client catching the exception, has to check the cache to
>>>>> see how much of the 2-4MB batch has been flushed to cache and resend the
>>>>> rest? Would setting streamer with overwrite set to true work, if the data
>>>>> source client resend the entire batch?
>>>>> A question regarding streamer with overwrite option set to true. How
>>>>> does the streamer compare the content the data in hand with the data in
>>>>> cache, if each record is being assigned UUID when being  inserted to 
>>>>> cache?
>>>>>
>>>>>
>>>>> On Tue, Jan 7, 2020 at 4:40 AM Andrei Aleksandrov <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Not flushed data in a data streamer will be lost. Data streamer works
>>>>>> thought some Ignite node and in case if this the node failed it can't
>>>>>> somehow start working with another one. So your application should
>>>>>> think
>>>>>> about how to track that all data was loaded (wait for completion of
>>>>>> loading, catch the exceptions, check the cache sizes, etc) and use
>>>>>> another client for data loading in case if previous one was failed.
>>>>>>
>>>>>> BR,
>>>>>> Andrei
>>>>>>
>>>>>> 1/6/2020 2:37 AM, narges saleh пишет:
>>>>>> > Hi All,
>>>>>> >
>>>>>> > Another question regarding ignite's streamer.
>>>>>> > What happens to the data if the streamer node crashes before the
>>>>>> > buffer's content is flushed to the cache? Is the client responsible
>>>>>> > for making sure the data is persisted or ignite redirects the data
>>>>>> to
>>>>>> > another node's streamer?
>>>>>> >
>>>>>> > thanks.
>>>>>>
>>>>>

Reply via email to