Re: DataStreamer as a Service

narges saleh Tue, 04 Feb 2020 06:51:58 -0800

Understood. Thank you, for the feedback.

On Tue, Feb 4, 2020 at 7:09 AM Ilya Kasnacheev <[email protected]>
wrote:


> Hello!
>
> Data Streamer can be used on server node all right, however, it is still a
> "client" operation, i.e., it will batch some data locally and only then
> send to server nodes, including itself.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> вт, 4 февр. 2020 г. в 13:58, narges saleh <[email protected]>:
>
>> Hi,
>> I am not sure I follow the relationship between per partition batching
>> and client-side capabilities. Does this mean that the data streamer cannot
>> do per partition batching on the server side, for example the service grid?
>>
>> I understand that low intensity streaming defeats the purpose of having
>> batching (whether client side or not), but my case is long lived high
>> intensity data traffic.
>>
>> thanks.
>>
>> On Tue, Feb 4, 2020 at 4:14 AM Ilya Kasnacheev <[email protected]>
>> wrote:
>>
>>> Hello!
>>>
>>> In case of long-lived, low-intensity streaming, Data Streamer will not
>>> be able to utilize its client-side per-partition batching capabilities,
>>> instead being just a wrapper over cache update operations, which are
>>> available as part of Cache API.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> вт, 4 февр. 2020 г. в 03:41, Denis Magda <[email protected]>:
>>>
>>>> Ilya,
>>>>
>>>> I don't quite understand why data streamer is not suitable as a
>>>> long-running solution. Please don't mislead, otherwise, list out specific
>>>> limitations. I don't see anything wrong by having an opened data
>>>> streamer that transfer data to Ignite in real-time.
>>>>
>>>> Narges, if the streamer crashes then your service/app needs to resend
>>>> those records that were not acknowledged. Probably, you might utilize Kafka
>>>> Connect here that keeps track of committed/pending records.
>>>>
>>>> -
>>>> Denis
>>>>
>>>>
>>>> On Mon, Feb 3, 2020 at 6:13 AM Ilya Kasnacheev <
>>>> [email protected]> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> I think these benefits are imaginary. You will have to worry about
>>>>> service more, rather about data streamer which may be recreated at any 
>>>>> time.
>>>>>
>>>>> Regards,
>>>>> --
>>>>> Ilya Kasnacheev
>>>>>
>>>>>
>>>>> пн, 3 февр. 2020 г. в 16:58, narges saleh <[email protected]>:
>>>>>
>>>>>> Thanks Ilya.
>>>>>>  I have to listen to these burst of data which arrive every few
>>>>>> seconds meaning an almost constant bursts of data from different data
>>>>>> sources.
>>>>>> The main reason that the services grid is appealing to me is its
>>>>>> resiliency; I don't have to worry about it. With the client side 
>>>>>> streamer,
>>>>>> I will have to deploy it and keep it up running, and load/re balance it.
>>>>>>
>>>>>> On Mon, Feb 3, 2020 at 7:17 AM Ilya Kasnacheev <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hello!
>>>>>>>
>>>>>>> I don't see why you would deploy it as a service, sounds like you
>>>>>>> will have to send more data over network. If you have to pull batches 
>>>>>>> in,
>>>>>>> then service should work. I recommend re-acquiring data streamer for 
>>>>>>> each
>>>>>>> batch.
>>>>>>>
>>>>>>> Please note that Data Streamer is very scalable, so it is preferred
>>>>>>> to tune it than trying to use more than one streamer.
>>>>>>>
>>>>>>> Regards,
>>>>>>> --
>>>>>>> Ilya Kasnacheev
>>>>>>>
>>>>>>>
>>>>>>> пн, 3 февр. 2020 г. в 16:11, narges saleh <[email protected]>:
>>>>>>>
>>>>>>>> Hi Ilya
>>>>>>>> The data comes in huge batches of records (each burst can be up to
>>>>>>>> 50-100 MB, which I plan to spread across multiple streamers) so, the
>>>>>>>> streamer seems to be the way to go. Also, I don't want to establish a 
>>>>>>>> JDBC
>>>>>>>> connection each time.
>>>>>>>> So, if the streamer is the way to go, is it feasible to deploy it
>>>>>>>> as a service?
>>>>>>>> thanks.
>>>>>>>>
>>>>>>>> On Mon, Feb 3, 2020 at 6:51 AM Ilya Kasnacheev <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hello!
>>>>>>>>>
>>>>>>>>> Contrary to its name, data streamer is not actually suitable for
>>>>>>>>> long-lived, low-intensity streaming. What it's good for is burst load 
>>>>>>>>> of
>>>>>>>>> large number of data in a short period of time.
>>>>>>>>>
>>>>>>>>> If your data arrives in large batches, you can use Data Streamer
>>>>>>>>> for each batch. If not, better use Cache API.
>>>>>>>>>
>>>>>>>>> If you are worried that plain Cache API is slow, but also want
>>>>>>>>> failure resilience, there's catch-22. The only way to make something
>>>>>>>>> resilient is to put it into cache :)
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> --
>>>>>>>>> Ilya Kasnacheev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> пн, 3 февр. 2020 г. в 14:34, narges saleh <[email protected]>:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>> But services are by definition long lived, right? Here is my
>>>>>>>>>> layout: The data is continuously generated and sent to the streamer
>>>>>>>>>> services (via JDBC connection with set streaming on option), 
>>>>>>>>>> deployed, say,
>>>>>>>>>> as node singleton (actually deployed also as microservices) to load 
>>>>>>>>>> the
>>>>>>>>>> data into the caches. The streamers do flush data based on some 
>>>>>>>>>> timers.
>>>>>>>>>>  If the streamer crashes before the buffer is flushed, the client
>>>>>>>>>> catches the exception and resends the batch. Any issue with this 
>>>>>>>>>> layout?
>>>>>>>>>>
>>>>>>>>>> thanks.
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hello!
>>>>>>>>>>>
>>>>>>>>>>> It is not recommended to have long-lived data streamers, it's
>>>>>>>>>>> best to acquire it when it is needed.
>>>>>>>>>>>
>>>>>>>>>>> If you have to keep data streamer around, don't forget to
>>>>>>>>>>> flush() it. This way you don't have to worry about its queue.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> --
>>>>>>>>>>> Ilya Kasnacheev
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> пн, 3 февр. 2020 г. в 13:24, narges saleh <[email protected]
>>>>>>>>>>> >:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> My specific question/concern is with regard to the state of the
>>>>>>>>>>>> streamer when it run as a service, i.e. when it crashes and it gets
>>>>>>>>>>>> redeployed. Specifically, what happens to the data?
>>>>>>>>>>>> I have a similar question with regard to the state of a
>>>>>>>>>>>> continuous query when it is deployed as a service, what happens to 
>>>>>>>>>>>> the data
>>>>>>>>>>>> in the listener's queue?
>>>>>>>>>>>>
>>>>>>>>>>>> thanks.
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>
>>>>>>>>>>>>> Not as far as I know, I have a number of services using
>>>>>>>>>>>>> streamers
>>>>>>>>>>>>> without any problems, do you have any specific problem with it
>>>>>>>>>>>>> ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Mikael
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>>>>>>>>>>>>> > Hi All,
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Is there a problem with running the datastreamer as a
>>>>>>>>>>>>> service, being
>>>>>>>>>>>>> > instantiated in init method? Or loading the data via JDBC
>>>>>>>>>>>>> connection
>>>>>>>>>>>>> > with streaming mode enabled?
>>>>>>>>>>>>> > In either case, the deployment is affinity based.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > thanks.
>>>>>>>>>>>>>
>>>>>>>>>>>>

Re: DataStreamer as a Service

Reply via email to