Re: DataStreamer as a Service

Denis Magda Mon, 03 Feb 2020 16:42:33 -0800

Ilya,

I don't quite understand why data streamer is not suitable as a
long-running solution. Please don't mislead, otherwise, list out specific
limitations. I don't see anything wrong by having an opened data
streamer that transfer data to Ignite in real-time.


Narges, if the streamer crashes then your service/app needs to resend those
records that were not acknowledged. Probably, you might utilize Kafka
Connect here that keeps track of committed/pending records.

-
Denis


On Mon, Feb 3, 2020 at 6:13 AM Ilya Kasnacheev <[email protected]>
wrote:

> Hello!
>
> I think these benefits are imaginary. You will have to worry about service
> more, rather about data streamer which may be recreated at any time.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пн, 3 февр. 2020 г. в 16:58, narges saleh <[email protected]>:
>
>> Thanks Ilya.
>>  I have to listen to these burst of data which arrive every few seconds
>> meaning an almost constant bursts of data from different data sources.
>> The main reason that the services grid is appealing to me is its
>> resiliency; I don't have to worry about it. With the client side streamer,
>> I will have to deploy it and keep it up running, and load/re balance it.
>>
>> On Mon, Feb 3, 2020 at 7:17 AM Ilya Kasnacheev <[email protected]>
>> wrote:
>>
>>> Hello!
>>>
>>> I don't see why you would deploy it as a service, sounds like you will
>>> have to send more data over network. If you have to pull batches in, then
>>> service should work. I recommend re-acquiring data streamer for each batch.
>>>
>>> Please note that Data Streamer is very scalable, so it is preferred to
>>> tune it than trying to use more than one streamer.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> пн, 3 февр. 2020 г. в 16:11, narges saleh <[email protected]>:
>>>
>>>> Hi Ilya
>>>> The data comes in huge batches of records (each burst can be up to
>>>> 50-100 MB, which I plan to spread across multiple streamers) so, the
>>>> streamer seems to be the way to go. Also, I don't want to establish a JDBC
>>>> connection each time.
>>>> So, if the streamer is the way to go, is it feasible to deploy it as a
>>>> service?
>>>> thanks.
>>>>
>>>> On Mon, Feb 3, 2020 at 6:51 AM Ilya Kasnacheev <
>>>> [email protected]> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> Contrary to its name, data streamer is not actually suitable for
>>>>> long-lived, low-intensity streaming. What it's good for is burst load of
>>>>> large number of data in a short period of time.
>>>>>
>>>>> If your data arrives in large batches, you can use Data Streamer for
>>>>> each batch. If not, better use Cache API.
>>>>>
>>>>> If you are worried that plain Cache API is slow, but also want failure
>>>>> resilience, there's catch-22. The only way to make something resilient is
>>>>> to put it into cache :)
>>>>>
>>>>> Regards,
>>>>> --
>>>>> Ilya Kasnacheev
>>>>>
>>>>>
>>>>> пн, 3 февр. 2020 г. в 14:34, narges saleh <[email protected]>:
>>>>>
>>>>>> Hi,
>>>>>> But services are by definition long lived, right? Here is my layout:
>>>>>> The data is continuously generated and sent to the streamer services (via
>>>>>> JDBC connection with set streaming on option), deployed, say, as node
>>>>>> singleton (actually deployed also as microservices) to load the data into
>>>>>> the caches. The streamers do flush data based on some timers.
>>>>>>  If the streamer crashes before the buffer is flushed, the client
>>>>>> catches the exception and resends the batch. Any issue with this layout?
>>>>>>
>>>>>> thanks.
>>>>>>
>>>>>> On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hello!
>>>>>>>
>>>>>>> It is not recommended to have long-lived data streamers, it's best
>>>>>>> to acquire it when it is needed.
>>>>>>>
>>>>>>> If you have to keep data streamer around, don't forget to flush()
>>>>>>> it. This way you don't have to worry about its queue.
>>>>>>>
>>>>>>> Regards,
>>>>>>> --
>>>>>>> Ilya Kasnacheev
>>>>>>>
>>>>>>>
>>>>>>> пн, 3 февр. 2020 г. в 13:24, narges saleh <[email protected]>:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> My specific question/concern is with regard to the state of the
>>>>>>>> streamer when it run as a service, i.e. when it crashes and it gets
>>>>>>>> redeployed. Specifically, what happens to the data?
>>>>>>>> I have a similar question with regard to the state of a continuous
>>>>>>>> query when it is deployed as a service, what happens to the data in the
>>>>>>>> listener's queue?
>>>>>>>>
>>>>>>>> thanks.
>>>>>>>>
>>>>>>>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi!
>>>>>>>>>
>>>>>>>>> Not as far as I know, I have a number of services using streamers
>>>>>>>>> without any problems, do you have any specific problem with it ?
>>>>>>>>>
>>>>>>>>> Mikael
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>>>>>>>>> > Hi All,
>>>>>>>>> >
>>>>>>>>> > Is there a problem with running the datastreamer as a service,
>>>>>>>>> being
>>>>>>>>> > instantiated in init method? Or loading the data via JDBC
>>>>>>>>> connection
>>>>>>>>> > with streaming mode enabled?
>>>>>>>>> > In either case, the deployment is affinity based.
>>>>>>>>> >
>>>>>>>>> > thanks.
>>>>>>>>>
>>>>>>>>

Re: DataStreamer as a Service

Reply via email to