Re: Continuous Query

narges saleh Wed, 07 Oct 2020 18:15:24 -0700

Thanks Raymond for the pointer.

On Wed, Oct 7, 2020 at 4:39 PM Raymond Wilson <raymond_wil...@trimble.com>
wrote:


> It's possible to configure the continuous query to return only keys in the
> cache that are stored on the local node.
>
> On the C# client we do it like this:
>
> var _queryHandle = queueCache.QueryContinuous
>             (qry: new ContinuousQuery<[..KeyType..],
> [...ValueType...]>(listener) {Local = true},
>               initialQry: new ScanQuery< [..KeyType..], [...ValueType...]
> > {Local = true});
>
> On Thu, Oct 8, 2020 at 9:53 AM narges saleh <snarges...@gmail.com> wrote:
>
>> Thanks for the detailed explanation.
>>
>> I don't know how efficient it would be if you have to filter each record
>> one by one and then update each record, three times, to keep track of the
>> status, if you're dealing with millions of records each hour, even if the
>> cache is partitioned. I guess I will need to benchmark this.  thanks again.
>>
>> On Wed, Oct 7, 2020 at 12:00 PM Denis Magda <dma...@apache.org> wrote:
>>
>>> I recalled a complete solution. That's what you would need to do if
>>> decide to process records in real-time with continuous queries in the
>>> *fault-tolerant fashion* (using pseudo-code rather than actual Ignite APIs).
>>>
>>> First. You need to add a flag field to the record's class that keeps the
>>> current processing status. Like that:
>>>
>>> MyRecord {
>>> int id;
>>> Date created;
>>> byte status; //0 - not processed, 1 - being processed withing a
>>> continuous query filter, 2 - processed by the filter, all the logic
>>> successfully completed
>>> }
>>>
>>> Second. The continuous query filter (that will be executed on nodes that
>>> store a copy of a record) needs to have the following structure.
>>>
>>> @IgniteAsyncCallback
>>> filterMethod(MyRecords updatedRecord) {
>>>
>>> if (isThisNodePrimaryNodeForTheRecord(updatedRecord)) { // execute on a
>>> primary node only
>>>     updatedRecord.status = 1 // setting the flag to signify the
>>> processing is started.
>>>
>>>     //execute your business logic
>>>
>>>     updatedRecord.status = 2 // finished processing
>>> }
>>>   return false; //you don't want a notification to be sent to the client
>>> application or another node that deployed the continuous query
>>> }
>>>
>>> Third. If any node leaves the cluster or the whole cluster is restarted,
>>> then you need to execute your custom for all the records with status=0 or
>>> status=1. To do that you can broadcast a compute task:
>>>
>>> // Application side
>>>
>>> int[] unprocessedRecords = "select id from MyRecord where status < 2;"
>>>
>>> IgniteCompute.affinityRun(idsOfUnprocessedRecords,
>>> taskWithMyCustomLogic); //the task will be executed only on the nodes that
>>> store the records
>>>
>>> // Server side
>>>
>>> taskWithMyCustomLogic() {
>>>     updatedRecord.status = 1 // setting the flag to signify the
>>> processing is started.
>>>
>>>     //execute your business logic
>>>
>>>     updatedRecord.status = 2 // finished processing
>>> }
>>>
>>>
>>> That's it. So, the third step already requires you to have a compute
>>> task that would run the calculation in case of failures. Thus, if the
>>> real-time aspect of the processing is not crucial right now, then you can
>>> start with the batch-based approach by running a compute task once at a
>>> time and then introduce the continuous queries-based improvement whenever
>>> is needed. You decide. Hope it helps.
>>>
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Wed, Oct 7, 2020 at 9:19 AM Denis Magda <dma...@apache.org> wrote:
>>>
>>>> So, a lesson for the future, would the continuous query approach still
>>>>> be preferable if the calculation involves the cache with continuous query
>>>>> and say a look up table? For example, if I want to see the country in the
>>>>> cache employee exists in the list of the countries that I am interested 
>>>>> in.
>>>>
>>>>
>>>> You can access other caches from within the filter but the logic has to
>>>> be executed asynchronously to avoid deadlocks:
>>>> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html
>>>>
>>>> Also, what do I need to do if I want the filter for the continuous
>>>>> query to execute on the cache on the local node only? Say, I have the
>>>>> continuous query deployed as singleton service on each node, to capture
>>>>> certain changes to a cache on the local node.
>>>>
>>>>
>>>> <https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html>
>>>>
>>>> The filter will be deployed and executed on every server node. The
>>>> filter is executed only on a server node that owns a record that is being
>>>> modified and passed into a filter. Hmm, it's also said that the filter can
>>>> be executed on a backup node. Check if it's true, and then you need to add
>>>> a special check into the filter that would allow executing the logic only
>>>> if it's the primary node:
>>>>
>>>> https://ignite.apache.org/docs/latest/key-value-api/continuous-queries#remote-filter
>>>>
>>>>
>>>> -
>>>> Denis
>>>>
>>>>
>>>> On Wed, Oct 7, 2020 at 4:39 AM narges saleh <snarges...@gmail.com>
>>>> wrote:
>>>>
>>>>> Also, what do I need to do if I want the filter for the continuous
>>>>> query to execute on the cache on the local node only? Say, I have the
>>>>> continuous query deployed as singleton service on each node, to capture
>>>>> certain changes to a cache on the local node.
>>>>>
>>>>> On Wed, Oct 7, 2020 at 5:54 AM narges saleh <snarges...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thank you, Denis.
>>>>>> So, a lesson for the future, would the continuous query approach
>>>>>> still be preferable if the calculation involves the cache with continuous
>>>>>> query and say a look up table? For example, if I want to see the country 
>>>>>> in
>>>>>> the cache employee exists in the list of the countries that I am 
>>>>>> interested
>>>>>> in.
>>>>>>
>>>>>> On Tue, Oct 6, 2020 at 4:11 PM Denis Magda <dma...@apache.org> wrote:
>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Then, I would consider the continuous queries based solution as long
>>>>>>> as the records can be updated in real-time:
>>>>>>>
>>>>>>>    - You can process the records on the fly and don't need to come
>>>>>>>    up with any batch task.
>>>>>>>    - The continuous query filter will be executed once on a node
>>>>>>>    that stores the record's primary copy. If the primary node fails in 
>>>>>>> the
>>>>>>>    middle of the filter's calculation execution, then the filter will be
>>>>>>>    executed on a backup node. So, you will not lose any updates but 
>>>>>>> might need
>>>>>>>    to introduce some logic/flag that confirms the calculation is not 
>>>>>>> executed
>>>>>>>    twice for a single record (this can happen if the primary node 
>>>>>>> failed in
>>>>>>>    the middle of the calculation execution and then the backup node 
>>>>>>> picked up
>>>>>>>    and started executing the calculation from scratch).
>>>>>>>    - Updates of other tables or records from within the continuous
>>>>>>>    query filter must go through an async thread pool. You need to use
>>>>>>>    IgniteAsyncCallback annotation for that:
>>>>>>>    
>>>>>>> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html
>>>>>>>
>>>>>>> Alternatively, you can always run the calculation in the
>>>>>>> batch-fashion:
>>>>>>>
>>>>>>>    - Run a compute task once in a while
>>>>>>>    - Read all the latest records that satisfy the requests with SQL
>>>>>>>    or any other APIs
>>>>>>>    - Complete the calculation, mark already processed records just
>>>>>>>    in case if everything is failed in the middle and you need to run the
>>>>>>>    calculation from scratch
>>>>>>>
>>>>>>>
>>>>>>> -
>>>>>>> Denis
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 5, 2020 at 8:33 PM narges saleh <snarges...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Denis
>>>>>>>>  The calculation itself doesn't involve an update or read of
>>>>>>>> another record, but based on the outcome of the calculation, the 
>>>>>>>> process
>>>>>>>> might make changes in some other tables.
>>>>>>>>
>>>>>>>> thanks.
>>>>>>>>
>>>>>>>> On Mon, Oct 5, 2020 at 7:04 PM Denis Magda <dma...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Good. Another clarification:
>>>>>>>>>
>>>>>>>>>    - Does that calculation change the state of the record
>>>>>>>>>    (updates any fields)?
>>>>>>>>>    - Does the calculation read or update any other records?
>>>>>>>>>
>>>>>>>>> -
>>>>>>>>> Denis
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Oct 3, 2020 at 1:34 PM narges saleh <snarges...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> The latter; the server needs to perform some calculations on the
>>>>>>>>>> data without sending any notification to the app.
>>>>>>>>>>
>>>>>>>>>> On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dma...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> And after you detect a record that satisfies the condition, do
>>>>>>>>>>> you need to send any notification to the application? Or is it more 
>>>>>>>>>>> like a
>>>>>>>>>>> server detects and does some calculation logically without updating 
>>>>>>>>>>> the app.
>>>>>>>>>>>
>>>>>>>>>>> -
>>>>>>>>>>> Denis
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Oct 2, 2020 at 11:22 AM narges saleh <
>>>>>>>>>>> snarges...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> The detection should happen at most a couple of minutes after a
>>>>>>>>>>>> record is inserted in the cache but all the detections are local 
>>>>>>>>>>>> to the
>>>>>>>>>>>> node. But some records with the current timestamp might show up in 
>>>>>>>>>>>> the
>>>>>>>>>>>> system with big delays.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dma...@apache.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> What are your requirements? Do you need to process the records
>>>>>>>>>>>>> as soon as they are put into the cluster?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Friday, October 2, 2020, narges saleh <snarges...@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you Dennis for the reply.
>>>>>>>>>>>>>> From the perspective of performance/resource overhead and
>>>>>>>>>>>>>> reliability, which approach is preferable? Does a continuous 
>>>>>>>>>>>>>> query based
>>>>>>>>>>>>>> approach impose a lot more overhead?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dma...@apache.org>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Narges,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Use continuous queries if you need to be notified in
>>>>>>>>>>>>>>> real-time, i.e. 1) a record is inserted, 2) the continuous 
>>>>>>>>>>>>>>> filter confirms
>>>>>>>>>>>>>>> the record's time satisfies your condition, 3) the continuous 
>>>>>>>>>>>>>>> queries
>>>>>>>>>>>>>>> notifies your application that does require processing.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The jobs are better for a batching use case when it's ok to
>>>>>>>>>>>>>>> process records together with some delay.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>> Denis
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <
>>>>>>>>>>>>>>> snarges...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>>  If I want to watch for a rolling timestamp pattern in all
>>>>>>>>>>>>>>>> the records that get inserted to all my caches, is it more 
>>>>>>>>>>>>>>>> efficient to use
>>>>>>>>>>>>>>>> timer based jobs (that checks all the records in some 
>>>>>>>>>>>>>>>> interval) or
>>>>>>>>>>>>>>>> continuous queries that locally filter on the pattern? These 
>>>>>>>>>>>>>>>> records can
>>>>>>>>>>>>>>>> get inserted in any order  and some can arrive with delays.
>>>>>>>>>>>>>>>> An example is to watch for all the records whose timestamp
>>>>>>>>>>>>>>>> ends in 50, if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> thanks
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> -
>>>>>>>>>>>>> Denis
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>
> --
> <http://www.trimble.com/>
> Raymond Wilson
> Solution Architect, Civil Construction Software Systems (CCSS)
> 11 Birmingham Drive | Christchurch, New Zealand
> +64-21-2013317 Mobile
> raymond_wil...@trimble.com
>
>
> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>
>

Re: Continuous Query

Reply via email to