It's possible to configure the continuous query to return only keys in the cache that are stored on the local node.
On the C# client we do it like this: var _queryHandle = queueCache.QueryContinuous (qry: new ContinuousQuery<[..KeyType..], [...ValueType...]>(listener) {Local = true}, initialQry: new ScanQuery< [..KeyType..], [...ValueType...] > {Local = true}); On Thu, Oct 8, 2020 at 9:53 AM narges saleh <snarges...@gmail.com> wrote: > Thanks for the detailed explanation. > > I don't know how efficient it would be if you have to filter each record > one by one and then update each record, three times, to keep track of the > status, if you're dealing with millions of records each hour, even if the > cache is partitioned. I guess I will need to benchmark this. thanks again. > > On Wed, Oct 7, 2020 at 12:00 PM Denis Magda <dma...@apache.org> wrote: > >> I recalled a complete solution. That's what you would need to do if >> decide to process records in real-time with continuous queries in the >> *fault-tolerant fashion* (using pseudo-code rather than actual Ignite APIs). >> >> First. You need to add a flag field to the record's class that keeps the >> current processing status. Like that: >> >> MyRecord { >> int id; >> Date created; >> byte status; //0 - not processed, 1 - being processed withing a >> continuous query filter, 2 - processed by the filter, all the logic >> successfully completed >> } >> >> Second. The continuous query filter (that will be executed on nodes that >> store a copy of a record) needs to have the following structure. >> >> @IgniteAsyncCallback >> filterMethod(MyRecords updatedRecord) { >> >> if (isThisNodePrimaryNodeForTheRecord(updatedRecord)) { // execute on a >> primary node only >> updatedRecord.status = 1 // setting the flag to signify the >> processing is started. >> >> //execute your business logic >> >> updatedRecord.status = 2 // finished processing >> } >> return false; //you don't want a notification to be sent to the client >> application or another node that deployed the continuous query >> } >> >> Third. If any node leaves the cluster or the whole cluster is restarted, >> then you need to execute your custom for all the records with status=0 or >> status=1. To do that you can broadcast a compute task: >> >> // Application side >> >> int[] unprocessedRecords = "select id from MyRecord where status < 2;" >> >> IgniteCompute.affinityRun(idsOfUnprocessedRecords, >> taskWithMyCustomLogic); //the task will be executed only on the nodes that >> store the records >> >> // Server side >> >> taskWithMyCustomLogic() { >> updatedRecord.status = 1 // setting the flag to signify the >> processing is started. >> >> //execute your business logic >> >> updatedRecord.status = 2 // finished processing >> } >> >> >> That's it. So, the third step already requires you to have a compute task >> that would run the calculation in case of failures. Thus, if the real-time >> aspect of the processing is not crucial right now, then you can start with >> the batch-based approach by running a compute task once at a time and then >> introduce the continuous queries-based improvement whenever is needed. You >> decide. Hope it helps. >> >> >> - >> Denis >> >> >> On Wed, Oct 7, 2020 at 9:19 AM Denis Magda <dma...@apache.org> wrote: >> >>> So, a lesson for the future, would the continuous query approach still >>>> be preferable if the calculation involves the cache with continuous query >>>> and say a look up table? For example, if I want to see the country in the >>>> cache employee exists in the list of the countries that I am interested in. >>> >>> >>> You can access other caches from within the filter but the logic has to >>> be executed asynchronously to avoid deadlocks: >>> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html >>> >>> Also, what do I need to do if I want the filter for the continuous query >>>> to execute on the cache on the local node only? Say, I have the continuous >>>> query deployed as singleton service on each node, to capture certain >>>> changes to a cache on the local node. >>> >>> >>> <https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html> >>> >>> The filter will be deployed and executed on every server node. The >>> filter is executed only on a server node that owns a record that is being >>> modified and passed into a filter. Hmm, it's also said that the filter can >>> be executed on a backup node. Check if it's true, and then you need to add >>> a special check into the filter that would allow executing the logic only >>> if it's the primary node: >>> >>> https://ignite.apache.org/docs/latest/key-value-api/continuous-queries#remote-filter >>> >>> >>> - >>> Denis >>> >>> >>> On Wed, Oct 7, 2020 at 4:39 AM narges saleh <snarges...@gmail.com> >>> wrote: >>> >>>> Also, what do I need to do if I want the filter for the continuous >>>> query to execute on the cache on the local node only? Say, I have the >>>> continuous query deployed as singleton service on each node, to capture >>>> certain changes to a cache on the local node. >>>> >>>> On Wed, Oct 7, 2020 at 5:54 AM narges saleh <snarges...@gmail.com> >>>> wrote: >>>> >>>>> Thank you, Denis. >>>>> So, a lesson for the future, would the continuous query approach still >>>>> be preferable if the calculation involves the cache with continuous query >>>>> and say a look up table? For example, if I want to see the country in the >>>>> cache employee exists in the list of the countries that I am interested >>>>> in. >>>>> >>>>> On Tue, Oct 6, 2020 at 4:11 PM Denis Magda <dma...@apache.org> wrote: >>>>> >>>>>> Thanks >>>>>> >>>>>> Then, I would consider the continuous queries based solution as long >>>>>> as the records can be updated in real-time: >>>>>> >>>>>> - You can process the records on the fly and don't need to come >>>>>> up with any batch task. >>>>>> - The continuous query filter will be executed once on a node >>>>>> that stores the record's primary copy. If the primary node fails in >>>>>> the >>>>>> middle of the filter's calculation execution, then the filter will be >>>>>> executed on a backup node. So, you will not lose any updates but >>>>>> might need >>>>>> to introduce some logic/flag that confirms the calculation is not >>>>>> executed >>>>>> twice for a single record (this can happen if the primary node failed >>>>>> in >>>>>> the middle of the calculation execution and then the backup node >>>>>> picked up >>>>>> and started executing the calculation from scratch). >>>>>> - Updates of other tables or records from within the continuous >>>>>> query filter must go through an async thread pool. You need to use >>>>>> IgniteAsyncCallback annotation for that: >>>>>> >>>>>> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html >>>>>> >>>>>> Alternatively, you can always run the calculation in the >>>>>> batch-fashion: >>>>>> >>>>>> - Run a compute task once in a while >>>>>> - Read all the latest records that satisfy the requests with SQL >>>>>> or any other APIs >>>>>> - Complete the calculation, mark already processed records just >>>>>> in case if everything is failed in the middle and you need to run the >>>>>> calculation from scratch >>>>>> >>>>>> >>>>>> - >>>>>> Denis >>>>>> >>>>>> >>>>>> On Mon, Oct 5, 2020 at 8:33 PM narges saleh <snarges...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Denis >>>>>>> The calculation itself doesn't involve an update or read of another >>>>>>> record, but based on the outcome of the calculation, the process might >>>>>>> make >>>>>>> changes in some other tables. >>>>>>> >>>>>>> thanks. >>>>>>> >>>>>>> On Mon, Oct 5, 2020 at 7:04 PM Denis Magda <dma...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> Good. Another clarification: >>>>>>>> >>>>>>>> - Does that calculation change the state of the record (updates >>>>>>>> any fields)? >>>>>>>> - Does the calculation read or update any other records? >>>>>>>> >>>>>>>> - >>>>>>>> Denis >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Oct 3, 2020 at 1:34 PM narges saleh <snarges...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> The latter; the server needs to perform some calculations on the >>>>>>>>> data without sending any notification to the app. >>>>>>>>> >>>>>>>>> On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dma...@apache.org> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> And after you detect a record that satisfies the condition, do >>>>>>>>>> you need to send any notification to the application? Or is it more >>>>>>>>>> like a >>>>>>>>>> server detects and does some calculation logically without updating >>>>>>>>>> the app. >>>>>>>>>> >>>>>>>>>> - >>>>>>>>>> Denis >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Oct 2, 2020 at 11:22 AM narges saleh < >>>>>>>>>> snarges...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> The detection should happen at most a couple of minutes after a >>>>>>>>>>> record is inserted in the cache but all the detections are local to >>>>>>>>>>> the >>>>>>>>>>> node. But some records with the current timestamp might show up in >>>>>>>>>>> the >>>>>>>>>>> system with big delays. >>>>>>>>>>> >>>>>>>>>>> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dma...@apache.org> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> What are your requirements? Do you need to process the records >>>>>>>>>>>> as soon as they are put into the cluster? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Friday, October 2, 2020, narges saleh <snarges...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Thank you Dennis for the reply. >>>>>>>>>>>>> From the perspective of performance/resource overhead and >>>>>>>>>>>>> reliability, which approach is preferable? Does a continuous >>>>>>>>>>>>> query based >>>>>>>>>>>>> approach impose a lot more overhead? >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dma...@apache.org> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Narges, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Use continuous queries if you need to be notified in >>>>>>>>>>>>>> real-time, i.e. 1) a record is inserted, 2) the continuous >>>>>>>>>>>>>> filter confirms >>>>>>>>>>>>>> the record's time satisfies your condition, 3) the continuous >>>>>>>>>>>>>> queries >>>>>>>>>>>>>> notifies your application that does require processing. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The jobs are better for a batching use case when it's ok to >>>>>>>>>>>>>> process records together with some delay. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> - >>>>>>>>>>>>>> Denis >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh < >>>>>>>>>>>>>> snarges...@gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi All, >>>>>>>>>>>>>>> If I want to watch for a rolling timestamp pattern in all >>>>>>>>>>>>>>> the records that get inserted to all my caches, is it more >>>>>>>>>>>>>>> efficient to use >>>>>>>>>>>>>>> timer based jobs (that checks all the records in some interval) >>>>>>>>>>>>>>> or >>>>>>>>>>>>>>> continuous queries that locally filter on the pattern? These >>>>>>>>>>>>>>> records can >>>>>>>>>>>>>>> get inserted in any order and some can arrive with delays. >>>>>>>>>>>>>>> An example is to watch for all the records whose timestamp >>>>>>>>>>>>>>> ends in 50, if the timestamp is in the format yyyy-mm-dd hh:mi. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> - >>>>>>>>>>>> Denis >>>>>>>>>>>> >>>>>>>>>>>> -- <http://www.trimble.com/> Raymond Wilson Solution Architect, Civil Construction Software Systems (CCSS) 11 Birmingham Drive | Christchurch, New Zealand +64-21-2013317 Mobile raymond_wil...@trimble.com <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>