Vladimir, The partition counter is supposed to be used internally to solve the duplication issue. Does it sound like a right approach then?
What would be an approach for SQL queries? Not sure the partition counter is applicable. -- Denis On Thu, Dec 13, 2018 at 11:16 AM Vladimir Ozerov <voze...@gridgain.com> wrote: > Partition counter is internal implemenattion detail, which has no sensible > meaning to end users. It should not be exposed through public API. > > On Thu, Dec 13, 2018 at 10:14 PM Denis Magda <dma...@apache.org> wrote: > > > Hello Piotr, > > > > That's a known problem and I thought a JIRA ticket already exists. > However, > > failed to locate it. The ticket for the improvement should be created as > a > > result of this conversation. > > > > Speaking of an initial query type, I would differentiate from ScanQueries > > and SqlQueries. For the former, it sounds reasonable to apply the > > partitionCounter logic. As for the latter, Vladimir Ozerov will it be > > addressed as part of MVCC/Transactional SQL activities? > > > > Btw, Piotr what's your initial query type? > > > > -- > > Denis > > > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański <piotr.roman...@gmail.com > > > > wrote: > > > > > Hi, as suggested by Ilya here: > > > > > > > > > http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html > > > I'm resending it to the developers list. > > > > > > From that thread we know that there might be duplicates between initial > > > query results and listener entries received as part of continuous > query. > > > That means that users need to manually dedupe data. > > > > > > In my opinion the manual deduplication in some use cases may lead to > > > possible memory problems on the client side. In order to remove > > duplicated > > > notifications which we are receiving in the local listener, we need to > > keep > > > all initial query results in memory (or at least their unique ids). > > > Unfortunately, there is no way (is there?) to find a point in time when > > we > > > can be sure that no dups will arrive anymore. That would mean that we > > need > > > to keep that data indefinitely and use it every time a new notification > > > arrives. In case of multiple continuous queries run from a single JVM, > > this > > > might eventually become a memory or performance problem. I can see the > > > following possible improvements to Ignite: > > > > > > 1. The deduplication between initial query and incoming notification > > could > > > be done fully in Ignite. As far as I know there is already the > > > updateCounter and partition id for all the objects so it could be used > > > internally. > > > > > > 2. Add a guarantee that notifications arriving in the local listener > > after > > > query() method returns are not duplicates. This kind of functionality > > would > > > require a specific synchronization inside Ignite. It would also mean > that > > > the query() method cannot return before all potential duplicates are > > > processed by a local listener what looks wrong. > > > > > > 3. Notify users that starting from a given notification they can be > sure > > > they will not receive any duplicates anymore. This could be an > additional > > > boolean flag in the CacheQueryEntryEvent. > > > > > > 4. CacheQueryEntryEvent already exposes the partitionUpdateCounter. > > > Unfortunately we don't have this information for initial query results. > > If > > > we had, a client could manually deduplicate notifications and get rid > of > > > initial query results for a given partition after newer notifications > > > arrive. Also it would be very convenient to expose partition id as well > > but > > > now we can figure it out using the affinity service. The assumption > here > > is > > > that notifications are ordered by partitionUpdateCounter (is it true?). > > > > > > Please correct me if I'm missing anything. > > > > > > What do you think? > > > > > > Piotr > > > > > >