Re: Performance vs correctness: I vote fore the second
Well, I agree that's arguable :) But keeping in mind that the cache is also used for atomics, replicated cache doesn't make sense in any case. Setting default number of backups to 1 for atomics and collections caches should be good enough. -Val On Thu, Apr 20, 2017 at 3:09 PM, Vladimir Ozerov wrote: > Valya, > > Why do you think locks should be in REPLICATED cache? It will make their > performance so poor, that users are likely to give using them :-) > > On Thu, Apr 20, 2017 at 4:04 PM, Valentin Kulichenko < > valentin.kuliche...@gmail.com> wrote: > > > Yeah, that's a very good point. However, the problem here is that we use > > single cache for very different structures. For atomics, for example, > > partitioned cache makes sense (usually with 1 or more backups though). > > While reentrant locks should always be in replicated cache in my view (or > > at least by default). Currently it's one or another for both. > > > > -Val > > > > > > > > On Thu, Apr 20, 2017 at 2:37 PM, Vladimir Ozerov > > wrote: > > > > > Evgeniy, > > > > > > Good catch! I personally had to explain users several time why they > loose > > > data in these cases with default configuration. > > > "AtomicConfiguration.backups > > > = 0" and "CollectionConfiguration.backups = 0" as defaults is > nonsense. > > > > > > On Thu, Apr 20, 2017 at 3:27 PM, Evgeniy Stanilovskiy < > > > estanilovs...@gridgain.com> wrote: > > > > > > > Guys, hope i can add one more example here. > > > > Ones we use IgniteAtomicSequence, after topology changes some > > assertions > > > > can be catched due to default AtomicConfiguration > > > > i.e. > > > > public static final int DFLT_BACKUPS = 0; > > > > public static final CacheMode DFLT_CACHE_MODE = PARTITIONED; > > > > > > > > minimal improvements here would be to set DFLT_BACKUPS = 1; or change > > > into > > > > REPLICATED mode. > > > > > > > > thanks. > > > > > > > > Folks, > > > >> > > > >> I received a number of complaints from users that our default > setting > > > >> favor > > > >> performance at the cost of correctness and subtle behavior. > Yesterday > > I > > > >> faced one such situation on my own. > > > >> > > > >> I started REPLICATED cache on several nodes, put some data, executed > > > >> simple > > > >> SQL and got wrong result. No errors, no warnings. The problem was > > caused > > > >> by > > > >> default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the > box! > > > >> > > > >> Another widely known examples are data streamer behavior, "read form > > > >> backups" + continuous queries. > > > >> > > > >> I propose to change our defaults to favor *correctness* over > > > performance, > > > >> and create good documentation and JavaDocs to explain users how to > > tune > > > >> our > > > >> product. Proposed changes: > > > >> > > > >> 1) FULL_SYNC as default; > > > >> 2) "readFromBackups=false" as default; > > > >> 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > >> > > > >> Users should not think how to make Ignite work correctly. It should > be > > > >> correct out of the box. > > > >> > > > >> Vladimir. > > > >> > > > > > > > > > >
Re: Performance vs correctness: I vote fore the second
Valya, Why do you think locks should be in REPLICATED cache? It will make their performance so poor, that users are likely to give using them :-) On Thu, Apr 20, 2017 at 4:04 PM, Valentin Kulichenko < valentin.kuliche...@gmail.com> wrote: > Yeah, that's a very good point. However, the problem here is that we use > single cache for very different structures. For atomics, for example, > partitioned cache makes sense (usually with 1 or more backups though). > While reentrant locks should always be in replicated cache in my view (or > at least by default). Currently it's one or another for both. > > -Val > > > > On Thu, Apr 20, 2017 at 2:37 PM, Vladimir Ozerov > wrote: > > > Evgeniy, > > > > Good catch! I personally had to explain users several time why they loose > > data in these cases with default configuration. > > "AtomicConfiguration.backups > > = 0" and "CollectionConfiguration.backups = 0" as defaults is nonsense. > > > > On Thu, Apr 20, 2017 at 3:27 PM, Evgeniy Stanilovskiy < > > estanilovs...@gridgain.com> wrote: > > > > > Guys, hope i can add one more example here. > > > Ones we use IgniteAtomicSequence, after topology changes some > assertions > > > can be catched due to default AtomicConfiguration > > > i.e. > > > public static final int DFLT_BACKUPS = 0; > > > public static final CacheMode DFLT_CACHE_MODE = PARTITIONED; > > > > > > minimal improvements here would be to set DFLT_BACKUPS = 1; or change > > into > > > REPLICATED mode. > > > > > > thanks. > > > > > > Folks, > > >> > > >> I received a number of complaints from users that our default setting > > >> favor > > >> performance at the cost of correctness and subtle behavior. Yesterday > I > > >> faced one such situation on my own. > > >> > > >> I started REPLICATED cache on several nodes, put some data, executed > > >> simple > > >> SQL and got wrong result. No errors, no warnings. The problem was > caused > > >> by > > >> default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the box! > > >> > > >> Another widely known examples are data streamer behavior, "read form > > >> backups" + continuous queries. > > >> > > >> I propose to change our defaults to favor *correctness* over > > performance, > > >> and create good documentation and JavaDocs to explain users how to > tune > > >> our > > >> product. Proposed changes: > > >> > > >> 1) FULL_SYNC as default; > > >> 2) "readFromBackups=false" as default; > > >> 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > >> > > >> Users should not think how to make Ignite work correctly. It should be > > >> correct out of the box. > > >> > > >> Vladimir. > > >> > > > > > >
Re: Performance vs correctness: I vote fore the second
Yeah, that's a very good point. However, the problem here is that we use single cache for very different structures. For atomics, for example, partitioned cache makes sense (usually with 1 or more backups though). While reentrant locks should always be in replicated cache in my view (or at least by default). Currently it's one or another for both. -Val On Thu, Apr 20, 2017 at 2:37 PM, Vladimir Ozerov wrote: > Evgeniy, > > Good catch! I personally had to explain users several time why they loose > data in these cases with default configuration. > "AtomicConfiguration.backups > = 0" and "CollectionConfiguration.backups = 0" as defaults is nonsense. > > On Thu, Apr 20, 2017 at 3:27 PM, Evgeniy Stanilovskiy < > estanilovs...@gridgain.com> wrote: > > > Guys, hope i can add one more example here. > > Ones we use IgniteAtomicSequence, after topology changes some assertions > > can be catched due to default AtomicConfiguration > > i.e. > > public static final int DFLT_BACKUPS = 0; > > public static final CacheMode DFLT_CACHE_MODE = PARTITIONED; > > > > minimal improvements here would be to set DFLT_BACKUPS = 1; or change > into > > REPLICATED mode. > > > > thanks. > > > > Folks, > >> > >> I received a number of complaints from users that our default setting > >> favor > >> performance at the cost of correctness and subtle behavior. Yesterday I > >> faced one such situation on my own. > >> > >> I started REPLICATED cache on several nodes, put some data, executed > >> simple > >> SQL and got wrong result. No errors, no warnings. The problem was caused > >> by > >> default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the box! > >> > >> Another widely known examples are data streamer behavior, "read form > >> backups" + continuous queries. > >> > >> I propose to change our defaults to favor *correctness* over > performance, > >> and create good documentation and JavaDocs to explain users how to tune > >> our > >> product. Proposed changes: > >> > >> 1) FULL_SYNC as default; > >> 2) "readFromBackups=false" as default; > >> 3) "IgniteDataStreamer.allowOverwrite=true" as default. > >> > >> Users should not think how to make Ignite work correctly. It should be > >> correct out of the box. > >> > >> Vladimir. > >> > > >
Re: Performance vs correctness: I vote fore the second
Evgeniy, Good catch! I personally had to explain users several time why they loose data in these cases with default configuration. "AtomicConfiguration.backups = 0" and "CollectionConfiguration.backups = 0" as defaults is nonsense. On Thu, Apr 20, 2017 at 3:27 PM, Evgeniy Stanilovskiy < estanilovs...@gridgain.com> wrote: > Guys, hope i can add one more example here. > Ones we use IgniteAtomicSequence, after topology changes some assertions > can be catched due to default AtomicConfiguration > i.e. > public static final int DFLT_BACKUPS = 0; > public static final CacheMode DFLT_CACHE_MODE = PARTITIONED; > > minimal improvements here would be to set DFLT_BACKUPS = 1; or change into > REPLICATED mode. > > thanks. > > Folks, >> >> I received a number of complaints from users that our default setting >> favor >> performance at the cost of correctness and subtle behavior. Yesterday I >> faced one such situation on my own. >> >> I started REPLICATED cache on several nodes, put some data, executed >> simple >> SQL and got wrong result. No errors, no warnings. The problem was caused >> by >> default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the box! >> >> Another widely known examples are data streamer behavior, "read form >> backups" + continuous queries. >> >> I propose to change our defaults to favor *correctness* over performance, >> and create good documentation and JavaDocs to explain users how to tune >> our >> product. Proposed changes: >> >> 1) FULL_SYNC as default; >> 2) "readFromBackups=false" as default; >> 3) "IgniteDataStreamer.allowOverwrite=true" as default. >> >> Users should not think how to make Ignite work correctly. It should be >> correct out of the box. >> >> Vladimir. >> >
Re: Performance vs correctness: I vote fore the second
Guys, hope i can add one more example here. Ones we use IgniteAtomicSequence, after topology changes some assertions can be catched due to default AtomicConfiguration i.e. public static final int DFLT_BACKUPS = 0; public static final CacheMode DFLT_CACHE_MODE = PARTITIONED; minimal improvements here would be to set DFLT_BACKUPS = 1; or change into REPLICATED mode. thanks. Folks, I received a number of complaints from users that our default setting favor performance at the cost of correctness and subtle behavior. Yesterday I faced one such situation on my own. I started REPLICATED cache on several nodes, put some data, executed simple SQL and got wrong result. No errors, no warnings. The problem was caused by default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the box! Another widely known examples are data streamer behavior, "read form backups" + continuous queries. I propose to change our defaults to favor *correctness* over performance, and create good documentation and JavaDocs to explain users how to tune our product. Proposed changes: 1) FULL_SYNC as default; 2) "readFromBackups=false" as default; 3) "IgniteDataStreamer.allowOverwrite=true" as default. Users should not think how to make Ignite work correctly. It should be correct out of the box. Vladimir.
Re: Performance vs correctness: I vote fore the second
I agree with Yakov. Let's not swing like a pendulum from one side to the other. Why not switch to FULL_SYNC only for REPLICATED caches? D. On Tue, Apr 18, 2017 at 5:48 AM, Yakov Zhdanov wrote: > Sergi, most probably, performance will not be vert much affected since we > can batch delayed responses and this for primary_sync only. However, I > agree with your point about overcomplicated code, but I did not state that > it would be trivial. > > So, what is the solution here? Switch replicated cache to full_sync by > default? This seems very simple from coding standpoint and should work for > most deployments. > > --Yakov >
Re: Performance vs correctness: I vote fore the second
Sergi, most probably, performance will not be vert much affected since we can batch delayed responses and this for primary_sync only. However, I agree with your point about overcomplicated code, but I did not state that it would be trivial. So, what is the solution here? Switch replicated cache to full_sync by default? This seems very simple from coding standpoint and should work for most deployments. --Yakov
Re: Performance vs correctness: I vote fore the second
Yakov, The idea of tracking current operations and wait if needed looks overcomplicated and most probably will result in performance drop. I think there is no way to have this guarantee with PRIMARY_SYNC in general case. Sergi 2017-04-18 13:25 GMT+03:00 Yakov Zhdanov : > Guys, what if we look at this from another point - we can switch to read > from primary only if there is any primary_sync operation that is not acked > by backups yet. Or we can wait until all operations of the kind are acked > and then proceed with query. This seems to work when we have query after > sequence of puts, but fails if we have sequence of puts then compute job > spawning a query from remote node. And this seems to bring lots of > complications to cache update protocol. > > Given this I would vote for switching default (probably, for replicated > cache only) to full_sync and output a performance warning. > > However, there is still an open question - how can I guarantee query > consistency with primary_sync? > > --Yakov >
Re: Performance vs correctness: I vote fore the second
Guys, what if we look at this from another point - we can switch to read from primary only if there is any primary_sync operation that is not acked by backups yet. Or we can wait until all operations of the kind are acked and then proceed with query. This seems to work when we have query after sequence of puts, but fails if we have sequence of puts then compute job spawning a query from remote node. And this seems to bring lots of complications to cache update protocol. Given this I would vote for switching default (probably, for replicated cache only) to full_sync and output a performance warning. However, there is still an open question - how can I guarantee query consistency with primary_sync? --Yakov
Re: Performance vs correctness: I vote fore the second
Dima, If change behavior of REPLICATED caches this way, instead of nice scalability out of the box, users will have slow distributed joins by default. All we need to do is to set strict FULL_SYNC as default. On Tue, Apr 18, 2017 at 12:26 PM, Dmitriy Setrakyan wrote: > On Tue, Apr 18, 2017 at 2:21 AM, Sergi Vladykin > wrote: > > > We never read from backups on partitioned cache, but for replicated we do > > that to be able to execute the whole query on single node locally.\ > > > > But I thought that we agreed to change that behavior and have REPLICATED > cache work the same as PARTITIONED. I think Valentin provided a link to the > discussion we had on the dev list. > > I would not make FULL_SYNC as default, but I would definitely fix this > behavior for the REPLICATED caches. > > D. >
Re: Performance vs correctness: I vote fore the second
It only means that we will parse the query always and check if it contains only replicated tables or not. If it does, then we execute the query on a single node across all the partitions. Sergi 2017-04-18 12:26 GMT+03:00 Dmitriy Setrakyan : > On Tue, Apr 18, 2017 at 2:21 AM, Sergi Vladykin > wrote: > > > We never read from backups on partitioned cache, but for replicated we do > > that to be able to execute the whole query on single node locally.\ > > > > But I thought that we agreed to change that behavior and have REPLICATED > cache work the same as PARTITIONED. I think Valentin provided a link to the > discussion we had on the dev list. > > I would not make FULL_SYNC as default, but I would definitely fix this > behavior for the REPLICATED caches. > > D. >
Re: Performance vs correctness: I vote fore the second
On Tue, Apr 18, 2017 at 2:21 AM, Sergi Vladykin wrote: > We never read from backups on partitioned cache, but for replicated we do > that to be able to execute the whole query on single node locally.\ > But I thought that we agreed to change that behavior and have REPLICATED cache work the same as PARTITIONED. I think Valentin provided a link to the discussion we had on the dev list. I would not make FULL_SYNC as default, but I would definitely fix this behavior for the REPLICATED caches. D.
Re: Performance vs correctness: I vote fore the second
We never read from backups on partitioned cache, but for replicated we do that to be able to execute the whole query on single node locally. Sergi 2017-04-18 12:07 GMT+03:00 Dmitriy Setrakyan : > Sergi, I am confused. If we don't read from backups, then why do we care > about sync or async backup updates? > > On Tue, Apr 18, 2017 at 1:11 AM, Sergi Vladykin > wrote: > > > Val, > > > > There we were not able to run queries against partitioned tables using > > replicated cache API (I already fixed that in master). > > > > Here we are talking about query result inconsistency in case of > > PRIMARY_SYNC > > because of async backup update. > > > > Sergi > > > > 2017-04-18 11:04 GMT+03:00 Valentin Kulichenko < > > valentin.kuliche...@gmail.com>: > > > > > Can you please elaborate then? What is the logic there? > > > > > > -Val > > > > > > On Tue, Apr 18, 2017 at 9:55 AM, Sergi Vladykin < > > sergi.vlady...@gmail.com> > > > wrote: > > > > > > > Val, > > > > > > > > That discussion has nothing to do with this PRIMARY_SYNC problem. > > > > > > > > Sergi > > > > > > > > 2017-04-18 10:51 GMT+03:00 Valentin Kulichenko < > > > > valentin.kuliche...@gmail.com>: > > > > > > > > > Sergi, > > > > > > > > > > I'm talking about this discussion: > > > > > http://apache-ignite-developers.2346864.n4.nabble. > > > > > com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html > > > > > > > > > > -Val > > > > > > > > > > On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov < > > voze...@gridgain.com > > > > > > > > > wrote: > > > > > > > > > > > Val, > > > > > > > > > > > > PRIMARY_SYNC doesn't work correctly with the most common case of > > SQL > > > > > query > > > > > > execution over REPLICATED cache. Also it has weird consequences > for > > > > > > continuous queries when coupled with another > > > > performance-over-correctness > > > > > > property "readFromBackup=true": user may receive CQ notification > > with > > > > new > > > > > > value, but subsequent GET on local node may return old value. > > > > > > > > > > > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < > > > > > > valentin.kuliche...@gmail.com> wrote: > > > > > > > > > > > > > This sounds more like an issue with query execution, rather > than > > > > wrong > > > > > > > PRIMARY_SYNC > > > > > > > behavior. We already had a discussion about this optimization > in > > > > > > replicated > > > > > > > cache and decided to switch it off by default. > > > > > > > > > > > > > > -Val > > > > > > > > > > > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin < > > > > > > sergi.vlady...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > With replicated cache we can execute a query against backup > > > > > partitions > > > > > > > that > > > > > > > > were not updated yet because of PRIMARY_SYNC. Thus we do not > > see > > > an > > > > > > > update. > > > > > > > > > > > > > > > > Sergi > > > > > > > > > > > > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan < > > > > dsetrak...@apache.org > > > > > >: > > > > > > > > > > > > > > > > > Vladimir, > > > > > > > > > > > > > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't > it > > > > work? > > > > > > > > > > > > > > > > > > D. > > > > > > > > > > > > > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > > > > > > > voze...@gridgain.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Folks, > > > > > > > > > > > > > > > > > > > > I received a number of complaints from users that our > > default > > > > > > setting > > > > > > > > > favor > > > > > > > > > > performance at the cost of correctness and subtle > behavior. > > > > > > > Yesterday I > > > > > > > > > > faced one such situation on my own. > > > > > > > > > > > > > > > > > > > > I started REPLICATED cache on several nodes, put some > data, > > > > > > executed > > > > > > > > > simple > > > > > > > > > > SQL and got wrong result. No errors, no warnings. The > > problem > > > > was > > > > > > > > caused > > > > > > > > > by > > > > > > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work > out > > of > > > > the > > > > > > > box! > > > > > > > > > > > > > > > > > > > > Another widely known examples are data streamer behavior, > > > "read > > > > > > form > > > > > > > > > > backups" + continuous queries. > > > > > > > > > > > > > > > > > > > > I propose to change our defaults to favor *correctness* > > over > > > > > > > > performance, > > > > > > > > > > and create good documentation and JavaDocs to explain > users > > > how > > > > > to > > > > > > > tune > > > > > > > > > our > > > > > > > > > > product. Proposed changes: > > > > > > > > > > > > > > > > > > > > 1) FULL_SYNC as default; > > > > > > > > > > 2) "readFromBackups=false" as default; > > > > > > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > > > > > > > > > > > > > Users should not think how to make Ignite work correctly. > > It > > > > > should > > > > > > > be > > > > > > > > > > correc
Re: Performance vs correctness: I vote fore the second
Sergi, I am confused. If we don't read from backups, then why do we care about sync or async backup updates? On Tue, Apr 18, 2017 at 1:11 AM, Sergi Vladykin wrote: > Val, > > There we were not able to run queries against partitioned tables using > replicated cache API (I already fixed that in master). > > Here we are talking about query result inconsistency in case of > PRIMARY_SYNC > because of async backup update. > > Sergi > > 2017-04-18 11:04 GMT+03:00 Valentin Kulichenko < > valentin.kuliche...@gmail.com>: > > > Can you please elaborate then? What is the logic there? > > > > -Val > > > > On Tue, Apr 18, 2017 at 9:55 AM, Sergi Vladykin < > sergi.vlady...@gmail.com> > > wrote: > > > > > Val, > > > > > > That discussion has nothing to do with this PRIMARY_SYNC problem. > > > > > > Sergi > > > > > > 2017-04-18 10:51 GMT+03:00 Valentin Kulichenko < > > > valentin.kuliche...@gmail.com>: > > > > > > > Sergi, > > > > > > > > I'm talking about this discussion: > > > > http://apache-ignite-developers.2346864.n4.nabble. > > > > com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html > > > > > > > > -Val > > > > > > > > On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov < > voze...@gridgain.com > > > > > > > wrote: > > > > > > > > > Val, > > > > > > > > > > PRIMARY_SYNC doesn't work correctly with the most common case of > SQL > > > > query > > > > > execution over REPLICATED cache. Also it has weird consequences for > > > > > continuous queries when coupled with another > > > performance-over-correctness > > > > > property "readFromBackup=true": user may receive CQ notification > with > > > new > > > > > value, but subsequent GET on local node may return old value. > > > > > > > > > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < > > > > > valentin.kuliche...@gmail.com> wrote: > > > > > > > > > > > This sounds more like an issue with query execution, rather than > > > wrong > > > > > > PRIMARY_SYNC > > > > > > behavior. We already had a discussion about this optimization in > > > > > replicated > > > > > > cache and decided to switch it off by default. > > > > > > > > > > > > -Val > > > > > > > > > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin < > > > > > sergi.vlady...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > With replicated cache we can execute a query against backup > > > > partitions > > > > > > that > > > > > > > were not updated yet because of PRIMARY_SYNC. Thus we do not > see > > an > > > > > > update. > > > > > > > > > > > > > > Sergi > > > > > > > > > > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan < > > > dsetrak...@apache.org > > > > >: > > > > > > > > > > > > > > > Vladimir, > > > > > > > > > > > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it > > > work? > > > > > > > > > > > > > > > > D. > > > > > > > > > > > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > > > > > > voze...@gridgain.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Folks, > > > > > > > > > > > > > > > > > > I received a number of complaints from users that our > default > > > > > setting > > > > > > > > favor > > > > > > > > > performance at the cost of correctness and subtle behavior. > > > > > > Yesterday I > > > > > > > > > faced one such situation on my own. > > > > > > > > > > > > > > > > > > I started REPLICATED cache on several nodes, put some data, > > > > > executed > > > > > > > > simple > > > > > > > > > SQL and got wrong result. No errors, no warnings. The > problem > > > was > > > > > > > caused > > > > > > > > by > > > > > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out > of > > > the > > > > > > box! > > > > > > > > > > > > > > > > > > Another widely known examples are data streamer behavior, > > "read > > > > > form > > > > > > > > > backups" + continuous queries. > > > > > > > > > > > > > > > > > > I propose to change our defaults to favor *correctness* > over > > > > > > > performance, > > > > > > > > > and create good documentation and JavaDocs to explain users > > how > > > > to > > > > > > tune > > > > > > > > our > > > > > > > > > product. Proposed changes: > > > > > > > > > > > > > > > > > > 1) FULL_SYNC as default; > > > > > > > > > 2) "readFromBackups=false" as default; > > > > > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > > > > > > > > > > > Users should not think how to make Ignite work correctly. > It > > > > should > > > > > > be > > > > > > > > > correct out of the box. > > > > > > > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Re: Performance vs correctness: I vote fore the second
Val, There we were not able to run queries against partitioned tables using replicated cache API (I already fixed that in master). Here we are talking about query result inconsistency in case of PRIMARY_SYNC because of async backup update. Sergi 2017-04-18 11:04 GMT+03:00 Valentin Kulichenko < valentin.kuliche...@gmail.com>: > Can you please elaborate then? What is the logic there? > > -Val > > On Tue, Apr 18, 2017 at 9:55 AM, Sergi Vladykin > wrote: > > > Val, > > > > That discussion has nothing to do with this PRIMARY_SYNC problem. > > > > Sergi > > > > 2017-04-18 10:51 GMT+03:00 Valentin Kulichenko < > > valentin.kuliche...@gmail.com>: > > > > > Sergi, > > > > > > I'm talking about this discussion: > > > http://apache-ignite-developers.2346864.n4.nabble. > > > com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html > > > > > > -Val > > > > > > On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov > > > > wrote: > > > > > > > Val, > > > > > > > > PRIMARY_SYNC doesn't work correctly with the most common case of SQL > > > query > > > > execution over REPLICATED cache. Also it has weird consequences for > > > > continuous queries when coupled with another > > performance-over-correctness > > > > property "readFromBackup=true": user may receive CQ notification with > > new > > > > value, but subsequent GET on local node may return old value. > > > > > > > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < > > > > valentin.kuliche...@gmail.com> wrote: > > > > > > > > > This sounds more like an issue with query execution, rather than > > wrong > > > > > PRIMARY_SYNC > > > > > behavior. We already had a discussion about this optimization in > > > > replicated > > > > > cache and decided to switch it off by default. > > > > > > > > > > -Val > > > > > > > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin < > > > > sergi.vlady...@gmail.com> > > > > > wrote: > > > > > > > > > > > With replicated cache we can execute a query against backup > > > partitions > > > > > that > > > > > > were not updated yet because of PRIMARY_SYNC. Thus we do not see > an > > > > > update. > > > > > > > > > > > > Sergi > > > > > > > > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan < > > dsetrak...@apache.org > > > >: > > > > > > > > > > > > > Vladimir, > > > > > > > > > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it > > work? > > > > > > > > > > > > > > D. > > > > > > > > > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > > > > > voze...@gridgain.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Folks, > > > > > > > > > > > > > > > > I received a number of complaints from users that our default > > > > setting > > > > > > > favor > > > > > > > > performance at the cost of correctness and subtle behavior. > > > > > Yesterday I > > > > > > > > faced one such situation on my own. > > > > > > > > > > > > > > > > I started REPLICATED cache on several nodes, put some data, > > > > executed > > > > > > > simple > > > > > > > > SQL and got wrong result. No errors, no warnings. The problem > > was > > > > > > caused > > > > > > > by > > > > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of > > the > > > > > box! > > > > > > > > > > > > > > > > Another widely known examples are data streamer behavior, > "read > > > > form > > > > > > > > backups" + continuous queries. > > > > > > > > > > > > > > > > I propose to change our defaults to favor *correctness* over > > > > > > performance, > > > > > > > > and create good documentation and JavaDocs to explain users > how > > > to > > > > > tune > > > > > > > our > > > > > > > > product. Proposed changes: > > > > > > > > > > > > > > > > 1) FULL_SYNC as default; > > > > > > > > 2) "readFromBackups=false" as default; > > > > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > > > > > > > > > Users should not think how to make Ignite work correctly. It > > > should > > > > > be > > > > > > > > correct out of the box. > > > > > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Re: Performance vs correctness: I vote fore the second
Can you please elaborate then? What is the logic there? -Val On Tue, Apr 18, 2017 at 9:55 AM, Sergi Vladykin wrote: > Val, > > That discussion has nothing to do with this PRIMARY_SYNC problem. > > Sergi > > 2017-04-18 10:51 GMT+03:00 Valentin Kulichenko < > valentin.kuliche...@gmail.com>: > > > Sergi, > > > > I'm talking about this discussion: > > http://apache-ignite-developers.2346864.n4.nabble. > > com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html > > > > -Val > > > > On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov > > wrote: > > > > > Val, > > > > > > PRIMARY_SYNC doesn't work correctly with the most common case of SQL > > query > > > execution over REPLICATED cache. Also it has weird consequences for > > > continuous queries when coupled with another > performance-over-correctness > > > property "readFromBackup=true": user may receive CQ notification with > new > > > value, but subsequent GET on local node may return old value. > > > > > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < > > > valentin.kuliche...@gmail.com> wrote: > > > > > > > This sounds more like an issue with query execution, rather than > wrong > > > > PRIMARY_SYNC > > > > behavior. We already had a discussion about this optimization in > > > replicated > > > > cache and decided to switch it off by default. > > > > > > > > -Val > > > > > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin < > > > sergi.vlady...@gmail.com> > > > > wrote: > > > > > > > > > With replicated cache we can execute a query against backup > > partitions > > > > that > > > > > were not updated yet because of PRIMARY_SYNC. Thus we do not see an > > > > update. > > > > > > > > > > Sergi > > > > > > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan < > dsetrak...@apache.org > > >: > > > > > > > > > > > Vladimir, > > > > > > > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it > work? > > > > > > > > > > > > D. > > > > > > > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > > > > voze...@gridgain.com> > > > > > > wrote: > > > > > > > > > > > > > Folks, > > > > > > > > > > > > > > I received a number of complaints from users that our default > > > setting > > > > > > favor > > > > > > > performance at the cost of correctness and subtle behavior. > > > > Yesterday I > > > > > > > faced one such situation on my own. > > > > > > > > > > > > > > I started REPLICATED cache on several nodes, put some data, > > > executed > > > > > > simple > > > > > > > SQL and got wrong result. No errors, no warnings. The problem > was > > > > > caused > > > > > > by > > > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of > the > > > > box! > > > > > > > > > > > > > > Another widely known examples are data streamer behavior, "read > > > form > > > > > > > backups" + continuous queries. > > > > > > > > > > > > > > I propose to change our defaults to favor *correctness* over > > > > > performance, > > > > > > > and create good documentation and JavaDocs to explain users how > > to > > > > tune > > > > > > our > > > > > > > product. Proposed changes: > > > > > > > > > > > > > > 1) FULL_SYNC as default; > > > > > > > 2) "readFromBackups=false" as default; > > > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > > > > > > > Users should not think how to make Ignite work correctly. It > > should > > > > be > > > > > > > correct out of the box. > > > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Re: Performance vs correctness: I vote fore the second
Val, That discussion has nothing to do with this PRIMARY_SYNC problem. Sergi 2017-04-18 10:51 GMT+03:00 Valentin Kulichenko < valentin.kuliche...@gmail.com>: > Sergi, > > I'm talking about this discussion: > http://apache-ignite-developers.2346864.n4.nabble. > com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html > > -Val > > On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov > wrote: > > > Val, > > > > PRIMARY_SYNC doesn't work correctly with the most common case of SQL > query > > execution over REPLICATED cache. Also it has weird consequences for > > continuous queries when coupled with another performance-over-correctness > > property "readFromBackup=true": user may receive CQ notification with new > > value, but subsequent GET on local node may return old value. > > > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < > > valentin.kuliche...@gmail.com> wrote: > > > > > This sounds more like an issue with query execution, rather than wrong > > > PRIMARY_SYNC > > > behavior. We already had a discussion about this optimization in > > replicated > > > cache and decided to switch it off by default. > > > > > > -Val > > > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin < > > sergi.vlady...@gmail.com> > > > wrote: > > > > > > > With replicated cache we can execute a query against backup > partitions > > > that > > > > were not updated yet because of PRIMARY_SYNC. Thus we do not see an > > > update. > > > > > > > > Sergi > > > > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan >: > > > > > > > > > Vladimir, > > > > > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? > > > > > > > > > > D. > > > > > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > > > voze...@gridgain.com> > > > > > wrote: > > > > > > > > > > > Folks, > > > > > > > > > > > > I received a number of complaints from users that our default > > setting > > > > > favor > > > > > > performance at the cost of correctness and subtle behavior. > > > Yesterday I > > > > > > faced one such situation on my own. > > > > > > > > > > > > I started REPLICATED cache on several nodes, put some data, > > executed > > > > > simple > > > > > > SQL and got wrong result. No errors, no warnings. The problem was > > > > caused > > > > > by > > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the > > > box! > > > > > > > > > > > > Another widely known examples are data streamer behavior, "read > > form > > > > > > backups" + continuous queries. > > > > > > > > > > > > I propose to change our defaults to favor *correctness* over > > > > performance, > > > > > > and create good documentation and JavaDocs to explain users how > to > > > tune > > > > > our > > > > > > product. Proposed changes: > > > > > > > > > > > > 1) FULL_SYNC as default; > > > > > > 2) "readFromBackups=false" as default; > > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > > > > > Users should not think how to make Ignite work correctly. It > should > > > be > > > > > > correct out of the box. > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > > > > > > > >
Re: Performance vs correctness: I vote fore the second
Sergi, I'm talking about this discussion: http://apache-ignite-developers.2346864.n4.nabble.com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html -Val On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov wrote: > Val, > > PRIMARY_SYNC doesn't work correctly with the most common case of SQL query > execution over REPLICATED cache. Also it has weird consequences for > continuous queries when coupled with another performance-over-correctness > property "readFromBackup=true": user may receive CQ notification with new > value, but subsequent GET on local node may return old value. > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < > valentin.kuliche...@gmail.com> wrote: > > > This sounds more like an issue with query execution, rather than wrong > > PRIMARY_SYNC > > behavior. We already had a discussion about this optimization in > replicated > > cache and decided to switch it off by default. > > > > -Val > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin < > sergi.vlady...@gmail.com> > > wrote: > > > > > With replicated cache we can execute a query against backup partitions > > that > > > were not updated yet because of PRIMARY_SYNC. Thus we do not see an > > update. > > > > > > Sergi > > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan : > > > > > > > Vladimir, > > > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? > > > > > > > > D. > > > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > > voze...@gridgain.com> > > > > wrote: > > > > > > > > > Folks, > > > > > > > > > > I received a number of complaints from users that our default > setting > > > > favor > > > > > performance at the cost of correctness and subtle behavior. > > Yesterday I > > > > > faced one such situation on my own. > > > > > > > > > > I started REPLICATED cache on several nodes, put some data, > executed > > > > simple > > > > > SQL and got wrong result. No errors, no warnings. The problem was > > > caused > > > > by > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the > > box! > > > > > > > > > > Another widely known examples are data streamer behavior, "read > form > > > > > backups" + continuous queries. > > > > > > > > > > I propose to change our defaults to favor *correctness* over > > > performance, > > > > > and create good documentation and JavaDocs to explain users how to > > tune > > > > our > > > > > product. Proposed changes: > > > > > > > > > > 1) FULL_SYNC as default; > > > > > 2) "readFromBackups=false" as default; > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > > > Users should not think how to make Ignite work correctly. It should > > be > > > > > correct out of the box. > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > >
Re: Performance vs correctness: I vote fore the second
Val, I'm not sure I understand what optimization you are talking about and what exactly did you decide to switch off, can you explain please? Sergi 2017-04-18 10:42 GMT+03:00 Valentin Kulichenko < valentin.kuliche...@gmail.com>: > This sounds more like an issue with query execution, rather than wrong > PRIMARY_SYNC > behavior. We already had a discussion about this optimization in replicated > cache and decided to switch it off by default. > > -Val > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin > wrote: > > > With replicated cache we can execute a query against backup partitions > that > > were not updated yet because of PRIMARY_SYNC. Thus we do not see an > update. > > > > Sergi > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan : > > > > > Vladimir, > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? > > > > > > D. > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > voze...@gridgain.com> > > > wrote: > > > > > > > Folks, > > > > > > > > I received a number of complaints from users that our default setting > > > favor > > > > performance at the cost of correctness and subtle behavior. > Yesterday I > > > > faced one such situation on my own. > > > > > > > > I started REPLICATED cache on several nodes, put some data, executed > > > simple > > > > SQL and got wrong result. No errors, no warnings. The problem was > > caused > > > by > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the > box! > > > > > > > > Another widely known examples are data streamer behavior, "read form > > > > backups" + continuous queries. > > > > > > > > I propose to change our defaults to favor *correctness* over > > performance, > > > > and create good documentation and JavaDocs to explain users how to > tune > > > our > > > > product. Proposed changes: > > > > > > > > 1) FULL_SYNC as default; > > > > 2) "readFromBackups=false" as default; > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > Users should not think how to make Ignite work correctly. It should > be > > > > correct out of the box. > > > > > > > > Vladimir. > > > > > > > > > >
Re: Performance vs correctness: I vote fore the second
Val, PRIMARY_SYNC doesn't work correctly with the most common case of SQL query execution over REPLICATED cache. Also it has weird consequences for continuous queries when coupled with another performance-over-correctness property "readFromBackup=true": user may receive CQ notification with new value, but subsequent GET on local node may return old value. On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < valentin.kuliche...@gmail.com> wrote: > This sounds more like an issue with query execution, rather than wrong > PRIMARY_SYNC > behavior. We already had a discussion about this optimization in replicated > cache and decided to switch it off by default. > > -Val > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin > wrote: > > > With replicated cache we can execute a query against backup partitions > that > > were not updated yet because of PRIMARY_SYNC. Thus we do not see an > update. > > > > Sergi > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan : > > > > > Vladimir, > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? > > > > > > D. > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > voze...@gridgain.com> > > > wrote: > > > > > > > Folks, > > > > > > > > I received a number of complaints from users that our default setting > > > favor > > > > performance at the cost of correctness and subtle behavior. > Yesterday I > > > > faced one such situation on my own. > > > > > > > > I started REPLICATED cache on several nodes, put some data, executed > > > simple > > > > SQL and got wrong result. No errors, no warnings. The problem was > > caused > > > by > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the > box! > > > > > > > > Another widely known examples are data streamer behavior, "read form > > > > backups" + continuous queries. > > > > > > > > I propose to change our defaults to favor *correctness* over > > performance, > > > > and create good documentation and JavaDocs to explain users how to > tune > > > our > > > > product. Proposed changes: > > > > > > > > 1) FULL_SYNC as default; > > > > 2) "readFromBackups=false" as default; > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > Users should not think how to make Ignite work correctly. It should > be > > > > correct out of the box. > > > > > > > > Vladimir. > > > > > > > > > >
Re: Performance vs correctness: I vote fore the second
This sounds more like an issue with query execution, rather than wrong PRIMARY_SYNC behavior. We already had a discussion about this optimization in replicated cache and decided to switch it off by default. -Val On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin wrote: > With replicated cache we can execute a query against backup partitions that > were not updated yet because of PRIMARY_SYNC. Thus we do not see an update. > > Sergi > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan : > > > Vladimir, > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? > > > > D. > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov > > wrote: > > > > > Folks, > > > > > > I received a number of complaints from users that our default setting > > favor > > > performance at the cost of correctness and subtle behavior. Yesterday I > > > faced one such situation on my own. > > > > > > I started REPLICATED cache on several nodes, put some data, executed > > simple > > > SQL and got wrong result. No errors, no warnings. The problem was > caused > > by > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the box! > > > > > > Another widely known examples are data streamer behavior, "read form > > > backups" + continuous queries. > > > > > > I propose to change our defaults to favor *correctness* over > performance, > > > and create good documentation and JavaDocs to explain users how to tune > > our > > > product. Proposed changes: > > > > > > 1) FULL_SYNC as default; > > > 2) "readFromBackups=false" as default; > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > Users should not think how to make Ignite work correctly. It should be > > > correct out of the box. > > > > > > Vladimir. > > > > > >
Re: Performance vs correctness: I vote fore the second
With replicated cache we can execute a query against backup partitions that were not updated yet because of PRIMARY_SYNC. Thus we do not see an update. Sergi 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan : > Vladimir, > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? > > D. > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov > wrote: > > > Folks, > > > > I received a number of complaints from users that our default setting > favor > > performance at the cost of correctness and subtle behavior. Yesterday I > > faced one such situation on my own. > > > > I started REPLICATED cache on several nodes, put some data, executed > simple > > SQL and got wrong result. No errors, no warnings. The problem was caused > by > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the box! > > > > Another widely known examples are data streamer behavior, "read form > > backups" + continuous queries. > > > > I propose to change our defaults to favor *correctness* over performance, > > and create good documentation and JavaDocs to explain users how to tune > our > > product. Proposed changes: > > > > 1) FULL_SYNC as default; > > 2) "readFromBackups=false" as default; > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > Users should not think how to make Ignite work correctly. It should be > > correct out of the box. > > > > Vladimir. > > >
Re: Performance vs correctness: I vote fore the second
Vladimir, What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? D. On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov wrote: > Folks, > > I received a number of complaints from users that our default setting favor > performance at the cost of correctness and subtle behavior. Yesterday I > faced one such situation on my own. > > I started REPLICATED cache on several nodes, put some data, executed simple > SQL and got wrong result. No errors, no warnings. The problem was caused by > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the box! > > Another widely known examples are data streamer behavior, "read form > backups" + continuous queries. > > I propose to change our defaults to favor *correctness* over performance, > and create good documentation and JavaDocs to explain users how to tune our > product. Proposed changes: > > 1) FULL_SYNC as default; > 2) "readFromBackups=false" as default; > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > Users should not think how to make Ignite work correctly. It should be > correct out of the box. > > Vladimir. >