Dima, I agree that cache==table is definitely a wrong choice, but as far as I see Vova suggests having cache==tablespace instead of cache==schema. I tend to agree with this decoupling of physical and logical grouping, but the concern is that it will require much more work to do.
Sergi 2017-01-16 21:35 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>: > Vova, > > Currently I see only 2 ways we can proceed here: > > 1. cache == table > 2. cache == schema > > I agree that "cache==table" may be more flexible, but I don't think it will > work in Ignite. > We may end up with 1,000s of caches, which will carry significant overhead > on memory and cluster overall. I think that we have no choice but to take > "cache==schema" approach. > > D. > > On Mon, Jan 16, 2017 at 1:00 AM, Vladimir Ozerov <voze...@gridgain.com> > wrote: > > > Sergi, Dima, > > > > In the scope of Ignite 1.x it is perfectly fine to have "schema = cache". > > Nobody suffers from it because nobody use Ignite as database. But in > > future, thanks to page memory, we are going to target real database use > > cases. Users will have multiple tables in Ignite. Plus views, triggers, > > constraints, etc.. All these features are very useful and easy to > implement > > provided that we already have table and index implementations. And in > > databases all related objects are *logically *grouped in a "schema". This > > is convenient for users: less boilerplate in SQL, better manageability > > (remember that database users will definitely need some console and/or UI > > tools to manage Ignite as a database). > > > > What you offer is to group database objects *physically *rather than > > logically. It will lead to: > > - Boilerplate in queries > > - Inconvenient database management. All the things database users are > used > > to - import/export tools, UIs, "USING" keyword, etc, will look weird in > > Ignite as there will be no way to group arbitrary objects logically. > > > > With this approach almost every user will have to use two schemes instead > > of one - one for operational data (PARTITIONED) and one for reference > data > > (REPLICATED). No conventional database works this way. > > > > Vladimir. > > > > On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan < > dsetrak...@apache.org> > > wrote: > > > > > Vova, > > > > > > I will join Sergi here. It seems like "schema = cache" will take care > of > > > all different configuration properties required for different groups of > > > caches. In addition, it cleanly maps into current Ignite architecture. > We > > > will need to have a very strong reason to move away from it. > > > > > > D. > > > > > > On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <voze...@gridgain.com > > > > > wrote: > > > > > > > Correct, it worked, because Ignite has never had real database use > case > > > in > > > > mind. Unfortunately, if our global plans go as expected, it will not > > work > > > > for Ignite 2.x+. > > > > > > > > On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin < > > > sergi.vlady...@gmail.com > > > > > > > > > wrote: > > > > > > > > > Lets move on with SQL schema == Ignite cache. It worked always like > > > > this, I > > > > > see no reasons to change this. > > > > > > > > > > Sergi > > > > > > > > > > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>: > > > > > > > > > > > "Tablespace" (Oracle, PostgreSQL) is what maps better than > "schema" > > > to > > > > > our > > > > > > cache. But not ideally still. > > > > > > > > > > > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov < > > > > voze...@gridgain.com> > > > > > > wrote: > > > > > > > > > > > > > Alex, > > > > > > > > > > > > > > Currently Ignite is not used as database. It is used as search > > > > engine - > > > > > > > several types, several tables, several joins. This is why > having > > > > > "SCHEMA > > > > > > == > > > > > > > cache" was never a problem. Users have never build complex SQL > > > > > > applications > > > > > > > on top of Ignite. But we are going towards database. And my > > > question > > > > > > stands > > > > > > > still - suppose it is Y2019, how is user going to migrate his > > > > database > > > > > > > containing 20-30-50-100 tables in a single schema in Oracle to > > > > Ignite? > > > > > > > > > > > > > > Single cache for all tables? Doens't work - not flexible. Users > > > will > > > > > > > definitely require different cache modes, different co-location > > > > rules, > > > > > > > different number of backups, etc.. > > > > > > > Schema per table? Doesn't work either - unmanageable and not > > > > convenient > > > > > > > for users even for relatively small databases. > > > > > > > > > > > > > > From user perspective schema is logical grouping of database > > > objects, > > > > > > > nothing more. > > > > > > > > > > > > > > For Ignite schema could be a logical group of resources (nodes, > > > > memory > > > > > > > pools, caches, etc.). And multiple tables over multiple caches > > > should > > > > > > > reside in it. To the contrast, table definition governs how > data > > is > > > > > > stored. > > > > > > > This is similar to, for example, MySQL approach, where you > define > > > how > > > > > you > > > > > > > store data on per-table level, and on schema level you define > > only > > > > > minor > > > > > > > things like collation. > > > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > > > > > > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko < > > > > > > > alexander.a.pasche...@gmail.com> wrote: > > > > > > > > > > > > > >> Vova, > > > > > > >> > > > > > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov < > voze...@gridgain.com > > >: > > > > > > >> > I am not quite sure I understand the idea of "SCHEMA == > > cache". > > > > > > Consider > > > > > > >> > some small database with, say, ~30 tables. And user wants to > > > > migrate > > > > > > to > > > > > > >> > Ignite. How is he supposed to do so? 30 schemas leading to > > > rewrite > > > > > of > > > > > > >> all > > > > > > >> > his SQL scripts? Or 30 key-value pairs in a single cache > > leading > > > > to > > > > > > >> lack of > > > > > > >> > flexibility and performance problems? > > > > > > >> > > > > > > >> But currently schema *is* semantically equal to cache while > > table > > > is > > > > > > >> equal to type descriptor (i.e. type of stored entities), > nothing > > > new > > > > > > >> here. > > > > > > >> > > > > > > >> Say, in single cache we may have entities of types Person and > > > > > > >> Organization, those map to two tables with same names, and can > > be > > > > > > >> accessed within the same cache (i.e. schema). > > > > > > >> > > > > > > >> If we want to limit the user with having single type > descriptor > > > per > > > > > > >> cache (i.e. cache has only one type of stored entities - BTW, > > > where > > > > we > > > > > > >> are with this 2.0-wise?), then this notion could change. But > > > > currently > > > > > > >> what has been suggested already fits quite good with what we > do > > > have > > > > > > >> at the moment regarding semantic of SQL objects. > > > > > > >> > > > > > > >> - Alex > > > > > > >> > > > > > > >> > Another example is how to deal with referene tables? Lots > > > database > > > > > has > > > > > > >> > small reference tables which is best to fit REPLICATED > cache, > > > > while > > > > > > >> others > > > > > > >> > are usually bound to PARTITIONED mode. "SCHEMA == cache" > will > > > > force > > > > > > >> users > > > > > > >> > to split them into separate schemes leading to poor user > > > > experience. > > > > > > >> > > > > > > > >> > I understand that we may have some implementation details > > around > > > > it > > > > > at > > > > > > >> the > > > > > > >> > moment. But from user perspective "SCHEMA == cache" doesn't > > make > > > > > > sense. > > > > > > >> As > > > > > > >> > we are going towards AI 2.0 we'd better to rethink this > > > approach. > > > > > > >> > > > > > > > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda < > > > dma...@apache.org> > > > > > > >> wrote: > > > > > > >> > > > > > > > >> >> > > > > > > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan < > > > > > > >> dsetrak...@apache.org> > > > > > > >> >> wrote: > > > > > > >> >> > > > > > > > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin < > > > > > > >> >> sergi.vlady...@gmail.com> > > > > > > >> >> > wrote: > > > > > > >> >> > > > > > > > >> >> >> The xml config was only for example. We can put in this > > > > > > >> configuration > > > > > > >> >> >> string cache config parameters directly like this: > > > > > > >> >> >> > > > > > > >> >> >> CREATE SCHEMA "MyCacheName" WITH > > > > > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC" > > > > > > >> >> >> > > > > > > >> >> > > > > > > > >> >> > This approach makes sense, if it can be easily supported > > with > > > > H2. > > > > > > >> >> > > > > > > >> >> What’s for affinity keys? Can we make an exception for them > > by > > > > > > >> defining in > > > > > > >> >> this part of the statement > > > > > > >> >> > > > > > > >> >> CREATE TABLE employee ( > > > > > > >> >> id BIGINT PRIMARY KEY, > > > > > > >> >> dept_id BIGINT AFFINITY KEY, > > > > > > >> >> name VARCHAR(128), > > > > > > >> >> ); > > > > > > >> >> > > > > > > >> >> or that l > > > > > > >> >> > > > > > > >> >> CREATE TABLE employee ( > > > > > > >> >> id BIGINT PRIMARY KEY, > > > > > > >> >> dept_id BIGINT, > > > > > > >> >> name VARCHAR(128), > > > > > > >> >> CONSTRAINT affKey AFFINITY KEY(dept_id) > > > > > > >> >> ); > > > > > > >> >> > > > > > > >> >> ? > > > > > > >> >> > > > > > > >> >> — > > > > > > >> >> Denis > > > > > > >> >> > > > > > > >> >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >