Re: DDL implementation details

Sergi Vladykin Mon, 16 Jan 2017 12:07:56 -0800

Dima,

I agree that cache==table is definitely a wrong choice, but as far as I see
Vova suggests having cache==tablespace instead of cache==schema. I tend to
agree with this decoupling of physical and logical grouping, but the
concern is that it will require much more work to do.


Sergi

2017-01-16 21:35 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:

> Vova,
>
> Currently I see only 2 ways we can proceed here:
>
>    1. cache == table
>    2. cache == schema
>
> I agree that "cache==table" may be more flexible, but I don't think it will
> work in Ignite.
> We may end up with 1,000s of caches, which will carry significant overhead
> on memory and cluster overall. I think that we have no choice but to take
> "cache==schema" approach.
>
> D.
>
> On Mon, Jan 16, 2017 at 1:00 AM, Vladimir Ozerov <voze...@gridgain.com>
> wrote:
>
> > Sergi, Dima,
> >
> > In the scope of Ignite 1.x it is perfectly fine to have "schema = cache".
> > Nobody suffers from it because nobody use Ignite as database. But in
> > future, thanks to page memory, we are going to target real database use
> > cases. Users will have multiple tables in Ignite. Plus views, triggers,
> > constraints, etc.. All these features are very useful and easy to
> implement
> > provided that we already have table and index implementations. And in
> > databases all related objects are *logically *grouped in a "schema". This
> > is convenient for users: less boilerplate in SQL, better manageability
> > (remember that database users will definitely need some console and/or UI
> > tools to manage Ignite as a database).
> >
> > What you offer is to group database objects *physically *rather than
> > logically. It will lead to:
> > - Boilerplate in queries
> > - Inconvenient database management. All the things database users are
> used
> > to - import/export tools, UIs, "USING" keyword, etc, will look weird in
> > Ignite as there will be no way to group arbitrary objects logically.
> >
> > With this approach almost every user will have to use two schemes instead
> > of one - one for operational data (PARTITIONED) and one for reference
> data
> > (REPLICATED). No conventional database works this way.
> >
> > Vladimir.
> >
> > On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan <
> dsetrak...@apache.org>
> > wrote:
> >
> > > Vova,
> > >
> > > I will join Sergi here. It seems like "schema = cache" will take care
> of
> > > all different configuration properties required for different groups of
> > > caches. In addition, it cleanly maps into current Ignite architecture.
> We
> > > will need to have a very strong reason to move away from it.
> > >
> > > D.
> > >
> > > On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <voze...@gridgain.com
> >
> > > wrote:
> > >
> > > > Correct, it worked, because Ignite has never had real database use
> case
> > > in
> > > > mind. Unfortunately, if our global plans go as expected, it will not
> > work
> > > > for Ignite 2.x+.
> > > >
> > > > On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <
> > > sergi.vlady...@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > Lets move on with SQL schema == Ignite cache. It worked always like
> > > > this, I
> > > > > see no reasons to change this.
> > > > >
> > > > > Sergi
> > > > >
> > > > > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:
> > > > >
> > > > > > "Tablespace" (Oracle, PostgreSQL) is what maps better than
> "schema"
> > > to
> > > > > our
> > > > > > cache. But not ideally still.
> > > > > >
> > > > > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
> > > > voze...@gridgain.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Alex,
> > > > > > >
> > > > > > > Currently Ignite is not used as database. It is used as search
> > > > engine -
> > > > > > > several types, several tables, several joins. This is why
> having
> > > > > "SCHEMA
> > > > > > ==
> > > > > > > cache" was never a problem. Users have never build complex SQL
> > > > > > applications
> > > > > > > on top of Ignite. But we are going towards database. And my
> > > question
> > > > > > stands
> > > > > > > still - suppose it is Y2019, how is user going to migrate his
> > > > database
> > > > > > > containing 20-30-50-100 tables in a single schema in Oracle to
> > > > Ignite?
> > > > > > >
> > > > > > > Single cache for all tables? Doens't work - not flexible. Users
> > > will
> > > > > > > definitely require different cache modes, different co-location
> > > > rules,
> > > > > > > different number of backups, etc..
> > > > > > > Schema per table? Doesn't work either - unmanageable and not
> > > > convenient
> > > > > > > for users even for relatively small databases.
> > > > > > >
> > > > > > > From user perspective schema is logical grouping of database
> > > objects,
> > > > > > > nothing more.
> > > > > > >
> > > > > > > For Ignite schema could be a logical group of resources (nodes,
> > > > memory
> > > > > > > pools, caches, etc.). And multiple tables over multiple caches
> > > should
> > > > > > > reside in it. To the contrast, table definition governs how
> data
> > is
> > > > > > stored.
> > > > > > > This is similar to, for example, MySQL approach, where you
> define
> > > how
> > > > > you
> > > > > > > store data on per-table level, and on schema level you define
> > only
> > > > > minor
> > > > > > > things like collation.
> > > > > > >
> > > > > > > Vladimir.
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > > > > > > alexander.a.pasche...@gmail.com> wrote:
> > > > > > >
> > > > > > >> Vova,
> > > > > > >>
> > > > > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <
> voze...@gridgain.com
> > >:
> > > > > > >> > I am not quite sure I understand the idea of "SCHEMA ==
> > cache".
> > > > > > Consider
> > > > > > >> > some small database with, say, ~30 tables. And user wants to
> > > > migrate
> > > > > > to
> > > > > > >> > Ignite. How is he supposed to do so? 30 schemas leading to
> > > rewrite
> > > > > of
> > > > > > >> all
> > > > > > >> > his SQL scripts? Or 30 key-value pairs in a single cache
> > leading
> > > > to
> > > > > > >> lack of
> > > > > > >> > flexibility and performance problems?
> > > > > > >>
> > > > > > >> But currently schema *is* semantically equal to cache while
> > table
> > > is
> > > > > > >> equal to type descriptor (i.e. type of stored entities),
> nothing
> > > new
> > > > > > >> here.
> > > > > > >>
> > > > > > >> Say, in single cache we may have entities of types Person and
> > > > > > >> Organization, those map to two tables with same names, and can
> > be
> > > > > > >> accessed within the same cache (i.e. schema).
> > > > > > >>
> > > > > > >> If we want to limit the user with having single type
> descriptor
> > > per
> > > > > > >> cache (i.e. cache has only one type of stored entities - BTW,
> > > where
> > > > we
> > > > > > >> are with this 2.0-wise?), then this notion could change. But
> > > > currently
> > > > > > >> what has been suggested already fits quite good with what we
> do
> > > have
> > > > > > >> at the moment regarding semantic of SQL objects.
> > > > > > >>
> > > > > > >> - Alex
> > > > > > >>
> > > > > > >> > Another example is how to deal with referene tables? Lots
> > > database
> > > > > has
> > > > > > >> > small reference tables which is best to fit REPLICATED
> cache,
> > > > while
> > > > > > >> others
> > > > > > >> > are usually bound to PARTITIONED mode. "SCHEMA == cache"
> will
> > > > force
> > > > > > >> users
> > > > > > >> > to split them into separate schemes leading to poor user
> > > > experience.
> > > > > > >> >
> > > > > > >> > I understand that we may have some implementation details
> > around
> > > > it
> > > > > at
> > > > > > >> the
> > > > > > >> > moment. But from user perspective "SCHEMA == cache" doesn't
> > make
> > > > > > sense.
> > > > > > >> As
> > > > > > >> > we are going towards AI 2.0 we'd better to rethink this
> > > approach.
> > > > > > >> >
> > > > > > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <
> > > dma...@apache.org>
> > > > > > >> wrote:
> > > > > > >> >
> > > > > > >> >>
> > > > > > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> > > > > > >> dsetrak...@apache.org>
> > > > > > >> >> wrote:
> > > > > > >> >> >
> > > > > > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> > > > > > >> >> sergi.vlady...@gmail.com>
> > > > > > >> >> > wrote:
> > > > > > >> >> >
> > > > > > >> >> >> The xml config was only for example. We can put in this
> > > > > > >> configuration
> > > > > > >> >> >> string cache config parameters directly like this:
> > > > > > >> >> >>
> > > > > > >> >> >> CREATE SCHEMA "MyCacheName" WITH
> > > > > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > > > > > >> >> >>
> > > > > > >> >> >
> > > > > > >> >> > This approach makes sense, if it can be easily supported
> > with
> > > > H2.
> > > > > > >> >>
> > > > > > >> >> What’s for affinity keys? Can we make an exception for them
> > by
> > > > > > >> defining in
> > > > > > >> >> this part of the statement
> > > > > > >> >>
> > > > > > >> >> CREATE TABLE employee (
> > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > >> >>    dept_id BIGINT AFFINITY KEY,
> > > > > > >> >>    name VARCHAR(128),
> > > > > > >> >> );
> > > > > > >> >>
> > > > > > >> >> or that l
> > > > > > >> >>
> > > > > > >> >> CREATE TABLE employee (
> > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > >> >>    dept_id BIGINT,
> > > > > > >> >>    name VARCHAR(128),
> > > > > > >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> > > > > > >> >> );
> > > > > > >> >>
> > > > > > >> >> ?
> > > > > > >> >>
> > > > > > >> >> —
> > > > > > >> >> Denis
> > > > > > >> >>
> > > > > > >> >>
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: DDL implementation details

Reply via email to