Re: DDL implementation details

Sergi Vladykin Tue, 17 Jan 2017 01:30:24 -0800

Sergey,

See inline:


2017-01-17 0:50 GMT+03:00 Sergey Kozlov <[email protected]>:

> For the approach schema==cache it is not clear how it works for some cases
> (especialy if SQL and regular cache operations are mixed):
>
> 1. I want to store two tables in same cache (same schema) and both tables
> have identical structure like id Integer, name String.
> How they will be processed for regular cache operations
> put/putall/get/getall etc where we get cache instance by its name (on SQL
> level we use the table name) ?
>

This already works. We can have multiple QueryEntities configured for a
single cache. The only restriction here is that cache key must be unique
across all the tables inside the same schema.


>
> 2. What's about ALTER TABLE statement? Will the changes for one table lock
> all tables in the cache?
>

It will not be needed, any table structure modification will be performed
in the table scope.


> 3. In the future if we will introduce table-based features like SQL roles
> (as next step of SQL implementation) "many tables one cache" rule looks
> like more difficult for its implementation.
>

I don't think so, must be the same thing.


>
> p.s. may be we need to think about new entity called "table" (based on type
> descriptor) inside cache to better support of SQL DDL
>

We already have it: QueryEntity. It has exactly that semantics.

Sergi


>
>
> On Mon, Jan 16, 2017 at 11:06 PM, Sergi Vladykin <[email protected]
> >
> wrote:
>
> > Dima,
> >
> > I agree that cache==table is definitely a wrong choice, but as far as I
> see
> > Vova suggests having cache==tablespace instead of cache==schema. I tend
> to
> > agree with this decoupling of physical and logical grouping, but the
> > concern is that it will require much more work to do.
> >
> > Sergi
> >
> > 2017-01-16 21:35 GMT+03:00 Dmitriy Setrakyan <[email protected]>:
> >
> > > Vova,
> > >
> > > Currently I see only 2 ways we can proceed here:
> > >
> > >    1. cache == table
> > >    2. cache == schema
> > >
> > > I agree that "cache==table" may be more flexible, but I don't think it
> > will
> > > work in Ignite.
> > > We may end up with 1,000s of caches, which will carry significant
> > overhead
> > > on memory and cluster overall. I think that we have no choice but to
> take
> > > "cache==schema" approach.
> > >
> > > D.
> > >
> > > On Mon, Jan 16, 2017 at 1:00 AM, Vladimir Ozerov <[email protected]
> >
> > > wrote:
> > >
> > > > Sergi, Dima,
> > > >
> > > > In the scope of Ignite 1.x it is perfectly fine to have "schema =
> > cache".
> > > > Nobody suffers from it because nobody use Ignite as database. But in
> > > > future, thanks to page memory, we are going to target real database
> use
> > > > cases. Users will have multiple tables in Ignite. Plus views,
> triggers,
> > > > constraints, etc.. All these features are very useful and easy to
> > > implement
> > > > provided that we already have table and index implementations. And in
> > > > databases all related objects are *logically *grouped in a "schema".
> > This
> > > > is convenient for users: less boilerplate in SQL, better
> manageability
> > > > (remember that database users will definitely need some console
> and/or
> > UI
> > > > tools to manage Ignite as a database).
> > > >
> > > > What you offer is to group database objects *physically *rather than
> > > > logically. It will lead to:
> > > > - Boilerplate in queries
> > > > - Inconvenient database management. All the things database users are
> > > used
> > > > to - import/export tools, UIs, "USING" keyword, etc, will look weird
> in
> > > > Ignite as there will be no way to group arbitrary objects logically.
> > > >
> > > > With this approach almost every user will have to use two schemes
> > instead
> > > > of one - one for operational data (PARTITIONED) and one for reference
> > > data
> > > > (REPLICATED). No conventional database works this way.
> > > >
> > > > Vladimir.
> > > >
> > > > On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan <
> > > [email protected]>
> > > > wrote:
> > > >
> > > > > Vova,
> > > > >
> > > > > I will join Sergi here. It seems like "schema = cache" will take
> care
> > > of
> > > > > all different configuration properties required for different
> groups
> > of
> > > > > caches. In addition, it cleanly maps into current Ignite
> > architecture.
> > > We
> > > > > will need to have a very strong reason to move away from it.
> > > > >
> > > > > D.
> > > > >
> > > > > On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <
> > [email protected]
> > > >
> > > > > wrote:
> > > > >
> > > > > > Correct, it worked, because Ignite has never had real database
> use
> > > case
> > > > > in
> > > > > > mind. Unfortunately, if our global plans go as expected, it will
> > not
> > > > work
> > > > > > for Ignite 2.x+.
> > > > > >
> > > > > > On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <
> > > > > [email protected]
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Lets move on with SQL schema == Ignite cache. It worked always
> > like
> > > > > > this, I
> > > > > > > see no reasons to change this.
> > > > > > >
> > > > > > > Sergi
> > > > > > >
> > > > > > > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <
> [email protected]
> > >:
> > > > > > >
> > > > > > > > "Tablespace" (Oracle, PostgreSQL) is what maps better than
> > > "schema"
> > > > > to
> > > > > > > our
> > > > > > > > cache. But not ideally still.
> > > > > > > >
> > > > > > > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
> > > > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Alex,
> > > > > > > > >
> > > > > > > > > Currently Ignite is not used as database. It is used as
> > search
> > > > > > engine -
> > > > > > > > > several types, several tables, several joins. This is why
> > > having
> > > > > > > "SCHEMA
> > > > > > > > ==
> > > > > > > > > cache" was never a problem. Users have never build complex
> > SQL
> > > > > > > > applications
> > > > > > > > > on top of Ignite. But we are going towards database. And my
> > > > > question
> > > > > > > > stands
> > > > > > > > > still - suppose it is Y2019, how is user going to migrate
> his
> > > > > > database
> > > > > > > > > containing 20-30-50-100 tables in a single schema in Oracle
> > to
> > > > > > Ignite?
> > > > > > > > >
> > > > > > > > > Single cache for all tables? Doens't work - not flexible.
> > Users
> > > > > will
> > > > > > > > > definitely require different cache modes, different
> > co-location
> > > > > > rules,
> > > > > > > > > different number of backups, etc..
> > > > > > > > > Schema per table? Doesn't work either - unmanageable and
> not
> > > > > > convenient
> > > > > > > > > for users even for relatively small databases.
> > > > > > > > >
> > > > > > > > > From user perspective schema is logical grouping of
> database
> > > > > objects,
> > > > > > > > > nothing more.
> > > > > > > > >
> > > > > > > > > For Ignite schema could be a logical group of resources
> > (nodes,
> > > > > > memory
> > > > > > > > > pools, caches, etc.). And multiple tables over multiple
> > caches
> > > > > should
> > > > > > > > > reside in it. To the contrast, table definition governs how
> > > data
> > > > is
> > > > > > > > stored.
> > > > > > > > > This is similar to, for example, MySQL approach, where you
> > > define
> > > > > how
> > > > > > > you
> > > > > > > > > store data on per-table level, and on schema level you
> define
> > > > only
> > > > > > > minor
> > > > > > > > > things like collation.
> > > > > > > > >
> > > > > > > > > Vladimir.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > > > > > > > > [email protected]> wrote:
> > > > > > > > >
> > > > > > > > >> Vova,
> > > > > > > > >>
> > > > > > > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <
> > > [email protected]
> > > > >:
> > > > > > > > >> > I am not quite sure I understand the idea of "SCHEMA ==
> > > > cache".
> > > > > > > > Consider
> > > > > > > > >> > some small database with, say, ~30 tables. And user
> wants
> > to
> > > > > > migrate
> > > > > > > > to
> > > > > > > > >> > Ignite. How is he supposed to do so? 30 schemas leading
> to
> > > > > rewrite
> > > > > > > of
> > > > > > > > >> all
> > > > > > > > >> > his SQL scripts? Or 30 key-value pairs in a single cache
> > > > leading
> > > > > > to
> > > > > > > > >> lack of
> > > > > > > > >> > flexibility and performance problems?
> > > > > > > > >>
> > > > > > > > >> But currently schema *is* semantically equal to cache
> while
> > > > table
> > > > > is
> > > > > > > > >> equal to type descriptor (i.e. type of stored entities),
> > > nothing
> > > > > new
> > > > > > > > >> here.
> > > > > > > > >>
> > > > > > > > >> Say, in single cache we may have entities of types Person
> > and
> > > > > > > > >> Organization, those map to two tables with same names, and
> > can
> > > > be
> > > > > > > > >> accessed within the same cache (i.e. schema).
> > > > > > > > >>
> > > > > > > > >> If we want to limit the user with having single type
> > > descriptor
> > > > > per
> > > > > > > > >> cache (i.e. cache has only one type of stored entities -
> > BTW,
> > > > > where
> > > > > > we
> > > > > > > > >> are with this 2.0-wise?), then this notion could change.
> But
> > > > > > currently
> > > > > > > > >> what has been suggested already fits quite good with what
> we
> > > do
> > > > > have
> > > > > > > > >> at the moment regarding semantic of SQL objects.
> > > > > > > > >>
> > > > > > > > >> - Alex
> > > > > > > > >>
> > > > > > > > >> > Another example is how to deal with referene tables?
> Lots
> > > > > database
> > > > > > > has
> > > > > > > > >> > small reference tables which is best to fit REPLICATED
> > > cache,
> > > > > > while
> > > > > > > > >> others
> > > > > > > > >> > are usually bound to PARTITIONED mode. "SCHEMA == cache"
> > > will
> > > > > > force
> > > > > > > > >> users
> > > > > > > > >> > to split them into separate schemes leading to poor user
> > > > > > experience.
> > > > > > > > >> >
> > > > > > > > >> > I understand that we may have some implementation
> details
> > > > around
> > > > > > it
> > > > > > > at
> > > > > > > > >> the
> > > > > > > > >> > moment. But from user perspective "SCHEMA == cache"
> > doesn't
> > > > make
> > > > > > > > sense.
> > > > > > > > >> As
> > > > > > > > >> > we are going towards AI 2.0 we'd better to rethink this
> > > > > approach.
> > > > > > > > >> >
> > > > > > > > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <
> > > > > [email protected]>
> > > > > > > > >> wrote:
> > > > > > > > >> >
> > > > > > > > >> >>
> > > > > > > > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> > > > > > > > >> [email protected]>
> > > > > > > > >> >> wrote:
> > > > > > > > >> >> >
> > > > > > > > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> > > > > > > > >> >> [email protected]>
> > > > > > > > >> >> > wrote:
> > > > > > > > >> >> >
> > > > > > > > >> >> >> The xml config was only for example. We can put in
> > this
> > > > > > > > >> configuration
> > > > > > > > >> >> >> string cache config parameters directly like this:
> > > > > > > > >> >> >>
> > > > > > > > >> >> >> CREATE SCHEMA "MyCacheName" WITH
> > > > > > > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > > > > > > > >> >> >>
> > > > > > > > >> >> >
> > > > > > > > >> >> > This approach makes sense, if it can be easily
> > supported
> > > > with
> > > > > > H2.
> > > > > > > > >> >>
> > > > > > > > >> >> What’s for affinity keys? Can we make an exception for
> > them
> > > > by
> > > > > > > > >> defining in
> > > > > > > > >> >> this part of the statement
> > > > > > > > >> >>
> > > > > > > > >> >> CREATE TABLE employee (
> > > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > > >> >>    dept_id BIGINT AFFINITY KEY,
> > > > > > > > >> >>    name VARCHAR(128),
> > > > > > > > >> >> );
> > > > > > > > >> >>
> > > > > > > > >> >> or that l
> > > > > > > > >> >>
> > > > > > > > >> >> CREATE TABLE employee (
> > > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > > >> >>    dept_id BIGINT,
> > > > > > > > >> >>    name VARCHAR(128),
> > > > > > > > >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> > > > > > > > >> >> );
> > > > > > > > >> >>
> > > > > > > > >> >> ?
> > > > > > > > >> >>
> > > > > > > > >> >> —
> > > > > > > > >> >> Denis
> > > > > > > > >> >>
> > > > > > > > >> >>
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Sergey Kozlov
> GridGain Systems
> www.gridgain.com
>

Re: DDL implementation details

Reply via email to