Re: Remove cache groups in AI 3.0

Dmitry Pavlov Wed, 11 Apr 2018 03:46:07 -0700

Hi Igniters,

Actually I do not understand both points of view: we need to (keep/remove)
cache groups.


Only one reason for refactoring I see : 'too much fsyncs', but it may be
solved at level of FilePageStoreV2 with new virtual FS for partitions/index
data, without any other changes.

Sincerely,
Dmitriy Pavlov

ср, 11 апр. 2018 г. в 13:30, Vladimir Ozerov <voze...@gridgain.com>:

> Anton,
>
> I do not see the point. What is the problem with creation or removal of
> real cache?
>
> On Wed, Apr 11, 2018 at 1:05 PM, Anton Vinogradov <a...@apache.org> wrote:
>
> > Vova,
> >
> > Cache groups are very useful.
> >
> > For example, you can develop multi-tenant applications using cache groups
> > as a templates.
> > In case you have some cache groups, eg. Users, Loans, Deposits, you can
> > keep records for Organisation_A, Organisation_B and Organisation_C at
> same
> > data sctuctures, but logically separated.
> > Addition/Removal of orgatisation will not cause creation or removal of
> real
> > caches.
> >
> > ASAIK, you can use GridSecurity [1] over caches inside cache groups, and
> > gain secured multi-tenant environment as a result.
> >
> > Can you propose better solution without cache groups usage?
> >
> > [1] https://docs.gridgain.com/docs/security-concepts
> >
> > 2018-04-11 0:24 GMT+03:00 Denis Magda <dma...@apache.org>:
> >
> > > Vladimir,
> > >
> > > - Data size per-cache
> > >
> > >
> > > Could you elaborate how the data size per-cache/table task will be
> > > addressed with proposed architecture? Are you going to store data of a
> > > specific cache in dedicated pages/segments? What's about index size?
> > >
> > > --
> > > Denis
> > >
> > > On Tue, Apr 10, 2018 at 2:31 AM, Vladimir Ozerov <voze...@gridgain.com
> >
> > > wrote:
> > >
> > > > Dima,
> > > >
> > > > 1) Easy to understand for users
> > > > AI 2.x: cluster -> cache group -> cache -> table
> > > > AI 3.x: cluster -> cache(==table)
> > > >
> > > > 2) Fine grained cache management
> > > > - MVCC on/off per-cache
> > > > - WAL mode on/off per-cache
> > > > - Data size per-cache
> > > >
> > > > 3) Performance:
> > > > - Efficient scans are not possible with cache groups
> > > > - Efficient destroy/DROP - O(N) now, O(1) afterwards
> > > >
> > > > "Huge refactoring" is not precise estimate. Let's think on how to do
> > that
> > > > instead of how not to do :-)
> > > >
> > > > On Tue, Apr 10, 2018 at 11:41 AM, Dmitriy Setrakyan <
> > > dsetrak...@apache.org
> > > > >
> > > > wrote:
> > > >
> > > > > Vladimir, sounds like a huge refactoring. Other than "cache groups
> > are
> > > > > confusing", are we solving any other big issues with the new
> proposed
> > > > > approach?
> > > > >
> > > > > (every time we try to refactor rebalancing, I get goose bumps)
> > > > >
> > > > > D.
> > > > >
> > > > > On Tue, Apr 10, 2018 at 1:32 AM, Vladimir Ozerov <
> > voze...@gridgain.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Igniters,
> > > > > >
> > > > > > Cache groups were implemented for a sole purpose - to hide
> internal
> > > > > > inefficiencies. Namely (add more if I missed something):
> > > > > > 1) Excessive heap usage for affinity/partition data
> > > > > > 2) Too much data files as we employ file-per-partition approach.
> > > > > >
> > > > > > These problems were resolved, but now cache groups are a great
> > source
> > > > of
> > > > > > confusion both for users and us - hard to understand, no way to
> > > > configure
> > > > > > it in deterministic way. Should we resolve mentioned performance
> > > issues
> > > > > we
> > > > > > would never had cache groups. I propose to think we would it take
> > for
> > > > us
> > > > > to
> > > > > > get rid of cache groups.
> > > > > >
> > > > > > Please provide your inputs to suggestions below.
> > > > > >
> > > > > > 1) "Merge" partition data from different caches
> > > > > > Consider that we start a new cache with the same affinity
> > > configuration
> > > > > > (cache mode, partition number, affinity function) as some of
> > already
> > > > > > existing caches, Is it possible to re-use partition distribution
> > and
> > > > > > history of existing cache for a new cache? Think of it as a kind
> of
> > > > > > automatic cache grouping which is transparent to the user. This
> > would
> > > > > > remove heap pressure. Also it could resolve our long-standing
> issue
> > > > with
> > > > > > FairAffinityFunction when tow caches with the same affinity
> > > > configuration
> > > > > > are not co-located when started on different topology versions.
> > > > > >
> > > > > > 2) Employ segment-extent based approach instead of
> > file-per-partition
> > > > > > - Every object (cache, index) reside in dedicated segment
> > > > > > - Segment consists of extents (minimal allocation units)
> > > > > > - Extents are allocated and deallocated as needed
> > > > > > - *Ignite specific*: particular extent can be used by only one
> > > > partition
> > > > > > - Segments may be located in any number of data files we find
> > > > convenient
> > > > > > With this approach "too many fsyncs" problem goes away
> > automatically.
> > > > At
> > > > > > the same time it would be possible to implement efficient
> rebalance
> > > > still
> > > > > > as partition data will be split across moderate number of
> extents,
> > > not
> > > > > > chaotically.
> > > > > >
> > > > > > Once we have p.1 and p.2 ready cache groups could be removed,
> > > couldn't
> > > > > > they?
> > > > > >
> > > > > > Vladimir.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Remove cache groups in AI 3.0

Reply via email to