Re: Remove cache groups in AI 3.0

Vladimir Ozerov Wed, 11 Apr 2018 03:28:57 -0700

Dmitry,

If you do this, why would you need cache groups at all?


On Tue, Apr 10, 2018 at 1:58 PM, Dmitry Pavlov <dpavlov....@gmail.com>
wrote:

> Hi Vladimir,
>
> We can solve "too many fsyncs" or 'too many small files' by placing several
> partitions of cache group in one file.
>
> We don't need to get rid from cache groups in this case.
>
> It is not trivial task, but it is doable. We need to create simplest FS for
> paritition chunks inside one file.
>
> Sincerely,
> Dmitriy Pavlov
>
> вт, 10 апр. 2018 г. в 12:31, Vladimir Ozerov <voze...@gridgain.com>:
>
> > Dima,
> >
> > 1) Easy to understand for users
> > AI 2.x: cluster -> cache group -> cache -> table
> > AI 3.x: cluster -> cache(==table)
> >
> > 2) Fine grained cache management
> > - MVCC on/off per-cache
> > - WAL mode on/off per-cache
> > - Data size per-cache
> >
> > 3) Performance:
> > - Efficient scans are not possible with cache groups
> > - Efficient destroy/DROP - O(N) now, O(1) afterwards
> >
> > "Huge refactoring" is not precise estimate. Let's think on how to do that
> > instead of how not to do :-)
> >
> > On Tue, Apr 10, 2018 at 11:41 AM, Dmitriy Setrakyan <
> dsetrak...@apache.org
> > >
> > wrote:
> >
> > > Vladimir, sounds like a huge refactoring. Other than "cache groups are
> > > confusing", are we solving any other big issues with the new proposed
> > > approach?
> > >
> > > (every time we try to refactor rebalancing, I get goose bumps)
> > >
> > > D.
> > >
> > > On Tue, Apr 10, 2018 at 1:32 AM, Vladimir Ozerov <voze...@gridgain.com
> >
> > > wrote:
> > >
> > > > Igniters,
> > > >
> > > > Cache groups were implemented for a sole purpose - to hide internal
> > > > inefficiencies. Namely (add more if I missed something):
> > > > 1) Excessive heap usage for affinity/partition data
> > > > 2) Too much data files as we employ file-per-partition approach.
> > > >
> > > > These problems were resolved, but now cache groups are a great source
> > of
> > > > confusion both for users and us - hard to understand, no way to
> > configure
> > > > it in deterministic way. Should we resolve mentioned performance
> issues
> > > we
> > > > would never had cache groups. I propose to think we would it take for
> > us
> > > to
> > > > get rid of cache groups.
> > > >
> > > > Please provide your inputs to suggestions below.
> > > >
> > > > 1) "Merge" partition data from different caches
> > > > Consider that we start a new cache with the same affinity
> configuration
> > > > (cache mode, partition number, affinity function) as some of already
> > > > existing caches, Is it possible to re-use partition distribution and
> > > > history of existing cache for a new cache? Think of it as a kind of
> > > > automatic cache grouping which is transparent to the user. This would
> > > > remove heap pressure. Also it could resolve our long-standing issue
> > with
> > > > FairAffinityFunction when tow caches with the same affinity
> > configuration
> > > > are not co-located when started on different topology versions.
> > > >
> > > > 2) Employ segment-extent based approach instead of file-per-partition
> > > > - Every object (cache, index) reside in dedicated segment
> > > > - Segment consists of extents (minimal allocation units)
> > > > - Extents are allocated and deallocated as needed
> > > > - *Ignite specific*: particular extent can be used by only one
> > partition
> > > > - Segments may be located in any number of data files we find
> > convenient
> > > > With this approach "too many fsyncs" problem goes away automatically.
> > At
> > > > the same time it would be possible to implement efficient rebalance
> > still
> > > > as partition data will be split across moderate number of extents,
> not
> > > > chaotically.
> > > >
> > > > Once we have p.1 and p.2 ready cache groups could be removed,
> couldn't
> > > > they?
> > > >
> > > > Vladimir.
> > > >
> > >
> >
>

Re: Remove cache groups in AI 3.0

Reply via email to