[jira] [Updated] (IGNITE-15351) Research possibility of having caching layer on top of RocksDB partitions

Ivan Bessonov (Jira) Fri, 03 Sep 2021 02:10:04 -0700


     [ 
https://issues.apache.org/jira/browse/IGNITE-15351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ivan Bessonov updated IGNITE-15351:
-----------------------------------
    Description: 
In Ignite 2.x there's a concept of "Data Regions", which is basically a set of 
fixed-sized in-memory caches that store data for a number of cache groups 
(let's ignore system region and similar stuff for now). It is very convenient 
and represents a core design feature in Apache Ignite - In-Memory Database.

Currently, Page Memory subsystem is not yet ported to Ignite 3.x codebase. 
Instead, there's an implementation based on RocksDB database to store data 
persistently.

But, this implementation is very simple and naive. There's no notion of 
in-memory cache across multiple tables, meaning that it can't be called an 
In-Memory Database. We should investigate ways to add this concept back on top 
of RocksDB implementation.

There are several things to investigate here:
 * how do you set up rocksdb properly and control its memory consumption - we 
should allow some configuration and a meaningful set of defaults;
 * how do you put a cache on top of several rocksdb instances. This is actually 
pretty easy, just use "org.rocksdb.Options#setRowCache(org.rocksdb.Cache)", it 
has LRU and Clock implementations. A way to configure it is still required;
 * how do we introduce data regions into our system? I see something like this:
 ** list of regions is either a node or cluster configuration;
 ** name of the region is a property of every individual table or table group 
(or whatever else we'll be having).

Last proposition is a bit tricky, cause it won't look like "create table with 
rocks engine with Clock cache...", it would look like "create table in region 
Foo". We have to conceptualize all these things and come up with proper naming 
at least.
h3. Update 1
 * the only way to control rocksdb memory usage is to have a single DB 
instance. For every table there will be several column families:
 ** one for table meta;
 ** one for every partition;
 ** one for every index;
 * data regions are a configuration of every individual node. They will have 
name, type and some other settings. The way tables chose the region remains to 
be defined;
 * there have to be common rocksdb settings outside of region settings, like 
mem table size, wal settings, etc.

  was:
In Ignite 2.x there's a concept of "Data Regions", which is basically a set of 
fixed-sized in-memory caches that store data for a number of cache groups 
(let's ignore system region and similar stuff for now). It is very convenient 
and represents a core design feature in Apache Ignite - In-Memory Database.

Currently, Page Memory subsystem is not yet ported to Ignite 3.x codebase. 
Instead, there's an implementation based on RocksDB database to store data 
persistently.

But, this implementation is very simple and naive. There's no notion of 
in-memory cache across multiple tables, meaning that it can't be called an 
In-Memory Database. We should investigate ways to add this concept back on top 
of RocksDB implementation.

There are several things to investigate here:
 * how do you set up rocksdb properly and control its memory consumption - we 
should allow some configuration and a meaningful set of defaults;
 * how do you put a cache on top of several rocksdb instances. This is actually 
pretty easy, just use "org.rocksdb.Options#setRowCache(org.rocksdb.Cache)", it 
has LRU and Clock implementations. A way to configure it is still required;
 * how do we introduce data regions into our system? I see something like this:
 ** list of regions is either a node or cluster configuration;
 ** name of the region is a property of every individual table or table group 
(or whatever else we'll be having).

Last proposition is a bit tricky, cause it won't look like "create table with 
rocks engine with Clock cache...", it would look like "create table in region 
Foo". We have to conceptualize all these things and come up with proper naming 
at least.


> Research possibility of having caching layer on top of RocksDB partitions
> -------------------------------------------------------------------------
>
>                 Key: IGNITE-15351
>                 URL: https://issues.apache.org/jira/browse/IGNITE-15351
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Ivan Bessonov
>            Assignee: Ivan Bessonov
>            Priority: Major
>              Labels: iep-74, ignite-3
>
> In Ignite 2.x there's a concept of "Data Regions", which is basically a set 
> of fixed-sized in-memory caches that store data for a number of cache groups 
> (let's ignore system region and similar stuff for now). It is very convenient 
> and represents a core design feature in Apache Ignite - In-Memory Database.
> Currently, Page Memory subsystem is not yet ported to Ignite 3.x codebase. 
> Instead, there's an implementation based on RocksDB database to store data 
> persistently.
> But, this implementation is very simple and naive. There's no notion of 
> in-memory cache across multiple tables, meaning that it can't be called an 
> In-Memory Database. We should investigate ways to add this concept back on 
> top of RocksDB implementation.
> There are several things to investigate here:
>  * how do you set up rocksdb properly and control its memory consumption - we 
> should allow some configuration and a meaningful set of defaults;
>  * how do you put a cache on top of several rocksdb instances. This is 
> actually pretty easy, just use 
> "org.rocksdb.Options#setRowCache(org.rocksdb.Cache)", it has LRU and Clock 
> implementations. A way to configure it is still required;
>  * how do we introduce data regions into our system? I see something like 
> this:
>  ** list of regions is either a node or cluster configuration;
>  ** name of the region is a property of every individual table or table group 
> (or whatever else we'll be having).
> Last proposition is a bit tricky, cause it won't look like "create table with 
> rocks engine with Clock cache...", it would look like "create table in region 
> Foo". We have to conceptualize all these things and come up with proper 
> naming at least.
> h3. Update 1
>  * the only way to control rocksdb memory usage is to have a single DB 
> instance. For every table there will be several column families:
>  ** one for table meta;
>  ** one for every partition;
>  ** one for every index;
>  * data regions are a configuration of every individual node. They will have 
> name, type and some other settings. The way tables chose the region remains 
> to be defined;
>  * there have to be common rocksdb settings outside of region settings, like 
> mem table size, wal settings, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IGNITE-15351) Research possibility of having caching layer on top of RocksDB partitions

Reply via email to