Hi All,

Suggestions
==========

When analyzing real-time data collected by PostgreSQL,
it can be difficult to tune the current PostgreSQL server for satisfactory 
performance.
Therefore, we propose Vertical Clustered Indexing (VCI), an in-memory column 
store function that holds data in a state suitable for business analysis and is 
also expected to improve analysis performance.
With VCI, you can also expect to run analysis 7.8 times faster. This is 
achieved by the analytics engine, which optimizes parallel processing of 
column-oriented data, in addition to the fact that VCI stores data in a 
columnar format, enabling efficient retrieval of the columns needed for 
analysis.

Similar Features
============

One column store feature available with postgres is Citus Columnar Table.
If you introduces the citus extension, which allows columnar tables to be used 
using the columnar access method.
This function is intended to analyze the accumulated data. Therefore, you 
cannot update or delete data.
VCI supports data updates and deletions. This enables you to analyze not only 
the accumulated data but also the data that occurs in real time.

Implementing VCI
============

To introduce an updatable column store, we explain how updates to row-oriented 
data are propagated to column-oriented data.

VCI has two storage areas.

- Write Optimized Storage (WOS)
- Read Optimized Storage (ROS)

Describes WOS.
The WOS stores data for all columns in the VCI in a row-oriented format.
All newly added data is stored in the WOS relation along with the transaction 
information.
Using WOS to delete and update newly added data has no significant performance 
impact compared to deleting from columnar storage.

ROS is the storage area where all column data is stored.

When inserting/updating/deleting, data is written synchronously to WOS. It does 
not compress or index the data.
This avoids the overhead of converting to a columnar while updating the data.
After a certain amount of data accumulates in the WOS, the ROS control daemon 
converts it to column data asynchronously with updates.
Column data transformation compresses and indexes the data and writes it to ROS.

Describes searching for data.
Since there are two storage formats, the SELECT process needs to convert the 
WOS data to local ROS to determine whether it is visible or invisible. This 
conversion cost depends on the number of tuples present in the WOS file. This 
may introduce some performance overhead.
Obtain search results by referencing the local ROS and referencing the ROS in 
parallel.

These implementation ideas are also posted on Fujitsu's blog for your 
reference. [1]

Past discussions
===========

We've proposed features before. [2]
This thread also explains the details, so please check it.

In a previous thread, we suggested implementing a modification to the 
PostgreSQL backend code.
Based on the FB we received at that time, we think we need to re-implement this 
feature in pluggable storage using the table access method API.
I also got a FB of the features I needed from a PostgreSQLPro member. We 
believe it is necessary to deal with these issues in stages.
- Need to provide vector processing for nodes (filter, grand aggregate, 
aggregation with group by...) to speed up computation
- Requires parallel processing support such as scanning

It is assumed that the re-implementation will also require additional 
functionality to the current Table Access Method API.
It is useful not only for VCI but also for other access methods.
Therefore, we decided to propose the VCI feature to the community and proceed 
with development.


Request matter
===========

Are members of the PostgreSQL hackers interested in VCI features?
We welcome your comments and suggestions on this feature.
In particular, if you have any questions, required features, or 
implementations, please let me know.


[1] 
https://www.postgresql.fastware.com/blog/improve-data-analysis-performance-without-impacting-business-transactions-with-vertical-clustered-index

[2]https://www.postgresql.org/message-id/cajrrpgfac7wc9nk6ptty6yn-nn+hcy8xolah2doyhvg5d6h...@mail.gmail.com

Regards,
Aya Iwata

FUJITSU LIMITED

Reply via email to