On 02.12.2019 4:15, Hubert Zhang wrote:
The prototype extension is at https://github.com/zhangh43/vectorize_engine
I am very sorry, that I have no followed this link.
Few questions concerning your design decisions:
1. Will it be more efficient to use native arrays in vtype instead of
array of Datum? I think it will allow compiler to generate more
efficient code for operations with float4 and int32 types.
It is possible to use union to keep fixed size of vtype.
2. Why VectorTupleSlot contains array (batch) of heap tuples rather than
vectors (array of vtype)?
3. Why you have to implement your own plan_tree_mutator and not using
4. As far as I understand you now always try to replace SeqScan with
your custom vectorized scan. But it makes sense only if there are quals
for this scan or aggregation is performed.
In other cases batch+unbatch just adds extra overhead, doesn't it?
5. Throwing and catching exception for queries which can not be
vectorized seems to be not the safest and most efficient way of handling
May be it is better to return error code in plan_tree_mutator and
propagate this error upstairs?
6. Have you experimented with different batch size? I have done similar
experiments in VOPS and find out that tile size larger than 128 are not
providing noticable increase of performance.
You are currently using batch size 1024 which is significantly larger
than typical amount of tuples on one page.
7. How vectorized scan can be combined with parallel execution (it is
already supported in9.6, isn't it?)
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company