On 02.12.2019 4:15, Hubert Zhang wrote:

The prototype extension is at https://github.com/zhangh43/vectorize_engine

I am very sorry, that I have no followed this link.
Few questions concerning your design decisions:

1. Will it be more efficient to use native arrays in vtype instead of array of Datum? I think it will allow compiler to generate more efficient code for operations with float4 and int32 types.
It is possible to use union to keep fixed size of vtype.
2. Why VectorTupleSlot contains array (batch) of heap tuples rather than vectors (array of vtype)? 3. Why you have to implement your own plan_tree_mutator and not using expression_tree_mutator? 4. As far as I understand you now always try to replace SeqScan with your custom vectorized scan. But it makes sense only if there are quals for this scan or aggregation is performed.
In other cases batch+unbatch just adds extra overhead, doesn't it?
5. Throwing and catching exception for queries which can not be vectorized seems to be not the safest and most efficient way of handling such cases. May be it is better to return error code in plan_tree_mutator and propagate this error upstairs? 6. Have you experimented with different batch size? I have done similar experiments in VOPS and find out that tile size larger than 128 are not providing noticable increase of performance. You are currently using batch size 1024 which is significantly larger than typical amount of tuples on one page. 7. How vectorized scan can be combined with parallel execution (it is already supported in9.6, isn't it?)

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply via email to