Thanks Eric and Qifan. I am sorry to reply after so long a time because I'm
on the Chinese mid-autumn festival holiday.
According to Qifan's advice, I upload a log which contains the DDL and the
query plan hope to get more advice.
and to Eric, I summit a jira about the block-encoding and the compr
Hi Qiao,
I see that your select query is using an equi-predicate on sid, and that
you have partitioned on SID. That is going to be a problem for parallelism,
as all the data to scan will be located on the same region.
Please consider removing the “ON (SID)” from the ddl.
I see that you use U
Hi Qiao,
Thank you for the data. It is very helpful.
There are several things noticed.
- The key columns are: _SALT_, _DIVISION_1_, SID, V_DATE, UID, VID
- The salt column is built from column SID only, which means all rows
with identical SID values V will be stored in the same partitio
Many thanks to Eric and Qifan again.
I upload a new log about the new ddl and the query plan.
@Eric, Following your advice I modified the ddl, the sum query now takes
25seconds while the old ddl will take 90seconds.(there are 3 nodes in my
cluster, 8cores(16 v-cores) and 16GRAM), Is there any othe
Hi Qiao
Optimizer picked 6 as DOP. You can force it higher:
Use
CQD parallel_num_esps ‘12’;
Then prepare and run the query.
You may experiment with re-generating you table with SALT 15 and CQD
parallel_num_esps ‘15’.
you can even draw a curve
SALT 18, parallel_num_esp ‘18’
SALT 21,
I also see that you should not be using DIVISION BY, since your V_DATE are
already on day boundary. Using DIVISION_BY is counter-productive here.
And sorry I opened wrong file, I had old and new opened at the same time
and looked wrong window J, you did change the charset and char length, my
bad