Hi all,
I found an interesting problem in global secondary index update. I am using
Phoenix 4.6.1. See the following SQLs:
create table tt (pk1 integer not null, pk2 varchar not null, cf.a integer,
cf.b varchar constraint pk primary key (pk1, pk2));
create index tta on tt(cf.a);
create index ttb on tt(cf.b);
Now we have a data table TT and its index tables TTA and TTB. Do some UPSERT
into TT.
upsert into tt values (1, '#1', 100, '111');
For this UPSERT statement, I expect one Scan and one Put operation executed
on the data table TT, and one Put for each index table. But the operations
actually took place were different.
For TT, two Scans were executed, one for CF.A and another for CF.B. I think
this should be reduced to one.
And for each index table, a useless Delete was executed. Because when doing
the UPSERT, the data table was empty. So there was nothing to delete in the
index tables.
I added some log in the code and dumped the Delete objects.
{"ts":1470037773335,"totalColumns":0,"families":{},"row":"\\x00\\x80\\x00\\x00\\x01#1"}
'\\x80\\x00\\x00\\x01' is PK1 and '#1' is PK2, then '\\x00' in the beginning
must be CF.A. As we cannot get a value for CF.A, then just use 0 instead?
Why are we doing this? Why not ignoring the Delete if we cannot get the old
state?
A direct result of this behavior is that the first region of the index table
handle more than 10 times requests, as it handles all the delete requests
itself if the scenario is 'append more and update less'.
This wastes a lot of RPC calls and extends the RT of the data table UPSERT
operation.
Might anyone explain this for me? Thanks.
William.