Tomas Vondra <tomas.von...@enterprisedb.com> writes:

>>> 7) v20240502-0007-Detect-wrap-around-in-parallel-callback.patch
>>>
>>> There's one more efficiency problem - the parallel scans are required to
>>> be synchronized, i.e. the scan may start half-way through the table, and
>>> then wrap around. Which however means the TID list will have a very wide
>>> range of TID values, essentially the min and max of for the key.
>> 
>> I have two questions here and both of them are generall gin index questions
>> rather than the patch here.
>> 
>> 1. What does the "wrap around" mean in the "the scan may start half-way
>> through the table, and then wrap around".  Searching "wrap" in
>> gin/README gets nothing. 
>> 
>
> The "wrap around" is about the scan used to read data from the table
> when building the index. A "sync scan" may start e.g. at TID (1000,0)
> and read till the end of the table, and then wraps and returns the
> remaining part at the beginning of the table for blocks 0-999.
>
> This means the callback would not see a monotonically increasing
> sequence of TIDs.
>
> Which is why the serial build disables sync scans, allowing simply
> appending values to the sorted list, and even with regular flushes of
> data into the index we can simply append data to the posting lists.

Thanks for the hints, I know the sync strategy comes from syncscan.c
now. 

>>> Without 0006 this would cause frequent failures of the index build, with
>>> the error I already mentioned:
>>>
>>>   ERROR: could not split GIN page; all old items didn't fit

>> 2. I can't understand the below error.
>> 
>>>   ERROR: could not split GIN page; all old items didn't fit

>   if (!append || ItemPointerCompare(&maxOldItem, &remaining) >= 0)
>     elog(ERROR, "could not split GIN page; all old items didn't fit");
>
> It can fail simply because of the !append part.

Got it, Thanks!

>> If we split the blocks among worker 1-block by 1-block, we will have a
>> serious issue like here.  If we can have N-block by N-block, and N-block
>> is somehow fill the work_mem which makes the dedicated temp file, we
>> can make things much better, can we? 

> I don't understand the question. The blocks are distributed to workers
> by the parallel table scan, and it certainly does not do that block by
> block. But even it it did, that's not a problem for this code.

OK, I get ParallelBlockTableScanWorkerData.phsw_chunk_size is designed
for this.

> The problem is that if the scan wraps around, then one of the TID lists
> for a given worker will have the min TID and max TID, so it will overlap
> with every other TID list for the same key in that worker. And when the
> worker does the merging, this list will force a "full" merge sort for
> all TID lists (for that key), which is very expensive.

OK.

Thanks for all the answers, they are pretty instructive!

-- 
Best Regards
Andy Fan



Reply via email to