On 7/16/25 19:56, Tomas Vondra wrote:
> On 7/16/25 18:39, Peter Geoghegan wrote:
>> On Wed, Jul 16, 2025 at 11:29 AM Peter Geoghegan <p...@bowt.ie> wrote:
>>> For example, with "linear_10 / eic=16 / sync", it looks like "complex"
>>> has about half the latency of "simple" in tests where selectivity is
>>> 10. The advantage for "complex" is even greater at higher
>>> "selectivity" values. All of the other "linear" test results look
>>> about the same.
>>
>> It's hard to interpret the raw data that you've provided. For example,
>> I cannot figure out where "selectivity" appears in the raw CSV file
>> from your results repro.
>>
>> Can you post a single spreadsheet or CSV file, with descriptive column
>> names, and a row for every test case you ran? And with the rows
>> ordered such that directly comparable results/rows appear close
>> together?
>>
>
> That's a good point, sorry about that. I forgot the CSV files don't have
> proper headers, I'll fix that and document the structure better.
>
> The process.sh script starts by loading the CSV(s) into sqlite, in order
> to do the processing / aggregations. If you copy the first couple lines,
> you'll get scans.db, with nice column names and all that..
>
> The selectivity is calculated as
>
> (rows / total_rows)
>
> where rows is the rowcount returned by the query, and total_rows is
> reltuples. I also had charts with "page selectivity", but that often got
> a bunch of 100% points squashed on the right edge, so I stopped
> generating those.
>
I've pushed results from a couple more runs (the cyclic_25 is still
running), and I added "export.csv" which has a subset of columns, and
calculated row/page selectivities.
Does this work for you?
regards
--
Tomas Vondra