Re: pgbench - allow to specify scale as a size

Fabien COELHO Mon, 19 Feb 2018 09:28:30 -0800


Hello Mark,

What if we consider using ascii (utf8?) text file sizes as a reference
point, something independent from the database?


Why not.

TPC-B basically specifies that rows (accounts, tellers, branches) are allpadded to 100 bytes, thus we could consider (i.e. document) that--scale=SIZE refers to the amount of useful data hold, and warn thatactual storage would add various overheads for page and row headers, freespace at the end of pages, indexes...


Then one scale step is 100,000 accounts, 10 tellers and 1 branch, i.e.
100,011 * 100 ~ 9.5 MiB of useful data per scale step.

I realize even if a flat file size can be used as a more consistent
reference across platforms, it doesn't help with accurately determining
the final data file sizes due to any architecture specific nuances or
changes in the backend.  But perhaps it might still offer a little more
meaning to be able to say "50 gigabytes of bank account data" rather
than "10 million rows of bank accounts" when thinking about size over
cardinality.


Yep.

Now the overhead is really 60-65%. Although the specification isunambiguous, but we still need some maths to know whether it fits inbuffers or memory... The point of Karel regression is to take this intoaccount.

Also, whether this option would be more admissible to Tom is still an openquestion. Tom?


--
Fabien.

Re: pgbench - allow to specify scale as a size

Reply via email to