On 2019-03-22 13:40:28 +0100, Francisco Olarte wrote: > On Fri, Mar 22, 2019 at 11:22 AM Thomas Güttler > <[email protected]> wrote: > > Thank you for asking several times for a benchmark. > > I wrote it now and it is visible: inserting random bytes into bytea > > is much slower, if you use the psycopg2 defaults. > > Here is the chart: > > > > https://github.com/guettli/misc/blob/master/bench-bytea-inserts-postrgres.png > > And here is the script which creates the chart: > > > > https://github.com/guettli/misc/blob/master/bench-bytea-inserts-postrgres.py > > I'm not too sure, but I read ( in the code ) you are measuring a > nearly not compressible urandom data againtst a highly compressible ( > 'x'*i ) data,
Yes, that seems to be the main difference. My "ascii" test creates
random data in the range [32, 126], which is not very compressible, and
I didn't see much of a difference to the full range (10th percentile and
median were the same, 90th percentile was noticeably better). If I
create "random" data in the range [120, 120], I also get a large speedup
(about 3.5 times). Interestingly the difference vanishes for large (> 10
MB) blobs.
> are you sure the difference is not due to data being compressed and
> generating much less disk usage in toast-tables/wal?
Yes, I think that's it: He is basically measuring how fast his CPU
can compress a stream of constant bytes. Not very meaningful.
Another difference I noticed between our benchmarks is that I used a
plain bytes object while he used a psycopg2.Binary object. Those might
be serialized differently, but since the speed difference is adequately
explained by the (lack of) randomness, I am not going to investigate
this.
hp
--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | [email protected] | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
signature.asc
Description: PGP signature
