Tom Lane wrote:
Andrew Dunstan <[EMAIL PROTECTED]> writes:
Here are some timing tests in 1m rows of random utf8 encoded 100 char data. It doesn't look to me like the saving you're suggesting is worth the trouble.

Hmm ... not sure I believe your numbers.  Using a test file of 1m lines
of 100 random latin1 characters converted to utf8 (thus, about half and
half 7-bit ASCII and 2-byte utf8 characters), I get this in SQL_ASCII

regression=# \timing
Timing is on.
regression=# create temp table test(f1 text);
Time: 5.047 ms
regression=# copy test from '/home/tgl/zzz1m';
COPY 1000000
Time: 4337.089 ms

and this in UTF8 encoding:

utf8=# \timing
Timing is on.
utf8=# create temp table test(f1 text);
Time: 5.108 ms
utf8=# copy test from '/home/tgl/zzz1m';
COPY 1000000
Time: 7776.583 ms

The numbers aren't super repeatable, but it sure looks to me like the
encoding check adds at least 50% to the runtime in this example; so
doing it twice seems unpleasant.

(This is CVS HEAD, compiled without assert checking, on an x86_64
Fedora Core 6 box.)


Here are some test results that are closer to yours. I used a temp table and had cassert off and fsync off, and tried with several encodings.

The additional load from the test isn't 50%, (I think you have added the cost of going from ascii to utf8 to the cost of the test to get that 50%) but it is nevertheless appreciable.

I agree that we should look at not testing if the client and server encodings are the same, so we can reduce the difference.



                   Run SQL_ASCII LATIN1  UTF8

                     1   4659.38 4766.07  9134.53

                     2   7999.64 4003.13  6231.41

                     3   4178.46 6178.89  7266.39

  Without test       4    4201.7 3930.84 10154.38

                     5   4092.44 4444.52  9438.24

                     6   3977.34 4197.09  8866.56

               Average   4851.49 4586.76  8515.25

                     1  11993.86 12625.8 10109.89

                     2   4647.16 9192.53 11251.27

  With test          3   4211.02 9903.77 10097.37

                     4   9203.62 7045.06 10372.25

                     5   4121.39 4138.78 10386.92

                     6   3722.73 4552.09  7432.56

               Average   6316.63 7909.67  9941.71

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to