Tom Lane wrote:
Andrew Dunstan <[EMAIL PROTECTED]> writes:
Here are some timing tests in 1m rows of random utf8 encoded 100 char data. It doesn't look to me like the saving you're suggesting is worth the trouble.

Hmm ... not sure I believe your numbers.  Using a test file of 1m lines
of 100 random latin1 characters converted to utf8 (thus, about half and
half 7-bit ASCII and 2-byte utf8 characters), I get this in SQL_ASCII

regression=# \timing
Timing is on.
regression=# create temp table test(f1 text);
Time: 5.047 ms
regression=# copy test from '/home/tgl/zzz1m';
COPY 1000000
Time: 4337.089 ms

and this in UTF8 encoding:

utf8=# \timing
Timing is on.
utf8=# create temp table test(f1 text);
Time: 5.108 ms
utf8=# copy test from '/home/tgl/zzz1m';
COPY 1000000
Time: 7776.583 ms

The numbers aren't super repeatable, but it sure looks to me like the
encoding check adds at least 50% to the runtime in this example; so
doing it twice seems unpleasant.
(This is CVS HEAD, compiled without assert checking, on an x86_64
Fedora Core 6 box.)

Are you comparing apples with apples? The db is utf8 in both of my cases.



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to