Some of my PostgreSQL Experts colleagues have been complaining to me that servers under load with very large queries cause CSV log files that are corrupted, because lines are apparently multiplexed. The log chunking protocol between the errlog routines and the syslogger is supposed to prevent that, so I did a little work to try to reproduce it in a controlled way. On my dual quad xeon setup, this script:

   #!/bin/sh
   par=$1
   seq=$2

   sed 2000q /usr/share/dict/words > words

   psql -q -c 'drop table if exists foo'
   psql -q -c 'create table foo (t text)'

   echo '\set words `cat words`' > wordsin.sql
   echo 'prepare fooplan (text) as insert into foo values ($1);' >>
   wordsin.sql

   for i in `seq 1 $seq`; do
      echo "execute fooplan(:'words');" >> wordsin.sql
   done

   for i in `seq 1 $par`; do
      psql -q -t -f wordsin.sql &
   done
   wait

called with parameters of 100 and 50 (i.e. 100 simultaneous clients each doing 50 very large inserts) is enough to cause CSV log corruption quite reliably on PostgreSQL 9.1.

This is a serious bug. I'm going to investigate, but it's causing major pain, so anyone else who has any ideas is welcome to chime in.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to