Re: [GENERAL] Trust intermediate CA for client certificates

2013-03-09 Thread Ian Pilcher
On 03/07/2013 12:42 PM, Ray Stell wrote:
 What Tom said works for me.  Here is a page that gives an example and I think 
 it demonstrates that the root CA does not allow everybody in the gate, the 
 chain has to be in place:
  
 http://stackoverflow.com/questions/1456034/trouble-understanding-ssl-certificate-chain-verification

That page doesn't even mention PostgreSQL.

 You can use the openssl verify command to test that the root is not wide 
 open on it's own.

The issue is the behavior of the PostgreSQL server.  openssl verify is
germane only in that it points to the source of the problem -- OpenSSL's
insistence on ultimately validating all certificates against a self-
signed root CA.  This requires that the root CA certificate be present
in root.crt, which causes the server to accept connections from all
clients that can present a certificate chain leading to that root CA.

If you don't believe me, test with the attached files, which implement
the following hierarchy.

+-+
| Root CA |
+-+
/\
   /  \
  /\
 /  \
/\
   /  \
  /\
 /  \
  +---++---+
  | Server CA || Client CA |
  +---++---+
/\\
   /  \\
  /\\
 /  \\
/\\
   /  \\
  /\\
 /  \\
  +--+  ++   ++
  | postgres |  | Bad  |   | Good |
  | (server) |  | client |   | client |
  +--+  ++   ++

The goal is to configure the server such that the good client will
be allowed to connect (because its certificate is signed by the Client
CA), but the bad client will not be allowed to connect (because its
certificate is not signed by the Client CA).

You will find the following:

1. You cannot simply use client-ca,crt as $PGDATA/root.crt.  OpenSSL
   will not validate a client certificate without access to the root CA
   certificate.

2. To enable client connections, you must add the root CA certificate
   to $PGDATA/root.crt -- cat client-ca.crt root-ca.crt  root.crt.

3. Once the root CA certificate is trusted, however, the bad client
   can also connect by using a certificate chain that includes the
   Server CA certificate --cat bad-client.crt server-ca.crt 
   ~/.postgresql/postgresql.crt.

After looking at be-secure.c and investigating the way that OpenSSL
validates certificates, I do not believe that there is any way of
achieving the desired behavior with the current codebase.

Adding pgsql-hackers to see if there is any interest in a patch to add
this functionality.

-- 

Ian Pilcher arequip...@gmail.com
Sometimes there's nothing left to do but crash and burn...or die trying.



root-ca.crt
Description: application/pkix-cert


server-ca.crt
Description: application/pkix-cert


client-ca.crt
Description: application/pkix-cert


bad-client.crt
Description: application/pkix-cert
-BEGIN RSA PRIVATE KEY-
MIIEpAIBAAKCAQEA5rerLO8F7DxFxuOXFmJT/YDQDmQtaMPxMQs1fufUiFIqgCyA
oBMCTghSQJPpm5dPP4385ZKBys2noqIMz2zr3JFQaTU8mO5wAHBuMjCPKbzqap/o
YAQTejaDYW/vh9Kz3wWLgabcaQl0xl9ZFjBUAV9sEh9P3drG58k+f5Glnf6OgDSK
P4SBZm6nT2tP4XL7T1NFHtqUMQn6TLkpLYGUxg3zgb4j69la6ItPjVEKuMkAW5zl
biLDYc5uuqtNUJHGeRmd03rWeMwamJClH7bUwEuuq7e4wn5mufzZR63uYrNgrG+h
1k/ar+7roc02icQh2rPdU4nxIJrs+c4l4HdwFQIDAQABAoIBAHO/a3pEhFUrO9p3
LcKGHBsPN9IwgfOQcf2n4PPE/QRTLI1XRkSIpNxfIlzRmB59/70jz9+g68rB+DsI
T6L0wzPKF2xg0ADthnVB8pbtc7V92KEbjmo1QUxL8we8L5CVrbXSw1WNUADGRLaM
+VW/czWpGL/Sw6/K5YU9mkRH3q3vJqZbSYyPxO42mZUJcH1wk/vIfYtzcR9tMPkW
Ln9zHZvDT9uBTTzWrPqr+R8QPsdW01AW2+D4OrQF5XB09N5fNGB9oUIxFdr3p+Lw
XeqMjNEafGtzYLHITMrIb0btvXJ8VdAUgT4Rm5QU0aATwP5iC0cCxCnefN0Q1qKO
MiU1OvUCgYEA/L1EZiT7yqa/+UEm9z2w7m//FOSzS+1tV6HazHhIr9xQ9C3AK5W5
s2YARG+k1WIvgLBpF2Bp4SAvMIBFSY0TpD+f3wek/lhO1OrrETXtZkdhFTi90qox
65wgybPkITXXLJoYNpmUKgTy7GosoSI8E4VO+inusj+fsyk+7RkiddcCgYEA6bGr
PgpYHehCblY/4XVaKB76C73pDRUyQh4KLYrcZwEq1tV5XdfRW8WHDP349xdpFli1
ABSuoc0DKqarVQBjLhnJNmJ8g3/FnzrMqk6JIYBbinYZmFFs888H92IK5/ctMT32
9D7GOD4JrfBRqbOn91c5Nq/ne015vLREGIZ+c/MCgYAr2f8DJgmWCMaoTbigD1Ei
ncYJbwD4/JILMWcQMRKTiMt3AnUkWs8kpF8JgMF90JJjZrhlOPJGAFqPtMHQ2Cx/
RBbOELp88v+Ci9wLWWr+YwYiM30kDymoMqext4euh3P1JitrVcxSWhd4E5f4wULh
NDEW0K28ubNQ16g2ZTUIcwKBgQCOs3pA2SozoQcnvy0k7HcQNtIzZ1UvMvlMnHFU
nA24LGNPam3BGy9xna25BkEICViXV7W3BeoZTUoYuku3DRSDKyXOOteTqOsxL0OY

[GENERAL] postgres 9.0.2 replicated database is crashing

2013-03-09 Thread akp geek
Hi all -

 I am in desperate need of your help. The replication/streaming
stopped working on March5th. I followed the following procedure to restart
the streaming. After running it for couple of hours , the database is
crashing on the slave.  This is on our production server.  Thanks for your
help.

Steps that I followed

1.  SELECT pg_start_backup('label', true);
2.  I created tar of data.
3.  un tar the data on the slave.
4. deleted the pid file on the slave
5. created recovery.conf
5. SELECT pg_stop_backup();  on the primary
6.  started the postgres on slave
7.  After 2 hours,  the slave server crashes

Please help.

Regards


Re: [GENERAL] Replication stopped on 9.0.2 after making change to conf file

2013-03-09 Thread Lonni J Friedman
It sounds like all you did was setup the slave from scratch with a
fresh base backup, without understanding or debugging what caused
everything to break.  Clearly whatever was wrong on March 5 is still
wrong, and nothing has been fixed.  The first step in debugging this
problem is to look at and/or post the log content (from both the
master  slave) from the time when this stopped working (march 5).

On Fri, Mar 8, 2013 at 11:06 PM, akp geek akpg...@gmail.com wrote:
 What I noticed is...

 our streaming stopped on march 5th.   I restarted the process today.
 However the new data is syncing correctly. But the data for these  dates
 between 5th and today is not syncing.  Is there some thing wrong that I did.

 here is what I did.

 1.. created base backup
 2.  Took the data directory on to the slave
 3.  stopped the base backup on master
 4. started the db on slave

 Appreciate your help.

 Regards



 On Sat, Mar 9, 2013 at 12:52 AM, akp geek akpg...@gmail.com wrote:

 Thanks a lot. I started the replication. It became very slow. It is taking
 long time to sync the masters data onto slave. Is there a way to find what's
 causing the issue?

 Regards



 On Fri, Mar 8, 2013 at 12:06 PM, John Laing john.la...@gmail.com wrote:

 I'm not sure about the existence of any standard scripts, but we have a
 pair of checks running periodically on the backup server.

 This shouldn't return anything:
 tail -3 /var/log/postgresql/postgresql-9.1-main.log | grep FATAL

 And this should return something:
 ps -u postgres -o cmd | grep postgres: wal receiver process   streaming

 These have worked very reliably for many months.

 -John


 On Fri, Mar 8, 2013 at 11:53 AM, akp geek akpg...@gmail.com wrote:

 I got it fixed.

 What I did was

 $ psql -c SELECT pg_start_backup('label', true)
 $ rsync -a ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude
 postmaster.pid
 $ psql -c SELECT pg_stop_backup()

 It took a while a to catch up the data.


 One question I have , are there any scripts to monitor the status of the
 replciation. so that I can be little proactive


 Regards



 On Thu, Mar 7, 2013 at 9:25 PM, Scott Marlowe scott.marl...@gmail.com
 wrote:

 On Thu, Mar 7, 2013 at 5:28 PM, akp geek akpg...@gmail.com wrote:
  Hi all -
 
 Recently made change on our primary database
 
default_text_search_config = 'pg_catalog.simple' .
  After that
  the replication is stopped. Can you please help me ? how to fix the
  issue. I
  am sure I made the change on the slave also.
 
  How can I start the replication and catch up the data. Thanks for
  your time.

 What are you seeing in your slony and / or postgresql logs, if
 anything?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres 9.0.2 replicated database is crashing

2013-03-09 Thread Lonni J Friedman
That process merely sets up a new server, it doesn't start streaming,
unless the server has been configured correctly.  You state that the
slave crashed after two hours.  How did you make this determination?

All you seem to be doing is setting up the slave from scratch
repeatedly, and assuming that it will magically just work, rather than
understanding  debugging why its not working.  Where is the log
output from your servers that shows what is really transpiring?

On Sat, Mar 9, 2013 at 6:51 AM, akp geek akpg...@gmail.com wrote:
 Hi all -

  I am in desperate need of your help. The replication/streaming
 stopped working on March5th. I followed the following procedure to restart
 the streaming. After running it for couple of hours , the database is
 crashing on the slave.  This is on our production server.  Thanks for your
 help.

 Steps that I followed

 1.  SELECT pg_start_backup('label', true);
 2.  I created tar of data.
 3.  un tar the data on the slave.
 4. deleted the pid file on the slave
 5. created recovery.conf
 5. SELECT pg_stop_backup();  on the primary
 6.  started the postgres on slave
 7.  After 2 hours,  the slave server crashes

 Please help.

 Regards



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] table spaces

2013-03-09 Thread Gregg Jaskiewicz
Performance related question.
With Linux (centos 6.3+), 64bit, ext4 in mind, how would you guys go about
distributing write load across disks.

Lets say I have quite few disks, and I can partition them the way I want,
in mirror configuration (to get some hardware failure resilience). Should I
separate tables from indexes onto separate raids ?

I know WAL has to go on a separate disk, for added performance.

I'm looking for your experiences, and most importantly how do you go about
deciding which way is best. I.e. which combinations make sense to try out
first, short of all permutations :-)

Thanks.



-- 
GJ


Re: [GENERAL] table spaces

2013-03-09 Thread Bèrto ëd Sèra
Hi Gregg

yes, keep the indexes on a separate channel. Much depends on how the
data is mapped and accessed, sometimes even distributing the data
itself onto different spaces may do good.

If you use a lot of logging (say you feed a massive pgFouine
activity), you would want to have that on yet another separate
channel, too.

There is no universal bullet for this, iostat will eventually tell you
whether your load distribution is good enough, or not.

Cheers
Bèrto

On 9 March 2013 17:51, Gregg Jaskiewicz gryz...@gmail.com wrote:
 Performance related question.
 With Linux (centos 6.3+), 64bit, ext4 in mind, how would you guys go about
 distributing write load across disks.

 Lets say I have quite few disks, and I can partition them the way I want, in
 mirror configuration (to get some hardware failure resilience). Should I
 separate tables from indexes onto separate raids ?

 I know WAL has to go on a separate disk, for added performance.

 I'm looking for your experiences, and most importantly how do you go about
 deciding which way is best. I.e. which combinations make sense to try out
 first, short of all permutations :-)

 Thanks.



 --
 GJ



-- 
==
If Pac-Man had affected us as kids, we'd all be running around in a
darkened room munching pills and listening to repetitive music.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Splitting Postgres into Separate Clusters?

2013-03-09 Thread Paul Jungwirth
Hello,

I'm running a specialized search engine that indexes a few tens of millions
of web pages, keeping everything in Postgres, and one problem I'm starting
to see is poor cache hit rates. My database has two or three tables just
for the text of the scraped pages, with one row every time a page is
scraped and a `text` column for the HTML. These tables are
almost-but-not-quite write-only. They are only read by one job, which uses
them to create other objects in the system. I'd like the rest of the
database to be in-memory all the time, but I don't really care if I have to
read these tables from disk. To keep my problem tables from dragging down
performance on the rest of the system, I'm thinking of splitting them out.
I suppose I could save the HTML outside of Postgres entirely, but I'm
wondering if a nicer solution would be to keep these tables in a separate
cluster (i.e. /var/lib/postgresql/9.1/something_else -- I hope this is
the right terminology). Then I could tune that cluster differently from the
main cluster, or even put it on a different machine. And I could still use
dblink to query both clusters conveniently (I think; this isn't really that
important). Does this seem like a worthwhile approach? Is there a better
way to deal with a few out-sized tables?

Thanks,
Paul

-- 
_
Pulchritudo splendor veritatis.


[GENERAL] Avoiding a deadlock

2013-03-09 Thread Paul Jungwirth
I have a long-running multi-row UPDATE that is deadlocking with a
single-row UPDATE:

2013-03-09 11:07:51 CST ERROR:  deadlock detected
2013-03-09 11:07:51 CST DETAIL:  Process 18851 waits for ShareLock on
transaction 10307138; blocked by process 24203.
Process 24203 waits for ShareLock on transaction 10306996; blocked
by process 18851.
Process 18851: UPDATE  taggings tg
SET score_tier = COALESCE(x.perc, 0)
FROM(SELECT tg2.id,
percent_rank() OVER (PARTITION BY
tg2.tag_id ORDER BY tg2.score ASC) AS perc
 FROM   taggings tg2, tags t
 WHERE  tg2.score IS NOT NULL
 ANDtg2.tag_id = t.id
 ANDt.tier = 2) AS x
WHERE   tg.id = x.id
AND tg.score IS NOT NULL
;
Process 24203: UPDATE taggings SET score = 2 WHERE
taggings.id = 29105523

Note that these two queries are actually updating different columns, albeit
apparently in the same row.

Is there anything I can do to avoid a deadlock here? The big query does
nothing else in its transaction; the little query's transaction might
update several rows from `taggings`, which I guess is the real reason for
the deadlock.

I'd be pretty satisfied with approximate values for the big query. As you
can see, it is just taking the `score` of each `tagging` and computing the
percentage of times it beats other taggings of the same tag. Is there
something I can do with transaction isolation levels here? I don't care if
the big query operates on slightly-out-of-date values. Since each query
updates different columns, I think there should be no issue with them
overwriting each other, right?

Thanks,
Paul



-- 
_
Pulchritudo splendor veritatis.


Re: [GENERAL] postgres 9.0.2 replicated database is crashing

2013-03-09 Thread akp geek
thank you. As you mentioned, I understood that I am starting the streaming
scratch which is not what I wanted to do.

Here is what I am planning to .

Our replication process was down since March5th.

1. Is it Ok to get all wals from March5th till now to standby pg_xlog
2. take pg_basebackup
3. export the data dir
4. stop backup
5. restart the standby.

Based on my understanding it should work, because when standby started, it
is not able find the files, as I have deleted from the slave them when I
tried to fix the issue


  0  2013-03-09 04:56:08 GMT LOG:  entering standby mode
cp: cannot access /backup/9.0.4/archive/0001010600E1
  0  2013-03-09 04:56:08 GMT LOG:  redo starts at 106/E120
cp: cannot access /backup/9.0.4/archive/0001010600E2
cp: cannot access /backup/9.0.4/archive/0001010600E3
cp: cannot access /backup/9.0.4/archive/0001010600E4
  0  2013-03-09 04:56:10 GMT LOG:  consistent recovery state reached at
106/E45AD4A8
  0  2013-03-09 04:56:10 GMT LOG:  database system is ready to accept
read only connections
cp: cannot access /backup/9.0.4/archive/0001010600E5
cp: cannot access /backup/9.0.4/archive/0001010600E6
cp: cannot access /backup/9.0.4/archive/0001010600E7
cp: cannot access /backup/9.0.4/archive/0001010600E8
cp: cannot access /backup/9.0.4/archive/0001010600E9
cp: cannot access /backup/9.0.4/archive/0001010600EA
  0  2013-03-09 04:56:12 GMT LOG:  invalid record length at 106/EA10B8C0
cp: cannot access /backup/9.0.4/archive/0001010600EA
  0  2013-03-09 04:56:12 GMT LOG:  streaming replication successfully
connected to primary
[unknown]  0  2013-03-09 04:57:00 GMT [unknown]LOG:  connection
received: host=[local]
[unknown]  0  2013-03-09 04:57:03 GMT [unknown]LOG:  connection
received: host=[local]

[unknown] 10.155.253.43(51257) 0 SELECT 2013-03-09 07:07:18 GMT
prodLog:  duration: 6316.649 ms
[unknown] 10.155.253.43(51257) 0 idle 2013-03-09 07:47:53 GMT prodLog:
 disconnection: session time: 0:41:06.529 user=postgres database=fprod host=
10.155.253.43 port=51257
  0  2013-03-09 07:55:48 GMT LOG:  restartpoint starting: time
  0  2013-03-09 08:25:47 GMT LOG:  restartpoint complete: wrote 19419
buffers (7.4%); write=1799.792 s, sync=0.066 s, total=1799.867 s
  0  2013-03-09 08:25:47 GMT LOG:  recovery restart point at
107/FB01B238
  0  2013-03-09 08:25:47 GMT DETAIL:  last completed transaction was at
log time 2013-03-09 08:25:41.85776+00
  0  2013-03-09 08:55:48 GMT LOG:  restartpoint starting: time
psql [local] 08P01 idle 2013-03-09 09:10:52 GMT prodLog:  unexpected EOF on
client connection
psql [local] 0 idle 2013-03-09 09:10:52 GMT prodLog:  disconnection:
session time: 2:15:06.351 user=postgres database=fprod host=[local]
  XX000  2013-03-09 09:23:46 GMT FATAL:  failed to add item to index page
in 100924/100937/1225845
  XX000  2013-03-09 09:23:46 GMT CONTEXT:  xlog redo Insert item, node:
100924/100937/1225845 blkno: 72666 offset: 234 nitem: 1 isdata: F isleaf F
isdelete
F updateBlkno:36483
  0  2013-03-09 09:23:46 GMT LOG:  startup process (PID 3880) exited
with exit code 1
  0  2013-03-09 09:23:46 GMT LOG:  terminating any other active server
processes


thanks


On Sat, Mar 9, 2013 at 11:16 AM, Lonni J Friedman netll...@gmail.comwrote:

 That process merely sets up a new server, it doesn't start streaming,
 unless the server has been configured correctly.  You state that the
 slave crashed after two hours.  How did you make this determination?

 All you seem to be doing is setting up the slave from scratch
 repeatedly, and assuming that it will magically just work, rather than
 understanding  debugging why its not working.  Where is the log
 output from your servers that shows what is really transpiring?

 On Sat, Mar 9, 2013 at 6:51 AM, akp geek akpg...@gmail.com wrote:
  Hi all -
 
   I am in desperate need of your help. The replication/streaming
  stopped working on March5th. I followed the following procedure to
 restart
  the streaming. After running it for couple of hours , the database is
  crashing on the slave.  This is on our production server.  Thanks for
 your
  help.
 
  Steps that I followed
 
  1.  SELECT pg_start_backup('label', true);
  2.  I created tar of data.
  3.  un tar the data on the slave.
  4. deleted the pid file on the slave
  5. created recovery.conf
  5. SELECT pg_stop_backup();  on the primary
  6.  started the postgres on slave
  7.  After 2 hours,  the slave server crashes
 
  Please help.
 
  Regards
 



Re: [GENERAL] postgres 9.0.2 replicated database is crashing

2013-03-09 Thread Lonni J Friedman
On Sat, Mar 9, 2013 at 1:51 PM, akp geek akpg...@gmail.com wrote:
 thank you. As you mentioned, I understood that I am starting the streaming
 scratch which is not what I wanted to do.

 Here is what I am planning to .

 Our replication process was down since March5th.

 1. Is it Ok to get all wals from March5th till now to standby pg_xlog
 2. take pg_basebackup
 3. export the data dir
 4. stop backup
 5. restart the standby.

 Based on my understanding it should work, because when standby started, it
 is not able find the files, as I have deleted from the slave them when I
 tried to fix the issue

Clearly it didn't work because you've been having issues for the past
4 days, and you've already tried this approach unsuccessfully.  Based
on the log snippets below, it looks like you have multiple problems.
First your slave is definitely missing WAL segments.  You should
increase the number of WAL segments that are archived on the master
and ensure that your base backup is including the WAL segments (by
generating it with the -x option).  However, that's the least of your
problems at the moment, because it looks like the master is what is
crashing.



   0  2013-03-09 04:56:08 GMT LOG:  entering standby mode
 cp: cannot access /backup/9.0.4/archive/0001010600E1
   0  2013-03-09 04:56:08 GMT LOG:  redo starts at 106/E120
 cp: cannot access /backup/9.0.4/archive/0001010600E2
 cp: cannot access /backup/9.0.4/archive/0001010600E3
 cp: cannot access /backup/9.0.4/archive/0001010600E4
   0  2013-03-09 04:56:10 GMT LOG:  consistent recovery state reached at
 106/E45AD4A8
   0  2013-03-09 04:56:10 GMT LOG:  database system is ready to accept
 read only connections
 cp: cannot access /backup/9.0.4/archive/0001010600E5
 cp: cannot access /backup/9.0.4/archive/0001010600E6
 cp: cannot access /backup/9.0.4/archive/0001010600E7
 cp: cannot access /backup/9.0.4/archive/0001010600E8
 cp: cannot access /backup/9.0.4/archive/0001010600E9
 cp: cannot access /backup/9.0.4/archive/0001010600EA
   0  2013-03-09 04:56:12 GMT LOG:  invalid record length at 106/EA10B8C0
 cp: cannot access /backup/9.0.4/archive/0001010600EA
   0  2013-03-09 04:56:12 GMT LOG:  streaming replication successfully
 connected to primary
 [unknown]  0  2013-03-09 04:57:00 GMT [unknown]LOG:  connection
 received: host=[local]
 [unknown]  0  2013-03-09 04:57:03 GMT [unknown]LOG:  connection
 received: host=[local]

 [unknown] 10.155.253.43(51257) 0 SELECT 2013-03-09 07:07:18 GMT prodLog:
 duration: 6316.649 ms
 [unknown] 10.155.253.43(51257) 0 idle 2013-03-09 07:47:53 GMT prodLog:
 disconnection: session time: 0:41:06.529 user=postgres database=fprod host=
 10.155.253.43 port=51257
   0  2013-03-09 07:55:48 GMT LOG:  restartpoint starting: time
   0  2013-03-09 08:25:47 GMT LOG:  restartpoint complete: wrote 19419
 buffers (7.4%); write=1799.792 s, sync=0.066 s, total=1799.867 s
   0  2013-03-09 08:25:47 GMT LOG:  recovery restart point at
 107/FB01B238
   0  2013-03-09 08:25:47 GMT DETAIL:  last completed transaction was at
 log time 2013-03-09 08:25:41.85776+00
   0  2013-03-09 08:55:48 GMT LOG:  restartpoint starting: time
 psql [local] 08P01 idle 2013-03-09 09:10:52 GMT prodLog:  unexpected EOF on
 client connection
 psql [local] 0 idle 2013-03-09 09:10:52 GMT prodLog:  disconnection:
 session time: 2:15:06.351 user=postgres database=fprod host=[local]
   XX000  2013-03-09 09:23:46 GMT FATAL:  failed to add item to index page in
 100924/100937/1225845
   XX000  2013-03-09 09:23:46 GMT CONTEXT:  xlog redo Insert item, node:
 100924/100937/1225845 blkno: 72666 offset: 234 nitem: 1 isdata: F isleaf F
 isdelete
 F updateBlkno:36483
   0  2013-03-09 09:23:46 GMT LOG:  startup process (PID 3880) exited
 with exit code 1
   0  2013-03-09 09:23:46 GMT LOG:  terminating any other active server
 processes

The real problem is here:
  XX000  2013-03-09 09:23:46 GMT FATAL:  failed to add item to index
page in 100924/100937/1225845
  XX000  2013-03-09 09:23:46 GMT CONTEXT:  xlog redo Insert item,
node: 100924/100937/1225845 blkno: 72666 offset: 234 nitem: 1 isdata:
F isleaf F isdelete F updateBlkno:36483

That looks like something is badly broken (maybe data corruption)?
Most of the google hits on that error are associated with GIST
indexes.  Are you using GIST indexes?

Are you really (still) using 9.0.4 ?  I hope you realize that there
are known data corruption bugs in that version, and that version is
super old at this point.  You really need to update to 9.0.12.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres 9.0.2 replicated database is crashing

2013-03-09 Thread akp geek
Appreciate your findings.  taking your points and doing things now.  We can
upgrade. Which version is more stable 9.2 or 9.1

We use GIST indexes quite a bit. and we gis also

I recently compiled postgres 9.2 ..

Regards



On Sat, Mar 9, 2013 at 5:09 PM, Lonni J Friedman netll...@gmail.com wrote:

 On Sat, Mar 9, 2013 at 1:51 PM, akp geek akpg...@gmail.com wrote:
  thank you. As you mentioned, I understood that I am starting the
 streaming
  scratch which is not what I wanted to do.
 
  Here is what I am planning to .
 
  Our replication process was down since March5th.
 
  1. Is it Ok to get all wals from March5th till now to standby pg_xlog
  2. take pg_basebackup
  3. export the data dir
  4. stop backup
  5. restart the standby.
 
  Based on my understanding it should work, because when standby started,
 it
  is not able find the files, as I have deleted from the slave them when I
  tried to fix the issue

 Clearly it didn't work because you've been having issues for the past
 4 days, and you've already tried this approach unsuccessfully.  Based
 on the log snippets below, it looks like you have multiple problems.
 First your slave is definitely missing WAL segments.  You should
 increase the number of WAL segments that are archived on the master
 and ensure that your base backup is including the WAL segments (by
 generating it with the -x option).  However, that's the least of your
 problems at the moment, because it looks like the master is what is
 crashing.

 
 
0  2013-03-09 04:56:08 GMT LOG:  entering standby mode
  cp: cannot access /backup/9.0.4/archive/0001010600E1
0  2013-03-09 04:56:08 GMT LOG:  redo starts at 106/E120
  cp: cannot access /backup/9.0.4/archive/0001010600E2
  cp: cannot access /backup/9.0.4/archive/0001010600E3
  cp: cannot access /backup/9.0.4/archive/0001010600E4
0  2013-03-09 04:56:10 GMT LOG:  consistent recovery state reached
 at
  106/E45AD4A8
0  2013-03-09 04:56:10 GMT LOG:  database system is ready to accept
  read only connections
  cp: cannot access /backup/9.0.4/archive/0001010600E5
  cp: cannot access /backup/9.0.4/archive/0001010600E6
  cp: cannot access /backup/9.0.4/archive/0001010600E7
  cp: cannot access /backup/9.0.4/archive/0001010600E8
  cp: cannot access /backup/9.0.4/archive/0001010600E9
  cp: cannot access /backup/9.0.4/archive/0001010600EA
0  2013-03-09 04:56:12 GMT LOG:  invalid record length at
 106/EA10B8C0
  cp: cannot access /backup/9.0.4/archive/0001010600EA
0  2013-03-09 04:56:12 GMT LOG:  streaming replication successfully
  connected to primary
  [unknown]  0  2013-03-09 04:57:00 GMT [unknown]LOG:  connection
  received: host=[local]
  [unknown]  0  2013-03-09 04:57:03 GMT [unknown]LOG:  connection
  received: host=[local]
 
  [unknown] 10.155.253.43(51257) 0 SELECT 2013-03-09 07:07:18 GMT
 prodLog:
  duration: 6316.649 ms
  [unknown] 10.155.253.43(51257) 0 idle 2013-03-09 07:47:53 GMT
 prodLog:
  disconnection: session time: 0:41:06.529 user=postgres database=fprod
 host=
  10.155.253.43 port=51257
0  2013-03-09 07:55:48 GMT LOG:  restartpoint starting: time
0  2013-03-09 08:25:47 GMT LOG:  restartpoint complete: wrote 19419
  buffers (7.4%); write=1799.792 s, sync=0.066 s, total=1799.867 s
0  2013-03-09 08:25:47 GMT LOG:  recovery restart point at
  107/FB01B238
0  2013-03-09 08:25:47 GMT DETAIL:  last completed transaction was
 at
  log time 2013-03-09 08:25:41.85776+00
0  2013-03-09 08:55:48 GMT LOG:  restartpoint starting: time
  psql [local] 08P01 idle 2013-03-09 09:10:52 GMT prodLog:  unexpected EOF
 on
  client connection
  psql [local] 0 idle 2013-03-09 09:10:52 GMT prodLog:  disconnection:
  session time: 2:15:06.351 user=postgres database=fprod host=[local]
XX000  2013-03-09 09:23:46 GMT FATAL:  failed to add item to index
 page in
  100924/100937/1225845
XX000  2013-03-09 09:23:46 GMT CONTEXT:  xlog redo Insert item, node:
  100924/100937/1225845 blkno: 72666 offset: 234 nitem: 1 isdata: F isleaf
 F
  isdelete
  F updateBlkno:36483
0  2013-03-09 09:23:46 GMT LOG:  startup process (PID 3880) exited
  with exit code 1
0  2013-03-09 09:23:46 GMT LOG:  terminating any other active
 server
  processes

 The real problem is here:
   XX000  2013-03-09 09:23:46 GMT FATAL:  failed to add item to index
 page in 100924/100937/1225845
   XX000  2013-03-09 09:23:46 GMT CONTEXT:  xlog redo Insert item,
 node: 100924/100937/1225845 blkno: 72666 offset: 234 nitem: 1 isdata:
 F isleaf F isdelete F updateBlkno:36483

 That looks like something is badly broken (maybe data corruption)?
 Most of the google hits on that error are associated with GIST
 indexes.  Are you using GIST indexes?

 Are you really (still) using 9.0.4 ?  I hope you realize that there
 are known data corruption bugs 

Re: [GENERAL] postgres 9.0.2 replicated database is crashing

2013-03-09 Thread Lonni J Friedman
On Sat, Mar 9, 2013 at 4:05 PM, akp geek akpg...@gmail.com wrote:
 Appreciate your findings.  taking your points and doing things now.  We can
 upgrade. Which version is more stable 9.2 or 9.1

They're two entirely different branches.  They should both be equally
stable.  However, if you're looking for the most straightforward path
I'd recommend going to 9.0.12.  Also be sure to read the release notes
first.


 We use GIST indexes quite a bit. and we gis also

 I recently compiled postgres 9.2 ..

 Regards



 On Sat, Mar 9, 2013 at 5:09 PM, Lonni J Friedman netll...@gmail.com wrote:

 On Sat, Mar 9, 2013 at 1:51 PM, akp geek akpg...@gmail.com wrote:
  thank you. As you mentioned, I understood that I am starting the
  streaming
  scratch which is not what I wanted to do.
 
  Here is what I am planning to .
 
  Our replication process was down since March5th.
 
  1. Is it Ok to get all wals from March5th till now to standby pg_xlog
  2. take pg_basebackup
  3. export the data dir
  4. stop backup
  5. restart the standby.
 
  Based on my understanding it should work, because when standby started,
  it
  is not able find the files, as I have deleted from the slave them when I
  tried to fix the issue

 Clearly it didn't work because you've been having issues for the past
 4 days, and you've already tried this approach unsuccessfully.  Based
 on the log snippets below, it looks like you have multiple problems.
 First your slave is definitely missing WAL segments.  You should
 increase the number of WAL segments that are archived on the master
 and ensure that your base backup is including the WAL segments (by
 generating it with the -x option).  However, that's the least of your
 problems at the moment, because it looks like the master is what is
 crashing.

 
 
0  2013-03-09 04:56:08 GMT LOG:  entering standby mode
  cp: cannot access /backup/9.0.4/archive/0001010600E1
0  2013-03-09 04:56:08 GMT LOG:  redo starts at 106/E120
  cp: cannot access /backup/9.0.4/archive/0001010600E2
  cp: cannot access /backup/9.0.4/archive/0001010600E3
  cp: cannot access /backup/9.0.4/archive/0001010600E4
0  2013-03-09 04:56:10 GMT LOG:  consistent recovery state reached
  at
  106/E45AD4A8
0  2013-03-09 04:56:10 GMT LOG:  database system is ready to
  accept
  read only connections
  cp: cannot access /backup/9.0.4/archive/0001010600E5
  cp: cannot access /backup/9.0.4/archive/0001010600E6
  cp: cannot access /backup/9.0.4/archive/0001010600E7
  cp: cannot access /backup/9.0.4/archive/0001010600E8
  cp: cannot access /backup/9.0.4/archive/0001010600E9
  cp: cannot access /backup/9.0.4/archive/0001010600EA
0  2013-03-09 04:56:12 GMT LOG:  invalid record length at
  106/EA10B8C0
  cp: cannot access /backup/9.0.4/archive/0001010600EA
0  2013-03-09 04:56:12 GMT LOG:  streaming replication
  successfully
  connected to primary
  [unknown]  0  2013-03-09 04:57:00 GMT [unknown]LOG:  connection
  received: host=[local]
  [unknown]  0  2013-03-09 04:57:03 GMT [unknown]LOG:  connection
  received: host=[local]
 
  [unknown] 10.155.253.43(51257) 0 SELECT 2013-03-09 07:07:18 GMT
  prodLog:
  duration: 6316.649 ms
  [unknown] 10.155.253.43(51257) 0 idle 2013-03-09 07:47:53 GMT
  prodLog:
  disconnection: session time: 0:41:06.529 user=postgres database=fprod
  host=
  10.155.253.43 port=51257
0  2013-03-09 07:55:48 GMT LOG:  restartpoint starting: time
0  2013-03-09 08:25:47 GMT LOG:  restartpoint complete: wrote
  19419
  buffers (7.4%); write=1799.792 s, sync=0.066 s, total=1799.867 s
0  2013-03-09 08:25:47 GMT LOG:  recovery restart point at
  107/FB01B238
0  2013-03-09 08:25:47 GMT DETAIL:  last completed transaction was
  at
  log time 2013-03-09 08:25:41.85776+00
0  2013-03-09 08:55:48 GMT LOG:  restartpoint starting: time
  psql [local] 08P01 idle 2013-03-09 09:10:52 GMT prodLog:  unexpected EOF
  on
  client connection
  psql [local] 0 idle 2013-03-09 09:10:52 GMT prodLog:  disconnection:
  session time: 2:15:06.351 user=postgres database=fprod host=[local]
XX000  2013-03-09 09:23:46 GMT FATAL:  failed to add item to index
  page in
  100924/100937/1225845
XX000  2013-03-09 09:23:46 GMT CONTEXT:  xlog redo Insert item, node:
  100924/100937/1225845 blkno: 72666 offset: 234 nitem: 1 isdata: F isleaf
  F
  isdelete
  F updateBlkno:36483
0  2013-03-09 09:23:46 GMT LOG:  startup process (PID 3880) exited
  with exit code 1
0  2013-03-09 09:23:46 GMT LOG:  terminating any other active
  server
  processes

 The real problem is here:
   XX000  2013-03-09 09:23:46 GMT FATAL:  failed to add item to index
 page in 100924/100937/1225845
   XX000  2013-03-09 09:23:46 GMT CONTEXT:  xlog redo Insert item,
 node: 100924/100937/1225845 blkno: 72666 offset: 234 nitem: 1 isdata:
 F 

Re: [GENERAL] selecting for type cast failures

2013-03-09 Thread David Johnston
Adrian Klaver-3 wrote
 My opinion, it would take more time to concoct regexes that cover all 
 the corner cases than to write a script that walks the through the data 
 , finds the problem data and flags them.

ISTM that using regular expressions is necessary regardless of whether you
put them into a function/script or otherwise use them interactively via
queries...

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/selecting-for-type-cast-failures-tp5747875p5747890.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] crosstab creating multiple rows for same id

2013-03-09 Thread David Johnston
you provide inadequate information to provide anything more than a guess...

Are you sure your issue isn't simply a client display consideration -
basically word-wrapping?

David J.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/crosstab-creating-multiple-rows-for-same-id-tp5747865p5747883.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL connect with Visual C++

2013-03-09 Thread dhaval257
I worked upon this link you sent. It ends up giving error This applcation
can not start because SSLEAY32.dll is missing. I want to connect in any
form whether it is ODBC or direct. If you know then please help me. I am
stuck here. Thank you



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/PostgreSQL-connect-with-Visual-C-tp5747463p5747642.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres 9.0.2 replicated database is crashing

2013-03-09 Thread akp geek
thanks a lot.  One final question on this.

if the streaming fails as in our case, we missed 4 days worth of data.  We
archive the log files on the primary.  if we were to restart , what should
be starting the point ?

Do we need to get all the archived logs to the pg_xlogs from the date ( in
our case 03/05 ) .  This is where I got confused.

Again thanks a lot. Now we will definitely upgrade

Regards



On Sat, Mar 9, 2013 at 7:11 PM, Lonni J Friedman netll...@gmail.com wrote:

 On Sat, Mar 9, 2013 at 4:05 PM, akp geek akpg...@gmail.com wrote:
  Appreciate your findings.  taking your points and doing things now.  We
 can
  upgrade. Which version is more stable 9.2 or 9.1

 They're two entirely different branches.  They should both be equally
 stable.  However, if you're looking for the most straightforward path
 I'd recommend going to 9.0.12.  Also be sure to read the release notes
 first.

 
  We use GIST indexes quite a bit. and we gis also
 
  I recently compiled postgres 9.2 ..
 
  Regards
 
 
 
  On Sat, Mar 9, 2013 at 5:09 PM, Lonni J Friedman netll...@gmail.com
 wrote:
 
  On Sat, Mar 9, 2013 at 1:51 PM, akp geek akpg...@gmail.com wrote:
   thank you. As you mentioned, I understood that I am starting the
   streaming
   scratch which is not what I wanted to do.
  
   Here is what I am planning to .
  
   Our replication process was down since March5th.
  
   1. Is it Ok to get all wals from March5th till now to standby pg_xlog
   2. take pg_basebackup
   3. export the data dir
   4. stop backup
   5. restart the standby.
  
   Based on my understanding it should work, because when standby
 started,
   it
   is not able find the files, as I have deleted from the slave them
 when I
   tried to fix the issue
 
  Clearly it didn't work because you've been having issues for the past
  4 days, and you've already tried this approach unsuccessfully.  Based
  on the log snippets below, it looks like you have multiple problems.
  First your slave is definitely missing WAL segments.  You should
  increase the number of WAL segments that are archived on the master
  and ensure that your base backup is including the WAL segments (by
  generating it with the -x option).  However, that's the least of your
  problems at the moment, because it looks like the master is what is
  crashing.
 
  
  
 0  2013-03-09 04:56:08 GMT LOG:  entering standby mode
   cp: cannot access /backup/9.0.4/archive/0001010600E1
 0  2013-03-09 04:56:08 GMT LOG:  redo starts at 106/E120
   cp: cannot access /backup/9.0.4/archive/0001010600E2
   cp: cannot access /backup/9.0.4/archive/0001010600E3
   cp: cannot access /backup/9.0.4/archive/0001010600E4
 0  2013-03-09 04:56:10 GMT LOG:  consistent recovery state
 reached
   at
   106/E45AD4A8
 0  2013-03-09 04:56:10 GMT LOG:  database system is ready to
   accept
   read only connections
   cp: cannot access /backup/9.0.4/archive/0001010600E5
   cp: cannot access /backup/9.0.4/archive/0001010600E6
   cp: cannot access /backup/9.0.4/archive/0001010600E7
   cp: cannot access /backup/9.0.4/archive/0001010600E8
   cp: cannot access /backup/9.0.4/archive/0001010600E9
   cp: cannot access /backup/9.0.4/archive/0001010600EA
 0  2013-03-09 04:56:12 GMT LOG:  invalid record length at
   106/EA10B8C0
   cp: cannot access /backup/9.0.4/archive/0001010600EA
 0  2013-03-09 04:56:12 GMT LOG:  streaming replication
   successfully
   connected to primary
   [unknown]  0  2013-03-09 04:57:00 GMT [unknown]LOG:  connection
   received: host=[local]
   [unknown]  0  2013-03-09 04:57:03 GMT [unknown]LOG:  connection
   received: host=[local]
  
   [unknown] 10.155.253.43(51257) 0 SELECT 2013-03-09 07:07:18 GMT
   prodLog:
   duration: 6316.649 ms
   [unknown] 10.155.253.43(51257) 0 idle 2013-03-09 07:47:53 GMT
   prodLog:
   disconnection: session time: 0:41:06.529 user=postgres database=fprod
   host=
   10.155.253.43 port=51257
 0  2013-03-09 07:55:48 GMT LOG:  restartpoint starting: time
 0  2013-03-09 08:25:47 GMT LOG:  restartpoint complete: wrote
   19419
   buffers (7.4%); write=1799.792 s, sync=0.066 s, total=1799.867 s
 0  2013-03-09 08:25:47 GMT LOG:  recovery restart point at
   107/FB01B238
 0  2013-03-09 08:25:47 GMT DETAIL:  last completed transaction
 was
   at
   log time 2013-03-09 08:25:41.85776+00
 0  2013-03-09 08:55:48 GMT LOG:  restartpoint starting: time
   psql [local] 08P01 idle 2013-03-09 09:10:52 GMT prodLog:  unexpected
 EOF
   on
   client connection
   psql [local] 0 idle 2013-03-09 09:10:52 GMT prodLog:
  disconnection:
   session time: 2:15:06.351 user=postgres database=fprod host=[local]
 XX000  2013-03-09 09:23:46 GMT FATAL:  failed to add item to index
   page in
   100924/100937/1225845
 XX000  2013-03-09 09:23:46 GMT CONTEXT:  

Re: [GENERAL] table spaces

2013-03-09 Thread Scott Marlowe
On Sat, Mar 9, 2013 at 10:51 AM, Gregg Jaskiewicz gryz...@gmail.com wrote:
 Performance related question.
 With Linux (centos 6.3+), 64bit, ext4 in mind, how would you guys go about
 distributing write load across disks.

 Lets say I have quite few disks, and I can partition them the way I want, in
 mirror configuration (to get some hardware failure resilience). Should I
 separate tables from indexes onto separate raids ?

 I know WAL has to go on a separate disk, for added performance.

 I'm looking for your experiences, and most importantly how do you go about
 deciding which way is best. I.e. which combinations make sense to try out
 first, short of all permutations :-)

First get a baseline for how things work with just pg_xlog on one
small set (RAID 1 is often plenty) and RAID-10 on all the rest with
all the data (i.e. base directory) there. With a fast HW RAID
controller this is often just about as fast as any amount of breaking
things out will be. But if you do break things out and they are fster
then you'll know by how much. If it's slower then you know you've got
a really busy set and some not so busy ones. And so on...


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general