Re: [GENERAL] FATAL: database a/system_data does not exist
This error is getting logged at an interval of 2 minutes and 10 seconds 2013-05-10 00:22:50 GMT:[4180]FATAL: database a/system_data does not exist 2013-05-10 00:25:00 GMT:[4657]FATAL: database a/system_data does not exist 2013-05-10 00:27:10 GMT:[5394]FATAL: database a/system_data does not exist -- View this message in context: http://postgresql.1045698.n5.nabble.com/FATAL-database-a-system-data-does-not-exist-tp5754839p5754975.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Large amount of serialization errors in transactions
Hi! I have a problem understanding how transactions with serializable isolation level works in postgresql. What exactly may cause a serialization error? My problem is a system where process one adds data to a database. Shortly afterwards, process two reads and possibly modifies the same data (keys are not touched). When large amounts of data arrives at about the same time, I get loads of serialization errors in process two. In a perfect world this should not happen, since data that is entered by process one is almost always different from the data that at the same time is being read and written by process two. I have tried increasing max_pred_locks_per_transaction, but it seems to have no effect. I do retry the transactions, and eventually they succeed, but my concern here is the amount of errors I get at certain periods. Am I expecting too much of serializable isolation level transactions, or is there anyting else that I am missing? - Vegard -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing small image files
Hi Nelson. I worked with images and Postgresql, and want to add some comments: On Thu, 9 May 2013 13:40:15 -0500 Nelson Green nelsongree...@gmail.com wrote: OK, this is kind of convoluted, but I got a couple of test cases that work for me. The steps to make the first one are below. First I took one of the photos and shrunk it real small using GIMP. If you want to manipulate images automatically, don't use GIMP, use ImageMagick(for shell scripts) or OpenCV(for C sourcecode) Then I manually converted that to a base64 encoded text file: /usr/bin/base64 test.jpg test.64 If you must to use the pg shell, perhaps coding Misa's function in other language (python f.ex.) allows you directly insert the bytea. A use hint: disable toast compression for that table, images are already compressed, you don't need to waste time with it. That outputs a base64 string that matches test.64. Outputting that to a file and then converting it back gives me my image: /usr/bin/base64 -d output.64 newtest.jpg Like I said, kind of crazy, but it satisfies me that my basic premise is doable. I'll still get one of the front-end developers to whip out some PHP just to be safe. Thanks to all! --- --- Eduardo Morras emorr...@yahoo.es -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing small image files
Nelson Green wrote on 09.05.2013 19:05: On Thu, May 9, 2013 at 10:51 AM, Achilleas Mantzios wrote: then here : http://www.dbforums.com/postgresql/1666200-insert-jpeg-files-into-bytea-column.html Thanks Achilleas. I usually do the physical design in vi using sql scripts, and I like to include a couple of inserts and selects to make sure everything is going according to plan. It looks like I may just have to work with a front-end developer for this particular instance. Of all the stupid things, in all of my years doing this I've never once had to work with storing binary files, other than years ago when I was studying for some of the MySQL certs. The thread from DbForums links to the SQL tool I'm maintaining, SQL Workbench/J: http://www.sql-workbench.net I assume the image files are stored on the client where you run the SQL rather than on the Postgres server, right? If you can use a different SQL client than psql, then SQL Workbench is probably the easiest way to solve this. I added that extended (proprietary) SQL syntax exactly for this purpose. Your statement would become: INSERT INTO security_badge VALUES ( 'PK00880918', (SELECT employee_id FROM employee WHERE employee_lastname = 'Kendell' AND employee_firstname = 'Paul'), {$blobfile='/path/to/test.jpg'} ); The /path/to/test.jpg is local to the computer where SQL Workbench is running. SQL Workbench is not only usable as a GUI application but also in console mode (similar to psql then) or in batch mode to run the scripts automatically. For bulk loading the SQL Workbench specific WbImport command also supports text files that contain a filename to be stored into a bytea column. All this support is for bytea only, it does not support large objects but as you are storing small images, bytea is the better choice anyway. Regards Thomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] PL/R etc.
I don't know whether anybody active on the list has R (and in particular PL/R) experience, but just in case... :-) i) Something like APL can operate on an array with minimal regard for index order, i.e. operations across the array are as easily-expressed and as efficient as operations down the array. Does this apply to PL/R? ii) Things like OpenOffice can be very inefficient if operating over a table comprising a non-trivial number of rows. Does PL/R offer a significant improvement, e.g. by using a cursor rather than trying to read an entire resultset into memory? -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues] -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing small image files
2013/5/10 Eduardo Morras emorr...@yahoo.es Hi Nelson. I worked with images and Postgresql, and want to add some comments: On Thu, 9 May 2013 13:40:15 -0500 Nelson Green nelsongree...@gmail.com wrote: OK, this is kind of convoluted, but I got a couple of test cases that work for me. The steps to make the first one are below. First I took one of the photos and shrunk it real small using GIMP. If you want to manipulate images automatically, don't use GIMP, use ImageMagick(for shell scripts) or OpenCV(for C sourcecode) Then I manually converted that to a base64 encoded text file: /usr/bin/base64 test.jpg test.64 If you must to use the pg shell, perhaps coding Misa's function in other language (python f.ex.) allows you directly insert the bytea. well, with pl/python there is more power - no need to use lo_largobejects... However solution for him is not in function because of file is on different machine then Postgres... I think Nelson has found solution what works for him... But now, what Thomas Keller suggested sounds as very good approach if user doesn't want to write his own client code... A use hint: disable toast compression for that table, images are already compressed, you don't need to waste time with it. That outputs a base64 string that matches test.64. Outputting that to a file and then converting it back gives me my image: /usr/bin/base64 -d output.64 newtest.jpg Like I said, kind of crazy, but it satisfies me that my basic premise is doable. I'll still get one of the front-end developers to whip out some PHP just to be safe. Thanks to all! --- --- Eduardo Morras emorr...@yahoo.es -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] FATAL: database a/system_data does not exist
On Fri, May 10, 2013 at 5:13 PM, sumita su...@avaya.com wrote: This error is getting logged at an interval of 2 minutes and 10 seconds 2013-05-10 00:22:50 GMT:[4180]FATAL: database a/system_data does not exist 2013-05-10 00:25:00 GMT:[4657]FATAL: database a/system_data does not exist 2013-05-10 00:27:10 GMT:[5394]FATAL: database a/system_data does not exist Sounds like you have an app in a retry loop of some sort. Does the figure 130 seconds call anything to mind? Would you have written anything to delay that long before reconnecting to the database? ChrisA -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing small image files
On Fri, May 10, 2013 at 2:59 AM, Eduardo Morras emorr...@yahoo.es wrote: Hi Nelson. I worked with images and Postgresql, and want to add some comments: On Thu, 9 May 2013 13:40:15 -0500 Nelson Green nelsongree...@gmail.com wrote: OK, this is kind of convoluted, but I got a couple of test cases that work for me. The steps to make the first one are below. First I took one of the photos and shrunk it real small using GIMP. If you want to manipulate images automatically, don't use GIMP, use ImageMagick(for shell scripts) or OpenCV(for C sourcecode) Then I manually converted that to a base64 encoded text file: /usr/bin/base64 test.jpg test.64 If you must to use the pg shell, perhaps coding Misa's function in other language (python f.ex.) allows you directly insert the bytea. A use hint: disable toast compression for that table, images are already compressed, you don't need to waste time with it. Thanks Eduardo, I should have caught that, but good advice. I appreciate it. That outputs a base64 string that matches test.64. Outputting that to a file and then converting it back gives me my image: /usr/bin/base64 -d output.64 newtest.jpg Like I said, kind of crazy, but it satisfies me that my basic premise is doable. I'll still get one of the front-end developers to whip out some PHP just to be safe. Thanks to all! --- --- Eduardo Morras emorr...@yahoo.es
Re: [GENERAL] Storing small image files
On Fri, May 10, 2013 at 5:24 AM, Misa Simic misa.si...@gmail.com wrote: 2013/5/10 Eduardo Morras emorr...@yahoo.es Hi Nelson. I worked with images and Postgresql, and want to add some comments: On Thu, 9 May 2013 13:40:15 -0500 Nelson Green nelsongree...@gmail.com wrote: OK, this is kind of convoluted, but I got a couple of test cases that work for me. The steps to make the first one are below. First I took one of the photos and shrunk it real small using GIMP. If you want to manipulate images automatically, don't use GIMP, use ImageMagick(for shell scripts) or OpenCV(for C sourcecode) Then I manually converted that to a base64 encoded text file: /usr/bin/base64 test.jpg test.64 If you must to use the pg shell, perhaps coding Misa's function in other language (python f.ex.) allows you directly insert the bytea. well, with pl/python there is more power - no need to use lo_largobejects... However solution for him is not in function because of file is on different machine then Postgres... I think Nelson has found solution what works for him... I did, but I have made note of your and Thomas's suggestions. I just wanted to be able to verify that my DDL had done what I meant with a simple insert and select. The real work will be done via a PHP front-end. Thanks everyone! But now, what Thomas Keller suggested sounds as very good approach if user doesn't want to write his own client code... A use hint: disable toast compression for that table, images are already compressed, you don't need to waste time with it. That outputs a base64 string that matches test.64. Outputting that to a file and then converting it back gives me my image: /usr/bin/base64 -d output.64 newtest.jpg Like I said, kind of crazy, but it satisfies me that my basic premise is doable. I'll still get one of the front-end developers to whip out some PHP just to be safe. Thanks to all! --- --- Eduardo Morras emorr...@yahoo.es -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] FATAL: database a/system_data does not exist
On 05/10/2013 12:13 AM, sumita wrote: This error is getting logged at an interval of 2 minutes and 10 seconds 2013-05-10 00:22:50 GMT:[4180]FATAL: database a/system_data does not exist 2013-05-10 00:25:00 GMT:[4657]FATAL: database a/system_data does not exist 2013-05-10 00:27:10 GMT:[5394]FATAL: database a/system_data does not exist A suggestion, turn up your log_statement to 'all' to see if you can catch what is triggering the error. -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PL/R etc.
On Fri, May 10, 2013 at 4:41 AM, Mark Morgan Lloyd markmll.pgsql-gene...@telemetry.co.uk wrote: I don't know whether anybody active on the list has R (and in particular PL/R) experience, but just in case... :-) i) Something like APL can operate on an array with minimal regard for index order, i.e. operations across the array are as easily-expressed and as efficient as operations down the array. Does this apply to PL/R? ii) Things like OpenOffice can be very inefficient if operating over a table comprising a non-trivial number of rows. Does PL/R offer a significant improvement, e.g. by using a cursor rather than trying to read an entire resultset into memory? pl/r (via R) very terse and expressive. it will probably meet or beat any performance expectations you have coming from openoffice. that said, it's definitely a memory bound language; typically problem solving involves stuffing data into huge data frames which then pass to the high level problem solving functions like glm. you have full access to sql within the pl/r function, so nothing is keeping you from paging data into the frame via a cursor, but that only helps so much. a lot depends on the specific problem you solve of course. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] PG in cash till machines
Hi, We are developing a solution which will run in thousands of small cash till machines running Linux and we would like to use PostgreSQL but there is a insecurity feeling regarding the solution basically because these boxes would be exposed to an insecure environment and insecure procedures like: non controlled power outages, untrusted hardware, no appropriate environment for hardware (dusty, hot) etc... Our major concern is related to the write-back issues we can face in this environment. Is there a way to prevent it or a way PG can be configured to detect write-back configurations? Don't think so, but... Basically, the question is: is there a way to guaranty a reliable PG database in an untrusted and insecure environment? Maybe some kind of file system could give this guaranty, not sure.. -- Reimer carlos.rei...@opendb.com.br
Re: [GENERAL] FATAL: database a/system_data does not exist
Adrian Klaver adrian.kla...@gmail.com writes: On 05/10/2013 12:13 AM, sumita wrote: This error is getting logged at an interval of 2 minutes and 10 seconds 2013-05-10 00:22:50 GMT:[4180]FATAL: database a/system_data does not exist A suggestion, turn up your log_statement to 'all' to see if you can catch what is triggering the error. log_connections would probably be more useful, since this looks like a connection-time failure. That would at least tell you what machine was issuing the bogus connection attempts. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] LONG delete with LOTS of FK's
On 2013-05-09 16:43, Larry Rosenman wrote: On 2013-05-09 16:40, Tom Lane wrote: Larry Rosenman l...@lerctr.org writes: On 2013-05-09 16:22, Tom Lane wrote: Perhaps it's blocked on a lock? Did you look into pg_locks? Did you note whether the process was consuming CPU time and/or doing IO? all the locks were clear, and it was consuming CPU and doing I/O (D-S-D state), etc. Hm. I'm suspicious that you still ended up with a seqscan checking plan. Was this session started after you added all the missing indexes? If not, it seems possible that it was using a bad pre-cached plan. regards, tom lane I added the indexes on last friday, and we've done a number of vacuumdb -zav's (every night) since then. So, if there's a cached plan, it's not from me. (we also restarted our app on Saturday night). Any ideas on how to figure out if we ARE getting seqscan check plans, and better fix it? -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 (c) E-Mail: l...@lerctr.org US Mail: 430 Valona Loop, Round Rock, TX 78681-3893 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PG in cash till machines
On Fri, May 10, 2013 at 8:43 AM, Carlos Henrique Reimer carlos.rei...@opendb.com.br wrote: Hi, We are developing a solution which will run in thousands of small cash till machines running Linux and we would like to use PostgreSQL but there is a insecurity feeling regarding the solution basically because these boxes would be exposed to an insecure environment and insecure procedures like: non controlled power outages, untrusted hardware, no appropriate environment for hardware (dusty, hot) etc... Our major concern is related to the write-back issues we can face in this environment. Is there a way to prevent it or a way PG can be configured to detect write-back configurations? Don't think so, but... This is not the job of postgres, but of the operating system (and possibly various hardware attached to the server). In practice, it's very possible to configure things so that a full flush all the way to physical storage is guaranteed -- in fact that's the default behavior for many configurations. So this part, at least, is trivially done. Basically, the question is: is there a way to guaranty a reliable PG database in an untrusted and insecure environment? Maybe some kind of file system could give this guaranty, not sure.. Again, this has more to do with the hardware (especially) operating environment and the operating system than postgres itself. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
Hello. We're intending to deploy PostgreSQL on Linux with SSD drives which would be in a RAID 1 configuration with Hardware RAID. My first question is essentially: are there any issues we need to be aware of when running PostgreSQL 9 on CentOS 6 on a server with SSD drives in a Hardware RAID 1 configuration? Will there be any compatibility problems (seems unlikely)? Should we consider alternative configurations as being more effective for getting better use out of the hardware? The second question is: are there any SSD-specific issues to be aware of when tuning PostgreSQL to make the best use of this hardware and software? The specific hardware we're planning to use is the HP ProLiant DL360 Gen8 server with P420i RAID controller, and two MLC SSDs in RAID 1 for the OS, and two SLC SSDs in RAID 1 for the database - but I guess it isn't necessary to have used this specific hardware setup in order to have experience with these general issues. The P420i controller appears to be compatible with recent versions of CentOS, so drivers should not be a concern (hopefully). Any insights anyone can offer on these issues would be most welcome. Regards, Matt. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] LONG delete with LOTS of FK's
Larry Rosenman l...@lerctr.org writes: Any ideas on how to figure out if we ARE getting seqscan check plans, and better fix it? Try an EXPLAIN ANALYZE VERBOSE on something that just deletes one row, and wait however long it takes. The printout should show how much time is taken in the implementation trigger for each foreign key. That will at least nail down which table(s) are causing problems. A different line of thought is that the EXPLAIN I suggested in 25119.1367507...@sss.pgh.pa.us isn't an entirely accurate representation of what a foreign-key checking query is like, because the check queries are parameterized. You might need to do this instead: prepare foo(referenced_column_data_type) as select 1 from referencing_table where referencing_column = $1; explain execute foo(sample_value); and verify you get a cheap plan for each referencing table. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On Fri, May 10, 2013 at 9:14 AM, Matt Brock m...@mattbrock.co.uk wrote: Hello. We're intending to deploy PostgreSQL on Linux with SSD drives which would be in a RAID 1 configuration with Hardware RAID. My first question is essentially: are there any issues we need to be aware of when running PostgreSQL 9 on CentOS 6 on a server with SSD drives in a Hardware RAID 1 configuration? Will there be any compatibility problems (seems unlikely)? Should we consider alternative configurations as being more effective for getting better use out of the hardware? The second question is: are there any SSD-specific issues to be aware of when tuning PostgreSQL to make the best use of this hardware and software? The specific hardware we're planning to use is the HP ProLiant DL360 Gen8 server with P420i RAID controller, and two MLC SSDs in RAID 1 for the OS, and two SLC SSDs in RAID 1 for the database - but I guess it isn't necessary to have used this specific hardware setup in order to have experience with these general issues. The P420i controller appears to be compatible with recent versions of CentOS, so drivers should not be a concern (hopefully). The specific drive models play a huge impact on SSD performance. In fact, the point you are using SLC drives suggests you might be using antiquated (by SSD standards) hardware. All the latest action is on MLC now (see here: http://www.intel.com/content/www/us/en/solid-state-drives/solid-state-drives-dc-s3700-series.html). merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] LONG delete with LOTS of FK's
On 2013-05-10 09:14, Tom Lane wrote: Larry Rosenman l...@lerctr.org writes: Any ideas on how to figure out if we ARE getting seqscan check plans, and better fix it? Try an EXPLAIN ANALYZE VERBOSE on something that just deletes one row, and wait however long it takes. The printout should show how much time is taken in the implementation trigger for each foreign key. That will at least nail down which table(s) are causing problems. A different line of thought is that the EXPLAIN I suggested in 25119.1367507...@sss.pgh.pa.us isn't an entirely accurate representation of what a foreign-key checking query is like, because the check queries are parameterized. You might need to do this instead: prepare foo(referenced_column_data_type) as select 1 from referencing_table where referencing_column = $1; explain execute foo(sample_value); and verify you get a cheap plan for each referencing table. regards, tom lane We don't :( [lrosenman@233175-blueprint-db1 ~]$ cat Seq.new_exp Seq Scan on account_billing_info (cost=0.00..7.19 rows=1 width=0) Seq Scan on account_main_admin (cost=0.00..4.69 rows=1 width=0) Seq Scan on bnymellon1_values (cost=0.00..288848.72 rows=10357338 width=0) Seq Scan on capgemini8_values (cost=0.00..380499.85 rows=12309748 width=0) Seq Scan on cityofcalgary_values (cost=0.00..245690.53 rows=8410682 width=0) Seq Scan on css_fro_values (cost=0.00..505110.71 rows=15228057 width=0) Seq Scan on cvscaremarkadp_values (cost=0.00..17062.58 rows=602126 width=0) Seq Scan on ericsson2_values (cost=0.00..104704.84 rows=3513987 width=0) Seq Scan on ibmbpmandrules_values (cost=0.00..153210.55 rows=5337724 width=0) Seq Scan on ibmbwlteam_values (cost=0.00..7903.44 rows=274515 width=0) Seq Scan on ibmgbs_values (cost=0.00..399206.24 rows=13983459 width=0) Seq Scan on ibmtechsales_values (cost=0.00..232201.80 rows=8204144 width=0) Seq Scan on jmffamilyent_values (cost=0.00..53596.24 rows=1874339 width=0) Seq Scan on johnsoncontrols5_values (cost=0.00..69047.31 rows=2405705 width=0) Seq Scan on keybank3_values (cost=0.00..23789.16 rows=855293 width=0) Seq Scan on mondialassistancegroup_values (cost=0.00..122394.54 rows=4454283 width=0) Seq Scan on permitted_ips (cost=0.00..4.01 rows=15 width=0) Seq Scan on presby_health_serv_values (cost=0.00..37387.31 rows=1340345 width=0) Seq Scan on principal_fin_grp4_values (cost=0.00..69872.73 rows=2436698 width=0) Seq Scan on processdoc_tc_nz_values (cost=0.00..360360.30 rows=10975144 width=0) Seq Scan on saml2_idp_account_junction (cost=0.00..1.07 rows=1 width=0) Seq Scan on troweprice2_values (cost=0.00..253867.86 rows=9135429 width=0) [lrosenman@233175-blueprint-db1 ~]$ cat fix_sql [lrosenman@233175-blueprint-db1 ~]$ cat fix_sql prepare foo_account_activity(bigint) as select 1 from account_activity where account_id = $1; explain execute foo_account_activity(29818880); prepare foo_account_billing_info(bigint) as select 1 from account_billing_info where account_id = $1; explain execute foo_account_billing_info(29818880); prepare foo_account_cleaving(bigint) as select 1 from account_cleaving where account_id = $1; explain execute foo_account_cleaving(29818880); prepare foo_account_locked_by(bigint) as select 1 from account_locked_by where account_id = $1; explain execute foo_account_locked_by(29818880); prepare foo_account_logo(bigint) as select 1 from account_logo where account_id = $1; explain execute foo_account_logo(29818880); prepare foo_account_main_admin(bigint) as select 1 from account_main_admin where account_id = $1; explain execute foo_account_main_admin(29818880); prepare foo_account_organization(bigint) as select 1 from account_organization where account_id = $1; explain execute foo_account_organization(29818880); prepare foo_preferences(bigint) as select 1 from preferences where account_id = $1; explain execute foo_preferences(29818880); prepare foo_account_properties(bigint) as select 1 from account_properties where account_id = $1; explain execute foo_account_properties(29818880); prepare foo_avatars(bigint) as select 1 from avatars where account_id = $1; explain execute foo_avatars(29818880); prepare foo_billing_address(bigint) as select 1 from billing_address where account_id = $1; explain execute foo_billing_address(29818880); prepare foo_billing_transaction(bigint) as select 1 from billing_transaction where account_id = $1; explain execute foo_billing_transaction(29818880); prepare foo_billing_transaction_item(bigint) as select 1 from billing_transaction_item where account_id = $1; explain execute foo_billing_transaction_item(29818880); prepare foo_blobs(bigint) as select 1 from blobs where account_id = $1; explain execute foo_blobs(29818880); prepare foo_bnymellon1_values(bigint) as select 1 from bnymellon1_values where account_id = $1; explain execute foo_bnymellon1_values(29818880); prepare foo_capgemini8_values(bigint) as select 1 from capgemini8_values where account_id =
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
After googling this for a while, it seems that High Endurance MLC is only starting to rival SLC for endurance and write performance in the very latest, cutting-edge hardware. In general, though, it seems it would be fair to say that SLCs are still a better bet for databases than MLC? The number and capacity of drives is small in this instance, and the price difference between the two for HP SSDs isn't very wide, so cost isn't really an issue. We just want to use whichever is better for the database. On 10 May 2013, at 15:20, Merlin Moncure mmonc...@gmail.com wrote: On Fri, May 10, 2013 at 9:14 AM, Matt Brock m...@mattbrock.co.uk wrote: Hello. We're intending to deploy PostgreSQL on Linux with SSD drives which would be in a RAID 1 configuration with Hardware RAID. My first question is essentially: are there any issues we need to be aware of when running PostgreSQL 9 on CentOS 6 on a server with SSD drives in a Hardware RAID 1 configuration? Will there be any compatibility problems (seems unlikely)? Should we consider alternative configurations as being more effective for getting better use out of the hardware? The second question is: are there any SSD-specific issues to be aware of when tuning PostgreSQL to make the best use of this hardware and software? The specific hardware we're planning to use is the HP ProLiant DL360 Gen8 server with P420i RAID controller, and two MLC SSDs in RAID 1 for the OS, and two SLC SSDs in RAID 1 for the database - but I guess it isn't necessary to have used this specific hardware setup in order to have experience with these general issues. The P420i controller appears to be compatible with recent versions of CentOS, so drivers should not be a concern (hopefully). The specific drive models play a huge impact on SSD performance. In fact, the point you are using SLC drives suggests you might be using antiquated (by SSD standards) hardware. All the latest action is on MLC now (see here: http://www.intel.com/content/www/us/en/solid-state-drives/solid-state-drives-dc-s3700-series.html). merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PG in cash till machines
Our major concern is related to the write-back issues we can face in this environment. I agree this is the OS's responsibility. Greg Smith's Postgres: High Performance book has a lot to say about this, but there's also a lot you could read online, since it's really a requirement for any Postgres installation anywhere. Paul -- _ Pulchritudo splendor veritatis. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On 5/10/2013 9:19 AM, Matt Brock wrote: After googling this for a while, it seems that High Endurance MLC is only starting to rival SLC for endurance and write performance in the very latest, cutting-edge hardware. In general, though, it seems it would be fair to say that SLCs are still a better bet for databases than MLC? I've never looked at SLC drives in the past few years and don't know anyone who uses them these days. The number and capacity of drives is small in this instance, and the price difference between the two for HP SSDs isn't very wide, so cost isn't really an issue. We just want to use whichever is better for the database. Could you post some specific drive models please ? HP probably doesn't make the drives, and it really helps to know what devices you're using since they are not nearly as generic in behavior and features as magnetic drives. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] authentication/privileges
Hi, Although I'm quite happy with the way my system (Debian sid) has set up the server (PosgreSQL 9.1), I'm not sure I'm using the authentication/privilege mechanism properly. In particular, I'd like to understand how the administrative user (postgres) is set up. Here is what pg_hba contains: # Database administrative login by Unix domain socket local all postgrespeer With peer authentication, one can only login as postgres from a local connection. I'm not sure what password the postgres user was set up in the OS, however, I assigned one to it (the same as for the PostgreSQL user). I've read somewhere that the postgres OS user should be left locked without password, although it's not clear what was meant by locked. In any case, what is recommended practice WRT passwords for setting Unix vs PostgreSQL passwords for postgres and other DB users? Thanks, -- Seb -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] LONG delete with LOTS of FK's
Larry Rosenman l...@lerctr.org writes: On 2013-05-10 09:14, Tom Lane wrote: ... and verify you get a cheap plan for each referencing table. We don't :( Ugh. I bet the problem is that in some of these tables, there are lots and lots of duplicate account ids, such that seqscans look like a good bet when searching for an otherwise-unknown id. You don't see this with a handwritten test for a specific id because then the planner can see it's not any of the common values. 9.2 would fix this for you --- any chance of updating? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] LONG delete with LOTS of FK's
On 2013-05-10 10:57, Tom Lane wrote: Larry Rosenman l...@lerctr.org writes: On 2013-05-10 09:14, Tom Lane wrote: ... and verify you get a cheap plan for each referencing table. We don't :( Ugh. I bet the problem is that in some of these tables, there are lots and lots of duplicate account ids, such that seqscans look like a good bet when searching for an otherwise-unknown id. You don't see this with a handwritten test for a specific id because then the planner can see it's not any of the common values. 9.2 would fix this for you --- any chance of updating? regards, tom lane I'll see what we can do. I was looking for a reason, this may be it. Thanks for all your help. -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 (c) E-Mail: l...@lerctr.org US Mail: 430 Valona Loop, Round Rock, TX 78681-3893 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
Not sure of your space requirements, but I'd think a RAID 10 of 8x or more Samsung 840 Pro 256/512 GB would be the best value. Using a simple mirror won't get you the reliability that you want since heavy writing will burn the drives out over time, and if you're writing the exact same content to both drives, they could likely fail at the same time. Regardless of the underlying hardware you should still follow best practices for provisioning disks, and raid 10 is the way to go. I don't know what your budget is though. Anyway, mirrored SSD will probably work fine, but I'd avoid using just two drives for the reasons above. I'd suggest at least testing RAID 5 or something else to spread the load around. Personally, I think the ideal configuration would be a RAID 10 of at least 8 disks plus 1 hot spare. The Samsung 840 Pro 256 GB are frequently $200 on sale at Newegg. YMMV but they are amazing drives. On Fri, May 10, 2013 at 11:25 AM, David Boreham david_l...@boreham.orgwrote: On 5/10/2013 9:19 AM, Matt Brock wrote: After googling this for a while, it seems that High Endurance MLC is only starting to rival SLC for endurance and write performance in the very latest, cutting-edge hardware. In general, though, it seems it would be fair to say that SLCs are still a better bet for databases than MLC? I've never looked at SLC drives in the past few years and don't know anyone who uses them these days. The number and capacity of drives is small in this instance, and the price difference between the two for HP SSDs isn't very wide, so cost isn't really an issue. We just want to use whichever is better for the database. Could you post some specific drive models please ? HP probably doesn't make the drives, and it really helps to know what devices you're using since they are not nearly as generic in behavior and features as magnetic drives. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/**mailpref/pgsql-generalhttp://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] authentication/privileges
Sebastian P. Luque splu...@gmail.com writes: With peer authentication, one can only login as postgres from a local connection. I'm not sure what password the postgres user was set up in the OS, however, I assigned one to it (the same as for the PostgreSQL user). I've read somewhere that the postgres OS user should be left locked without password, although it's not clear what was meant by locked. It's fairly common for distro-supplied packages to create a postgres OS user but not assign it any password. In that state, the only way to become postgres is to su to it from root, or perhaps from a sudoer account with root-equivalent privileges. While that might be okay for machines with just one person administering everything, I can't say that I think it's recommendable practice in general: you don't want to have to give somebody root to let them admin the database. Better to give the postgres user a password. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On Fri, May 10, 2013 at 10:19 AM, Matt Brock m...@mattbrock.co.uk wrote: After googling this for a while, it seems that High Endurance MLC is only starting to rival SLC for endurance and write performance in the very latest, cutting-edge hardware. In general, though, it seems it would be fair to say that SLCs are still a better bet for databases than MLC? The number and capacity of drives is small in this instance, and the price difference between the two for HP SSDs isn't very wide, so cost isn't really an issue. We just want to use whichever is better for the database. Well, it's more complicated than that. While SLC drives were indeed inherently faster and had longer lifespans, all flash drives basically have the requirement of having to carefully manages writes in order to get good performance. Unfortunately, this means that for database use the drives must have some type of non-volatile cache and/or sufficient back up juice in a capacitor to spin out pending write in the event of sudden loss of power. Many drives, including (famously) the so-called Intel X25-E enterprise lines, did not do this and where therefore unsuitable for database use. As it turns out the list of flash drives are suitable for database use is surprisingly small. The s3700 I noted upthread seems to be specifically built with databases in mind and is likely the best choice for new deployments. The older Intel 320 is also a good choice. I think that's pretty much it until you get into expensive pci-e based gear. There might be some non-intel drives out there that are suitable but be very very careful and triple verify that the drive has on-board capacitor and has gotten real traction in enterprise database usage. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PL/R etc.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 05/10/2013 06:33 AM, Merlin Moncure wrote: you have full access to sql within the pl/r function, so nothing is keeping you from paging data into the frame via a cursor, but that only helps so much. I have thought about implementing the paging transparently beneath the dataframe object, but have not gotten around to it. Apparently the Oracle connector to R is able to do that, so I'm sure it is a SMOP. Joe - -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJRjR7PAAoJEDfy90M199hlV8MQAJXaAW2trOPRK+BO53jWKfJs Ksdepom2Mc4MAvMCPY+t3VCSvIA8SIYwj58dzMF9sC5jz46Dfgu+mT6ML3qFadIq yO/aa6Ss9j/LzyODpON3uZb3P/HAAifDC+rg11JzgoQj9L/7eoc+uI7Ruc3He6aE hA4MkxN9zuvowTt9yGi+N0iQNvMMIlFUNS0Uc0PUwYWXka0PSjkZsTShySaF5U34 ch9PNqS2U1vsHi+D8YsgiloHuFoZMvV0NEr9vixWC8s0mtP2+LHSrklMTD9X6HDg YMm0ma+/cmALhc51qXkCpNX4S4oDhf9Ma+y3E1++BNQW50Vnu0+/mcCEwoIxpJEg aphT+UX3k0duusE6PfzPx/ouHslW/TfBKdSkYZSwaqqhKky0pzxDnOgsXUxw6w1a RBOHds+5moCnyhQ1TI+RyWQl0+jBueJWAXOJxb3d9Sy5EjWXUgWBwr0yjORfK4pg s7Wo5Y0Isb9Y+au0OFbIjlSYXiJsXy/n7IiiKaFR74pE3StIm3pn2T7Zc48aJOly s1xmWXdBZGq78KE59fqA/wFsKqYWp9FM9WCBTVdncP70pnKtOJmhhTWXKDOYiCpJ v8EE814zPj4x8kTjWXjTdT1kGCaLwe9GREkmmW0w2M+lVpi9/oBLs6uIdUdyQTci HzL41MQfE/T+81Bkg4V2 =NIq1 -END PGP SIGNATURE- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On Fri, May 10, 2013 at 11:11 AM, Evan D. Hoffman evandhoff...@gmail.com wrote: Not sure of your space requirements, but I'd think a RAID 10 of 8x or more Samsung 840 Pro 256/512 GB would be the best value. Using a simple mirror won't get you the reliability that you want since heavy writing will burn the drives out over time, and if you're writing the exact same content to both drives, they could likely fail at the same time. Regardless of the underlying hardware you should still follow best practices for provisioning disks, and raid 10 is the way to go. I don't know what your budget is though. Anyway, mirrored SSD will probably work fine, but I'd avoid using just two drives for the reasons above. I'd suggest at least testing RAID 5 or something else to spread the load around. Personally, I think the ideal configuration would be a RAID 10 of at least 8 disks plus 1 hot spare. The Samsung 840 Pro 256 GB are frequently $200 on sale at Newegg. YMMV but they are amazing drives. Samsung 840 has no power loss protection and is therefore useless for database use IMO unless you don't care about data safety and/or are implementing redundancy via some other method (say, by synchronous replication). merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] authentication/privileges
Tom Lane escribió: It's fairly common for distro-supplied packages to create a postgres OS user but not assign it any password. In that state, the only way to become postgres is to su to it from root, or perhaps from a sudoer account with root-equivalent privileges. While that might be okay for machines with just one person administering everything, I can't say that I think it's recommendable practice in general: you don't want to have to give somebody root to let them admin the database. Better to give the postgres user a password. Of course, it's also possible to give multiple people sudo-to-postgres capability without giving them sudo-to-root. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
I'd expect to use a RAID controller with either BBU or NVRAM cache to handle that, and that the server itself would be on UPS for a production DB. That said, a standby replica DB on conventional disk is definitely a good idea in any case. On Fri, May 10, 2013 at 12:25 PM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, May 10, 2013 at 11:11 AM, Evan D. Hoffman evandhoff...@gmail.com wrote: Not sure of your space requirements, but I'd think a RAID 10 of 8x or more Samsung 840 Pro 256/512 GB would be the best value. Using a simple mirror won't get you the reliability that you want since heavy writing will burn the drives out over time, and if you're writing the exact same content to both drives, they could likely fail at the same time. Regardless of the underlying hardware you should still follow best practices for provisioning disks, and raid 10 is the way to go. I don't know what your budget is though. Anyway, mirrored SSD will probably work fine, but I'd avoid using just two drives for the reasons above. I'd suggest at least testing RAID 5 or something else to spread the load around. Personally, I think the ideal configuration would be a RAID 10 of at least 8 disks plus 1 hot spare. The Samsung 840 Pro 256 GB are frequently $200 on sale at Newegg. YMMV but they are amazing drives. Samsung 840 has no power loss protection and is therefore useless for database use IMO unless you don't care about data safety and/or are implementing redundancy via some other method (say, by synchronous replication). merlin
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On Fri, May 10, 2013 at 11:34 AM, Evan D. Hoffman evandhoff...@gmail.com wrote: I'd expect to use a RAID controller with either BBU or NVRAM cache to handle that, and that the server itself would be on UPS for a production DB. That said, a standby replica DB on conventional disk is definitely a good idea in any case. Sadly, NVRAM cache doesn't help (unless the raid controller is managing drive writes down to the flash level and no such products exist that I am aware of). The problem is that provide guarantees the raid controller still needs to be able to tell the device to flush down to physical storage. While flash drives can be configured to do that (basically write-through mode), it's pretty silly to do so as it will ruin performance and quickly destroy the drive. Trusting UPS is up to you, but if your ups does, someone knocks the power cable, etc you have data loss. With on-drive capacitor you only get data loss via physical damage or corruption on the drive. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] pg_basebackup, requested WAL has already been removed
Hi, I've recently started to use pg_basebackup --xlog-method=stream to backup my multi-Tb database. Before I did the backup when there was not much activity in the DB and it went perfectly fine, but today, I've started the backup and it failed twice almost at the same time as the CREATE INDEX (and another time CLUSTER) commands were finished. Here: postgres@cappc118:/mnt/backup/wsdb_130510$ pg_basebackup --xlog-method=stream --progress --verbose --pg transaction log start point: 23AE/BD003E70 pg_basebackup: starting background WAL receiver pg_basebackup: unexpected termination of replication stream: FATAL: requested WAL segment 000123B100FE has already been removed 4819820/16816887078 kB (4%), 0/1 tablespace (/mnt/backup/wsdb_130510/base/1) And the logs from around that time contained: some_user:wsdb:2013-05-10 14:35:41 BST:10587LOG: duration: 40128.163 ms statement: CREATE INDEX usno_cle an_q3c_idx ON usno_clean (q3c_ang2ipix(ra,dec)); ::2013-05-10 14:35:43 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:43 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. ::2013-05-10 14:35:51 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:51 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. postgres:[unknown]:2013-05-10 14:35:55 BST:8177FATAL: requested WAL segment 000123B100FE has already been removed some_user:wsdb:2013-05-10 14:36:59 BST:10599LOG: duration: 78378.194 ms statement: CLUSTER usno_clean_q3c_idx ON usno_clean; One the previous occasion when it happened the CREATE INDEX() was being executed: some_user:wsdb:2013-05-10 09:17:20 BST:3300LOG: duration: 67.680 ms statement: SELECT name FROM (SELECT pg_catalog.lower(name) AS name FROM pg_catalog.pg_settings UNION ALL SELECT 'session authorization' UNION ALL SELECT 'all') ss WHERE substring(name,1,4)='rand' LIMIT 1000 ::2013-05-10 09:22:47 BST:25529LOG: checkpoints are occurring too frequently (18 seconds apart) ::2013-05-10 09:22:47 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segments. postgres:[unknown]:2013-05-10 09:22:49 BST:27659FATAL: requested WAL segment 000123990040 has already been removed some_user:wsdb:2013-05-10 09:22:57 BST:3236LOG: duration: 542955.262 ms statement: CREATE INDEX xmatch_temp_usnoid_idx ON xmatch_temp (usno_id); The .configuration PG 9.2.4, Debian 7.0, amd64 shared_buffers = 10GB work_mem = 1GB maintenance_work_mem = 1GB effective_io_concurrency = 5 synchronous_commit = off checkpoint_segments = 32 max_wal_senders = 2 effective_cache_size = 30GB autovacuum_max_workers = 3 wal_level=archive archive_mode = off Does it look like a bug or am I missing something ? Thanks, Sergey -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_basebackup, requested WAL has already been removed
Its definitely not a bug. You need to set/increase wal_keep_segments to a value that ensures that they aren't recycled faster than the time required to complete the base backup (plus some buffer). On Fri, May 10, 2013 at 9:48 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: Hi, I've recently started to use pg_basebackup --xlog-method=stream to backup my multi-Tb database. Before I did the backup when there was not much activity in the DB and it went perfectly fine, but today, I've started the backup and it failed twice almost at the same time as the CREATE INDEX (and another time CLUSTER) commands were finished. Here: postgres@cappc118:/mnt/backup/wsdb_130510$ pg_basebackup --xlog-method=stream --progress --verbose --pg transaction log start point: 23AE/BD003E70 pg_basebackup: starting background WAL receiver pg_basebackup: unexpected termination of replication stream: FATAL: requested WAL segment 000123B100FE has already been removed 4819820/16816887078 kB (4%), 0/1 tablespace (/mnt/backup/wsdb_130510/base/1) And the logs from around that time contained: some_user:wsdb:2013-05-10 14:35:41 BST:10587LOG: duration: 40128.163 ms statement: CREATE INDEX usno_cle an_q3c_idx ON usno_clean (q3c_ang2ipix(ra,dec)); ::2013-05-10 14:35:43 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:43 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. ::2013-05-10 14:35:51 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:51 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. postgres:[unknown]:2013-05-10 14:35:55 BST:8177FATAL: requested WAL segment 000123B100FE has already been removed some_user:wsdb:2013-05-10 14:36:59 BST:10599LOG: duration: 78378.194 ms statement: CLUSTER usno_clean_q3c_idx ON usno_clean; One the previous occasion when it happened the CREATE INDEX() was being executed: some_user:wsdb:2013-05-10 09:17:20 BST:3300LOG: duration: 67.680 ms statement: SELECT name FROM (SELECT pg_catalog.lower(name) AS name FROM pg_catalog.pg_settings UNION ALL SELECT 'session authorization' UNION ALL SELECT 'all') ss WHERE substring(name,1,4)='rand' LIMIT 1000 ::2013-05-10 09:22:47 BST:25529LOG: checkpoints are occurring too frequently (18 seconds apart) ::2013-05-10 09:22:47 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segments. postgres:[unknown]:2013-05-10 09:22:49 BST:27659FATAL: requested WAL segment 000123990040 has already been removed some_user:wsdb:2013-05-10 09:22:57 BST:3236LOG: duration: 542955.262 ms statement: CREATE INDEX xmatch_temp_usnoid_idx ON xmatch_temp (usno_id); The .configuration PG 9.2.4, Debian 7.0, amd64 shared_buffers = 10GB work_mem = 1GB maintenance_work_mem = 1GB effective_io_concurrency = 5 synchronous_commit = off checkpoint_segments = 32 max_wal_senders = 2 effective_cache_size = 30GB autovacuum_max_workers = 3 wal_level=archive archive_mode = off Does it look like a bug or am I missing something ? Thanks, Sergey -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- ~ L. Friedmannetll...@gmail.com LlamaLand https://netllama.linux-sxs.org -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_basebackup, requested WAL has already been removed
On Fri, 10 May 2013, Lonni J Friedman wrote: Its definitely not a bug. You need to set/increase wal_keep_segments to a value that ensures that they aren't recycled faster than the time required to complete the base backup (plus some buffer). But I thought that wal_keep_segments is not needed for the streaming regime ( --xlog-method=stream) And the documentation http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html only mentions wal_keep_segments when talking about --xlog-method=fetch. On Fri, May 10, 2013 at 9:48 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: Hi, I've recently started to use pg_basebackup --xlog-method=stream to backup my multi-Tb database. Before I did the backup when there was not much activity in the DB and it went perfectly fine, but today, I've started the backup and it failed twice almost at the same time as the CREATE INDEX (and another time CLUSTER) commands were finished. Here: postgres@cappc118:/mnt/backup/wsdb_130510$ pg_basebackup --xlog-method=stream --progress --verbose --pg transaction log start point: 23AE/BD003E70 pg_basebackup: starting background WAL receiver pg_basebackup: unexpected termination of replication stream: FATAL: requested WAL segment 000123B100FE has already been removed 4819820/16816887078 kB (4%), 0/1 tablespace (/mnt/backup/wsdb_130510/base/1) And the logs from around that time contained: some_user:wsdb:2013-05-10 14:35:41 BST:10587LOG: duration: 40128.163 ms statement: CREATE INDEX usno_cle an_q3c_idx ON usno_clean (q3c_ang2ipix(ra,dec)); ::2013-05-10 14:35:43 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:43 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. ::2013-05-10 14:35:51 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:51 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. postgres:[unknown]:2013-05-10 14:35:55 BST:8177FATAL: requested WAL segment 000123B100FE has already been removed some_user:wsdb:2013-05-10 14:36:59 BST:10599LOG: duration: 78378.194 ms statement: CLUSTER usno_clean_q3c_idx ON usno_clean; One the previous occasion when it happened the CREATE INDEX() was being executed: some_user:wsdb:2013-05-10 09:17:20 BST:3300LOG: duration: 67.680 ms statement: SELECT name FROM (SELECT pg_catalog.lower(name) AS name FROM pg_catalog.pg_settings UNION ALL SELECT 'session authorization' UNION ALL SELECT 'all') ss WHERE substring(name,1,4)='rand' LIMIT 1000 ::2013-05-10 09:22:47 BST:25529LOG: checkpoints are occurring too frequently (18 seconds apart) ::2013-05-10 09:22:47 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segments. postgres:[unknown]:2013-05-10 09:22:49 BST:27659FATAL: requested WAL segment 000123990040 has already been removed some_user:wsdb:2013-05-10 09:22:57 BST:3236LOG: duration: 542955.262 ms statement: CREATE INDEX xmatch_temp_usnoid_idx ON xmatch_temp (usno_id); The .configuration PG 9.2.4, Debian 7.0, amd64 shared_buffers = 10GB work_mem = 1GB maintenance_work_mem = 1GB effective_io_concurrency = 5 synchronous_commit = off checkpoint_segments = 32 max_wal_senders = 2 effective_cache_size = 30GB autovacuum_max_workers = 3 wal_level=archive archive_mode = off Does it look like a bug or am I missing something ? Thanks, Sergey -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- ~ L. Friedmannetll...@gmail.com LlamaLand https://netllama.linux-sxs.org -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_basebackup, requested WAL has already been removed
That's a good point. Then i dunno, perhaps it is a bug, but I'd be surprised if this wasn't working, as its not really a corner case that could be missed in testing, as long as all the options were exercised. Hopefully someone else can weigh in. On Fri, May 10, 2013 at 10:00 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: On Fri, 10 May 2013, Lonni J Friedman wrote: Its definitely not a bug. You need to set/increase wal_keep_segments to a value that ensures that they aren't recycled faster than the time required to complete the base backup (plus some buffer). But I thought that wal_keep_segments is not needed for the streaming regime ( --xlog-method=stream) And the documentation http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html only mentions wal_keep_segments when talking about --xlog-method=fetch. On Fri, May 10, 2013 at 9:48 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: Hi, I've recently started to use pg_basebackup --xlog-method=stream to backup my multi-Tb database. Before I did the backup when there was not much activity in the DB and it went perfectly fine, but today, I've started the backup and it failed twice almost at the same time as the CREATE INDEX (and another time CLUSTER) commands were finished. Here: postgres@cappc118:/mnt/backup/wsdb_130510$ pg_basebackup --xlog-method=stream --progress --verbose --pg transaction log start point: 23AE/BD003E70 pg_basebackup: starting background WAL receiver pg_basebackup: unexpected termination of replication stream: FATAL: requested WAL segment 000123B100FE has already been removed 4819820/16816887078 kB (4%), 0/1 tablespace (/mnt/backup/wsdb_130510/base/1) And the logs from around that time contained: some_user:wsdb:2013-05-10 14:35:41 BST:10587LOG: duration: 40128.163 ms statement: CREATE INDEX usno_cle an_q3c_idx ON usno_clean (q3c_ang2ipix(ra,dec)); ::2013-05-10 14:35:43 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:43 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. ::2013-05-10 14:35:51 BST:25529LOG: checkpoints are occurring too frequently (8 seconds apart) ::2013-05-10 14:35:51 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segmen ts. postgres:[unknown]:2013-05-10 14:35:55 BST:8177FATAL: requested WAL segment 000123B100FE has already been removed some_user:wsdb:2013-05-10 14:36:59 BST:10599LOG: duration: 78378.194 ms statement: CLUSTER usno_clean_q3c_idx ON usno_clean; One the previous occasion when it happened the CREATE INDEX() was being executed: some_user:wsdb:2013-05-10 09:17:20 BST:3300LOG: duration: 67.680 ms statement: SELECT name FROM (SELECT pg_catalog.lower(name) AS name FROM pg_catalog.pg_settings UNION ALL SELECT 'session authorization' UNION ALL SELECT 'all') ss WHERE substring(name,1,4)='rand' LIMIT 1000 ::2013-05-10 09:22:47 BST:25529LOG: checkpoints are occurring too frequently (18 seconds apart) ::2013-05-10 09:22:47 BST:25529HINT: Consider increasing the configuration parameter checkpoint_segments. postgres:[unknown]:2013-05-10 09:22:49 BST:27659FATAL: requested WAL segment 000123990040 has already been removed some_user:wsdb:2013-05-10 09:22:57 BST:3236LOG: duration: 542955.262 ms statement: CREATE INDEX xmatch_temp_usnoid_idx ON xmatch_temp (usno_id); The .configuration PG 9.2.4, Debian 7.0, amd64 shared_buffers = 10GB work_mem = 1GB maintenance_work_mem = 1GB effective_io_concurrency = 5 synchronous_commit = off checkpoint_segments = 32 max_wal_senders = 2 effective_cache_size = 30GB autovacuum_max_workers = 3 wal_level=archive archive_mode = off Does it look like a bug or am I missing something ? Thanks, Sergey -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- ~ L. Friedmannetll...@gmail.com LlamaLand https://netllama.linux-sxs.org -- ~ L. Friedmannetll...@gmail.com LlamaLand https://netllama.linux-sxs.org -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On 5/10/2013 10:21 AM, Merlin Moncure wrote: As it turns out the list of flash drives are suitable for database use is surprisingly small. The s3700 I noted upthread seems to be specifically built with databases in mind and is likely the best choice for new deployments. The older Intel 320 is also a good choice. I think that's pretty much it until you get into expensive pci-e based gear. This may have been a typo : did you mean Intel 710 series rather than 320 ? While the 320 has the supercap, it isn't specified for high write endurance. Definitely usable for a database, and a better choice than most of the alternatives, but I'd have listed the 710 ahead of the 320. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_basebackup, requested WAL has already been removed
On Fri, May 10, 2013 at 7:00 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote: On Fri, 10 May 2013, Lonni J Friedman wrote: Its definitely not a bug. You need to set/increase wal_keep_segments to a value that ensures that they aren't recycled faster than the time required to complete the base backup (plus some buffer). But I thought that wal_keep_segments is not needed for the streaming regime ( --xlog-method=stream) And the documentation http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html only mentions wal_keep_segments when talking about --xlog-method=fetch. It's not a bug in the software - this will happen if the background stream in pg_basebackup cannot keep up, and that's normal. It works the same way as a regular standby in that if it's unable to keep up it will eventually fall so far behind that it can't recover. It may definitely be something that needs to be cleared up in the documentation though. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On Fri, May 10, 2013 at 12:03 PM, David Boreham david_l...@boreham.org wrote: On 5/10/2013 10:21 AM, Merlin Moncure wrote: As it turns out the list of flash drives are suitable for database use is surprisingly small. The s3700 I noted upthread seems to be specifically built with databases in mind and is likely the best choice for new deployments. The older Intel 320 is also a good choice. I think that's pretty much it until you get into expensive pci-e based gear. This may have been a typo : did you mean Intel 710 series rather than 320 ? While the 320 has the supercap, it isn't specified for high write endurance. Definitely usable for a database, and a better choice than most of the alternatives, but I'd have listed the 710 ahead of the 320. It wasn't a typo. The 320 though is perfectly fine although it will wear out faster -- so it fills a niche for low write intensity applications. I find the s3700 to be superior to the 710 in just about every way (although you're right -- it is suitable for database use). merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On 05/10/2013 12:46 PM, Merlin Moncure wrote: On Fri, May 10, 2013 at 11:34 AM, Evan D. Hoffman evandhoff...@gmail.com wrote: I'd expect to use a RAID controller with either BBU or NVRAM cache to handle that, and that the server itself would be on UPS for a production DB. That said, a standby replica DB on conventional disk is definitely a good idea in any case. Sadly, NVRAM cache doesn't help (unless the raid controller is managing drive writes down to the flash level and no such products exist that I am aware of). The problem is that provide guarantees the raid controller still needs to be able to tell the device to flush down to physical storage. While flash drives can be configured to do that (basically write-through mode), it's pretty silly to do so as it will ruin performance and quickly destroy the drive. Trusting UPS is up to you, but if your ups does, someone knocks the power cable, etc you have data loss. With on-drive capacitor you only get data loss via physical damage or corruption on the drive. merlin Well we have dual redundant power supplies on separate UPS so could something go wrong yes, but a tornado could come along and destroy the building also. -- Stephen Clark *NetWolves* Director of Technology Phone: 813-579-3200 Fax: 813-882-0209 Email: steve.cl...@netwolves.com http://www.netwolves.com
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On Fri, May 10, 2013 at 10:20 AM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, May 10, 2013 at 12:03 PM, David Boreham david_l...@boreham.org wrote: On 5/10/2013 10:21 AM, Merlin Moncure wrote: As it turns out the list of flash drives are suitable for database use is surprisingly small. The s3700 I noted upthread seems to be specifically built with databases in mind and is likely the best choice for new deployments. The older Intel 320 is also a good choice. I think that's pretty much it until you get into expensive pci-e based gear. This may have been a typo : did you mean Intel 710 series rather than 320 ? While the 320 has the supercap, it isn't specified for high write endurance. Definitely usable for a database, and a better choice than most of the alternatives, but I'd have listed the 710 ahead of the 320. It wasn't a typo. The 320 though is perfectly fine although it will wear out faster -- so it fills a niche for low write intensity applications. I find the s3700 to be superior to the 710 in just about every way (although you're right -- it is suitable for database use). There's also the 520 series, which has better performance than the 320 series (which is EOL now). -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On 5/10/2013 11:20 AM, Merlin Moncure wrote: I find the s3700 to be superior to the 710 in just about every way (although you're right -- it is suitable for database use). merlin The s3700 series replaces the 710 so it should be superior :) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On 5/10/2013 11:23 AM, Lonni J Friedman wrote: There's also the 520 series, which has better performance than the 320 series (which is EOL now). I wouldn't use the 520 series for production database storage -- it has the Sandforce controller and apparently no power failure protection. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
Steve Clark escribió: Well we have dual redundant power supplies on separate UPS so could something go wrong yes, but a tornado could come along and destroy the building also. .. hence your standby server across the country? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On May 10, 2013, at 7:14 AM, Matt Brock m...@mattbrock.co.uk wrote: Hello. We're intending to deploy PostgreSQL on Linux with SSD drives which would be in a RAID 1 configuration with Hardware RAID. My first question is essentially: are there any issues we need to be aware of when running PostgreSQL 9 on CentOS 6 on a server with SSD drives in a Hardware RAID 1 configuration? Will there be any compatibility problems (seems unlikely)? Should we consider alternative configurations as being more effective for getting better use out of the hardware? The second question is: are there any SSD-specific issues to be aware of when tuning PostgreSQL to make the best use of this hardware and software? A couple of things I noticed with a similar-ish setup: * Some forms of RAID / LVM break the kernel's automatic disk tuning mechanism. In particular, there is a rotational tunable that often does not get set right. You might end up tweaking read ahead and friends as well. http://www.mjmwired.net/kernel/Documentation/block/queue-sysfs.txt#112 * The default Postgres configuration is awful for a SSD backed database. You really need to futz with checkpoints to get acceptable throughput. The PostgreSQL 9.0 High Performance book is fantastic and is what I used to great success. * The default Linux virtual memory configuration is awful for this configuration. Briefly, it will accept a ton of incoming data, and then go through an awful stall as soon as it calls fsync() to write all that data to disk. We had multi-second delays all the way through to the application because of this. We had to change the zone_reclaim_mode and the dirty buffer limits. http://www.postgresql.org/message-id/500616cb.3070...@2ndquadrant.com I am not sure that these numbers will end up being anywhere near what works for you, but these are my notes from tuning a 4xMLC SSD RAID-10. I haven't proven that this is optimal, but it was way better than the defaults. We ended up with the following list of changes: * Change IO scheduler to noop * Mount DB volume with nobarrier, noatime * Turn blockdev readahead to 16MiB * Turn sdb's rotational tuneable to 0 PostgreSQL configuration changes: synchronous_commit = off effective_io_concurrency = 4 checkpoint_segments = 1024 checkpoint_timeout = 10min checkpoint_warning = 8min shared_buffers = 32gb temp_buffers = 128mb work_mem = 512mb maintenance_work_mem = 1gb Linux sysctls: vm.swappiness = 0 vm.zone_reclaim_mode = 0 vm.dirty_bytes = 134217728 vm.dirty_background_bytes = 1048576 Hope that helps, Steven -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Large amount of serialization errors in transactions
Vegard Bønes vegard.bo...@met.no wrote: I have a problem understanding how transactions with serializable isolation level works in postgresql. What exactly may cause a serialization error? A pattern of read-write dependencies among concurrent transactions which indicate that a serialization anomaly is possible. There is a possibility of false positives -- mostly because of the granularity of tracking the information, to limit the RAM used to track reads and read-write conflicts. The granularity promotion logic is pretty simple, and we may need to make it more sophisticated, but I haven't yet seen a real-life use case where that would help. So far a high serailization failure rate with serializable transactions is either intrinsic to the nature of the workload or could be solved by increasing the RAM used to track locks. If you have a case where a more sophisticated heuristic would help, I would be interested in more details, but first let's see whether things can be improved to an acceptable level with tuning. For details on how the logic works, you can review the examples and an overview of the technique here: http://wiki.postgresql.org/wiki/SSI ... which links to the documentation and to a Wiki page which was used to manage development of the current implementation -- which goes into more detail about how it works, and which in turn links to the academic papers which were the foundation for the development. My problem is a system where process one adds data to a database. Shortly afterwards, process two reads and possibly modifies the same data (keys are not touched). When large amounts of data arrives at about the same time, I get loads of serialization errors in process two. In a perfect world this should not happen, since data that is entered by process one is almost always different from the data that at the same time is being read and written by process two. It would be easier to tell what the cause is if you pasted the exact messages which are getting logged most frequently, with associated DETAIL. I have tried increasing max_pred_locks_per_transaction, but it seems to have no effect. How high have you tried to set it. I have seen some reports of people needing to set it about 10 times higher than the default to reduce false positives to a negligible level. I do retry the transactions, and eventually they succeed, but my concern here is the amount of errors I get at certain periods. Am I expecting too much of serializable isolation level transactions, or is there anyting else that I am missing? Have you reviewed the issues listed under for optimal performance on this page?: http://www.postgresql.org/docs/current/interactive/transaction-iso.html#XACT-SERIALIZABLE If you allow large numbers of connections, you may see a large benefit from using a connection pooler which funnels many logical client connections down to a much smaller number of database connections. This tends to help in general, but is especially important when using serializable transactions. http://wiki.postgresql.org/wiki/Number_Of_Database_Connections -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On Fri, May 10, 2013 at 11:23 AM, Steven Schlansker ste...@likeness.com wrote: On May 10, 2013, at 7:14 AM, Matt Brock m...@mattbrock.co.uk wrote: Hello. We're intending to deploy PostgreSQL on Linux with SSD drives which would be in a RAID 1 configuration with Hardware RAID. My first question is essentially: are there any issues we need to be aware of when running PostgreSQL 9 on CentOS 6 on a server with SSD drives in a Hardware RAID 1 configuration? Will there be any compatibility problems (seems unlikely)? Should we consider alternative configurations as being more effective for getting better use out of the hardware? The second question is: are there any SSD-specific issues to be aware of when tuning PostgreSQL to make the best use of this hardware and software? A couple of things I noticed with a similar-ish setup: * Some forms of RAID / LVM break the kernel's automatic disk tuning mechanism. In particular, there is a rotational tunable that often does not get set right. You might end up tweaking read ahead and friends as well. http://www.mjmwired.net/kernel/Documentation/block/queue-sysfs.txt#112 * The default Postgres configuration is awful for a SSD backed database. You really need to futz with checkpoints to get acceptable throughput. The PostgreSQL 9.0 High Performance book is fantastic and is what I used to great success. * The default Linux virtual memory configuration is awful for this configuration. Briefly, it will accept a ton of incoming data, and then go through an awful stall as soon as it calls fsync() to write all that data to disk. We had multi-second delays all the way through to the application because of this. We had to change the zone_reclaim_mode and the dirty buffer limits. http://www.postgresql.org/message-id/500616cb.3070...@2ndquadrant.com I am not sure that these numbers will end up being anywhere near what works for you, but these are my notes from tuning a 4xMLC SSD RAID-10. I haven't proven that this is optimal, but it was way better than the defaults. We ended up with the following list of changes: * Change IO scheduler to noop * Mount DB volume with nobarrier, noatime * Turn blockdev readahead to 16MiB * Turn sdb's rotational tuneable to 0 PostgreSQL configuration changes: synchronous_commit = off effective_io_concurrency = 4 checkpoint_segments = 1024 checkpoint_timeout = 10min checkpoint_warning = 8min shared_buffers = 32gb temp_buffers = 128mb work_mem = 512mb maintenance_work_mem = 1gb Linux sysctls: vm.swappiness = 0 vm.zone_reclaim_mode = 0 vm.dirty_bytes = 134217728 vm.dirty_background_bytes = 1048576 Can you provide more details about your setup, including: * What kind of filesystem are you using? * Linux distro and/or kernel version * hardware (RAM, CPU cores etc) * database usage patterns (% writes, growth, etc) thanks -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On Fri, May 10, 2013 at 1:23 PM, Steven Schlansker ste...@likeness.com wrote: On May 10, 2013, at 7:14 AM, Matt Brock m...@mattbrock.co.uk wrote: Hello. We're intending to deploy PostgreSQL on Linux with SSD drives which would be in a RAID 1 configuration with Hardware RAID. My first question is essentially: are there any issues we need to be aware of when running PostgreSQL 9 on CentOS 6 on a server with SSD drives in a Hardware RAID 1 configuration? Will there be any compatibility problems (seems unlikely)? Should we consider alternative configurations as being more effective for getting better use out of the hardware? The second question is: are there any SSD-specific issues to be aware of when tuning PostgreSQL to make the best use of this hardware and software? A couple of things I noticed with a similar-ish setup: * Some forms of RAID / LVM break the kernel's automatic disk tuning mechanism. In particular, there is a rotational tunable that often does not get set right. You might end up tweaking read ahead and friends as well. http://www.mjmwired.net/kernel/Documentation/block/queue-sysfs.txt#112 * The default Postgres configuration is awful for a SSD backed database. You really need to futz with checkpoints to get acceptable throughput. The PostgreSQL 9.0 High Performance book is fantastic and is what I used to great success. * The default Linux virtual memory configuration is awful for this configuration. Briefly, it will accept a ton of incoming data, and then go through an awful stall as soon as it calls fsync() to write all that data to disk. We had multi-second delays all the way through to the application because of this. We had to change the zone_reclaim_mode and the dirty buffer limits. http://www.postgresql.org/message-id/500616cb.3070...@2ndquadrant.com I am not sure that these numbers will end up being anywhere near what works for you, but these are my notes from tuning a 4xMLC SSD RAID-10. I haven't proven that this is optimal, but it was way better than the defaults. We ended up with the following list of changes: * Change IO scheduler to noop * Mount DB volume with nobarrier, noatime * Turn blockdev readahead to 16MiB * Turn sdb's rotational tuneable to 0 PostgreSQL configuration changes: synchronous_commit = off effective_io_concurrency = 4 checkpoint_segments = 1024 checkpoint_timeout = 10min checkpoint_warning = 8min shared_buffers = 32gb temp_buffers = 128mb work_mem = 512mb maintenance_work_mem = 1gb Linux sysctls: vm.swappiness = 0 vm.zone_reclaim_mode = 0 vm.dirty_bytes = 134217728 vm.dirty_background_bytes = 1048576 that's good info, but it should be noted that synchronous_commit trades a risk of some data loss (but not nearly as much risk as volatile storage) for a big increase in commit performance. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On May 10, 2013, at 11:38 AM, Merlin Moncure mmonc...@gmail.com wrote: PostgreSQL configuration changes: synchronous_commit = off that's good info, but it should be noted that synchronous_commit trades a risk of some data loss (but not nearly as much risk as volatile storage) for a big increase in commit performance. Yes, that is a choice we consciously made. If our DB server crashes losing the last few ms worth of transactions is an acceptable loss to us. But that may not be OK for everyone :-) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID
On May 10, 2013, at 11:35 AM, Lonni J Friedman netll...@gmail.com wrote: I am not sure that these numbers will end up being anywhere near what works for you, but these are my notes from tuning a 4xMLC SSD RAID-10. I haven't proven that this is optimal, but it was way better than the defaults. We ended up with the following list of changes: * Change IO scheduler to noop * Mount DB volume with nobarrier, noatime * Turn blockdev readahead to 16MiB * Turn sdb's rotational tuneable to 0 PostgreSQL configuration changes: synchronous_commit = off effective_io_concurrency = 4 checkpoint_segments = 1024 checkpoint_timeout = 10min checkpoint_warning = 8min shared_buffers = 32gb temp_buffers = 128mb work_mem = 512mb maintenance_work_mem = 1gb Linux sysctls: vm.swappiness = 0 vm.zone_reclaim_mode = 0 vm.dirty_bytes = 134217728 vm.dirty_background_bytes = 1048576 Can you provide more details about your setup, including: * What kind of filesystem are you using? * Linux distro and/or kernel version * hardware (RAM, CPU cores etc) * database usage patterns (% writes, growth, etc) Yes, as long as you promise not to just use my configuration without doing proper testing on your own system, even if it seems similar! Linux version 2.6.32.225 (gcc version 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC) ) #2 SMP Thu Mar 29 16:43:20 EDT 2012 DMI: Supermicro X8DTN/X8DTN, BIOS 2.1c 10/28/2011 CPU0: Intel(R) Xeon(R) CPU X5670 @ 2.93GHz stepping 02 Total of 24 processors activated (140796.98 BogoMIPS).(2 socket x 2 hyperthread x 6 cores) 96GB ECC RAM Filesystem is ext4 on LVM on hardware RAID 1+0 Adaptec 5405 Database is very much read heavy, but there is a base load of writes and bursts of much larger writes. I don't have specifics regarding how it breaks down. The database is about 400GB and is growing moderately, maybe a few GB/day. More of the write traffic is re-writes rather than writes. Hope that helps, Steven -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PL/R etc.
Merlin Moncure wrote: On Fri, May 10, 2013 at 4:41 AM, Mark Morgan Lloyd markmll.pgsql-gene...@telemetry.co.uk wrote: I don't know whether anybody active on the list has R (and in particular PL/R) experience, but just in case... :-) i) Something like APL can operate on an array with minimal regard for index order, i.e. operations across the array are as easily-expressed and as efficient as operations down the array. Does this apply to PL/R? ii) Things like OpenOffice can be very inefficient if operating over a table comprising a non-trivial number of rows. Does PL/R offer a significant improvement, e.g. by using a cursor rather than trying to read an entire resultset into memory? pl/r (via R) very terse and expressive. it will probably meet or beat any performance expectations you have coming from openoffice. that said, it's definitely a memory bound language; typically problem solving involves stuffing data into huge data frames which then pass to the high level problem solving functions like glm. you have full access to sql within the pl/r function, so nothing is keeping you from paging data into the frame via a cursor, but that only helps so much. a lot depends on the specific problem you solve of course. Thanks Merlin and Joe. As an occasional APL user terse and oppressive doesn't really bother me :-) As a particular example of the sort of thing I'm thinking, using pure SQL the operation of summing the columns in each row and summing the rows in each column are very different. In contrast, in APL if I have an array B 1 2 3 4 5 6 7 8 9 10 11 12 I can perform a reduction operation using + over whichever axis I specify: +/[1]B 15 18 21 24 +/[2]B 10 26 42 or even by default +/B 10 26 42 Does PL/R provide that sort of abstraction in a uniform fashion? -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues] -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PL/R etc.
On Fri, May 10, 2013 at 2:32 PM, Mark Morgan Lloyd markmll.pgsql-gene...@telemetry.co.uk wrote: Merlin Moncure wrote: On Fri, May 10, 2013 at 4:41 AM, Mark Morgan Lloyd markmll.pgsql-gene...@telemetry.co.uk wrote: I don't know whether anybody active on the list has R (and in particular PL/R) experience, but just in case... :-) i) Something like APL can operate on an array with minimal regard for index order, i.e. operations across the array are as easily-expressed and as efficient as operations down the array. Does this apply to PL/R? ii) Things like OpenOffice can be very inefficient if operating over a table comprising a non-trivial number of rows. Does PL/R offer a significant improvement, e.g. by using a cursor rather than trying to read an entire resultset into memory? pl/r (via R) very terse and expressive. it will probably meet or beat any performance expectations you have coming from openoffice. that said, it's definitely a memory bound language; typically problem solving involves stuffing data into huge data frames which then pass to the high level problem solving functions like glm. you have full access to sql within the pl/r function, so nothing is keeping you from paging data into the frame via a cursor, but that only helps so much. a lot depends on the specific problem you solve of course. Thanks Merlin and Joe. As an occasional APL user terse and oppressive doesn't really bother me :-) As a particular example of the sort of thing I'm thinking, using pure SQL the operation of summing the columns in each row and summing the rows in each column are very different. In contrast, in APL if I have an array B 1 2 3 4 5 6 7 8 9 10 11 12 I can perform a reduction operation using + over whichever axis I specify: +/[1]B 15 18 21 24 +/[2]B 10 26 42 or even by default +/B 10 26 42 Does PL/R provide that sort of abstraction in a uniform fashion? certainly (for example see here: http://stackoverflow.com/questions/13352180/sum-different-columns-in-a-data-frame) -- getting good at R can take some time but it's worth it. R is hot right now with all the buzz around big data lately. The main challenge actually is the language is so rich it can be difficult to zero in on the precise behaviors you need. Also, the documentation is all over the place. pl/r plays in nicely because with some thought you can marry the R analysis functions directly to the query in terms of both inputs and outputs -- basically very, very sweet syntax sugar. It's a little capricious though (and be advised: Joe has put up some very important and necessary fixes quite recently) so usually I work out the R code in the R console first before putting in the database. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PL/R etc.
Merlin Moncure wrote: On Fri, May 10, 2013 at 2:32 PM, Mark Morgan Lloyd markmll.pgsql-gene...@telemetry.co.uk wrote: Merlin Moncure wrote: On Fri, May 10, 2013 at 4:41 AM, Mark Morgan Lloyd markmll.pgsql-gene...@telemetry.co.uk wrote: I don't know whether anybody active on the list has R (and in particular PL/R) experience, but just in case... :-) i) Something like APL can operate on an array with minimal regard for index order, i.e. operations across the array are as easily-expressed and as efficient as operations down the array. Does this apply to PL/R? ii) Things like OpenOffice can be very inefficient if operating over a table comprising a non-trivial number of rows. Does PL/R offer a significant improvement, e.g. by using a cursor rather than trying to read an entire resultset into memory? pl/r (via R) very terse and expressive. it will probably meet or beat any performance expectations you have coming from openoffice. that said, it's definitely a memory bound language; typically problem solving involves stuffing data into huge data frames which then pass to the high level problem solving functions like glm. you have full access to sql within the pl/r function, so nothing is keeping you from paging data into the frame via a cursor, but that only helps so much. a lot depends on the specific problem you solve of course. Thanks Merlin and Joe. As an occasional APL user terse and oppressive doesn't really bother me :-) As a particular example of the sort of thing I'm thinking, using pure SQL the operation of summing the columns in each row and summing the rows in each column are very different. In contrast, in APL if I have an array B 1 2 3 4 5 6 7 8 9 10 11 12 I can perform a reduction operation using + over whichever axis I specify: +/[1]B 15 18 21 24 +/[2]B 10 26 42 or even by default +/B 10 26 42 Does PL/R provide that sort of abstraction in a uniform fashion? certainly (for example see here: http://stackoverflow.com/questions/13352180/sum-different-columns-in-a-data-frame) -- getting good at R can take some time but it's worth it. R is hot right now with all the buzz around big data lately. The main challenge actually is the language is so rich it can be difficult to zero in on the precise behaviors you need. Also, the documentation is all over the place. pl/r plays in nicely because with some thought you can marry the R analysis functions directly to the query in terms of both inputs and outputs -- basically very, very sweet syntax sugar. It's a little capricious though (and be advised: Joe has put up some very important and necessary fixes quite recently) so usually I work out the R code in the R console first before putting in the database. [Peruse] Thanks, I think I get the general idea. I'm aware of the significance of R, and in particular that it's attracting attention due to the undesirability of hiding functionality in spreadsheets where these usurped APL for certain types of operation. -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues] -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PG in cash till machines
Hi Carlos, On 5/10/2013 6:43 AM, Carlos Henrique Reimer wrote: We are developing a solution which will run in thousands of small cash till machines running Linux and we would like to use PostgreSQL but there is a insecurity feeling regarding the solution basically because these boxes would be exposed to an insecure environment and insecure procedures like: non controlled power outages, untrusted hardware, no appropriate environment for hardware (dusty, hot) etc... Welcome to my world! : To clarify your requirements: - the devices contain hard currency which they either *accept* from the user, dispense *to* the user, or both. - the contents of the DB are critical to the proper operation of the device (e.g., perhaps they maintain account balances that, if corruptible, represent an exploit) - presumably, the environment is secure enough that an adversary can't just remove the entire device to abscound with the monies therein (i.e., while it may not be as secure as a bank lobby, neither is it sitting in the middle of an open field! Perhaps sitting *in* a storefront establishment?) - the device has some physical safeguards to prevent an adversary vandalizing it GIVEN THE LEVEL OF SUPERVISION IN ITS ENVIRONMENT (e.g., the coinbox of a US pay phone is pretty hard to break into) - yet, a subtle, clever hacker could tease the device into a state that is more generous to his wishes (e.g., a high voltage discharge at a key point along the case/wiring to crash the processor at a critical point in it's transaction processing) - a single device may not be a significant value target -- but, an exploit that can be reapplied to *each* target (possibly repeatedly, over time) makes this a sufficiently large honey pot - you're very confident in the robustness of the underlying OS, application and PostgreSQL itself that you don't fear problems from any of those in normal operation (e.g., the kernel isn't likely to panic due to some unforseen out-of-memory condition -- nor the processor hiccup from operating in an overtemperature condition) Our major concern is related to the write-back issues we can face in this environment. Is there a way to prevent it or a way PG can be configured to detect write-back configurations? Don't think so, but... I'm not sure what specific threat you are anticipating, here. Could you posit a potential attack scenario against which you would like to defend? Basically, the question is: is there a way to guaranty a reliable PG database in an untrusted and insecure environment? Maybe some kind of file system could give this guaranty, not sure.. Absent *physical* security, you can't do *anything*! (especially given your reliance on FOSS components!). E.g., an adversary can disembowel the device and massage the contents of the raw media, etc.) Having designed (regulated) gaming and grey area devices (each handling hard currency), I can tell you that you have to have already performed a pretty exhaustive threat analysis (e.g., red team, blue\ team) *before* you start the product's design. If you can't imagine *all* of the ways you can be targeted, then you can't determine how/if you will be secure in each of those scenarios (e.g., I've incorporated features into the hardware designs to counter certain types of physical attacks). Always consider how you can design the system to *screw* the attacker instead of *reward* him, given the option. E.g., given the (fictitious) example of allowing the user to withdraw funds from his account (possibly), then opt for: decrease_balance(amount) dispense_currency(amount) instead of: dispense_currency(amount) decrease_balance(amount) as the latter case could allow an attacker to interrupt the atomic operation *after* receiving his hard currency yet before a record of this has been made. In the former case, an attack results in his account *losing* money without ever receiving the hard currency! Attackers are shy about complaining that the machine cheated me as that brings attention to their activities! (Oh? Let's review the video surveillance to see what happened... :) Of course, the above *should* be atomic but you can't roll-back a *physical* transaction (i.e., reach out and grab the currency *from* him if the transaction is aborted). Perhaps if you can highlight specific actions that you imagine as being vulnerabilities, a compromise solution might be available? [BTW, I find these the most delightful projects as you *really* have to think outside the box -- unlike a desktop/server environment where all you have to worry about is a disk crash, DoS attack, etc.] G'luck! --don -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PG in cash till machines
On 5/10/2013 2:11 PM, Bexley Hall wrote: Having designed (regulated) gaming and grey area devices (each handling hard currency), I can tell you that you have to have already performed a pretty exhaustive threat analysis (e.g., red team, blue\ team) *before* you start the product's design. If you can't imagine *all* of the ways you can be targeted, then you can't determine how/if you will be secure in each of those scenarios (e.g., I've incorporated features into the hardware designs to counter certain types of physical attacks). indeed, and there's always threat models that no one could foresee, witness the recent story of coordinated ATM withdrawals of $45,000,000 enabled by some back door hacking of the bank databases. -- john r pierce 37N 122W somewhere on the middle of the left coast