Re: [GENERAL] Hardware requirements for a PostGIS server
Responses in-line: On Tue, 10 Feb 2015 19:52:41 -0500 Mathieu Basille basille@ase-research.org wrote: I am posting here a question that I initially asked on the PostGIS list [1], where I was advised to try here too (I will keep both lists updated about the developments on this issue). I am currently planning to set up a PostgreSQL + PostGIS instance for my lab. Turns out I believe this would be useful for the whole center, so that I'm now considering setting up the server for everyone?if interest is shared of course. At the moment, I am however struggling with what would be required in terms of hardware, and of course, the cost will depend on that?at the end of the day, it's really a matter of money well spent. I have then a series of questions/remarks, and I would welcome any feedback from people with existing experience on setting up a multi-user PostGIS server. I'm insisting on the PostGIS aspect, since the most heavy requests will be GIS requests (intersections, spatial queries, etc.). However, people with similar PostgreSQL setup may have very relevant comments about their own configuration. * My own experience about servers is rather limited: I used PostGIS quite a bit, but only on a desktop, with only 2 users. The desktop was quite good (quad-core Xeon, 12 Go RAM, 500 GB hd), running Debian, and we never had any performance issue (although some queries were rather long, but still acceptable). * The use case I'm envisioning would be (at least in the foreseeable future): - About 10 faculty users (which means potentially a little bit more students using it); I would have hard time considering more than 4 concurrent users; - Data would primarily involve a lot (hundreds/thousands) of high resolution (spatial and temporal) raster and vector maps, possibly over large areas (Florida / USA / continental), as well as potentially millions of GPS records (animals individually monitored); - Queries will primarily involve retrieving points/maps over given areas/time, as well as intersecting points over environmental layers [from what I understand, a lot of I/O, with many intermediary tables involved]; other use cases will involve working with steps, i.e. the straight line segment connecting two successive locations, and intersecting them with environmental layers; * I couldn't find comprehensive or detailed guidelines on-line about hardware, but from what I could see, it seems that memory wouldn't be the main issue, but the number of cores would be (one core per database connection if I'm not mistaken). At the same time, we want to make sure that the experience is smooth for everyone... I was advised on the PostGIS list to give a look at pgpool (however, UNIX only). # of cores helps in parallel processing. But 4 simultaneous users doesn't particularly mean 4 simultaneous queries. How much time do your users spend running queries vs. idling? If you don't expect more than 4 concurrent users, I would think you'll be fine with a single quad-core CPU. I would get the fastest CPU available, though, as it will make number crunching go faster. I can't see any reason why you'd want/need pgpool. pgpool is generally useful when you have a LOT of simultaneous connections, and you're only estimating 4. Additionally, pgpool is fairly easy to add on later if you need it ... so my recommendation would be not to worry about it just yet. * Does anyone have worked with a server running the DB engine, while the DB itself was stored on another box/server? That would likely be the case here since we already have a dedicated box for file storage. Along these lines, does the system of the file storage box matter (Linux vs. MS)? Yes. If you have a lot data that will need to be crunched, I would consider getting SSDs directly attached to the computer running Postgres. Anything you put between RAM and your disks that slows down transfers is going to hurt performance. However, since you haven't made an estimate of the physical size of the data, I can't comment on whether sufficient SSD storage is cost effective or not. If you can't get DAS storage, you can make up for some of the performance hit by getting lots of RAM. Part of the effectiveness of the RAM is dependent on the OS and it's storage drivers, though, and I have no experience with how well Windows does that ... and since you didn't mention which file storage technology you're using, I can't comment on that either. SAN and NAS storage vary wildly from brand to brand on their performance characteristics, so it's difficult to say unless you can find someone who has tried the exact hardware you're liable to be using. If performance is important, I highly recommend DAS, and furthermore SSDs if you can afford them. * We may also use the server as a workstation to streamline PostGIS processing with further R analyses/modeling (or even use R from within the database using PL/R).
Re: [GENERAL] Improving performance of merging data between tables
Sorry, it took me a while to respond, but I re-factored all of this process to suggestions. On Wed, Jan 7, 2015 at 7:49 PM, Maxim Boguk maxim.bo...@gmail.com wrote: On Wed, Jan 7, 2015 at 8:49 PM, Pawel Veselov pawel.vese...@gmail.com wrote: PS: your setup look pretty complicated and hard to analyze without seeing all involved table structures, transaction/query flow, and (especially) involved procedures source code. Sure :) At this point, I've put together the bulk merge code as well. I can't quite see much of a difference, actually, but it's hard to trust the execution times, as on the same amount of data they vary from, say, 0.5s to 2s, and the sample data is not stepping on any other locks. In general, I'm afraid all those left joins and multiple scans, even over small amount of data, is nullifying any positive effect. Now some ideas to check. The high CPU usage usually isn't related to locking, but related to seq scan or wrong plans or simple inefficient pl/pgsql code, locked processes usually doesn't use too much cpu. 1)on the test database perform select pg_stat_reset(); then perform full round of merges, then check select * from pg_stat_user_tables where seq_scan0 order by seq_tup_read; and if you find a lot of seq_scan and seq_tuple_reads on the particular table try find where they coming from (it could be reason for high CPU usage). 2)enable track_functions in postgresql.conf and perform the same sequence (select pg_stat_reset() + full round of merges ) then check select * FROM pg_stat_user_functions order by self_time desc; and check which function using the most time. These are good pointers, if the new process is having the same sort of problems, this will come in handy on figuring out where they are coming from, thank you. 3)old/lost prepared transactions can have deadly effect on the database performance at whole. So check select * from pg_prepared_xact(); and verify that you don't have a hours (or weeks) old prepared xact lying around. If there are lost prepared transactions, they will lock up a particular instance from being able to write into its table data, so it will just stall the node. But does happen, and we have an application mechanism to find and delete those. PS: btw I still don't fully understood relation between the: - merges data into its own node tables (using merge_xxx PL/pgSQL functions) and provided code for the public.merge_all02-9A-46-8B-C1-DD and PUBLIC.merge_agrio. As I see public.merge_all02-9A-46-8B-C1-DD calling PUBLIC.merge_agrio, and the PUBLIC.merge_agrio updates a global table R_AGRIO (but not the own node table). It's a bit irrelevant at this point, but. merge_all02-9A-46-8B-C1-DD() function will take all data for 02-9A-46-8B-C1-DD node and move it into the master table. There is an analogous merge_02-9A-46-8B-C1-DD() function that takes data from application, and writes it into the tables for 02-9A-46-8B-C1-DD node. The process of moving data node tables-main tables and application-node tables is nearly identical, hence I only provided the body once. The big difference, is when merging into master, there is a lot more data to look through, as node tables only contain data that has not yet been merged into the master yet. I think the best implementation of such task is asynchronous processing of this changes via background process. An application only inserts events into queue table (it lockless process), and some background process read these data from queue table and merge it into main table (again lockless because it single thread so no concurrent writes), and then delete the merged data from queue table. Well, that was a really good suggestion, thank you. Some weeks later I've put it together. This hasn't hit production yet, so I'm yet to see the overall improvement effect. Along with turning it into a queue, I've added provisions to try to combine as much data as possible before writing it out into the databse tables, and merged all of the satellite tables with the main data. Before, I had: r_agrio r_brk_xxx (multiple entries reference rows in r_agrio) Now, I have: r_agrio_daily r_agrio_total r_agrio_hourly All the data that was in the r_brk_xxx tables is now in columns of the r_agrio* tables. To get around the fact that there are potentially multiple BRK records for each AGR record, the data is now stored as JSON object. The primary key used for the BRK tables is turned into a string that serves as a key in a top level JSON object. This should help me tremendously on the side that needs to read that data, as I had to join or left join the BRK tables. Splitting this into 3 tables may come back and bite me in back, since it's two more inserts and corresponding look ups, but it seriously helps me on the reading side of things. The code that aggregates the JSON data is still done in PL/PGSQL, which is probably a bad idea, considering that PL doesn't have good ways of manipulating
Re: [GENERAL] Hardware requirements for a PostGIS server
I am currently planning to set up a PostgreSQL + PostGIS instance for my lab. Turns out I believe this would be useful for the whole center, so that I'm now considering setting up the server for everyone—if interest is shared of course. At the moment, I am however struggling with what would be required in terms of hardware Just for perspective, here are the specs required to run a Nominatim server, which uses PostGIS to do geocoding on OpenStreetMap data: http://wiki.openstreetmap.org/wiki/Nominatim/Installation Of course maybe your users have more detailed data, but at least that link will give you something to think about. Good luck! Paul -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Hardware requirements for a PostGIS server
On 11/02/15 13:52, Mathieu Basille wrote: Dear PostgreSQL users, I am posting here a question that I initially asked on the PostGIS list [1], where I was advised to try here too (I will keep both lists updated about the developments on this issue). I am currently planning to set up a PostgreSQL + PostGIS instance for my lab. Turns out I believe this would be useful for the whole center, so that I'm now considering setting up the server for everyone—if interest is shared of course. At the moment, I am however struggling with what would be required in terms of hardware, and of course, the cost will depend on that—at the end of the day, it's really a matter of money well spent. I have then a series of questions/remarks, and I would welcome any feedback from people with existing experience on setting up a multi-user PostGIS server. I'm insisting on the PostGIS aspect, since the most heavy requests will be GIS requests (intersections, spatial queries, etc.). However, people with similar PostgreSQL setup may have very relevant comments about their own configuration. * My own experience about servers is rather limited: I used PostGIS quite a bit, but only on a desktop, with only 2 users. The desktop was quite good (quad-core Xeon, 12 Go RAM, 500 GB hd), running Debian, and we never had any performance issue (although some queries were rather long, but still acceptable). * The use case I'm envisioning would be (at least in the foreseeable future): - About 10 faculty users (which means potentially a little bit more students using it); I would have hard time considering more than 4 concurrent users; - Data would primarily involve a lot (hundreds/thousands) of high resolution (spatial and temporal) raster and vector maps, possibly over large areas (Florida / USA / continental), as well as potentially millions of GPS records (animals individually monitored); - Queries will primarily involve retrieving points/maps over given areas/time, as well as intersecting points over environmental layers [from what I understand, a lot of I/O, with many intermediary tables involved]; other use cases will involve working with steps, i.e. the straight line segment connecting two successive locations, and intersecting them with environmental layers; * I couldn't find comprehensive or detailed guidelines on-line about hardware, but from what I could see, it seems that memory wouldn't be the main issue, but the number of cores would be (one core per database connection if I'm not mistaken). At the same time, we want to make sure that the experience is smooth for everyone... I was advised on the PostGIS list to give a look at pgpool (however, UNIX only). * Is there a difference in terms of possibilities, performance and usability between a Linux-based and a MS-based server (from the user perspective)? My center is unfortunately MS-centered, and existing equipment runs with MS systems... It would thus be easier for them to set up a MS-based server. Does it change anything for the user? (I insist on the user perspective, since I and others will not admin the system, but only use it) * Does anyone have worked with a server running the DB engine, while the DB itself was stored on another box/server? That would likely be the case here since we already have a dedicated box for file storage. Along these lines, does the system of the file storage box matter (Linux vs. MS)? * We may also use the server as a workstation to streamline PostGIS processing with further R analyses/modeling (or even use R from within the database using PL/R). Again, does anyone have experience doing it? Is a single workstation the recommended way to work with such workflow? Or would it be better (but more costly) to have one server dedicated to PostGIS and another one, with different specs, dedicated to analyses (R)? I realize my questions and comments may be a confusing, likely because of the lack of experience about these issues on my side. I really welcome any feedback of people working with PostgreSQL servers (+ PostGIS ideally!) in a small unit, or any similar setting that could be informative! In advance, thank you very much! Sincerely, Mathieu Basille. [1] Start of the thread here: http://lists.osgeo.org/pipermail/postgis-users/2015-February/040120.html When I looked at tuning options for PostgreSQL, I found that there were limitations mentioned for Microsoft O/S's. I get the general impression from my reading from multiple sources over the years, that if you are serious about performance on a server, then you should prefer Linux to Microsoft. Note that most servers run Linux, and that over 95% of the top 500 super computers runs Linux - which is rather telling about the perception of Microsoft's performance. Cheers, Gavin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:
Re: [GENERAL] Cluster seems broken after pg_basebackup
Adrian, in response to your question: 2015-02-06 07:11:38 EST FATAL: le rôle « 208375PT$ » n'existe pas So where is role 208375PT$ supposed to come from? I found that when I stop/start/restart pgsql through the services.msc application in Windows, this message is issued in the log file. This makes sense since the account I use to start the services.msc application is my admin account, 208375PT\Admlocal. If I use instead the command line with pg_ctl restart -U postgres, I don't get this message in the log file. 2015-02-06 11:28 GMT-05:00 Adrian Klaver adrian.kla...@aklaver.com: On 02/06/2015 05:03 AM, Guillaume Drolet wrote: Hi, Yesterday I ran a pg_basebackup of my cluster. Since it has completed, my cluster doesn't work properly. I tried restarting the computer (or service) a few times but I always get the same messages in my logs (it's in French. If someone is willing to help me I can try to translate the logs. Just ask): Enter Google Translate:) First some questions: 1) What Postgres version? 2) What OS(s)? I am assuming Windows from the log info below, but we all know what assuming gets you. 3) Where were you backing up from and to? 4) Which cluster does not start, the master or the child you created with pg_basebackup? 2015-02-06 07:11:38 EST LOG: le système de bases de données a été interrompu ; dernier lancement connu à 2015-02-06 07:05:05 EST 2015-02-06 07:11:38 EST LOG: le système de bases de données n'a pas été arrêté proprement ; restauration automatique en cours 2015-02-06 07:11:38 EST LOG: record with zero length at 24B/2C000160 2015-02-06 07:11:38 EST LOG: la ré-exécution n'est pas nécessaire 2015-02-06 07:11:38 EST LOG: le système de bases de données est prêt pour accepter les connexions 2015-02-06 07:11:38 EST LOG: lancement du processus autovacuum 2015-02-06 07:11:38 EST FATAL: le rôle « 208375PT$ » n'existe pas So where is role 208375PT$ supposed to come from? Then if I start pgAdmin I get a series of pop-ups I have to click OK to to continue: An error has ocurred: Column not found in pgSet: datlastsysoid An error has ocurred: Column not found in pgSet: datlastsysoid An error has ocurred: Column not found in pgSet: oid An error has ocurred: Column not found in pgSet: encoding An error has ocurred: Column not found in pgSet: Connection to database broken Not sure about that this, someone more versed in pgAdmin will have to answer. And after that, I went back to the log file and there's new information added: 2015-02-06 07:51:05 EST LOG: processus serveur (PID 184) a été arrêté par l'exception 0x8004 2015-02-06 07:51:05 EST DÉTAIL: Le processus qui a échoué exécutait : SELECT version(); 2015-02-06 07:51:05 EST ASTUCE : Voir le fichier d'en-tête C « ntstatus.h » pour une description de la valeur hexadécimale. Well according to here: https://msdn.microsoft.com/en-us/library/cc704588.aspx 0x8004 STATUS_SINGLE_STEP {EXCEPTION} Single Step A single step or trace operation has just been completed. A developer is going to have explain what that means. 2015-02-06 07:51:05 EST LOG: arrêt des autres processus serveur actifs 2015-02-06 07:51:05 EST ATTENTION: arrêt de la connexion à cause de l'arrêt brutal d'un autre processus serveur 2015-02-06 07:51:05 EST DÉTAIL: Le postmaster a commandé à ce processus serveur d'annuler la transaction courante et de quitter car un autre processus serveur a quitté anormalement et qu'il existe probablement de la mémoire partagée corrompue. 2015-02-06 07:51:05 EST ASTUCE : Dans un moment, vous devriez être capable de vous reconnecter à la base de données et de relancer votre commande. 2015-02-06 07:51:05 EST LOG: processus d'archivage (PID 692) quitte avec le code de sortie 1 2015-02-06 07:51:05 EST LOG: tous les processus serveur se sont arrêtés, réinitialisation 2015-02-06 07:51:15 EST FATAL: le bloc de mémoire partagé pré-existant est toujours en cours d'utilisation 2015-02-06 07:51:15 EST ASTUCE : Vérifier s'il n'y a pas de vieux processus serveur en cours d'exécution. Si c'est le cas, fermez-les. I was about to try restarting postgresql using the base backup I made yesterday but since this means I'll have to copy my database again (700 GB takes a while...) I am looking for a better solution from more experienced people. My suspicion is you copied at least partly over a running server. Thanks a lot for helping! Guillaume -- Adrian Klaver adrian.kla...@aklaver.com
Re: [GENERAL] Logical Decoding Callbacks
Hi, On 2015-02-09 16:52:59 -0800, Xavier Stevens wrote: I'm trying to write a logical decoding plugin and I'm seeing some interesting behavior around the startup callback. When I use psql and the built-in SQL functions (like pg_logical_slot_peek_changes) to use my module I see the startup_cb get called. I have written my own streaming replication client and for some reason when I use my client the startup_cb does not get called, but I can still see the changes just fine. The reason why the startup_cb is particularly important in my case, is I'm trying to set two dynamic OIDs for PostGIS geometry and geography types. I've tried setting them both as globals and as part of the output_plugin_private data. Any idea what's wrong here? Did I miss something in the streaming replication documentation? Hm, that's odd. The startup callback should (and is, I used it just a couple hours ago) definitely be called when using the streaming replication protocol. You're saying that the callback is not called when you do START_LOGICAL_REPLICATION ... ? Can you reproduce the problem with test_decoding? Because that'd actually crash if no startup callback were registered... Can you show your code/a testcase? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Logical Decoding Callbacks
On 2015-02-10 11:01:08 +0900, Michael Paquier wrote: Just in case, I have just done a quick test with pg_recvlogical on latest HEAD of REL9_4_STABLE and the startup_cb gets called: DEBUG: received replication command: IDENTIFY_SYSTEM DEBUG: received replication command: START_REPLICATION SLOT slot LOGICAL 0/0 LOG: called startup_cb_wrapper Note that on 9.5 (master HEAD at c619c23) I am seeing an assertion failure. What is the server version you are using? I would imagine 9.4 but let's be sure. Uh, which plugin and what assertion? What's the backtrace? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Performance slowing down when doing same UPDATE many times
Hi, we recently found a bug in one of our applications which was doing exactly the same UPDATE operation a few thousand times inside a transaction. This caused the UPDATEs to become slower and slower from some milliseconds to some seconds. We already fixed the application but I am wondering if this might be a PostgreSQL bug, too. Here is a simple test case that performs and benchmarks 100,000 UPDATEs (benchmarking only every 10,000th to reduce output): BEGIN; CREATE TEMP TABLE test (id integer PRIMARY KEY, flag boolean DEFAULT false); INSERT INTO test (id) SELECT generate_series(1, 10); DO $$ DECLARE s timestamp; e timestamp; BEGIN FOR i IN 1..10 LOOP SELECT clock_timestamp() INTO s; UPDATE test SET flag = true WHERE id = 12345; SELECT clock_timestamp() INTO e; IF i%1 = 0 THEN RAISE NOTICE '%', e-s; END IF; END LOOP; END $$; ROLLBACK; The output looks like this: NOTICE: 00:00:00.000525 NOTICE: 00:00:00.000992 NOTICE: 00:00:00.001404 NOTICE: 00:00:00.001936 NOTICE: 00:00:00.002374 NOTICE: 00:00:00.002925 NOTICE: 00:00:00.003525 NOTICE: 00:00:00.004015 NOTICE: 00:00:00.00453 NOTICE: 00:00:00.004976 The problem only occurs inside a transaction and if the same dataset is updated. I´m using PostgreSQL 9.1.15. Jan
Re: [GENERAL] Command to reset entire PgSQL cluster?
On Tue, 10 Feb 2015 13:02:05 -0600 Felipe Gasper fel...@felipegasper.com wrote: On 2/10/15 12:57 PM, David G Johnston wrote: Felipe Gasper wrote Is there a quick way to reset a PgSQL cluster to its pristine state--i.e., to what initdb gives but preserving configuration customizations? Not that I am aware of. If you describe your use-case then meaningful suggestions could be offered. We have a lot of code that tests interaction with a PgSQL cluster on the same server that has a live production cluster. To keep the testing environment separate from production, we create a PgSQL cluster in a temp directory then run tests against that. This, as you can imagine, is pretty expensive to set up each time ? initdb is not fast! It would be nice if we could just create a single test PgSQL cluster then wipe it clean at the end of each test. Without more details, I can't be sure that this will work for you, but it seems like you should be able to simply drop/create the database that the tests use to rebuild the environment, since most things associated with a test environment will be destroyed when the database is dropped. There are some exceptions, such as roles and users, but I wouldn't think you would need to wipe/recreate those. In any event, don't know if that suggestion will help, but it's what was successful for me. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Logical Decoding Callbacks
Sorry to raise the issue on startup_cb. I added a whole bunch of logging statements and I was only running the section of code I wanted when the startup callback had options. This now gets me to the next issue I encounter. In my output plugin, I'm trying to use the SPI interface to query about PostGIS OIDs in the startup callback. Just calling SPI_connect() seems to be causing a segfault. This is the last thing I see in the logs before the segfault occurs: https://github.com/xstevens/decoderbufs/blob/master/src/decoderbufs.c#L151 On Tue, Feb 10, 2015 at 4:18 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Tue, Feb 10, 2015 at 5:59 PM, Andres Freund and...@2ndquadrant.com wrote: On 2015-02-10 11:01:08 +0900, Michael Paquier wrote: Just in case, I have just done a quick test with pg_recvlogical on latest HEAD of REL9_4_STABLE and the startup_cb gets called: DEBUG: received replication command: IDENTIFY_SYSTEM DEBUG: received replication command: START_REPLICATION SLOT slot LOGICAL 0/0 LOG: called startup_cb_wrapper Note that on 9.5 (master HEAD at c619c23) I am seeing an assertion failure. What is the server version you are using? I would imagine 9.4 but let's be sure. Uh, which plugin and what assertion? What's the backtrace? I am raising that on a new thread on -hackers, that's not related to the report here. Thanks for reminding. -- Michael
Re: [GENERAL] Command to reset entire PgSQL cluster?
On 2/10/15 12:57 PM, David G Johnston wrote: Felipe Gasper wrote Is there a quick way to reset a PgSQL cluster to its pristine state--i.e., to what initdb gives but preserving configuration customizations? Not that I am aware of. If you describe your use-case then meaningful suggestions could be offered. We have a lot of code that tests interaction with a PgSQL cluster on the same server that has a live production cluster. To keep the testing environment separate from production, we create a PgSQL cluster in a temp directory then run tests against that. This, as you can imagine, is pretty expensive to set up each time … initdb is not fast! It would be nice if we could just create a single test PgSQL cluster then wipe it clean at the end of each test. -FG -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Command to reset entire PgSQL cluster?
Hello, Is there a quick way to reset a PgSQL cluster to its pristine state--i.e., to what initdb gives but preserving configuration customizations? Thank you! -FG -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Command to reset entire PgSQL cluster?
On 02/10/2015 10:49 AM, Felipe Gasper wrote: Hello, Is there a quick way to reset a PgSQL cluster to its pristine state--i.e., to what initdb gives but preserving configuration customizations? Something like: cp -r /pgdata to pgdata_bak/ at initial state. When you want to reset cp -r pgdata_bak/ pgdata/ Thank you! -FG -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Command to reset entire PgSQL cluster?
On 2/10/2015 11:02 AM, Felipe Gasper wrote: On 2/10/15 12:57 PM, David G Johnston wrote: Felipe Gasper wrote Is there a quick way to reset a PgSQL cluster to its pristine state--i.e., to what initdb gives but preserving configuration customizations? Not that I am aware of. If you describe your use-case then meaningful suggestions could be offered. We have a lot of code that tests interaction with a PgSQL cluster on the same server that has a live production cluster. To keep the testing environment separate from production, we create a PgSQL cluster in a temp directory then run tests against that. This, as you can imagine, is pretty expensive to set up each time … initdb is not fast! It would be nice if we could just create a single test PgSQL cluster then wipe it clean at the end of each test. use a file system, like zfs, that supports snapshots.initdb the first time, configure your system, then snapshot it. each successive time, stop postgres, revert the snapshot, restart postgres. -- john r pierce 37N 122W somewhere on the middle of the left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Command to reset entire PgSQL cluster?
Felipe Gasper wrote Is there a quick way to reset a PgSQL cluster to its pristine state--i.e., to what initdb gives but preserving configuration customizations? Not that I am aware of. If you describe your use-case then meaningful suggestions could be offered. David J. -- View this message in context: http://postgresql.nabble.com/Command-to-reset-entire-PgSQL-cluster-tp5837399p5837400.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Command to reset entire PgSQL cluster?
On 2/10/2015 10:49 AM, Felipe Gasper wrote: Is there a quick way to reset a PgSQL cluster to its pristine state--i.e., to what initdb gives but preserving configuration customizations? backup the .conf file(s), initdb, restore the .conf files. -- john r pierce 37N 122W somewhere on the middle of the left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Command to reset entire PgSQL cluster?
On 10/02/2015 19:02, Felipe Gasper wrote: On 2/10/15 12:57 PM, David G Johnston wrote: Felipe Gasper wrote Is there a quick way to reset a PgSQL cluster to its pristine state--i.e., to what initdb gives but preserving configuration customizations? Not that I am aware of. If you describe your use-case then meaningful suggestions could be offered. We have a lot of code that tests interaction with a PgSQL cluster on the same server that has a live production cluster. To keep the testing environment separate from production, we create a PgSQL cluster in a temp directory then run tests against that. This, as you can imagine, is pretty expensive to set up each time … initdb is not fast! It would be nice if we could just create a single test PgSQL cluster then wipe it clean at the end of each test. How about this? - - Create your new, pristine cluster. - Stop the PG server. - Make a filesystem copy of the data directory to somewhere else. - Restart the server. Then, when you want to restore the cluster to its pristine state, you could just stop the server, delete the data directory, restore it from the copy made above, then restart the server. You'd need to make sure that permissions were correct. Mind you, I've never found initdb particularly slow... Ray. -- Raymond O'Donnell :: Galway :: Ireland r...@iol.ie -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Stability of JSON textual representation
On Mon, Feb 9, 2015 at 3:07 AM, David Evans david.ev...@cantab.net wrote: On 9 February 2015 at 03:57, Gavin Flower gavinflo...@archidevsys.co.nz wrote: Would using jsonb be more consistent? Yes, casting to jsonb seems produce consistent output: # SELECT array_to_json(array[1, 2, 3])::jsonb, json_build_array(1, 2, 3)::jsonb; array_to_json | json_build_array ---+-- [1, 2, 3] | [1, 2, 3] That makes sense I suppose, given that only the structure in preserved in the jsonb representation. This is not really due to whitespace/'structure only' preservation aspects of json and jsonb, respectively. What is happening here is that the various functions that render json from text are non consistent in terms of spacing. Personally, I greatly prefer the xxx_to_json functions because they don't insert spurious whitespace making for more compact documents; putting in spaces for aesthetic purposes is the job for a prettifier. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Logical Decoding Callbacks
On 2015-02-10 10:33:41 -0800, Xavier Stevens wrote: Sorry to raise the issue on startup_cb. I added a whole bunch of logging statements and I was only running the section of code I wanted when the startup callback had options. Heh. Just to make sure: You can pass options via replication protocol too. This now gets me to the next issue I encounter. In my output plugin, I'm trying to use the SPI interface to query about PostGIS OIDs in the startup callback. Just calling SPI_connect() seems to be causing a segfault. This is the last thing I see in the logs before the segfault occurs: https://github.com/xstevens/decoderbufs/blob/master/src/decoderbufs.c#L151 The problem likely is that in the startup callback you're neither guaranteed to be in a transaction, nor to have a snapshot set up. It'd generally be easier to analyze such problems if you provide a backtrace (e.g. by enabling core files). Another generally very adviseable thing to do when developing code running in the backend is to enable assertions (you may already do that...). You can lookup types much easier than that btw. C.f. TypenameGetTypid(typname) But note that both that and what you do would possibly fail if there's more than one geometry type around. You could either hardcode postgis' schema name and use namespaceId = LookupExplicitNamespace(schemaname, false); typoid = GetSysCacheOid2(TYPENAMENSP, PointerGetDatum(typname), ObjectIdGetDatum(namespaceId)); if (typoid == InvalidOid) elog(ERROR, cache lookup failed for type %u, typoid); or be a bit more complex and lookup the postgis' extension's schema pg_extension.extnamespace first. Anyway, to run full queries in the startup callback you're going to have to do something like: if (!IsTransactionState()) { tx_started = true; StartTransactionCommand(); } PushActiveSnapshot(GetTransactionSnapshot()); /* do your stuff */ PopActiveSnapshot(); if (tx_started) CommitTransactionCommand(); Note that the begin, change, commit callbacks *do* run with a transaction and snapshot setup. But you can't run general SQL queries - only catalog tables (or tables marked as such) are accessible. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [ADMIN] [GENERAL] Change postgresql encoding
Thanks, problem was mine, all is ok now, thanks. I were putting es_ES@iso885915 or es_ES@latin9 and it wasn't working .. Solution was using es_ES.iso885915@euro for collation and ctype (and latin9 for encoding). Thanks to all! Regards... 2015-02-10 23:26 GMT+00:00 Jeremiah Ocasio joca...@kaleidoscopemediasystems.com: Hello Adrian, Not sure if this is what your looking for insofar as a solution but have you taken a look at this page from the documentation. http://www.postgresql.org/docs/9.3/static/multibyte.html On Mon, Feb 9, 2015 at 12:11 AM, Oliver ofab...@gmail.com wrote: 2015-02-09 7:52 GMT+00:00 Oliver ofab...@gmail.com: 2015-02-08 20:44 GMT+00:00 Adrian Klaver adrian.kla...@aklaver.com: On 02/08/2015 11:20 AM, Oliver wrote: Hi, I have a new postgresql installation done, it has utf-8 encoding. I have to make a Oracle database migration and it has ISO8859-15 encoding, should I change postgresql encoding to same Oracle encoding or with utf-8 it should go well? If I want change postgresql encoding, I have understood that I should reinstall postgresql (I do installation from rpm official binary files for red hat), someone knows say me if when you install software it asks you for choosing encoding or it puts it from what you have in the system? (I did installation from many time ago and I don't remember it). Thanks beforehand. What version of Postgres? 9.3.4 What version of RedHat? Red Hat Enterprise Linux Server release 6.5 P.D.: I've tried change encoding to a new postgresql database to latin9, but it gives me error for having utf-8. What was the command you gave? CREATE DATABASE xxx WITH ENCODING 'LATIN9' OWNER=xxx TEMPLATE=template0 LC_COLLATE='es_ES.latin9' LC_CTYPE='es_ES.latin9' CONNECTION LIMIT=-1 TABLESPACE=xxx; What was the exact error message? regional configuration name not valid es_ES.latin9 SQL state: 42809 I've tried es_ES.iso8859-15 and same error. I'm using pgadmin III 1.18.1 for running querys. When I run SHOW client_encoding; it shows UNICODE. Regards... Thanks beforehand. -- Adrian Klaver adrian.kla...@aklaver.com If I set encoding to latin9, lc_collate and lc_ctype to 'C', database is created correctly, but I'm not sure if it is ok :-? I want have database with iso8859-15 encoding. My system has, when I run 'locale', the next: LANG=es_ES.UTF-8 LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=es_ES.UTF-8 LC_TIME=es_ES.UTF-8 LC_COLLATE=es_ES.UTF-8 LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=es_ES.UTF-8 LC_PAPER=es_ES.UTF-8 LC_NAME=es_ES.UTF-8 LC_ADDRESS=es_ES.UTF-8 LC_TELEPHONE=es_ES.UTF-8 LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=es_ES.UTF-8 LC_ALL= Thanks beforehand. -- Jeremiah Ocasio joca...@kaleidoscopemediasystems.com 1200 Westlake Ave N, Suite 604 Seattle, WA 98109 Tel: 855-247-5231 Fax: 206-400-2712 This electronic mail communication is intended only for the individual or entity to which it is addressed and contains privileged and confidential information. If you receive this communication in error, please advise us by return e-mail or call us immediately at 855-247-5231 and delete this communication and all copies and attachments. Any dissemination, distribution or copying of this communication is strictly prohibited. IRS Circular 230 disclosure: To ensure compliance with requirements imposed by the IRS and other taxing authorities, we inform you that any tax advice contained in this communication (including any attachments) is not intended or written to be used, and cannot be used, for the purpose of (i) avoiding penalties that may be imposed on any taxpayer or (ii) promoting, marketing or recommending to another party any transaction or matter addressed herein.
Re: [GENERAL] EXCLUDE constraint with not equals
Hi Thomas, The partial unique index would prevent multiple person records with the same email. I want to allow that as long as they agree on the value of user. Kai On Tue, Feb 10, 2015 at 2:14 AM, Thomas Kellerer spam_ea...@gmx.net wrote: Kai Groner schrieb am 10.02.2015 um 01:38: Given the following table, I would like to ensure that all the rows for an email that have a user defined map to the same user. CREATE TABLE person ( id INTEGER PRIMARY KEY, user TEXT, email TEXT NOT NULL); What I think I'm looking for is something like this: CREATE TABLE person ( id INTEGER PRIMARY KEY, user TEXT, email TEXT NOT NULL, EXCLUDE (email WITH =, user WITH ) WHERE (user IS NOT NULL)); The not equals comparison isn't supported, but it would be useful here. Is there another way to do this, short of creating a separate table that associates email and user? A partial unique index on (user, email) should do: create unique index on person (email, user) where user is not null; Thomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Logical Decoding Callbacks
On Tue, Feb 10, 2015 at 5:59 PM, Andres Freund and...@2ndquadrant.com wrote: On 2015-02-10 11:01:08 +0900, Michael Paquier wrote: Just in case, I have just done a quick test with pg_recvlogical on latest HEAD of REL9_4_STABLE and the startup_cb gets called: DEBUG: received replication command: IDENTIFY_SYSTEM DEBUG: received replication command: START_REPLICATION SLOT slot LOGICAL 0/0 LOG: called startup_cb_wrapper Note that on 9.5 (master HEAD at c619c23) I am seeing an assertion failure. What is the server version you are using? I would imagine 9.4 but let's be sure. Uh, which plugin and what assertion? What's the backtrace? I am raising that on a new thread on -hackers, that's not related to the report here. Thanks for reminding. -- Michael
Re: [ADMIN] [GENERAL] Change postgresql encoding
Hello Adrian, Not sure if this is what your looking for insofar as a solution but have you taken a look at this page from the documentation. http://www.postgresql.org/docs/9.3/static/multibyte.html On Mon, Feb 9, 2015 at 12:11 AM, Oliver ofab...@gmail.com wrote: 2015-02-09 7:52 GMT+00:00 Oliver ofab...@gmail.com: 2015-02-08 20:44 GMT+00:00 Adrian Klaver adrian.kla...@aklaver.com: On 02/08/2015 11:20 AM, Oliver wrote: Hi, I have a new postgresql installation done, it has utf-8 encoding. I have to make a Oracle database migration and it has ISO8859-15 encoding, should I change postgresql encoding to same Oracle encoding or with utf-8 it should go well? If I want change postgresql encoding, I have understood that I should reinstall postgresql (I do installation from rpm official binary files for red hat), someone knows say me if when you install software it asks you for choosing encoding or it puts it from what you have in the system? (I did installation from many time ago and I don't remember it). Thanks beforehand. What version of Postgres? 9.3.4 What version of RedHat? Red Hat Enterprise Linux Server release 6.5 P.D.: I've tried change encoding to a new postgresql database to latin9, but it gives me error for having utf-8. What was the command you gave? CREATE DATABASE xxx WITH ENCODING 'LATIN9' OWNER=xxx TEMPLATE=template0 LC_COLLATE='es_ES.latin9' LC_CTYPE='es_ES.latin9' CONNECTION LIMIT=-1 TABLESPACE=xxx; What was the exact error message? regional configuration name not valid es_ES.latin9 SQL state: 42809 I've tried es_ES.iso8859-15 and same error. I'm using pgadmin III 1.18.1 for running querys. When I run SHOW client_encoding; it shows UNICODE. Regards... Thanks beforehand. -- Adrian Klaver adrian.kla...@aklaver.com If I set encoding to latin9, lc_collate and lc_ctype to 'C', database is created correctly, but I'm not sure if it is ok :-? I want have database with iso8859-15 encoding. My system has, when I run 'locale', the next: LANG=es_ES.UTF-8 LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=es_ES.UTF-8 LC_TIME=es_ES.UTF-8 LC_COLLATE=es_ES.UTF-8 LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=es_ES.UTF-8 LC_PAPER=es_ES.UTF-8 LC_NAME=es_ES.UTF-8 LC_ADDRESS=es_ES.UTF-8 LC_TELEPHONE=es_ES.UTF-8 LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=es_ES.UTF-8 LC_ALL= Thanks beforehand. -- Jeremiah Ocasio joca...@kaleidoscopemediasystems.com 1200 Westlake Ave N, Suite 604 Seattle, WA 98109 Tel: 855-247-5231 Fax: 206-400-2712 This electronic mail communication is intended only for the individual or entity to which it is addressed and contains privileged and confidential information. If you receive this communication in error, please advise us by return e-mail or call us immediately at 855-247-5231 and delete this communication and all copies and attachments. Any dissemination, distribution or copying of this communication is strictly prohibited. IRS Circular 230 disclosure: To ensure compliance with requirements imposed by the IRS and other taxing authorities, we inform you that any tax advice contained in this communication (including any attachments) is not intended or written to be used, and cannot be used, for the purpose of (i) avoiding penalties that may be imposed on any taxpayer or (ii) promoting, marketing or recommending to another party any transaction or matter addressed herein.
Re: [GENERAL] Logical Decoding Callbacks
There was no reason I needed to run full statements in this case. I just didn't know I could get the type ids like that. Thanks for all of your help Andres! On Tue, Feb 10, 2015 at 1:23 PM, Andres Freund and...@2ndquadrant.com wrote: On 2015-02-10 10:33:41 -0800, Xavier Stevens wrote: Sorry to raise the issue on startup_cb. I added a whole bunch of logging statements and I was only running the section of code I wanted when the startup callback had options. Heh. Just to make sure: You can pass options via replication protocol too. This now gets me to the next issue I encounter. In my output plugin, I'm trying to use the SPI interface to query about PostGIS OIDs in the startup callback. Just calling SPI_connect() seems to be causing a segfault. This is the last thing I see in the logs before the segfault occurs: https://github.com/xstevens/decoderbufs/blob/master/src/decoderbufs.c#L151 The problem likely is that in the startup callback you're neither guaranteed to be in a transaction, nor to have a snapshot set up. It'd generally be easier to analyze such problems if you provide a backtrace (e.g. by enabling core files). Another generally very adviseable thing to do when developing code running in the backend is to enable assertions (you may already do that...). You can lookup types much easier than that btw. C.f. TypenameGetTypid(typname) But note that both that and what you do would possibly fail if there's more than one geometry type around. You could either hardcode postgis' schema name and use namespaceId = LookupExplicitNamespace(schemaname, false); typoid = GetSysCacheOid2(TYPENAMENSP, PointerGetDatum(typname), ObjectIdGetDatum(namespaceId)); if (typoid == InvalidOid) elog(ERROR, cache lookup failed for type %u, typoid); or be a bit more complex and lookup the postgis' extension's schema pg_extension.extnamespace first. Anyway, to run full queries in the startup callback you're going to have to do something like: if (!IsTransactionState()) { tx_started = true; StartTransactionCommand(); } PushActiveSnapshot(GetTransactionSnapshot()); /* do your stuff */ PopActiveSnapshot(); if (tx_started) CommitTransactionCommand(); Note that the begin, change, commit callbacks *do* run with a transaction and snapshot setup. But you can't run general SQL queries - only catalog tables (or tables marked as such) are accessible. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
[GENERAL] Hardware requirements for a PostGIS server
Dear PostgreSQL users, I am posting here a question that I initially asked on the PostGIS list [1], where I was advised to try here too (I will keep both lists updated about the developments on this issue). I am currently planning to set up a PostgreSQL + PostGIS instance for my lab. Turns out I believe this would be useful for the whole center, so that I'm now considering setting up the server for everyone—if interest is shared of course. At the moment, I am however struggling with what would be required in terms of hardware, and of course, the cost will depend on that—at the end of the day, it's really a matter of money well spent. I have then a series of questions/remarks, and I would welcome any feedback from people with existing experience on setting up a multi-user PostGIS server. I'm insisting on the PostGIS aspect, since the most heavy requests will be GIS requests (intersections, spatial queries, etc.). However, people with similar PostgreSQL setup may have very relevant comments about their own configuration. * My own experience about servers is rather limited: I used PostGIS quite a bit, but only on a desktop, with only 2 users. The desktop was quite good (quad-core Xeon, 12 Go RAM, 500 GB hd), running Debian, and we never had any performance issue (although some queries were rather long, but still acceptable). * The use case I'm envisioning would be (at least in the foreseeable future): - About 10 faculty users (which means potentially a little bit more students using it); I would have hard time considering more than 4 concurrent users; - Data would primarily involve a lot (hundreds/thousands) of high resolution (spatial and temporal) raster and vector maps, possibly over large areas (Florida / USA / continental), as well as potentially millions of GPS records (animals individually monitored); - Queries will primarily involve retrieving points/maps over given areas/time, as well as intersecting points over environmental layers [from what I understand, a lot of I/O, with many intermediary tables involved]; other use cases will involve working with steps, i.e. the straight line segment connecting two successive locations, and intersecting them with environmental layers; * I couldn't find comprehensive or detailed guidelines on-line about hardware, but from what I could see, it seems that memory wouldn't be the main issue, but the number of cores would be (one core per database connection if I'm not mistaken). At the same time, we want to make sure that the experience is smooth for everyone... I was advised on the PostGIS list to give a look at pgpool (however, UNIX only). * Is there a difference in terms of possibilities, performance and usability between a Linux-based and a MS-based server (from the user perspective)? My center is unfortunately MS-centered, and existing equipment runs with MS systems... It would thus be easier for them to set up a MS-based server. Does it change anything for the user? (I insist on the user perspective, since I and others will not admin the system, but only use it) * Does anyone have worked with a server running the DB engine, while the DB itself was stored on another box/server? That would likely be the case here since we already have a dedicated box for file storage. Along these lines, does the system of the file storage box matter (Linux vs. MS)? * We may also use the server as a workstation to streamline PostGIS processing with further R analyses/modeling (or even use R from within the database using PL/R). Again, does anyone have experience doing it? Is a single workstation the recommended way to work with such workflow? Or would it be better (but more costly) to have one server dedicated to PostGIS and another one, with different specs, dedicated to analyses (R)? I realize my questions and comments may be a confusing, likely because of the lack of experience about these issues on my side. I really welcome any feedback of people working with PostgreSQL servers (+ PostGIS ideally!) in a small unit, or any similar setting that could be informative! In advance, thank you very much! Sincerely, Mathieu Basille. [1] Start of the thread here: http://lists.osgeo.org/pipermail/postgis-users/2015-February/040120.html -- ~$ whoami Mathieu Basille http://ase-research.org/basille ~$ locate --details University of Florida \\ Fort Lauderdale Research and Education Center (+1) 954-577-6314 ~$ fortune « Le tout est de tout dire, et je manque de mots Et je manque de temps, et je manque d'audace. » -- Paul Éluard ___ postgis-users mailing list postgis-us...@lists.osgeo.org http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users -- ~$ whoami Mathieu Basille http://ase-research.org/basille ~$ locate --details University of Florida \\ Fort Lauderdale Research and Education Center (+1) 954-577-6314 ~$ fortune « Le tout est de tout dire, et je