Re: [GENERAL] Build in function to verify email addresses
On Thu, Nov 16, 2017 at 1:56 AM, Nick Drowrote: > I beleieve that every information system has the needs to send emails. > Currently PostgreSQL doesn't have a function which gets TEXT and return > true if it's valid email address (x...@yyy.com / .co.ZZ) > Do you believe such function should exist in PostgreSQL or it's best to > let every user to implement his own function? > There's a world of difference between an email address that is well-formed and one that actually works. In the systems I administer there's a lot of time spent dealing with bounced mail to make sure that the email addresses we have actually reach someone, hopefully the intended target. And in the US, bulk emailers also have to deal with the CAN-SPAM act, which specifies procedures that must be in place to allow easy administrative options to remove one'e email address from mailing lists. Procedures to verify that an email address works and to administer its use under rules like CAN_SPAM cannot exist solely within the database itself. And as others have noted, what makes for a 'well-formed' email address has always been a bit complicated. -- Mike Nolan
Re: [GENERAL] Postgres 9.6 fails to start on VMWare
On Mon, Oct 23, 2017 at 3:14 AM, Martin Moorewrote: > Same server. I tried a few times. > > I didn’t move the db separately, but did a ‘dd’ to copy the disk to an > imagefile which was converted and loaded into VMWare. > > I ‘believed’ that this should keep the low level disk structure the same, > but if this has corrupted the files I can drop, dump and restore, in which > case how do I ‘drop’ the DB without postgres running? > > Ta, > > Martin. > Was the server you were backing up shut down or in backup mode when you did the 'dd' copy? -- Mike Nolan
Re: [GENERAL] Nice to have features: Percentage function
I also have some pre-defined percentage functions, they check the denominator and return null if it is zero, to avoid 'divide by zero' errors. -- Mike Nolan On Sun, Apr 16, 2017 at 11:37 AM, Melvin Davidsonwrote: > > > On Sun, Apr 16, 2017 at 12:23 PM, Adrian Klaver > wrote: > >> On 04/15/2017 10:47 PM, Ron Ben wrote: >> >>> Hi, >>> I'm always finiding myself writing many varations of functions to >>> calculate percentage. >>> I think it would be nice if postgresql would have build in functions for >>> that. >>> I think the major functionality is something like the 3 ooptions here: >>> https://percentagecalculator.net/ >>> >>> It may help to keep code simple and clean and it seem like something >>> simple to implement. >>> >> >> Plan B, CREATE your own extension. What follows is my first attempt at >> creating an extension and the functions included are simple placeholders >> more then anything else: >> >> File name: calc_percents--1.0.sql >> >> -- complain if script is sourced in psql, rather than via CREATE EXTENSION >> \echo Use "CREATE EXTENSION calc_percents" to load this file. \quit >> >> CREATE OR REPLACE FUNCTION public.percent_of(val1 numeric, val2 numeric) >> RETURNS numeric >> LANGUAGE sql >> AS $function$ >> select (val1 / 100) * val2; >> $function$ >> ; >> >> CREATE OR REPLACE FUNCTION public.what_percent(val1 numeric, val2 numeric) >> RETURNS numeric >> LANGUAGE sql >> AS $function$ >> SELECT (val1 / val2) * 100; >> $function$ >> ; >> >> CREATE OR REPLACE FUNCTION public.percent_diff(val1 numeric, val2 numeric) >> RETURNS numeric >> LANGUAGE sql >> AS $function$ >> select (val2 - val1) / val1 * 100; >> $function$ >> ; >> >> File name: calc_percents.control >> >> # calc_percents extension >> comment = 'Functions for calculating percentages' >> default_version = '1.0' >> relocatable = true >> >> >> Install the above in $SHARE/extension, in my case >> /usr/local/pgsql/share/extension/ >> >> Then: >> >> test=# create extension calc_percents; >> CREATE EXTENSION >> test=# \df percent_of >> List of functions >> Schema |Name| Result data type |Argument data types | >> Type >> ++--+--- >> -+ >> public | percent_of | numeric | val1 numeric, val2 numeric | >> normal >> >> test=# select * from round(percent_of(10, 100), 2) ; >> round >> --- >> 10.00 >> >> >> test=# \df percent_diff >>List of functions >> Schema | Name | Result data type |Argument data types | Type >> +--+--+- >> ---+ >> public | percent_diff | numeric | val1 numeric, val2 numeric | >> normal >> >> >> test=# select * from round(percent_diff(100, 109), 2) ; >> round >> --- >> 9.00 >> (1 row) >> >> >> test=# \df what_percent >>List of functions >> Schema | Name | Result data type |Argument data types | Type >> +--+--+- >> ---+ >> public | what_percent | numeric | val1 numeric, val2 numeric | >> normal >> (1 row) >> >> test=# select * from round(what_percent(10, 109), 2) ; >> round >> --- >> 9.17 >> >> >> >>> If you think it's a good idea it would be nice if someone can implement >>> this. >>> >> >> >> -- >> Adrian Klaver >> adrian.kla...@aklaver.com >> >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general >> > > > > > > > > > > > > > *Or, you could just as easily compute inline in SQL:SELECT datname, > pg_size_pretty(pg_database_size(datname))as size_pretty, > pg_database_size(datname) as size, (SELECT pg_size_pretty (SUM( > pg_database_size(datname))::bigint)FROM pg_database) AS > total, ((pg_database_size(datname) / (SELECT SUM( > pg_database_size(datname)) FROM > pg_database) ) * 100)::numeric(6,3) AS pct FROM pg_database ORDER BY > datname;* > > -- > *Melvin Davidson* > I reserve the right to fantasize. Whether or not you > wish to share my fantasy is entirely up to you. >
Re: [GENERAL] Bypassing NULL elements in row_to_json function
In case it wasn't clear, the sample data was 3 rows of data. (There are actually around 890K rows in the table pgfutter built from the JSON file.) - Mike Nolan
Re: [GENERAL] Bypassing NULL elements in row_to_json function
Here's what I did: \d gold1604_test Table "uscf.gold1604_test" Column | Type | Modifiers +--+--- data | json | Some sample data: {"id":"1001","name":"MISNER, J NATHAN","st":"NY","exp":"2012-05-31","sts": "A"} + {"id":"1002","name":"MISNER, JUDY","st":"TN","exp":"2007-07-31","sts":"I"} + {"id":"1003","name":"MISNER, J AMSCHEL","st":"NY","exp":"2007-05-31","sts" :"A"}+ uscf-> \d goldmast_test Table "uscf.goldmast_test" Column | Type | Modifiers +---+--- id | character varying(8) | name | character varying(40) | st | character varying(2) | exp| date | sts| character(1) | supp | date | rrtg | character varying(8) | qrtg | character varying(8) | brtg | character varying(8) | oqrtg | character varying(8) | obrtg | character varying(8) | fid| character varying(12) | insert into goldmast_test select * from json_populate_record(NULL::"goldmast_test", (select * from gold1604_test limit 1) ) produces: uscf=> select * from goldmast_test; id| name | st |exp | sts | supp | rrtg | qrtg | brtg | oqrtg | obrtg | fid --+--+++-+--+--+--+- -+---+---+- 1001 | MISNER, J NATHAN | NY | 2012-05-31 | A | | | | | | | (1 row) The fact that the null values were stripped out is not an issue here. But, uscf=> insert into goldmast_test select * from json_populate_record(NULL::"goldmast_test", (select * from gold1604_test limit 2) ) uscf-> \g ERROR: more than one row returned by a subquery used as an expression Is there a way to get around the one row per subquery issue? -- Mike Nolan
Re: [GENERAL] Bypassing NULL elements in row_to_json function
On Sun, Apr 10, 2016 at 2:30 AM, David G. Johnston < david.g.johns...@gmail.com> wrote: > On Sat, Apr 9, 2016 at 9:48 PM, Michael Nolan <htf...@gmail.com> wrote: > >> >> 2nd Followup: It turns out that loading a table from a JSON string is >> more complicated than going from a table to JSON, perhaps for good reason. >> There does not appear to be a direct inverse to the row_to_json() function, >> but it wasn't difficult for me to write a PHP program that takes the JSON >> file I created the other day and converts it back to a series of inserts, >> recreating the original table. >> >> Of course this simple program does NO validation (not that this file >> needed any), so if the JSON string is not well-formed for any of a number >> of reasons, or if it is not properly mapped to the table into which the >> inserts are made, an insert could fail or result in incorrect data. >> -- >> Mike Nolan >> > > See: http://www.postgresql.org/docs/9.5/interactive/functions-json.html > > json_populate_record(base anyelement, from_json json) > json_populate_recordset(base anyelement, from_json json) > > Exists in 9.3 too...though if you are going heavy json I'd suggest doing > whatever you can to keep up with the recent releases. > > David J. > > If there's a way to use the json_populate_record() or json_populate_recordset() functions to load a table from a JSON file (eg, using copy), it would be nice if it was better documented. I did find a tool that loads a JSON file into a table (pgfutter), and even loaded one row from that table into another table using json_populate_record(), but the 'subquery returned multiple rows' issue wouldn't let me do the entire table. But that still doesn't deal with validating individual fields or checking that the JSON is complete and consistent with the table to be loaded. -- Mike Nolan
Re: [GENERAL] Bypassing NULL elements in row_to_json function
2nd Followup: It turns out that loading a table from a JSON string is more complicated than going from a table to JSON, perhaps for good reason. There does not appear to be a direct inverse to the row_to_json() function, but it wasn't difficult for me to write a PHP program that takes the JSON file I created the other day and converts it back to a series of inserts, recreating the original table. Of course this simple program does NO validation (not that this file needed any), so if the JSON string is not well-formed for any of a number of reasons, or if it is not properly mapped to the table into which the inserts are made, an insert could fail or result in incorrect data. -- Mike Nolan
Re: [GENERAL] Bypassing NULL elements in row_to_json function
I was able to try it on a test server, the combination of row_to_json() and json_strip_nulls() worked exactly as I had hoped. Stripping nulls reduced the JSON file by over 50%. (The data I needed to export has around 900,000 rows, so it gets quite large.) I've got a test file I can make available to app developers. My next task is to find out if validating and importing a JSON file into a table is as easy as exporting a table in JSON turned out to be. Thanks for the help. -- Mike Nolan
Re: [GENERAL] Bypassing NULL elements in row_to_json function
It looks like json_strip_nulls() may be what I need, I'm currently on 9.3, which doesn't have that function but may be in a position to upgrade to 9.5 this summer. I think the apps that would be receiving the data can deal with any resulting 'holes' in the data set by just setting them to null. -- Mike Nolan
[GENERAL] Bypassing NULL elements in row_to_json function
I'm looking at the possibility of using JSON as a data exchange format with some apps running on both PCs and Macs. . The table I would be exporting has a lot of NULL values in it. Is there any way to skip the NULL values in the row_to_json function and include only the fields that are non-null? -- Mike Nolan no...@tssi.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Regex help again (sorry, I am bad at these)
On Mon, Dec 28, 2015 at 2:08 PM, Christopher Molnarwrote: > Hello all! > > Sorry to have to ask the experts here for some regex assistance again. I > am admittadly awful with these and could use some help. > > Any suggestions? > I have found over the years that it is far easier to write a short PHP or PERL program to do tasks like this. Much easier to debug and the speed improvement by using SQL is not important for 200,000 records. -- Mike Nolan
Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.
On Wed, Nov 18, 2015 at 4:38 PM, Adrian Klaverwrote: > >> Alright, I was following you up to this. Seems to me deleted data would > represent stale/old data and would be less valuable. > >> >> It may depend on WHY the data was deleted. If it represented, say, Hillary Clinton's deleted email, recovering that data might be more valuable to some people than the data that was not deleted. -- Mike Nolan
Re: [GENERAL] A table of magic constants
On Sat, Jul 11, 2015 at 4:53 PM, Vincent Veyron vv.li...@wanadoo.fr wrote: On Sat, 11 Jul 2015 16:55:44 -0400 Dane Foster studdu...@gmail.com wrote: . After a while, you'll find your way around the documentation. I've been doing it almost every day for years, still learning every time. I highly recommend reading the documentation from 'cover to cover' periodically. Yes, there will be things you don't understand yet, but each time you'll pick up things you didn't get in previous passes. A lot of people have put in a lot of time on that documentation, and it is first-rate. (I've been working on a project that requires MySQL, their documentation is far inferior.) -- Mike Nolan no...@tssi.com
Re: [GENERAL] Average New Users Per DOW
On 7/6/15, Robert DiFalco robert.difa...@gmail.com wrote: I'm not sure how to create a result where I get the average number of new users per day of the week. My issues are that days that did not have any new users will not be factored into the average, giving an overinflated result. This is what I started with: WITH userdays AS (SELECT u.created::DATE AS created, to_char(u.created,'Dy') AS d, COUNT(*) AS total FROM users u GROUP BY 1,2), userdays_avg AS (SELECT extract('dow' FROM created) AS nDay, d AS Day, AVG(total) AS New Users FROM userdays GROUP BY 1,2 ORDER BY 1) SELECT Day, New Users FROM userdays_avg ORDER BY nDay; But you can see it wont give correct results since (for example) Monday's with no new users will not be counted in the average as 0. One way to handle this is to union your query with one that has a generate_series (0,6) for the DOW column and nulls for the other columns, then treat both that and your original query as a subquery and do your averages, since nulls are not included in either count() or average() aggregates: select dow, count(*), avg(some_column) from ( select extract ('dow' from some_date) as dow, some_number from some_table union select generate_series(0,6) as dow, null as some_number) as x group by 1 order by 1 -- Mike Nolan no...@tssi.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Average New Users Per DOW
On Mon, Jul 6, 2015 at 5:50 PM, David G. Johnston david.g.johns...@gmail.com wrote: On Mon, Jul 6, 2015 at 6:16 PM, Michael Nolan htf...@gmail.com wrote: But you can see it wont give correct results since (for example) Monday's with no new users will not be counted in the average as 0. One way to handle this is to union your query with one that has a generate_series (0,6) for the DOW column and nulls for the other columns, then treat both that and your original query as a subquery and do your averages, since nulls are not included in either count() or average() aggregates: select dow, count(*), avg(some_column) from ( select extract ('dow' from some_date) as dow, some_number from some_table union select generate_series(0,6) as dow, null as some_number) as x group by 1 order by 1 I'm not seeing how this is at all useful. As you said, the average function ignores the null introduced by the union so the final answer with and without the union is the same. No matter how you work a generate_series(0,6) based query it will never be able to give a correct answer expect accidentally. Each actual missing date contributes a ZERO to the numerator and a ONE to the denominator in the final division that constitutes the mean-average. You must have those dates. In a series with four Mondays ( 4, 0, 0, 8 ) the average desired is 3, not 6 (or 4). There is no way to make the denominator (number of Mondays) 4 instead of 3 by using generate_series(0,6). David J. Ah, you're right. The problem is that avg() is going to treat missing data as missing (of course.) It will either be necessary to add in the missing days as a zero value (but ONLY the missing days, requiring some kind of 'not exists' select, I suppose) or to 'roll your own' average function by adding in the missing days as I did with a union in my earlier post. The real problem is the DOW is not the field where the missing data is, it is in the underlying date field. I created a test dataset. It has 1 day missing in a two-week period from June 1st through June 14th (Sunday, June 7th). Here's what the OP's SQL generates: Day New Users --- -- Sun 2. Mon 4.5000 Tue 2. Wed 4.5000 Thu 1. Fri 3. Sat 3. Here's the SQL to generate the missing day and do the average function by hand: select Day, New Users from ( select dow, Day, sum(total) / count(distinct created) as New Usersfrom (select extract(dow from created) as dow, to_char(created,'Dy') as Day, created, created2, total from (select created, created as created2, count(*) as total from users group by 1, 2 union (select generate_series('2015-06-01 00:00'::timestamp, '2015-06-14'::timestamp,'1 day')::date, null, 0) ) as x) as y group by 1, 2) as z order by dow Day New Users --- -- Sun 1. Mon 4.5000 Tue 2. Wed 4.5000 Thu 1. Fri 3. Sat 3. -- Mike Nolan no...@tssi.com
Re: [GENERAL] Average New Users Per DOW
Here's a minor refinement that doesn't require knowing the range of dates in the users table: (select created, created as created2, count(*) as total from users group by 1, 2 union (select generate_series( (select min(created)::timestamp from users), (select max(created)::timestamp from users), '1 day')::date, null, 0) ) as x) as y group by 1, 2) as z order by dow Day New Users --- -- Sun 1. Mon 4.5000 Tue 2. Wed 4.5000 Thu 1. Fri 3. Sat 3. -- Mike Nolan
Re: [GENERAL] alter column type
On Fri, Jun 5, 2015 at 12:23 PM, Casey Deccio ca...@deccio.net wrote: I have a database in which one table references the primary key of another. The type of the primary key was initially int, but I changed it to bigint. However, I forgot to update the type of a column that references it. So, I've initiated ALTER TABLE foo ALTER COLUMN bar TYPE bigint, where foo/bar is the table/column referencing the primary key that is now of type bigint. However, with 2^31 rows, it is taking a long time to write the rows (it's been 12 hours). Is there a more efficient way to do this? Even if/when this one finishes, there are other column types that I have to update. This update effectively locked me out of all access to the data anyway, so I don't foresee any concern of writes that might affect integrity. Cheers, Casey Probably too late for this time, but in the past when I've needed to redefine the type for a column, I've made a dump, edited the dump file to change the type and then renamed the table and reloaded it. That's usually several orders of magnitude faster. -- Mike Nolan no...@tssi.com
[GENERAL] Consistent state for pg_dump and pg_dumpall
The documentation for pg_dump says that dump files are created in a consistent state. Is that true across multiple tables in the same pg_dump command? (Obviously it would not be true if I dumped tables using separate pg_dump commands. But if I put the database into a backup state using 'pg_start_backup', would separately executed pg_dump commands be in a consistent state across the set of dump files?) The documentation for pg_dumpall does not say that its dump file is in a consistent state (eg, across all tables), but it does say that it uses pg_dump to dump clusters. So, how consistent are the tables in pg_dumpall files? -- Mike Nolan
Re: [GENERAL] Consistent state for pg_dump and pg_dumpall
On Wed, May 20, 2015 at 12:40 PM, David G. Johnston david.g.johns...@gmail.com wrote: Yes. The entire dump is performed within a single transaction. On Wed, May 20, 2015 at 9:24 AM, Michael Nolan htf...@gmail.com wrote: The documentation for pg_dump says that dump files are created in a consistent state. Is that true across multiple tables in the same pg_dump command? (Obviously it would not be true if I dumped tables using separate pg_dump commands. Yes. The entire dump is performed within a single transaction. But if I put the database into a backup state using 'pg_start_backup', would separately executed pg_dump commands be in a consistent state across the set of dump files?) pg_start_backup and pg_dump are not designed to work together. Namely, pg_start_backup is mostly concerned with making sure future writes are accounted for in the final backup while pg_dump says to ignore everything that happens after the command begins. The documentation for pg_dumpall does not say that its dump file is in a consistent state (eg, across all tables), but it does say that it uses pg_dump to dump clusters. So, how consistent are the tables in pg_dumpall files? Each database is internally consistent. There is no guarantee that databases and globals are consistent with each other (though those are typically seldom changed) but different databases will to represent the same point in time vis-a-vis each other. You might want to describe what you are trying to do here. David J. I'm getting ready for a security audit and I want to make sure I have the database backup procedures properly documented, including what the limitations are on each type of backup . We us a combination of low level backups with log shipping, dumping of key individual tables, dumping of entire databases and dumping the entire system (pg_dumpall.) Hardware for setting up a slave server may be in a future budget, though I hope to be able to test having a slave server in the cloud later this year. (I'm not sure we have enough network bandwidth for that, hence the test.) When I moved to a new release of pg (9.3) last December, I stopped all transaction processing first so that pg_dumpall had no consistency issues. -- Mike Nolan
Re: [GENERAL] PostgreSQL-related legal question
One of my sons was hired by Google last year after spending the past several years working on various open-source projects, it took 2 days of back-and-forth with Google's legal department before he was satisfied with the restrictions in their offer. -- Mike Nolan On Wed, Mar 11, 2015 at 4:46 PM, Jan de Visser j...@de-visser.net wrote: On March 12, 2015 06:43:40 AM Gavin Flower wrote: Bill cannot comment, but it might be along the lines of assigning all intellectual property rights, or something of that ilk. In that case, it might give the company ownership of stuff he may have contributed (or intends to contribute) to PostgreSQL in some way – which could lead to legal complications affecting PostgreSQL adversely, which would be expensive and an unnecessary distraction. I used to work for a company that did exactly that - you had to sign a contract that claimed copyright of all your work, even work done outside of work hours, to the company. They did however tell you beforehand that if you were an established contributor to an open-source project they could make exceptions for that, but you had to go through legal. But the upshot was that if you wrote an iPhone app in 15 minutes, the company would own that, technically. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Temporarily suspend a user account?
Might not do what you want, but I just change the password. -- Mike Nolan On Fri, Feb 6, 2015 at 4:11 PM, Melvin Davidson melvin6...@gmail.com wrote: Possibly, To disble: ALTER USER name RENAME TO xname; To enable ALTER USER xname RENAME TO name; ??? On Fri, Feb 6, 2015 at 3:57 PM, Felipe Gasper fel...@felipegasper.com wrote: Hello, Is there a way to temporarily suspend a user account? I would prefer not to revoke login privileges since that will break things that mine pg_users and pg_shadow. I also am trying to find something that is completely reversible, so something like setting connection limit to 0, which would lose a potentially customized connection limit, doesn’t work. We do this in MySQL by reversing the password hash then running FLUSH PRIVILEGES; however, that doesn’t seem to work in PostgreSQL/pg_authid as some sort of cache prevents this from taking effect. Has anyone else solved this issue? Thank you! -Felipe Gasper Houston, TX -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- *Melvin Davidson* I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you.
Re: [GENERAL] Temporarily suspend a user account?
On 2/6/15, David G Johnston david.g.johns...@gmail.com wrote: On Fri, Feb 6, 2015 at 2:22 PM, Michael Nolan [via PostgreSQL] ml-node+s1045698n5836989...@n5.nabble.com wrote: Might not do what you want, but I just change the password. How do you do that and re-enable using the previous password? David J. Encrypted passwords are kept in the pg_shadow file and should start with 'md5'. Just save a copy of the encrypted password for that user and when you want to re-enable that user do: alter user xxx encrypted password 'md5'; I have tested this on 9.3.5. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to analyze a slowdown in 9.3.5?
For what it's worth, this week's run covered even more months than last week's did, and ran in about 5 1/2 hours, with no slowdowns, under a similar system load. So, it could have been a one-time thing or some combination of factors that will be difficult to reproduce. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Re: Stuck trying to backup large database - best practice? How about a cloud service?
On Mon, Jan 12, 2015 at 7:46 PM, Bob Futrelle bob.futre...@gmail.com wrote: You should be able to find a cloud provider that could give you many TB. Or so they like to claim. Nope, but you probably find one willing to SELL you access to many TB. -- Mike Nolan
Re: [GENERAL] How to analyze a slowdown in 9.3.5?
On Sat, Jan 10, 2015 at 8:54 PM, Melvin Davidson melvin6...@gmail.com wrote: Just curious. Have you checked that the tables are being vacuum/analyzed periodically and that the statistics are up to date? Try running the following query to verify: A vacuum analyze runs every night and there would not have been many inserts or updates to the tables used by the lookup function since the latest vacuum analyze. I think I may have even done a vacuum analyze on the two largest tables after the first DB shutdown. -- Mike Nolan
Re: [GENERAL] How to analyze a slowdown in 9.3.5?
On Fri, Jan 9, 2015 at 7:52 PM, Tomas Vondra tomas.von...@2ndquadrant.com wrote: On 9.1.2015 23:14, Michael Nolan wrote: I'm running 9.3.5 on a virtual machine with 5 cores and 24 GB of memory. Disk is on a SAN. I have a task that runs weekly that processes possibly as many as 120 months worth of data, one month at a time. Since moving to 9.3.5 (from 8.2!!) the average time for a month has been 3 minutes or less. Congrats to migrating to a supported version! Yeah, it's been a long and annoying 7 years since we updated the server or database version, but I don't make the budget decisions. Going to PGCON was frustrating when nearly all the talks were about features added several versions after the one I was stuck running! -- Mike Nolan PS. Sorry about the top-posting in my last note.
Re: [GENERAL] How to analyze a slowdown in 9.3.5?
The function is a complicated plpgsql function that makes numerous database queries, all read-only. (Other parts of that program may make changes to the database.) The first database shutdown and the shutdown/reboot later on were both 'clean' shutdowns, so there shouldn't have been any kind of transaction rollback. I has sar running on that server, if that provides any useful data. Mostly I'm just trying to make up a list of what to look for and what to log in case it happens again. (It runs again on Tuesday, and I already know it will be going back to review 2004 data so it'll be an even longer run than this week's was.) -- Mike Nolan On Sat, Jan 10, 2015 at 12:55 PM, Andy Colson a...@squeakycode.net wrote: On 01/09/2015 07:52 PM, Tomas Vondra wrote: On 9.1.2015 23:14, Michael Nolan wrote: I'm running 9.3.5 on a virtual machine with 5 cores and 24 GB of memory. Disk is on a SAN. I have a task that runs weekly that processes possibly as many as 120 months worth of data, one month at a time. Since moving to 9.3.5 (from 8.2!!) the average time for a month has been 3 minutes or less. Congrats to migrating to a supported version! Please, comparison of the configuration used on 8.2 and 9.3.5 would be helpful (i.e. how you've updated the config on the new version?). However, when this job ran this Tuesday, it ran fine for a number of months, but then started slowing down dramatically, 300 minutes for one month and then 167 minutes for the next. I stopped and restarted postgresql, the next block also ran really slow (157 minutes.) I then rebooted the server and the remaining blocks ran at the usual fast speed again, so restarting postgresql didn't fix the problem but rebooting the server did. What amounts of data are we talking about? Gigabytes? Tens of gigabytes? Looking at the logs, I see queries with a function call that would normally take no more than 100-200 milliseconds, usually far less, that were taking 100 seconds or longer. This function gets called thousands of times for each month, so that appears to be one source of the slowdown. But why are the functions taking so much longer? Are they eating CPU, I/O or are generally waiting for something (e.g. locks)? I don't suspect a memory leak in the calling program (in php), because since moving to this server in December this weekly task has run several times over the same range of months, making pretty much the same function calls each time. I also ran the entire range several times during testing. One change made to the server since the previous week's run was that I moved up to the latest Centos kernel (Linux version 3.10.0-123.13.2.el7.x86_64). And what was the previous kernel version? However, if it worked fine after rebooting the server, it may not be a kernel issue (unless it somehow depends on uptime). Is there something in the /var/log/messages? At first, I was thinking, lots of activity within one transaction was messing up the stats and the planner started getting it wrong. But a reboot wouldn't fix that. Would it? What if the reboot rolled back the db, would that stats make sense again? I have a process that makes a big temp table (with indexes). After its built if I dont run a quick analyze on it the planner never uses the indexes right. Another thing I can think of is never commiting. If it started collecting lots and lots of row versions it could get slower and slower. But, then, you'd see the same thing on 8.2, so, that's probably not it. Do you have any Idle in transaction connections? -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] How to analyze a slowdown in 9.3.5?
I'm running 9.3.5 on a virtual machine with 5 cores and 24 GB of memory. Disk is on a SAN. I have a task that runs weekly that processes possibly as many as 120 months worth of data, one month at a time. Since moving to 9.3.5 (from 8.2!!) the average time for a month has been 3 minutes or less. However, when this job ran this Tuesday, it ran fine for a number of months, but then started slowing down dramatically, 300 minutes for one month and then 167 minutes for the next. I stopped and restarted postgresql, the next block also ran really slow (157 minutes.) I then rebooted the server and the remaining blocks ran at the usual fast speed again, so restarting postgresql didn't fix the problem but rebooting the server did. Looking at the logs, I see queries with a function call that would normally take no more than 100-200 milliseconds, usually far less, that were taking 100 seconds or longer. This function gets called thousands of times for each month, so that appears to be one source of the slowdown. I don't suspect a memory leak in the calling program (in php), because since moving to this server in December this weekly task has run several times over the same range of months, making pretty much the same function calls each time. I also ran the entire range several times during testing. One change made to the server since the previous week's run was that I moved up to the latest Centos kernel (Linux version 3.10.0-123.13.2.el7.x86_64). As far as I can tell, the other virtual servers weren't being slowed down, so I don't suspect problems with the virtual server or the SAN. If this happens again, what sorts of settings in postgresq.conf or other tools should I be using to try to track down what's causing this? -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Blocking access by remote users for a specific time period
I have several web apps that access our Postgresql database that I'd like to lock out of the database for about an hour during a weekly maintenance interval. (There are some internal users that do not get locked out, because they're running the maintenance tasks.) There are no time-of-day access limitation parameters in the pg_hba.conf file, are there any simple ways to do this? -- Mike Nolan
Re: [GENERAL] Blocking access by remote users for a specific time period
Yeah, a cron job to swap pg_hba.conf files is the best solution I've come up with so far. It's not one web app, it's closer to two dozen of them, on multiple sites. -- Mike Nolan On Sat, Dec 13, 2014 at 11:10 PM, Adrian Klaver adrian.kla...@aklaver.com wrote: On 12/13/2014 08:13 PM, Michael Nolan wrote: I have several web apps that access our Postgresql database that I'd like to lock out of the database for about an hour during a weekly maintenance interval. (There are some internal users that do not get locked out, because they're running the maintenance tasks.) There are no time-of-day access limitation parameters in the pg_hba.conf file, are there any simple ways to do this? Use a cron job that at beginning of period swaps out the pg_hba.conf with one that denies access, reloads server and then at end of time period reverse procedure ? -- Mike Nolan -- Adrian Klaver adrian.kla...@aklaver.com
Re: [GENERAL] Merge rows based on Levenshtein distance
I don't think you've defined your problem very clearly. Suppose you have 1000 names in your database. Are you planning to compare each name to the other 999 names to see which is closest? What if two names are equally close to a third name but not to each other, how do you decide which is better? -- Mike Nolan
Re: [GENERAL] Merge rows based on Levenshtein distance
Have you considered using a soundex function to sort names into similarity groups? In my experience it works fairly well with Western European names, not quite as well with names from other parts of the world. It also doesn't deal well with many nicknames (Mike instead of Michael, etc.) -- Mike Nolan
Re: [GENERAL] Problem with query
On 4/11/14, Chris Curvey ch...@chriscurvey.com wrote: On Fri, Apr 11, 2014 at 1:50 PM, Susan Cassidy susan.cass...@decisionsciencescorp.com wrote: I have a query with several joins, where I am searching for specific data in certain columns. Have you tried running each of your joins separately to see if there are row values common to both tables, ie: select count(*) from scenes s left outer join scene_thing_instances si on s.scene_id = si.scene_id then select count(*) from scene_thing_instances si left outer join scene_things st on si.scene_thing_id = st.scene_thing_id etc. I find when building complex queries (I've written some that ran over 100 lines and involved a dozen or more joined tables), I need to build them up, testing them as I build. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Mysterious DB reset
On 3/6/14, Israel Brewster isr...@eraalaska.net wrote: LOG: received smart shutdown request LOG: autovacuum launcher shutting down LOG: shutting down LOG: database system is shut down However, there are no timestamps on any of the entries (can I fix that?) Yes, change the log_line_prefix in the postgresql.conf file and reload it. I use: log_line_prefix = '%m %u ' You might also want to use this, at least temporarily: log_statement = all -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Moving data from M$ JetDB file to Postgres on Linux
I think that PHP has modules (eg, PEAR) that can read MS Access database files, and once you have it in an array you can create INSERT statements for PostgreSQL, including cleaning up any data format issues (eg, dates of 00-00-) -- Mike Nolan On Fri, Feb 28, 2014 at 6:21 PM, Rich Shepard rshep...@appl-ecosys.comwrote: I just downloaded two scientific data files from a federal agency's Web site. Both are in M$ JetDB format. I run only linux and keep all my scientific dat in postgres. My Web search did not turn up anything useful; the closest was a thread from this mail list in 2000 on how to send a postgres query through odbc to an Access database. Is there a filter I can use to get the data from these files? TIA, Rich -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] to_date() and invalid dates
Thomas, try this: '2013-02-31'::date -- Mike Nolan On Mon, Jan 20, 2014 at 7:44 AM, Thomas Kellerer spam_ea...@gmx.net wrote: Hi, I asked this a while back already: select to_date('2013-02-31', '-mm-dd'); will not generate an error (unlike e.g. Oracle) However in the release notes of 9.2.3[1] it is mentioned that - Reject out-of-range dates in to_date() (Hitoshi Harada) I tried the above statement using 9.2.6 and 9.3.2 in both versions 2013-02-03 is returned instead of rejecting the input. The same is true if e.g. an invalid month is specified: to_date('2013-17-09', '-mm-dd'). Does this check need a configuration setting to be in effect? Regards Thomas [1] http://www.postgresql.org/docs/9.2/static/release-9-2-3.html -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Junk date getting uploaded into date field
On 11/5/13, bsreejithin bsreejit...@gmail.com wrote: I am not sure why : select to_date('33-OCT-2013', 'dd-mon-') is returning 2013-11-02. For cases like the issue I am facing, where we need to raise an error saying the data is wrong, DB manipulating the data is not proper. Try using a cast to date instead: select '33-oct-2013'::date throws an error. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dumpall from a script
You could write a plperlul function that runs a shell script to back up your database, you can even pass it parameters and put a call to that in a trigger. BUT, this could result in multiple backups running at the same time and become a performance drag. -- Mike Nolan On Tue, Oct 22, 2013 at 9:19 PM, James Sewell james.sew...@lisasoft.comwrote: Oh I missed that, I skimmed and thought it was the same as \set Turns out it's not and it's exactly what I want! Thanks! James Sewell, PostgreSQL Team Lead / Solutions Architect __ Level 2, 50 Queen St, Melbourne VIC 3000 *P *(+61) 3 8370 8000 * **W* www.lisasoft.com *F *(+61) 3 8370 8099 On Wed, Oct 23, 2013 at 11:48 AM, Adrian Klaver adrian.kla...@gmail.comwrote: On 10/22/2013 03:41 PM, James Sewell wrote: Hello All, Thanks for the replies.Sorry I must have been a bit unclear, I realise I *could* do this from the shell level, but can I do it from a PSQL session somehow? Lucas' \setenv method won't work for you? Cheers, James Sewell, PostgreSQL Team Lead / Solutions Architect -- Adrian Klaver adrian.kla...@gmail.com -- The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.
Re: [GENERAL] How to failover from Primary to Standby and Set the old Primary as a new Standby
Assuming the database hasn't changed much since the failover, doing a fsync from the new primary back to the old primary should be fairly quick. -- Mike Nolan On 9/19/13, Vick Khera vi...@khera.org wrote: On Thu, Sep 19, 2013 at 11:31 AM, ascot.m...@gmail.com ascot.m...@gmail.com wrote: I use PG 9.2.4 with streaming replication. What will be the manual procedure to failover from Primary to Standby and Set the old Primary as a new standby? From what I understand, you start over by setting up the old primary as a new standby from scratch. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to failover from Primary to Standby and Set the old Primary as a new Standby
On 9/19/13, John R Pierce pie...@hogranch.com wrote: On 9/19/2013 1:29 PM, Vick Khera wrote: On Thu, Sep 19, 2013 at 11:31 AM, ascot.m...@gmail.com mailto:ascot.m...@gmail.com ascot.m...@gmail.com mailto:ascot.m...@gmail.com wrote: I use PG 9.2.4 with streaming replication. What will be the manual procedure to failover from Primary to Standby and Set the old Primary as a new standby? From what I understand, you start over by setting up the old primary as a new standby from scratch. if you use rsync for the base backup of new master to old, it should go fairly quickly as relatively few files should have changed assuming not much time has elapsed. Of course, before you do anything, you should spend some time figuring out WHY the old master failed. There could be issues that need to be resolved before putting it back online, and fixing them could affect how much work you have to do to get the physical files back in sync. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to switch file systems with least downtime?
Have you considered setting up a synchronously replicated slave database on the new file system (using a port other than 5432), getting it in sync, then shutting both databases down (master first), switching the slave over to become the master and restarting just that database on port 5432? -- Mike Nolan On Sat, Sep 14, 2013 at 8:32 AM, Moshe Jacobson mo...@neadwerx.com wrote: How do I migrate my 9.1 directory to a new file system with the least downtime possible? I don't know if this makes any difference, but my pg_xlog directory is on its own volume as well, so I would have to unmount it and remount it as well, but I would not have to copy over my xlogs. I figure the first part of this is to do a pg_start_backup() and rsync the files over., bu I'm not sure what to do after that. Thanks for your help. Moshe Jacobson Nead Werx, Inc. | Manager of Systems Engineering 2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339 mo...@neadwerx.com | www.neadwerx.com Quality is not an act, it is a habit. -- Aristotle
Re: [GENERAL] question about age()
On 8/29/13, Andreas Kretschmer akretsch...@spamfence.net wrote: is there a bug in age()? test=*# select *, age(birthday), age (current_date-1, birthday) from birthday ; id | birthday | age | age ++-+- 1 | 2010-08-29 | 3 years | 2 years 11 mons 30 days (1 row) Time: 0,322 ms test=*# select * from birthday where age(birthday) != age (current_date-1, birthday); id | birthday +-- (0 rows) '3 years' != '2 years 11 mons 30 days', but i got 0 rows, why? I'm using 9.2.4. What is the content of the field 'birthday''? My guess is there's a null value for the field, in which case you are comparing two nulls. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] question about age()
On 8/29/13, Michael Nolan htf...@gmail.com wrote: On 8/29/13, Andreas Kretschmer akretsch...@spamfence.net wrote: I'm using 9.2.4. What is the content of the field 'birthday''? My guess is there's a null value for the field, in which case you are comparing two nulls. Oops, missed seeing the first half of the post with the data. It may be a function output type issue. Modifying the query as follows works (in 8.2, which I'm still stuck on): select * from birthday where age(birthday)::text != age (current_date-1, birthday)::text; -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] incremental dumps
On 8/10/13, haman...@t-online.de haman...@t-online.de wrote: currently the source uses some 20 GB in a database partition and about 700 GB in a general data partition. For the database, a diff -e grows to about 10% of the size of a full dump in a week The remote site is a raid box at a hosting center, with paid backup Regards Wolfgang It sounds like you have catastrophic failure covered, but what about data integrity and data security? You may need to 'roll your own' solution, possibly using something like Slony. Having a timestamp field that indicates when the row was inserted or last updated may help. A true incremental backup would IMHO be a very useful tool for database administrators, but there are a number of technical challenges involved, especially dealing with deleted records. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] incremental dumps
On 8/1/13, haman...@t-online.de haman...@t-online.de wrote: Hi, I want to store copies of our data on a remote machine as a security measure. Wolfgang 2 questions: 1. How secure is the remote site? 2. How much data are we talking about? -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to clone a running master cluster?
On 5/11/13, Moshe Jacobson mo...@neadwerx.com wrote: I have a master database cluster on one server, and it is configured to ship logs via scp to an archive directory on my slave server. The slave server is configured for streaming replication, and also is configured to delete the archived xlogs when they are no longer needed (using pg_archivecleanup). I have a third machine on which I'd like to get another master cluster running, and I'd like it to start with a copy of my current master. I'd cannot restart my master, and would prefer not to restart my slave either. Given my xlog archive configuration, Is there a way to clone my master cluster to another machine, including all of the necessary xlogs, without bringing down the original master or slave? Step-by-step instructions would be much appreciated. Thank you! -- Moshe Jacobson Nead Werx, Inc. | Manager of Systems Engineering 2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339 mo...@neadwerx.com | www.neadwerx.com Quality is not an act, it is a habit. -- Aristotle Moshe, if you need a detailed cookbook tailored to your specific requirements, you may need to hire a PostgreSQL expert as a consultant to write it for you. Generalized guidelines can't possibly cover every possible situation. The Binary Replication Tutorial at http://wiki.postgresql.org/wiki/Binary_Replication_Tutorial (which is slightly out of date as it does not cover 9.2 and 9.3 improvements yet) is probably going to cover most of what you need. The 'long method' is going to be pretty much what you need, you will still need to do a pg_start_backup() and pg_stop_backup() while you copy the data directory files, but you probably won't need to restart the master to change the master configuration files since you've already got replication working to one server and you're apparently not planning to have the second slave server poll the master for updates. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How large can a PostgreSQL database get?
On 4/17/13, Scott Marlowe scott.marl...@gmail.com wrote: My experience, doing production and dev dba work on both postgresql and oracle, is that either works well, as long as you partition properly or even break things into silos. Oracle isn't magic pixie dust that suddenly gets hardware with 250MB/s seq read arrays to read at 1GB/s, etc. With oracle partitioning is easier, and everything else on the freaking planet is harder. Scott, thank you for the best laugh I've had all day! I started out on Oracle (some 20 years ago) and have been running both MySQL and PostgreSQL databases for the last 10 years or so. I'd take PostgreSQL over the other two in a heartbeat! Data integrity/data preservation issues (backup is just one aspect of that) are going to be your biggest problems with VERY large databases, no matter how much money you throw at it. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Money casting too liberal?
On 3/27/13, Steve Crawford scrawf...@pinpointresearch.com wrote: Somewhat more worrisome is the fact that it automatically rounds input (away from zero) to fit. select '123.456789'::money; money - $123.46 So does casting to an integer: select 1.25::integer ; int4 1 And then there's this: create table wkdata (numval numeric(5,2)) CREATE TABLE Time: 6.761 ms nolan= insert into wkdata nolan- values (123.456789); INSERT 569625265 1 Time: 4.063 ms nolan= select * from wkdata; select * from wkdata; numval -- 123.46 So rounding a money field doesn't seem inconsistent with other data types. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] .pgpass and root: a problem
On Tue, Feb 5, 2013 at 1:57 PM, Scott Mead sco...@openscg.com wrote: I would love to see pgpass storing encrypted stuff here, that'd be great... in the meantime... I would suggest going one step further, and making encrypted pgpass authorization something that has to be specifically enabled in pg_hba.conf. -- Mike Nolan
Re: [GENERAL] query by partial timestamp
On 1/8/13, Gavan Schneider pg-...@snkmail.com wrote: 2. SELECT ... WHERE '2011-01-01'::TIMESTAMP = col_of_type_timestamp ANDcol_of_type_timestamp = '2011-12-31'::TIMESTAMP; This won't quite work, because '2011-12-31'::TIMESTAMP is the same as 2011-12-31 00:00:00.0 so records timestamped later in the day on the 31st would not get selected SELECT ... WHERE '2011-01-01'::TIMESTAMP = col_of_type_timestamp AND col_of_type_timestamp '2012-01:01'::TIMESTAMP; would get all records with a 2011 timestamp. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] query by partial timestamp
It is probably not the most efficient, but I often use this syntax, which reads better. Select . where col_type_timestamp::date between '2011-01-01' and '2011-12-31' This will use a timestamp index. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Any experience with Drobo SAN and PG?
I'm looking to spec a new production server for a small client and have been looking at the Drobo SAN units. Has anybody run PG on one of these yet? It looks like only the B1200i supports Linux operating systems. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Streaming replication and high query cancellation values
On 8/1/12, Christophe Pettus x...@thebuild.com wrote: I have a couple of questions about how streaming replication works in the presence of a high timeout for query cancellation: Are you referring to queries on the slave? The master doesn't know what the slave is doing, so it would keep on shipping streaming replication data. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Replication/cloning: rsync vs modification dates?
On 7/16/12, Chris Angelico ros...@gmail.com wrote: I'm speccing up a three-node database for reliability, making use of streaming replication, and it's all working but I have a bit of a performance concern. Can the individual files' modification timestamps be relied upon? If so, it'd potentially mean a lot of savings, as the directory entries can be read fairly efficiently. I could still then use rsync to transfer those files (so if it's only a small part that's changed, we take advantage of its optimizations too). I did several weeks of tests on 9.1.3 using mod time and file size rather than checksumming the files, that did not appear to cause any problems and it sped up the rsync considerably. (This was about a 40 GB database.) -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Replication/cloning: rsync vs modification dates?
On 7/16/12, Chris Angelico ros...@gmail.com wrote: On Tue, Jul 17, 2012 at 1:40 AM, Michael Nolan htf...@gmail.com wrote: I did several weeks of tests on 9.1.3 using mod time and file size rather than checksumming the files, that did not appear to cause any problems and it sped up the rsync considerably. (This was about a 40 GB database.) Thanks! Is file size a necessary part of the check, or can mod time alone cover it? I'm looking at having my monitoring application automatically bring database nodes up, so it looks like the simplest way to handle it will be to have the new slave mandatorially do the backup/rsync, even if it's been down for only a couple of minutes. With a mod time check, I could hopefully do this without too much hassle. As I understand the docs for rsync, it will use both mod time and file size if told not to do checksums. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Replication/cloning: rsync vs modification dates?
On 7/16/12, Sergey Konoplev sergey.konop...@postgresql-consulting.com wrote: On Mon, Jul 16, 2012 at 8:01 PM, Chris Angelico ros...@gmail.com wrote: On Tue, Jul 17, 2012 at 1:58 AM, Michael Nolan htf...@gmail.com wrote: As I understand the docs for rsync, it will use both mod time and file size if told not to do checksums. I wonder if it is correct in general to use mtime and size to perform these checks from the point of view of PostgreSQL. If it works with the current version then is there a guaranty that it will work with the future versions? There are many things for which no guarantee of future compatibility (or sufficiency) are the case. For that matter, there's really no assurance that timestamp+size is sufficient NOW. But checksums aren't 100% reliable, either. without doing a byte by byte comparison of two files, there's no way to ensure they are identical. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Replication/cloning: rsync vs modification dates?
On 7/16/12, Steven Schlansker ste...@likeness.com wrote: I think it's pretty easy to show that timestamp+size isn't good enough to do this 100% reliably. That may not be a problem if the slave server synchronization code always starts to play back WAL entries at a time before the worst case for timestamp precision. I'm assuming here that the WAL playback process works something like this: Look at a WAL entry, see if the disk block it references matches the 'before' indicators for that block in the WAL. If so, update it to the 'after' data content. There are two non-matching conditions: If the disk block information indicates that it should match a later update, then that block does not need to be updated. But if the disk block information indicates that it should match an earlier update than the one in the WAL entry, then the synchronization fails. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Hot standby streaming replication doesn't work
On Sun, Jun 24, 2012 at 1:57 AM, Tim Uckun timuc...@gmail.com wrote: I am following the instructions on the wiki https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial#PITR.2C_Warm_Standby.2C_Hot_Standby.2C_and_Streaming_Replication using the 10 minute version of the setup. What version of postgresql are you running? -- Mike Nolan
Re: [GENERAL] Procedural Languages
On Thu, May 31, 2012 at 10:36 AM, John Townsend jtowns...@advancedformulas.com wrote: There are least 10 Procedural Languageshttp://en.wikipedia.org/wiki/PL/pgSQLavailable for PostGreSQL. The one that comes with the installation is PL/pgSQL. Which ones do you use and why? Thanks, John Townsend PL/pgSQL and PL/perlu are the only ones I use. I use PL/perlu primarily to launch shell scripts from triggers, for example to update an external website when a row in a table has been inserted, deleted or updated. -- Mike Nolan
Fwd: [GENERAL] Procedural Languages
-- Forwarded message -- From: Michael Nolan htf...@gmail.com Date: Thu, May 31, 2012 at 2:49 PM Subject: Re: [GENERAL] Procedural Languages To: Darren Duncan dar...@darrenduncan.net On Thu, May 31, 2012 at 2:23 PM, Darren Duncan dar...@darrenduncan.netwrote: Michael Nolan wrote: PL/pgSQL and PL/perlu are the only ones I use. I use PL/perlu primarily to launch shell scripts from triggers, for example to update an external website when a row in a table has been inserted, deleted or updated. There is also another way to do what you describe that might be more secure. Rather than having the DBMS launch shell scripts directly, instead use LISTEN/NOTIFY messaging, where the trigger posts a message, and you have an ordinary client script listening for them, and the client script launches the shell scripts when it gets a message. This way, you need a persistent client script, but you don't need to invoke the shell in the DBMS ... or use the untrusted version of PL/Perl if that's all it was for. -- Darren Duncan Anybody have examples of a persistent client script? -- Mike Nolan
Fwd: [GENERAL] Disable Streaming Replication without restarting either master or slave
-- Forwarded message -- From: Michael Nolan htf...@gmail.com Date: Tue, May 29, 2012 at 1:37 PM Subject: Re: [GENERAL] Disable Streaming Replication without restarting either master or slave To: Fujii Masao masao.fu...@gmail.com On Tue, May 29, 2012 at 1:15 PM, Fujii Masao masao.fu...@gmail.com wrote: On Tue, May 29, 2012 at 10:17 PM, François Beausoleil franc...@teksol.info wrote: Are per-chance looking for pg_xlog_replay_pause() and pg_xlog_replay_resume() ? Those can pause and resume WAL replay in the standby, but not streaming replication. Even while WAL replay is being paused, WAL can be streamed from the master to the standby. Regards, -- Fujii Masao So, that means that the only ways to stop streaming replication are to stop the slave server, to disable access to the master via the pg_hba.conf file (requiring the master configs be reloaded) or to set the trigger file on the slave to tell it to stop replicating the master. And if the master/slave are set to synchronous streaming replication, your options are more limited, since the master has to know to stop waiting for the synchronous slave to respond. Once the slave has gone out of asynchronous replication mode, wuld it be possible to resume asynchronous replication by stopping the slave server, removing the trigger file, and restarting it in asynchronous streaming replication mode? This would, at a minimum, depend on how many updates have occurred on the master during the time streaming replication was disabled and having all the WAL files available, right? -- Mike Nolan
Re: [GENERAL] Lost one tablespace - can't access whole database
On Sat, May 5, 2012 at 4:19 PM, Stefan Tzeggai tzeg...@wikisquare.dewrote: Hi postgresql 9.1 on Ubuntu 10.04 All important information is in the other tablespaces. I would be totally happy to just loose all relations in that lost tablespace. It's just indexes. Is there any way to tell PG to drop/ignore that tablespace and access the database? Steve, the reason you're getting those messages when you try to access any tables with SQL is because it is trying to access the indexes in the lost tablespace. I tried recreating your problem on a test server and you do should a few options, which you choose may depend on how big your database is. First, if you haven't already done so, BEFORE DOING ANYTHING ELSE, make a complete file level backup of your database (after shutting it down), less the lost tablespace, of course. There are two types of options that come to mind, there may be others. You should be able to pg_dump your database table by table. I haven't tried it, but I think dumping your databases one by one should work, too, since pg_dump doesn't appear to need to access the missing indexes. pg_dumpall appears to work, too. This gives you several choices, depending upon how many tables had indexes in the lost tablespace. You could, for example, just dump and restore the affected tables. Or you could restore the affected database(s) completely or the entire system from the pg_dumpall file. Another option that seems to work for me is this: 1. Recreate the missing directories in the lost tablspace, specifically the one that starts with PG_9.1' and the subdirectories under it. The error messages from psql will tell you what their exact names were. 2. Re-index all the tables that had indexes in the lost tablespace. Whichever method you use, you need to re-think your backup protocols. You got lucky here, because there were only index files in the tablespace you lost. Next time you may not be so fortunate. -- Mike Nolan
Re: [GENERAL] Streaming replication: sequences on slave seemingly ahead of sequences on master
On Mon, May 7, 2012 at 4:01 AM, Vincent de Phily vincent.deph...@mobile-devices.fr wrote: On Sunday 06 May 2012 10:29:17 Simon Riggs wrote: On 4 May 2012 14:55, Vincent de Phily vincent.deph...@mobile-devices.fr wrote: Would be nice to see it added to the documentation (unless I just didn't find it ?), as it is quite surprising, and might lead to problems if people expect to be able to read sequence values from the slave. What people need to understand is that there is no way to 'read' a sequence value from a slave. 'SELECT * from sequence_name' will not reliably give you either the most recently assigned or the next sequence value. This is currently covered in the documentation for sequences, but could probably be improved upon and mentioned somewhere in the documentation on setting up slave servers. (I will look at adding it to the binary replication tutorial wiki page.) Since 'nextval' cannot be called on a sequence on a slave (because a slave can only support read-only transactions), 'currval' will by definition return an error. To cross-pollinate with another thread, if temporary tables (and insert/delete/update transactions to them) are to be supported on a slave, will the applications using those temporary tables expect to be able to use 'nextval' on inserts to temporary tables as well? As a bonus question, I guess it would be the same if using synchroneous replication ? Yes. -- Mike Nolan
Re: [GENERAL] Streaming replication: sequences on slave seemingly ahead of sequences on master
This is due to how sequences are pre-allocated in blocks to sessions running on the master. Since the slave is updated via the WALs, and not via 'nextval' function calls in queries, the sequences that are actually used will remain in sync with the master. -- Mike Nolan
Re: [GENERAL] Issue with rsync based incremental backup : postgres: startup process waiting for 0000000100000001000000D2
On Thu, May 3, 2012 at 11:49 AM, Samba saas...@gmail.com wrote: Hi, Please advise me if what i'm doing is makes sense and is an accepted mechanism for taking backups or if there is any other procedure that i can emplpoy to avoid unnecessarily archiving gigabytes of WAL logs which may be growing many times the size of the actual data directory. Thanks and Regards, Samba The problem is that rsync isn't copying all the xlog files created during the time the rsync is taking place, which is why it is complaining that there are files missing. There may be other logical flaws with your process as well. Something similar to the steps given in Starting Replication with only a Quick Master Restart as laid out in the wiki tutorial on binary replication might give you a way to make this work. (You probably won't need the restart of the master, since you're not actually setting up replication, so you won't be changing the postgresql.conf file on your master.) This uses a two-step process. First you copy all the files EXCEPT the ones on pg_xlog, then you copy those files, so you have a complete set. See http://wiki.postgresql.org/wiki/Binary_Replication_Tutorial -- Mike Nolan
Re: [GENERAL] Re: how to set up automatically startup database when the server boot or reboot.
Your options range from doing something simple to something complex. A simple option on a Linux server would be placing a command like this in /etc/rc/rc.local: su - postgres -C /usr/local/pgsql/bin/pg_ctl -D ;/usr/local/pgsql/data -l /usr/local/pgsql/logfile start However, that might not be the optimal choice for every situation, because it doesn't take into account WHY the system rebooted. A system that rebooted because of a power/UPS issue might need to be treated differently than one that rebooted because of a hardware failure or kernel panic. (And just because postgres can restart the database, that doesn't always mean it should. Even a well-written startup script might not know enough to make that decision for you.) This might be good material for a tutorial on the wiki site, with some system-specific sections. -- Mike Nolan
Re: [GENERAL] Backups using Solaris ZFS Snapshots
On Tue, Apr 24, 2012 at 4:08 PM, Yunong J Xiao yjx...@gmail.com wrote: I am currently backing up my postgres instances using ZFS snapshots instead of the sanctioned pg_dump utility mainly because I am running on Solaris and it offers a copy-on-write file system. Anecdotally this has been working fine for me. Are there any issues I should be aware of since I'm not using pg_dumps at all but merely copying snapshots of the postgres data directory? As a matter of principle, you should test your backup strategy periodically, no matter what it is! This test should tell you: 1. Whether the backup method even works. (Sadly, a backup method that worked in the past may no longer work.) 2. What steps are needed to recover from a backup. 3. How much data loss (if any) you are likely to experience. 4. How long it will take to bring up the recovered database. All of these are things that you need to know in advance, and your management will be vitally interested in #3 and #4 when the occasion arises to have to use a backup. You also need to know how to do partial recoveries (such as an inadvertently deleted or corrupted but important table.) Not all backup strategies lend themselves readily to partial recoveries. -- Mike Nolan
Re: [GENERAL] PostgreSQL 9.1 Hot Backup Error: the database system is starting up
On Thu, Apr 19, 2012 at 12:46 PM, Jen jennifer.s...@oeconnection.comwrote: I have been working on a hot backup for Postgres 9.1 for awhile and have run into a consistent issue. The instructions in the Binary Replication Tutorial work well for me, I suggest you read through the '10 minute' version. Specifically, look at the way the rsyncs are done in two stages, one while the primary database is in backup mode, and one afterwards. -- Mike Nolan
Re: [GENERAL] PostgreSQL 9.1 Hot Backup Error: the database system is starting up
On Thu, Apr 19, 2012 at 1:07 PM, Michael Nolan htf...@gmail.com wrote: On Thu, Apr 19, 2012 at 12:46 PM, Jen jennifer.s...@oeconnection.comwrote: I have been working on a hot backup for Postgres 9.1 for awhile and have run into a consistent issue. The instructions in the Binary Replication Tutorial work well for me, I suggest you read through the '10 minute' version. Specifically, look at the way the rsyncs are done in two stages, one while the primary database is in backup mode, and one afterwards. -- Mike Nolan Sorry, forgot the link: http://wiki.postgresql.org/wiki/Binary_Replication_Tutorial
Re: [GENERAL] LOCK TABLE is not allowed in a non-volatile function
On Tue, Apr 17, 2012 at 5:20 PM, Eliot Gable ega...@gmail.com wrote: I cannot find a single non-volatile function in the call path; so I am baffled on where this error message is coming from. I would be thankful for any ideas anyone might have on where this error message might be coming from or how to locate where it is coming from. According to the documentation, the current_timestamp family of functions is stable, could that be the cause? Better yet, should it? -- Mike Nolan
Re: [GENERAL] Searchable chess positions in a Postgress DB
On Thu, Apr 12, 2012 at 4:50 PM, Gavin Flower gavinflo...@archidevsys.co.nz wrote: On 11/04/12 21:24, Gavin Flower wrote: On 11/04/12 19:15, Sidney Cadot wrote: Dear all, As a hobby project, I am toying around with a database containing about 5 million chess games. On average, these games have about 80 positions (~ 40 moves by both black and white), which means there are about 400 million chess positions in there. If you haven't done so already, you should read through the literature on chess and computers. I'm quite a few years out of date, but there's been a lot of research into efficient ways to store and search chess positions, and some of it may have dealt with SQL database structures. -- Mike Nolan
Re: [GENERAL] [streaming replication] 9.1.3 streaming replication bug ?
On 4/11/12, 乔志强 qiaozhiqi...@leadcoretech.com wrote: Yes, increase wal_keep_segments. Even if you set wal_keep_segments to 64, the amount of disk space for WAL files is only 1GB, so there is no need to worry so much, I think. No? But when a transaction larger than 1GB... Then you may need WAL space larger than 1GB as well. For replication to work, it seems likely that you may need to have sufficient WAL space to handle a row, possibly the entire transaction.. But since a single statement can update thousands or millions of rows, do you always need enough WAL space to hold the entire transaction? So in sync streaming replication, if master delete WAL before sent to the only standby, all transaction will fail forever, the master tries to avoid a PANIC error rather than termination of replication. but in sync replication, termination of replication is THE bigger PANIC error. That's somewhat debatable. Would I rather have a master that PANICED or a slave that lost replication? I would choose the latter. A third option, which may not even be feasible, would be to have the master fail the transaction if synchronous replication cannot be achieved, although that might have negative consequences as well. Another question: Does master send WAL to standby before the transaction commit ? That's another question for the core team, I suspect. A related question is what happens if there is a rollback? -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [HACKERS] [GENERAL] [streaming replication] 9.1.3 streaming replication bug ?
On 4/11/12, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Michael Nolan htf...@gmail.com wrote: On 4/11/12, 乔志强 qiaozhiqi...@leadcoretech.com wrote: But when a transaction larger than 1GB... Then you may need WAL space larger than 1GB as well. For replication to work, it seems likely that you may need to have sufficient WAL space to handle a row, possibly the entire transaction.. But since a single statement can update thousands or millions of rows, do you always need enough WAL space to hold the entire transaction? No. Does master send WAL to standby before the transaction commit ? Yes. A related question is what happens if there is a rollback? PostgreSQL doesn't use a rollback log; WAL files can be reclaimed as soon as the work they represent has been persisted to the database by a CHECKPOINT, even if it is not committed. Because there can be multiple versions of each row in the base table, each with its own xmin (telling which transaction committed it) and xmax (telling which transaction expired it) visibiliity checking can handle the commits and rollbacks correctly. It also uses a commit log (CLOG), hint bits, and other structures to help resolve visibility. It is a complex topic, but it does work. Thanks, Kevin. That does lead to a question about the problem that started this thread, though. How does one determine how big the WAL space needs to be to not cause streaming replication to fail? Or maybe this is a bug after all? -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [HACKERS] [GENERAL] [streaming replication] 9.1.3 streaming replication bug ?
On 4/11/12, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Apr 11, 2012 at 3:31 PM, 乔志强 qiaozhiqi...@leadcoretech.com wrote: So in sync streaming replication, if master delete WAL before sent to the only standby, all transaction will fail forever, the master tries to avoid a PANIC error rather than termination of replication. but in sync replication, termination of replication is THE bigger PANIC error. I see your point. When there are backends waiting for replication, the WAL files which the standby might not have received yet must not be removed. If they are removed, replication keeps failing forever because required WAL files don't exist in the master, and then waiting backends will never be released unless replication mode is changed to async. This should be avoided. To fix this issue, we should prevent the master from deleting the WAL files including the minimum waiting LSN or bigger ones. I'll think more and implement the patch. With asynchonous replication, does the master even know if a slave fails because of a WAL problem? And does/should it care? Isn't there a separate issue with synchronous replication? If it fails, what's the appropriate action to take on the master? PANICing it seems to be a bad idea, but having transactions never complete because they never hear back from the synchronous slave (for whatever reason) seems bad too. -- Mike Nolan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Fwd: [HACKERS] [GENERAL] [streaming replication] 9.1.3 streaming replication bug ?
-- Forwarded message -- From: Michael Nolan htf...@gmail.com Date: Wed, 11 Apr 2012 14:48:18 -0400 Subject: Re: [HACKERS] [GENERAL] [streaming replication] 9.1.3 streaming replication bug ? To: Robert Haas robertmh...@gmail.com On Wed, Apr 11, 2012 at 2:14 PM, Robert Haas robertmh...@gmail.com wrote: We've talked about teaching the master to keep track of how far back all of its known standbys are, and retaining WAL back to that specific point, rather than the shotgun approach that is wal_keep_segments. It's not exactly clear what the interface to that should look like, though. Moreover, how does the database decide when to drop a known standby from the queue because it has failed or the DBA notify the database that a particular standby should no longer be included? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Fwd: [GENERAL] [streaming replication] 9.1.3 streaming replication bug ?
-- Forwarded message -- From: Michael Nolan htf...@gmail.com Date: Tue, Apr 10, 2012 at 9:47 PM Subject: Re: [GENERAL] [streaming replication] 9.1.3 streaming replication bug ? To: Fujii Masao masao.fu...@gmail.com On Tue, Apr 10, 2012 at 9:09 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Apr 11, 2012 at 10:06 AM, 乔志强 How can I do when I need a backup standby server and wal_keep_segments = 3 for save master disk usage(master will delete wal before send to standby now when heavy load, Need modify some config?) Yes, increase wal_keep_segments. Even if you set wal_keep_segments to 64, the amount of disk space for WAL files is only 1GB, so there is no need to worry so much, I think. No? If you're writing records with a 100MB blob object in them, you definitely need to keep more than 3 WAL segments at a time, because at 16MB each that won't hold even one of your largest records. That's the kind of value added information that the DBA brings to the table that the database itself won't know, which is why one of the DBA's most important tasks is to properly configure the postgresql.conf file, and revise it as the database changes over time. -- Mike Nolan
Re: [GENERAL] 9.1.3 Standby catchup mode
On Thu, Apr 5, 2012 at 12:35 PM, hans wulf lo...@gmx.net wrote: I am wondering how the catchup mode of a hot synchron slave server works on 9.1.3 if there is no WAL archive. Why would you not want to maintain a WAL archive? Are you depending on the slave server(s) as your only form of backup? It isn't clear what you want from synchronous streaming replication, or if you understand the difference between synchronous streaming replication and asynchronous streaming replication. -- Mike Nolan
Re: [GENERAL] 9.1.3: launching streaming replication
On Mon, Apr 2, 2012 at 4:21 PM, Welty, Richard rwe...@ltionline.com wrote: I got similar messages the first few times I tried to start up my slave server, I never did figure out exactly what caused it. You can either delete all the files on the slave and try again, or do what I did, write a script that handles transferring just the files needed to resync the slave. Here's the script I've been using to transfer the files between my two servers to resync them. This is not a production-ready script. I have a second tablespace, so there are two 'data' transfers plus the xlog transfer. (You may run into issues transferring the pg_tblspc directory, as I did, hence the '-safe-links' parameter.) The '-delete' term deletes any files on the slave that aren't on the server, unless you list them in an '--exclude' clause.) /usr/local/pgsql/bin/psql -c select pg_start_backup('tardir',true) postgres postgres rsync -av --exclude log.out --exclude postgresql.conf \ --exclude postgresql.pid --delete --exclude pg_hba.conf \ --exclude pg_xlog --exclude server.crt --exclude server.key \ --exclude restore.conf --exclude restore.done \ --safe-links /usr/local/pgsql/data/ postgres@xxx:/usr/local/pgsql/data rsync -av /usr/local/pgsql/data2/ postgres@xxx:/usr/local/pgsql/data2 /usr/local/pgsql/bin/psql -c select pg_stop_backup() postgres postgres rsync -av /usr/local/pgsql/data/pg_xlog postgres@xxx:/usr/local/pgsql/data/ echo ok to start standby -- Mike Nolan
Re: [GENERAL] 9.1.3: launching streaming replication
On Mon, Apr 2, 2012 at 6:19 PM, Michael Nolan htf...@gmail.com wrote: I got similar messages the first few times I tried to start up my slave server, I never did figure out exactly what caused it. One possibility is that I may not have restarted the master server after changing the postgresql.conf file, thus the server still didn't know it was going into hot-standby mode. -- Mike Nolan
Re: [GENERAL] Problems with Binary Replication
On Sat, Mar 31, 2012 at 6:58 PM, Andreas maps...@gmx.net wrote: Now what could one do to prevent those sequence gaps? There might be scenarios where it's important not to have gaps in the numbering even when one has to switch to the standby if there is a failiour on the master. E.g. numbers of invoices need to be gapless. Then you may need to find some other way within your application to assign invoice numbers, because sequences aren't GUARANTEED not to have gaps, especially if there is a failure of the primary server that results in a switch over to the standby server. A transaction that is rolled back (such as due to an error) after the nextval() function has been called will not roll back the sequence value, for example. You cannot issue a nextval() call on a standby server, because it is in read-only mode. -- MIke Nolan
Re: [GENERAL] huge price database question..
right now I am having about 7000 tables for individual stock and I use perl to do inserts, it's very slow. I would like to use copy or other bulk loading tool to load the daily raw gz data. but I need the split the file to per stock files first before I do bulk loading. I consider this a bit messy. Are you committing each insert separately or doing them in batches using 'begin transaction' and 'commit'? I have a database that I do inserts in from a text file. Doing a commit every 1000 transactions cut the time by over 90%. -- Mike Nolan
Re: [GENERAL] How to isolate the result of SELECT's?
On Sun, Mar 18, 2012 at 6:33 PM, Andre Lopes lopes80an...@gmail.com wrote: Hi, I need to do an operation that I will use some SELECT's and get the results, but I want to have sure that those tables have not been changed with INSERT's or UPDATES during the operation. Example: BEGIN OPERATION Select field from table1; ... Select other_field from table2; ... END OPERATION How can I lock these tables to assure that the tables are not getting INSERTS's or UPDATE's during the operation? Best Regards,\ Isn't that what 'begin transaction' and 'commit' are for? -- Mike Nolan
Re: [GENERAL] Recommendations for SSDs in production?
On Thu, Nov 3, 2011 at 4:15 AM, Allan Kamau kamaual...@gmail.com wrote: How about SSDs on Raid 1+0 (I have no experience on SSD and RAID though) and have replication to another server having the same setup and still do frequent backups. The Crucial m4 SSDs seem to be reasonably priced and perform well. The savings on power and cooling may be used in offsetting some of cost of the warm standby server. A question I have wondered about is whether RAID controllers, which were designed with conventional disk drives in mind, aren't likely to spread the write load out fairly evenly among the SSDs, and thus lead to the situation where all of the drives are approaching their rated write cycle capacity at around the same time. I've asked a few RAID manufacturers whether their controllers can be reconfigured to use SSDs more appropriately, I have yet to get a substantive answer. Benjamin, have you checked to see if your 'sudden death' problem is heat related? - Mike Nolan
Re: [GENERAL] securing the sql server ?
On Mon, Aug 22, 2011 at 3:40 AM, Condor con...@stz-bg.com wrote: Hello ppl, any one can tell me how I can secure linux server with database postgres for example ? Im thinking to make a cryptfs file system and to deploy database over the cryptfs. The problem here may will be when front end need any data for in/out cpus of the server will aways decrypt/encrypt data and performance will be very low. I remember a few months ago some one ask similar question about how he can crypt data that is stored on database and problem was the key. Key is stored on the same server if some one get access can decrypt data. Any one have some ideas how to make something like crypt bubble and to store database there ? Or something else ? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/**mailpref/pgsql-generalhttp://www.postgresql.org/mailpref/pgsql-general Recently, as a test I created a database using encfs and ran some tests against the same database but unencrypted. In both cases the data was being stored on a 500 GB external hard drive connected via USB2 to an HP laptop running Linux Fedora 15. I found that the encrypted database ran 15-20% slower on PostgreSQL 9.0.4 on most queries. -- Mike Nolan
[GENERAL] using xmin in a query?
Why does this query succeed: select count(*) from tablename where xmin = 2 while this query fails: select count(*) from tablename where xmin != 2 The latter will generate an error message (using 9.0.4, but it does not seem to be version specific): ERROR: operator does not exist: xid integer LINE 1: select count(*) from tablename where xmin != 2; ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. What cast or comparison operator would work? You cannot cast an xid to an integer, nor can you cast an integer to an xid. The only way I can get this to work is: select count(*) from tablename where not xmin = 2 That seems pretty obscure. -- Mike Nolan no...@tssi.com
Re: [GENERAL] using xmin in a query?
On Thu, Jul 28, 2011 at 12:23 PM, Andy Colson a...@squeakycode.net wrote: On 7/28/2011 11:40 AM, Michael Nolan wrote: Why does this query succeed: select count(*) from tablename where xmin = 2 while this query fails: select count(*) from tablename where xmin != 2 You probably want . That doesn't work either. -- Mike Nolan select count(*) from tablename where xmin 2 -Andy
Re: [GENERAL] using xmin in a query?
On Thu, Jul 28, 2011 at 2:27 PM, Tom Lane t...@sss.pgh.pa.us wrote: Michael Nolan htf...@gmail.com writes: Why does this query succeed: select count(*) from tablename where xmin = 2 while this query fails: select count(*) from tablename where xmin != 2 It told you why not: ERROR: operator does not exist: xid integer You could do where not (xmin = 2), I suppose. I understand that, Tom, and my original posted did cite 'not xmin = 2' as working. The parentheses appear to be optional, though in a more complex query they would probably be necessary to make sure it parses properly. It appears to me that it is doing an implicit cast of the integer '2' into an xid in the first query. It seems like we're being inconsistent here in allowing 'where xid = integer' but not allowing 'where xid != integer'. Is there no explicit 'cast to xid' available? -- Mike Nolan
Re: [GENERAL] using xmin in a query?
On Thu, Jul 28, 2011 at 5:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: Michael Nolan htf...@gmail.com writes: It seems like we're being inconsistent here in allowing 'where xid = integer' but not allowing 'where xid != integer'. Well, if you look into pg_operator you'll soon find that there are exactly two built-in operators that accept type xid: =(xid,xid) and =(xid,integer) (where I'd say the latter is just a kluge). There hasn't previously been any demand to flesh it out more than that. Do you have an actual use-case where would be helpful, or is this just experimentation? I'm not sure yet. I was doing some thinking about ways to do incremental backups (at least for inserted/updated rows, deleted rows present a different challenge), and was just doing some simple queries to see what worked and what didn't.. It also appears you cannot group on a column of type xid. Would adding a operator enable that? -- Mike Nolan
Re: [GENERAL] using xmin in a query?
On Thu, Jul 28, 2011 at 5:36 PM, Tom Lane t...@sss.pgh.pa.us wrote: Michael Nolan htf...@gmail.com writes: It also appears you cannot group on a column of type xid. You can in 8.4 and up. Previous versions only know how to GROUP BY sortable columns, which requires a btree opclass, which xid doesn't have and really can't have because it doesn't have a linear ordering. There is a hash opclass for it, though, so in versions that know how to GROUP BY using hashing, it'll work. :sigh: I thought I had done all the tests on my 9.0.4 testbed server too. One of these days I hope to get the production and development servers off 8.2. I more or less understand why xid types don't have a linear ordering from Robert Hass's tutorial at PGCON11. So, a operator (either xid,xid or xid,integer) would need to be implemented using the hash opclass, correct? (I don't have a use case for it yet, though.) Would adding a operator enable that? No, it's pretty irrelevant ... OK, thanks for putting up with my noobie questions. -- Mike Nolan
Re: [GENERAL] Suggested enhancement to pg_restore
On Tue, Jul 26, 2011 at 6:10 PM, Chris Travers chris.trav...@gmail.comwrote: On Tue, Jul 26, 2011 at 3:48 PM, Michael Nolan htf...@gmail.com wrote: I suggest adding the following parameter to pg_restore: --rename-table= When used in conjunction with the --data-only, --schema and -t options (all three of which would be necessary), it would allow restoring a table (without indexes) to a different table name (which would need to already exist and match the structure of the table which is being restored, of course.) Does pg_restore allow you to specify a set of tables the same way pg_dump does, i.e. by -t table1 -t table2? If so how would this feature play along? Not sure, the man page for pg_restore seems to imply that -t can be used to restore just ONE table, though it also seems to say that pg_restore can be used to affect the order in which the tables are restored. If it can handle multiple -t entries, presumably they must all be in the same schema, otherwise things could get really confused if the same table name exists in more than one schema. If multiple -t entries are supported, I guess we would have two options. 1. Only allow one table to be restored using the --rename-table parameter at a time. 2. Require that the command have matching pairs of -t and --rename-table entries to make sure that the tables are restored to the intended new names. I don't have a major preference between these, though I suspect #1 would be easier to implement. -- Mike Nolan no...@tssi.com
[GENERAL] Suggested enhancement to pg_restore
I suggest adding the following parameter to pg_restore: --rename-table= When used in conjunction with the --data-only, --schema and -t options (all three of which would be necessary), it would allow restoring a table (without indexes) to a different table name (which would need to already exist and match the structure of the table which is being restored, of course.) This would give PostgreSQL users the ability to reload a table from a dump file to a separate table name in the same database and schema. In other words, this command: pg_restore --data-only --schema=abc -t xyz --rename-table=xyz_copy would restore a copy of table xyz into the existing (and presumably empty) table xyz_copy, leaving table xyz untouched. -- Mike Nolan no...@tssi.com
Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?
On Mon, Jul 25, 2011 at 7:18 AM, Tomas Vondra t...@fuzzy.cz wrote: On 25 Červenec 2011, 11:39, Yan Chunlu wrote: I am using debian ant apt-get to install postgresql, dpkg list shows they are the same? is there anyway to tell what's version it is compiled from? thanks! AFAIK there's no way to find out which compiler was used to build PostgreSQL binaries You can do a strings on a binary file (eg, postmaster) and search for GCC in the output. -- Mike Nolan
Re: [GENERAL] Slow query with sub-select
2011/7/16 - - loh@hotmail.com The weird thing is that before I updated my server the query was about 5 times faster. Updated it from what to what, and how? -- Mike Nolan no...@tssi.com
Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
On Fri, Jul 8, 2011 at 10:27 PM, Robert Haas robertmh...@gmail.com wrote: But if that's what you want, just don't put your data in different databases in the first place. That's what schemas are for. Sadly, DBAs don't always have the ability to put all their data in one database, even if that is what schemas are for. The ability to do cross-database (most likely cross-server as well) queries would address a lot of real-world problems. - Mike Nolan no...@tssi.com
Re: [GENERAL] Performance Monitoring of PostGRE
On Wed, Jul 6, 2011 at 3:57 AM, BangarRaju Vadapalli bangarraju.vadapa...@infor.com wrote: Hi Everybody, ** ** We want to monitor the performance of PostGRE database. Could anyone please suggest any tools tried/working successfully… ** ** **1. **We want AWR kind of report in Postgres By AWR do you mean something like AWStats? The impression I got at PGCon 11 is that all of the major log file analysis tools have their fans and their weaknesses. Splunk is on my 'I want to test this' list, but it gets pricey. 2. We would like to do CPU Utilization monitoring based on the postgres processes I don't think log file analysis tools can gather information about CPU usage.. You'd need something that gathered real time data from the OS, eg from /proc on a linux kernel. sar doesn't tell you a lot about what postgres is up to. -- Mike Nolan no...@tssi.com
Re: [GENERAL] Contrib source
On Thu, Jun 30, 2011 at 10:17 AM, gnuo...@rcn.com wrote: D'oh! I didn't recall that it was packaged together, but the contrib source isn't in src, where I looked. Oh well. IIt's not a separate file, there should be a contrib subdirectory in the source code file. However, if you're using a packaged pre-built binary, you're pretty much at the mercy of the packager as to which contrib packages are built in (if any) or available separately. Using a pre-built binary and then building a contrib package from the source code might cause some problems, I've always just built everything from the source code. -- Mike Nolan no...@tssi.com
[GENERAL] An amusing MySQL weakness--not!
Earlier today I was working on a MySQL database (not by choice, I assure you), and I typed a statement like this: Update tablexyz set field1 = '15' where field2 - 20; The '-' was supposed to be an equal sign, but MySQL executed it anyway. (Field2 is an integer.) I was not amused. PostgreSQL reports this as an error, of course. -- Mike Nolan no...@tssi.com
[GENERAL] postgresql and encfs?
Has anyone successfully used encfs with postgresq recently? I'm not sure if this is specifically a postgresql problem, but I'm trying to get postgresql to run on an encrypted file system that is a copy of my live data directory tree (after a shutdown, of course) and am getting the following errors in the log: LOG: database system was shut down at 2011-06-21 23:21:08 CDT LOG: checkpoint record is at 9/D6A7078 LOG: redo record is at 9/D6A7078; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 0/6712; next OID: 137017002 LOG: next MultiXactId: 1; next MultiXactOffset: 0 PANIC: could not open file pg_xlog/00010009000D (log file 9, segment 13): Invalid argument LOG: startup process (PID 21248) was terminated by signal 6 LOG: aborting startup due to startup process failure The database version here is 8.2.11, running on a Linux Fedora Core 14 server. The encrypted file system was created and mounted as the postgres user. I can read the pg_xlog/00010009000D file as the postgres user and the log entries were written in a directory that is part of the encrypted file system, so it appears that encfs is working as it should. Since as far as I can tell the postgres user has full access on that directory once mounted, my best guess is that postgresql is trying to access the pg_xlog file as some user other than postgres, Mounting the file system in --public mode doesn't have any impact. FWIW, the reason I'm doing this is I'm trying to create a copy of a client database I can take with me to a conference this summer but I'd prefer it to be encrypted in the event the portable drive it is on is lost or stolen. So far this appears to be the best option available. I'm testing it to see if performance is going to be a major concern. -- Mike Nolan no...@tssi.com
Re: [GENERAL] postgresql and encfs?
On Wed, Jun 22, 2011 at 3:48 PM, Tom Lane t...@sss.pgh.pa.us wrote: Michael Nolan htf...@gmail.com writes: Has anyone successfully used encfs with postgresq recently? PANIC: could not open file pg_xlog/00010009000D (log file 9, segment 13): Invalid argument The database version here is 8.2.11, running on a Linux Fedora Core 14 server. Does it work any better if you set wal_sync_method = fdatasync? That's been our default on Linux since 8.2.19, because of certain filesystems failing like this. I'm kind of wondering why you're not using a more current minor release, anyway, if this is a new install. regards, tom lane Yes, that fixes the problem. I was using that minor release because that's what the production server is still running. For the purposes of this conference, I may use a 9.0.4 server, which I have running as I continue to work on upgrade issues. (I just want to be able to run some ad hoc queries on the database while at that meeting without having to deal with net access problems, so the issues keeping them from running a more current release shouldn't be a factor. With luck, I'll get the budget and approval to schedule the move to a more current release while there.) As always, you are a font of knowledge, Tom. -- Mike Nolan
Re: [GENERAL] Query to return every 1st Sat of a month between two dates
On Wed, May 11, 2011 at 10:22 AM, Alex - ainto...@hotmail.com wrote: Hi, is there an easy way to return the date of every first Saturday of a month in a data range i.e. 2011-2013 This is one way to do it:, there are others: select '2011-01-01'::date + s.a as dates from generate_series(0,1095) as s(a) where to_char('2011-01-01'::date+s.a,'dd') between '01' and '07' and to_char('2011-01-01'::date+s.a,'dy') = 'sat' -- Mike Nolan