Re: [GENERAL] Recover from corrupted database due to failing disk
On 11/2/16 6:21 PM, Jim Nasby wrote: I wouldn't trust the existing cluster that far. Since it sounds like you have no better options, you could use zero_damaged_pages to allow a pg_dumpall to complete, but you're going to end up with missing data. So what I'd suggest would be: stop Postgres make a copy of the cluster start with zero_damaged_pages pg_dumpall stop and remove the cluster (make sure you've got that backup) create a new cluster and load the dump Oh, and while you're at it, upgrade to a version that's supported. 8.1 has been out of support for 5+ years. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Recover from corrupted database due to failing disk
On 11/2/16 2:02 PM, Gionatan Danti wrote: However, backup continue to fail with "invalid page header in block" message. Morever, I am very near the xid wraparound limit and, as vacuum fails due to the invalid blocks, I expect a database shutdown (triggered by the 1M transaction protection) within some days. That means at least some of the Postgres files have been damaged (possibly due to the failing disk). Postgres will complain when it sees internal data structures that don't make sense, but it has no way to know if any of the user data has been screwed up. From my understanding, both problem *should* be solved enabling "zero_damaged_pages" and executing a "vacuumdb -a". Is this expectation correct? Will a "reindexdb -a" necessary? I wouldn't trust the existing cluster that far. Since it sounds like you have no better options, you could use zero_damaged_pages to allow a pg_dumpall to complete, but you're going to end up with missing data. So what I'd suggest would be: stop Postgres make a copy of the cluster start with zero_damaged_pages pg_dumpall stop and remove the cluster (make sure you've got that backup) create a new cluster and load the dump -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Questions on Post Setup MASTER and STANDBY replication - Postgres9.1
On 11/2/16 2:49 PM, Joanna Xu wrote: The replication is verified and works. My questions are what’s the reason causing “cp: cannot stat `/opt/postgres/9.1/archive/00010003': No such file or directory” on STANDBY and how to fix it? What instructions/tools did you use to setup replication? Also, it seems the startup process stucks on “recovering 00010004”, how to resolve it? As far as I know that's normal while in streaming mode. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] initdb createuser commands
On 10/31/16 9:50 AM, Christofer C. Bell wrote: He's getting a lot of pushback that really feels it's coming from the wrong direction. "Just learn it." "It's always been this way." "No one agrees with you." These arguments are unconvincing. That said, there's nothing wrong with just saying, "we're not going to change it because we don't want to." The community often does a horrible job of viewing things through the eyes of a new user. This is why mysql became so popular for a while. Comments like "just learn it" are unproductive and push new users away. And we wonder why we're having trouble attracting new developers... This has actually been discussed recently on -hackers as well[1], and there is some general consensus that simplification in this area would be a good idea. 1: https://www.postgresql.org/message-id/20160826202911.GA320593@alvherre.pgsql -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checking Postgres Streaming replication delay
On 10/31/16 3:39 PM, Patrick B wrote: |( ||extract(epoch FROMnow())- ||extract(epoch FROMpg_last_xact_replay_timestamp()) ||)::int lag| You could certainly simplify it though... extract(epoch FROM now()-pg_last_xact_replay_timestamp()) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2
On 9/12/16 3:24 PM, Jeff Janes wrote: The man page you quote seems clear to me that setting it to 1, rather than leaving it at 0, makes the opportunity for corruption wider, not narrower. Yeah, I actually read it backwards. :/ I don't see how --modify-window is helpful at all here; you need to use --ignore-times. I thought that David's "-1" suggestions was tongue in cheek. But it turns out that that actually does work. Of course, it works by forcing every file to be copied, which removes the point of using this over pg_basebackup, but nonetheless it would preserve the integrity of the data. AFAIK pg_basebackup blindly copies all data files, while rsync will transfer only the parts of the files that have actually changed (see --block-size). If the source and destination are on different servers, that can mean less data transferred over the network. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] overwrite column data select - Postgres 9.2
On 9/21/16 1:50 PM, Steve Petrie, P.Eng. wrote: The reason I ask is -- the maximum length of a valid email address is actually 256 characters (or 254, according comments in the PHP function is_valid_email_address(...) that I found on the Internet at http://code.iamcal.com/ and use myself). In my own PG DDL, I define email addresses like: contact_email_addr varchar(256) NOT NULL, FWIW, I stay away from varchar limits that are more than a "suggestion". Generally speaking it's just not worth limiting to something like 50, then the business decides they want 60, then 70, then... I still use varchar to ensure the database can't get DOS'd with garbage, but I'll just set something like varchar(100). That said, if there's a defined limit for email address length, might as well use it... Disclosure: I'm a PG newbie (and a relative SQL newbie, too), and not familiar with the DEFAULT ":: notation in your DDL. The :: is a cast that was presumably added by Postgres when the default was assigned. It's equivalent to DEFAULT ''. I definitely don't like defaults like that... if you don't know what the email is then it should be NULL. Or to put it another way, having a default set largely defeats the purpose of NOT NULL (IMHO). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Duplicate data despite unique constraint
On 9/2/16 8:02 AM, Adrian Klaver wrote: Best guess is the INDEX on the column is corrupted and needs to be reindexed: You should contact AWS support about this; they'd want to know. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] What limits Postgres performance when the whole database lives in cache?
On 9/8/16 3:15 AM, Nicolas Grilly wrote: So my question is not to challenge the Postgres way. It's simply to ask whether there are any known figures that would directly support or refute his claims. Does Postgres really spend 96% of its time in thumb-twiddling once the entire database resides in memory? Alas, I've been unable to find any relevant benchmark. I'm not motivated enough to install a PostgreSQL and VoltDB and try it for myself :-) My guess is this is a test scenario that completely favors VoltDB while hamstringing Postgres, such as using no transaction durability at all in VoltDB while using maximum durability in Postgres. Comparing the cost of every COMMIT doing an fsync vs not could certainly produce a 25x difference. There could be other cases where you'd get a 25x difference. You need to be careful of benchmarks from commercial companies. MySQL used to tout how fast it was compared to Postgres, using a benchmark it created specifically for that purpose that had very little to do with the real world. People eventually discovered that as soon as you had a concurrent workload Postgres was actually faster. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trigger is not working for Inserts from the application
On 9/10/16 6:03 AM, Kiran wrote: If I insert a record from my application using following code : db.myschema.cf_question.insert({ cf_question_type_id:request.payload.type_id, cf_question_category_id:request.payload.cat_id, lang:request.payload.lang, body:request.payload.body } The above app code inserts the record in the DB, but the respective trigger in the database is not triggered hence the "weighted_tsv" columns is empty for this record. But if I insert another record from the postgre's psql, it will insert and the respective trigger is working perfectly. What could be the problem ? Why trigger is not working if I insert from the application ? Am I doing anything wrong ? Any help would be really really appreciated. Without knowing what that app code is doing it's impossible to know. Try turning on logging of all statements (log_statement = ALL) and see what queries the app is actually running. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2
On 9/8/16 3:29 PM, David Gibbons wrote: Isn't this heading in the wrong direction? We need to be more precise than 0 (since 0 is computed off of rounded/truncated time stamps), not less precise than 0. Cheers, Jeff Hmm, You may be right, reading it 4 more times for comprehension it looks like it should be set to -1 not 1. Not according to my man page: --modify-window When comparing two timestamps, rsync treats the timestamps as being equal if they differ by no more than the modify-window value. This is normally 0 (for an exact match), but you may find it useful to set this to a larger value in some situations. In particular, when transferring to or from an MS Windows FAT filesystem (which represents times with a 2-second resolution), --modify-window=1 is useful (allowing times to differ by up to 1 second). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2
Please include the mailing list in replies... On 9/7/16 6:10 PM, David Gibbons wrote: That is NOT safe. The problem is it allows rsync to use mtime alone to decide that a file is in sync, and that will fail if Postgres writes to a file in the same second that the first rsync reads from it (assuming Postgres writes after rsync reads). You need to add the --checksum flag to rsync (which means it will still have to read everything that's in /var/lib/pgsql). The checksum flag as you mention is not performant, Definitely not. :/ If this is a concern, you're much better using the *--modify-window *flag: When comparing two timestamps, rsync treats the timestamps as being equal if they differ by no more than the modify-window value. This is normally 0 (for an exact match), but you may find it useful to set this to a larger value in some situations. Hence, rsync -va --modify-window=1 would remove your concern about a same second race condition without forcing the sync to read through all the files. Very interesting and useful! -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] What limits Postgres performance when the whole database lives in cache?
On 9/2/16 7:39 PM, dandl wrote: I don't think this is quite true. The mechanism he proposes has a small window in which committed transactions can be lost, and this should be addressed by replication or by a small amount of UPC (a few seconds). Except that's the entire point where all those kind of solutions *completely* depart ways from Postgres. Postgres is designed to *lose absolutely no data after a COMMIT*, potentially including requiring that data to be synchronized out to a second server. That is worlds apart from "we might lose a few seconds", and there's a lot of stuff Postgres has to worry about to accomplish that. Some of that stuff can be short-circuited if you don't care (that's what SET synchronous_commit = off does), but there's always going to be some amount of extra work to support synchronous_commit = local or remote_*. Presumably there's more improvements that could be made to Postgres in this area, but if you really don't care about losing seconds worth of data and you need absolutely the best performance possible then maybe Postgres isn't the right choice for you. "All databases suck, each one just sucks in a different way." - Me, circa 1999. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] IDE for function/stored proc development.
On 9/3/16 7:49 AM, Tim Uckun wrote: I was hoping there was some IDE which made that process seamless. Something like PgAdmin but better editing features and features like "find definition" or "find usages" and such. The jetbrains products come close but as I said they are buggy and don't work very well with postgres. Keep in mind that workflow doesn't work well if you need to deploy to production on a regular basis. The workflow I generally use is sqitch[1] and a thin wrapper that runs my unit tests (you do write unit tests for your functions, right? :)). Something like: revert_to=`sqitch tag|tail -n2|head -n1` # Get second to last deployed tag sqitch rebase -y --onto $revert_to $DB && sqitch rebase -y --onto $revert_to $DB && db/run_test $DB Normally you won't be re-deploying that much, so that would be pretty fast. Note that you'll want to create a separate sqitch migration for each object. [1] http://sqitch.org/ -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to assemble all fields of (any) view into a string?
On 9/7/16 6:07 PM, Ken Tanzer wrote: ERROR: PL/Python functions cannot accept type record Ugh, yeah... that won't work. plperl might be able to do it, but I suspect you're going to be stuck pulling the size info out of info_schema or the catalog. Actually, there is a way you could hack this via plpython; pass the row in as text as well as the relation (regclass is good for that). You could then do plpy.execute('SELECT (%::%).*'.format(row_text, relation)); that should give you a dict just like Adrian's example did. It would be nice if there was a function that accepted something with a row descriptor and spit out the details of the descriptor. http://pgxn.org/dist/colnames/doc/colnames.html comes close; if you know much about C at all it shouldn't be hard to add a function to that extension that returned the full details of the row. That and converting the row to JSON would make it relatively easy to accomplish what you want in a plpgsql (or maybe even plsql) function. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Clustered index to preserve data locality in a multitenant application?
On 9/6/16 11:21 AM, Nicolas Grilly wrote: It looks like I can move forward with our migration from MySQL to PostgreSQL, without worrying about the lack of clustered indexes, because there are better solutions to keep tenant data contiguous! First rule of performance tuning: don't. :) There are lots of areas where Postgres can be expected to perform better than MySQL, so without testing your app you really don't know how it's going to fare. There's also another option: use a logical replication system (such as pg_logical, BDR, londiste or Slony) to maintain at least one replica. You can take that replica down to perform maintenance (such as a database-wide CLUSTER) as needed, and let replication catch up once you bring it back online. That, combined with scripted failover makes a lot of database maintenance items far easier, at the cost of having to maintain the replication. Depending on your needs, a major benefit to this method is it makes major version upgrades very simple: you just stand up a new replica on the new version and then failover to it. If anything goes wrong, you can fail back to the old version without losing any data. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2
On 9/2/16 11:44 AM, David Gibbons wrote: rsync -va /var/lib/pgsql/ /var/lib/pgsql2/ service postgres stop rsync -va /var/lib/pgsql/ /var/lib/pgsql2/ The second rsync will only copy the deltas from the first, it still has to go in and determine what needs to be copied/what changed but the bulk of it can be prepared/migrated before the actual downtime window. That is NOT safe. The problem is it allows rsync to use mtime alone to decide that a file is in sync, and that will fail if Postgres writes to a file in the same second that the first rsync reads from it (assuming Postgres writes after rsync reads). You need to add the --checksum flag to rsync (which means it will still have to read everything that's in /var/lib/pgsql). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Privileges on public schema can't be revoked?
On 9/6/16 3:16 PM, Greg Fodor wrote: It seems that functionality that lets a superuser quickly audit the privileges for a user (including those granted via PUBLIC) would be really helpful for diagnosing cases where that user can do something they shouldn't be allowed to. That's actually relatively easy to do today; see the has_*_privilege() functions. You might also find http://pgxn.org/dist/pg_acl useful. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Materialized view auto refresh
On 9/6/16 9:47 PM, Nguyễn Trần Quốc Vinh wrote: I'm sorry. Please check it again at http://it.ued.udn.vn/myprojects/pgTriggerGen/: http://it.ued.udn.vn/myprojects/pgTriggerGen/PgSQLMVTriggersGenerator_v2_src.rar. FWIW, people would be much more likely to find this if you put it on GitHub (and it would be far easier for them to submit improvements). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres UPGRADE from 9.2 to 9.4
On 9/6/16 10:53 PM, John R Pierce wrote: 1) setup new streaming slave at new provider using 9.2, wait for replication to complete and satisfy yourself that the database is complete and intact. 2) shut down master at old provider (and any other slaves), promote new provider 9.2 to master. 3) pg_upgrade using --link option to 9.4 on new provider. if this fails, restart old master and start over at step 1 4) bring up applications on new database. if they fail, restart old master, and start over at step 1 5) bring up new slave(s) on new database. Or if you want, upgrade on your existing provider first, then setup streaming replication. But no matter what, pg_upgrade will require some form of downtime. You could also use either Slony or londiste to directly migrate from Rackspace to an EC2 instance on 9.5; no need for DMS at all. That has the added benefit of allowing you to switch to a new version with virtually zero downtime (as in seconds if you script it) and allowing you to maintain the old version (along with any new data changes) in case you need to fail back (without data loss). Keep in mind that in this case you're really only using DMS as a form of replication, so you might be better off just sticking with Postgres tools. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to assemble all fields of (any) view into a string?
On 9/7/16 5:32 PM, Ken Tanzer wrote: SELECT my_cat(ebh_gain,'ebh_gain') FROM ebh_gain; I know TCL and probably Python and others can work with a record as a trigger function. But TCL doesn't seem to accept a record as an argument. Can any of the other languages that could also accomplish this function? Or some other way? Thanks. A PL that can accept composite types (such as plpythonu) should be able to do this. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL Database performance
On 9/6/16 2:08 PM, Scott Marlowe wrote: checkpoint_completion_target = 0.9 Too high of a checkpoint completion target may cause buffers to get written out more often than needed. but it varies based on load etc. The odds on that don't seem to be terribly high. Even if that is a common occurrence if it's enough to make a difference then you're already close to the limits of your IO, and if that's true then you definitely want to spread the checkpoint out over a longer interval. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] a column definition list is required for functions returning "record"
On 9/7/16 2:02 PM, Merlin Moncure wrote: >> Exactly. You can pass anonymous record types around today, as long as you >> don't do anything that requires knowing what their contents are, either in >> the function or in the calling query: > > What I was thinking of is something (like a function) that has explicitly > defined what the contents of the record are. We have that already, it's named 'json_each_text' Apparently you haven't looked at json parse/deparse costs ;P -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] a column definition list is required for functions returning "record"
On 8/29/16 6:28 AM, Tom Lane wrote: Pavel Stehule <pavel.steh...@gmail.com> writes: > 2016-08-29 1:59 GMT+02:00 Jim Nasby <jim.na...@bluetreble.com>: >> It would be nice if there was a way to pass dynamically formed records >> around, similar to how you can pass the results of row() around. Someone >> else has actually be asking about this at https://github.com/decibel/pg_ >> lambda/issues/1. > Probably there is a space to be PLpgSQL more flexible - but there are > limits - PLpgSQL is black box for SQL engine, and when output is any record > type, then SQL engine knows zero about returning data structure in > preprocessing time. Exactly. You can pass anonymous record types around today, as long as you don't do anything that requires knowing what their contents are, either in the function or in the calling query: What I was thinking of is something (like a function) that has explicitly defined what the contents of the record are. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] LOG: could not fork new process for connection: Cannot allocate memory
On 8/25/16 7:45 PM, Ahsan Ali wrote: Please don't top-post; it's harder to read. On Thu, Aug 25, 2016 at 5:29 PM, John R Pierce <pie...@hogranch.com <mailto:pie...@hogranch.com>> wrote: so there were 1818 postgres client processes at the time it coudln't create a new process. thats certainly a larger number than I've ever run. if I have client software that has lots and lots of idle connections, I use a connection pooler like pgbouncer, in transaction mode. While not the most ideal, people pay way too much attention to large connection counts. It's not *that* big a deal. we have a pooling on the application level. however we never had this issues before this start happning since last couple of days in past we had over 2300 sessions but no issues. Well, if I'm reading your original post correctly, this on a server that only has 252MB of memory, which is *very* small. Even so, according to `free` there's 175MB cached, which should become available as necessary. While the shared memory settings are an interesting theory, there's nothing in 9.3 that would attempt to allocate more shared memory after the database is started, so that can't be it. The only thing I can think of is that someone enabled user quotas on the system... though if that was the case I would expect it to apply to all the existing backends as well (though, maybe there's some mode where that doesn't happen...). It might also be possible that Postgres is reporting the wrong error... ISTR one or two cases in startup code where failure to allocate something other than memory (like a socket) could result in a false memory error in some pathological cases. If you've got debug symbols you could try attaching to the postmaster and setting a breakpoint at ereport and then trying to connect. You could then get a backtrace; just don't leave the system in that state for long. (There might be a more elegant way to do that...) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] a column definition list is required for functions returning "record"
On 8/26/16 10:32 AM, Alexander Farber wrote: Thank you, I was just wondering if there is a simpler way... but ok It would be nice if there was a way to pass dynamically formed records around, similar to how you can pass the results of row() around. Someone else has actually be asking about this at https://github.com/decibel/pg_lambda/issues/1. BTW, there's no advantage I can think of to using plpgsql just to return the output of a query. You'd be better off using a SQL function instead. Actually, I guess the plan for the plpgsql version would be cached; I'm not sure if the same is true for SQL functions. But you'd probably need a more complex query for that to be a win over the lighter weight nature of SQL functions. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Any reasons for 'DO' statement not returning result?
On 8/16/16 1:05 PM, Adrian Klaver wrote: On 08/16/2016 07:54 AM, Jim Nasby wrote: On 8/14/16 5:13 AM, Xtra Coder wrote: - ability to switch session language from 'sql' to 'pl/pgsql' Actually, something I wish I had was the ability to temporarily switch to an entirely different interpreter (such as ipython), while still retaining current database connection and context. That would be especially useful for debugging plpython functions. What aspects of ipython are you interested in? First, it's not ipython in particular, just a python environment. What would be *really* cool is finding a good way to integrate with Jupyter Notebook... Mostly in being able to deal with plpython code. Right now that's difficult because the plpy object is only exposed inside a plpython function, and it's a bit painful to get pythonic representations of data out of the plpython environment and into a python environment. Outside of that though, there's still useful things that can be done. I think a big part of why people keep asking to add things like IF/THEN to psql is because psql is an incredibly useful tool for handling lots of SQL statements (either DDL or DML). It's much better at that than any other tool I've seen. BUT, there's a limit to what psql or SQL can do. String manipulation (for example) pretty much sucks. python (or perl or ...) are quite good at that stuff though. It's certainly not terribly hard to run a query or two from python. But a dozen? That becomes very tedious very quickly. And if you're trying to do this interactively, it just sucks. (No \d et all, no tab completion, no \h, etc). So what I ultimately wish for is a way to blend these things together as needed. What would be truly amazing is if we had a way to pass a database connection around to different tools. Do a bunch of SQL stuff within psql... now I need to deal with a plpython function; do that in Jupyter... now I'm back to SQL, go back to psql. I wonder how hard it would be to allow psql to expose a local port that other stuff could connect to... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] C++ port of Postgres
On 8/16/16 2:52 AM, Gavin Flower wrote: In both cases, part of the motivation to change from C was to appeal to new developers - from what I remember of the discussions. I have moved this discussion over to -hackers. (https://www.postgresql.org/message-id/f7682c24-4271-1ff5-d963-053ecb0fc...@bluetreble.com) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Any reasons for 'DO' statement not returning result?
On 8/16/16 11:17 AM, Chris Travers wrote: I am thinking adding a temporary keyword to functions would make a lot more sense. Well, right now that's just syntactic sugar, so I think the only real benefit might be visibility (though, really we should be marketing the idea that you can create almost *any* object in pg_temp!). What would be a lot more interesting is if creating a temp function didn't involve writing an entry to the catalog (something being discussed for temp tables right now). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Jsonb extraction very slow
On 8/16/16 10:19 AM, Tom Lane wrote: [ thinks for a bit... ] In principle we could have the planner notice whether there are multiple references to the same Var of a varlena type, and then cue the executor to do a pre-emptive detoasting of that field of the input tuple slot. But it would be hard to avoid introducing some regressions along with the benefits, I'm afraid. I suspect that the ExtendedObject stuff makes this even more appealing... it would certainly be nice if we only needed to pay the expansion cost once (assuming no one dirtied the expanded object). I certainly think there's more need for this kind of thing as the use of JSON expands. Perhaps that's part of what Robert was suggesting recently with moving datums around the executor instead of tuples. > Some of these would have been nested ->/->>. In a chain of functions only the first one would be paying the overhead we're talking about here; though I'm not sure how efficient the case is overall in JSONB. I've since heard that chaining -> is a really bad idea compared to #>, which is unfortunately because -> is the normal idiom in other languages (and what I suspect everyone will use by default). I've wondered if an expanded object version of json might be expanding only top-level keys (and maybe only as needed), and then -> is actually just a pointer to the originally expanded data. A chained -> then wouldn't need to duplicate everything... and in fact might be able to do it's expansion in the original object so that subsequent references to that key wouldn't need to re-expand it. I don't think the current EO framework supports that, but it doesn't seem impossible to add... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Jsonb extraction very slow
On 8/11/16 8:45 AM, Tom Lane wrote: Jim Nasby <jim.na...@bluetreble.com> writes: I never dug into why. As Tom posited, decompression might explain the time to get a single key out. Getting 10 keys instead of just 1 wasn't 10x more expensive, but it was significantly more expensive than just getting a single key. What were you doing to "get ten keys out"? If those were ten separate JSON operators, they'd likely have done ten separate decompressions. You'd have saved something by having the TOAST data already fetched into shared buffers, but it'd still hardly be free. Multiple -> or ->> operators, but all operating on the same field (which I thought would mean a single datum that would end up detoasted?). Some of these would have been nested ->/->>. In essence, this was a set of nested views that ultimately pulled from a single JSONB field. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Any reasons for 'DO' statement not returning result?
On 8/14/16 5:13 AM, Xtra Coder wrote: - ability to switch session language from 'sql' to 'pl/pgsql' Actually, something I wish I had was the ability to temporarily switch to an entirely different interpreter (such as ipython), while still retaining current database connection and context. That would be especially useful for debugging plpython functions. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Jsonb extraction very slow
Please CC the list. On 8/11/16 2:19 AM, hari.prasath wrote: Actually I've done some testing with this and there is a *significant* overhead in getting multiple keys from a large document. There's a significant extra cost for the first key, but there's also a non-trivial cost for every key after that. Why is it take some extra cost for the first key and less for keys after that.? Is there any specific reason for this.? if so please explain.. I never dug into why. As Tom posited, decompression might explain the time to get a single key out. Getting 10 keys instead of just 1 wasn't 10x more expensive, but it was significantly more expensive than just getting a single key. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Jsonb extraction very slow
On 8/9/16 9:29 AM, Tom Lane wrote: "hari.prasath" <hari.pras...@zohocorp.com> writes: I am using jsonb for storing key-value pair information(500 keys) and it was a very big data set with some 10M rows. Whenever i try to extract some keys(let say some 10 keys and its values) its really very slow. Is this due to jsonb parsing (or) each time json will be loaded from disk to memory for 10keys(mainly if my keys are at end of 500 this is very slow).? It's probably mostly the cost to fetch and decompress the very wide json field. jsonb is pretty quick at finding an object key once it's got the value available to look at. You could possibly alleviate some of the speed issue by storing the column uncompressed (see ALTER TABLE ... SET STORAGE EXTERNAL), but that would bloat your disk space requirements so I'm not really sure it'd be a win. Actually I've done some testing with this and there is a *significant* overhead in getting multiple keys from a large document. There's a significant extra cost for the first key, but there's also a non-trivial cost for every key after that. I suspect the issue is the goofy logic used to store key name offsets (to improve compression), but I never got around to actually tracing it. I suspect there's a win to be had by having both json types use the ExpandedObject stuff. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Any reasons for 'DO' statement not returning result?
On 8/8/16 7:25 PM, Xtra Coder wrote: With some former experience with MsSQL server, where 'complex' script is executed easily and straightforward without any 'wrapping', like this dummy-one ... DECLARE @a int; DECLARE @b int; ... select @a + @b as "a+b" ... every time I need to execute some one-time-through-away complex code in PostgreSQL which returns rowset I'm disappointed - this has to be wrapped into normal 'temp' function which I have to delete all the time in current session, thus making an anonymous 'DO' statement use-less in 95% of my use-cases. So ... may someone know good reasons for such inconvenient design of 'DO' statement? I don't recall why DO was designed that way, but I created http://pgxn.org/dist/pg_lambda/ to do what you're looking for. Unfortunately it's not quite as convenient as DO, and you also must ALWAYS provide at least one correctly typed input (even if it's NULL) so the pseudotype will work. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] plpython.h not installed in 9.4
9.5+ installs plpython.h under include/server. 9.4 apparently doesn't. I'm guessing that changed in 9.5? Or am I doing something wrong? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Array value from table as parameter
On 7/22/16 7:31 AM, Charles Clavadetscher wrote: I need to pass an array as a parameter in a function, but it is a part of a trigger that get a "new" value. > > I've tested it like > > select function_x(1,55,array[['x'],['y']]) > or > select function_x(1,55,array[['x','y']]) > > and it worked. > > But if I use > > select function_x(1,55,new.situations) > > it doesn't work, as the value comes like this: {"x","y"} > I couldn't find a function that converts {} values back to [] without treat it as a string and use replace, what I think that is not the ideal solution because it may can't satisfy more complex arrays. Would it help to simply cast the argument to TEXT[]? That would simply be masking the problem. '{"x","y"}' is a completely valid representation of an array, but not one you should be getting out of code (because there's no reason for array_out to toss the extra "s in there). It also doesn't match either of your other examples. Please create a stand-alone scenario that demonstrates the problem you're seeing. I suspect that in the process of doing that you're going to uncover a bug in your code, but if not then we'll have something concrete we can look at. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] unique constraint with several null values
On 7/20/16 1:14 PM, Mark Lybarger wrote: This leads me to think I need to create 2^5 or 32 unique constraints to handle the various combinations of data that I can store. Another option would be to create a unique index of a bit varying field that set a bit to true for each field that was NULL WHERE field> != 0. Let me know if you want to go that route, I could probably add that to http://pgxn.org/dist/count_nulls/ without much difficulty. Though, probably a better way to accomplish that would be to add a function to count_nulls that spits out an array of fields that are NULL; you could then do a unique index on that WHERE array != array[]. Maybe a less obtuse option would be to use a boolean array. Storage would be ~8x larger, but since there should be very few rows I doubt that matters. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Constraint using a SQL function executed during SELECT
On 7/19/16 7:43 AM, Cyril B. wrote: Hello, Is there a way to execute a SELECT on a table having a constraint that uses a non-working SQL function? ... ALTER TABLE ONLY t1 ADD CONSTRAINT c EXCLUDE (id WITH =) WHERE ((f(id) IS NOT TRUE)); ... ERROR: relation "rename_me.t2" does not exist ... CONTEXT: SQL function "f" during inlining In this example, you should be able to avoid that by setting constraint_exclusion=off. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Stored procedure version control
On 6/30/16 9:16 AM, Merlin Moncure wrote: It's not really necessary to create version down scripts. In five years of managing complex database environments we've never had to roll a version back and likely never will; in the event of a disaster it's probably better to restore from backup anyways. I'm surprised no one has mentioned http://sqitch.org. It makes it very easy to manage migrations, as well as creating downgrade scripts (if you use rework, and put each object into it's own file). I do agree that down scripts are pretty over-rated as long as you have good test practices (as in, database unit tests). In 9 years in an environment where downtime was 6 figures per hour I only had 1 or 2 deployments that had problems, and never bad enough to consider reverting. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How safe is pg_basebackup + continuous archiving?
On 6/30/16 8:30 AM, Stephen Frost wrote: > How can we check for backup corruption in this case? Thanks you very much. There has been some discussion about a specific tool for checking the checksums throughout the entire system. I don't know of anyone activly working on that, unfortunately. If someone did want that though, it could probably be done as an extension. I believe you just have to pull all of each relation into shared buffers for the checksums to be verified. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [HACKERS] sign function with INTERVAL?
On 4/13/16 1:36 PM, Daniel Lenski wrote: Hi all, Is there a good reason why the SIGN() function does not work with the INTERVAL type? (It is only defined for numeric types.) (http://www.postgresql.org/docs/9.5/static/functions-math.html) The only thing that comes to mind is you can get some strange circumstances with intervals, like '-1 mons +4 days'. I don't think that precludes sign() though. What I have come up with is this rather inelegant and error-prone case statement: How is it error prone? case when x is null then null x>interval '0' then +1 when x You don't need to handle null explicitly. You could do SELECT CASE WHEN x > interval '0' THEN 1 WHEN x < interval '0' THEN -1 WHEN x = interval '0' THEN 0 END Or, you could do... CREATE FUNCTION sign(interval) RETURNS int LANGUAGE sql STRICT IMMUTABLE AS $$ SELECT CASE WHEN $1 > interval '0' THEN 1 WHEN x < interval '0' THEN -1 ELSE 0 END $$; That works because a STRICT function won't even be called if any of it's inputs are NULL. Is there a more obvious way to do sign(interval)? Would it be technically difficult to make it "just work"? Actually, after looking at the code for interval_lt, all that needs to happen to add this support is to expose interval_cmp_internal() as a strict function. It already does exactly what you want. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Odd quoting behavior from \set
Is this odd quoting behavior expected? (Note difference between "'a':1" and "'b':'a'") ~@decina.local/53896# \set df pd.DataFrame.from_dict([{'a':1,'b':'a'},{'a':2,'b':'b'}]) ~@decina.local/53896# \echo :df pd.DataFrame.from_dict([{a:1,b:'a'},{a:2,b:'b'}]) ~@decina.local/53896# -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Execute commands in single-user mode
On 1/10/16 3:44 PM, Andreas Joseph Krogh wrote: It might be about time to come up with an extension that's a replacement for large objects. What would it take to fund such an extension? Time and/or money. It would "have to" support: - Migrate existing LOs away from pg_largeobject - Proper driver-support (JDBC) Would probably be possible to extend JDBC (looks like that's what the current support does). - Possible to exclude from pg_dump That'd be up to the extension dump control semantics. - Support pg_upgrade > And -hackers should agree on the goal to ultimately being merged into core and replace pg_largeobject. Well, there's a reason I suggested an extension. I think it's very unlikely -hackers would want to add another LO format to the database. Now-a-days, it's generally preferred to do most things as extensions, and only incorporate things in the backend that really can't be done with an extension. If this theoretical new replacement for LOs took the world by storm and everyone was using it, maybe it'd be a different thing. The xml and JSON types are examples of that; they started life as add-ons and were eventually pulled in because they became extremely popular. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Unable to build python extension with PGXS
On 1/13/16 3:11 PM, Jim Nasby wrote: On 1/12/16 10:04 PM, Jim Nasby wrote: Attempting to build a python extension, I'm getting: Undefined symbols for architecture x86_64: "_PyErr_Clear", referenced from: _PLyNdarray_FromDatum in pg_ndarray.o _PLyObject_To_ndarray in pg_ndarray.o In case anyone runs into this in the future, my eventual solution was https://github.com/decibel/PandaPost/blob/master/Makefile -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Moving a large DB (> 500GB) to another DB with different locale
On 1/13/16 2:03 AM, Andreas Joseph Krogh wrote: Seems like pglogical migth be better? http://2ndquadrant.com/en/resources/pglogical/ It's available for 9.4 also. It would certainly be faster. It's also less tested than Slony is though, as it's fairly new. If it was me, I'd use pg_logical. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Changing varchar length by manipulating pg_attribute
On 1/13/16 5:59 AM, Christian Ramseyer wrote: UPDATE pg_attribute SET atttypmod = 35+4 -- new desired length + 4 WHERE attrelid = 'TABLE1'::regclass AND attname = 'COL1'; I don't know of any reason that wouldn't work. Note that you might have to make the same change to all the views too. Is this safe to do in Postgres 9.4? Also, best practice seems to be to use text nowadays, is there even a variant of this that lets me convert FWIW, I prefer using varchar with a fairly large limit unless the field really does need to be unlimited. That protects against bad code or a malicious user filling your database with garbage. a "column from character varying(256)" to "text" without having to recreate all the nested views? You could probably change pg_attribute.atttypid to 'text'::regtype. You should change atttypmod to -1 at the same time if you do that. Obviously you should test all of this thoroughly before doing it in production. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Unable to build python extension with PGXS
On 1/12/16 10:04 PM, Jim Nasby wrote: Attempting to build a python extension, I'm getting: Undefined symbols for architecture x86_64: "_PyErr_Clear", referenced from: _PLyNdarray_FromDatum in pg_ndarray.o _PLyObject_To_ndarray in pg_ndarray.o "_PyImport_ImportModule", referenced from: _PLyNdarray_FromDatum in pg_ndarray.o _PLyObject_To_ndarray in pg_ndarray.o "_PyObject_CallFunction", referenced from: _PLyNdarray_FromDatum in pg_ndarray.o _PLyObject_To_ndarray in pg_ndarray.o "_PyObject_GetAttrString", referenced from: _PLyNdarray_FromDatum in pg_ndarray.o _PLyObject_To_ndarray in pg_ndarray.o "_PyString_AsString", referenced from: _PLyObject_To_ndarray in pg_ndarray.o "_PyString_FromStringAndSize", referenced from: _PLyNdarray_FromDatum in pg_ndarray.o "_PyString_Size", referenced from: _PLyObject_To_ndarray in pg_ndarray.o ld: symbol(s) not found for architecture x86_64 I've included $(python_includespec) in my Makefile: override CPPFLAGS := $(python_includespec) $(CPPFLAGS) Is there some other magic I need? Do I need to switch to using MODULE_big or something? After some Screwing Around(TM), I figured out that the magic trick is to add the output of python-config --ldflags to LDFLAGS. I see we have a $(perl_embed_ldflags), should there be something similar for python? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Moving a large DB (> 500GB) to another DB with different locale
On 1/13/16 2:39 PM, Andreas Joseph Krogh wrote: Where can I find more info about how to use and configure pg_logical to replicate a 9.4 DB to 9.5? http://2ndquadrant.com/en/resources/pglogical/ -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Call postgres PL/Python stored function from another PL/Python block.
On 1/13/16 9:47 AM, lodopidolo wrote: Hello. It is possible to call al PL/Python stored function natively from another PL/Python function? Stackoverflow is stupid and won't let me post there, but here's what you want: There is no special capability to call other plpython functions. You need to call them as you would any other Postgres function, ie: do $$ begin ... rv = plpy.execute("SELECT f1()") t = rv[1]["f1] ... end; $$ language 'plpython3u'; See http://www.postgresql.org/docs/9.5/static/plpython-database.html#AEN65599 for more information. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Unable to build python extension with PGXS
Attempting to build a python extension, I'm getting: Undefined symbols for architecture x86_64: "_PyErr_Clear", referenced from: _PLyNdarray_FromDatum in pg_ndarray.o _PLyObject_To_ndarray in pg_ndarray.o "_PyImport_ImportModule", referenced from: _PLyNdarray_FromDatum in pg_ndarray.o _PLyObject_To_ndarray in pg_ndarray.o "_PyObject_CallFunction", referenced from: _PLyNdarray_FromDatum in pg_ndarray.o _PLyObject_To_ndarray in pg_ndarray.o "_PyObject_GetAttrString", referenced from: _PLyNdarray_FromDatum in pg_ndarray.o _PLyObject_To_ndarray in pg_ndarray.o "_PyString_AsString", referenced from: _PLyObject_To_ndarray in pg_ndarray.o "_PyString_FromStringAndSize", referenced from: _PLyNdarray_FromDatum in pg_ndarray.o "_PyString_Size", referenced from: _PLyObject_To_ndarray in pg_ndarray.o ld: symbol(s) not found for architecture x86_64 I've included $(python_includespec) in my Makefile: override CPPFLAGS := $(python_includespec) $(CPPFLAGS) Is there some other magic I need? Do I need to switch to using MODULE_big or something? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] New hacker item posted
Anyone looking to get their feet wet in the backend code, please take a look at http://www.postgresql.org/message-id/568f03ef.4070...@bluetreble.com. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using xmax to detect deleted rows
On 1/8/16 9:02 AM, Alvaro Herrera wrote: Meel Velliste wrote: I would like to use the "xmax" column to detect rows that have been recently deleted. Is it possible to get the deleted row versions with non-zero xmax to remain visible long enough that I could periodically check, say once an hour, and still be able to see rows that were deleted since I last checked? No. Maybe you want a trigger that saves the deleted row somewhere (a separate table perhaps) which you can later inspect and delete again? If that is what you need then PgQ might be a good solution. But it'll be a lot simpler to just do whatever you need to do when the row is actually deleted. Just be sure you deal with rollbacks correctly if you're doing something external. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trigger function interface
On 1/7/16 6:15 PM, Tatsuo Ishii wrote: On 1/6/16 7:03 PM, Tatsuo Ishii wrote: Is it possible to get the parse tree in a C trigger function which is invoked when DML (INSERT/UPDATE/DELETE against a view) is executed? Yes, it's in fcinfo->flinfo->fn_expr. Thanks for the info. But is this the parse tree for the top level query which involves the trigger? Hrm, apparently not. fcinfo->context would maybe be helpful, but I'm not sure. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Execute commands in single-user mode
On 1/10/16 10:01 AM, Andreas Joseph Krogh wrote: pg_largeobject being a system-relation does quite make sense to me, but that's another discussion. I know there has been some discussions in the past about making it a non system-relation but it never got anywhere AFAIK. BTW, there's some other issues with large objects, notably their use of OIDs. Lots of LOs can lead to OID depletion. There was a thread about this recently. It might be about time to come up with an extension that's a replacement for large objects. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] COPY FROM STDIN
On 1/8/16 10:37 AM, Luke Coldiron wrote: On 1/6/16 9:45 PM, Luke Coldiron wrote: In the example above I'm not sure if I can use some sub struct of the SPIPlanPtr and hand it off to the DoCopy function as the CopyStmt or if I need to go about this entirely different. Any advice on the matter would be much appreciated. I don't know off-hand. I suggest you look at what psql does to implement \copy (note the \). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com I took a look at the \copy but as far as I can tell this is using the frontend (client side) of postgresql and I need to be able to do this on the Oops. backend (server side). I don't see a way with this either to attach to the connection that called the c function and execute the copy statement. The DoCopy in commands/copy.h appears to me to be the server side copy command. And I think I can get it to work if I can figure out how to attach to the connection of the calling function and pass in the CopyStmt variable. That sounds... bad. I think the way to handle this is to add a new type to CopyDest and modify CopyGetData() and CopySendEndOfRow() accordingly. It might be tempting to use CopyState->filename as the pointer to a StringInfoData (StringInfo), but I'm not sure that's a great idea. I think it'd be better to add a new field to CopyStateData. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Code of Conduct: Is it time?
On 1/10/16 10:07 AM, Bill Moran wrote: The fact that Postgres has not needed a CoC up till now is a testiment to the quality of the people in the community. However, if Postgres continues to be more popular, the number of people involved is going to increase. Simply as a factor of statistics, the project will be forced to deal with some unsavory people at some point. Having a CoC is laying the foundation to ensure that dealing with those people involves the least pain possible. It will always involve_some_ pain, but less is better. I've done the job of #3 with other groups, and 99% of the time there was nothing to do. The one incident I had to handle was terrible, but at least I had some guidance on how to deal with it. Bingo. To me, the CoC is as much about protecting Postgres itself as it is about protecting contributors. Haters are going to hate, no matter what you do... so how do you remove them and their toxicity as cleanly as possible? BTW, IMHO I think it was a mistake for the FreeBSD community to try and keep things quiet. Sweeping stuff like this under the rug doesn't help anyone. The problem is how to publicize things without scaring people away from reporting. Also, not allowing your CoC to become a weapon that someone can use offensively. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Code of Conduct: Is it time?
On 1/6/16 2:17 AM, Victor Yegorov wrote: Another very wanted change in the community is mentorship. Personally, I don't feel confident to ask endless questions I have when looking into the code, as I understand, that this might be a very basic (for PostgreSQL hackers) stuff. For me it'd be a great helper, if I could talk this over (via e-mail or any messenger) with experienced developer. Reminds me of what we do for the GSoC, where developers volunteer for mentoring students. Something similar would be handy in general, perhaps with a web interface similar to the CommitFest's one. Please, ask your questions! The Postgres community really is one of the most patient and helpful OSS communities out there, and there's plenty of people that would be happy to explain things. Questions are also a good way to show where things could possibly be better commented/documented. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Code of Conduct: Is it time?
On 1/6/16 9:48 AM, Melvin Davidson wrote: there is no way it can be enforced. So while I applaud Joshua Drake for his good intent, I there must therefore take the position of saying no to a CoC A good CoC is not just a code, it is also a means of enforcement. To wit, from the CouchDB CoC[1]: "If you believe someone is violating this code of conduct, you may reply to them and point out this code of conduct. Such messages may be in public or in private, whatever is most appropriate. Assume good faith; it is more likely that participants are unaware of their bad behaviour than that they intentionally try to degrade the quality of the discussion. Should there be difficulties in dealing with the situation, you may report your compliance issues in confidence to priv...@couchdb.apache.org. "If the violation is in documentation or code, for example inappropriate pronoun usage or word choice within official documentation, we ask that people report these privately to the project at priv...@couchdb.apache.org, and, if they have sufficient ability within the project, to resolve or remove the concerning material, being mindful of the perspective of the person originally reporting the issue." Importantly, the code clearly states what is and isn't acceptable, in a calm and rational manner, so that when an incident does occur -core or whoever else can deal with it much more easily. Vague statements like "don't be an ass" are useless for dealing with an actual situation. (BTW, if your concern on enforcement is about control, not only can people be removed from mailing lists and the like, but there actually is a Postgres legal entity that could start legal proceedings if it ever came to it.) [1] http://couchdb.apache.org/conduct.html -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Code of Conduct: Is it time?
On 1/6/16 1:36 AM, Pavel Stehule wrote: The CoC doesn't solve it. We do on mature, stable, pretty complex code - use C (not JavaScript or Java). This isn't hobby project or student project. No, CoC by itself doesn't grow the community. That doesn't mean we shouldn't have one. Another weakness we have is the mentality that the only way to contribute to the community is as a developer. There's tons of other ways people could help, if we made an effort to engage them. Infrastructure, website design, documentation, project management (ie: CF manager), issue tracker wrangler (if we had one), advocacy. There's probably some others. It wouldn't even take effort from the existing community to attract those people; all we'd need to do is decide we wanted non-developers to work on that stuff and find some volunteers to go find them. But the big thing is, the existing community would have to welcome that help. Part of that would mean some changes to how the community currently operates, and the community can be very resistant to that. (I suspect partly because it pays to be very conservative when writting database software... :) ) Taking new developers needs the hard individual work with any potential developer/student. I see as interesting one point - PostgreSQL extensibility - the less experienced developer can write extension, there can be interesting experimental extensions that can be supported without risk of unstability of core code. Can be nice to allow to write not only C language extensions. Then the Postgres can be used on universities and in some startup companies - and it can increase the number of active developers. My very talented colleague doesn't write to Postgres due C language. He like to write planner in lisp or erlang. Or like to play in these languages. C is barrier for younger people. Agreed. I recently said something to that effect to a few others, using Python as an example. If you look at the Python source, there are 380 .c files and 2000 .py files. Postgres has 1200 .c, 2000 .h and only 652 .sql. Since there's 640 .out files most of the .sql is presumably tests. I'm not suggesting we switch to Python; the point is we could do a better job of "eating our own dog food". I think it would also be very interesting if there were add-on frameworks that allowed things like a planner written in another language (which with the planner hooks might actually be possible). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Definitive answer: can functions use indexes?
On 1/6/16 5:15 PM, Seamus Abshere wrote: I've been using Postgres for years ( :heart: ) and I'm still in doubt about this. Would somebody provide an authoritative, definitive, narrative answer? -> Can a function like `LEFT()` use an index? (Or do I have to find an "equivalent" operator in order to leverage indexes?) If you're looking for magic here, there is none. CREATE INDEX ON a(field); ... WHERE field = LEFT(...) -- can use index ... WHERE LEFT(field) = ... -- can NOT use index CREATE INDEX ON a(LEFT(field,5)) ... WHERE field = LEFT(...) -- can NOT use index ... WHERE LEFT(field,5) = ... -- CAN use index ... WHERE LEFT(field,6) = ... -- can NOT use index -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Definitive answer: can functions use indexes?
On 1/6/16 5:41 PM, Tom Lane wrote: Since the question makes little sense as stated, I'm going to assume you mean "can a query like SELECT ... WHERE left(foo, 3) = 'bar' use an index on column foo?" The answer to that is no, there is no such optimization built into Postgres. (In principle there could be, but I've not heard enough requests to make me think we'd ever pursue it.) BTW, the case where this would be highly valuable is timestamps. Being able to do something like date_part('month',timestamptz)='Jan' would be a big, big deal for warehousing. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] COPY FROM STDIN
On 1/6/16 9:45 PM, Luke Coldiron wrote: In the example above I'm not sure if I can use some sub struct of the SPIPlanPtr and hand it off to the DoCopy function as the CopyStmt or if I need to go about this entirely different. Any advice on the matter would be much appreciated. I don't know off-hand. I suggest you look at what psql does to implement \copy (note the \). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trigger function interface
On 1/6/16 7:03 PM, Tatsuo Ishii wrote: Is it possible to get the parse tree in a C trigger function which is invoked when DML (INSERT/UPDATE/DELETE against a view) is executed? Yes, it's in fcinfo->flinfo->fn_expr. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Efficiently selecting single row from a select with window functions row_number, lag and lead
On 1/1/16 9:39 PM, Andrew Bailey wrote: select id, row_number() over w as rownum, lag(id, 1) over w as prev, lead(id, 1) over w as next from route where id=1350 window w as (order by shortname, id asc rows between 1 preceding and 1 following) order by shortname, id ; However this gives the result 1350;1;; The following query gives the result I am expecting select * from (select id, row_number() over w as rownum, lag(id, 1) over w as prev, lead(id, 1) over w as next from route window w as (order by shortname, id rows between 1 preceding and 1 following) order by shortname, id) as s where id=1350 1350;3;1815;1813 The explain plan is "Subquery Scan on s (cost=0.14..15.29 rows=1 width=32)" " Filter: (s.id <http://s.id> = 1350)" " -> WindowAgg (cost=0.14..13.51 rows=143 width=12)" "-> Index Only Scan using route_idx on route (cost=0.14..10.29 rows=143 width=12)" I'm pretty sure the issue here is that the WHERE clause is limiting your result set before the window can find what you're looking for. You could probably switch the WHERE in your original query to a HAVING and get the same results. I'm not sure the filter can actually be pushed past the window functions to get the result you want. That Index Only Scan could still be pulling every row in the table. BTW, if you switch the order by to id, shortname then it might be able to use the index, but of course the results would be different. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Code of Conduct: Is it time?
On 1/5/16 10:03 PM, John R Pierce wrote: On 1/5/2016 5:31 PM, Jim Nasby wrote: IMHO, the real problem here is not simply a CoC, it is that the Postgres community doesn't focus on developing the community itself. The closest we come to "focus" is occasional talk on -hackers about how we need more developers. There is no formal discussion/leadership/coordination towards actively building and strengthening our community. Until that changes, I fear we will always have a lack of developers. More importantly, we will continue to lack all the other ways that people could contribute beyond writing code. IE: the talk shouldn't be about needing more developers, it should be about needing people who want to contribute time to growing the community. That sounds like a bunch of modern marketing graduate mumbojumbo to me.The postgres community are the people who actually support it on the email lists and IRC, as well as the core development teams, and INMO, they are quite strong and effective. when you start talking about social marketing and facebook and twitter and stuff, thats just a bunch of feelgood smoke and mirrors.The project's output is what supports it, not having people going out 'growing community', that is just a bunch of hot air. you actively 'grow community' when you're pushing worthless products (soda pop, etc) based on slick marketing plans rather than actually selling something useful. Then why is it that there is almost no contribution to the community other than code and mailing list discussion? Why is the infrastructure team composed entirely of highly experienced code contributors, of which there are ~200 on the planet, when there are literally 100s of thousands (if not millions) of people out there that could do that work (and could probably do it better if it's what they do for a living, no offense to the efforts of the infrastructure team). Why is there a lack of developers? And a serious lack of code reviewers? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Code of Conduct: Is it time?
On 1/5/16 6:32 PM, Scott Marlowe wrote: I don't think I am a good person to rationalize their reasoning because I >don't like the idea of a CoC. That said, I think a lot of boils down to >perception, responsibility, accountability and the fact that a lot of people >are flat out jerks. I am not talking the ball busting type of jerk but >honest, just not nice people or people who vastly lack the ability to >integrate with larger society. Those people tend to need guidelines for >their jerkiness because they will say, "I didn't know I couldn't do/say >XYZ". Whether that is true or not, I have no idea. CoC: 1: Use our code how you want 2: Don't sue us 3: Don't be a jerk Well, that highlights that it's not just about a CoC, it's the things that surround it. Especially what the conflict resolution policy is. I suspect JD thought about this because of a recent Facebook thread[1] about how the FreeBSD community just screwed this up big-time[2]. The big screw-up was not having solid ways to deal with such complaints in place. Sadly, as part of that thread, it comes to light that there is some history of this in the Postgres project as well. IMHO, the real problem here is not simply a CoC, it is that the Postgres community doesn't focus on developing the community itself. The closest we come to "focus" is occasional talk on -hackers about how we need more developers. There is no formal discussion/leadership/coordination towards actively building and strengthening our community. Until that changes, I fear we will always have a lack of developers. More importantly, we will continue to lack all the other ways that people could contribute beyond writing code. IE: the talk shouldn't be about needing more developers, it should be about needing people who want to contribute time to growing the community. I saw a great presentation about building a strong community by Joan Touzet of CouchDB. The presentation link is currently down, but there's a great interview with her at [3]. CouchDB didn't focus on community building until they had a major problem to deal with. Now, they make community one of their focal points. Just one example, this is the 3rd paragraph on their home page: "We welcome your contributions. CouchDB is an open source project. Everything, from this website to the core of the database itself, has been contributed by helpful individuals. The time and attention of our contributors is our most precious resource, and we always need more of it. Our primary goal is to build a welcoming, supporting, inclusive and diverse community. We abide by Code of Conduct and a set of Project Bylaws. Come join us!" What I'd love to see is support and commitment from the Postgres community to actively attract people who will focus not on the code but on building the community itself. I know there are people in the community that would be interested in doing that, but without active support and some encouragement things aren't going to change. [1] https://www.facebook.com/jon.erdman.jr/posts/10153828693183899 [2] http://blog.randi.io/2015/12/31/the-developer-formerly-known-as-freebsdgirl/ [3] https://opensource.com/life/15/8/couchdb-community-apache-way -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Cannot upgrade from 9.3 to 9.4 using pg_upgrade
On 1/4/16 7:40 PM, Adrian Klaver wrote: or even better yet could you post the section of the log above the error? The server log itself might be useful, especially if full query logging was turned on. Dunno how easy/possible that is with pg_upgrade. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Getting the function definition from oid in before firing the function
On 1/5/16 8:13 AM, Mohammed Ajil wrote: Now my problem consists of two steps: - First I need to decide if the algorithm supports the trigger type (only of LANGUAGE PLPGSQL). plpgsql is NOT the only language that supports triggers. - Second I need to get the SQL command that the function will execute. What I tried was the following: In the file trigger.c I have found the location where the function is called. I think triggers is the wrong way to approach this; it's full of holes (not the least of which is triggers don't fire on SELECT). You'd be much better off with an executor hook. Have you looked at https://github.com/2ndQuadrant/pgaudit? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] to_timestamp alternatives
On 1/1/16 4:11 AM, Thomas Kellerer wrote: You only need to provide the text value to be casted (no format mask). Use only a single expression without those unnecessary parentheses: SELECT gmt_date||' '||lpad(gmt_time,8,'0')::timestamp You don't need the lpad() either: SELECT gmt_date||' '||gmt_time::timestamp I suspect you need to wrap that in (). Even if the parser does the right thing there, it'd certainly make the intent a lot clearer. SELECT (gmt_date || ' ' || gmt_time)::timestamp -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] COPY FROM STDIN
On 1/4/16 12:18 PM, Luke Coldiron wrote: Is there a way to achieve the performance of the COPY FROM STDIN command within a C extension function connected to the db connection that called the C function? I have text that I would like to receive as input to a C function that contains many COPY command statements in the file that would be parsed similar to how psql would handle the file but I don't want to shell out to psql as I need to do all of this work on the db connection that the function was called from as there are other commands that I need to perform as well after before and after handling the COPY commands on this same connection. I would like the unit of work to be all or nothing and have the performance of the COPY FROM STDIN command and not break things out into SELECT INTO or INSERT statements for performance. Ideally I would like to be able to attach to the calling db connection via SPI_connect() and then use the libpq library to issue the copy commands via PQputCopyData, PQputCopyEnd. C functions can use SPI, so I'm not sure what the issue is? http://www.postgresql.org/docs/9.5/static/spi.html (BTW, you'll want to scroll to the bottom of that page...) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] to_timestamp alternatives
On 1/1/16 7:15 AM, Alban Hertroys wrote: Since you're dealing with GPS data and presumably have lat/long, it shouldn't be hard to do this dynamically either, either by just blindly dividing longitude by 15 or using actual timezone shape polygons and @> or <@. That would be a bad idea for global data, since not all time zones are full hours apart, or have the same (if any) DST change-over dates. For example, India is currently at UTC+05:30, probably because they wrapped the entire country in the same TZ after their independence. The original desire was to ignore DST, hence my suggestion. Normally you'd want to figure out the correct TZ for the lat/long. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] cannot get stable function to use index
On 12/30/15 2:12 PM, Andy Colson wrote: random_page_cost = 1 Humm, nope. I removed the config option, restart PG, then analyzed the search table: FYI, you can set that inside any session, any time you want. What's in postgresql.conf is just the default value. (For that matter, you can also set a default for than on a specific database, or a specific user, using ALTER DATABASE SET or ALTER USER SET.) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Efficiently Triggering Autovacuum Analyze?
On 12/30/15 1:31 PM, Joe Conway wrote: On 12/30/2015 11:09 AM, Cory Tucker wrote: With this scenario you can expect an autoanalyze every 5 million rows and autovacuum every 10 million. In my experience (and based on your description, yours as well) this is not often enough. Not only that, when it does run it runs longer than you would like, causing an I/O hit while it does. You probably should tune this table specifically, e.g. Another option is to explicitly analyze then SELECT from the table after you're done inserting into it. The advantage is it doesn't tie up an autovac worker and you can ensure that the newly added tuples get properly hinted. You can run the ANALYZE immediately after your insert finishes. The reason to do that is to get up-to-date statistics for other queries to use. That can be particularly important if the new rows have values significantly outside what was in the table before. That's common with things like sequence IDs and timestamp data. The SELECT is a bit trickier; you want to ensure that there is no transaction still running in the database that's older than the transaction that added all the new data. You can check that by comparing the xmin field of one of your new rows with txid_snapshot_xmin(txid_current_snapshot()). Note that because of wraparound you can't do a simple comparison; txid 3 is actually greater than txid 2^32. The whole point of this SELECT is to get the new tuples hinted while the pages are still hot in cache. If you don't do that, the next query that reads the tuple will have to set the hints, which also dirties the page. VACUUM does that too, but there's really no point in having vacuum run through the entire table just to set hints on less than 1% of it. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] to_timestamp alternatives
On 12/31/15 5:05 PM, gkhan wrote: For example, this fails because of the day-before-month format: Right, which is why Tom had in his example: regression=# set datestyle = dmy; BTW, my recommendation would be to store in a timestamptz field *with the correct timezone*, and then convert on output as necessary. This is easy to do by either SET timezone or SELECT timestamptz_field AT TIME ZONE '...'; If you want the time without DST, you can just use a timezone like '+8' or '-8'. Since you're dealing with GPS data and presumably have lat/long, it shouldn't be hard to do this dynamically either, either by just blindly dividing longitude by 15 or using actual timezone shape polygons and @> or <@. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] cannot get stable function to use index
On 12/29/15 5:21 PM, David G. Johnston wrote: STABLE functions, nor VOLATILE ones, are candidates for indexing. Only IMMUTABLE ones. The default for functions is VOLATILE. I haven't the time to provide a solution to your problem - I'm just pointing out "cannot get stable function to use index" is working as designed and as is logically required. An index must not rely upon outside information, most typically time, since there exists no means for an index to update itself based upon changes in the environment. The only type of function guaranteed to not rely upon the external environment is an immutable one. And no, you shouldn't lie by marking a function immutable to get this to work. The system does not check that the stated volatility and the actual implementation match. So while the above is all true I apparently mis-understood your question... :( I'm not an expert on text search, but presumably the tsvector knows what config was used, which means as long as the config is part of the index it should be OK. Marking to_tsquery_partial as immutable would be wrong and potentially dangerous, as you can change default_text_search_config at any time, which would change the output of to_tsquery (as well as to_tsvector). I'm going to wait for someone thinking more clearly to answer...but it seems that given an inability to prove that the result of the function call is meaningfully selective the system would default to choosing a sequential scan plan over an index. You happen to choose a value that only returns a single row but nothing prevents you from picking one that returns the entire table. There may be other factors involved as I am not that familiar with the full text search capabilities of PostgreSQL. If I'm reading EXPLAIN ANALYZE correctly, to_tsquery_partial is being simplified out of the query entirely: Filter: (search_vec @@ to_tsquery((array_to_string('{213,E,13,ST,N}'::text[], ' & '::text) || ':*'::text))) Part of this could well be that you're not feeding the same data to to_tsquery. Your hard-coded example is where search_vec @@ to_tsquery('213 & E & 13 & ST & N'); but your second query becomes '213 & E & 13 & ST & N:*'. Have you tried that as a hard-coded value? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] efficient math vector operations on arrays
On 12/27/15 2:00 AM, Jony Cohen wrote: Hi, Don't know if it's exactly what you're looking for but the MADLib package has utility function for matrix and vector operations. see: http://doc.madlib.net/latest/group__grp__array.html Apply an operator to al elements on an array or pair of arrays: http://theplateisbad.blogspot.com/2015/12/the-arraymath-extension-vs-plpgsql.html, https://github.com/pramsey/pgsql-arraymath. See also http://theplateisbad.blogspot.com/2015/12/more-fortran-90-like-vector-operations.html. BTW, if you want to simply apply a function to all elements in an array there is an internal C function array_map that can do it. There's no SQL interface to it, but it shouldn't be hard to add one. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] efficient math vector operations on arrays
On 12/29/15 6:50 PM, Tom Lane wrote: Jim Nasby<jim.na...@bluetreble.com> writes: >BTW, if you want to simply apply a function to all elements in an array >there is an internal C function array_map that can do it. There's no SQL >interface to it, but it shouldn't be hard to add one. That wouldn't be useful for the example given originally, since it iterates over just one array not two arrays in parallel. But you could imagine writing something similar that would iterate over two arrays and call a two-argument function. Actually, I suspect you could pretty easily do array_map(regprocedure, VARIADIC anyarray). Whether it's worth a SQL interface is debatable though. Whatever efficiency you might gain from using this would probably be eaten by the overhead of calling a SQL or PL function for each pair of array elements. You'd probably end up in the same ballpark performance-wise as the UNNEST solution given earlier. Take a look at [1]; using a rough equivalent to array_map is 6% faster than unnest(). The array op array version is 30% faster that plpgsql, which based on the code at [2] I assume is doing explain analyze select array(select a*b from unnest(array(select random() from generate_series(1,100)), array(select random() from generate_series(1,100 u(a,b); The syntactic sugar of r := array_map('function(a, b)', in1, in2) (let alone r := in1 * in2;) is appealing too. [1] http://theplateisbad.blogspot.com/2015/12/the-arraymath-extension-vs-plpgsql.html [2] http://theplateisbad.blogspot.com/2015/12/more-fortran-90-like-vector-operations.html -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Shared system resources
On 12/23/15 12:05 PM, Melvin Davidson wrote: As others have pointed out, worrying about someone accessing database shared memory is like worrying about an asteroid striking the earth and wiping out all life. It's a one in a billion chance compared to other security violations that can occur. You are better off concentrating on proper O/S security and user/table permissions. That is how to implement database security! True, but in my experience security audits have nothing to do with security and everything to do with marking off checkboxes and complicating lawsuits. ;) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] efficient math vector operations on arrays
On 12/24/15 1:56 AM, Pavel Stehule wrote: I don't know any extension that calculate euclid distance, but it should be trivial in C - if you don't need to use generic types and generic operations. Before messing around with that, I'd recommend trying either pl/r or pl/pythonu. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Shared system resources
On 12/23/15 7:55 AM, oleg yusim wrote: Sure David. For simplicity of modeling here, let's assume raw database data was encrypted and the only possibility for attacker to get something from raw data is to go and dig into sessions leftovers. Now, with that has been said, do you happen to know what information actually gets stored during the session into memory, reserved by session process? I'm trying to determine, basically, does it even worth a talk - maybe there is nothing at all valuable. There's tons of raw data stored in the shared memory segment, and some of that can be copied to process local memory at any time. If they OS doesn't secure that adequately there's certainly nothing that Postgres or any other database can do about it. As David said, by the time you're concerned about someone getting access to raw memory it's already way too late. As for memory pages being zero'd after they are returned to the OS, that's entirely up to the OS. The only thing you could do on the Postgres side is to compile with memory debugging enabled, which will over-write any memory that's freed with a magic value. That's done to help hunt down memory access bugs, but would have the obvious side effect of obliterating any data that was in the page. Uh, only thing is, I don't know if this is done if we're going to be returning the memory to the OS. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Table with seemingly duplicated primary key values
On 12/22/15 1:31 PM, Melvin Davidson wrote: The fact that you have rows with an identical id of 2141750 verifies a corrupted primary index. To correct it, you need to decide which row to keep. So review the results of SELECT * FROM some_table WHERE ctid = (79664,59) OR ctid = (79772,23) DELETE FROM some_table WHERE id = 2147150 AND field1 = {value for field1 of row to delete} AND field2 = {value for field2 of row to delete} ... ... AND fieldn = {value for fieldn of row to delete}; Then REINDEX TABLE some_table; And most importantly: review your logs for hardware errors and your Postgres and filesystem settings for anything dangerous. Index corruption is not normal and indicates the underlying hardware or OS is faulty (or maybe a bug in Postgres, but that's very unlikely). You should also consider turning on page checksums if you haven't already. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Shared system resources
On 12/22/15 6:03 PM, oleg yusim wrote: Absolutely. But we are not talking about that type of data leakage here. We are talking about potential situation when user, who doesn't have access to database, but has (or gained) access to the Linux box DB is installed one and gets his hands on data, database processes stored in memory (memory would be a common resource here). Of far larger concern at that point is unauthorized access to the database files. Basically, if someone gains access to the OS user that Postgres is running as, or to root, it's game-over. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Test disk reliability (or HGST HTS721010A9E630 surprisingly reliable)
On 12/21/15 8:22 AM, Bill Moran wrote: Why? Just because a disk isn't enterprise-grade doesn't mean it has to lie > >about fsync, which is the only thing diskchecker.pl tests for. > > > >I was thinking that since the disk have a 32M write-cache (with not >battery) it would lie to the OS (and postgres) about when data are really >on disk (not in the disk write cache). But maybe that thinking was wrong. There are ways to make on-disk write caches safe without a battery. IIRC some hard drives would use the inertia of the platter (turning the motor into a generator) to write contents out on power-off. You could also use a "super cap". It varies by vendor and product, which is why diskchecker.pl exists. It's even possible that the behavior is configurable ... check to see if the vendor provides a utility for configuring it. Your OS might let you control it too; I know FreeBSD has support for this. (Whether the drive obeys or not is a different matter...) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Re: Moving Postgresql 9.1 instance from FreeBSD 9.0 machine to FreeBSD 10.2
On 12/20/15 3:43 PM, Amitabh Kant wrote: Just did a transfer without a hitch on a 1 TB (approx) production database . Everything is working fine. Saved a ton of time and complications. Note that that would not work if there were any binary incompatibilities between the two systems. Something like big vs little endian would be an obvious example. Postgres would detect all the obvious examples of this and refuse to start. One thing I'm not sure about is if there were different locals installed on the two machines. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Test disk reliability (or HGST HTS721010A9E630 surprisingly reliable)
On 12/20/15 1:09 PM, Félix GERZAGUET wrote: After reading http://www.postgresql.org/docs/current/static/wal-reliability.html, I tried the recommended diskchecker.pl <http://brad.livejournal.com/2116715.html> but I am not satisfied: I always get: Total errors: 0 even if I tested with with a HGST HTS721010A9E630 that the vendor's datasheet (http://www.hgst.com/sites/default/files/resources/TS7K1000_ds.pdf) advertise as " Designed for low duty cycle, non mission-critical applications in PC,nearline and consumer electronics environments, which vary application to application " Since it is not, a high end disk, I expect some errors. Why? Just because a disk isn't enterprise-grade doesn't mean it has to lie about fsync, which is the only thing diskchecker.pl tests for. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Unique index problem
On 12/20/15 10:18 AM, Andreas Kretschmer wrote: test=*# create unique index on foo(a,b,c) where a is not null and b is not null and c is not null; CREATE INDEX As you discovered, you'd have to build separate indexes for each of the nullable fields: UNIQUE ON (a,b) WHERE c IS NULL a,c WHERE b IS NULL b,c WHERE a IS NULL a WHERE b IS NULL AND c IS NULL That's doable for 3 fields, but obviously gets out of hand really quickly. I wonder if it would be possible to construct an operator family (class?) that would accept 2 rows( ie: ROW(a,b,c) ) and treat NULLs as single values... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] wal receiver process always start after startup process recovering all WALs and need new WAL?
On 12/15/15 2:49 AM, Jov wrote: I think this behavior for recovery_min_apply_delay is not good,because if the receiver do not fetch the wal for a long time(in these cases it must replay 3d's wal before wal receiver start),the master will delete the wal,and the standby will need be re do. AFAIK, if you use a replication slot (new in 9.4) that won't happen, because the master will know that the replica still hasn't applied the data. If you don't use replication slots then you need to set wal_keep_segments large enough on the master so that it doesn't delete segments that are needed. Or configure WAL archiving and let the replica replay from the archive. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trigger function, C, lookup of attribute type
On 12/15/15 4:42 AM, Paul wrote: I'm writing a trigger procedure in C to catch updates to a table and write them to a log file. You might find https://github.com/arkhipov/temporal_tables useful. Though, if you just want to log things to a file, you should check out http://pgaudit.org/. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to get the size of JSONB in bytes?
On 12/15/15 8:24 AM, Petr Korobeinikov wrote: The better approach is extract your length-validation logic into your application. That's really up to interpretation. The database is the only place the data is stored, and as such is the only place that can constrain that data in all places. If you're accepting data from a web form or something you certainly want it to also check things, so the user gets immediate feedback. But for anything you need to guarantee, you need to use the database. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] HELP!!! The WAL Archive is taking up all space
On 12/9/15 7:05 PM, Andreas Kretschmer wrote: I'm really newbie to PostgreSQL but the boss pushed me to handle it >and implement it in production f*&%*$%%$#%$#
Re: [GENERAL] Overhead changing varchar(2000) to text
On 12/9/15 5:43 PM, Edson Richter wrote: Actually, the biggest change is that I don't have to keep another constraint between app and database - if I want to increase the user perceived space, now I just have to change the application (of course, under the limits). For what it's worth, I usually put some limit on fields that a webapp can write to in the database. That way a bug in the app (or malicious action) can't just start allocating gigabytes of stuff in your database. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Feature Request: Faceting for full text search
On 12/9/15 11:30 AM, Mirek Svoboda wrote: Is it OK if I add feature request "faceting for FTS" to roadmap? If you're asking how to get a feature added to Postgres, you basically have 4 options: 1) Convince an existing contributor to work on adding the feature. 2) Convince a large enough group of users that they want the feature and get them all to ask for it (large enough to make the developer community notice, agree the feature is good, and decide to implement it). 3) Become involved in Postgres development and learn enough to add the feature yourself. 4) Pay one of the support companies to develop the feature. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deletion Challenge
On 12/9/15 7:59 PM, Berend Tober wrote: The issue is that I'd like the application (that is, the data base and its stored procedures) to be robust enough to be a "long-running" application, i.e. one that doesn't suffer gradual performance degradation as time and the accumulated data increase. In my experience, no such thing exists. This is one of the things that makes database development very different than other forms of programming. Once you write a program, it's basically always going to perform the same. Database performance slowly changes over time, not only due to different amounts of data, but also different *composition* of data stored. Of course, it is wise to avoid things that will obviously hurt future performance, so it's good that you're thinking about this. But don't expect something you can "set and forget". :) This question was sort of addressed at the "query tuning" aspect, and I'm confident that partitioning would help. Keep in mind that partitioning isn't a magic bullet either, though in this case I agree it would help. Sometimes something as simple as having "active" and "history" partitions is enough. This project is a game, btw, described at You might be interested in https://schemaverse.com/ -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Table with invalid page blocks
On 12/4/15 11:34 AM, Gerdan Rezende dos Santos wrote: Someone has some way of identifying all invalid blocks of a table postgresql? Plpgsql a function, a tool, somehow. I found one solution on http://www.postgresql.org/message-id/1184245756.24101.178.ca...@coppola.muc.ecircle.de, but I can not change in order to identify any defective blocks at once. If your question is "How can I modify that function to report ALL invalid CTIDs?" then you probably need to use a cursor and wrap the FETCH in a BEGIN/END block with an exception handler. Something like: DECLARE curs refcursor; rec record; last_good tid; bad boolean := false; BEGIN OPEN curs NO SCROLL FOR EXECUTE 'SELECT ctid FROM ' || table_name; LOOP BEGIN FETCH curs INTO rec; EXIT WHEN NOT FOUND; IF bad THEN RAISE WARNING 'Next good CTID %', rec.ctid; bad := false; END IF; last_good := rec.ctid; EXCEPTION WHEN OTHERS RAISE WARNING E'Error %: %\nLast good CTID %', SQLSTATE, SQLERRM, last_good; bad := true; END; END LOOP; END; -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] FATAL: unable to read data from DB node 0
On 11/27/15 5:45 PM, DrakoRod wrote: Hi everybody I've the next architecture with pgpool (streaming replication mode): 4 nodes 1 Master Node 2 Standbys Node 1 pgpool Node I've disabled the load balancing, because some clients report me problems with the load balancer, they told me the load balancer some times send querys to standby nodes, which has not yet recovered data and the querys fail, but this topic is for another thread. When I try run a stress test with hammerdb I see next errors in the pgpool Node 2015-11-27 16:48:21: pid 20190: FATAL: unable to read data from DB node 0 2015-11-27 16:48:21: pid 20190: DETAIL: EOF encountered with backend 2015-11-27 16:48:21: pid 19182: LOG: child process with pid: 20190 exits with status 256 2015-11-27 16:48:21: pid 19182: LOG: fork a new child process with pid: 20298 2015-11-27 16:48:21: pid 20163: FATAL: unable to read data from DB node 0 2015-11-27 16:48:21: pid 20163: DETAIL: EOF encountered with backend 2015-11-27 16:48:21: pid 19182: LOG: child process with pid: 20163 exits with status 256 2015-11-27 16:48:21: pid 19182: LOG: fork a new child process with pid: 20299 In all Child, therefore the hammerdb stop the test because all connections it lots. I've never run pgPool myself, but maybe this means the connection was broken for some reason. Are there related errors in the Postgres server log for that node? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] plperlu stored procedure seems to freeze for a minute
On 12/4/15 4:15 AM, Peter J. Holzer wrote: On 2015-12-03 10:02:18 -0500, Tom Lane wrote: >"Peter J. Holzer"<hjp-pg...@hjp.at> writes: > >Can those signals be safely ignored? Just blocking them (so that they > >are delivered after the UDF finishes) might be safer. But even that may > >be a problem: If the UDF then executes some SQL, could that rely on > >signals being delivered? I have no idea. > >The minute you start fooling with a backend's signal behavior, we're >going to politely refuse to support whatever breakage you run into. As I understood Jim he was talking about possible changes to postgresql to shield UDFs from those signals, not something the author of a UDF should do. No, I was suggesting that: 1) It might possible for Postgres error handling to detect that a signal had been received since entering a UDF if the UDF reported an error, and make that information available as part of the standard error report. 2) If there are alarms being set specifically for authentication, they should probably be cancelled after authentication is successful. I certainly do NOT think it would be wise to disable all signal handling during UDF execution. Just for starters, that means that if a UDF went into an infinite loop your only way to recover would be to PANIC the entire database. It would probably create a bunch of other problems as well. In other words, UDFs *must* be capable of handling an interrupt. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Convert from hex to string
On 11/26/15 1:12 AM, Francisco Olarte wrote: Anyway, I think people should be told to respect types, and people should be teach that strings are sequences of characters, that to do what people think is 'hex encoding' ( two digits per*byte*, no delimiter ), you need to first transform the string to bytes, then hex-encode that. BTW, it wouldn't be that difficult to create a hex data type that worked like bytea but accepted only hex strings. Likewise for a base64 type. Only downside is you'd probably have to do it in C, because no other language can handle cstring and I don't think there's any built-in conversion functions with the correct parameters. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] JSONB performance enhancement for 9.6
On 11/29/15 9:30 AM, Arthur Silva wrote: Is this correct? I'm fairly sure jsonb supports lazily parsing objects and each object level is actually searched using binary search. The problem is there's no support for loading just part of a TOASTed field. Even if that existed, we'd still need a way to know what byte position in the TOASTed field a key lived at. It's possible to add all that, but I think it'd be a serious amount of work. Since someone else was just wondering about storing more specific types in JSON, it might be more useful/interesting to devise a Postgres-specific way to store variable schema documents. That would give us a lot more flexibility over implementation than the JSON type ever will. Or think about it this way: there's really no great reason why everyone chose JSON. There's tons of other serialization storage formats for other languages out there, so why not one specialized to Postgres? (And of course we'd want to be able to cast from that to JSON and back...) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] loading data into cluster - can I daisy-chain streaming replication?
On 12/2/15 9:54 PM, Dennis wrote: What you can do with your current configuration is do a pg_dumpall or pg_dump of A and load that logical dump into B which would then replicate the data to C. IF you can't handle the downtime you could also setup londiste or Slony and logically replicate A to B. But (assuming A and B are binary compatible) you'd be better off just breaking the B to C replication, setting B up as a replica of A, and then setting C up as a replica of B. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general