Re: [GENERAL] Recover from corrupted database due to failing disk

2016-11-02 Thread Jim Nasby

On 11/2/16 6:21 PM, Jim Nasby wrote:

I wouldn't trust the existing cluster that far. Since it sounds like you
have no better options, you could use zero_damaged_pages to allow a
pg_dumpall to complete, but you're going to end up with missing data. So
what I'd suggest would be:

stop Postgres
make a copy of the cluster
start with zero_damaged_pages
pg_dumpall
stop and remove the cluster (make sure you've got that backup)
create a new cluster and load the dump


Oh, and while you're at it, upgrade to a version that's supported. 8.1 
has been out of support for 5+ years.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recover from corrupted database due to failing disk

2016-11-02 Thread Jim Nasby

On 11/2/16 2:02 PM, Gionatan Danti wrote:

However, backup continue to fail with "invalid page header in block"
message. Morever, I am very near the xid wraparound limit and, as vacuum
fails due to the invalid blocks, I expect a database shutdown (triggered
by the 1M transaction protection) within some days.


That means at least some of the Postgres files have been damaged 
(possibly due to the failing disk). Postgres will complain when it sees 
internal data structures that don't make sense, but it has no way to 
know if any of the user data has been screwed up.



From my understanding, both problem *should* be solved enabling
"zero_damaged_pages" and executing a "vacuumdb -a". Is this expectation
correct? Will a "reindexdb -a" necessary?


I wouldn't trust the existing cluster that far. Since it sounds like you 
have no better options, you could use zero_damaged_pages to allow a 
pg_dumpall to complete, but you're going to end up with missing data. So 
what I'd suggest would be:


stop Postgres
make a copy of the cluster
start with zero_damaged_pages
pg_dumpall
stop and remove the cluster (make sure you've got that backup)
create a new cluster and load the dump
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Questions on Post Setup MASTER and STANDBY replication - Postgres9.1

2016-11-02 Thread Jim Nasby

On 11/2/16 2:49 PM, Joanna Xu wrote:

The replication is verified and works.  My questions are what’s the
reason causing “cp: cannot stat
`/opt/postgres/9.1/archive/00010003': No such file or
directory” on STANDBY and how to fix it?


What instructions/tools did you use to setup replication?


Also, it seems the startup
process stucks on “recovering 00010004”, how to resolve
it?


As far as I know that's normal while in streaming mode.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] initdb createuser commands

2016-11-02 Thread Jim Nasby

On 10/31/16 9:50 AM, Christofer C. Bell wrote:

He's getting a lot of pushback that really feels it's coming from the
wrong direction.  "Just learn it."  "It's always been this way."  "No
one agrees with you."  These arguments are unconvincing.  That said,
there's nothing wrong with just saying, "we're not going to change it
because we don't want to."


The community often does a horrible job of viewing things through the 
eyes of a new user. This is why mysql became so popular for a while. 
Comments like "just learn it" are unproductive and push new users away.


And we wonder why we're having trouble attracting new developers...

This has actually been discussed recently on -hackers as well[1], and 
there is some general consensus that simplification in this area would 
be a good idea.


1: 
https://www.postgresql.org/message-id/20160826202911.GA320593@alvherre.pgsql

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Checking Postgres Streaming replication delay

2016-11-02 Thread Jim Nasby

On 10/31/16 3:39 PM, Patrick B wrote:

|(
||extract(epoch FROMnow())-
||extract(epoch FROMpg_last_xact_replay_timestamp())
||)::int lag|


You could certainly simplify it though...

extract(epoch FROM now()-pg_last_xact_replay_timestamp())
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

2016-09-24 Thread Jim Nasby

On 9/12/16 3:24 PM, Jeff Janes wrote:

The man page you quote seems clear to me that setting it to 1, rather
than leaving it at 0, makes the opportunity for corruption wider, not
narrower.


Yeah, I actually read it backwards. :/ I don't see how --modify-window 
is helpful at all here; you need to use --ignore-times.



I thought that David's "-1" suggestions was tongue in cheek.  But it
turns out that that actually does work.  Of course, it works by forcing
every file to be copied, which removes the point of using this over
pg_basebackup, but nonetheless it would preserve the integrity of the data.


AFAIK pg_basebackup blindly copies all data files, while rsync will 
transfer only the parts of the files that have actually changed (see 
--block-size). If the source and destination are on different servers, 
that can mean less data transferred over the network.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] overwrite column data select - Postgres 9.2

2016-09-21 Thread Jim Nasby

On 9/21/16 1:50 PM, Steve Petrie, P.Eng. wrote:


The reason I ask is -- the maximum length of a valid email address is
actually 256 characters (or 254, according comments in the PHP function
is_valid_email_address(...) that I found on the Internet at
http://code.iamcal.com/ and use myself).

In my own PG DDL, I define email addresses like:

 contact_email_addr varchar(256) NOT NULL,


FWIW, I stay away from varchar limits that are more than a "suggestion". 
Generally speaking it's just not worth limiting to something like 50, 
then the business decides they want 60, then 70, then... I still use 
varchar to ensure the database can't get DOS'd with garbage, but I'll 
just set something like varchar(100). That said, if there's a defined 
limit for email address length, might as well use it...



Disclosure: I'm a PG newbie (and a relative SQL newbie, too), and not
familiar with the DEFAULT ":: notation in your DDL.


The :: is a cast that was presumably added by Postgres when the default 
was assigned. It's equivalent to DEFAULT ''. I definitely don't like 
defaults like that... if you don't know what the email is then it should 
be NULL. Or to put it another way, having a default set largely defeats 
the purpose of NOT NULL (IMHO).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Duplicate data despite unique constraint

2016-09-10 Thread Jim Nasby

On 9/2/16 8:02 AM, Adrian Klaver wrote:

Best guess is the INDEX on the column is corrupted and needs to be
reindexed:


You should contact AWS support about this; they'd want to know.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What limits Postgres performance when the whole database lives in cache?

2016-09-10 Thread Jim Nasby

On 9/8/16 3:15 AM, Nicolas Grilly wrote:

So my question is not to challenge the Postgres way. It's simply to
ask whether there are any known figures that would directly support
or refute his claims. Does Postgres really spend 96% of its time in
thumb-twiddling once the entire database resides in memory?


Alas, I've been unable to find any relevant benchmark. I'm not motivated
enough to install a PostgreSQL and VoltDB and try it for myself :-)


My guess is this is a test scenario that completely favors VoltDB while 
hamstringing Postgres, such as using no transaction durability at all in 
VoltDB while using maximum durability in Postgres. Comparing the cost of 
every COMMIT doing an fsync vs not could certainly produce a 25x 
difference. There could be other cases where you'd get a 25x difference.


You need to be careful of benchmarks from commercial companies. MySQL 
used to tout how fast it was compared to Postgres, using a benchmark it 
created specifically for that purpose that had very little to do with 
the real world. People eventually discovered that as soon as you had a 
concurrent workload Postgres was actually faster.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Trigger is not working for Inserts from the application

2016-09-10 Thread Jim Nasby

On 9/10/16 6:03 AM, Kiran wrote:

If I insert a record from my application using following code :
db.myschema.cf_question.insert({
cf_question_type_id:request.payload.type_id,
cf_question_category_id:request.payload.cat_id,
lang:request.payload.lang,
body:request.payload.body
}
The above app code inserts the record in the DB, but the respective trigger
in the database is not triggered hence the "weighted_tsv" columns is empty
for this record.

But if I insert another record from the postgre's psql, it will insert and
the respective trigger is working perfectly.

What could be the problem ? Why trigger is not working if I insert from the
application ? Am I doing anything wrong ?
Any help would be really really appreciated.


Without knowing what that app code is doing it's impossible to know. Try 
turning on logging of all statements (log_statement = ALL) and see what 
queries the app is actually running.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

2016-09-10 Thread Jim Nasby

On 9/8/16 3:29 PM, David Gibbons wrote:


Isn't this heading in the wrong direction?   We need to be more
precise than 0 (since 0 is computed off of rounded/truncated time
stamps), not less precise than 0.

Cheers,

Jeff



Hmm, You may be right, reading it 4 more times for comprehension it
looks like it should be set to -1 not 1.


Not according to my man page:

   --modify-window
  When comparing two timestamps, rsync treats the 
timestamps as being equal if they differ by no more than the 
modify-window value.  This is normally 0 (for an exact match), but you
  may find it useful to set this to a larger value in some 
situations.  In particular, when transferring to or from an MS Windows 
FAT  filesystem  (which  represents  times  with  a
  2-second resolution), --modify-window=1 is useful 
(allowing times to differ by up to 1 second).


--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

2016-09-07 Thread Jim Nasby

Please include the mailing list in replies...

On 9/7/16 6:10 PM, David Gibbons wrote:

That is NOT safe. The problem is it allows rsync to use mtime alone
to decide that a file is in sync, and that will fail if Postgres
writes to a file in the same second that the first rsync reads from
it (assuming Postgres writes after rsync reads). You need to add the
--checksum flag to rsync (which means it will still have to read
everything that's in /var/lib/pgsql).


The checksum flag as you mention is not performant,


Definitely not. :/


If this is a concern, you're much better using the *--modify-window *flag:
When comparing two timestamps, rsync treats the timestamps as being
equal if they differ by no more than the modify-window value. This is
normally 0 (for an exact match), but you may find it useful to set this
to a larger value in some situations.

Hence, rsync -va --modify-window=1 would remove your concern about a
same second race condition without forcing the sync to read through all
the files.


Very interesting and useful!
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What limits Postgres performance when the whole database lives in cache?

2016-09-07 Thread Jim Nasby

On 9/2/16 7:39 PM, dandl wrote:

I don't think this is quite true. The mechanism he proposes has a small window 
in which committed transactions can be lost, and this should be addressed by 
replication or by a small amount of UPC (a few seconds).


Except that's the entire point where all those kind of solutions 
*completely* depart ways from Postgres. Postgres is designed to *lose 
absolutely no data after a COMMIT*, potentially including requiring that 
data to be synchronized out to a second server. That is worlds apart 
from "we might lose a few seconds", and there's a lot of stuff Postgres 
has to worry about to accomplish that. Some of that stuff can be 
short-circuited if you don't care (that's what SET synchronous_commit = 
off does), but there's always going to be some amount of extra work to 
support synchronous_commit = local or remote_*.


Presumably there's more improvements that could be made to Postgres in 
this area, but if you really don't care about losing seconds worth of 
data and you need absolutely the best performance possible then maybe 
Postgres isn't the right choice for you.


"All databases suck, each one just sucks in a different way." - Me, 
circa 1999.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] IDE for function/stored proc development.

2016-09-07 Thread Jim Nasby

On 9/3/16 7:49 AM, Tim Uckun wrote:

I was hoping there was some IDE which made that process seamless.
Something like PgAdmin but better editing features and features like
"find definition" or "find usages" and such.  The jetbrains products
come close but as I said they are buggy and don't work very well with
postgres.


Keep in mind that workflow doesn't work well if you need to deploy to 
production on a regular basis.


The workflow I generally use is sqitch[1] and a thin wrapper that runs 
my unit tests (you do write unit tests for your functions, right? :)). 
Something like:


revert_to=`sqitch tag|tail -n2|head -n1` # Get second to last deployed tag
sqitch rebase -y --onto $revert_to $DB && sqitch rebase -y --onto 
$revert_to $DB && db/run_test $DB


Normally you won't be re-deploying that much, so that would be pretty 
fast. Note that you'll want to create a separate sqitch migration for 
each object.


[1] http://sqitch.org/
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to assemble all fields of (any) view into a string?

2016-09-07 Thread Jim Nasby

On 9/7/16 6:07 PM, Ken Tanzer wrote:

ERROR:  PL/Python functions cannot accept type record


Ugh, yeah... that won't work. plperl might be able to do it, but I 
suspect you're going to be stuck pulling the size info out of 
info_schema or the catalog.


Actually, there is a way you could hack this via plpython; pass the row 
in as text as well as the relation (regclass is good for that). You 
could then do plpy.execute('SELECT (%::%).*'.format(row_text, 
relation)); that should give you a dict just like Adrian's example did.


It would be nice if there was a function that accepted something with a 
row descriptor and spit out the details of the descriptor. 
http://pgxn.org/dist/colnames/doc/colnames.html comes close; if you know 
much about C at all it shouldn't be hard to add a function to that 
extension that returned the full details of the row. That and converting 
the row to JSON would make it relatively easy to accomplish what you 
want in a plpgsql (or maybe even plsql) function.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Clustered index to preserve data locality in a multitenant application?

2016-09-07 Thread Jim Nasby

On 9/6/16 11:21 AM, Nicolas Grilly wrote:

It looks like I can move forward with our migration from MySQL to
PostgreSQL, without worrying about the lack of clustered indexes,
because there are better solutions to keep tenant data contiguous!


First rule of performance tuning: don't. :) There are lots of areas 
where Postgres can be expected to perform better than MySQL, so without 
testing your app you really don't know how it's going to fare.


There's also another option: use a logical replication system (such as 
pg_logical, BDR, londiste or Slony) to maintain at least one replica. 
You can take that replica down to perform maintenance (such as a 
database-wide CLUSTER) as needed, and let replication catch up once you 
bring it back online. That, combined with scripted failover makes a lot 
of database maintenance items far easier, at the cost of having to 
maintain the replication. Depending on your needs, a major benefit to 
this method is it makes major version upgrades very simple: you just 
stand up a new replica on the new version and then failover to it. If 
anything goes wrong, you can fail back to the old version without losing 
any data.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

2016-09-07 Thread Jim Nasby

On 9/2/16 11:44 AM, David Gibbons wrote:

rsync -va /var/lib/pgsql/ /var/lib/pgsql2/
service postgres stop
rsync -va /var/lib/pgsql/ /var/lib/pgsql2/

The second rsync will only copy the deltas from the first, it still has
to go in and determine what needs to be copied/what changed but the bulk
of it can be prepared/migrated before the actual downtime window.


That is NOT safe. The problem is it allows rsync to use mtime alone to 
decide that a file is in sync, and that will fail if Postgres writes to 
a file in the same second that the first rsync reads from it (assuming 
Postgres writes after rsync reads). You need to add the --checksum flag 
to rsync (which means it will still have to read everything that's in 
/var/lib/pgsql).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Privileges on public schema can't be revoked?

2016-09-07 Thread Jim Nasby

On 9/6/16 3:16 PM, Greg Fodor wrote:

It seems that functionality that lets a superuser quickly audit the
privileges for a user (including those granted via PUBLIC) would be
really helpful for diagnosing cases where that user can do something
they shouldn't be allowed to.


That's actually relatively easy to do today; see the has_*_privilege() 
functions.


You might also find http://pgxn.org/dist/pg_acl useful.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Materialized view auto refresh

2016-09-07 Thread Jim Nasby

On 9/6/16 9:47 PM, Nguyễn Trần Quốc Vinh wrote:

I'm sorry. Please check it again at
http://it.ued.udn.vn/myprojects/pgTriggerGen/: 
http://it.ued.udn.vn/myprojects/pgTriggerGen/PgSQLMVTriggersGenerator_v2_src.rar.


FWIW, people would be much more likely to find this if you put it on 
GitHub (and it would be far easier for them to submit improvements).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres UPGRADE from 9.2 to 9.4

2016-09-07 Thread Jim Nasby

On 9/6/16 10:53 PM, John R Pierce wrote:

1) setup new streaming slave at new provider using 9.2, wait for
replication to complete and satisfy yourself that the database is
complete and intact.
2) shut down master at old provider (and any other slaves), promote new
provider 9.2 to master.
3) pg_upgrade using --link option to 9.4 on new provider.   if this
fails, restart old master and start over at step 1
4) bring up applications on new database.  if they fail, restart old
master, and start over at step 1
5) bring up new slave(s) on new database.


Or if you want, upgrade on your existing provider first, then setup 
streaming replication. But no matter what, pg_upgrade will require some 
form of downtime.


You could also use either Slony or londiste to directly migrate from 
Rackspace to an EC2 instance on 9.5; no need for DMS at all. That has 
the added benefit of allowing you to switch to a new version with 
virtually zero downtime (as in seconds if you script it) and allowing 
you to maintain the old version (along with any new data changes) in 
case you need to fail back (without data loss).


Keep in mind that in this case you're really only using DMS as a form of 
replication, so you might be better off just sticking with Postgres tools.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to assemble all fields of (any) view into a string?

2016-09-07 Thread Jim Nasby

On 9/7/16 5:32 PM, Ken Tanzer wrote:

SELECT my_cat(ebh_gain,'ebh_gain') FROM ebh_gain;

I know TCL and probably Python and others can work with a record as a
trigger function.  But TCL doesn't seem to accept a record as an
argument.  Can any of the other languages that could also accomplish
this function? Or some other way? Thanks.


A PL that can accept composite types (such as plpythonu) should be able 
to do this.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Database performance

2016-09-07 Thread Jim Nasby

On 9/6/16 2:08 PM, Scott Marlowe wrote:

checkpoint_completion_target = 0.9

Too high of a checkpoint completion target may cause buffers to get
written out more often than needed. but it varies based on load etc.


The odds on that don't seem to be terribly high. Even if that is a 
common occurrence if it's enough to make a difference then you're 
already close to the limits of your IO, and if that's true then you 
definitely want to spread the checkpoint out over a longer interval.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] a column definition list is required for functions returning "record"

2016-09-07 Thread Jim Nasby

On 9/7/16 2:02 PM, Merlin Moncure wrote:

>> Exactly.  You can pass anonymous record types around today, as long as you
>> don't do anything that requires knowing what their contents are, either in
>> the function or in the calling query:

>
> What I was thinking of is something (like a function) that has explicitly
> defined what the contents of the record are.

We have that already, it's named 'json_each_text'


Apparently you haven't looked at json parse/deparse costs ;P
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] a column definition list is required for functions returning "record"

2016-09-02 Thread Jim Nasby

On 8/29/16 6:28 AM, Tom Lane wrote:

Pavel Stehule <pavel.steh...@gmail.com> writes:

> 2016-08-29 1:59 GMT+02:00 Jim Nasby <jim.na...@bluetreble.com>:

>> It would be nice if there was a way to pass dynamically formed records
>> around, similar to how you can pass the results of row() around. Someone
>> else has actually be asking about this at https://github.com/decibel/pg_
>> lambda/issues/1.

> Probably there is a space to be PLpgSQL more flexible - but there are
> limits - PLpgSQL is black box for SQL engine, and when output is any record
> type, then SQL engine knows zero about returning data structure in
> preprocessing time.

Exactly.  You can pass anonymous record types around today, as long as you
don't do anything that requires knowing what their contents are, either in
the function or in the calling query:


What I was thinking of is something (like a function) that has 
explicitly defined what the contents of the record are.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] LOG: could not fork new process for connection: Cannot allocate memory

2016-08-28 Thread Jim Nasby

On 8/25/16 7:45 PM, Ahsan Ali wrote:

Please don't top-post; it's harder to read.

On Thu, Aug 25, 2016 at 5:29 PM, John R Pierce <pie...@hogranch.com
<mailto:pie...@hogranch.com>> wrote:
so there were 1818 postgres client processes at the time it coudln't
create a new process.   thats certainly a larger number than I've
ever run.   if I have client software that has lots and lots of idle
connections, I use a connection pooler like pgbouncer, in
transaction mode.


While not the most ideal, people pay way too much attention to large 
connection counts. It's not *that* big a deal.



we have a pooling on the application level. however we never had this
issues before this start happning since last couple of days in past we
had over 2300 sessions but no issues.


Well, if I'm reading your original post correctly, this on a server that 
only has 252MB of memory, which is *very* small. Even so, according to 
`free` there's 175MB cached, which should become available as necessary.


While the shared memory settings are an interesting theory, there's 
nothing in 9.3 that would attempt to allocate more shared memory after 
the database is started, so that can't be it.


The only thing I can think of is that someone enabled user quotas on the 
system... though if that was the case I would expect it to apply to all 
the existing backends as well (though, maybe there's some mode where 
that doesn't happen...).


It might also be possible that Postgres is reporting the wrong error... 
ISTR one or two cases in startup code where failure to allocate 
something other than memory (like a socket) could result in a false 
memory error in some pathological cases. If you've got debug symbols you 
could try attaching to the postmaster and setting a breakpoint at 
ereport and then trying to connect. You could then get a backtrace; just 
don't leave the system in that state for long. (There might be a more 
elegant way to do that...)

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] a column definition list is required for functions returning "record"

2016-08-28 Thread Jim Nasby

On 8/26/16 10:32 AM, Alexander Farber wrote:

Thank you, I was just wondering if there is a simpler way... but ok


It would be nice if there was a way to pass dynamically formed records 
around, similar to how you can pass the results of row() around. Someone 
else has actually be asking about this at 
https://github.com/decibel/pg_lambda/issues/1.


BTW, there's no advantage I can think of to using plpgsql just to return 
the output of a query. You'd be better off using a SQL function instead.


Actually, I guess the plan for the plpgsql version would be cached; I'm 
not sure if the same is true for SQL functions. But you'd probably need 
a more complex query for that to be a win over the lighter weight nature 
of SQL functions.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Any reasons for 'DO' statement not returning result?

2016-08-16 Thread Jim Nasby

On 8/16/16 1:05 PM, Adrian Klaver wrote:

On 08/16/2016 07:54 AM, Jim Nasby wrote:

On 8/14/16 5:13 AM, Xtra Coder wrote:

- ability to switch session language from 'sql' to 'pl/pgsql'


Actually, something I wish I had was the ability to temporarily switch
to an entirely different interpreter (such as ipython), while still
retaining current database connection and context. That would be
especially useful for debugging plpython functions.


What aspects of ipython are you interested in?


First, it's not ipython in particular, just a python environment. What 
would be *really* cool is finding a good way to integrate with Jupyter 
Notebook...


Mostly in being able to deal with plpython code. Right now that's 
difficult because the plpy object is only exposed inside a plpython 
function, and it's a bit painful to get pythonic representations of data 
out of the plpython environment and into a python environment.


Outside of that though, there's still useful things that can be done. I 
think a big part of why people keep asking to add things like IF/THEN to 
psql is because psql is an incredibly useful tool for handling lots of 
SQL statements (either DDL or DML). It's much better at that than any 
other tool I've seen. BUT, there's a limit to what psql or SQL can do. 
String manipulation (for example) pretty much sucks. python (or perl or 
...) are quite good at that stuff though.


It's certainly not terribly hard to run a query or two from python. But 
a dozen? That becomes very tedious very quickly. And if you're trying to 
do this interactively, it just sucks. (No \d et all, no tab completion, 
no \h, etc).


So what I ultimately wish for is a way to blend these things together as 
needed. What would be truly amazing is if we had a way to pass a 
database connection around to different tools. Do a bunch of SQL stuff 
within psql... now I need to deal with a plpython function; do that in 
Jupyter... now I'm back to SQL, go back to psql.


I wonder how hard it would be to allow psql to expose a local port that 
other stuff could connect to...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] C++ port of Postgres

2016-08-16 Thread Jim Nasby

On 8/16/16 2:52 AM, Gavin Flower wrote:

In both cases, part of the motivation to change from C was to appeal to
new developers - from what I remember of the discussions.


I have moved this discussion over to -hackers. 
(https://www.postgresql.org/message-id/f7682c24-4271-1ff5-d963-053ecb0fc...@bluetreble.com)

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Any reasons for 'DO' statement not returning result?

2016-08-16 Thread Jim Nasby

On 8/16/16 11:17 AM, Chris Travers wrote:

I am thinking adding a temporary keyword to functions would make a lot
more sense.


Well, right now that's just syntactic sugar, so I think the only real 
benefit might be visibility (though, really we should be marketing the 
idea that you can create almost *any* object in pg_temp!).


What would be a lot more interesting is if creating a temp function 
didn't involve writing an entry to the catalog (something being 
discussed for temp tables right now).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Jsonb extraction very slow

2016-08-16 Thread Jim Nasby

On 8/16/16 10:19 AM, Tom Lane wrote:

[ thinks for a bit... ]  In principle we could have the planner notice
whether there are multiple references to the same Var of a varlena type,
and then cue the executor to do a pre-emptive detoasting of that field
of the input tuple slot.  But it would be hard to avoid introducing some
regressions along with the benefits, I'm afraid.



I suspect that the ExtendedObject stuff makes this even more 
appealing... it would certainly be nice if we only needed to pay the 
expansion cost once (assuming no one dirtied the expanded object). I 
certainly think there's more need for this kind of thing as the use of 
JSON expands.


Perhaps that's part of what Robert was suggesting recently with moving 
datums around the executor instead of tuples.



> Some of these would have been nested ->/->>.

In a chain of functions only the first one would be paying the overhead
we're talking about here; though I'm not sure how efficient the case is
overall in JSONB.


I've since heard that chaining -> is a really bad idea compared to #>, 
which is unfortunately because -> is the normal idiom in other languages 
(and what I suspect everyone will use by default). I've wondered if an 
expanded object version of json might be expanding only top-level keys 
(and maybe only as needed), and then -> is actually just a pointer to 
the originally expanded data. A chained -> then wouldn't need to 
duplicate everything... and in fact might be able to do it's expansion 
in the original object so that subsequent references to that key 
wouldn't need to re-expand it. I don't think the current EO framework 
supports that, but it doesn't seem impossible to add...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Jsonb extraction very slow

2016-08-16 Thread Jim Nasby

On 8/11/16 8:45 AM, Tom Lane wrote:

Jim Nasby <jim.na...@bluetreble.com> writes:

I never dug into why. As Tom posited, decompression might explain the
time to get a single key out. Getting 10 keys instead of just 1 wasn't
10x more expensive, but it was significantly more expensive than just
getting a single key.


What were you doing to "get ten keys out"?  If those were ten separate
JSON operators, they'd likely have done ten separate decompressions.
You'd have saved something by having the TOAST data already fetched into
shared buffers, but it'd still hardly be free.


Multiple -> or ->> operators, but all operating on the same field (which 
I thought would mean a single datum that would end up detoasted?).


Some of these would have been nested ->/->>. In essence, this was a set 
of nested views that ultimately pulled from a single JSONB field.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Any reasons for 'DO' statement not returning result?

2016-08-16 Thread Jim Nasby

On 8/14/16 5:13 AM, Xtra Coder wrote:

- ability to switch session language from 'sql' to 'pl/pgsql'


Actually, something I wish I had was the ability to temporarily switch 
to an entirely different interpreter (such as ipython), while still 
retaining current database connection and context. That would be 
especially useful for debugging plpython functions.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Jsonb extraction very slow

2016-08-11 Thread Jim Nasby

Please CC the list.

On 8/11/16 2:19 AM, hari.prasath wrote:

Actually I've done some testing with this and there is a *significant*
overhead in getting multiple keys from a large document. There's a
significant extra cost for the first key, but there's also a non-trivial
cost for every key after that.


Why is it take some extra cost for the first key and less for keys after
that.?
Is there any specific reason for this.? if so please explain..


I never dug into why. As Tom posited, decompression might explain the 
time to get a single key out. Getting 10 keys instead of just 1 wasn't 
10x more expensive, but it was significantly more expensive than just 
getting a single key.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Jsonb extraction very slow

2016-08-10 Thread Jim Nasby

On 8/9/16 9:29 AM, Tom Lane wrote:

"hari.prasath" <hari.pras...@zohocorp.com> writes:

 I am using jsonb for storing key-value pair information(500 keys) and it 
was a very big data set with some 10M rows. Whenever i try to extract some 
keys(let say some 10 keys and its values) its really very slow.
Is this due to jsonb parsing (or) each time json will be loaded from disk to 
memory for 10keys(mainly if my keys are at end of 500 this is very slow).?


It's probably mostly the cost to fetch and decompress the very wide json
field.  jsonb is pretty quick at finding an object key once it's got
the value available to look at.

You could possibly alleviate some of the speed issue by storing the column
uncompressed (see ALTER TABLE ... SET STORAGE EXTERNAL), but that would
bloat your disk space requirements so I'm not really sure it'd be a win.


Actually I've done some testing with this and there is a *significant* 
overhead in getting multiple keys from a large document. There's a 
significant extra cost for the first key, but there's also a non-trivial 
cost for every key after that.


I suspect the issue is the goofy logic used to store key name offsets 
(to improve compression), but I never got around to actually tracing it. 
I suspect there's a win to be had by having both json types use the 
ExpandedObject stuff.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Any reasons for 'DO' statement not returning result?

2016-08-10 Thread Jim Nasby

On 8/8/16 7:25 PM, Xtra Coder wrote:

With some former experience with MsSQL server, where 'complex' script is
executed easily and straightforward without any 'wrapping', like this
dummy-one ...

DECLARE @a int;
DECLARE @b int;
...
select @a + @b as "a+b"

... every time I need to execute some one-time-through-away complex code
in PostgreSQL which returns rowset I'm disappointed - this has to be
wrapped into normal 'temp' function which I have to delete all the time
in current session, thus making an anonymous 'DO' statement use-less in
95% of my use-cases.

So ... may someone know good reasons for such inconvenient design of
'DO' statement?


I don't recall why DO was designed that way, but I created 
http://pgxn.org/dist/pg_lambda/ to do what you're looking for. 
Unfortunately it's not quite as convenient as DO, and you also must 
ALWAYS provide at least one correctly typed input (even if it's NULL) so 
the pseudotype will work.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] plpython.h not installed in 9.4

2016-08-10 Thread Jim Nasby
9.5+ installs plpython.h under include/server. 9.4 apparently doesn't. 
I'm guessing that changed in 9.5? Or am I doing something wrong?

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Array value from table as parameter

2016-07-22 Thread Jim Nasby

On 7/22/16 7:31 AM, Charles Clavadetscher wrote:

I need to pass an array as a parameter in a function, but it is a part of a trigger that 
get a "new" value.
>
> I've tested it like
>
> select function_x(1,55,array[['x'],['y']])
> or
> select function_x(1,55,array[['x','y']])
>
> and it worked.
>
> But if I use
>
> select function_x(1,55,new.situations)
>
> it doesn't work, as the value comes like this: {"x","y"}
> I couldn't find a function that converts {} values back to [] without treat 
it as a string and use replace, what I think that is

not the ideal solution because it may can't satisfy more complex arrays.

Would it help to simply cast the argument to TEXT[]?


That would simply be masking the problem.

'{"x","y"}' is a completely valid representation of an array, 
but not one you should be getting out of code (because there's no reason 
for array_out to toss the extra "s in there). It also doesn't match 
either of your other examples.


Please create a stand-alone scenario that demonstrates the problem 
you're seeing. I suspect that in the process of doing that you're going 
to uncover a bug in your code, but if not then we'll have something 
concrete we can look at.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] unique constraint with several null values

2016-07-22 Thread Jim Nasby

On 7/20/16 1:14 PM, Mark Lybarger wrote:

This leads me to think I need to create 2^5 or 32 unique constraints to
handle the various combinations of data that I can store.


Another option would be to create a unique index of a bit varying field 
that set a bit to true for each field that was NULL WHERE field> != 0.


Let me know if you want to go that route, I could probably add that to 
http://pgxn.org/dist/count_nulls/ without much difficulty. Though, 
probably a better way to accomplish that would be to add a function to 
count_nulls that spits out an array of fields that are NULL; you could 
then do a unique index on that WHERE array != array[].


Maybe a less obtuse option would be to use a boolean array. Storage 
would be ~8x larger, but since there should be very few rows I doubt 
that matters.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Constraint using a SQL function executed during SELECT

2016-07-19 Thread Jim Nasby

On 7/19/16 7:43 AM, Cyril B. wrote:

Hello,

Is there a way to execute a SELECT on a table having a constraint that
uses a non-working SQL function?

...

ALTER TABLE ONLY t1 ADD CONSTRAINT c EXCLUDE (id WITH =) WHERE ((f(id)
IS NOT TRUE));

...

ERROR:  relation "rename_me.t2" does not exist

...

CONTEXT:  SQL function "f" during inlining


In this example, you should be able to avoid that by setting 
constraint_exclusion=off.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Stored procedure version control

2016-07-01 Thread Jim Nasby

On 6/30/16 9:16 AM, Merlin Moncure wrote:

It's not really necessary to create version down scripts.  In five
years of managing complex database environments we've never had to
roll a version back and likely never will; in the event of a disaster
it's probably better to restore from backup anyways.


I'm surprised no one has mentioned http://sqitch.org. It makes it very 
easy to manage migrations, as well as creating downgrade scripts (if you 
use rework, and put each object into it's own file).


I do agree that down scripts are pretty over-rated as long as you have 
good test practices (as in, database unit tests). In 9 years in an 
environment where downtime was 6 figures per hour I only had 1 or 2 
deployments that had problems, and never bad enough to consider reverting.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How safe is pg_basebackup + continuous archiving?

2016-07-01 Thread Jim Nasby

On 6/30/16 8:30 AM, Stephen Frost wrote:

> How can we check for backup corruption in this case? Thanks you very much.

There has been some discussion about a specific tool for checking the
checksums throughout the entire system.  I don't know of anyone activly
working on that, unfortunately.


If someone did want that though, it could probably be done as an 
extension. I believe you just have to pull all of each relation into 
shared buffers for the checksums to be verified.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [HACKERS] sign function with INTERVAL?

2016-04-13 Thread Jim Nasby

On 4/13/16 1:36 PM, Daniel Lenski wrote:

Hi all,
Is there a good reason why the SIGN() function does not work with the
INTERVAL type? (It is only defined for numeric types.)
(http://www.postgresql.org/docs/9.5/static/functions-math.html)


The only thing that comes to mind is you can get some strange 
circumstances with intervals, like '-1 mons +4 days'. I don't think that 
precludes sign() though.



What I have come up with is this rather inelegant and error-prone case
statement:


How is it error prone?


case when x is null then null x>interval '0' then +1 when x

You don't need to handle null explicitly. You could do

SELECT CASE WHEN x > interval '0' THEN 1 WHEN x < interval '0' THEN -1 
WHEN x = interval '0' THEN 0 END


Or, you could do...

CREATE FUNCTION sign(interval) RETURNS int LANGUAGE sql STRICT IMMUTABLE 
AS $$
SELECT CASE WHEN $1 > interval '0' THEN 1 WHEN x < interval '0' THEN -1 
ELSE 0 END

$$;

That works because a STRICT function won't even be called if any of it's 
inputs are NULL.




Is there a more obvious way to do sign(interval)? Would it be
technically difficult to make it "just work"?


Actually, after looking at the code for interval_lt, all that needs to 
happen to add this support is to expose interval_cmp_internal() as a 
strict function. It already does exactly what you want.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Odd quoting behavior from \set

2016-01-19 Thread Jim Nasby
Is this odd quoting behavior expected? (Note difference between "'a':1" 
and "'b':'a'")



~@decina.local/53896# \set df 
pd.DataFrame.from_dict([{'a':1,'b':'a'},{'a':2,'b':'b'}])
~@decina.local/53896# \echo :df
pd.DataFrame.from_dict([{a:1,b:'a'},{a:2,b:'b'}])
~@decina.local/53896#


--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Execute commands in single-user mode

2016-01-18 Thread Jim Nasby

On 1/10/16 3:44 PM, Andreas Joseph Krogh wrote:


It might be about time to come up with an extension that's a replacement
for large objects.

What would it take to fund such an extension?


Time and/or money.


It would "have to" support:
- Migrate existing LOs away from pg_largeobject
- Proper driver-support (JDBC)


Would probably be possible to extend JDBC (looks like that's what the 
current support does).



- Possible to exclude from pg_dump


That'd be up to the extension dump control semantics.


- Support pg_upgrade

>

And -hackers should agree on the goal to ultimately being merged into
core and replace pg_largeobject.


Well, there's a reason I suggested an extension. I think it's very 
unlikely -hackers would want to add another LO format to the database. 
Now-a-days, it's generally preferred to do most things as extensions, 
and only incorporate things in the backend that really can't be done 
with an extension.


If this theoretical new replacement for LOs took the world by storm and 
everyone was using it, maybe it'd be a different thing. The xml and JSON 
types are examples of that; they started life as add-ons and were 
eventually pulled in because they became extremely popular.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unable to build python extension with PGXS

2016-01-15 Thread Jim Nasby

On 1/13/16 3:11 PM, Jim Nasby wrote:

On 1/12/16 10:04 PM, Jim Nasby wrote:

Attempting to build a python extension, I'm getting:

Undefined symbols for architecture x86_64:
   "_PyErr_Clear", referenced from:
   _PLyNdarray_FromDatum in pg_ndarray.o
   _PLyObject_To_ndarray in pg_ndarray.o


In case anyone runs into this in the future, my eventual solution was 
https://github.com/decibel/PandaPost/blob/master/Makefile

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Moving a large DB (> 500GB) to another DB with different locale

2016-01-13 Thread Jim Nasby

On 1/13/16 2:03 AM, Andreas Joseph Krogh wrote:

Seems like pglogical migth be better?
http://2ndquadrant.com/en/resources/pglogical/
It's available for 9.4 also.


It would certainly be faster. It's also less tested than Slony is 
though, as it's fairly new. If it was me, I'd use pg_logical.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Changing varchar length by manipulating pg_attribute

2016-01-13 Thread Jim Nasby

On 1/13/16 5:59 AM, Christian Ramseyer wrote:

UPDATE pg_attribute SET atttypmod = 35+4   -- new desired length + 4
WHERE attrelid = 'TABLE1'::regclass
AND attname = 'COL1';


I don't know of any reason that wouldn't work. Note that you might have 
to make the same change to all the views too.



Is this safe to do in Postgres 9.4? Also, best practice seems to be to
use text nowadays, is there even a variant of this that lets me convert


FWIW, I prefer using varchar with a fairly large limit unless the field 
really does need to be unlimited. That protects against bad code or a 
malicious user filling your database with garbage.



a "column from character varying(256)" to "text" without having to
recreate all the nested views?


You could probably change pg_attribute.atttypid to 'text'::regtype. You 
should change atttypmod to -1 at the same time if you do that.


Obviously you should test all of this thoroughly before doing it in 
production.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unable to build python extension with PGXS

2016-01-13 Thread Jim Nasby

On 1/12/16 10:04 PM, Jim Nasby wrote:

Attempting to build a python extension, I'm getting:

Undefined symbols for architecture x86_64:
   "_PyErr_Clear", referenced from:
   _PLyNdarray_FromDatum in pg_ndarray.o
   _PLyObject_To_ndarray in pg_ndarray.o
   "_PyImport_ImportModule", referenced from:
   _PLyNdarray_FromDatum in pg_ndarray.o
   _PLyObject_To_ndarray in pg_ndarray.o
   "_PyObject_CallFunction", referenced from:
   _PLyNdarray_FromDatum in pg_ndarray.o
   _PLyObject_To_ndarray in pg_ndarray.o
   "_PyObject_GetAttrString", referenced from:
   _PLyNdarray_FromDatum in pg_ndarray.o
   _PLyObject_To_ndarray in pg_ndarray.o
   "_PyString_AsString", referenced from:
   _PLyObject_To_ndarray in pg_ndarray.o
   "_PyString_FromStringAndSize", referenced from:
   _PLyNdarray_FromDatum in pg_ndarray.o
   "_PyString_Size", referenced from:
   _PLyObject_To_ndarray in pg_ndarray.o
ld: symbol(s) not found for architecture x86_64

I've included $(python_includespec) in my Makefile:

override CPPFLAGS := $(python_includespec) $(CPPFLAGS)

Is there some other magic I need? Do I need to switch to using
MODULE_big or something?


After some Screwing Around(TM), I figured out that the magic trick is to 
add the output of python-config --ldflags to LDFLAGS. I see we have a 
$(perl_embed_ldflags), should there be something similar for python?

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Moving a large DB (> 500GB) to another DB with different locale

2016-01-13 Thread Jim Nasby

On 1/13/16 2:39 PM, Andreas Joseph Krogh wrote:

Where can I find more info about how to use and configure pg_logical to
replicate a 9.4 DB to 9.5?


http://2ndquadrant.com/en/resources/pglogical/
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Call postgres PL/Python stored function from another PL/Python block.

2016-01-13 Thread Jim Nasby

On 1/13/16 9:47 AM, lodopidolo wrote:

Hello. It is possible to call al PL/Python stored function natively from
another PL/Python function?


Stackoverflow is stupid and won't let me post there, but here's what you 
want:


There is no special capability to call other plpython functions. You 
need to call them as you would any other Postgres function, ie:


do $$
begin
...
rv = plpy.execute("SELECT f1()")
t = rv[1]["f1]
...
end;
$$ language 'plpython3u';

See 
http://www.postgresql.org/docs/9.5/static/plpython-database.html#AEN65599 for 
more information.


--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Unable to build python extension with PGXS

2016-01-12 Thread Jim Nasby

Attempting to build a python extension, I'm getting:

Undefined symbols for architecture x86_64:
  "_PyErr_Clear", referenced from:
  _PLyNdarray_FromDatum in pg_ndarray.o
  _PLyObject_To_ndarray in pg_ndarray.o
  "_PyImport_ImportModule", referenced from:
  _PLyNdarray_FromDatum in pg_ndarray.o
  _PLyObject_To_ndarray in pg_ndarray.o
  "_PyObject_CallFunction", referenced from:
  _PLyNdarray_FromDatum in pg_ndarray.o
  _PLyObject_To_ndarray in pg_ndarray.o
  "_PyObject_GetAttrString", referenced from:
  _PLyNdarray_FromDatum in pg_ndarray.o
  _PLyObject_To_ndarray in pg_ndarray.o
  "_PyString_AsString", referenced from:
  _PLyObject_To_ndarray in pg_ndarray.o
  "_PyString_FromStringAndSize", referenced from:
  _PLyNdarray_FromDatum in pg_ndarray.o
  "_PyString_Size", referenced from:
  _PLyObject_To_ndarray in pg_ndarray.o
ld: symbol(s) not found for architecture x86_64

I've included $(python_includespec) in my Makefile:

override CPPFLAGS := $(python_includespec) $(CPPFLAGS)

Is there some other magic I need? Do I need to switch to using
MODULE_big or something?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] New hacker item posted

2016-01-11 Thread Jim Nasby
Anyone looking to get their feet wet in the backend code, please take a 
look at 
http://www.postgresql.org/message-id/568f03ef.4070...@bluetreble.com.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using xmax to detect deleted rows

2016-01-10 Thread Jim Nasby

On 1/8/16 9:02 AM, Alvaro Herrera wrote:

Meel Velliste wrote:

I would like to use the "xmax" column to detect rows that have been
recently deleted. Is it possible to get the deleted row versions with
non-zero xmax to remain visible long enough that I could periodically
check, say once an hour, and still be able to see rows that were deleted
since I last checked?


No.  Maybe you want a trigger that saves the deleted row somewhere (a
separate table perhaps) which you can later inspect and delete again?


If that is what you need then PgQ might be a good solution. But it'll be 
a lot simpler to just do whatever you need to do when the row is 
actually deleted. Just be sure you deal with rollbacks correctly if 
you're doing something external.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Trigger function interface

2016-01-10 Thread Jim Nasby

On 1/7/16 6:15 PM, Tatsuo Ishii wrote:

On 1/6/16 7:03 PM, Tatsuo Ishii wrote:

Is it possible to get the parse tree in a C trigger function which is
invoked when DML (INSERT/UPDATE/DELETE against a view) is executed?


Yes, it's in fcinfo->flinfo->fn_expr.


Thanks for the info. But is this the parse tree for the top level
query which involves the trigger?


Hrm, apparently not. fcinfo->context would maybe be helpful, but I'm not 
sure.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Execute commands in single-user mode

2016-01-10 Thread Jim Nasby

On 1/10/16 10:01 AM, Andreas Joseph Krogh wrote:

pg_largeobject being a system-relation does quite make sense to me, but
that's another discussion. I know there has been some discussions in the
past about making it a non system-relation but it never got anywhere AFAIK.


BTW, there's some other issues with large objects, notably their use of 
OIDs. Lots of LOs can lead to OID depletion. There was a thread about 
this recently.


It might be about time to come up with an extension that's a replacement 
for large objects.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY FROM STDIN

2016-01-10 Thread Jim Nasby

On 1/8/16 10:37 AM, Luke Coldiron wrote:

On 1/6/16 9:45 PM, Luke Coldiron wrote:

In the example above I'm not sure if I can use some sub struct of the
SPIPlanPtr and hand it off to the DoCopy function as the CopyStmt or
if I need to go about this entirely different. Any advice on the
matter would be much appreciated.


I don't know off-hand. I suggest you look at what psql does to implement

\copy (note the \).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in

Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in
Treble! http://BlueTreble.com

I took a look at the \copy but as far as I can tell this is using the
frontend (client side) of postgresql and I need to be able to do this on the


Oops.


backend (server side). I don't see a way with this either to attach to the
connection that called the c function and execute the copy statement. The
DoCopy in commands/copy.h appears to me to be the server side copy command.
And I think I can get it to work if I can figure out how to attach to the
connection of the calling function and pass in the CopyStmt variable.


That sounds... bad.

I think the way to handle this is to add a new type to CopyDest and 
modify CopyGetData() and CopySendEndOfRow() accordingly.


It might be tempting to use CopyState->filename as the pointer to a 
StringInfoData (StringInfo), but I'm not sure that's a great idea. I 
think it'd be better to add a new field to CopyStateData.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time?

2016-01-10 Thread Jim Nasby

On 1/10/16 10:07 AM, Bill Moran wrote:

The fact that Postgres has not needed a CoC up till now is a
testiment to the quality of the people in the community. However,
if Postgres continues to be more popular, the number of people
involved is going to increase. Simply as a factor of statistics,
the project will be forced to deal with some unsavory people at
some point. Having a CoC is laying the foundation to ensure that
dealing with those people involves the least pain possible. It
will always involve_some_  pain, but less is better.

I've done the job of #3 with other groups, and 99% of the time
there was nothing to do. The one incident I had to handle was
terrible, but at least I had some guidance on how to deal with
it.


Bingo.

To me, the CoC is as much about protecting Postgres itself as it is 
about protecting contributors. Haters are going to hate, no matter what 
you do... so how do you remove them and their toxicity as cleanly as 
possible?


BTW, IMHO I think it was a mistake for the FreeBSD community to try and 
keep things quiet. Sweeping stuff like this under the rug doesn't help 
anyone. The problem is how to publicize things without scaring people 
away from reporting. Also, not allowing your CoC to become a weapon that 
someone can use offensively.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time?

2016-01-06 Thread Jim Nasby

On 1/6/16 2:17 AM, Victor Yegorov wrote:

Another very wanted change in the community is mentorship. Personally, I
don't feel confident to ask endless questions I have when looking into
the code, as I understand, that this might be a very basic (for
PostgreSQL hackers) stuff. For me it'd be a great helper, if I could
talk this over (via e-mail or any messenger) with experienced developer.
Reminds me of what we do for the GSoC, where developers volunteer for
mentoring students.
Something similar would be handy in general, perhaps with a web
interface similar to the CommitFest's one.


Please, ask your questions! The Postgres community really is one of the 
most patient and helpful OSS communities out there, and there's plenty 
of people that would be happy to explain things. Questions are also a 
good way to show where things could possibly be better commented/documented.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time?

2016-01-06 Thread Jim Nasby

On 1/6/16 9:48 AM, Melvin Davidson wrote:

there is no way it can be enforced. So while I applaud Joshua Drake for
his good intent, I there must therefore take the position of saying no
to a CoC


A good CoC is not just a code, it is also a means of enforcement. To 
wit, from the CouchDB CoC[1]:


"If you believe someone is violating this code of conduct, you may reply 
to them and point out this code of conduct. Such messages may be in 
public or in private, whatever is most appropriate. Assume good faith; 
it is more likely that participants are unaware of their bad behaviour 
than that they intentionally try to degrade the quality of the 
discussion. Should there be difficulties in dealing with the situation, 
you may report your compliance issues in confidence to 
priv...@couchdb.apache.org.


"If the violation is in documentation or code, for example inappropriate 
pronoun usage or word choice within official documentation, we ask that 
people report these privately to the project at 
priv...@couchdb.apache.org, and, if they have sufficient ability within 
the project, to resolve or remove the concerning material, being mindful 
of the perspective of the person originally reporting the issue."


Importantly, the code clearly states what is and isn't acceptable, in a 
calm and rational manner, so that when an incident does occur -core or 
whoever else can deal with it much more easily. Vague statements like 
"don't be an ass" are useless for dealing with an actual situation.


(BTW, if your concern on enforcement is about control, not only can 
people be removed from mailing lists and the like, but there actually is 
a Postgres legal entity that could start legal proceedings if it ever 
came to it.)


[1] http://couchdb.apache.org/conduct.html
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time?

2016-01-06 Thread Jim Nasby

On 1/6/16 1:36 AM, Pavel Stehule wrote:

The CoC doesn't solve it. We do on mature, stable, pretty complex
code - use C (not JavaScript or Java).  This isn't hobby project or
student project.


No, CoC by itself doesn't grow the community. That doesn't mean we
shouldn't have one.

Another weakness we have is the mentality that the only way to
contribute to the community is as a developer. There's tons of other
ways people could help, if we made an effort to engage them.
Infrastructure, website design, documentation, project management (ie:
CF manager), issue tracker wrangler (if we had one), advocacy. There's
probably some others. It wouldn't even take effort from the existing 
community to attract those people; all we'd need to do is decide we 
wanted non-developers to work on that stuff and find some volunteers to 
go find them. But the big thing is, the existing community would have to 
welcome that help. Part of that would mean some changes to how the 
community currently operates, and the community can be very resistant to 
that. (I suspect partly because it pays to be very conservative when 
writting database software... :) )



Taking new developers needs the hard individual work with any
potential developer/student. I see as interesting one point -
PostgreSQL extensibility - the less experienced developer can write
extension, there can be interesting experimental extensions that can
be supported without risk of unstability of core code. Can be nice to
allow to write not only C language extensions. Then the Postgres can
be used on universities and in some startup companies - and it can
increase the number of active developers. My very talented colleague
doesn't write to Postgres due C language. He like to write planner in
lisp or erlang. Or like to play in these languages. C is barrier for
younger people.


Agreed. I recently said something to that effect to a few others, using
Python as an example. If you look at the Python source, there are 380 .c
files and 2000 .py files. Postgres has 1200 .c, 2000 .h and only 652
.sql. Since there's 640 .out files most of the .sql is presumably tests.
I'm not suggesting we switch to Python; the point is we could do a
better job of "eating our own dog food". I think it would also be very
interesting if there were add-on frameworks that allowed things like a
planner written in another language (which with the planner hooks might
actually be possible).
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Definitive answer: can functions use indexes?

2016-01-06 Thread Jim Nasby

On 1/6/16 5:15 PM, Seamus Abshere wrote:

I've been using Postgres for years ( :heart: ) and I'm still in doubt
about this. Would somebody provide an authoritative, definitive,
narrative answer?

-> Can a function like `LEFT()` use an index?

(Or do I have to find an "equivalent" operator in order to leverage
indexes?)


If you're looking for magic here, there is none.

CREATE INDEX ON a(field);
... WHERE field = LEFT(...) -- can use index
... WHERE LEFT(field) = ... -- can NOT use index

CREATE INDEX ON a(LEFT(field,5))
... WHERE field = LEFT(...) -- can NOT use index
... WHERE LEFT(field,5) = ... -- CAN use index
... WHERE LEFT(field,6) = ... -- can NOT use index
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Definitive answer: can functions use indexes?

2016-01-06 Thread Jim Nasby

On 1/6/16 5:41 PM, Tom Lane wrote:

Since the question makes little sense as stated, I'm going to assume
you mean "can a query like SELECT ... WHERE left(foo, 3) = 'bar'
use an index on column foo?"

The answer to that is no, there is no such optimization built into
Postgres.  (In principle there could be, but I've not heard enough
requests to make me think we'd ever pursue it.)


BTW, the case where this would be highly valuable is timestamps. Being 
able to do something like date_part('month',timestamptz)='Jan' would be 
a big, big deal for warehousing.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY FROM STDIN

2016-01-06 Thread Jim Nasby

On 1/6/16 9:45 PM, Luke Coldiron wrote:

In the example above I'm not sure if I can use some sub struct of the
SPIPlanPtr and hand it off to the DoCopy function as the CopyStmt or if I
need to go about this entirely different. Any advice on the matter would be
much appreciated.


I don't know off-hand. I suggest you look at what psql does to implement 
\copy (note the \).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Trigger function interface

2016-01-06 Thread Jim Nasby

On 1/6/16 7:03 PM, Tatsuo Ishii wrote:

Is it possible to get the parse tree in a C trigger function which is
invoked when DML (INSERT/UPDATE/DELETE against a view) is executed?


Yes, it's in fcinfo->flinfo->fn_expr.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Efficiently selecting single row from a select with window functions row_number, lag and lead

2016-01-05 Thread Jim Nasby

On 1/1/16 9:39 PM, Andrew Bailey wrote:

select id, row_number() over w as rownum, lag(id, 1) over w as prev,
lead(id, 1) over w as next from route where id=1350 window w as (order
by shortname, id asc rows between 1 preceding and 1 following) order by
shortname, id ;

However this gives the result
1350;1;;

The following query gives the result I am expecting

select * from (select id, row_number() over w as rownum,
lag(id, 1) over w as prev, lead(id, 1) over w as next
from route window w as (order by shortname, id
rows between 1 preceding and 1 following) order by shortname, id) as s
where id=1350

1350;3;1815;1813

The explain plan is
"Subquery Scan on s  (cost=0.14..15.29 rows=1 width=32)"
"  Filter: (s.id <http://s.id> = 1350)"
"  ->  WindowAgg  (cost=0.14..13.51 rows=143 width=12)"
"->  Index Only Scan using route_idx on route  (cost=0.14..10.29
rows=143 width=12)"


I'm pretty sure the issue here is that the WHERE clause is limiting your 
result set before the window can find what you're looking for.


You could probably switch the WHERE in your original query to a HAVING 
and get the same results.


I'm not sure the filter can actually be pushed past the window functions 
to get the result you want. That Index Only Scan could still be pulling 
every row in the table.


BTW, if you switch the order by to id, shortname then it might be able 
to use the index, but of course the results would be different.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time?

2016-01-05 Thread Jim Nasby

On 1/5/16 10:03 PM, John R Pierce wrote:

On 1/5/2016 5:31 PM, Jim Nasby wrote:

IMHO, the real problem here is not simply a CoC, it is that the
Postgres community doesn't focus on developing the community itself.
The closest we come to "focus" is occasional talk on -hackers about
how we need more developers. There is no formal
discussion/leadership/coordination towards actively building and
strengthening our community. Until that changes, I fear we will always
have a lack of developers. More importantly, we will continue to lack
all the other ways that people could contribute beyond writing code.
IE: the talk shouldn't be about needing more developers, it should be
about needing people who want to contribute time to growing the
community.



That sounds like a bunch of modern marketing graduate mumbojumbo to
me.The postgres community are the people who actually support it on
the email lists and IRC, as well as the core development teams, and
INMO, they are quite strong and effective. when you start talking
about social marketing and facebook and twitter and stuff, thats just a
bunch of feelgood smoke and mirrors.The project's output is what
supports it, not having people going out 'growing community', that is
just a bunch of hot air.   you actively 'grow community' when you're
pushing worthless products (soda pop, etc) based on slick marketing
plans rather than actually selling something useful.


Then why is it that there is almost no contribution to the community 
other than code and mailing list discussion?


Why is the infrastructure team composed entirely of highly experienced 
code contributors, of which there are ~200 on the planet, when there are 
literally 100s of thousands (if not millions) of people out there that 
could do that work (and could probably do it better if it's what they do 
for a living, no offense to the efforts of the infrastructure team).


Why is there a lack of developers? And a serious lack of code reviewers?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time?

2016-01-05 Thread Jim Nasby

On 1/5/16 6:32 PM, Scott Marlowe wrote:

I don't think I am a good person to rationalize their reasoning because I
>don't like the idea of a CoC. That said, I think a lot of boils down to
>perception, responsibility, accountability and the fact that a lot of people
>are flat out jerks. I am not talking the ball busting type of jerk but
>honest, just not nice people or people who vastly lack the ability to
>integrate with larger society. Those people tend to need guidelines for
>their jerkiness because they will say, "I didn't know I couldn't do/say
>XYZ". Whether that is true or not, I have no idea.

CoC:
1: Use our code how you want
2: Don't sue us
3: Don't be a jerk


Well, that highlights that it's not just about a CoC, it's the things 
that surround it. Especially what the conflict resolution policy is.


I suspect JD thought about this because of a recent Facebook thread[1] 
about how the FreeBSD community just screwed this up big-time[2]. The 
big screw-up was not having solid ways to deal with such complaints in 
place. Sadly, as part of that thread, it comes to light that there is 
some history of this in the Postgres project as well.


IMHO, the real problem here is not simply a CoC, it is that the Postgres 
community doesn't focus on developing the community itself. The closest 
we come to "focus" is occasional talk on -hackers about how we need more 
developers. There is no formal discussion/leadership/coordination 
towards actively building and strengthening our community. Until that 
changes, I fear we will always have a lack of developers. More 
importantly, we will continue to lack all the other ways that people 
could contribute beyond writing code. IE: the talk shouldn't be about 
needing more developers, it should be about needing people who want to 
contribute time to growing the community.


I saw a great presentation about building a strong community by Joan 
Touzet of CouchDB. The presentation link is currently down, but there's 
a great interview with her at [3]. CouchDB didn't focus on community 
building until they had a major problem to deal with. Now, they make 
community one of their focal points. Just one example, this is the 3rd 
paragraph on their home page:


"We welcome your contributions. CouchDB is an open source project. 
Everything, from this website to the core of the database itself, has 
been contributed by helpful individuals. The time and attention of our 
contributors is our most precious resource, and we always need more of 
it. Our primary goal is to build a welcoming, supporting, inclusive and 
diverse community. We abide by Code of Conduct and a set of Project 
Bylaws. Come join us!"


What I'd love to see is support and commitment from the Postgres 
community to actively attract people who will focus not on the code but 
on building the community itself. I know there are people in the 
community that would be interested in doing that, but without active 
support and some encouragement things aren't going to change.


[1] https://www.facebook.com/jon.erdman.jr/posts/10153828693183899
[2] 
http://blog.randi.io/2015/12/31/the-developer-formerly-known-as-freebsdgirl/

[3] https://opensource.com/life/15/8/couchdb-community-apache-way
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cannot upgrade from 9.3 to 9.4 using pg_upgrade

2016-01-05 Thread Jim Nasby

On 1/4/16 7:40 PM, Adrian Klaver wrote:

or even better yet could you post the section of the log above the error?


The server log itself might be useful, especially if full query logging 
was turned on. Dunno how easy/possible that is with pg_upgrade.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Getting the function definition from oid in before firing the function

2016-01-05 Thread Jim Nasby

On 1/5/16 8:13 AM, Mohammed Ajil wrote:

Now my problem consists of two steps:
- First I need to decide if the algorithm supports the trigger type
(only of LANGUAGE PLPGSQL).


plpgsql is NOT the only language that supports triggers.


- Second I need to get the SQL command that the function will execute.

What I tried was the following:
In the file trigger.c I have found the location where the function is
called.


I think triggers is the wrong way to approach this; it's full of holes 
(not the least of which is triggers don't fire on SELECT). You'd be much 
better off with an executor hook.


Have you looked at https://github.com/2ndQuadrant/pgaudit?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] to_timestamp alternatives

2016-01-05 Thread Jim Nasby

On 1/1/16 4:11 AM, Thomas Kellerer wrote:

You only need to provide the text value to be casted (no format mask).
Use only a single expression without those unnecessary parentheses:

 SELECT gmt_date||' '||lpad(gmt_time,8,'0')::timestamp

You don't need the lpad() either:

 SELECT gmt_date||' '||gmt_time::timestamp


I suspect you need to wrap that in (). Even if the parser does the right 
thing there, it'd certainly make the intent a lot clearer.


SELECT (gmt_date || ' ' || gmt_time)::timestamp
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY FROM STDIN

2016-01-05 Thread Jim Nasby

On 1/4/16 12:18 PM, Luke Coldiron wrote:

Is there a way to achieve the performance of the COPY FROM STDIN command
within a C extension function connected to the db connection that called
the C function? I have text that I would like to receive as input to a C
function that contains many COPY command statements in the file that
would be parsed similar to how psql would handle the file but I don't
want to shell out to psql as I need to do all of this work on the db
connection that the function was called from as there are other commands
that I need to perform as well after before and after handling the COPY
commands on this same connection. I would like the unit of work to be
all or nothing and have the performance of the COPY FROM STDIN command
and not break things out into SELECT INTO or INSERT statements for
performance.

Ideally I would like to be able to attach to the calling db connection
via SPI_connect() and then use the libpq library to issue the copy
commands via PQputCopyData, PQputCopyEnd.


C functions can use SPI, so I'm not sure what the issue is?

http://www.postgresql.org/docs/9.5/static/spi.html

(BTW, you'll want to scroll to the bottom of that page...)
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] to_timestamp alternatives

2016-01-04 Thread Jim Nasby

On 1/1/16 7:15 AM, Alban Hertroys wrote:

Since you're dealing with GPS data and presumably have lat/long, it shouldn't be hard 
to do this dynamically either, either by just blindly dividing longitude by 15 or 
using actual timezone shape polygons and @> or <@.

That would be a bad idea for global data, since not all time zones are full 
hours apart, or have the same (if any) DST change-over dates. For example, 
India is currently at UTC+05:30, probably because they wrapped the entire 
country in the same TZ after their independence.


The original desire was to ignore DST, hence my suggestion. Normally 
you'd want to figure out the correct TZ for the lat/long.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] cannot get stable function to use index

2015-12-31 Thread Jim Nasby

On 12/30/15 2:12 PM, Andy Colson wrote:


random_page_cost = 1



Humm, nope.  I removed the config option, restart PG, then analyzed the
search table:


FYI, you can set that inside any session, any time you want. What's in 
postgresql.conf is just the default value.


(For that matter, you can also set a default for than on a specific 
database, or a specific user, using ALTER DATABASE SET or ALTER USER SET.)

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Efficiently Triggering Autovacuum Analyze?

2015-12-31 Thread Jim Nasby

On 12/30/15 1:31 PM, Joe Conway wrote:

On 12/30/2015 11:09 AM, Cory Tucker wrote:
With this scenario you can expect an autoanalyze every 5 million rows
and autovacuum every 10 million. In my experience (and based on your
description, yours as well) this is not often enough. Not only that,
when it does run it runs longer than you would like, causing an I/O hit
while it does.

You probably should tune this table specifically, e.g.


Another option is to explicitly analyze then SELECT from the table after 
you're done inserting into it. The advantage is it doesn't tie up an 
autovac worker and you can ensure that the newly added tuples get 
properly hinted.


You can run the ANALYZE immediately after your insert finishes. The 
reason to do that is to get up-to-date statistics for other queries to 
use. That can be particularly important if the new rows have values 
significantly outside what was in the table before. That's common with 
things like sequence IDs and timestamp data.



The SELECT is a bit trickier; you want to ensure that there is no 
transaction still running in the database that's older than the 
transaction that added all the new data. You can check that by comparing 
the xmin field of one of your new rows with 
txid_snapshot_xmin(txid_current_snapshot()). Note that because of 
wraparound you can't do a simple comparison; txid 3 is actually greater 
than txid 2^32.


The whole point of this SELECT is to get the new tuples hinted while the 
pages are still hot in cache. If you don't do that, the next query that 
reads the tuple will have to set the hints, which also dirties the page. 
VACUUM does that too, but there's really no point in having vacuum run 
through the entire table just to set hints on less than 1% of it.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] to_timestamp alternatives

2015-12-31 Thread Jim Nasby

On 12/31/15 5:05 PM, gkhan wrote:

For
example, this fails because of the day-before-month format:


Right, which is why Tom had in his example:

regression=# set datestyle = dmy;

BTW, my recommendation would be to store in a timestamptz field *with 
the correct timezone*, and then convert on output as necessary. This is 
easy to do by either


SET timezone

or

SELECT timestamptz_field AT TIME ZONE '...';

If you want the time without DST, you can just use a timezone like '+8' 
or '-8'.


Since you're dealing with GPS data and presumably have lat/long, it 
shouldn't be hard to do this dynamically either, either by just blindly 
dividing longitude by 15 or using actual timezone shape polygons and @> 
or <@.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] cannot get stable function to use index

2015-12-29 Thread Jim Nasby

On 12/29/15 5:21 PM, David G. Johnston wrote:

STABLE functions, nor VOLATILE ones, are candidates for indexing.
Only IMMUTABLE ones.  The default for functions is VOLATILE.​

I haven't the time to provide a solution to your problem - I'm just
pointing out "cannot get stable function to use index" is working as
designed and as is logically required.  An index must not rely upon
outside information, most typically time, since there exists no
means for an index to update itself based upon changes in the
environment. The only type of function guaranteed to not rely upon
the external environment is an immutable one.  And no, you shouldn't
lie by marking a function immutable to get this to work.  The system
does not check that the stated volatility and the actual
implementation match.


​So while the above is all true I apparently mis-understood your
question... :(


I'm not an expert on text search, but presumably the tsvector knows what 
config was used, which means as long as the config is part of the index 
it should be OK.


Marking to_tsquery_partial as immutable would be wrong and potentially 
dangerous, as you can change default_text_search_config at any time, 
which would change the output of to_tsquery (as well as to_tsvector).



I'm going to wait for someone thinking more clearly to answer...but it
seems that given an inability to prove that the result of the function
call is meaningfully selective the system would default to choosing a
sequential scan plan over an index.  You happen to choose a value that
only returns a single row but nothing prevents you from picking one that
returns the entire table.  There may be other factors involved as I am
not that familiar with the full text search capabilities of PostgreSQL.


If I'm reading EXPLAIN ANALYZE correctly, to_tsquery_partial is being 
simplified out of the query entirely:


   Filter: (search_vec @@ 
to_tsquery((array_to_string('{213,E,13,ST,N}'::text[], ' & '::text) || 
':*'::text)))


Part of this could well be that you're not feeding the same data to 
to_tsquery. Your hard-coded example is


where search_vec @@ to_tsquery('213 & E & 13 & ST & N');

but your second query becomes '213 & E & 13 & ST & N:*'. Have you tried 
that as a hard-coded value?

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] efficient math vector operations on arrays

2015-12-29 Thread Jim Nasby

On 12/27/15 2:00 AM, Jony Cohen wrote:

Hi, Don't know if it's exactly what you're looking for but the MADLib
package has utility function for matrix and vector operations.
see: http://doc.madlib.net/latest/group__grp__array.html


Apply an operator to al elements on an array or pair of arrays: 
http://theplateisbad.blogspot.com/2015/12/the-arraymath-extension-vs-plpgsql.html, 
https://github.com/pramsey/pgsql-arraymath.


See also 
http://theplateisbad.blogspot.com/2015/12/more-fortran-90-like-vector-operations.html.


BTW, if you want to simply apply a function to all elements in an array 
there is an internal C function array_map that can do it. There's no SQL 
interface to it, but it shouldn't be hard to add one.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] efficient math vector operations on arrays

2015-12-29 Thread Jim Nasby

On 12/29/15 6:50 PM, Tom Lane wrote:

Jim Nasby<jim.na...@bluetreble.com>  writes:

>BTW, if you want to simply apply a function to all elements in an array
>there is an internal C function array_map that can do it. There's no SQL
>interface to it, but it shouldn't be hard to add one.

That wouldn't be useful for the example given originally, since it
iterates over just one array not two arrays in parallel.  But you could
imagine writing something similar that would iterate over two arrays and
call a two-argument function.


Actually, I suspect you could pretty easily do array_map(regprocedure, 
VARIADIC anyarray).



Whether it's worth a SQL interface is debatable though.  Whatever
efficiency you might gain from using this would probably be eaten by the
overhead of calling a SQL or PL function for each pair of array elements.
You'd probably end up in the same ballpark performance-wise as the UNNEST
solution given earlier.


Take a look at [1]; using a rough equivalent to array_map is 6% faster 
than unnest().


The array op array version is 30% faster that plpgsql, which based on 
the code at [2] I assume is doing


 explain analyze select array(select a*b from unnest(array(select 
random() from generate_series(1,100)), array(select random() from 
generate_series(1,100 u(a,b);


The syntactic sugar of r := array_map('function(a, b)', in1, in2) (let 
alone r := in1 * in2;) is appealing too.


[1] 
http://theplateisbad.blogspot.com/2015/12/the-arraymath-extension-vs-plpgsql.html
[2] 
http://theplateisbad.blogspot.com/2015/12/more-fortran-90-like-vector-operations.html

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Shared system resources

2015-12-25 Thread Jim Nasby

On 12/23/15 12:05 PM, Melvin Davidson wrote:

As others have pointed out, worrying about someone accessing database
shared memory is like worrying about an asteroid striking the earth and
wiping out all life.
It's a one in a billion chance compared to other security violations
that can occur.
You are better off concentrating on proper O/S security and user/table
permissions. That is how to implement database security!


True, but in my experience security audits have nothing to do with 
security and everything to do with marking off checkboxes and 
complicating lawsuits. ;)

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] efficient math vector operations on arrays

2015-12-25 Thread Jim Nasby

On 12/24/15 1:56 AM, Pavel Stehule wrote:

I don't know any extension that calculate euclid distance, but it should
be trivial in C - if you don't need to use generic types and generic
operations.


Before messing around with that, I'd recommend trying either pl/r or 
pl/pythonu.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Shared system resources

2015-12-23 Thread Jim Nasby

On 12/23/15 7:55 AM, oleg yusim wrote:

Sure David. For simplicity of modeling here, let's assume raw database
data was encrypted and the only possibility for attacker to get
something from raw data is to go and dig into sessions leftovers. Now,
with that has been said, do you happen to know what information actually
gets stored during the session into memory, reserved by session process?
I'm trying to determine, basically, does it even worth a talk - maybe
there is nothing at all valuable.


There's tons of raw data stored in the shared memory segment, and some 
of that can be copied to process local memory at any time. If they OS 
doesn't secure that adequately there's certainly nothing that Postgres 
or any other database can do about it.


As David said, by the time you're concerned about someone getting access 
to raw memory it's already way too late.


As for memory pages being zero'd after they are returned to the OS, 
that's entirely up to the OS. The only thing you could do on the 
Postgres side is to compile with memory debugging enabled, which will 
over-write any memory that's freed with a magic value. That's done to 
help hunt down memory access bugs, but would have the obvious side 
effect of obliterating any data that was in the page.


Uh, only thing is, I don't know if this is done if we're going to be 
returning the memory to the OS.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Table with seemingly duplicated primary key values

2015-12-22 Thread Jim Nasby

On 12/22/15 1:31 PM, Melvin Davidson wrote:

The fact that you have rows with an identical id of 2141750 verifies a
corrupted primary index.
To correct it, you need to decide which row to keep.

So review the results of
SELECT * FROM some_table WHERE ctid =  (79664,59) OR ctid = (79772,23)

DELETE FROM some_table
WHERE id = 2147150
 AND field1 = {value for field1 of row to delete}
 AND field2 = {value for field2 of row to delete}
...
...
AND fieldn = {value for fieldn of row to delete};

Then
  REINDEX TABLE some_table;


And most importantly: review your logs for hardware errors and your 
Postgres and filesystem settings for anything dangerous. Index 
corruption is not normal and indicates the underlying hardware or OS is 
faulty (or maybe a bug in Postgres, but that's very unlikely).


You should also consider turning on page checksums if you haven't already.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Shared system resources

2015-12-22 Thread Jim Nasby

On 12/22/15 6:03 PM, oleg yusim wrote:

Absolutely. But we are not talking about that type of data leakage here.
We are talking about potential situation when user, who doesn't have
access to database, but has (or gained) access to the Linux box DB is
installed one and gets his hands on data, database processes stored in
memory (memory would be a common resource here).


Of far larger concern at that point is unauthorized access to the 
database files.


Basically, if someone gains access to the OS user that Postgres is 
running as, or to root, it's game-over.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Test disk reliability (or HGST HTS721010A9E630 surprisingly reliable)

2015-12-22 Thread Jim Nasby

On 12/21/15 8:22 AM, Bill Moran wrote:

Why? Just because a disk isn't enterprise-grade doesn't mean it has to lie
> >about fsync, which is the only thing diskchecker.pl tests for.
> >

>
>I was thinking that since the disk have a 32M write-cache (with not
>battery) it would lie to the OS (and postgres) about when data are really
>on disk (not in the disk write cache). But maybe that thinking was wrong.


There are ways to make on-disk write caches safe without a battery. IIRC 
some hard drives would use the inertia of the platter (turning the motor 
into a generator) to write contents out on power-off. You could also use 
a "super cap".



It varies by vendor and product, which is why diskchecker.pl exists.
It's even possible that the behavior is configurable ... check to see
if the vendor provides a utility for configuring it.


Your OS might let you control it too; I know FreeBSD has support for 
this. (Whether the drive obeys or not is a different matter...)

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Re: Moving Postgresql 9.1 instance from FreeBSD 9.0 machine to FreeBSD 10.2

2015-12-20 Thread Jim Nasby

On 12/20/15 3:43 PM, Amitabh Kant wrote:

Just did a transfer without a hitch on a 1 TB (approx) production
database . Everything is working fine. Saved a ton of time and
complications.


Note that that would not work if there were any binary incompatibilities 
between the two systems. Something like big vs little endian would be an 
obvious example. Postgres would detect all the obvious examples of this 
and refuse to start. One thing I'm not sure about is if there were 
different locals installed on the two machines.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Test disk reliability (or HGST HTS721010A9E630 surprisingly reliable)

2015-12-20 Thread Jim Nasby

On 12/20/15 1:09 PM, Félix GERZAGUET wrote:

After reading
http://www.postgresql.org/docs/current/static/wal-reliability.html, I
tried the recommended diskchecker.pl
<http://brad.livejournal.com/2116715.html> but I am not satisfied:

I always get:
Total errors: 0

even if I tested with with a HGST HTS721010A9E630 that the vendor's
datasheet
(http://www.hgst.com/sites/default/files/resources/TS7K1000_ds.pdf)
advertise as "
Designed for low duty cycle, non mission-critical applications in
PC,nearline and consumer electronics environments, which vary
application to application
"

Since it is not, a high end disk, I expect some errors.


Why? Just because a disk isn't enterprise-grade doesn't mean it has to 
lie about fsync, which is the only thing diskchecker.pl tests for.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unique index problem

2015-12-20 Thread Jim Nasby

On 12/20/15 10:18 AM, Andreas Kretschmer wrote:

test=*# create unique index on foo(a,b,c) where a is not null and b is
not null and c is not null;
CREATE INDEX


As you discovered, you'd have to build separate indexes for each of the 
nullable fields:


UNIQUE ON (a,b) WHERE c IS NULL
a,c WHERE b IS NULL
b,c WHERE a IS NULL
a WHERE b IS NULL AND c IS NULL


That's doable for 3 fields, but obviously gets out of hand really quickly.

I wonder if it would be possible to construct an operator family 
(class?) that would accept 2 rows( ie: ROW(a,b,c) ) and treat NULLs as 
single values...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] wal receiver process always start after startup process recovering all WALs and need new WAL?

2015-12-15 Thread Jim Nasby

On 12/15/15 2:49 AM, Jov wrote:

I think this behavior for recovery_min_apply_delay is not good,because
if the receiver do not fetch the wal for a long time(in these cases it
must replay 3d's wal before wal receiver start),the master will delete
the wal,and the standby will need be re do.


AFAIK, if you use a replication slot (new in 9.4) that won't happen, 
because the master will know that the replica still hasn't applied the data.


If you don't use replication slots then you need to set 
wal_keep_segments large enough on the master so that it doesn't delete 
segments that are needed. Or configure WAL archiving and let the replica 
replay from the archive.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Trigger function, C, lookup of attribute type

2015-12-15 Thread Jim Nasby

On 12/15/15 4:42 AM, Paul wrote:

I'm writing a trigger procedure in C to catch updates to a
table and write them to a log file.


You might find https://github.com/arkhipov/temporal_tables useful.

Though, if you just want to log things to a file, you should check out 
http://pgaudit.org/.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to get the size of JSONB in bytes?

2015-12-15 Thread Jim Nasby

On 12/15/15 8:24 AM, Petr Korobeinikov wrote:

The better approach is extract your length-validation logic into your
application.


That's really up to interpretation.

The database is the only place the data is stored, and as such is the 
only place that can constrain that data in all places.


If you're accepting data from a web form or something you certainly want 
it to also check things, so the user gets immediate feedback. But for 
anything you need to guarantee, you need to use the database.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] HELP!!! The WAL Archive is taking up all space

2015-12-14 Thread Jim Nasby

On 12/9/15 7:05 PM, Andreas Kretschmer wrote:

I'm really newbie to PostgreSQL but the boss pushed me to handle it
>and implement it in production f*&%*$%%$#%$#

Re: [GENERAL] Overhead changing varchar(2000) to text

2015-12-14 Thread Jim Nasby

On 12/9/15 5:43 PM, Edson Richter wrote:

Actually, the biggest change is that I don't have to keep another
constraint between app and database - if I want to increase the user
perceived space, now I just have to change the application (of course,
under the limits).


For what it's worth, I usually put some limit on fields that a webapp 
can write to in the database. That way a bug in the app (or malicious 
action) can't just start allocating gigabytes of stuff in your database.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Feature Request: Faceting for full text search

2015-12-14 Thread Jim Nasby

On 12/9/15 11:30 AM, Mirek Svoboda wrote:

Is it OK if I add feature request "faceting for FTS" to roadmap?


If you're asking how to get a feature added to Postgres, you basically 
have 4 options:


1) Convince an existing contributor to work on adding the feature.

2) Convince a large enough group of users that they want the feature and 
get them all to ask for it (large enough to make the developer community 
notice, agree the feature is good, and decide to implement it).


3) Become involved in Postgres development and learn enough to add the 
feature yourself.


4) Pay one of the support companies to develop the feature.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Deletion Challenge

2015-12-14 Thread Jim Nasby

On 12/9/15 7:59 PM, Berend Tober wrote:

The issue is that I'd like the application (that is, the data base and
its stored procedures) to be robust enough to be a "long-running"
application, i.e. one that doesn't suffer gradual performance
degradation as time and the accumulated data increase.


In my experience, no such thing exists. This is one of the things that 
makes database development very different than other forms of 
programming. Once you write a program, it's basically always going to 
perform the same. Database performance slowly changes over time, not 
only due to different amounts of data, but also different *composition* 
of data stored.


Of course, it is wise to avoid things that will obviously hurt future 
performance, so it's good that you're thinking about this. But don't 
expect something you can "set and forget". :)



This question was sort of addressed at the "query tuning" aspect, and
I'm confident that partitioning would help.


Keep in mind that partitioning isn't a magic bullet either, though in 
this case I agree it would help. Sometimes something as simple as having 
"active" and "history" partitions is enough.



This project is a game, btw, described at


You might be interested in https://schemaverse.com/
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Table with invalid page blocks

2015-12-06 Thread Jim Nasby

On 12/4/15 11:34 AM, Gerdan Rezende dos Santos wrote:

Someone has some way of identifying all invalid blocks of a table
postgresql?
Plpgsql a function, a tool, somehow.


I found one solution on
http://www.postgresql.org/message-id/1184245756.24101.178.ca...@coppola.muc.ecircle.de,
but I can not change in order to identify any defective blocks at once.


If your question is "How can I modify that function to report ALL 
invalid CTIDs?" then you probably need to use a cursor and wrap the 
FETCH in a BEGIN/END block with an exception handler. Something like:


DECLARE
curs refcursor;
rec record;
last_good tid;
bad boolean := false;
BEGIN
OPEN curs NO SCROLL FOR EXECUTE 'SELECT ctid FROM ' || table_name;
LOOP
BEGIN
FETCH curs INTO rec;
EXIT WHEN NOT FOUND;
IF bad THEN
RAISE WARNING 'Next good CTID %', rec.ctid;
bad := false;
END IF;
last_good := rec.ctid;
EXCEPTION WHEN OTHERS
RAISE WARNING E'Error %: %\nLast good CTID %', SQLSTATE, SQLERRM, last_good;
bad := true;
END;
END LOOP;
END;
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] FATAL: unable to read data from DB node 0

2015-12-06 Thread Jim Nasby

On 11/27/15 5:45 PM, DrakoRod wrote:

Hi everybody

I've the next architecture with pgpool (streaming replication mode):

4 nodes

1 Master Node
2 Standbys Node
1 pgpool Node

I've disabled the load balancing, because some clients report me problems
with the load balancer, they told me the load balancer some times send
querys to standby nodes, which has not yet recovered data and the querys
fail, but this topic is for another thread.

When I try run a stress test with hammerdb I see next errors in the pgpool
Node

2015-11-27 16:48:21: pid 20190: FATAL:  unable to read data from DB node 0
2015-11-27 16:48:21: pid 20190: DETAIL:  EOF encountered with backend
2015-11-27 16:48:21: pid 19182: LOG:  child process with pid: 20190 exits
with status 256
2015-11-27 16:48:21: pid 19182: LOG:  fork a new child process with pid:
20298
2015-11-27 16:48:21: pid 20163: FATAL:  unable to read data from DB node 0
2015-11-27 16:48:21: pid 20163: DETAIL:  EOF encountered with backend
2015-11-27 16:48:21: pid 19182: LOG:  child process with pid: 20163 exits
with status 256
2015-11-27 16:48:21: pid 19182: LOG:  fork a new child process with pid:
20299

In all Child, therefore the hammerdb stop the test because all connections
it lots.


I've never run pgPool myself, but maybe this means the connection was 
broken for some reason. Are there related errors in the Postgres server 
log for that node?

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] plperlu stored procedure seems to freeze for a minute

2015-12-06 Thread Jim Nasby

On 12/4/15 4:15 AM, Peter J. Holzer wrote:

On 2015-12-03 10:02:18 -0500, Tom Lane wrote:

>"Peter J. Holzer"<hjp-pg...@hjp.at>  writes:

> >Can those signals be safely ignored? Just blocking them (so that they
> >are delivered after the UDF finishes) might be safer. But even that may
> >be a problem: If the UDF then executes some SQL, could that rely on
> >signals being delivered? I have no idea.

>
>The minute you start fooling with a backend's signal behavior, we're
>going to politely refuse to support whatever breakage you run into.

As I understood Jim he was talking about possible changes to postgresql
to shield UDFs from those signals, not something the author of a UDF
should do.


No, I was suggesting that:

1) It might possible for Postgres error handling to detect that a signal 
had been received since entering a UDF if the UDF reported an error, and 
make that information available as part of the standard error report.


2) If there are alarms being set specifically for authentication, they 
should probably be cancelled after authentication is successful.


I certainly do NOT think it would be wise to disable all signal handling 
during UDF execution. Just for starters, that means that if a UDF went 
into an infinite loop your only way to recover would be to PANIC the 
entire database. It would probably create a bunch of other problems as well.


In other words, UDFs *must* be capable of handling an interrupt.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Convert from hex to string

2015-12-06 Thread Jim Nasby

On 11/26/15 1:12 AM, Francisco Olarte wrote:

Anyway, I think people should be told to respect types, and people
should be teach that strings are sequences of characters, that to do
what people think is 'hex encoding' ( two digits per*byte*, no
delimiter ), you need to first transform the string to bytes, then
hex-encode that.


BTW, it wouldn't be that difficult to create a hex data type that worked 
like bytea but accepted only hex strings. Likewise for a base64 type. 
Only downside is you'd probably have to do it in C, because no other 
language can handle cstring and I don't think there's any built-in 
conversion functions with the correct parameters.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] JSONB performance enhancement for 9.6

2015-12-03 Thread Jim Nasby

On 11/29/15 9:30 AM, Arthur Silva wrote:

Is this correct? I'm fairly sure jsonb supports lazily parsing objects
and each object level is actually searched using binary search.


The problem is there's no support for loading just part of a TOASTed 
field. Even if that existed, we'd still need a way to know what byte 
position in the TOASTed field a key lived at.


It's possible to add all that, but I think it'd be a serious amount of work.

Since someone else was just wondering about storing more specific types 
in JSON, it might be more useful/interesting to devise a 
Postgres-specific way to store variable schema documents. That would 
give us a lot more flexibility over implementation than the JSON type 
ever will.


Or think about it this way: there's really no great reason why everyone 
chose JSON. There's tons of other serialization storage formats for 
other languages out there, so why not one specialized to Postgres? (And 
of course we'd want to be able to cast from that to JSON and back...)

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] loading data into cluster - can I daisy-chain streaming replication?

2015-12-03 Thread Jim Nasby

On 12/2/15 9:54 PM, Dennis wrote:

What you can do with your current configuration is do a pg_dumpall or pg_dump 
of A and load that logical dump into B which would then replicate the data to C.


IF you can't handle the downtime you could also setup londiste or Slony 
and logically replicate A to B.


But (assuming A and B are binary compatible) you'd be better off just 
breaking the B to C replication, setting B up as a replica of A, and 
then setting C up as a replica of B.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


  1   2   3   4   5   >