Re: [GENERAL] Is it possible to create an index without keeping the indexed data in a column?

2014-08-01 Thread Amit Langote
On Fri, Aug 1, 2014 at 2:50 PM, Amit Langote amitlangot...@gmail.com wrote:

 Not sure exactly if it applies here;

Re-reading the OP again, perhaps it doesn't. Sorry about the noise

--
Amit


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Upgrading from 9.0.11 to 9.3.5 on CentOS 6 (64 bit)

2014-08-01 Thread Phoenix Kiula
Hello,

I have Postgresql from a few years ago. That's 9.0.11.

During the vacuum it's basically crawling to its knees. While googling
for this (it stops at pg_classes forever) I see Tom Lane suggested
upgrading.

So now I must. In doing so, can I follow these instructions?
https://www.digitalocean.com/community/tutorials/how-to-install-and-use-postgresql-on-a-centos-vps

I want to make sure all my data remains exactly as it is, and the
pgbouncer on top of PG (helps us a lot) also remains on the same port
etc. Just want to confirm that whether I update via the RPM method, or
the YUM method, that the settings in all the places will remain?

Ideally, I don't want to be linking new paths and so on as I see in
online instructions on blogs. Many of them (e.g., the official post
here - 
http://wiki.postgresql.org/wiki/FAQ#What_is_the_upgrade_process_for_PostgreSQL.3F
) also speak of clusters. I don't have any, or is my PG basically
one cluster?

Sorry for the noob question, but it would be great to get some simple
to follow, step by step guidance. MySQL etc are so simple to upgrade!

Many thanks,
PK


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Upgrading from 9.0.11 to 9.3.5 on CentOS 6 (64 bit)

2014-08-01 Thread John R Pierce

On 7/31/2014 11:09 PM, Phoenix Kiula wrote:

I have Postgresql from a few years ago. That's 9.0.11.


you can upgrade to 9.0.18 painlessly.   9.1 or .2 or .3, not quite so 
painless.



During the vacuum it's basically crawling to its knees. While googling
for this (it stops at pg_classes forever) I see Tom Lane suggested
upgrading.


have you tried a vacuum full of the whole cluster, with your 
applications shut down?





So now I must. In doing so, can I follow these instructions?
https://www.digitalocean.com/community/tutorials/how-to-install-and-use-postgresql-on-a-centos-vps


those aren't upgrade instructions, those are first-time install 
instructions.




I want to make sure all my data remains exactly as it is, and the
pgbouncer on top of PG (helps us a lot) also remains on the same port
etc. Just want to confirm that whether I update via the RPM method, or
the YUM method, that the settings in all the places will remain?


you will need to either pg_dumpall your old database 'cluster' and load 
this into the new version, or use pg_upgrade, which is a fair bit 
trickier but can do an in-place upgrade.if your databases aren't 
much over a few dozen gigabytes, pg_dumpall is probably simpler than 
pg_upgrade.  if your databases are large, pg_dumpall - psql restore may 
take a LONG time, so the pg_upgrade process may be more efficient.


since you've never done this before, if you chose to go the pg_upgrade 
route, BACKUP EVERYTHING BEFORE YOU START.  it may take several tries to 
get right.





Ideally, I don't want to be linking new paths and so on as I see in
online instructions on blogs. Many of them (e.g., the official post
here 
-http://wiki.postgresql.org/wiki/FAQ#What_is_the_upgrade_process_for_PostgreSQL.3F
) also speak of clusters. I don't have any, or is my PG basically
one cluster?


in PG terminology, a 'cluster' is the set of databases in a single 
instance of the postgres server, with a single $PGDATA directory. poor 
choice of terms, 'instance' probably would have been more appropriate, 
but its too late to change.




Sorry for the noob question, but it would be great to get some simple
to follow, step by step guidance. MySQL etc are so simple to upgrade!


mysql hasn't changed its core data formats in eons.  but try to upgrade 
from MyISAM to InnoDB, good luck.



--
john r pierce  37N 122W
somewhere on the middle of the left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Very Limited Toast Compression on JSONB (9.4 beta 2)

2014-08-01 Thread Jeff Janes
On Thursday, July 31, 2014, Larry White ljw1...@gmail.com
javascript:_e(%7B%7D,'cvml','ljw1...@gmail.com'); wrote:

 Hi,

 I'm running an experiment on 9.4 beta 2.

 I put 275,000 identical JSON files into a table using JSONB (one per row).
  Each raw text file is 251K in size, so the total uncompressed is 69GB. The
 column storage is set to EXTENDED.  There are other toastable columns in
 the table, but none have more than 36 bytes of data in them.

 My Toast table is 66GB. I would have expected to get that much (or more)
 compression just from JSONB being a binary format.

 If I compress one of these JSON files outside of Postgres, it goes from
 251K to 1K.


That is an astonishing amount of compression.  Are you really compressing
one of the files in isolation, rather than co-compressing many and then
pro-rating the result?

Can you provide an example of the data, and the command line you used to
compress it?



 So each file should theoretically fit on a single row in the toast table.
 In total, the amount well under a GB when compressed outside of PG.

 Any guesses as to why there is so little compression of this data or how I
 might remedy the situation?


PostgreSQL's built in tuple compression is generally not very good.  It is
good at compressing long strings of identical bytes, but not good at
compressing the type of thing you are likely to find in JSON (unless your
JSON had long strings of spaces to reflect indentation of deeply nested
structures, which JSON probably wouldn't do and which JSONB certainly
wouldn't).  It was designed to be very fast and to be unencumbered with the
patent issues common at the time it was written.  It was not designed to
give the best possible compression ratios.

It also compresses each row independently.  Most of the compression
opportunities in a column of JSON data would probably be between rows, when
the same keys show up and over and over again, not within a row.  But it
can't capture those opportunities.

Cheers,

Jeff


Re: [GENERAL] Re: User-defined operator function: what parameter type to use for uncast character string?

2014-08-01 Thread Adam Mackler
On Thu, Jul 31, 2014 at 10:03:00AM -0400, Tom Lane wrote:
 2. text is the preferred type among the string class, so any case where
 you have text on one side and some other string type on the other is
 going to get resolved as text vs text.
 Because of #1, domain-specific functions and operators tend to be pretty
 useless; you find yourself always having to cast the other side to get
 
 If you're intent on having this behavior, the way to go at it is to make
 your own actual datatype (not a domain) and create all your own comparison
 operators for it.  You can add an implicit cast to text for cases where

Thanks to everyone who responded to my question.  Yes, the
case-insensitivity was an example contrivance.  My actual operator
function is using regular expressions to do text replacements, so
citext is not going to solve all my problems.  Looks like 'CREATE
TYPE' is the only way to get what I'm after here, and since there are
regular expressions involved, a C-language solution is not seeming
like it's going to be very convenient.  On top of that, the content of
my regular-expression replacement strings are constructed from data
in my database, so really my SQL-language operator function seemed
ideal except--of course--for the limitations you all have explained to
me.

One final question: the 'CREATE CAST' command got my interest.  I'm
assuming that when the docs say it 'performs a conversion between two
data types,' that the meaning of data type includes only those
created using 'CREATE TYPE' and excludes domains.  If I am mistaken on
that point I would be grateful to learn of that mistake.

Thanks again,
-- 
Adam Mackler


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Very Limited Toast Compression on JSONB (9.4 beta 2)

2014-08-01 Thread Larry White
On Fri, Aug 1, 2014 at 2:20 AM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Thursday, July 31, 2014, Larry White ljw1...@gmail.com wrote:

  Hi,

 I'm running an experiment on 9.4 beta 2.

 I put 275,000 identical JSON files into a table using JSONB (one per
 row).  Each raw text file is 251K in size, so the total uncompressed is
 69GB. The column storage is set to EXTENDED.  There are other toastable
 columns in the table, but none have more than 36 bytes of data in them.

 My Toast table is 66GB. I would have expected to get that much (or more)
 compression just from JSONB being a binary format.

 If I compress one of these JSON files outside of Postgres, it goes from
 251K to 1K.


 That is an astonishing amount of compression.  Are you really compressing
 one of the files in isolation, rather than co-compressing many and then
 pro-rating the result?


Yes, I should have explained. These are generated JSON files for testing
and there is massive repetition in them, which is why they compress so well
outside of Postgres. (Basically there is a repeating array of the same
string) I did compress just the one.



 Can you provide an example of the data, and the command line you used to
 compress it?


Compressed on a Mac with the Compress UI option. Here's a brief sample from
the file. You can see why it compresses so well:
{\junk\:[\124245etweetwet345gwtretwt43 qwrqwq qwre qw
rsdflkas\,\q4535 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535

and so on.

So each file should theoretically fit on a single row in the toast table.
 In total, the amount well under a GB when compressed outside of PG.

 Any guesses as to why there is so little compression of this data or how
 I might remedy the situation?


 PostgreSQL's built in tuple compression is generally not very good.  It is
 good at compressing long strings of identical bytes, but not good at
 compressing the type of thing you are likely to find in JSON (unless your
 JSON had long strings of spaces to reflect indentation of deeply nested
 structures, which JSON probably wouldn't do and which JSONB certainly
 wouldn't).  It was designed to be very fast and to be unencumbered with the
 patent issues common at the time it was written.  It was not designed to
 give the best possible compression ratios.

 It also compresses each row independently.  Most of the compression
 opportunities in a column of JSON data would probably be between rows, when
 the same keys show up and over and over again, not within a row.  But it
 can't capture those opportunities.


I'm not expecting miracles with real data, but as far as I can tell, there
is zero compression happening. I'm wondering if it is disabled for JSONB
for some reason.



 Cheers,

 Jeff



[GENERAL] Re: Is it possible to create an index without keeping the indexed data in a column?

2014-08-01 Thread David G Johnston
Amit Langote wrote
 On Fri, Aug 1, 2014 at 2:50 PM, Amit Langote lt;

 amitlangote09@

 gt; wrote:

 Not sure exactly if it applies here;
 
 Re-reading the OP again, perhaps it doesn't. Sorry about the noise

This is a functional index which lets you store derived data in the index
without having to also store it in the table.  Mostly useful for stuff that
is only relevant in the context of searching and not something you would
ever return to the user.

The restriction here is that the raw data still needs to be stored in the
table.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Is-it-possible-to-create-an-index-without-keeping-the-indexed-data-in-a-column-tp5813461p5813499.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: Is it possible to create an index without keeping the indexed data in a column?

2014-08-01 Thread David G Johnston
larrry wrote
 Hi,
 
 I would like to create a GIN index on a set of JSON documents. Right now
 I'm storing the data in a JSONB column. The current index looks like this:
 
 CREATE INDEX document_payload_idx
   ON document
   USING gin
   (payload jsonb_path_ops);
 
 The index is pretty small, but the actual data takes up a *lot* of space.
 Is there a way to get Postgres to index the table *as if* the JSON were
 there, but not actually put the data in the table? I could either store
 the
 docs elsewhere and keep a reference, or compress them and put them in the
 table in compressed form as a blob.
 
 Thanks much for your help.
 
 Larry

No idea if this works but maybe you can store the compressed data and then
write the index expression like:

USING gin (unzip(payload) jsonb_path_ops)

The unzip function would need to be custom I think...

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Is-it-possible-to-create-an-index-without-keeping-the-indexed-data-in-a-column-tp5813461p5813500.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to implement a uniqueness constraint across multiple tables?

2014-08-01 Thread Pujol Mathieu


Le 31/07/2014 20:38, Kynn Jones a écrit :
I want to implement something akin to OO inheritance among DB tables.  
The idea is to define some superclass table, e.g.:


CREATE TABLE super (
super_id INT PRIMARY KEY,
...
-- other columns
);

CREATE TABLE sub_1 (
super_id INT PRIMARY KEY,
FOREIGN KEY (super_id) REFERENCES super(super_id),
...
-- other columns
);

CREATE TABLE sub_2 (
super_id INT PRIMARY KEY,
FOREIGN KEY (super_id) REFERENCES super(super_id),
...
-- other columns
);

...

CREATE TABLE sub_n (
super_id INT PRIMARY KEY,
FOREIGN KEY (super_id) REFERENCES super(super_id),
...
-- other columns
);

I cribbed this pattern from pp. 92-93 of Bill Kirwan's SQL 
Antipatterns: Avoiding the pitfalls of database programming. The 
approach has a weakness, however, (which the author does not make 
sufficiently clear) and that is that, as presented above, it would be 
possible for multiple sub records (each from a different sub_k 
table) to refer to the same super record, and this may not be 
consistent with the semantics of some applications.


Does PostgreSQL have a good way to enforce the uniqueness of super_id 
values across multiple tables?


(BTW, one could use PostgreSQL built-in support for table inheritance 
to implement something very much like the scheme above.  
Unfortunately, as explained in the documentation, there's no built-in 
support yet for enforcing uniqueness across multiple subclass tables.)


Thanks in advance!

kj

PS: I'm sure that the problem described above crops up frequently, and 
that one could find much material about it on the Web, but my online 
searches have been hampered (I think) by my not having adequate search 
keywords for it.  I'd be interested in learning keywords to facilitate 
researching this topic.



Hi,
Maybe you can use inheritance.
CREATE TABLE super (
super_id INT PRIMARY KEY,
...
-- other columns
);

CREATE TABLE sub_template (
super_id INT PRIMARY KEY,
   FOREIGN KEY (super_id) REFERENCES super(super_id) UNIQUE,
);

CREATE TABLE sub_1 (
-- other columns
) INHERITS (sub_template);

CREATE TABLE sub_2 (
  -- other columns
) INHERITS (sub_template);

So the foreign key constraint will be on the sub_template avoiding two 
row of sub_x to reference the same foreign key.
This is just an idea I let you check for syntax. 
http://www.postgresql.org/docs/9.3/static/ddl-inherit.html

Regards,
Mathieu


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: How to implement a uniqueness constraint across multiple tables?

2014-08-01 Thread David G Johnston
Pujol Mathieu wrote
 Le 31/07/2014 20:38, Kynn Jones a écrit :
 I want to implement something akin to OO inheritance among DB tables.  
 The idea is to define some superclass table, e.g.:

 (BTW, one could use PostgreSQL built-in support for table inheritance 
 to implement something very much like the scheme above.  
 Unfortunately, as explained in the documentation, there's no built-in 
 support yet for enforcing uniqueness across multiple subclass tables.)

 
 Maybe you can use inheritance.
 
 So the foreign key constraint will be on the sub_template avoiding two 
 row of sub_x to reference the same foreign key.
 This is just an idea I let you check for syntax. 
 http://www.postgresql.org/docs/9.3/static/ddl-inherit.html
 Regards,
 Mathieu

You should read Section 5.8.1 (Caveats) of the page your referenced. Or the
BTW in the OP which reiterates the salient points.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/How-to-implement-a-uniqueness-constraint-across-multiple-tables-tp5813448p5813503.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: User-defined operator function: what parameter type to use for uncast character string?

2014-08-01 Thread David G Johnston
On Thursday, July 31, 2014, Adam Mackler-5 [via PostgreSQL] 
ml-node+s1045698n5813494...@n5.nabble.com
javascript:_e(%7B%7D,'cvml','ml-node%2bs1045698n5813494...@n5.nabble.com');
wrote:

 On Thu, Jul 31, 2014 at 10:03:00AM -0400, Tom Lane wrote:
  2. text is the preferred type among the string class, so any case where
  you have text on one side and some other string type on the other is
  going to get resolved as text vs text.
  Because of #1, domain-specific functions and operators tend to be pretty
  useless; you find yourself always having to cast the other side to get
 
  If you're intent on having this behavior, the way to go at it is to make
  your own actual datatype (not a domain) and create all your own
 comparison
  operators for it.  You can add an implicit cast to text for cases where

 Thanks to everyone who responded to my question.  Yes, the
 case-insensitivity was an example contrivance.  My actual operator
 function is using regular expressions to do text replacements, so
 citext is not going to solve all my problems.

Looks like 'CREATE TYPE' is the only way to get what I'm after here, and
 since there are
 regular expressions involved, a C-language solution is not seeming
 like it's going to be very convenient.  On top of that, the content of
 my regular-expression replacement strings are constructed from data
 in my database, so really my SQL-language operator function seemed
 ideal except--of course--for the limitations you all have explained to
 me.


I'm obviously under informed but you may wish to step back and consider
whether you are being too clever/novel...

One area you need to evaluate is how what you do interplays with indexing -
if necessary - and performance generally.


 One final question: the 'CREATE CAST' command got my interest.  I'm
 assuming that when the docs say it 'performs a conversion between two
 data types,' that the meaning of data type includes only those
 created using 'CREATE TYPE' and excludes domains.  If I am mistaken on
 that point I would be grateful to learn of that mistake.


 Seems easy enough to test...though since 'value'::domain is valid syntax I
would have to assume your conclusion is wrong.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/User-defined-operator-function-what-parameter-type-to-use-for-uncast-character-string-tp5813386p5813504.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Re: [GENERAL] Re: How to implement a uniqueness constraint across multiple tables?

2014-08-01 Thread Pujol Mathieu


Le 01/08/2014 09:28, David G Johnston a écrit :

Pujol Mathieu wrote

Le 31/07/2014 20:38, Kynn Jones a écrit :

I want to implement something akin to OO inheritance among DB tables.
The idea is to define some superclass table, e.g.:

(BTW, one could use PostgreSQL built-in support for table inheritance
to implement something very much like the scheme above.
Unfortunately, as explained in the documentation, there's no built-in
support yet for enforcing uniqueness across multiple subclass tables.)



Maybe you can use inheritance.

So the foreign key constraint will be on the sub_template avoiding two
row of sub_x to reference the same foreign key.
This is just an idea I let you check for syntax.
http://www.postgresql.org/docs/9.3/static/ddl-inherit.html
Regards,
Mathieu

You should read Section 5.8.1 (Caveats) of the page your referenced. Or the
BTW in the OP which reiterates the salient points.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/How-to-implement-a-uniqueness-constraint-across-multiple-tables-tp5813448p5813503.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.



You are right, I didn't saw that.
So ignore my previous message. Or maybe for future release like the 
documentation says.

Mathieu

--
Mathieu PUJOL
Ingénieur Réalité Virtuelle
REAL FUSIO - 3D Computer Graphics
10, rue des arts - 31000 TOULOUSE - FRANCE
mathieu.pu...@realfusio.com - http://www.realfusio.com



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to implement a uniqueness constraint across multiple tables?

2014-08-01 Thread Alban Hertroys
On 31 Jul 2014, at 20:38, Kynn Jones kyn...@gmail.com wrote:

 I want to implement something akin to OO inheritance among DB tables.  The 
 idea is to define some superclass table, e.g.:
 
 CREATE TABLE super (
 super_id INT PRIMARY KEY,
 ...
 -- other columns
 );
 
 CREATE TABLE sub_1 (
 super_id INT PRIMARY KEY,
 FOREIGN KEY (super_id) REFERENCES super(super_id),
 ...
 -- other columns
 );
 
 CREATE TABLE sub_2 (
 super_id INT PRIMARY KEY,
 FOREIGN KEY (super_id) REFERENCES super(super_id),
 ...
 -- other columns
 );
 
 ...
 
 CREATE TABLE sub_n (
 super_id INT PRIMARY KEY,
 FOREIGN KEY (super_id) REFERENCES super(super_id),
 ...
 -- other columns
 );
 
 I cribbed this pattern from pp. 92-93 of Bill Kirwan's SQL Antipatterns: 
 Avoiding the pitfalls of database programming.  The approach has a weakness, 
 however, (which the author does not make sufficiently clear) and that is 
 that, as presented above, it would be possible for multiple sub records 
 (each from a different sub_k table) to refer to the same super record, 
 and this may not be consistent with the semantics of some applications.
 
 Does PostgreSQL have a good way to enforce the uniqueness of super_id values 
 across multiple tables?

Not in and of itself, but if you change the pattern a little you can have 
uniqueness:

CREATE TABLE super (
super_id INT,
— Add a type to the PK
type text,
PRIMARY KEY (super_id, type),
...
-- other columns
);

CREATE TABLE sub_1 (
super_id INT,
— Constrain the records in a sub-table to have a specific type
type text CHECK (type = ’sub_1’),
PRIMARY KEY (super_id, type),
FOREIGN KEY (super_id, type) REFERENCES super(super_id, type),
...
-- other columns
);

etc.

You still won’t have a unique super_id, but the combination of (super_id, type) 
will be unique.

Unfortunately, this approach breaks (again) if you would want to allow for 
multiple inheritance. You could fix that by keeping multiple levels of “type”, 
using multiple type-columns or perhaps an array, but that gets ugly fast.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to implement a uniqueness constraint across multiple tables?

2014-08-01 Thread Kynn Jones
On Thu, Jul 31, 2014 at 3:19 PM, Rob Sargent robjsarg...@gmail.com wrote:


 Wouldn't this be a problem only if new subn() could/would re-use an id?
 if new sub() generates a unique id, there would be no chance of two subn
 entries having the same id.


I'd thought that the ids of the sub_k tables were never generated
independently, but rather they must pre-exist as values of the super_id
column of the super table.  After reading your post though, I'm no longer
sure that this is what Kirwan had in mind...  (He does not give any details
at all on how the IDs should be created.)

If I understand you correctly, there should be a business rule
*somewhere* that says that entries in the super table must *always* be
created following these steps:

1. generate a new (unique) ID from a sequence super_seq;
2. insert a new entry  in the super table having this (necessarily
unique) ID in its super_id column;
3. insert a new entry in some some sub_k table, having this ID in its
super_id column;
4. (somehow) disallow any subsequent updating of the super_id field of
this newly-added sub_k table (although it could still be OK to delete a
record from the super table, and cascade this to the appropriate record in
some sub_k table).

I'm sure this sort of thing could be implemented in PostgreSQL, though I'd
be hard-pressed to fill in the details.  How much of this can be specified
in the definitions (CREATE TABLE ...) of the tables?  For example, (1)
could be taken care of by defining the super_id column of the super table
as a SERIAL.  I imagine that (2) and (3) would have to be encapsulated in a
stored procedure .  Can (4) be implemented in the definitions of the
tables?

Thanks!


Re: [GENERAL] How to implement a uniqueness constraint across multiple tables?

2014-08-01 Thread Kynn Jones
On Thu, Jul 31, 2014 at 3:52 PM, Kevin Grittner kgri...@ymail.com wrote:

This goes beyond the capabilities of declarative constraints to
 enforce.  You can enforce it using triggers, but you need to handle
 race conditions, which is not easy with MVCC behavior (where reads
 don't block anything and writes don't block reads).  There are
 basically two ways to cover that:

snip

Thanks for the detailed response.  It gives me much to work with/follow-up
on.


Re: [GENERAL] Re: Is it possible to create an index without keeping the indexed data in a column?

2014-08-01 Thread Larry White
Thank you David and Amit. This is more or less what I was looking for.

I _think_ I might be able to store the data as TEXT, which is highly
compressed by Toast, and then perhaps write the function in terms of a TEXT
to JSONB conversion.  I will give it a try. It might perform terribly, but
will be an interesting experiment.:)


On Fri, Aug 1, 2014 at 3:14 AM, David G Johnston david.g.johns...@gmail.com
 wrote:

 larrry wrote
  Hi,
 
  I would like to create a GIN index on a set of JSON documents. Right now
  I'm storing the data in a JSONB column. The current index looks like
 this:
 
  CREATE INDEX document_payload_idx
ON document
USING gin
(payload jsonb_path_ops);
 
  The index is pretty small, but the actual data takes up a *lot* of space.
  Is there a way to get Postgres to index the table *as if* the JSON were
  there, but not actually put the data in the table? I could either store
  the
  docs elsewhere and keep a reference, or compress them and put them in the
  table in compressed form as a blob.
 
  Thanks much for your help.
 
  Larry

 No idea if this works but maybe you can store the compressed data and then
 write the index expression like:

 USING gin (unzip(payload) jsonb_path_ops)

 The unzip function would need to be custom I think...

 David J.




 --
 View this message in context:
 http://postgresql.1045698.n5.nabble.com/Is-it-possible-to-create-an-index-without-keeping-the-indexed-data-in-a-column-tp5813461p5813500.html
 Sent from the PostgreSQL - general mailing list archive at Nabble.com.


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] Very Limited Toast Compression on JSONB (9.4 beta 2)

2014-08-01 Thread Larry White
There is no TOAST compression on JSON or JSONB data in 9.4 beta 2. I'm not
sure about other versions.  I'm also not sure if this is a bug or by
design, but if it is by design, I think the documentation should be
updated.

Here is a summary of my results inserting 10,000 highly compressible JSON
docs of 251K each.

Column Type  - Storage   - TOAST table size
JSONB   - EXTERNAL -  2448 MB
JSONB   - EXTENDED - 2448 MB
JSON - EXTENDED - 2504 MB
TEXT - EXTERNAL  - 2409 MB
TEXT - EXTENDED -  40 MB


On Fri, Aug 1, 2014 at 2:36 AM, Larry White ljw1...@gmail.com wrote:


 On Fri, Aug 1, 2014 at 2:20 AM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Thursday, July 31, 2014, Larry White ljw1...@gmail.com wrote:

  Hi,

 I'm running an experiment on 9.4 beta 2.

 I put 275,000 identical JSON files into a table using JSONB (one per
 row).  Each raw text file is 251K in size, so the total uncompressed is
 69GB. The column storage is set to EXTENDED.  There are other toastable
 columns in the table, but none have more than 36 bytes of data in them.

 My Toast table is 66GB. I would have expected to get that much (or more)
 compression just from JSONB being a binary format.

 If I compress one of these JSON files outside of Postgres, it goes from
 251K to 1K.


 That is an astonishing amount of compression.  Are you really compressing
 one of the files in isolation, rather than co-compressing many and then
 pro-rating the result?


 Yes, I should have explained. These are generated JSON files for testing
 and there is massive repetition in them, which is why they compress so well
 outside of Postgres. (Basically there is a repeating array of the same
 string) I did compress just the one.



 Can you provide an example of the data, and the command line you used to
 compress it?


 Compressed on a Mac with the Compress UI option. Here's a brief sample
 from the file. You can see why it compresses so well:
 {\junk\:[\124245etweetwet345gwtretwt43 qwrqwq qwre qw
 rsdflkas\,\q4535 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535

 and so on.

  So each file should theoretically fit on a single row in the toast
 table. In total, the amount well under a GB when compressed outside of PG.

 Any guesses as to why there is so little compression of this data or how
 I might remedy the situation?


 PostgreSQL's built in tuple compression is generally not very good.  It
 is good at compressing long strings of identical bytes, but not good at
 compressing the type of thing you are likely to find in JSON (unless your
 JSON had long strings of spaces to reflect indentation of deeply nested
 structures, which JSON probably wouldn't do and which JSONB certainly
 wouldn't).  It was designed to be very fast and to be unencumbered with the
 patent issues common at the time it was written.  It was not designed to
 give the best possible compression ratios.

 It also compresses each row independently.  Most of the compression
 opportunities in a column of JSON data would probably be between rows, when
 the same keys show up and over and over again, not within a row.  But it
 can't capture those opportunities.


 I'm not expecting miracles with real data, but as far as I can tell, there
 is zero compression happening. I'm wondering if it is disabled for JSONB
 for some reason.



 Cheers,

 Jeff





[GENERAL] Feature proposal and discussion: full-fledged column/function equivalence

2014-08-01 Thread Chris Travers
Hi all;

I had a pleasant surprise today when demonstrating a previous misfeature in
PostgreSQL behaved unexpectedly.  In further investigation, there is a
really interesting syntax which is very helpful for some things I had not
known about.

Consider the following:

CREATE TABLE keyvaltest (
key text primary key,
value text not null
);
INSERT INTO keyvaltest VALUES ('foo', 'bar'), ('fooprime', 'barprime');
SELECT value(k) from keyvaltest k;

The latter performs exactly like

SELECT k.value from keyvaltest k;

So the column/function equivalent is there.  This is probably not the best
for production SQL code just because it is non-standard, but it is great
for theoretical examples because it shows the functional dependency between
tuple and tuple member.

It gets better:

CREATE OR REPLACE FUNCTION value(test) returns int language sql as $$
select 3; $$;
ERROR:  value is  already an attribute of type test

So this further suggests that value(test) is effectively an implicit
function of test (because it is a trivial functional dependency).

So with all this in mind, is there any reason why we can't or shouldn't
allow:

CREATE testfunction(test) returns int language sql as $$ select 1; $$;
SELECT testfunction FROM test;

That would allow first-class calculated columns.

I assume the work is mostly at the parser/grammatical level.  Is there any
reason why supporting that would be a bad idea?
-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


[GENERAL] jsonb creation functions?

2014-08-01 Thread Larry White
There is a set of creation functions for json, such as:

 to_json(anyelement)

There doesn't seem to be any equivalent functions for converting text to
jsonb.

Is there a way to do this?

Thanks.


Re: [GENERAL] Re: User-defined operator function: what parameter type to use for uncast character string?

2014-08-01 Thread Tom Lane
Adam Mackler pgsql-gene...@mackler.org writes:
 One final question: the 'CREATE CAST' command got my interest.  I'm
 assuming that when the docs say it 'performs a conversion between two
 data types,' that the meaning of data type includes only those
 created using 'CREATE TYPE' and excludes domains.  If I am mistaken on
 that point I would be grateful to learn of that mistake.

I wouldn't recommend it.  The expected behavior of up-casting to a domain
from its base type is that the value doesn't change but the domain's
current check constraints are applied.  I think that if you defined
a cast via CREATE CAST it would override that behavior (I've not verified
this though), and it would then be entirely on your head whether the
resulting value was actually a legal member of the domain.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Feature proposal and discussion: full-fledged column/function equivalence

2014-08-01 Thread Vik Fearing
On 08/01/2014 04:57 PM, Chris Travers wrote:
 Hi all;
 
 I had a pleasant surprise today when demonstrating a previous misfeature
 in PostgreSQL behaved unexpectedly.  In further investigation, there is
 a really interesting syntax which is very helpful for some things I had
 not known about.
 
 Consider the following:
 
 CREATE TABLE keyvaltest (
 key text primary key,
 value text not null
 );
 INSERT INTO keyvaltest VALUES ('foo', 'bar'), ('fooprime', 'barprime');
 SELECT value(k) from keyvaltest k;
 
 The latter performs exactly like 
 
 SELECT k.value from keyvaltest k;

Interesting.  I wasn't aware of that.

 So the column/function equivalent is there.  This is probably not the
 best for production SQL code just because it is non-standard, but it is
 great for theoretical examples because it shows the functional
 dependency between tuple and tuple member.
 
 It gets better:
 
 CREATE OR REPLACE FUNCTION value(test) returns int language sql as $$
 select 3; $$;
 ERROR:  value is  already an attribute of type test
 
 So this further suggests that value(test) is effectively an implicit
 function of test (because it is a trivial functional dependency).
 
 So with all this in mind, is there any reason why we can't or shouldn't
 allow:
 
 CREATE testfunction(test) returns int language sql as $$ select 1; $$;
 SELECT testfunction FROM test;
 
 That would allow first-class calculated columns.
 
 I assume the work is mostly at the parser/grammatical level.  Is there
 any reason why supporting that would be a bad idea?

This is already supported since forever.

SELECT test.testfunction FROM test;

This link might be of interest to you:
http://momjian.us/main/blogs/pgblog/2013.html#April_10_2013
-- 
Vik


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Feature proposal and discussion: full-fledged column/function equivalence

2014-08-01 Thread Vik Fearing
On 08/01/2014 06:28 PM, Vik Fearing wrote:
 So with all this in mind, is there any reason why we can't or shouldn't
  allow:
  
  CREATE testfunction(test) returns int language sql as $$ select 1; $$;
  SELECT testfunction FROM test;
  
  That would allow first-class calculated columns.
  
  I assume the work is mostly at the parser/grammatical level.  Is there
  any reason why supporting that would be a bad idea?
 This is already supported since forever.
 
 SELECT test.testfunction FROM test;
 
 This link might be of interest to you:
 http://momjian.us/main/blogs/pgblog/2013.html#April_10_2013

Sorry, that's not the link I wanted.  This one is:
http://momjian.us/main/blogs/pgblog/2013.html#April_1_2013

Despite being posted on April 1st, it is not a joke. :)
-- 
Vik


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Very Limited Toast Compression on JSONB (9.4 beta 2)

2014-08-01 Thread Jeff Janes
On Thu, Jul 31, 2014 at 11:36 PM, Larry White ljw1...@gmail.com wrote:


 On Fri, Aug 1, 2014 at 2:20 AM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Thursday, July 31, 2014, Larry White ljw1...@gmail.com wrote:

  Hi,

 I'm running an experiment on 9.4 beta 2.

 I put 275,000 identical JSON files into a table using JSONB (one per
 row).  Each raw text file is 251K in size, so the total uncompressed is
 69GB. The column storage is set to EXTENDED.  There are other toastable
 columns in the table, but none have more than 36 bytes of data in them.

 My Toast table is 66GB. I would have expected to get that much (or more)
 compression just from JSONB being a binary format.

 If I compress one of these JSON files outside of Postgres, it goes from
 251K to 1K.


 That is an astonishing amount of compression.  Are you really compressing
 one of the files in isolation, rather than co-compressing many and then
 pro-rating the result?


 Yes, I should have explained. These are generated JSON files for testing
 and there is massive repetition in them, which is why they compress so well
 outside of Postgres. (Basically there is a repeating array of the same
 string) I did compress just the one.



 Can you provide an example of the data, and the command line you used to
 compress it?


 Compressed on a Mac with the Compress UI option. Here's a brief sample
 from the file. You can see why it compresses so well:
 {\junk\:[\124245etweetwet345gwtretwt43 qwrqwq qwre qw
 rsdflkas\,\q4535 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535

 and so on.


If I take that example (and cap off the array and hash right after the end
of what you show, and remove the escapes of the double quote marks) then it
does not compress, but only because it is not long enough to trigger the
compression attempts.

If I repeat the array portion 4 more times to make the whole thing long
enough for compression to be used, it compresses nicely.  Not 100 fold (but
then again, neither does bzip2 or gzip on the data I just described), but
text and json compresses 10 fold and jsonb 5 fold.

Cheers,

Jeff


Re: [GENERAL] jsonb creation functions?

2014-08-01 Thread Christoph Moench-Tegeder
Hi,

 There is a set of creation functions for json, such as:
 
  to_json(anyelement)
 
 There doesn't seem to be any equivalent functions for converting text to
 jsonb.
 
 Is there a way to do this?

You can always cast json to jsonb:
test_db=# create table t (a integer primary key, b jsonb);
CREATE TABLE
test_db=# insert into t (a, b) values (1, to_json('a'::text)::jsonb);
INSERT 0 1
test_db=# select * from t;
 a |  b  
---+-
 1 | a
(1 row)

test_db=# insert into t (a, b) values (2, 
to_json('{a,b,c}'::text[])::jsonb);
INSERT 0 1
test_db=# select * from t;
 a |b
---+-
 1 | a
 2 | [a, b, c]
(2 rows)

Regards,
Christoph

-- 
Spare Space


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Very Limited Toast Compression on JSONB (9.4 beta 2)

2014-08-01 Thread Larry White
Jeff,

Thank you for your help. This is a Postgres bug, but I don't think I'd have
figured it out without your help.

What is happening is that if PG can, after compression, put the entire
'document' into one row/page in the toast table it does. However, if the
document is too big to fit in one row after compression, it does no
compression at all.  This is why it worked for you, but not for me.

I create my test file (in part) with this loop:

for (int j = 0; j  110; j++) {
mediumPayload.getJunk().add(124245etweetwet345gwtretwt43
qwrqwq qwre qw rsdflkas);
mediumPayload.getJunk().add(q4535 wqrqwrqwrqw2 wrqwrqwrq32232w
kswe sfasrs sdfsd);
}

if the loop runs 110 times as shown, it compresses.
if the loop runs 111 times, it does not:

With 110 iterations:
Extended  8192 bytes  (one page)
External   66 MB

With 111 iterations:
Extended  69 MB
External69 MB

Hopefully they can fix this before the GA release.


On Fri, Aug 1, 2014 at 12:38 PM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Thu, Jul 31, 2014 at 11:36 PM, Larry White ljw1...@gmail.com wrote:


 On Fri, Aug 1, 2014 at 2:20 AM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Thursday, July 31, 2014, Larry White ljw1...@gmail.com wrote:

  Hi,

 I'm running an experiment on 9.4 beta 2.

 I put 275,000 identical JSON files into a table using JSONB (one per
 row).  Each raw text file is 251K in size, so the total uncompressed is
 69GB. The column storage is set to EXTENDED.  There are other toastable
 columns in the table, but none have more than 36 bytes of data in them.

 My Toast table is 66GB. I would have expected to get that much (or
 more) compression just from JSONB being a binary format.

 If I compress one of these JSON files outside of Postgres, it goes from
 251K to 1K.


 That is an astonishing amount of compression.  Are you really
 compressing one of the files in isolation, rather than co-compressing many
 and then pro-rating the result?


 Yes, I should have explained. These are generated JSON files for testing
 and there is massive repetition in them, which is why they compress so well
 outside of Postgres. (Basically there is a repeating array of the same
 string) I did compress just the one.



 Can you provide an example of the data, and the command line you used to
 compress it?


 Compressed on a Mac with the Compress UI option. Here's a brief sample
 from the file. You can see why it compresses so well:
 {\junk\:[\124245etweetwet345gwtretwt43 qwrqwq qwre qw
 rsdflkas\,\q4535 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535

 and so on.


 If I take that example (and cap off the array and hash right after the end
 of what you show, and remove the escapes of the double quote marks) then it
 does not compress, but only because it is not long enough to trigger the
 compression attempts.

 If I repeat the array portion 4 more times to make the whole thing long
 enough for compression to be used, it compresses nicely.  Not 100 fold (but
 then again, neither does bzip2 or gzip on the data I just described), but
 text and json compresses 10 fold and jsonb 5 fold.

 Cheers,

 Jeff



Re: [GENERAL] Very Limited Toast Compression on JSONB (9.4 beta 2)

2014-08-01 Thread Larry White
Reported as bug #11109.


On Fri, Aug 1, 2014 at 1:46 PM, Larry White ljw1...@gmail.com wrote:

 Jeff,

 Thank you for your help. This is a Postgres bug, but I don't think I'd
 have figured it out without your help.

 What is happening is that if PG can, after compression, put the entire
 'document' into one row/page in the toast table it does. However, if the
 document is too big to fit in one row after compression, it does no
 compression at all.  This is why it worked for you, but not for me.

 I create my test file (in part) with this loop:

 for (int j = 0; j  110; j++) {
 mediumPayload.getJunk().add(124245etweetwet345gwtretwt43
 qwrqwq qwre qw rsdflkas);
 mediumPayload.getJunk().add(q4535 wqrqwrqwrqw2
 wrqwrqwrq32232w kswe sfasrs sdfsd);
 }

 if the loop runs 110 times as shown, it compresses.
 if the loop runs 111 times, it does not:

 With 110 iterations:
 Extended  8192 bytes  (one page)
 External   66 MB

 With 111 iterations:
 Extended  69 MB
 External69 MB

 Hopefully they can fix this before the GA release.


 On Fri, Aug 1, 2014 at 12:38 PM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Thu, Jul 31, 2014 at 11:36 PM, Larry White ljw1...@gmail.com wrote:


 On Fri, Aug 1, 2014 at 2:20 AM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Thursday, July 31, 2014, Larry White ljw1...@gmail.com wrote:

  Hi,

 I'm running an experiment on 9.4 beta 2.

 I put 275,000 identical JSON files into a table using JSONB (one per
 row).  Each raw text file is 251K in size, so the total uncompressed is
 69GB. The column storage is set to EXTENDED.  There are other toastable
 columns in the table, but none have more than 36 bytes of data in them.

 My Toast table is 66GB. I would have expected to get that much (or
 more) compression just from JSONB being a binary format.

 If I compress one of these JSON files outside of Postgres, it goes
 from 251K to 1K.


 That is an astonishing amount of compression.  Are you really
 compressing one of the files in isolation, rather than co-compressing many
 and then pro-rating the result?


 Yes, I should have explained. These are generated JSON files for testing
 and there is massive repetition in them, which is why they compress so well
 outside of Postgres. (Basically there is a repeating array of the same
 string) I did compress just the one.



 Can you provide an example of the data, and the command line you used
 to compress it?


 Compressed on a Mac with the Compress UI option. Here's a brief sample
 from the file. You can see why it compresses so well:
 {\junk\:[\124245etweetwet345gwtretwt43 qwrqwq qwre qw
 rsdflkas\,\q4535 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535
 wqrqwrqwrqw2 wrqwrqwrq32232w kswe sfasrs
 sdfsd\,\124245etweetwet345gwtretwt43 qwrqwq qwre qw rsdflkas\,\q4535

 and so on.


 If I take that example (and cap off the array and hash right after the
 end of what you show, and remove the escapes of the double quote marks)
 then it does not compress, but only because it is not long enough to
 trigger the compression attempts.

 If I repeat the array portion 4 more times to make the whole thing long
 enough for compression to be used, it compresses nicely.  Not 100 fold (but
 then again, neither does bzip2 or gzip on the data I just described), but
 text and json compresses 10 fold and jsonb 5 fold.

 Cheers,

 Jeff





[GENERAL] Re: Feature proposal and discussion: full-fledged column/function equivalence

2014-08-01 Thread David G Johnston
Vik Fearing wrote
 CREATE testfunction(test) returns int language sql as $$ select 1; $$;
 SELECT testfunction FROM test;
 
 That would allow first-class calculated columns.
 
 I assume the work is mostly at the parser/grammatical level.  Is there
 any reason why supporting that would be a bad idea?
 
 This is already supported since forever.
 
 SELECT test.testfunction FROM test;

More to the point: if you are writing a multiple-relation query and have
testfunction functions defined for at least two of the relations used in
the query how would the system decide which one to use?

SELECT testfunction FROM test JOIN test_extended USING (test_id)

I guess you could allow for the non-ambiguous cases and error out otherwise
but that seems to be adding quite a bit of complexity for little gain.

David J.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Feature-proposal-and-discussion-full-fledged-column-function-equivalence-tp5813533p5813571.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Re: Feature proposal and discussion: full-fledged column/function equivalence

2014-08-01 Thread Chris Travers
On Fri, Aug 1, 2014 at 12:19 PM, David G Johnston 
david.g.johns...@gmail.com wrote:

 Vik Fearing wrote
  CREATE testfunction(test) returns int language sql as $$ select 1; $$;
  SELECT testfunction FROM test;
 
  That would allow first-class calculated columns.
 
  I assume the work is mostly at the parser/grammatical level.  Is there
  any reason why supporting that would be a bad idea?
 
  This is already supported since forever.
 
  SELECT test.testfunction FROM test;

 More to the point: if you are writing a multiple-relation query and have
 testfunction functions defined for at least two of the relations used in
 the query how would the system decide which one to use?


Same way you do it for columns.  Throw an error that it is ambiguous.



 SELECT testfunction FROM test JOIN test_extended USING (test_id)

 I guess you could allow for the non-ambiguous cases and error out otherwise
 but that seems to be adding quite a bit of complexity for little gain.


Hmm.  As I see it, there is one possible backwards compatibility issue but
it is almost certainly extraordinarily rare.

Suppose in your above example, test_extended has a testfunction attribute
but test has a testfunction function.  In the current codebase, there is no
parsing ambiguity (the attribute wins because the function is ignored), but
we'd have to throw the same error as if the function were an attribute if
we did this.

It doesn't seem terribly logically complicated to do this (since it is a
slight extension to the lookup in the system catalogs), and I am having
trouble imagining that there are many cases where these sorts of functions
are added.

The larger question becomes:

Would it be more useful to have such functions in the select * result, or
to treat them as hidden columns from that?  (I am thinking that can be
decided down the road though if I go through and take this up on -hackers).


 David J.



 --
 View this message in context:
 http://postgresql.1045698.n5.nabble.com/Feature-proposal-and-discussion-full-fledged-column-function-equivalence-tp5813533p5813571.html
 Sent from the PostgreSQL - general mailing list archive at Nabble.com.


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general




-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: [GENERAL] Upgrading from 9.0.11 to 9.3.5 on CentOS 6 (64 bit)

2014-08-01 Thread Phoenix Kiula
Thank you John.


 you can upgrade to 9.0.18 painlessly.   9.1 or .2 or .3, not quite so
 painless.


What's the best way to get to 9.0.18, as a start?  Is there a simple
single command I can use? I'm on CentOS 6, 64bit.



 have you tried a vacuum full of the whole cluster, with your applications
 shut down?


Not yet, not with the apps shut down entirely, but in read mode, yes.
No new rows being added. SELECTs have to work as it's a high traffic
website.




 you will need to either pg_dumpall your old database 'cluster' and load this
 into the new version, or use pg_upgrade, which is a fair bit trickier but
 can do an in-place upgrade.if your databases aren't much over a few
 dozen gigabytes, pg_dumpall is probably simpler than pg_upgrade.  if your
 databases are large, pg_dumpall - psql restore may take a LONG time, so the
 pg_upgrade process may be more efficient.



Dread to use pg_upgrade after that confirmation of my fears. Our DB is
around 200 GB. Even with pg_upgrade, will I have to once again tinker
with all the conf files and authentication (which is md5 right now),
change my passwords and do template1 database stuff again? I saw the
doc page on the postgresql.org site for pg_upgrade, but it presumes a
lot of things in terms of knowledge. I just have one database
instance or cluster. The 10-15 tables are not complex. But they're
large, as in over a billion rows now. All I need is for the upgrade to
happen automatically, retaining my config and paths and whatnot (or
clear instructions that work, step by step).

Thanks!


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Upgrading from 9.0.11 to 9.3.5 on CentOS 6 (64 bit)

2014-08-01 Thread John R Pierce

On 8/1/2014 6:31 PM, Phoenix Kiula wrote:

What's the best way to get to 9.0.18, as a start?  Is there a simple
single command I can use? I'm on CentOS 6, 64bit.


assuming you installed 9.0 from the yum.postgresql.com respositories, 
then, `yum update postgresql90-server`   and restart the postgresql-9.0 
service should do nicely.


if you installed 9.0 some other method, then I dunno.




--
john r pierce  37N 122W
somewhere on the middle of the left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Re: Feature proposal and discussion: full-fledged column/function equivalence

2014-08-01 Thread David Johnston
On Fri, Aug 1, 2014 at 6:22 PM, Chris Travers chris.trav...@gmail.com
wrote:

 On Fri, Aug 1, 2014 at 12:19 PM, David G Johnston 
 david.g.johns...@gmail.com wrote:

 Vik Fearing wrote
  CREATE testfunction(test) returns int language sql as $$ select 1; $$;
  SELECT testfunction FROM test;
 
  That would allow first-class calculated columns.
 
  I assume the work is mostly at the parser/grammatical level.  Is there
  any reason why supporting that would be a bad idea?
 
  This is already supported since forever.
 
  SELECT test.testfunction FROM test;

 More to the point: if you are writing a multiple-relation query and have
 testfunction functions defined for at least two of the relations used in
 the query how would the system decide which one to use?


 Same way you do it for columns.  Throw an error that it is ambiguous.



​I'd rather approach the first-class issue by being able to say:  ALTER
TABLE test ADD COLUMN ​testfunction(test) -- maybe with an AS col_alias...

I do not have anything particularly against your proposal but neither do I
find it an overwhelming improvement over testfunction(test) and
test.testfunction - especially when I can encapsulate them behind a VIEW.



 SELECT testfunction FROM test JOIN test_extended USING (test_id)

 I guess you could allow for the non-ambiguous cases and error out
 otherwise
 but that seems to be adding quite a bit of complexity for little gain.


 Hmm.  As I see it, there is one possible backwards compatibility issue but
 it is almost certainly extraordinarily rare.

 Suppose in your above example, test_extended has a testfunction attribute
 but test has a testfunction function.  In the current codebase, there is no
 parsing ambiguity (the attribute wins because the function is ignored), but
 we'd have to throw the same error as if the function were an attribute if
 we did this.

 It doesn't seem terribly logically complicated to do this (since it is a
 slight extension to the lookup in the system catalogs), and I am having
 trouble imagining that there are many cases where these sorts of functions
 are added.

 The larger question becomes:

 Would it be more useful to have such functions in the select * result, or
 to treat them as hidden columns from that?  (I am thinking that can be
 decided down the road though if I go through and take this up on -hackers).




If they truly are first class members of the table they should probably
appear with  SELECT * ; otherwise, and this is simply semantics, you are
simply adding yet another syntax to remember to invoke a function since the
user will still have to know said function exists.  I read first class to
mean that the fact the value is being derived from a function call is
invisible to the user.  And this then points leads back to the idea of
defining a generated column on the actual table or, in absence of that
capability - live with the fact the updateable can accomplish many, if not
all, of the same goals today.

David J.