from:"Vitaly Burovoy"

Re: [HACKERS] Fix dumping pre-10 DBs by pg_dump10 if table "name" exists

2017-10-31 Thread Vitaly Burovoy

On 10/31/17, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Yeah, there are quite a few unqualified casts in pg_dump, but AFAICS
> all the rest are OK because the search_path is just pg_catalog.
>
> But I did find psql's describe.c making a similar mistake :-(.
> Pushed that along with your fix.
>
>   regards, tom lane
>

Oops. I missed it in "describe.c" because I grepped for exact "::name" string.

Thank you very much!

--
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix dumping pre-10 DBs by pg_dump10 if table "name" exists

2017-10-31 Thread Vitaly Burovoy

On 10/31/17, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Vitaly Burovoy <vitaly.buro...@gmail.com> writes:
>> Recently my colleagues found a bug.
>
>> -  "SELECT 'bigint'::name AS 
>> sequence_type, "
>> +  "SELECT 
>> 'bigint'::pg_catalog.name AS sequence_type,
>
> Good catch, but I think we could simplify this by just omitting the cast
> altogether:
>
> -   "SELECT 'bigint'::name AS 
> sequence_type, "
> +   "SELECT 'bigint' AS 
> sequence_type,
>
> pg_dump doesn't particularly care whether the column comes back marked
> as 'name' or 'text' or 'unknown'.
>
>   regards, tom lane

OK, just for convenience I'm attaching your version of the fix.
I left an other "NULL::name AS rolname" at
src/bin/pg_dump/pg_dump.c:2978 because can't check (remoteVersion <
9) it and it is under strict "selectSourceSchema(fout,
"pg_catalog");" schema set.

--
Best regards,
Vitaly Burovoy


0001-Fix-dumping-schema-if-a-table-named-name-exists.ver2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Fix dumping pre-10 DBs by pg_dump10 if table "name" exists

2017-10-31 Thread Vitaly Burovoy

Hello, hackers!

Recently my colleagues found a bug.
They could not migrate from PG9.5 to PG10 due to error during
pg_upgrage (the same as in the "reproduce" part below).
An investigation showed there is a table "name" in the same schema
where the dumped sequence is located and the PG tries to unpack the
literal "bigint" as a composite type and fails.

The bug was introduced by two commits, one of them changes
search_path, the second one introduces the "'bigint'::name"
without specifying schema:
da4d1c0c15ab9afdfeee8bad9a1a9989b6bd59b5 pg_dump: Fix some schema
issues when dumping sequences
2ea5b06c7a7056dca0af1610aadebe608fbcca08 Add CREATE SEQUENCE AS  clause

Steps to reproduce:
$ psql -h pre-10 postgres
Password for user postgres:
psql (11devel, server 9.5.8)
Type "help" for help.

postgres=# create table name(id serial);
CREATE TABLE
postgres=# \q

$ pg_dump -h pre-10 postgres
pg_dump: [archiver (db)] query failed: ERROR:  malformed record
literal: "bigint"
LINE 1: SELECT 'bigint'::name AS sequence_type, start_value, increme...
   ^
DETAIL:  Missing left parenthesis.
pg_dump: [archiver (db)] query was: SELECT 'bigint'::name AS
sequence_type, start_value, increment_by, max_value, min_value,
cache_value, is_cycled FROM name_id_seq


I've implemented a little fix (attached), don't think there is
something to be written to docs and tests.

-- 
Best regards,
Vitaly Burovoy


0001-Fix-dumping-schema-if-a-table-named-name-exists.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2017-04-23 Thread Vitaly Burovoy

On 4/23/17, Robert Haas <robertmh...@gmail.com> wrote:
> On Thu, Apr 20, 2017 at 12:05 AM, Vitaly Burovoy
> <vitaly.buro...@gmail.com> wrote:
>> OK. Let's go through it again.
>> IDENTITY is a property of a column. There are no syntax to change any
>> property of any DB object via the "ADD" syntax.
>> Yes, a structure (a sequence) is created. But in fact it cannot be
>> independent from the column at all (I remind you that according to the
>> standard it should be unnamed sequence and there are really no way to
>> do something with it but via the column's DDL).
>
> I agree that ADD is a little odd here, but it doesn't seem terrible.

Yes, it is not terrible, but why do we need to support an odd syntax
if we can use a correct one?
If we leave it as it was committed, we have to support it for years.

> But why do we need it?  Instead of:
>
> ADD GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
> SET GENERATED { ALWAYS | BY DEFAULT }
> DROP IDENTITY [ IF EXISTS ]
>
> Why not just:
>
> SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
> DROP IDENTITY [ IF EXISTS ]
>
> Surely the ALTER TABLE command can tell whether the column is already
> GENERATED, so the first form could make it generated if it's not and
> adjust the ALWAYS/BY DEFAULT property if it is.

I thought exactly that way, but Peter gave an explanation[1].
I had to search a different way because no one joined to the
discussion at that time.
One of reasons from Peter was to make "SET GENERATED" follow the
standard (i.e. raise an error).
I asked whether "IF NOT EXISTS" works for him instead of "ADD GENERATED".
The answer[2] was "It could be done", but "it is very difficult to implement".

So I wonder why the adjustment patch is not wished for being committed.

>> It is even hard to detect which sequence (since they have names) is
>> owned by the column:
>>
>> postgres=# CREATE TABLE xxx(i int generated always as identity, j
>> serial);
>> CREATE TABLE
>> postgres=# \d xxx*
>> Table "public.xxx"
>>  Column |  Type   | Collation | Nullable |Default
>> +-+---+--+
>>  i  | integer |   | not null | generated always as identity
>>  j  | integer |   | not null | nextval('xxx_j_seq'::regclass)
>>
>>  Sequence "public.xxx_i_seq"
>>Column   |  Type   | Value
>> +-+---
>>  last_value | bigint  | 1
>>  log_cnt| bigint  | 0
>>  is_called  | boolean | f
>>
>>  Sequence "public.xxx_j_seq"
>>Column   |  Type   | Value
>> +-+---
>>  last_value | bigint  | 1
>>  log_cnt| bigint  | 0
>>  is_called  | boolean | f
>> Owned by: public.xxx.j
>>
>> I can only guess that "public.xxx_i_seq" is owned by "public.xxx.i",
>> nothing proves that.
>> Whereas for regular sequence there are two evidences ("Default" and "Owned
>> by").
>
> This seems like a psql deficiency that should be fixed.

It was a part of explanation why IDENTITY is a property and therefore
"SET" should be used instead of "ADD".

>> Also the created sequence cannot be deleted (only with the column) or
>> left after the column is deleted.
>
> This does not seem like a problem.  In fact I'd say that's exactly the
> desirable behavior.

Also it is not about a problem, it is a part of explanation.

>> The "[ NOT ] EXISTS" is a common Postgres' syntax extension for
>> creating/updating objects in many places. That's why I think it should
>> be used instead of introducing the new "ADD" syntax which contradicts
>> the users' current experience.
>
> As noted above, I don't understand why we need either ADD or IF NOT
> EXISTS.  Why can't SET just, eh, set the property to the desired
> state?

+1

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

[1] https://postgr.es/m/497c40af-bd7a-5cb3-d028-d91434639...@2ndquadrant.com
[2] https://postgr.es/m/59d8e32a-14de-6f45-c2b0-bb52e4a75...@2ndquadrant.com
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2017-04-19 Thread Vitaly Burovoy

On 4/18/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
> On 4/7/17 01:26, Vitaly Burovoy wrote:
>> I've implement SET GENERATED ... IF NOT EXISTS. It must be placed
>> before other SET options but fortunately it conforms with the
>> standard.
>> Since that form always changes the sequence behind the column, I
>> decided to explicitly write "[NO] CACHE" in pg_dump.
>>
>> As a plus now it is possible to rename the sequence behind the column
>> by specifying SEQUENCE NAME in SET GENERATED.
>>
>> I hope it is still possible to get rid of the "ADD GENERATED" syntax.
>
> I am still not fond of this change.  There is precedent all over the
> place for having separate commands for creating a structure, changing a
> structure, and removing a structure.  I don't understand what the
> problem with that is.
>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

OK. Let's go through it again.
IDENTITY is a property of a column. There are no syntax to change any
property of any DB object via the "ADD" syntax.
Yes, a structure (a sequence) is created. But in fact it cannot be
independent from the column at all (I remind you that according to the
standard it should be unnamed sequence and there are really no way to
do something with it but via the column's DDL).
It is even hard to detect which sequence (since they have names) is
owned by the column:

postgres=# CREATE TABLE xxx(i int generated always as identity, j serial);
CREATE TABLE
postgres=# \d xxx*
Table "public.xxx"
 Column |  Type   | Collation | Nullable |Default
+-+---+--+
 i  | integer |   | not null | generated always as identity
 j  | integer |   | not null | nextval('xxx_j_seq'::regclass)

 Sequence "public.xxx_i_seq"
   Column   |  Type   | Value
+-+---
 last_value | bigint  | 1
 log_cnt| bigint  | 0
 is_called  | boolean | f

 Sequence "public.xxx_j_seq"
   Column   |  Type   | Value
+-+---
 last_value | bigint  | 1
 log_cnt| bigint  | 0
 is_called  | boolean | f
Owned by: public.xxx.j

I can only guess that "public.xxx_i_seq" is owned by "public.xxx.i",
nothing proves that.
Whereas for regular sequence there are two evidences ("Default" and "Owned by").

Also the created sequence cannot be deleted (only with the column) or
left after the column is deleted.

Everywhere else the "ADD" syntax is used where you can add more than
one object to the altered one:
ALTER TABLE ... ADD COLUMN /* there can be many added columns
differing by names in the table */
ALTER TABLE ... ADD CONSTRAINT /* there can be many added constraints
differing by names in the table */
ALTER TYPE ... ADD VALUE /* many values in an enum differing by names
in the enum */
ALTER TYPE ... ADD ATTRIBUTE /* many attributes in a composite
differing by names in the enum */
etc.

But what is for "ALTER TABLE ... ALTER COLUMN ... ADD GENERATED"?
Whether a property's name is used for a distinction between them in a column?
Whether it is possible to have more than one such property to the
altering column?

The "SET GENERATED" (without "IF NOT EXISTS") syntax conforms to the
standard for those who want it.
The "SET GENERATED ... IF NOT EXISTS" syntax allows users to have the
column be in a required state (IDENTITY with set options) without
paying attention whether it is already set as IDENTITY or not.

The "[ NOT ] EXISTS" is a common Postgres' syntax extension for
creating/updating objects in many places. That's why I think it should
be used instead of introducing the new "ADD" syntax which contradicts
the users' current experience.

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sequence data type

2017-04-13 Thread Vitaly Burovoy

On 4/4/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
> On 3/30/17 22:47, Vitaly Burovoy wrote:
>> It seemed not very hard to fix it.
>> Please find attached patch to be applied on top of your one.
>>
>> I've added more tests to cover different cases of changing bounds when
>> data type is changed.
>
> Committed all that.  Thanks!

Thank you very much!

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2017-04-13 Thread Vitaly Burovoy

On 4/6/17, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> On 4/6/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
>> As I tried to mention earlier, it is very difficult to implement the IF
>> NOT EXISTS behavior here, because we need to run the commands the create
>> the sequence before we know whether we will need it.
>
> In fact with the function "get_attidentity" it is not so hard. Please,
> find the patch attached.
...
> I hope it is still possible to get rid of the "ADD GENERATED" syntax.

Hello hackers,

Is it possible to commit the patch from the previous letter?
It was sent before the feature freeze, introduces no new feature (only
syntax change discussed in this thread and not implemented due to lack
of a time of the author) and can be considered as a fix to the
committed patch (similar to the second round in "sequence data type").

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2017-04-06 Thread Vitaly Burovoy

On 4/6/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
> On 4/4/17 22:53, Vitaly Burovoy wrote:
>> The next nitpickings to the last patch. I try to get places with
>> lacking of variables' initialization.
>> All other things seem good for me now. I'll continue to review the
>> patch while you're fixing the current notes.
>
> Committed with your changes (see below) as well as executor fix.

Thank you very much!

> As I tried to mention earlier, it is very difficult to implement the IF
> NOT EXISTS behavior here, because we need to run the commands the create
> the sequence before we know whether we will need it.

In fact with the function "get_attidentity" it is not so hard. Please,
find the patch attached.
I've implement SET GENERATED ... IF NOT EXISTS. It must be placed
before other SET options but fortunately it conforms with the
standard.
Since that form always changes the sequence behind the column, I
decided to explicitly write "[NO] CACHE" in pg_dump.

As a plus now it is possible to rename the sequence behind the column
by specifying SEQUENCE NAME in SET GENERATED.

I hope it is still possible to get rid of the "ADD GENERATED" syntax.

>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Best regards,
Vitaly Burovoy

v7-0001-Implement-SET-IDENTITY-.-IF-NOT-EXISTS.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2017-04-05 Thread Vitaly Burovoy

On 4/4/17, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> On 4/3/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
>> On 3/30/17 22:57, Vitaly Burovoy wrote:
>>> Why do you still want to leave "ADD IF NOT EXISTS" instead of using
>>> "SET IF NOT EXISTS"?
>>> If someone wants to follow the standard he can simply not to use "IF
>>> NOT EXISTS" at all. Without it error should be raised.
>>
>> As I tried to mention earlier, it is very difficult to implement the IF
>> NOT EXISTS behavior here, because we need to run the commands the create
>> the sequence before we know whether we will need it.
>
> I understand. You did not mention the reason.
> But now you have the "get_attidentity" function to check whether the
> column is an identity or not.
> If it is not so create a sequence otherwise skip the creation.
>
> I'm afraid after the feature freeze it is impossible to do "major
> reworking of things", but after the release we have to support the
> "ALTER COLUMN col ADD" grammar.

So it would be great if you have a time to get rid of "ALTER ... ADD
GENERATED" syntax in favor of "ALTER ... SET GENERATED ... IF NOT
EXIST"

> 3.
> It would be better if tab-completion has ability to complete the
> "RESTART" keyword like:
> ALTER TABLE x1 ALTER COLUMN i RESTART
> tab-complete.c:8898
> - COMPLETE_WITH_LIST5("TYPE", "SET", "RESET", "ADD", "DROP");
> + COMPLETE_WITH_LIST5("TYPE", "SET", "RESET", "RESTART", "ADD", "DROP");

Oops. Of course
+ COMPLETE_WITH_LIST6("TYPE", "SET", "RESET", "RESTART", "ADD", "DROP");

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2017-04-04 Thread Vitaly Burovoy

On 4/3/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
> On 3/30/17 22:57, Vitaly Burovoy wrote:
>> Why do you still want to leave "ADD IF NOT EXISTS" instead of using
>> "SET IF NOT EXISTS"?
>> If someone wants to follow the standard he can simply not to use "IF
>> NOT EXISTS" at all. Without it error should be raised.
>
> As I tried to mention earlier, it is very difficult to implement the IF
> NOT EXISTS behavior here, because we need to run the commands the create
> the sequence before we know whether we will need it.

I understand. You did not mention the reason.
But now you have the "get_attidentity" function to check whether the
column is an identity or not.
If it is not so create a sequence otherwise skip the creation.

I'm afraid after the feature freeze it is impossible to do "major
reworking of things", but after the release we have to support the
"ALTER COLUMN col ADD" grammar.

> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

The next nitpickings to the last patch. I try to get places with
lacking of variables' initialization.
All other things seem good for me now. I'll continue to review the
patch while you're fixing the current notes.
Unfortunately I don't know PG internals so deep to understand
correctness of changes in the executor (what Tom and Andres are
talking about).

0. There is a little conflict of applying to the current master
because of 60a0b2e.

1.
First of all, I think the previous usage of the constant
"ATTRIBUTE_IDENTITY_NONE" (for setting a value) is better than just
'\0'.
It is OK for lacking of the constant in comparison.

2.
Lack of "SET IDENTITY" and "RESTART" in the "Compatibility" chapter in
"alter_table.sgml":
"The forms ADD (without USING INDEX), DROP, SET DEFAULT, and SET DATA
TYPE (without USING) conform with the SQL standard."
Also ADD IDENTITY is an extension (I hope temporary), but it looks
like a standard feature (from the above sentance).

3.
It would be better if tab-completion has ability to complete the
"RESTART" keyword like:
ALTER TABLE x1 ALTER COLUMN i RESTART
tab-complete.c:8898
- COMPLETE_WITH_LIST5("TYPE", "SET", "RESET", "ADD", "DROP");
+ COMPLETE_WITH_LIST5("TYPE", "SET", "RESET", "RESTART", "ADD", "DROP");

4.
The block in "gram.y":
/* ALTER TABLE  ALTER [COLUMN]  DROP IDENTITY */
does not contain "n->missing_ok = false;". I think it should be.

5.
In the file "pg_dump.c" in the "getTableAttrs" function at row 7959
the "tbinfo->needs_override" is used (in the "||" operator) but it is
initially nowhere set to "false".

6.
And finally... a segfault when pre-9.6 databases are pg_dump-ing:
SQL query in the file "pg_dump.c" in funcion "getTables" has the
"is_identity_sequence" column only in the "if (fout->remoteVersion >=
90600)" block (frow 5536), whereas 9.6 does not support it (but OK,
for 9.6 it is always FALSE, no sense to create a new "if" block), but
in other blocks no ",FALSE as is_identity_sequence" is added.

The same mistake is in the "getTableAttrs" function. Moreover whether
"ORDER BY a.attrelid, a.attnum" is really necessary ("else if
(fout->remoteVersion >= 90200)" has only "ORDER BY a.attnum")?

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2017-03-30 Thread Vitaly Burovoy

On 3/29/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
> On 3/24/17 05:29, Vitaly Burovoy wrote:
>> It would be great if "DROP IDENTITY IF EXISTS" is in the current patch
>> since we don't have any disagreements about "DROP IDENTITY" behavior
>> and easiness of implementation of "IF EXISTS" there.
>
> Here is an updated patch that adds DROP IDENTITY IF EXISTS.
>
> Unfortunately, implementing ADD IF NOT EXISTS is much harder, so I
> haven't done that here.
>
> Additionally, this patch fixes a few minor issues that you had pointed
> out, and merges with the new expression evaluation system in the executor.
>
> I have also CC'ed you on a separate patch to improve the behavior when
> changing a sequence's data type.

Thank you a lot. I'll have a deep look by the Sunday evening.

Why do you still want to leave "ADD IF NOT EXISTS" instead of using
"SET IF NOT EXISTS"?
If someone wants to follow the standard he can simply not to use "IF
NOT EXISTS" at all. Without it error should be raised.
But we would not need to introduce a new grammar for a column property
which is single and can't be added more than once.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sequence data type

2017-03-30 Thread Vitaly Burovoy

On 3/30/17, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> On 3/29/17, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
>> On 3/29/17, Michael Paquier <michael.paqu...@gmail.com> wrote:
>>> On Thu, Mar 30, 2017 at 11:18 AM, Vitaly Burovoy
>>> <vitaly.buro...@gmail.com> wrote:
>>>> I think min_value and max_value should not be set to "1" or "-1" but
>>>> to real min/max of the type by default.
>>>
>>> This is the default behavior for ages, since e8647c45 to be exact. So
>>> you would change 20 years of history?
>>
>> ... is it a wrong way to keep historical minimum as "1" by
>> default: it is not a minimum of any of supported type.
>
> I've read the standard about "minvalue", "maxvalue" and "start".
> OK, I was wrong. Since "start" should be equal to "minvalue" unless
> defined explicitly, the only bug left from my first email here is
> resetting "minvalue" back to 1 when data type changes and if the value
> matches the bound of the old type (the last case there).
>
> P.S.: the same thing with "maxvalue" when "increment" is negative.

It seemed not very hard to fix it.
Please find attached patch to be applied on top of your one.

I've added more tests to cover different cases of changing bounds when
data type is changed.

-- 
Best regards,
Vitaly Burovoy


0002-Fix-overriding-max-min-value-of-a-sequence-to-1-on-A.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sequence data type

2017-03-30 Thread Vitaly Burovoy

On 3/30/17, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> On 3/29/17, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
>> On 3/29/17, Michael Paquier <michael.paqu...@gmail.com> wrote:
>>> On Thu, Mar 30, 2017 at 11:18 AM, Vitaly Burovoy
>>> <vitaly.buro...@gmail.com> wrote:
>>>> I think min_value and max_value should not be set to "1" or "-1" but
>>>> to real min/max of the type by default.
>>>
>>> This is the default behavior for ages, since e8647c45 to be exact. So
>>> you would change 20 years of history?
>>
>> ... is it a wrong way to keep historical minimum as "1" by
>> default: it is not a minimum of any of supported type.
>
> I've read the standard about "minvalue", "maxvalue" and "start".
> OK, I was wrong. Since "start" should be equal to "minvalue" unless
> defined explicitly, the only bug left from my first email here is
> resetting "minvalue" back to 1 when data type changes and if the value
> matches the bound of the old type (the last case there).
>
> P.S.: the same thing with "maxvalue" when "increment" is negative.

It seemed not very hard to fix it.
Please find attached patch.

I've added more tests to cover different cases of changing bounds when
data type is changed.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sequence data type

2017-03-30 Thread Vitaly Burovoy

On 3/29/17, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> On 3/29/17, Michael Paquier <michael.paqu...@gmail.com> wrote:
>> On Thu, Mar 30, 2017 at 11:18 AM, Vitaly Burovoy
>> <vitaly.buro...@gmail.com> wrote:
>>> I think min_value and max_value should not be set to "1" or "-1" but
>>> to real min/max of the type by default.
>>
>> This is the default behavior for ages, since e8647c45 to be exact. So
>> you would change 20 years of history?
>
> ... is it a wrong way to keep historical minimum as "1" by
> default: it is not a minimum of any of supported type.

I've read the standard about "minvalue", "maxvalue" and "start".
OK, I was wrong. Since "start" should be equal to "minvalue" unless
defined explicitly, the only bug left from my first email here is
resetting "minvalue" back to 1 when data type changes and if the value
matches the bound of the old type (the last case there).

P.S.: the same thing with "maxvalue" when "increment" is negative.

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sequence data type

2017-03-29 Thread Vitaly Burovoy

On 3/29/17, Michael Paquier <michael.paqu...@gmail.com> wrote:
> On Thu, Mar 30, 2017 at 11:18 AM, Vitaly Burovoy
> <vitaly.buro...@gmail.com> wrote:
>> I think min_value and max_value should not be set to "1" or "-1" but
>> to real min/max of the type by default.
>
> This is the default behavior for ages, since e8647c45 to be exact. So
> you would change 20 years of history?
> --
> Michael
>

Unfortunately yes, because the current patch has appeared because of
another patch with the "IDENTITY COLUMN" feature.
Since sequences used for such columns are completely hidden (they do
not appear in DEFAULTs like "serial" do) and a new option (sequence
data type) is supported with its altering (which was absent
previously), new errors appear because of that.

Be honest I did not checked the "countdown" case at the current
release, but the most important part is the second one because it is a
direct analogue of what happens (in the parallel thread) on "ALTER
TABLE tbl ALTER col TYPE new_type" where "col" is an identity column.

Since the commit 2ea5b06c7a7056dca0af1610aadebe608fbcca08 declares
"The data type determines the default minimum and maximum values of
the sequence." is it a wrong way to keep historical minimum as "1" by
default: it is not a minimum of any of supported type.

I don't think something will break if min_value is set lesser than
before, but it will decrease astonishment of users in cases I wrote in
the previous email.

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sequence data type

2017-03-29 Thread Vitaly Burovoy

On 3/29/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
> Over at
> <https://www.postgresql.org/message-id/cakoswnnxmm6ybxnzgnxtzqmpjdgjf+a3wx53wzmrq5wqdyr...@mail.gmail.com>
> is is being discussed that maybe the behavior when altering the sequence
> type isn't so great, because it currently doesn't update the min/max
> values of the sequence at all.  So here is a patch to update the min/max
> values when the old min/max values were the min/max values of the data
> type.
>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

It seems almost good for me except a single thing (I'm sorry, I missed
the previous discussion).
Why is min_value set to 1 (or -1 for negative INCREMENTs) by default
for all sequence types?
With the committed patch it leads to the extra "MINVALUE" option
besides the "START" one; and it is not the worst thing.

It leads to strange error for countdown sequences:
postgres=# CREATE SEQUENCE abc AS smallint MINVALUE 0 START 2 INCREMENT -1;
ERROR:  MINVALUE (0) must be less than MAXVALUE (-1)

postgres=# CREATE SEQUENCE abc AS smallint MINVALUE 0 START 2
INCREMENT -1 NO MAXVALUE; -- does not help
ERROR:  MINVALUE (0) must be less than MAXVALUE (-1)

With the proposed patch users can impact with the next error:

postgres=# CREATE TABLE tbl(i smallserial);
CREATE TABLE
postgres=# SELECT * FROM pg_sequences;
 schemaname | sequencename | sequenceowner | data_type | start_value |
min_value | max_value | increment_by | cycle | cache_size | last_value
+--+---+---+-+---+---+--+---++
 public | tbl_i_seq| burovoy_va| smallint  |   1 |
1 | 32767 |1 | f |  1 |
(1 row)

postgres=# -- min_value for smallint is "1"? Ok, I want to use the whole range:
postgres=# ALTER SEQUENCE tbl_i_seq MINVALUE -32768 START -32768 RESTART -32768;
ALTER SEQUENCE
postgres=# -- after a while I realized the range is not enough. Try to
enlarge it:
postgres=# ALTER SEQUENCE tbl_i_seq AS integer;
ERROR:  START value (-32768) cannot be less than MINVALUE (1)

It is not an expected behavior.

I think min_value and max_value should not be set to "1" or "-1" but
to real min/max of the type by default.

I recommend to add to the docs explicit phrase that START value is not
changed even if it matches the bound of the original type.

Also it is good to have regressions like:
CREATE SEQUENCE sequence_test10 AS smallint  MINVALUE -1000 MAXVALUE 1000;
ALTER SEQUENCE sequence_test10 AS int NO MINVALUE NO MAXVALUE INCREMENT 1;
ALTER SEQUENCE sequence_test10 AS bigint NO MINVALUE NO MAXVALUE INCREMENT -1;

CREATE SEQUENCE sequence_test11 AS smallint  MINVALUE -32768 MAXVALUE 32767;
ALTER SEQUENCE sequence_test11 AS int NO MINVALUE NO MAXVALUE INCREMENT 1;
ALTER SEQUENCE sequence_test11 AS int NO MINVALUE NO MAXVALUE INCREMENT -1;

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2017-03-24 Thread Vitaly Burovoy

On 3/23/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
> On 3/23/17 06:09, Vitaly Burovoy wrote:
>> I think we'll end up with "DROP IDENTITY IF EXISTS" to avoid raising
>> an exception and "ADD OR SET" if your grammar remains.
>
> That sounds reasonable to me.

It would be great if "DROP IDENTITY IF EXISTS" is in the current patch
since we don't have any disagreements about "DROP IDENTITY" behavior
and easiness of implementation of "IF EXISTS" there.

>> Right. From that PoV IDENTITY also changes a default value: "SET (ADD
>> ... AS?) IDENTITY" works as setting a default to "nextval(...)"
>> whereas "DROP IDENTITY" works as setting it back to NULL.
>
> But dropping and re-adding an identity destroys state, so it's not quite
> the same.

How does dropping and re-adding affects a choosing between "ADD" and "SET"?
If you drop identity property, you get a column's state destroyed
whatever grammar variation you are using.

> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

I have an idea. What about the next version of DDL:

DROP IDENTITY [ IF EXISTS ]
SET GENERATED { ALWAYS | BY DEFAULT } [ IF NOT EXISTS ] | SET ...

That version:
1. does not introduce a new "ADD" variation;
2. without "IF NOT EXISTS" follows the standard;
3. with "IF NOT EXISTS" sets a column's identity property or alters it
(works as "CREATE OR REPLACE" for functions).

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2017-03-23 Thread Vitaly Burovoy

On 3/22/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
> On 3/22/17 03:59, Vitaly Burovoy wrote:
>> Column's IDENTITY behavior is very similar to a DEFAULT one. We write
>> "SET DEFAULT" and don't care whether it was set before or not, because
>> we can't have many of them for a single column. Why should we do that
>> for IDENTITY?
>
> One indication is that the SQL standard requires that DROP IDENTITY only
> succeed if the column is currently an identity column.  That is
> different from how DEFAULT works.

I think we'll end up with "DROP IDENTITY IF EXISTS" to avoid raising
an exception and "ADD OR SET" if your grammar remains.

> Another difference is that there is no such thing as "no default",
> because in absence of an explicit default, it is NULL.  So all you are
> doing with SET DEFAULT or DROP DEFAULT is changing the default.  You are
> not actually adding or removing it.

Right. From that PoV IDENTITY also changes a default value: "SET (ADD
... AS?) IDENTITY" works as setting a default to "nextval(...)"
whereas "DROP IDENTITY" works as setting it back to NULL.

> Therefore, the final effect of SET DEFAULT is the same no matter whether
> another default was there before or not.  For ADD/SET IDENTITY, you get
> different behaviors.  For example:
>
> ADD .. AS IDENTITY (START 2)
>
> creates a new sequence that starts at 2 and uses default parameters
> otherwise.  But
>
> SET (START 2)
>
> alters the start parameter of an existing sequence.  So depending on
> whether you already have an identity sequence, these commands do
> completely different things.

If you use "SET START 2" to a non-identity columns, you should get an
exception (no information about an identity type: neither "by default"
nor "always"). The correct example is:
ADD GENERATED BY DEFAULT AS IDENTITY (START 2)
and
SET GENERATED BY DEFAULT SET START 2

Note that creating a sequence is an internal machinery hidden from users.
Try to see from user's PoV: the goal is to have a column with an
autoincrement. If it is already autoincremented, no reason to create a
sequence (it is already present) and no reason to restart with 2
(there can be rows with such numbers).
"... SET START 2" is for the next "RESTART" DDL, and if a user insists
to start with 2, it is still possible:

SET GENERATED BY DEFAULT SET START 2 RESTART 2

I still think that introducing "ADD" for a property which can not be
used more than once (compare with "ADD CHECK": you can use it with the
same expression multiple times) is not a good idea.

I think there should be a consensus in the community for a grammar.

> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2017-03-22 Thread Vitaly Burovoy

On 3/21/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
> On 3/21/17 16:11, Vitaly Burovoy wrote:
>> My argument is consistency.
>> Since IDENTITY is a property of a column (similar to DEFAULT, NOT
>> NULL, attributes, STORAGE, etc.), it follows a different rule: it is
>> either set or not set. If it did not set before, the "SET" DDL "adds"
>> it, if that property already present, the DDL replaces it.
>> There is no "ADD" clause in DDLs like "...ALTER table ALTER column..."
>> (only "SET", "RESET" and "DROP")[2].
>> Your patch introduces the single DDL version with "...ALTER column
>> ADD..." for a property.
>
> But it creates a sequence, so it creates state.

Right. But it is an internal mechanism. DDL is not about creating a
sequence, it is about changing a property.

> So mistakes could easily be masked.  With my patch, if you do ADD twice, you 
> get an error.

Agree. But what's for? Whether that parameters are incompatible (and
can't be changed later)?

> With your proposal, you'd have to use SET, and you could overwrite
> existing sequence state without realizing it.

I can't overwrite its state (current value), only its settings like
start, maxval, etc.

In fact when I write a DDL I want to change a schema. For
non-properties it is natural to write "CREATE" (schema, table) and
"ADD" (column, constraints) because there can be many of them (with
different names) in a single object: many schemas in a DB, many tables
in a schema, many columns in a table and even many constraints in a
table. So ADD is used for adding objects which have a name to some
container (DB, schema, table).
It is not true for the IDENTITY property. You can have many identity
columns, but you can not have many of them in a single column.

Column's IDENTITY behavior is very similar to a DEFAULT one. We write
"SET DEFAULT" and don't care whether it was set before or not, because
we can't have many of them for a single column. Why should we do that
for IDENTITY?

Whether I write "ADD" or "SET" I want to have a column with some
behavior and I don't mind what behavior it has until it is
incompatible with my wish (e.g. it has DEFAULT, but I want IDENTITY or
vice versa).

>>> It does change the type, but changing the type doesn't change the
>>> limits.  That is a property of how ALTER SEQUENCE works, which was
>>> separately discussed.
>>
>> Are you about the thread[1]? If so, I'd say the current behavior is not
>> good.
>> I sent an example with users' bad experience who will know nothing
>> about sequences (because they'll deal with identity columns).
>> Would it be better to change bounds of a sequence if they match the
>> bounds of an old type (to the bounds of a new type)?
>
> That's an idea, but that's for a separate patch.

It is very likely to have one in Postgres10. I'm afraid in the other
case we'll impact with many bug reports similar to my example.

> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2017-03-21 Thread Vitaly Burovoy

I haven't seen a patch (I'll do it later), just few notes:

On 3/21/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
>>>> 3. Strange error (not about absence of a column; but see pp.5 and 8):
>>>> test=# ALTER TABLE idnt ALTER COLUMN o ADD GENERATED ALWAYS AS
>>>> IDENTITY;
>>>> ERROR:  identity column type must be smallint, integer, or bigint
>>>
>>> What's wrong with that?
>>
>> The column mentioned there does not exist. Therefore the error should
>> be about it, not about a type of absent column:
>
> This was already fixed in the previous version.

I've just checked and still get an error about a type, not about
absence of a column:
test=# CREATE TABLE itest3 (a int generated by default as identity, b text);
CREATE TABLE
test=# ALTER TABLE itest3 ALTER COLUMN o ADD GENERATED ALWAYS AS IDENTITY;
ERROR:  identity column type must be smallint, integer, or bigint

>> Since we extend the spec by "ADD GENERATED...", "SEQUENCE NAME" and
>> allow more than one identity column, why can't we extend it by
>> allowing "SET GENERATED" for non-identity columns and drop "ADD
>> GENERATED..."?
>
> SQL commands generally don't work that way.  Either they create or they
> alter.  There are "OR REPLACE" options when you do both.

I'd agree with you if we are talking about database's objects like
tables, functions, columns etc.

> So I think it
> is better to keep these two things separate.  Also, while you argue that
> we *could* do it the way you propose, I don't really see an argument why
> it would be better.

My argument is consistency.
Since IDENTITY is a property of a column (similar to DEFAULT, NOT
NULL, attributes, STORAGE, etc.), it follows a different rule: it is
either set or not set. If it did not set before, the "SET" DDL "adds"
it, if that property already present, the DDL replaces it.
There is no "ADD" clause in DDLs like "...ALTER table ALTER column..."
(only "SET", "RESET" and "DROP")[2].
Your patch introduces the single DDL version with "...ALTER column
ADD..." for a property.

>> 16. changing a type does not change an underlying sequence type (its
>> limits):
>
> It does change the type, but changing the type doesn't change the
> limits.  That is a property of how ALTER SEQUENCE works, which was
> separately discussed.

Are you about the thread[1]? If so, I'd say the current behavior is not good.
I sent an example with users' bad experience who will know nothing
about sequences (because they'll deal with identity columns).
Would it be better to change bounds of a sequence if they match the
bounds of an old type (to the bounds of a new type)?

> --
> Peter Eisentraut      http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

[1] 
https://www.postgresql.org/message-id/flat/898deb94-5265-37cf-f199-4f79ef864...@2ndquadrant.com
[2] https://www.postgresql.org/docs/current/static/sql-altertable.html
-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2017-03-20 Thread Vitaly Burovoy

On 2/28/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
> New patch that fixes everything.  ;-)

Great work!

> On 1/4/17 19:34, Vitaly Burovoy wrote:
>> 1. The fact COPY ignores GENERATED ALWAYS constraint (treats as
>> GENERATED BY DEFAULT) should be mentioned as well as rules.
>
> fixed (documentation added)
>
> What do you mean for rules?

I meant the phrase "However, it will not invoke rules." mentioned there.
For rewrite rules this behavior is mentioned on the COPY page, not on
the "CREATE RULE" page.
I think it would be better if that behavior is placed on both CREATE
TABLE and COPY pages.


>> 2. Usually error message for identical columns (with LIKE clause)
>> looks like this:
>> test=# CREATE TABLE def(i serial, n1 int NOT NULL, n2 int);
>> CREATE TABLE
>> test=# CREATE TABLE test(i serial, LIKE def INCLUDING ALL);
>> ERROR:  column "i" specified more than once
>>
>> but for generated columns it is disorienting enough:
>> test=# CREATE TABLE idnt(i int GENERATED BY DEFAULT AS IDENTITY);
>> CREATE TABLE
>> test=# CREATE TABLE test(i int GENERATED BY DEFAULT AS IDENTITY, LIKE
>> idnt INCLUDING ALL);
>> ERROR:  relation "test_i_seq" already exists
>
> Yeah, this is kind of hard to fix though, because the sequence names
> are decided during parse analysis, when we don't see that the same
> sequence name is also used by another new column.  We could maybe
> patch around it in an ugly way, but it doesn't seem worth it for this
> marginal case.

OK

>> 3. Strange error (not about absence of a column; but see pp.5 and 8):
>> test=# ALTER TABLE idnt ALTER COLUMN o ADD GENERATED ALWAYS AS IDENTITY;
>> ERROR:  identity column type must be smallint, integer, or bigint
>
> What's wrong with that?

The column mentioned there does not exist. Therefore the error should
be about it, not about a type of absent column:
test=# CREATE TABLE idnt(i int GENERATED BY DEFAULT AS IDENTITY, t TEXT);
CREATE TABLE
test=# ALTER TABLE idnt ALTER COLUMN o TYPE int;  -- expected error message
ERROR:  42703: column "o" of relation "idnt" does not exist
test=# -- compare with:
test=# ALTER TABLE idnt ALTER COLUMN o ADD GENERATED ALWAYS AS
IDENTITY;  -- strange error message
ERROR:  identity column type must be smallint, integer, or bigint


>> 5. Attempt to make a column be identity fails:
>> test=# ALTER TABLE idnt ALTER COLUMN n1 SET GENERATED BY DEFAULT;
>> ERROR:  column "n1" of relation "idnt" is not an identity column
>>
>> As I understand from the Spec, chapter 11.20 General Rule 2)b) says
>> about the final state of a column without mentioning of the initial
>> state.
>> Therefore even if the column has the initial state "not generated",
>> after the command it changes the state to either "GENERATED ALWAYS" or
>> "GENERATED BY DEFAULT".
>
> 11.12  states that "If  specification> or  is specified, then C
> shall be an identity column."  So this clause is only to alter an
> existing identity column, not make a new one.

Ouch. Right. The rule was at the upper level.

Nevertheless even now we don't follow rules directly.
For example, we allow "SET NOT NULL" and "TYPE" for the column which
are restricted by the spec.
Since we extend the spec by "ADD GENERATED...", "SEQUENCE NAME" and
allow more than one identity column, why can't we extend it by
allowing "SET GENERATED" for non-identity columns and drop "ADD
GENERATED..."?


>> 6. The docs mention a syntax:
>> ALTER [ COLUMN ] > class="PARAMETER">column_name { SET GENERATED { ALWAYS |
>> BY DEFAULT } | SET sequence_option | RESET
>> } [...]
>>
>> but the command fails:
>> test=# ALTER TABLE idnt ALTER COLUMN i RESET;
>> ERROR:  syntax error at or near ";"
>> LINE 1: ALTER TABLE idnt ALTER COLUMN i RESET;
>
> This works for me.  Check again please.

I'm sorry, it still does not work for me (whether you mix it up with "RESTART"):
test=# ALTER TABLE idnt ALTER COLUMN i RESET;
ERROR:  syntax error at or near ";"
LINE 1: ALTER TABLE idnt ALTER COLUMN i RESET;


>> 7. (the same line of the doc)
>> usually ellipsis appears in inside parenthesis with clauses can be
>> repeated, in other words it should be written as:
>> ALTER  column_name ( { SET GENERATED { ALWAYS |
>> BY DEFAULT } |  } [...] )
>
> But there are no parentheses in that syntax.  I think the syntax
> synopsis as written is correct.

I was wrong. Your version is correct.


>> 9. The command fails:
>> test=# ALTER TABLE idnt ALTER n2 ADD GENERATED A

Re: [HACKERS] identity columns

2017-03-15 Thread Vitaly Burovoy

On 3/15/17, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
> Vitaly, will you be able to review this again?
>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/

I apologize for a delay. Yes, I'm going to do it by Sunday.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS][REVIEW] macaddr 64 bit (EUI-64) datatype support

2017-01-31 Thread Vitaly Burovoy

Hello,

I've reviewed the patch[1].

Result of testing:

make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed


The patch introduce a new type macaddr8 for EUI-64 addresses[2]
(assuming OUI field is 24 bits wide) with EUI-48 (existing "macaddr"
type) interoperability.
It is a mostly copy-pasted macaddr implementation with necessary
changes for increased range.
Consensus was reached on such implementation in the current thread before.

There are two patch files for convenient reviewing: base macaddr8
implementation and its supporting in btree-gin and btree-gist indexes.

The patch:
* cleanly applies to the current master
(6af8b89adba16f97bee0d3b01256861e10d0e4f1);
* passes tests;
* looks fine, follows the PostgreSQL style guide;
* has documentation changes;
* has tests.

All notes and requirements were took into account and the patch was
changed according to them.
I have no suggestions on improving it.

The new status of this patch is: Ready for Committer

P.S.:
1. The doc and error/hint messages should be proof-read by a native speaker.
2. A committer should bump catversion. It is not bumped in the patch
because it is unclear when it is committed.

[1]https://postgr.es/m/CAJrrPGeT8zrGPMcRVk_wRvYD-ETcgUz6WRrc2C=inubmrkr...@mail.gmail.com
[2]http://standards.ieee.org/develop/regauth/tut/eui64.pdf
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2017-01-30 Thread Vitaly Burovoy

On 1/27/17, Haribabu Kommi <kommi.harib...@gmail.com> wrote:
> Updated patches are attached.

Hello,

I'm almost ready to mark it as Ready for committer.
The final round.

1.
>+DATA(insert OID = 774 ( macaddr8 ...
>+#define MACADDR8OID 972
What does this number (972) mean? I think it should be 774, the same
number as was used in the type definition.

Since there is an editing required, please, fix whitespaces:
2.
>+DATA(insert OID = 3371 (  403 macaddr8_ops
>PGNSP PGUID ));
>+DATA(insert OID = 3372 (  405 macaddr8_ops
>PGNSP PGUID ));

only one (not three) tab before "PGNSP" should be used (see other rows
around yours: "OID = 1983", "OID = 1991" etc).

3.
>diff --git a/contrib/btree_gist/Makefile b/contrib/btree_gist/Makefile
some new rows have two tabs instead of eight spaces.

4.
Some other files need to be fixed by pgindent (it helps supporting in
the future):
contrib/btree_gist/btree_macaddr8.c (a lot of rows)
src/include/utils/inet.h  (I have no idea why it changes indentation
to tabs for macaddr8 whereas it leaves spaces for macaddr)

5.
src/backend/utils/adt/network.c
pg_indent makes it uglier. I've just found how to write the line for it:
res *= ((double) 256) * 256 * 256 * 256;

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2017-01-26 Thread Vitaly Burovoy

On 1/25/17, Haribabu Kommi <kommi.harib...@gmail.com> wrote:
> On Wed, Jan 25, 2017 at 6:43 PM, Vitaly Burovoy <vitaly.buro...@gmail.com>
> wrote:
>
>> On 1/23/17, Haribabu Kommi <kommi.harib...@gmail.com> wrote:
>> > The patch is split into two parts.
>> > 1. Macaddr8 datatype support
>> > 2. Contrib module support.
>>
>> Hello,
>>
>> I'm sorry for the delay.
>> The patch is almost done, but I have two requests since the last review.
>>
>
> Thanks for the review.
>
>
>> 1.
>> src/backend/utils/adt/mac8.c:
>> +   int a,
>> +   b,
>> +   c,
>> +   d = 0,
>> +   e = 0,
>> ...
>>
>> There is no reason to set them as 0. For EUI-48 they will be
>> reassigned in the "if (count != 8)" block, for EUI-64 -- in one of
>> sscanf.
>> They could be set to "d = 0xFF, e = 0xFE," and avoid the "if" block
>> mentioned above, but it makes the code be much less readable.
>>



>
> Changed accordingly.

I'm going to do (I hope) a final review tonight.
Please, remove initialization of the variables "d" and "e" since there
is really no reason to keep them be zero for a short time.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2017-01-24 Thread Vitaly Burovoy

On 1/23/17, Haribabu Kommi <kommi.harib...@gmail.com> wrote:
> The patch is split into two parts.
> 1. Macaddr8 datatype support
> 2. Contrib module support.

Hello,

I'm sorry for the delay.
The patch is almost done, but I have two requests since the last review.

1.
src/backend/utils/adt/mac8.c:
+   int a,
+   b,
+   c,
+   d = 0,
+   e = 0,
...

There is no reason to set them as 0. For EUI-48 they will be
reassigned in the "if (count != 8)" block, for EUI-64 -- in one of
sscanf.
They could be set to "d = 0xFF, e = 0xFE," and avoid the "if" block
mentioned above, but it makes the code be much less readable.

Oh. I see. In the current version it must be assigned because for
EUI-48 memory can have values <0 or >255 in uninitialized variables.
It is better to avoid initialization by merging two parts of the code:
+   if (count != 6)
+   {
+   /* May be a 8-byte MAC address */
...
+   if (count != 8)
+   {
+   d = 0xFF;
+   e = 0xFE;
+   }

to a single one:
+   if (count == 6)
+   {
+   d = 0xFF;
+   e = 0xFE;
+   }
+   else
+   {
+   /* May be a 8-byte MAC address */
...

2.
src/backend/utils/adt/network.c:
+   res = (mac->a << 24) | (mac->b << 16) | (mac->c 
<< 8) | (mac->d);
+   res = (double)((uint64)res << 32);
+   res += (mac->e << 24) | (mac->f << 16) | 
(mac->g << 8) | (mac->h);

Khm... I trust that modern compilers can do a lot of optimizations but
for me it looks terrible because of needless conversions.
The reason why earlier versions did have two lines "res *= 256 * 256"
was an integer overflow for four multipliers, but it can be solved by
defining the first multiplier as a double:
+   res = (mac->a << 24) | (mac->b << 16) | (mac->c 
<< 8) | (mac->d);
+   res *= (double)256 * 256 * 256 * 256;
+   res += (mac->e << 24) | (mac->f << 16) | 
(mac->g << 8) | (mac->h);

In this case the left-hand side argument for the "*=" operator is
computed at the compile time as a single constant.
The second line can be written as "res *= 256. * 256 * 256 * 256;"
(pay attention to a dot in the first multiplier), but it is not
readable at all (and produces the same code).

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2017-01-05 Thread Vitaly Burovoy

On 1/4/17, Haribabu Kommi <kommi.harib...@gmail.com> wrote:
> On Tue, Nov 29, 2016 at 8:36 PM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>> Updated patch attached with added cast function from macaddr8 to
>> macaddr.
>>
>> Currently there are no support for cross operators. Is this required
>> to be this patch only or can be handled later if required?
>>
>
> Updated patch attached to address the duplicate OID problem.
> There are no other functional changes to the previous patch.

Hello,

several thoughts about the patch:


Documentation:
1.
+ The remaining six input formats are not part of any standard.
Which ones (remaining six formats)?


2.
+ trunc(macaddr8) returns a MAC
+   address with the last 3 bytes set to zero.
It is a misprinting or a copy-paste error.
The implementation and the standard says that the last five bytes are
set to zero and the first three are left as is.


3.
+ for lexicographical ordering
I'm not a native English speaker, but I'd say just "for ordering"
since there are no words, it is just a big number with a special
input/output format.


The code:
4.
+   if ((a == 0) && (b == 0) && (c == 0) && (d == 0)
+   && (e == 0) && (f == 0) && (g == 0) && (h == 0))
...
+   if ((a == 255) && (b == 255) && (c == 255) && (d == 255)
+   && (e == 255) && (f == 255) && (g == 255) && (h == 255))
The standard forbids these values from using in real hardware, no
reason to block them as input data.
Moreover these values can be stored as a result of binary operations,
users can dump them but can not restore.


5.
+   if (!eight_byte_address)
+   {
+   result->d = 0xFF;
...

Don't you think the next version is simplier (all sscanf for macaddr6
skip "d" and "e")?

+   count = sscanf(str, "%x:%x:%x:%x:%x:%x%1s",
+  , , , , , , junk);
...
+   if (!eight_byte_address)
+   {
+   d = 0xFF;
+   e = 0xFE;
+   }
+   result->a = a;
+   result->b = b;
+   result->c = c;
+   result->d = d;
+   result->e = e;
+   result->f = f;
+   result->g = g;
+   result->h = h;

Also:
+   if (buf->len == 6)
+   {
+   addr->d = 0xFF;
+   addr->e = 0xFE;
+   addr->f = pq_getmsgbyte(buf);
+   addr->g = pq_getmsgbyte(buf);
+   addr->h = pq_getmsgbyte(buf);
+   }
+   else
+   {
+   addr->d = pq_getmsgbyte(buf);
+   addr->e = pq_getmsgbyte(buf);
+   addr->f = pq_getmsgbyte(buf);
+   addr->g = pq_getmsgbyte(buf);
+   addr->h = pq_getmsgbyte(buf);
+   }

can be written as:
+   if (buf->len == 6)
+   {
+   addr->d = 0xFF;
+   addr->e = 0xFE;
+   }
+   else
+   {
+   addr->d = pq_getmsgbyte(buf);
+   addr->e = pq_getmsgbyte(buf);
+   }
+   addr->f = pq_getmsgbyte(buf);
+   addr->g = pq_getmsgbyte(buf);
+   addr->h = pq_getmsgbyte(buf);

but it is only my point of view (don't need to pay close attention
that only those two bytes are written differently, not the last tree
ones), it is not an error.


6.
+errmsg("macaddr8 out of range to convert to 
macaddr")));
I think a hint should be added which values are allowed to convert to "macaddr".

7.
+DATA(insert (  829 7744123 i f ));
+DATA(insert (  774  829   4124 i f ));
It is a nitpicking, but try to use tabs as in the code around.
(tab between "774" and "829" and space instead of tab between "829" and "4124").

8.
+#define hibits(addr) \
+  ((unsigned long)(((addr)->a<<24)|((addr)->b<<16)|((addr)->c<<8)))
+
+#define lobits(addr) \
+  ((unsigned long)(((addr)->d<<16)|((addr)->e<<8)|((addr)->f)))
+
+#define lobits_extra(addr) \
+  ((unsigned long)((addr)->g<<8)|((addr)->h))

I mentioned that fitting all 4 bytes is a wrong idea[1]
> The macros "hibits" should be 3 octets long, not 4;

... but now I do not think so (there is no UB[2] because source and
destination are not signed).
Moreover you've already fill in "hibits" the topmost byte by shifting by 24.
If you use those two macros ("hibits" and "lobits") it allows to avoid
two extra comparisons in macaddr8_cmp_internal.
Version from the "macaddr64_poc.patch" is correct.


[1]https://www.postgresql.org/message-id/CAKOSWNng9_+=fvo6oz4tgv1kkhmom11ankihbokpuzki1ca...@mail.gmail.com
[2]http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1817.htm


-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS][PATCH] BUG #14486: Inserting and selecting interval have different constraints

2017-01-05 Thread Vitaly Burovoy

On 1/5/17, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Vitaly Burovoy <vitaly.buro...@gmail.com> writes:
>> On 1/5/17, Tom Lane <t...@sss.pgh.pa.us> wrote:
>>> My point is that ideally, any value that can physically fit into struct
>>> Interval ought to be considered valid.  The fact that interval_out can't
>>> cope is a bug in interval_out, which ideally we would fix without
>>> artificially restricting the range of the datatype.
>
>> Am I correct that we are still limited by ECPG which is limited by the
>> system "tm" struct?
>
> I'm not really that concerned about whether ECPG can deal with enormous
> intervals.  If it bothers you, and you want to go fix it, more power to
> you --- but I think fixing the backend is much higher priority.

I really do not think it is possible since it uses system struct.
My point - ECPG is a part of Postgres, we can't fix server side and
leave the rest.

>> Also users who use a binary protocol can also use the "tm" struct and
>> can not expect overflow.
>
> If they store an enormous interval value, its really up to them not to
> choke on it when they read it back.  Not our problem.

Those who store and those who read it back can be different groups of people.
For example, the second group are libraries' writers.
My point - that interval is big enough and limiting it can help people
from errors.
Because finally
* either they have an error in their data and that change will not
break their reslut (since it is wrong because of wrong source data)
* or they still get overflow and they have to find a solution - with
that patch or without it
* or they get result in that 16% interval between fitting hours into
"int" and "seconds" in PG_INT64, get silently corrupted result because
of ECPG or a library.

A solution for the third case can be the same as for the second one
because these groups can be the same (just with different data).

>> The docs say[1] the highest value of the interval type is 178_000_000
>> years which is
>
> ... irrelevant really.  That's talking about the largest possible value of
> the "months" field, viz (2^31-1)/12.  Perhaps we ought to document the
> other field limits, but right now there's nothing there about how large
> the hours field can get.

No, it was bad explanation of a point - now there is nothing
documented about "seconds" part of the "interval" type.
And we can just declare limit. Users don't mind reason of limitation,
they just will plan workaround if their data do not fit in limits
whatever they are.

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS][PATCH] BUG #14486: Inserting and selecting interval have different constraints

2017-01-05 Thread Vitaly Burovoy

On 1/5/17, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Vitaly Burovoy <vitaly.buro...@gmail.com> writes:
>> On 1/5/17, Tom Lane <t...@sss.pgh.pa.us> wrote:
>>> We could think about replacing interval2tm's output format with some
>>> other struct that uses a TimeOffset for hours and so cannot overflow.
>>> I'm not sure though how far the effects would propagate; it might be
>>> more work than we want to put into this.
>
>> If values with overflow are already in a database, what do you expect
>> a new output function should fix?
>
> My point is that ideally, any value that can physically fit into struct
> Interval ought to be considered valid.  The fact that interval_out can't
> cope is a bug in interval_out, which ideally we would fix without
> artificially restricting the range of the datatype.
>
> Now, the problem with that of course is that it's not only interval_out
> but multiple other places.  But your proposed patch also requires touching
> nearly everything interval-related, so I'm not sure it has any advantage
> that way over the less restrictive answer.

OK, I try to limit values from
9223372036854775807/36/24/365=292471.2 years
to
77309411328/36/24/365=245146.5 years

i.e. cut 47325 years or ~16%, but it is only for interval's part
measured in seconds.

Am I correct that we are still limited by ECPG which is limited by the
system "tm" struct?
Also users who use a binary protocol can also use the "tm" struct and
can not expect overflow.

The docs say[1] the highest value of the interval type is 178_000_000
years which is
significantly bigger than both of values (current and limited)
measured in seconds.
I don't think applying that limitation is a big deal.


I tried to find functions should be touched and there are not so many of them:

-- internal functions:
interval_check_overflow(Interval *i)
tm2interval(struct pg_tm * tm, fsec_t fsec, Interval *span)
interval2tm(Interval span, struct pg_tm * tm, fsec_t *fsec)
intervaltypmodleastfield(int32 typmod)

-- output functions:
interval_out(PG_FUNCTION_ARGS)  -- skip output
interval_send(PG_FUNCTION_ARGS) -- skip output

-- main input/computing functions:
interval_in(PG_FUNCTION_ARGS)   -- forgot to cover
interval_recv(PG_FUNCTION_ARGS) -- covered
make_interval(PG_FUNCTION_ARGS) -- covered

AdjustIntervalForTypmod(Interval *interval, int32 typmod)  -- can lead
to overflow

interval_justify_interval(PG_FUNCTION_ARGS)  -- can lead to overflow
interval_justify_hours(PG_FUNCTION_ARGS)  -- can lead to overflow
interval_justify_days(PG_FUNCTION_ARGS)  -- can lead to overflow
interval_um(PG_FUNCTION_ARGS)  -- covered
interval_pl(PG_FUNCTION_ARGS)  -- covered
interval_mi(PG_FUNCTION_ARGS)  -- covered
interval_mul(PG_FUNCTION_ARGS) -- covered

-- can not lead to overflow
interval_transform(PG_FUNCTION_ARGS)
interval_div(PG_FUNCTION_ARGS)
interval_trunc(PG_FUNCTION_ARGS)
interval_part(PG_FUNCTION_ARGS)

-- based on other functions
interval_scale(PG_FUNCTION_ARGS) -- based on AdjustIntervalForTypmod
interval_accum(PG_FUNCTION_ARGS) -- based on interval_pl
interval_accum_inv(PG_FUNCTION_ARGS) -- based on interval_mi
interval_avg(PG_FUNCTION_ARGS)   -- based on interval_pl
interval_combine(PG_FUNCTION_ARGS)   -- based on interval_pl
mul_d_interval(PG_FUNCTION_ARGS) -- based on interval_mul

-- checking, not changing:
interval_finite(PG_FUNCTION_ARGS)
interval_smaller(PG_FUNCTION_ARGS)
interval_larger(PG_FUNCTION_ARGS)
interval_cmp_value(const Interval *interval)
interval_cmp_internal(Interval *interval1, Interval *interval2)
interval_eq(PG_FUNCTION_ARGS)
interval_ne(PG_FUNCTION_ARGS)
interval_lt(PG_FUNCTION_ARGS)
interval_gt(PG_FUNCTION_ARGS)
interval_le(PG_FUNCTION_ARGS)
interval_ge(PG_FUNCTION_ARGS)
interval_cmp(PG_FUNCTION_ARGS)
interval_hash(PG_FUNCTION_ARGS)

-- not applicable:
intervaltypmodin(PG_FUNCTION_ARGS)
intervaltypmodout(PG_FUNCTION_ARGS)
timestamp_pl_interval(PG_FUNCTION_ARGS)
timestamp_mi_interval(PG_FUNCTION_ARGS)
timestamptz_pl_interval(PG_FUNCTION_ARGS)
timestamptz_mi_interval(PG_FUNCTION_ARGS)


=
In fact, only six of them (not "touching nearly everything
interval-related") should be covered:

* interval_in
* interval_out (yes, the SAMESIGN there is useless)
* AdjustIntervalForTypmod
* interval_justify_interval
* interval_justify_hours
* interval_justify_days


[1]https://www.postgresql.org/docs/current/static/datatype-datetime.html
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS][PATCH] BUG #14486: Inserting and selecting interval have different constraints

2017-01-05 Thread Vitaly Burovoy

On 1/5/17, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Vitaly Burovoy <vitaly.buro...@gmail.com> writes:
>>> I've written a patch which fixes that bug (in attachment).
>>> Should it be registered in the CF?
>
>> Oops. Forgot to attach the patch. Fixed.
>
> I suspect that many of these SAMESIGN() tests you've added are not
> actually adequate/useful.  That's only sufficient when the output could be
> at most a factor of 2 out-of-range.  If it could overflow past the sign
> bit then you need to work harder.

I disagree. These SAMESIGN() tests cover addition where result can not
be more than one out-of-range.
Multiplications are covered just before.

> (By the same token, the existing SAMESIGN test in interval2tm is
> wrong.)
>
> Possibly we should consider alternatives before plowing ahead in this
> direction, since adding guards to interval_in and interval computations
> doesn't help with oversize values that are already stored in a database.

We can do nothing with values are already stored in a database. But we
can prevent appearing them later.

> We could think about replacing interval2tm's output format with some
> other struct that uses a TimeOffset for hours and so cannot overflow.
> I'm not sure though how far the effects would propagate; it might be
> more work than we want to put into this.

If values with overflow are already in a database, what do you expect
a new output function should fix?

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [BUGS][PATCH] BUG #14486: Inserting and selecting interval have different constraints

2017-01-05 Thread Vitaly Burovoy

On 1/5/17, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> On 1/4/17, Pantelis Theodosiou <yperc...@gmail.com> wrote:
>> On Wed, Jan 4, 2017 at 3:03 PM, <web+postgre...@modin.io> wrote:
>>
>>> The following bug has been logged on the website:
>>>
>>> Bug reference:  14486
>>> Logged by:  Per Modin
>>> Email address:  web+postgre...@modin.io
>>> PostgreSQL version: 9.6.1
>>> Operating system:   Linux 303a92173594 4.8.15-moby #1 SMP Sat Dec 17 0
>>> Description:
>>>
>>> Found this bug in 9.4.8, tried it in docker towards psql 9.6.1 and it's
>>> in
>>> there as well. A minimum working example would be as follows:
>>>
>>> ```
>>> postgres=# CREATE TABLE tbl AS SELECT 9223372036854 * interval '1
>>> second'
>>> col; TABLE tbl;
>>> SELECT 1
>>> ERROR:  interval out of range
>>> ```
>>>
>>> ```
>>> postgres=# SELECT count(*) FROM tbl;
>>>  count
>>> ---
>>>  1
>>> (1 row)
>>> ```
>>>
>>> It seems that inserting and retrieving data have different constraints.
>>> As
>>> you
>>> can see from query 2, the data still gets inserted.
>>>
>>> ```
>>> postgres=# select version();
>>>  version
>>> 
>>> --
>>>  PostgreSQL 9.6.1 on x86_64-pc-linux-gnu, compiled by gcc (Debian
>>> 4.9.2-10)
>>> 4.9.2, 64-bit
>>> (1 row)
>>> ```
>>>
>>> Best regards,
>>> Per Modin
>>>
>>>
>> And these work without error:
>>
>> postgres=# select col - 9223372036854 * interval '1 second' from tbl ;
>>  ?column?
>> --
>>  00:00:00
>> (1 row)
>>
>> postgres=# select col from xx where col < interval '10 year' ;
>>  col
>> -
>> (0 rows)
>>
>
> Yes, it is a bug, but it is not a constraint, it is just different
> internal checks.
> Moreover even if a function does not raise an error, output could be wrong
> (pay attention to the duplicated '-' sign)
> postgres=# SELECT '-2147483647:59:59'::interval - '1s'::interval;
>   ?column?
> 
>  --2147483648:00:00
> (1 row)
>
> I've written a patch which fixes that bug (in attachment).
> Should it be registered in the CF?

Oops. Forgot to attach the patch. Fixed.

-- 
Best regards,
Vitaly Burovoy


0001-Add-check-for-overflow-to-interval-functions.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [BUGS][PATCH] BUG #14486: Inserting and selecting interval have different constraints

2017-01-05 Thread Vitaly Burovoy

On 1/4/17, Pantelis Theodosiou <yperc...@gmail.com> wrote:
> On Wed, Jan 4, 2017 at 3:03 PM, <web+postgre...@modin.io> wrote:
>
>> The following bug has been logged on the website:
>>
>> Bug reference:  14486
>> Logged by:  Per Modin
>> Email address:  web+postgre...@modin.io
>> PostgreSQL version: 9.6.1
>> Operating system:   Linux 303a92173594 4.8.15-moby #1 SMP Sat Dec 17 0
>> Description:
>>
>> Found this bug in 9.4.8, tried it in docker towards psql 9.6.1 and it's
>> in
>> there as well. A minimum working example would be as follows:
>>
>> ```
>> postgres=# CREATE TABLE tbl AS SELECT 9223372036854 * interval '1 second'
>> col; TABLE tbl;
>> SELECT 1
>> ERROR:  interval out of range
>> ```
>>
>> ```
>> postgres=# SELECT count(*) FROM tbl;
>>  count
>> ---
>>  1
>> (1 row)
>> ```
>>
>> It seems that inserting and retrieving data have different constraints.
>> As
>> you
>> can see from query 2, the data still gets inserted.
>>
>> ```
>> postgres=# select version();
>>  version
>> 
>> --
>>  PostgreSQL 9.6.1 on x86_64-pc-linux-gnu, compiled by gcc (Debian
>> 4.9.2-10)
>> 4.9.2, 64-bit
>> (1 row)
>> ```
>>
>> Best regards,
>> Per Modin
>>
>>
> And these work without error:
>
> postgres=# select col - 9223372036854 * interval '1 second' from tbl ;
>  ?column?
> --
>  00:00:00
> (1 row)
>
> postgres=# select col from xx where col < interval '10 year' ;
>  col
> -
> (0 rows)
>

Yes, it is a bug, but it is not a constraint, it is just different
internal checks.
Moreover even if a function does not raise an error, output could be wrong
(pay attention to the duplicated '-' sign)
postgres=# SELECT '-2147483647:59:59'::interval - '1s'::interval;
  ?column?

 --2147483648:00:00
(1 row)

I've written a patch which fixes that bug (in attachment).
Should it be registered in the CF?
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2017-01-04 Thread Vitaly Burovoy

ROM ('2016-07-01') TO ('2016-08-01');
ERROR:  cannot inherit from table with identity column
test=# INSERT INTO measurement(logdate) VALUES('2016-07-10');
ERROR:  no partition of relation "measurement" found for row
DETAIL:  Failing row contains (1, 2016-07-10, null, null).


12. You've introduced a new parameter for a sequence declaration
("SEQUENCE NAME") which is not mentioned in the docs and not
supported:
a2=# CREATE SEQUENCE s SEQUENCE NAME s;
ERROR:  option "sequence_name" not recognized

I think the error should look as:
a2=# CREATE SEQUENCE s SEQUENCE__ NAME s;
ERROR:  syntax error at or near "SEQUENCE__"
LINE 1: CREATE SEQUENCE s SEQUENCE__ NAME s;
  ^

13. Also strange error message:
test=# CREATE SCHEMA sch;
test=# ALTER TABLE idnt ALTER COLUMN n1 ADD GENERATED BY DEFAULT AS
IDENTITY (SEQUENCE NAME sch.s START 1);
ERROR:  relation "sch.idnt" does not exist

But if a table sch.idnt exists, it leads to a silent inconsistency:
test=# CREATE TABLE sch.idnt(n1 int);
CREATE TABLE
test=# ALTER TABLE idnt ALTER COLUMN n1 ADD GENERATED BY DEFAULT AS
IDENTITY (SEQUENCE NAME sch.s START 1);
ALTER TABLE
test=# DROP TABLE sch.idnt;
ERROR:  could not find tuple for attrdef 0

Also dump/restore fails for it.

Note that by default sequences have different error message:
test=# CREATE SEQUENCE sch.s1 OWNED BY idnt.n1;
ERROR:  sequence must be in same schema as table it is linked to

14. Wrong hint message:
test=# DROP SEQUENCE idnt_i_seq;
ERROR:  cannot drop sequence idnt_i_seq because default for table idnt
column i requires it
HINT:  You can drop default for table idnt column i instead.

but according to DDL there is no "default" which can be dropped:
test=# ALTER TABLE idnt ALTER COLUMN i DROP DEFAULT;
ERROR:  column "i" of relation "idnt" is an identity column


15. And finally. The command:
test=# CREATE TABLE t(i int GENERATED BY DEFAULT AS IDENTITY (SEQUENCE NAME s));

leads to a core dump.
It happens when no sequence parameter (like "START") is set.


[1]https://www.postgresql.org/docs/devel/static/sql-createtable.html

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2016-10-12 Thread Vitaly Burovoy

On 10/12/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvhe...@2ndquadrant.com> writes:
>> Tom Lane wrote:
>>> Vitaly Burovoy <vitaly.buro...@gmail.com> writes:
>>>> P.S.: I still think it is a good idea to change storage format,

>>> I'm not sure which part of "no" you didn't understand,

I just paid attention to the words "likelihood" (mixed up with
"likeliness"), "we wanted" and "probably".
Also there was a note about "would also break send/recv" which
behavior can be saved.
And after your letter Julien Rouhaud wrote about mapping from MAC-48
to EUI-64 which leads absence of a bit indicated version of a stored
value. Which can be considered as a new information.

>>> but we're
>>> not breaking on-disk compatibility of existing macaddr columns.

Can I ask why? It will not be a varlen (typstorage will not be
changed), it just changes typlen to 8 and typalign to 'd'.
For every major release 9.0, 9.1, 9.2 .. 9.6 the docs says "A
dump/restore using pg_dumpall, or use of pg_upgrade, is required".
Both handle changes in a storage format. Do they?

>>> Breaking the on-the-wire binary I/O representation seems like a
>>> nonstarter as well.

I wrote that for the EUI-48 (MAC-48) values the binary I/O
representation can be saved.
The binary format (in DataRow message) has a length of the column
value which is reflected in PGresAttValue.len in libpq.
If the client works with the binary format it must consult with the
length of the data.
But until the client works with (and columns have) MAC-48 nothing
hurts it and PGresAttValue.len is "6" as now.

>> I think the suggestion was to rename macaddr to macaddr6 or similar,
>> keeping the existing behavior and the current OID.  So existing columns
>> would continue to work fine and maintain on-disk compatibility, but any
>> newly created columns would become the 8-byte variant.
>
> ... and would have different I/O behavior from before, possibly breaking
> applications that expect "macaddr" to mean what it used to.  I'm still
> dubious that that's a good idea.

Only if a new type will send xx:xx:xx:FF:FF:xx:xx:xx instead of usual
(expected) 6 octets long.
Again, that case in my offer is similar (by I/O behavior) to "just
change 'macaddr' to keep and accept both MAC-48 and MAC-64", but
allows to use "-k" key for pg_upgrade to prevent rewriting possibly
huge (for instance, 'log') tables (but users unexpectedly get
"macaddr6" after upgrade in their columns and function names which
looks strange enough).

> The larger picture here is that we got very little thanks when we squeezed
> IPv6 into the pre-existing inet datatype;

Without a sarcasm, I thank a lot all people involved in it because it
does not hurt me (and many other people) from distinguishing ipv4 and
ipv6 at app-level.
I write apps and just save remote address of clients to an "inet"
column named "remote_ip" without thinking "what if we start serving
clients via ipv6?"; or have a column named "allowed_ip" with IPs or
subnets and just save client's IPv4 or IPv6 as a white list (and use
"allowed_ip >>= $1"). It just works.

> there's a large number of people
> who just said "no thanks" and started using the add-on ip4r type instead.

I found a repository[1] at github. From the description it is
understandable why people used ip4r those days (2005 year). The reason
"Furthermore, they are variable length types (to support ipv6) with
non-trivial overheads" is mentioned as the last in its README.
When you deal with IPv4 in 99.999%, storing it in TOAST tables leads
to a big penalty, but the new version of macaddr is not so wide, so it
does not lead to similar speed decrease (it will be stored inplace).

> So I'm not sure why we want to complicate our lives in order to make
> macaddr follow the same path.

Because according to the Wiki[3] MAC addresses now "are formed
according to the rules of one of three numbering name spaces ...:
MAC-48, EUI-48, and EUI-64.", so IEEE extended range of allowed values
from 48 to 64 bits and since Postgres claims supporting of "mac
addresses", I (as a developer who still uses PG as a primary database)
expect supporting of any kind of mac address, not a limited one. I
expect it is just works.

I reject to imagine what I have to do if I have a column of a type
"macaddr" and unexpectedly I have to deal with an input of EUI-64
type. Add a new column or change columns's type?

In the first case what to do with stored procedures? Duplicate input
parameter to pass the new column of macaddr8 (if macaddr was passed
before)? Duplicate stored procedure?
Also I have to support two columns at the application level. W

Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2016-10-12 Thread Vitaly Burovoy

On 10/12/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Vitaly Burovoy <vitaly.buro...@gmail.com> writes:
>> I'm sorry for the offtopic, but does anyone know a reason why a
>> condition in mac.c
>
>>> if ((a < 0) || (a > 255) || (b < 0) || (b > 255) ||
>>> (c < 0) || (c > 255) || (d < 0) || (d > 255) ||
>>> (e < 0) || (e > 255) || (f < 0) || (f > 255))
>
>> can not be rewritten as:
>
>>> if (((a | b | c | d | e | f) < 0) ||
>>> ((a | b | c | d | e | f) > 255))
>

> Well, it's ugly and

> it adds a bunch of assumptions about arithmetic
> behavior that we don't particularly need to make.

It explains the reason, thank you. I'm just not familiar with other
architectures where it is not the same as in X86/X86-64.

> If this were some
> amazingly hot hot-spot then maybe it would be worth making the code
> unreadable to save a few nanoseconds, but I doubt that it is.

> (Anyway, you've not shown that there actually is any benefit ...)
I don't think it has a speed benefit, especially comparing with the sscanf call.
But personally for me the second variant does not seem ugly, just "no
one bit in all variables is out of a byte" (looks better with
comparison with "0xff" as sscanf operates with "%2x").
Sorry for my bad taste and for a noise.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2016-10-12 Thread Vitaly Burovoy

I'm sorry for the offtopic, but does anyone know a reason why a
condition in mac.c

> if ((a < 0) || (a > 255) || (b < 0) || (b > 255) ||
> (c < 0) || (c > 255) || (d < 0) || (d > 255) ||
> (e < 0) || (e > 255) || (f < 0) || (f > 255))

can not be rewritten as:

> if (((a | b | c | d | e | f) < 0) ||
> ((a | b | c | d | e | f) > 255))

It seems more compact and a compiler can optimize it to keep a result
of a binary OR for the comparison with 255...

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2016-10-12 Thread Vitaly Burovoy

On 10/12/16, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> On 10/12/16, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
>> Julien Rouhaud wrote:
>>> On 12/10/2016 14:32, Alvaro Herrera wrote:
>>> > Julien Rouhaud wrote:
>>> >
>>> >> and you can instead make macaddr64 support both format, and provide a
>>> >> macaddr::macaddr64 cast
>>> >
>>> > Having macaddr64 support both formats sounds nice, but how does it
>>> > work?
>>> > Will we have to reserve one additional bit to select the
>>> > representation?
>>> > That would make the type be 65 bits which is a clear loser IMO.
>>> >
>>> > Is it allowed to just leave 16 bits as zeroes which would indicate
>>> > that
>>> > the address is EUI48?  I wouldn't think so ...
>>>
>>> From what I read, you can indicate it's an EUI-48 address by storing
>>> FF:FF (or FF:FE for MAC-48) in 4th and 5th bytes of the EUI-64 address.
>>
>> That seems reasonable at first glance; so the new type would be able to
>> store both 48-bit and 64-bit values, and there would be assignment casts
>> in both directions
>
> I think either "macaddr" should be renamed to "macaddr6" (saved its
> oid), a new type "macaddr8" is added with introducing a new alias
> "macaddr" or the current "macaddr" should accept both versions as the
> "inet" type does.
>
> The function "macaddr_recv" can distinguish them by the
> StringInfoData.len member, "macaddr_in" - by number of parts split by
> ":".
> The "macaddr_out" and "macaddr_send" can out 6 octets if the stored
> value is mapped to MAC-48.
> Storing can be done always as 8 bytes using the rule above.
>
> In the other case there is hard from user's PoV to detect at the
> design stage when it is necessary to define column as macaddr and when
> to macaddr8.
> If users need to update a column type (a new hardware appears with
> EUI-64 address), they should keep in mind that all things are changed
> for the new column type - stored procedure's parameters, application
> code interacting with the DB etc.).
> I don't agree with Tom's proposal to introduce a new type, it would be
> inconvenient for users.
>
> We have types "int2", "int4", "int8" and an alias "int" for a type
> "int4", at least psql does not show it:
> postgres=# \dT+ int
> List of data types
>  Schema | Name | Internal name | Size | Elements | Owner | Access
> privileges | Description
> +--+---+--+--+---+---+-
> (0 rows)
>
> It is understandable to have 3 types for integers because most space
> of the DB occupied by them and in the real life we just don't use big
> numbers, but cases for "inet" and "macaddr" are different.
>
>> and a suite of operators to enable interoperability
>> of 48-bit values in macaddr8 with values in type macaddr.  Right?
>>
>> (The cast function from macaddr8 to macaddr would raise error if the
>> 4th and 5th bytes are not either FF:FF or FF:FE -- I don't think we can
>> in practice distinguish EUI-48 from MAC-48 in this context.
>
> The wikipedia says[1] they are the same things but MAC-48 is an
> obsolete term for a special case, so there is no necessary to
> distinguish them.
>
>> The cast in the other direction would have no restriction and should
>> probably always use FF:FE).
>
> [1]
> https://en.wikipedia.org/wiki/Organizationally_unique_identifier#48-bit_Media_Access_Control_Identifier_.28MAC-48.29

Haribabu Kommi, why have you read enough about EUI-64?
Your function "macaddr64_trunc" sets 4 lower bytes as 0 whereas
(according to the Wikipedia, but I can look for a standard) 3 bytes
are still define an OUI (organizationally unique identifier), so
truncation should be done for 5, not 4 lower octets.

The macros "hibits" should be 3 octets long, not 4; "lobits" --- 5 bytes, not 4.
In the other case your comparisons fail.

What document have you used to write the patch? Are short form formats
correct in macaddr64_in?

P.S.: I still think it is a good idea to change storage format,
macaddr_{in,out,send,recv}, fill 4th and 5th bytes if necessary;
change "lobits" macros and add new fields to bit operation functions.
It avoids a new type, casting, comparison functions between macaddr6
and macaddr8 etc. Proposed patch is mostly copy-paste of mac.c

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2016-10-12 Thread Vitaly Burovoy

On 10/12/16, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
> Julien Rouhaud wrote:
>> On 12/10/2016 14:32, Alvaro Herrera wrote:
>> > Julien Rouhaud wrote:
>> >
>> >> and you can instead make macaddr64 support both format, and provide a
>> >> macaddr::macaddr64 cast
>> >
>> > Having macaddr64 support both formats sounds nice, but how does it
>> > work?
>> > Will we have to reserve one additional bit to select the
>> > representation?
>> > That would make the type be 65 bits which is a clear loser IMO.
>> >
>> > Is it allowed to just leave 16 bits as zeroes which would indicate that
>> > the address is EUI48?  I wouldn't think so ...
>>
>> From what I read, you can indicate it's an EUI-48 address by storing
>> FF:FF (or FF:FE for MAC-48) in 4th and 5th bytes of the EUI-64 address.
>
> That seems reasonable at first glance; so the new type would be able to
> store both 48-bit and 64-bit values, and there would be assignment casts
> in both directions

I think either "macaddr" should be renamed to "macaddr6" (saved its
oid), a new type "macaddr8" is added with introducing a new alias
"macaddr" or the current "macaddr" should accept both versions as the
"inet" type does.

The function "macaddr_recv" can distinguish them by the
StringInfoData.len member, "macaddr_in" - by number of parts split by
":".
The "macaddr_out" and "macaddr_send" can out 6 octets if the stored
value is mapped to MAC-48.
Storing can be done always as 8 bytes using the rule above.

In the other case there is hard from user's PoV to detect at the
design stage when it is necessary to define column as macaddr and when
to macaddr8.
If users need to update a column type (a new hardware appears with
EUI-64 address), they should keep in mind that all things are changed
for the new column type - stored procedure's parameters, application
code interacting with the DB etc.).
I don't agree with Tom's proposal to introduce a new type, it would be
inconvenient for users.

We have types "int2", "int4", "int8" and an alias "int" for a type
"int4", at least psql does not show it:
postgres=# \dT+ int
List of data types
 Schema | Name | Internal name | Size | Elements | Owner | Access
privileges | Description
+--+---+--+--+---+---+-
(0 rows)

It is understandable to have 3 types for integers because most space
of the DB occupied by them and in the real life we just don't use big
numbers, but cases for "inet" and "macaddr" are different.

> and a suite of operators to enable interoperability
> of 48-bit values in macaddr8 with values in type macaddr.  Right?
>
> (The cast function from macaddr8 to macaddr would raise error if the
> 4th and 5th bytes are not either FF:FF or FF:FE -- I don't think we can
> in practice distinguish EUI-48 from MAC-48 in this context.

The wikipedia says[1] they are the same things but MAC-48 is an
obsolete term for a special case, so there is no necessary to
distinguish them.

> The cast in the other direction would have no restriction and should
> probably always use FF:FE).

[1] 
https://en.wikipedia.org/wiki/Organizationally_unique_identifier#48-bit_Media_Access_Control_Identifier_.28MAC-48.29
-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs

2016-10-06 Thread Vitaly Burovoy

On 10/6/16, Serge Rielau <se...@rielau.com> wrote:
>> On Oct 6, 2016, at 9:20 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> Vitaly Burovoy <vitaly.buro...@gmail.com> writes:
>>> But what I discover for myself is that we have pg_attrdef separately
>>> from the pg_attribute. Why?
>>
>> The core reason for that is that the default expression needs to be
>> a separate object from the column for purposes of dependency analysis.
>> For example, if you have a column whose default is "foo()", then the
>> default expression depends on the function foo(), but the column should
>> not: if you drop the function, only the default expression ought to
>> be dropped, not the column.
>>
>> Because of this, the default expression needs to have its own OID
>> (to be stored in pg_depend) and it's convenient to store it in a
>> separate catalog so that the classoid can identify it as being a
>> default expression rather than some other kind of object.
>
> Good to know.
>
>> If we were going to allow these missing_values or creation_defaults
>> or whatever they're called to be general expressions, then they would
>> need
>> to have their own OIDs for dependency purposes.  That would lead me to
>> think that the best representation is to put them in their own rows in
>> pg_attrdef, perhaps adding a boolean column to pg_attrdef to distinguish
>> regular defaults from these things.  Or maybe they even need their own
>> catalog, depending on whether you think dependency analysis would want
>> to distinguish them from regular defaults using just the classed.
>>
>> Now, as I just pointed out in another mail, realistically we're probably
>> going to restrict the feature to simple constants, which'd mean they will
>> depend only on the column's type and can never need any dependencies of
>> their own.  So we could take the shortcut of just storing them in a new
>> column in pg_attribute.

I agree with you.

>> But maybe that's shortsighted and we'll
>> eventually wish we'd done them as full-fledged separate objects.

I don't think so. If we try to implement non-blocking adding columns
with volatile defaults (and for instance update old rows in the
background), we can end up with the next situation:

CREATE TABLE a(i bigint PRIMARY KEY);
INSERT INTO a SELECT generate_series(1,100);

ALTER TABLE a ADD COLUMN b bigserial CHECK (b BETWEEN 1 AND 100);

For indexes (even unique) created concurrently similar troubles are
solved with a "not valid" mark, but what to do with a heap if we try
to do it in the background?

>> But on the third hand ... once one of these is in place, how could you
>> drop it separately from the column?  That would amount to a change in the
>> column's stored data, which is not what one would expect from dropping
>> a separate object.  So maybe it's senseless to think that these things
>> could ever be distinct objects.  But that definitely leads to the
>> conclusion that they're constants and nothing else.

> I cannot follow this reasoning.
> Let’s look past what PG does today:
> For each row (whether that’s necessary or not) we evaluate the expression,
> compute the value and
> store it in the rewritten table.
> We do not record dependencies on the “pedigree” of the value.
> It happened to originate from the DEFAULT expression provided with the ADD
> COLUMN,
> but that is not remembered anywhere.
> All we remember is the value - in each row.
> So the only change that is proposed here - when it comes right down to it -
> is to remember the value once only (IFF it is provably the same for each row)
> and thus avoid the need to rewrite the table.
> So I see no reason to impose any restriction other than “evaluated value is
> provably the same for every row”.

Tom says the same thing. The expression at the end should be a value
if it allows to avoid rewriting table.

> Regarding the location of storage.
> I did start of using pg_attrdef, but ran into some snags.
> My approach was to add the value as an extra column (rather than an extra
> row).
> That caused trouble since a SET DEFAULT operation is decomposed into a DROP
> and a SET and
> preserving the value across such operations did not come naturally.

I'm sorry for making you be confused. The best way is to use an extra
column in the pg_attribute to store serialized value.

> If we were to use extra rows instead that issue would be solved, assuming we
> add a “default kind” sort of column.
> It would dictate the storage format though which may be considered overkill
> for a a constant.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs

2016-10-06 Thread Vitaly Burovoy

On 10/6/16, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> Ough. I made a mistake about pg_attribute because I forgot about the
> pg_attrdef.
> If we do not merge these tables, the pg_attrdef is the best place to
> store evaluated expression as a constant the same way defaults are
> stored in adbin.

Oops. While I was writing the previous email, Tom explained necessity
of the pg_attrdef.
With that explanation it is obvious that I was wrong and a value for
pre-existing rows should be in a new column in the pg_attribute.
All the other thoughts from my previous email stand good.

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs

2016-10-06 Thread Vitaly Burovoy

On 10/6/16, Serge Rielau <se...@rielau.com> wrote:
>> On Oct 6, 2016, at 9:01 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> BTW, it also occurs to me that there are going to be good implementation
>> reasons for restricting it to be a hard constant, not any sort of
>> expression.  We are likely to need to be able to insert the value in
>> low-level code where general expression evaluation is impractical.
>>
> Yes, the padding must happen primarily in the getAttr() routines.
> Clearly we do not want to evaluate an expression there.
> But what speaks against evaluating the expression before we store it?
> After all we seem to all agree that this only works if the expression
> computes to the same constant all the time.
>
> If we do not want to store an “untyped” datum straight in pg_attribute as a
> BYTEA (my current approach)

Ough. I made a mistake about pg_attribute because I forgot about the pg_attrdef.
If we do not merge these tables, the pg_attrdef is the best place to
store evaluated expression as a constant the same way defaults are
stored in adbin.

> we could store the pretty printed version of the constant

It is a wrong way. It ruins commands like "ALTER COLUMN ... TYPE ... USING"

> and evaluate that when we build the tuple descriptor.
> This happens when we load the relation into the relcache.
>
> Anyway, I’m jumping ahead and it’s perhaps best to let the code speak for
> itself once I have the WIP patch ready so we have something concrete to
> discuss

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs

2016-10-06 Thread Vitaly Burovoy

On 10/6/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Serge Rielau <se...@rielau.com> writes:
>>> On Oct 6, 2016, at 5:25 AM, Vitaly Burovoy <vitaly.buro...@gmail.com>
>>> wrote:
>>>> Which makes me think we should call this missing_value or absent_value

Be honest Simon Rigg's wrote that words.

>>>> so its clear that it is not a "default" it is the value we use for
>>>> rows that do not have any value stored for them.
>
>> I like Tom’s “creation default”. Another one could be “initial default”.
>> But that, too, can be misread.
>
> Something based on missing_value/absent_value could work for me too.
>
> If we name it something involving "default", that definitely increases
> the possibility for confusion with the regular user-settable default.
>
> Also worth thinking about here is that the regular default expression
> affects what will be put into future inserted rows, whereas this thing
> affects the interpretation of past rows.  So it's really quite a different
> animal.  That's kind of leading me away from calling it creation_default.
>
> BTW, it also occurs to me that there are going to be good implementation
> reasons for restricting it to be a hard constant, not any sort of
> expression.  We are likely to need to be able to insert the value in
> low-level code where general expression evaluation is impractical.

Yes, I mentioned that it should be evaluated and stored as a value
because user functions can be changed (besides the speed reason),
that's why I like the "value" in its name. The "default" is usually
identified with expressions, not values (which are particular cases of
expressions).

Serge mentioned the phrase "pre-existing rows", which makes me think
about something like "pre_existing_value"

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs

2016-10-06 Thread Vitaly Burovoy

On 10/6/16, Simon Riggs <si...@2ndquadrant.com> wrote:
> On 6 October 2016 at 04:43, Serge Rielau <se...@rielau.com> wrote:
>>>> Or should I compose some sort of a design document?
>
> Having read this thread, I'm a little unclear as to what you're
> writing now, though there's definitely good ideas here.
>
> I think it would be beneficial to write up a single coherent
> description of this, including behaviour and a small sketch of
> implementation, just so everyone knows what this is. No design doc,
> but a summary.

At the moment I think it can also be a good idea to post the current
patch as a Proposal or a WIP to get initial feedback.

> It would be very useful to be able to do this...
> ALTER TABLE foo ADD last_updated_timestamp timestamp default
> current_timestamp
> so that it generates a constant value and stores that for all prior
> rows, but then generates a new value for future rows.

Yes, it works for stable "now()" but does not work for volatile
functions like "random()", "uuid_generate_v4()" or default for serial
columns. The only possible way I can see is to check an expression has
only "T_Const"s, static and stable functions. In such case the
expression can be evaluated and the result be saved as a value for
absented attributes of a tuple. In the other case save NULL there and
rewrite the table.

> Which makes me think we should call this missing_value or absent_value
> so its clear that it is not a "default" it is the value we use for
> rows that do not have any value stored for them.

It is definitely a default for a user, it is not a regular default internally.
I'm not a native speaker, "absent_value" can be mixed up with a NULL.
As for me the best phrase is "pre-add-column-default", but it is
impossible to use it as a column name. :-(
It is still an open question.

(I remember funny versions in a discussion[1] when people tried to
choose a name for a function reversed to pg_size_pretty...)

[1] 
https://www.postgresql.org/message-id/flat/cafj8prd-tgodknxdygecza4on01_urqprwf-8ldkse-6bdh...@mail.gmail.com

--
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs

2016-10-05 Thread Vitaly Burovoy

On 10/5/16, Serge Rielau <se...@rielau.com> wrote:
>On Wed, Oct 5, 2016 at 4:19 PM, Vitaly Burovoy <vitaly.buro...@gmail.com> 
>wrote:
>> But what I discover for myself is that we have pg_attrdef separately
>> from the pg_attribute. Why?
>> Is it time to join them? For not presented defaults it would be only
>> one bit per row(if we avoid "adsrc" as it is recommended), but for a
>> separate table it is 11 columns with two indexes now...
>
> In terms of footprint we may be able to remove pg_attrdef.
> I would consider that orthogonal to the proposed feature though.

It was a question mostly to the community rather than to you.

> The internal representation of defaults in the tuple descriptor still needs 
> to be a map of sorts.
>
> To comment on Pantelis SQL Server Reference:
> Other vendors such as Oracle and DB2 also support this feature.
>
> The listed restriction made me loop back to Vitaly’s original serial example:
> ALTER TABLE t ADD COLUMN c2 serial;
> and rethink Tom’s struct restriction to constants.

I'm sorry, the correct example with "now" should be:
ALTER TABLE t ADD COLUMN c1 timestamptz NOT NULL DEFAULT
'now'::text::timestamptz;

I wanted to show that for non-immutable (even stable) expression as a
default one each time selected old tuples gives different result.
But your implementation evaluates it before saving as a
"pre-add-column" default and it breakes the current behavior in a
different way.

> In PG the proposed feature would also have to be limited to immutable(?)
> default expressions to comply with existing behavior, which matches SQL 
> Servers.
>
> My current patch does not restrict that and thusly falsely "fills in" the 
> same value for all rows.

If you change the current behavior (just making it "faster") you
should save the current behavior.
The best case is to determine whether it is possible to do a "fast"
change and fill the "pre-add-column" default parameter or leave the
current "long" behavior.

I assure it is often enough to add columns with a default value which
fills the current rows by non-static values.
One of them is adding a serial column (which is a macros for "CREATE
SEQUENCE+SET NULL+SET DEFAULT
nextval(just_created_sequence_relation)"), others are
"uuid_generate_vX(...)" and "random()"

On 10/5/16, Serge Rielau <se...@rielau.com> wrote:
> I want to point out as a minor "extension" that there is no need for the
> default to be immutable. It is merely required that the default is evaluate
> at time of ADD COLUMN and then we remember the actual value for the exist
> default, rather than the parsed expression as we do for the "current"
> default.

I don't think it will be accepted.

On 10/5/16, Serge Rielau <se...@rielau.com> wrote:
> Thanks.
> This would be my first contribution.
> I take it I would post a patch based on a recent PG 9.6 master for review?

Your patch must be based on a just "master" branch.
If you haven't seen wiki pages [1], [2] and [3], it is the time to do
it (and related ones).

> Or should I compose some sort of a design document?

Since Tom Lane, Andres Freund and other people agreed "it does work",
you may post a patch to a new thread and write a short (but clean
enough) description with a link to the current thread. Examples can be
seen by links from the CF[4].
Nevertheless it is important to honor all thoughts mentioned here
because if your patch breaks the current behavior (with no significant
reason) it is senseless (even if it matches the behavior of other
RDBMS) and will not be committed.

Of cource, feel free to ask.

[1] https://wiki.postgresql.org/wiki/So,_you_want_to_be_a_developer%3F
[2] https://wiki.postgresql.org/wiki/Developer_FAQ
[3] https://wiki.postgresql.org/wiki/Submitting_a_Patch
[4] https://commitfest.postgresql.org/11/

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs

2016-10-05 Thread Vitaly Burovoy

On 10/5/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> I wrote:
>> Need a better name for the concept, since evidently this name isn't
>> conveying the idea.
>
> Maybe "creation default" would work better?  Point being it's the
> default value at the time of column creation.

Hmm... Personaly for me the original topic name is good enough.


But what I discover for myself is that we have pg_attrdef separately
from the pg_attribute. Why?
Is it time to join them? For not presented defaults it would be only
one bit per row(if we avoid "adsrc" as it is recommended), but for a
separate table it is 11 columns with two indexes now...
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs

2016-10-05 Thread Vitaly Burovoy

On 10/5/16, Andres Freund <and...@anarazel.de> wrote:
> On 2016-10-05 15:44:56 -0700, Jeff Janes wrote:
>> On Wed, Oct 5, 2016 at 3:29 PM, Andres Freund <and...@anarazel.de> wrote:
>> > ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT;
>> > INSERT id = 1;
>> > ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 1;
>> > ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT;
>> > INSERT id = 2;
>> > ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 2;
>> > ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT;
>> > INSERT id = 3;
>> > ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 3;
>> >
>> > The result here would be that there's three rows with a default value
>> > for foo that's the same as their id. None of them has that column
>> > present in the row.
>> >
>>
>> My understanding is that all of those would be materialized.
>
> But that'd require a table rewrite, as none of the above INSERTs were
> done when a default was in place.

Since they did not have the default value, that tuples are written
with actual TupleDesc.natts where att_isnull for "withdefault" column
is set (actually the column does not have default for inserted tuples
in your case).

> But each has a different "applicable" default value.

No, their values are constructed "from scratch", not fetched from a
heap, so "pre-alter-add-column" default is not applicable for them.

>> The only
>> default that isn't materialized is the one in effect in the same
>> statement
>> in which that column was added.  Since a column can only be added once,
>> the
>> default in effect at the time the column was added can never change, no
>> matter what you do to the default later on.
>
> DROP DEFAULT pretty much does that, because it allows multiple (set of)
> rows with no value (or a NULL) for a specific column, but with differing
> applicable default values.

DROP DEFAULT is for "post-alter-add-column" tuples, it does not
affects "pre-alter-add-column" ones.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs

2016-10-05 Thread Vitaly Burovoy

On 10/5/16, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> On 10/5/16, Andres Freund <and...@anarazel.de> wrote:
>> On 2016-10-05 15:23:05 -0700, Vitaly Burovoy wrote:
>>> On 10/5/16, Andres Freund <and...@anarazel.de> wrote:
>>> > On 2016-10-05 11:58:33 -0700, Serge Rielau wrote:
>>> >> Dear Hackers,
>>> >> I’m working on a patch that expands PG’s ability to add columns to a
>>> >> table
>>> >> without a table rewrite (i.e. at O(1) cost) from the
>>> >> nullable-without-default to a more general case.
>>> >
>>> > If I understand this proposal correctly, altering a column default
>>> > will
>>> > still have trigger a rewrite unless there's previous default?
>>>
>>> No, "a second “exist default"" was mentioned, i.e. it is an additional
>>> column in a system table (pg_attribute) as default column values of
>>> the "pre-alter" era. It solves changing of the default expression of
>>> the same column later.
>>
>> Don't think that actually solves the issue. The default might be unset
>> for a while, for example. Essentially you'd need to be able to associate
>> arbitrary number of default values with an arbitrary set of rows.
>>
>> ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT;
>> INSERT id = 1;
>> ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 1;
>> ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT;
>> INSERT id = 2;
>> ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 2;
>> ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT;
>> INSERT id = 3;
>> ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 3;
>>
>> The result here would be that there's three rows with a default value
>> for foo that's the same as their id. None of them has that column
>> present in the row.
>
> I'm sorry, while I was writting "pre-alter" I meant
> "pre-alter-add-column" era (not "pre-alter-set-default"), all later
> default changes "current" default, whereas "pre-alter-add-column" adds
> value if current column number < TupleDesc.natts.
>
> All your DDL are in the "post-alter-add-column" era.

I'm so sorry, I was in a hurry. Of course,
- if current column number < TupleDesc.natts.
+ if current column number > TupleDesc.natts.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs

2016-10-05 Thread Vitaly Burovoy

On 10/5/16, Andres Freund <and...@anarazel.de> wrote:
> On 2016-10-05 15:23:05 -0700, Vitaly Burovoy wrote:
>> On 10/5/16, Andres Freund <and...@anarazel.de> wrote:
>> > On 2016-10-05 11:58:33 -0700, Serge Rielau wrote:
>> >> Dear Hackers,
>> >> I’m working on a patch that expands PG’s ability to add columns to a
>> >> table
>> >> without a table rewrite (i.e. at O(1) cost) from the
>> >> nullable-without-default to a more general case.
>> >
>> > If I understand this proposal correctly, altering a column default will
>> > still have trigger a rewrite unless there's previous default?
>>
>> No, "a second “exist default"" was mentioned, i.e. it is an additional
>> column in a system table (pg_attribute) as default column values of
>> the "pre-alter" era. It solves changing of the default expression of
>> the same column later.
>
> Don't think that actually solves the issue. The default might be unset
> for a while, for example. Essentially you'd need to be able to associate
> arbitrary number of default values with an arbitrary set of rows.
>
> ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT;
> INSERT id = 1;
> ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 1;
> ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT;
> INSERT id = 2;
> ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 2;
> ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT;
> INSERT id = 3;
> ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 3;
>
> The result here would be that there's three rows with a default value
> for foo that's the same as their id. None of them has that column
> present in the row.

I'm sorry, while I was writting "pre-alter" I meant
"pre-alter-add-column" era (not "pre-alter-set-default"), all later
default changes "current" default, whereas "pre-alter-add-column" adds
value if current column number < TupleDesc.natts.

All your DDL are in the "post-alter-add-column" era.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs

2016-10-05 Thread Vitaly Burovoy

On 10/5/16, Andres Freund <and...@anarazel.de> wrote:
> On 2016-10-05 11:58:33 -0700, Serge Rielau wrote:
>> Dear Hackers,
>> I’m working on a patch that expands PG’s ability to add columns to a table
>> without a table rewrite (i.e. at O(1) cost) from the
>> nullable-without-default to a more general case.
>
> If I understand this proposal correctly, altering a column default will
> still have trigger a rewrite unless there's previous default?

No, "a second “exist default"" was mentioned, i.e. it is an additional
column in a system table (pg_attribute) as default column values of
the "pre-alter" era. It solves changing of the default expression of
the same column later.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs

2016-10-05 Thread Vitaly Burovoy

On 10/5/16, Serge Rielau <se...@rielau.com> wrote:
> Dear Hackers,
>
> I’m working on a patch that expands PG’s ability to add columns to a table
> without a table rewrite (i.e. at O(1) cost) from the
> nullable-without-default to a more general case. E.g.
>
> CREATE TABLE T(pk INT NOT NULL PRIMARY KEY);
> INSERT INTO T VALEUS (1), (2), (3);
> ALTER TABLE T ADD COLUMN c1 INTEGER NOT NULL DEFAULT 5;
> INSERT INTO T VALUES (4, DEFAULT);
> ALTER TABLE T ALTER COLUMN SET DEFAULT 6;
> INSERT INTO T VALUS (5, DEFAULT);
> SELECT * FROM T ORDER BY pk;
> =>
> (1, 5),
> (2, 5),
> (3, 5),
> (4, 5),
> (5, 6);
>
> Rows 1-3 have never been updated, yet they know that their values of c1 is
> 5.
>
> The requirement is driven by large tables for which add column takes too
> much time and/or produces too large a transaction for comfort.
>
> In simplified terms:
>
> * a second “exist default” is computed and stored in
> the catalogs at time of AT ADD COLUMN
>
> * The exist default is cached in the tuple descriptor (e.g in attrdef)
>
> * When one of the getAttr or copytuple related routines is invoked
> the exist default is filled in instead of simply NULL padding if the
> tuple is shorter the requested attribute number.
>
> Is there an interest in principle in the community for this functionality?

Wow! I think it would be great! It also solves huge vacuuming after
rewriting the table(s).
Just pay attention to corner cases like indexes, statistics and speed.

But I'd like to see solution for more important cases like:
CREATE TABLE t (pk INT NOT NULL PRIMARY KEY);
INSERT INTO t VALUES (1), (2), (3);
ALTER TABLE t ADD COLUMN c1 timestamptz NOT NULL DEFAULT 'now';
SELECT * FROM t ORDER BY pk;
ALTER TABLE t ADD COLUMN c2 serial;
SELECT * FROM t ORDER BY pk;
INSERT INTO t(pk) VALUES (4);
SELECT * FROM t ORDER BY pk;

P.S.: I really think it is a good idea, just some research is
necessary and covering corner cases...

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Small doc fix

2016-10-05 Thread Vitaly Burovoy

Hello, hackers,

I've just noticed an extra word in a sentence in the docs in the
"parallel.sgml".
It seems the sentence was constructed one way and changed later with
the extra word left.

Please, find the fix attached.

-- 
Best regards,
Vitaly Burovoy


pg-docs-fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: ON UPDATE REMOVE foreign key action

2016-10-04 Thread Vitaly Burovoy

On 10/4/16, Kirill Berezin <ene...@exsul.net> wrote:
> Disclaimer: sorry, i dont understand, should i reply to each of you
> personally, or just answer to channel. Some feedbacks were sended in
> personal, and some include channel copy.

Usually discussions are in the list, therefore you should use "reply
to all" (see [1]).
Exception is when a sender notes "Off the list".

> Thanks for responses, you understand it correctly.
>
> When i said "anybody", i mean inclusive owner himself. For example cookie
> poisoning.
> There is no "another" session, technically. They similar to the server,
> they even can have same IP.
> Yes, we can't prevent it with CSRF cookies, but it is not the point of
> current conversation.
>
> I can make business logic outside table: make extra query.

Good decision. Your case needs exactly what you've just written.

> Im just dont like how it looks from perspective of encapsulation.
> Each table should describe itself, like object in OOP language.
> With SQL constructions or triggers/constraits.

SQL is not OOP. There is no "encapsulation".

> Second part of my use case is data cache.

Hmm. Usage of RDBMS as a cache with an overhead for Isolation and
Durability (from ACID)? Really?
As for me it is a bad idea for most cases.

> When user update
> version(generation), cache should be flushed. As easy example: imagine i am
> fetching currency value. And till end of the day, i am honor current
> course. (Or any other information, that has certain origin checkpoints).
> When i am updating origin state (current day, server version, ip address,
> neural network generation), i am have to invalidate all previous data.

It is a bad example. Companies working with currency exchange rates
always keep their values as historical data.

> Like i am calculating digits of the square root, of some number. The more i
> spend time, the closer my approx value to irrational result. But when
> original value has changed - all previous data does not make sense. I am
> flushing it and starting from digit 1.

Why do you "update" original value instead of deleting old one and
inserting new value?

> This is allegorical examples to my real-world cases. I may try imagine some
> hypothetical situations, when this functionality more welcomed. But, i am
> respect reasons why do not apply this proposal. If my update didn't shift
> the balance, its ok. on update trigger is not such painful.

All your cases (except the exchange rate one) can be done using two
queries: delete original row (which deletes other linked data "ON
DELETE CASCADE") and insert a new one. You don't even have to use
transactions!
If your business logic is so "OOP", you can use stored procedures, but
introducing new grammar specially for concrete task is a bad idea.

Of course at first sight there is a meaningless sequence "ON UPDATE
SET (NULL|DEFAULT)", but the meaning of SET NULL and SET DEFAULT for
both ON UPDATE and ON DELETE is using them for "unlinking" data from
the referenced one. It is similar to "NO ACTION" but explicitly change
them as they are no longer connected to the referenced row (by
referencing column list).

Also your proposal is not consistent: ON UPDATE REMOVE (DELETE?), but
ON DELETE - what? again remove/delete?

[1] https://wiki.postgresql.org/wiki/Mailing_Lists#Using_the_discussion_lists
-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: ON UPDATE REMOVE foreign key action

2016-10-03 Thread Vitaly Burovoy

On 10/3/16, Kirill Berezin <ene...@exsul.net> wrote:
> *One-line Summary:* On foreign key update we unable to remove all depended
> records. Currently we have "ON REMOVE CASCADE DELETE", but no "ON UPDATE
> CASCADE DELETE". We can only update field to NULL or DEFAULT.

I think there are three causes why we don't have it implemented.
The first one is that there is no such grammar in the SQL spec (your
version is also wrong: SQL spec has "ON DELETE CASCADE" as well as "ON
DELETE CASCADE" [or any other action instead of "CASCADE"]).

The second one is in almost all cases there is no reason to delete
rows because of updating referenced row. If these rows are still
connected, they should be updated, if not --- left as is ("NO ACTION")
or with reference link deleted ("SET NULL" or "DEFAULT").
These rows has data, that's why they are still in tables. They can be
deleted (by reference) if and only if "parent" or "linked" data (all
data, not just referenced key) is deleted.

> *Business Use-case:* Cache expiration on hash/version update. Revoke all
> access on account id update.

> In my case i met this situation: I am using access links to share user
> account. Account owner can give private link to somebody, and its session
> become mirrored. (Owner access to account granted).

And the third cause is avoiding of bad design. If you has to give
access to anyone and you know access will be revoked soon (or late),
it is wise to give private link with different identificator which can
be easily found and removed by a grantor id (your id).

> You cant imagine facebook desktop and mobile sessions.

Which, of course, have different session ids. You can revoke session
without renaming your own.

> It's just shortcut for
> entering credentials. Now i am implementing "revoke all but me". Its done
> simple, since each user is uuid indexed, i am just generate new uuid for
> current account. Old uuid become invalid to other sessions - since no
> record is found in base.
> I want to remove any pending access links, prevent bad guy restore access.
> I can possibly set linked account to NULL,

Why just don't delete them when grantor revokes access?

> and then clear record on
> expiration, but i feel that automatically removing on update event is more
> rational.

I personally don't see necessity to introduce new non-spec grammar.
If you think I has not understood you, send an example with schema ---
what you have now and how you expect it should be.

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_file_settings view patch

2016-10-03 Thread Vitaly Burovoy

On 10/2/16, Michael Paquier <michael.paqu...@gmail.com> wrote:
> On Mon, Oct 3, 2016 at 3:25 PM, Vitaly Burovoy <vitaly.buro...@gmail.com>
> wrote:
>> I guess for ability to use filtering like:
>>
>> SELECT * FROM pg_hba_rules WHERE options->>radiusserver LIKE
>> '%.example.com';
>>
>> I think it would be harder if options is an array of strings...
>
> With unnest() and a matching pattern, not that hard but..

I'm not a user of that feature, and I don't know how pg_hba files look
like in really big companies...

But for me filtering is more complicated than just a single comparison.
What about more complex filtering --- several radiusserver and a user(s):

WHERE
options->>radiusserver = ANY(ARRAY['a.example.com', 'g.example.com'])
AND
options->>radiusidentifier = ANY(ARRAY['ID_a', 'ID_b', 'ID_c',
'ID_d', 'ID_e'])  -- or even a subquery

Again, I don't know whether it will be widely used, but in case of
multiple param_name->param_value settings (column "options") I'd like
to see native key-value store rather than array of strings (according
to POLA).

I guess you're expecting "key=value" format as they are written in the
pg_hba file (and described in the doc), but sometimes they can be
parsed and output differs from exact pg_hba records (for instance look
at "ldapurl" parameter).

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_file_settings view patch

2016-10-03 Thread Vitaly Burovoy

On 10/2/16, Michael Paquier <michael.paqu...@gmail.com> wrote:
> +   push_jsonb_string_key(, "map");
> +   push_jsonb_string_value(, hba->usermap);
> [...]
> +
> + options
> + jsonb
> + Configuration options set for authentication method
> +
> Why is it an advantage to use jsonb here instead of a simple array
> made of name=value? If they were nested I'd see a case for it but it
> seems to me that as presented this is just an overkill.

I guess for ability to use filtering like:

SELECT * FROM pg_hba_rules WHERE options->>radiusserver LIKE '%.example.com';

I think it would be harder if options is an array of strings...

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Question / requests.

2016-09-30 Thread Vitaly Burovoy

On 9/30/16, Francisco Olarte <fola...@peoplecall.com> wrote:
> Hello everyone.

Hello, Francisco!

> Also, although I feel confident in my coding I have zero knowledge of
> developing for postgres,

It is easy enough and all important steps are documented in the wiki.
Also some interesting things can be found in presentations from
hackers about how to hack PostgreSQL.

> and I am not confident in my git or testing habilities.
> I can develop it, as it is just modifying a single libpq
> client program and only in the easy part of the string lists and may
> be emitting a new error (as this can introduce a new failure mode of
> 'no tables to vacuum'), I can test it locally and produce a patch for
> that file, but I'm not confident on integrating it, making git patchs
> or going further, so I would like to know if doing that would be
> enough and then I can give the code to someone to review or integrate
> it.

Do your best and send a patch. No one is good enough at understanding
all the code base at once. There are lot of people who know different
parts of the code and who have ability to test patches in different
ways.

You can be sure you get a feedback, your code will not be merged to
the code base without deep review and independent testing.

Just be ready to improve your patch according to a feedback and be
ready that usually it takes several rounds of sending-review before
patch is committed.

Also you can follow a discussion from one of simple patches in a
commitfest to be familiar with the process.

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Detect supported SET parameters when pg_restore is run

2016-09-27 Thread Vitaly Burovoy

On 9/27/16, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> On 9/27/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> (The other thing I'd want here is a --target-version option so that
>> you could get the same output alterations in pg_dump or pg_restore to
>> text.  Otherwise it's nigh undebuggable, and certainly much harder
>> to test than it needs to be.)
>
> I thought that way. I'm ready to introduce that parameter, but again,
> I see now it will influence only SET parameters. Does it worth it?

The only reason I have not implemented it was attempt to avoid users
being confused who could think that result of pg_dump (we need it
there for the plain text output) or pg_restore can be converted for
target version to be restored without new features (but now it is
wrong).

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Detect supported SET parameters when pg_restore is run

2016-09-27 Thread Vitaly Burovoy

On 9/27/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Vitaly Burovoy <vitaly.buro...@gmail.com> writes:
>> On 9/27/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
>>> I'm not exactly convinced that you did.  There's only one copy of
>>> Archive->remoteVersion, and you're overwriting it long before the
>>> dump process is over.
>
>> It does not seem that I'm "overwriting it long before the dump process
>> is over"...
>
> There's a lot that happens during RestoreArchive.  Even if none of it
> inspects remoteVersion today, I do not think that's a safe assumption to
> make going forward.

And... Khm... Note that even _now_ AHX->remoteVersion is set to a
database version pg_restore connects to... So all the code has it
during restoring process...

> The easiest counterexample is that this very bit of
> code you want to add does so.

The only change I've done is set remoteVersion to the maximum allowed
when output is the plain text format.

> I really do not want to get into a design
> that says "remoteVersion means the source server version until we reach
> RestoreArchive, and the target version afterwards".  That way madness lies.

It is only if you think about "remoteVersion" as
"sourceServerVersion", but even now it is not so.

Moreover RestoreArchive is a delimter only for pg_dump and only when
output is a plain text.
For other modes of the pg_dump RestoreArchive is not called at all.

> If we're going to try altering the emitted SQL based on target version,
> let's first create a separation between those concepts;

I've just found there is _archiveHandle.archiveRemoteVersion. Is it a
parameter you were searched for?
The pg_restore code does not use it (just as remoteVersion), but it
can do so if it is necessary.

> otherwise I will bet that we add more bugs than we remove.
>
> (The other thing I'd want here is a --target-version option so that
> you could get the same output alterations in pg_dump or pg_restore to
> text.  Otherwise it's nigh undebuggable, and certainly much harder
> to test than it needs to be.)

I thought that way. I'm ready to introduce that parameter, but again,
I see now it will influence only SET parameters. Does it worth it?

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Detect supported SET parameters when pg_restore is run

2016-09-27 Thread Vitaly Burovoy

On 9/27/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Vitaly Burovoy <vitaly.buro...@gmail.com> writes:
>> On 9/27/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
>>> The general policy has always been that pg_dump output is only expected
>>> to
>>> restore without errors into a server that's the same or newer version as
>>> pg_dump (regardless of the server version being dumped from).
>
>> Why can't I use it if pg_dump92/pg_restore92 have the same behavior as
>> pg_dump96/pg_restore96 except the SET block?
>
> That's a pretty large "if", and not one I'm prepared to commit to.
> Before you assert that it's true, you might want to reflect on the
> size of the diff between 9.2 and 9.6 pg_dump (hint: it's over 20K
> lines in -u format).
>
>>> Either that, or the patch is overwriting pg_dump's idea of what the
>>> source server version is at the start of the output phase, which is
>>> likely to break all kinds of stuff when dumping from an older server.)
>
>> I agree, that's why I left current behavior as is for the plain text
>> output.
>
> I'm not exactly convinced that you did.  There's only one copy of
> Archive->remoteVersion, and you're overwriting it long before the
> dump process is over.

I'm sorry, I'm not so good at knowing the code base but I see that my
patch changes Archive->remoteVersion in the "RestoreArchive" which is
called after all schema is fetched to pg_dump's structs and just
before output is written, moreover, there is a comment that it is a
final stage (9.2 has the same block of code):
...
/*
 * And finally we can do the actual output.
 *...
 */
if (plainText)
RestoreArchive(fout);

CloseArchive(fout);

exit_nicely(0);
}

It does not seem that I'm "overwriting it long before the dump process
is over"... Also pg_dump -v proves that changing remoteVersion happens
after all "pg_dump: finding *", "pg_dump: reading *" and "pg_dump:
saving *".

> That'd break anything that consulted it to
> find the source server's version after RestoreArchive starts.

I'm sorry again, could you or anyone else point me what part of the
code use Archive->remoteVersion after schema is read?
I set up break point at the line
AHX->remoteVersion = 99;
and ran pg_dump with plain text output to a file (via "-f" option).
When the line was reached I set up break points at all lines I could
find by a script:

egrep -n -r --include '*.c' 'remoteVersion [><]' src/bin/pg_dump/ |
awk -F: '{print "b "$1":"$2}'

(there were 217 of them) and continued debuging. All three fired break
points were expectedly in _doSetFixedOutputState (at the lines where
my patch introduced them) after which the program exited normally
without stopping.

Also I wonder that the process of "restoring" consults
Archive->remoteVersion because currently in the pg_restore to plain
text output remoteVersion is zero. It means restoring process would
output schema to the oldest supported version, but is not so.

> I could get behind a patch that split remoteVersion into sourceVersion
> and targetVersion and made sure that all the existing references to
> remoteVersion were updated to the correct one of those.  After that
> we could do something like what you have here in a less shaky fashion.
> As Robert noted, there'd probably still be a long way to go before it'd
> really work to use a newer pg_dump with an older target version, but at
> least we'd not be building on quicksand.

It means support of restoring from a newer version to an older one.
What's for? If new features are used it is impossible to restore
(port) them to an older version, if they are not, restoring process is
done (i guess) in 99% of cases.

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Detect supported SET parameters when pg_restore is run

2016-09-27 Thread Vitaly Burovoy

On 9/27/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Robert Haas <robertmh...@gmail.com> writes:
>> On Mon, Sep 26, 2016 at 9:56 PM, Vitaly Burovoy
>> <vitaly.buro...@gmail.com> wrote:
>>> We do dump/restore schemas/data via custom/dir formats and we have to
>>> keep several client versions for 9.2, 9.4 and 9.5 versions on local
>>> workstations because after pg_restore95 connects to 9.2, it fails when
>>> it sets run-time parameters via SET:
>
>> I think our policy is that a newer pg_dump needs to work with an older
>> version of the database, but I don't think we make any similar
>> guarantee for other tools, such as pg_restore.  It's an interesting
>> question whether we should try a little harder to do that, but I
>> suspect it might require more changes than what you've got in this
>> patch

Well... I'm not inclined to insert support of restoring from a higher
major version to a lower one, because it can lead to security issues
(e.g. with RLS). But my observation is that all new features supported
by the "pg_dump" are "incremental", e.g. the last feature "parallel"
for pg_dump/pg_restore --- lack of "PARALLEL UNSAFE" (which is by
default) from 9.6 and lack of it from pre-9.6.

That behavior allows newer versions of pg_restore to use dumps from DB
of older versions because of lack of new features grammar. With the
patch I'm able to use pg_dump96/pg_restore96 for our database of 9.2
(dump from 9.2 and restore to 9.2).

> The general policy has always been that pg_dump output is only expected to
> restore without errors into a server that's the same or newer version as
> pg_dump (regardless of the server version being dumped from).  If you try
> to restore into an older server version, you may get some errors, which we
> try to minimize the scope of when possible.  But it will never be possible
> to promise none at all.  I think the correct advice here is simply "don't
> use pg_restore -e -1 when trying to restore into an older server version".

Why can't I use it if pg_dump92/pg_restore92 have the same behavior as
pg_dump96/pg_restore96 except the SET block?
The patch does not give guarantee of a restoration, it just avoids
setting unsupported parameters for pg_restore the same way as pg_dump
does.
The other issues are for solving by the user who wants to restore to a
DB of older version.

> Taking a step back, there are any number of ways that you might get
> errors during a pg_restore into a DB that's not set up exactly as pg_dump
> expected.  Missing roles or tablespaces, for example.

It does not depends on a pg_dump/pg_restore version and can be soved
by using command line options:
--no-tablespaces --no-owner --no-privileges

> Again, the dump output is constructed so that you can survive those problems
> and bull ahead ... but not with "-e -1".

I think "-e -1" was invented specially for it --- stop restoring if
something is going wrong. Wasn't it?

> I don't see a very good reason why
> older-server-version shouldn't be considered the same type of problem.
>
> The patch as given seems rather broken anyway --- won't it change text
> output from pg_dump as well as on-line output from pg_restore?

My opinion --- no (and I wrote it in the initial letter): because it
is impossible to know what version of a database is used for that
plain text output. Users who use output to a plain text are able to
use sed (or something else) to delete unwanted rows.

> (That is,
> it looks to me like the SETs emitted by pg_dump to text format would
> depend on the source server version, which they absolutely should not.
> Either that, or the patch is overwriting pg_dump's idea of what the
> source server version is at the start of the output phase, which is
> likely to break all kinds of stuff when dumping from an older server.)

I agree, that's why I left current behavior as is for the plain text output.

> It's possible that we could alleviate this specific symptom by arranging
> for the BEGIN caused by "-1" to come out after the initial SETs, but
> I'm not sure how complicated that would be or whether it's worth the
> trouble.

It leads to change ERRORs to WARNINGs but current behavior discussed
above is left the same.
Why don't just avoid SET parameters when we know for sure they are not
supported by the server to which pg_restore is connected?

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Detect supported SET parameters when pg_restore is run

2016-09-26 Thread Vitaly Burovoy

Hackers,

At work we use several major versions of PostgreSQL, and developers
use non-local clusters for developing and debugging.
We do dump/restore schemas/data via custom/dir formats and we have to
keep several client versions for 9.2, 9.4 and 9.5 versions on local
workstations because after pg_restore95 connects to 9.2, it fails when
it sets run-time parameters via SET:

vitaly@work 01:21:26 ~ $ pg_restore95 --host DEV_HOST_9_2 -d DBNAME
--data-only -e -1 -Fc arhcivefile
Password:
pg_restore95: [archiver (db)] Error while INITIALIZING:
pg_restore95: [archiver (db)] could not execute query: ERROR:
unrecognized configuration parameter "lock_timeout"
Command was: SET lock_timeout = 0;

Of course, it can be fixed avoiding "--single-transaction", but if
there is inconsistent schema (or stricter constraints) part of
schema/data is already changed/inserted and a lot of errors are
generated for the next pg_restore run.

The pd_dump has checks in "setup_connection" function to detect what
to send after connection is done for dumping, but there is no checks
in _doSetFixedOutputState for restoring. If there are checks it is
possible to use a single version pg_dump96/pg_restore96 to
dump/restore, for example 9.2->9.2 as well as 9.4->9.4 and so on.

The only trouble we have is in "SET" block and after some research I
discovered it is possible not to send unsupported SET options to the
database.

Please, find attached simple patch.

For restoring to stdout (or dumping to a plain SQL file) I left
current behavior: all options in the SET block are written.
Also I left "SET row_security = on;" if "enable_row_security" is set
to break restoring to a DB non-supported version.

-- 
Best regards,
Vitaly Burovoy


detect_supported_set_parameters_for_pgrestore.001.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2016-09-12 Thread Vitaly Burovoy

On 9/12/16, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
> Thank you for this extensive testing.  I will work on getting the bugs
> fixed.  Just a couple of comments on some of your points:
>
> On 9/9/16 11:45 PM, Vitaly Burovoy wrote:
>> It compiles and passes "make check" tests, but fails with "make
>> check-world" at:
>> test foreign_data ... FAILED
>
> I do not see that.  You can you show the diffs?

I can't reproduce it, it is my fault, may be I did not clean build dir.

>> 1. The standard requires "... ALTER COLUMN ... SET GENERATED { ALWAYS
>> | BY DEFAULT }" (9075-2:2011 subcl 11.20), but the patch implements it
>> as "... ALTER COLUMN ... ADD GENERATED { ALWAYS | BY DEFAULT } AS
>> IDENTITY"
>
> SET and ADD are two different things.  The SET command just changes the
> parameters of the underlying sequence.

Well... As for me ADD is used when you can add more than one property
of the same kind to a relation (e.g. column or constraint), but SET is
used when you change something and it replaces previous state (e.g.
NOT NULL, DEFAULT, STORAGE, SCHEMA, TABLESPACE etc.)

You can't set ADD more than one IDENTITY to a column, so it should be "SET".

> This can be implemented later and doesn't seem so important now.

Hmm. Now you're passing params to CreateSeqStmt because they are the same.
Is it hard to pass them to AlterSeqStmt (if there is no SET GENERATED")?

> The ADD command is not in the standard, but I needed it for pg_dump, mainly.
> I will need to document this.

Firstly, why to introduce new grammar which differs from the standard
instead of follow the standard?
Secondly, I see no troubles to follow the standard:
src/bin/pg_dump/pg_dump.c:
- "ALTER COLUMN %s ADD GENERATED ",
+ "ALTER COLUMN %s SET GENERATED ",

src/backend/parser/gram.y:
- /* ALTER TABLE  ALTER [COLUMN]  ADD GENERATED ... AS
IDENTITY ... */
- | ALTER opt_column ColId ADD_P GENERATED generated_when AS
IDENTITY_P OptParenthesizedSeqOptList
- c->options = $9;
+ /* ALTER TABLE  ALTER [COLUMN]  SET GENERATED ... */
+ | ALTER opt_column ColId SET GENERATED generated_when OptSeqOptList
- c->options = $7;

I guess "ALTER opt_column ColId SET OptSeqOptList" is easy to be
implemented, after some research "ALTER opt_column ColId RESTART [WITH
...]" also can be added.

(and reflected in the docs)

>> 14. It would be fine if psql has support of new clauses.
>
> What do you mean by that?  Tab completion?

Yes, I'm about it. Or tab completion is usually developed later?

>> 16. I think it is a good idea to not raise exceptions for "SET
>> GENERATED/DROP IDENTITY" if a column has the same type of identity/not
>> an identity. To be consistent with "SET/DROP NOT NULL".
>
> These behaviors are per SQL standard.

Can you point to concrete rule(s) in the standard?

I could not find it in ISO/IEC 9075-2 subclauses 11.20 "" and 11.21 "".
Only subclause 4.15.11 "Identity columns" says "The columns of a base
table BT can optionally include not more than one identity column."
(which you don't follow).

For instance, subclause 11.42 , General
Rules p.1 says explicitly about exception.
Or (for columns): 11.4 , General Rules p.3: "The
 in the  SHALL NOT be equivalent to
the  of any other column of T."

> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Several additional thoughts:
1. I think it is wise to add ability to set name of a sequence (as
PG's extension of the standard) to SET GENERATED or GENERATED in a
relation definition (something like CONSTRAINTs names), without it it
is hard to fix conflicts with other sequences (e.g. from serial pseudo
type) and manual changes of the sequence ("alter seq rename", "alter
seq set schema" etc.).
2. Is it useful to rename sequence linked with identity constraint
when table is renamed (similar way when sequence moves to another
schema following the linked table)?
3. You're setting OWNER to a sequence, but what about USAGE privilege
to roles have INSERT/UPDATE privileges to the table? For security
reasons table is owned by a role different from roles which are using
the table (to prevent changing its definition).

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-12 Thread Vitaly Burovoy

--+---+--+-+-+---
 db1  | fred  | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 db2  | bob   | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
(2 rows)

postgres=# set role bob;
SET
postgres=> CREATE DATABASE ss TEMPLATE db  -- shows both
db1  db2
postgres=> CREATE DATABASE ss TEMPLATE db2;
ERROR:  permission denied to create database
postgres=>

So a check for the CREATEDB privilege should be done at the point
whether to show CREATE DATABASE or not.
But if a user has privileges, Tom's version works fine.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-11 Thread Vitaly Burovoy

On 9/11/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Vitaly Burovoy <vitaly.buro...@gmail.com> writes:
>> On 9/11/16, Kevin Grittner <kgri...@gmail.com> wrote:
>>> I was able to find cases during test which were not handled
>>> correctly with either version, so I tweaked the query a little.
>
>> Hmm. Which one? Attempt to "SET ROLE "?
>> Unfortunately, I after reading your letter I realized that I missed a
>> case (it is not working even with your version):
>
> I wasn't aware that this patch was doing anything nontrivial ...
>
> After looking at it I think it's basically uninformed about how to test
> for ownership.  An explicit join against pg_roles is almost never the
> right way for SQL queries to do that.  Lose the join and write it more
> like this:
>
> +"SELECT pg_catalog.quote_ident(d.datname) "\
> +"  FROM pg_catalog.pg_database d "\
> +" WHERE substring(pg_catalog.quote_ident(d.datname),1,%d)='%s' "\
> +"   AND (d.datistemplate OR pg_catalog.pg_has_role(d.datdba, 'USAGE'))"
>
> See the information_schema views for precedent.
>
>   regards, tom lane

Wow! I have not pay enough attention to a description of "pg_has_role".
Your version works for all my tests. Thank you.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-11 Thread Vitaly Burovoy

On 9/11/16, Kevin Grittner <kgri...@gmail.com> wrote:
> On Sat, Sep 10, 2016 at 12:26 AM, Vitaly Burovoy
>> Mark it as "Ready for committer".
>>
>> P.S.: While I was reviewing I simplified SQL query: improved version
>> only 2 seqscans instead of 3 seqscans with an inner loop in an
>> original one.
>> Please find a file "tab-complete-create-database-improved.patch"
>> attached.
>
> I was able to find cases during test which were not handled
> correctly with either version, so I tweaked the query a little.

Hmm. Which one? Attempt to "SET ROLE "?
Unfortunately, I after reading your letter I realized that I missed a
case (it is not working even with your version):

postgres=# CREATE GROUP g1;
CREATE ROLE
postgres=# CREATE GROUP g2;
CREATE ROLE
postgres=# GRANT g2 TO g1;
GRANT ROLE
postgres=# CREATE USER u1 CREATEDB;
CREATE ROLE
postgres=# GRANT g1 TO u1;
GRANT ROLE
postgres=# CREATE DATABASE db_tpl;
CREATE DATABASE
postgres=# ALTER DATABASE db_tpl OWNER TO g2;
ALTER DATABASE
postgres=# SET ROLE g1;
SET
postgres=> CREATE DATABASE db1 TEMPLATE db   -- shows nothing

postgres=> CREATE DATABASE db1 TEMPLATE db^C -- shows nothing
postgres=> RESET ROLE;
RESET
postgres=# SET ROLE u1;
SET
postgres=> CREATE DATABASE db1 TEMPLATE db   -- shows nothing

postgres=> CREATE DATABASE db1 TEMPLATE db^C -- shows nothing
postgres=> CREATE DATABASE db1 TEMPLATE db_tpl;
CREATE DATABASE


It seems if a user has the CREATEDB privilege and he is a member of a
group (role) which owns a database, he can use the database as a
template. But to support it recursive queries should be used. Is it
useless or should be fixed?

> Also, for future reference, please try to use white-space that
> matches surrounding code -- it make scanning through code less
> "jarring".

I tried to. Should "FROM" be at the same row as sub-"SELECT" is
placed? Or there should be something else (I'm now only about
white-space formatting)?
Surrounding code looks similar enough for me... =(

> Thanks all for the patch and the reviews!

Thank you!

> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sequence data type

2016-09-10 Thread Vitaly Burovoy

On 9/10/16, Jim Nasby <jim.na...@bluetreble.com> wrote:
> On 9/3/16 6:01 PM, Gavin Flower wrote:
>> I am curious as to the use cases for other possibilities.
>
> A hex or base64 type might be interesting, should those ever get created...
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX

Hex or base64 are not data types. They are just different
representation types of binary sequences.
Even for bigints these representations are done after writing numbers
as byte sequences.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-09 Thread Vitaly Burovoy

Hello Sehrope Sarkuni,

I have reviewed the patch.
It is very simple (only an SQL query in the "psql" application changed).

It applies at the top of master.
It implements completion database names ("") for commands like
"CREATE DATABASE ... TEMPLATE ".
According to the documentation since 9.2 till devel a database can be
used as a template if it has a "datistemplate" mark or by superusers
or by their owners.
Previous implementation checked only "datistemplate" mark.

Tested manually in versions 9.2 and 10devel, I hope it can be
back-patched to all supported versions.
No documentation needed.

Mark it as "Ready for committer".


P.S.: While I was reviewing I simplified SQL query: improved version
only 2 seqscans instead of 3 seqscans with an inner loop in an
original one.
Please find a file "tab-complete-create-database-improved.patch" attached.

-- 
Best regards,
Vitaly Burovoy


tab-complete-create-database-improved.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2016-09-09 Thread Vitaly Burovoy

implementation:

postgres=# CREATE TABLE itest10 (a int GENERATED BY DEFAULT AS
IDENTITY, b text);
CREATE TABLE
postgres=# INSERT INTO itest10 DEFAULT VALUES;
INSERT 0 1
postgres=# INSERT INTO itest10 OVERRIDING USER VALUE VALUES(1,2);
INSERT 0 1
postgres=# INSERT INTO itest10 OVERRIDING USER VALUE DEFAULT VALUES;
ERROR:  syntax error at or near "DEFAULT" at character 43


---
12. Dump/restore is broken for some cases:

postgres=# CREATE SEQUENCE itest1_a_seq;
CREATE SEQUENCE
postgres=# CREATE TABLE itest1 (a int generated by default as identity, b text);
CREATE TABLE
postgres=# DROP SEQUENCE itest1_a_seq;
DROP SEQUENCE
postgres=# CREATE DATABASE a;
CREATE DATABASE
postgres=# \q

comp ~ $ pg_dump postgres | psql a
SET
SET
SET
SET
SET
SET
SET
SET
COMMENT
CREATE EXTENSION
COMMENT
SET
SET
SET
CREATE TABLE
ALTER TABLE
ALTER TABLE
COPY 0
ERROR:  relation "itest1_a_seq1" does not exist
LINE 1: SELECT pg_catalog.setval('itest1_a_seq1', 2, true);


---
13. doc/src/sgml/ref/create_table.sgml (5th chunk) has "TODO". Why?

---
14. It would be fine if psql has support of new clauses.


===
Also several notes:

15. Initializing attidentity in most places is ' ' but makefuncs.c has
"n->identity = 0;". Is it correct?

---
16. I think it is a good idea to not raise exceptions for "SET
GENERATED/DROP IDENTITY" if a column has the same type of identity/not
an identity. To be consistent with "SET/DROP NOT NULL".


-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2016-09-07 Thread Vitaly Burovoy

Hello,

The first look at the patch:

On 8/30/16, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:
> Here is another attempt to implement identity columns.  This is a
> standard-conforming variant of PostgreSQL's serial columns.
>
> ...
>
> Some comments on the implementation, and where it differs from previous
> patches:
>
> - The new attidentity column stores whether a column is an identity
> column and when it is generated (always/by default).  I kept this
> independent from atthasdef mainly for he benefit of existing (client)
> code querying those catalogs, but I kept confusing myself by this, so
> I'm not sure about that choice.  Basically we need to store four
> distinct states (nothing, default, identity always, identity by default)
> somehow.

I don't have a string opinion which way is preferred. I think if the
community is not against it, it can be left as is.

> ...
> - I did not implement the restriction of one identity column per table.
> That didn't seem necessary.

I think it should be mentioned in docs' "Compatibility" part as a PG's
extension (similar to "Zero-column Tables").

> ...
>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Questions:
1. Is your patch implements T174 feature? Should a corresponding line
be changed in src/backend/catalog/sql_features.txt?
2. Initializing attidentity in most places is ' ' but makefuncs.c has
"n->identity = 0;". Is it correct?
3. doc/src/sgml/ref/create_table.sgml (5th chunk) has "TODO". Why?
4. There is "ADD GENERATED", but the standard says it should be "SET
GENERATED" (ISO/IEC 9075-2 Subcl.11.20)
5. In ATExecAddIdentity: is it a good idea to check whether
"attTup->attidentity" is the same as the given in "(ADD) SET
GENERATED" and do nothing (except changing sequence's options) in
addition to strict checking for "unset" (" ")?
6. In ATExecDropIdentity: is it a good idea to do nothing if the
column is already not a identity (the same behavior as DROP NOT
NULL/DROP DEFAULT)?
7. Is there any reason to insert CREATE_TABLE_LIKE_IDENTITY before
CREATE_TABLE_LIKE_INDEXES, not at the end?

Why do you change catversion.h? It can lead conflict when other
patches influence it are committed...

I'll have a closer look soon.

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why we lost Uber as a user

2016-07-28 Thread Vitaly Burovoy

On 7/28/16, Geoff Winkless <pgsqlad...@geoff.dj> wrote:
> On 27 July 2016 at 17:04, Bruce Momjian <br...@momjian.us> wrote:
>
>> Well, their big complaint about binary replication is that a bug can
>> spread from a master to all slaves, which doesn't happen with statement
>> level replication.
>
> 
> I'm not sure that that makes sense to me. If there's a database bug that
> occurs when you run a statement on the master, it seems there's a decent
> chance that that same bug is going to occur when you run the same statement
> on the slave.
>
> Obviously it depends on the type of bug and how identical the slave is, but
> statement-level replication certainly doesn't preclude such a bug from
> propagating.
>
> Geoff

Please, read the article first! The bug is about wrong visibility of
tuples after applying WAL at slaves.
For example, you can see two different records selecting from a table
by a primary key (moreover, their PKs are the same, but other columns
differ).

From the article (emphasizing is mine):
The following query illustrates how this bug would affect our users
table example:
SELECT * FROM users WHERE id = 4;
This query would return *TWO* records: ...

And it affected slaves, not master.
Slaves are for decreasing loading to master, if you run all queries
(even) RO at master, why would you (or someone) have so many slaves?

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Prevent ALTER TABLE DROP NOT NULL on child tables if parent column has it

2016-06-15 Thread Vitaly Burovoy

On 6/15/16, Michael Paquier <michael.paqu...@gmail.com> wrote:
> On Thu, Jun 16, 2016 at 12:10 PM, Vitaly Burovoy
> <vitaly.buro...@gmail.com> wrote:
>> In the initial letter[1] I posted a digest from the SQL-2011 standard
>> and a conclusion as a design of a new patch.
>> Now I have more free time and I'm hacking it that way. The new patch
>> is at the very early stage, full of WIPs and TODOs. I hope it'll be
>> ready to be shown in a month (may be two).
>
> I have just read both your patch and the one of Alvaro, but yours does
> not touch pg_constraint in any way. Isn't that unexpected?

The consensus was reached to use CHECK constraints instead of new type
of constrains.
Alvaro made attempt[1] to write PoC in 2012 but it failed to apply on
top of master (and after solving conflicts led to core dumps) in Jan
2016.

I just rebased Alvaro's one to understand how he wanted to solve issue
and to run tests and queries. After all I sent rebased working patch
for anyone who wants to read it and try it without core dumps.

I have not published my patch for NOT NULLs yet.

Alvaro, the correct path[2] in the second message[3] of the thread.
What's wrong in it (I got the source in the previous[1] thread)?

[1] 
https://www.postgresql.org/message-id/flat/1343682669-sup-2...@alvh.no-ip.org
[2] 
https://www.postgresql.org/message-id/attachment/41886/catalog-notnull-2-c477e84_cleaned.patch
[3] 
https://www.postgresql.org/message-id/CAKOSWNnXbOY4pEiwN9wePOx8J%2BB44yTj40BQ8RVo-eWkdiGDFQ%40mail.gmail.com
-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Prevent ALTER TABLE DROP NOT NULL on child tables if parent column has it

2016-06-15 Thread Vitaly Burovoy

On 6/15/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paqu...@gmail.com> writes:
>> To put it short, it should not be possible to drop a NOT NULL
>> constraint on a child relation when its parent table is using it, this
>> should only be possible from the parent. Attached is a patch handling
>> this problem by adding a new function in pg_inherits.c to fetch the
>> list of parent relations for a given relation OID, and did some
>> refactoring to stick with what is done when scanning child relations.
>
> This doesn't sound like the right approach; in particular, it won't really
> help for deciding whether to propagate a DROP NOT NULL on a parent rel to
> its children.  What we've discussed in the past is to store NOT NULL
> constraints in pg_constraint, much like CHECK constraints are already, and
> use use-count logic identical to the CHECK case to keep track of whether
> NOT NULL constraints are inherited or not.  My feeling is that we'd keep
> the pg_attribute.attnotnull field and continue to drive actual enforcement
> off that, but it would just reflect a summary of the pg_constraint state.
>
> IIRC, Alvaro posted a WIP patch for that awhile back.  Not sure what the
> current state is.
>
>   regards, tom lane

The last thread about NOT NULLs as constraints is accessible by the link[1].
I rebased[2] Alvaro's patch to the actual master at that time, but I
have not repeated it since then.

In the initial letter[1] I posted a digest from the SQL-2011 standard
and a conclusion as a design of a new patch.
Now I have more free time and I'm hacking it that way. The new patch
is at the very early stage, full of WIPs and TODOs. I hope it'll be
ready to be shown in a month (may be two).

But it already forbids dropping NOT NULLs if they were set as result
of inheritance.


[1] 
https://www.postgresql.org/message-id/flat/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/attachment/41886/catalog-notnull-2-c477e84_cleaned.patch

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 10.0

2016-05-13 Thread Vitaly Burovoy

On 5/13/16, Robert Haas <robertmh...@gmail.com> wrote:
> On Fri, May 13, 2016 at 2:49 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> So I think we should solve these problems at a stroke, and save ourselves
>> lots of breath in the future, by getting rid of the whole "major major"
>> idea and going over to a two-part version numbering scheme.  To be
>> specific:
>>
>> * This year's major release will be 9.6.0, with minor updates 9.6.1,
>> 9.6.2, etc.  It's too late to do otherwise for this release cycle.
>>
>> * Next year's major release will be 10.0, with minor updates 10.1,
>> 10.2, etc.
>>
>> * The year after, 11.0.  Etc cetera.
>>
>> No confusion, no surprises, no debate ever again about what the next
>> version number is.
>>
>> This is by no means a new idea, but I think its time has come.
>
> Man, I hate version number inflation.

+1

> I'm running Firefox 45.0.2, and I think that's crazy. It hit 1.0 when were
> at aversion 7.4! Granted, this wouldn't be that bad, but

+1
Big numbers in version don't show big (or even equal) changes. How to
compare 7.0…8.0 and 8.0…9.0 and 9.0…9.6(10.0?)?
Is difference between FireFox 10.0…20.0 the same as FireFox 20.0…30.0?

> I have always thought that burning through a first digit a few times a decade
> is much more sensible than doing it every year.
> We just have to remember to bump the first digit occasionally.

+1
Better once a decade. =)

> If we don't want to stick with the current practice of debating when
> to bump the same digit, then let's agree that 10.0 will follow 9.6 and
> after that we'll bump the first digit after X.4, as we did with 7.X
> and 8.X.

Why just don't agree that since each release has good features and we
can't release multiple backward-incompatible features at one release
there is no sense of bumping the first digit earlier than the second
one reaches the value "9"?

Pretend the current version is not 9.5, but 95 (PostgreSQL 95),
current beta not 9.6 but 96 etc, but just divided by 10.

In fact it is similar to Tom's offer but avoided proposed bump to 10.0
and left increasing by 0.1 instead of 1.0:
curnt   Tom's proposed
 9.5.X  9.5.X
 9.6.X  9.6.X
 9.7.X 10.X
 9.8.X 11.X
 9.9.X 12.X
10.0.X 13.X
10.1.X 14.X
10.2.X 15.X
...
10.6.X 19.X
10.7.X 20.X
10.8.X 21.X
10.9.X 22.X
11.0.X 23.X
11.1.X 24.X

Etc.

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

--
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 10.0

2016-05-13 Thread Vitaly Burovoy

On 5/13/16, Bruce Momjian <br...@momjian.us> wrote:
> On Fri, May 13, 2016 at 11:05:23AM -0400, Robert Haas wrote:
>> The major arguments advanced in favor of 10.0 are:
>>
>> - There are a lot of exciting features in this release.

If I'm mot mistaken each release introduced exciting features.
In my opinion:
9.5 - kind of UPSERT & RLS;
9.4 - reduced locks for ALTER TABLE, concurrent matview updates and JSONB(!!!);
9.3 - autoupdatable views & LATERAL joins & event trigger;
9.2 - RANGE data types & index-only scans & planner improvements;
9.1 - FDW & extensions & true SERIALIZABLE isolation level & data
modification in CTE.

Each version is a candidate to be 10.0 by the same reasons as people
try to show now.
Since each version has quite big changes and you can't predict how
great changes will be in the next release why don't just increase the
second digit up to 9? As for me version 9.9 is beautiful enough (and
version 9.9.9 is even more beautiful). =)

>> - Even if you aren't super-excited by the features in this release,
>> PostgreSQL 9.6/10.0 is a world away from 10.0, and therefore it makes
>
> I think you meant "a world away from 9.0".
>
> Actually, I don't see the distance from 9.0 as a valid argument as 9.5
> was probably also a world away from 9.0.
>
> I prefer calling 9.7 as 10.0 because there will be near-zero-downtime
> major upgrades with pg_logical (?),

I dream there is time of zero-downtime when admins are instantiate new
version of PostgreSQL, make it slave, wait until it finishes
synchronization (prepare to multimaster), replace both of them as
multimaster online (without downtime), then reconnect clients to the
new instance, then leave new instance as master and shutdown old
version.

If it appears in the release after next one (which is expected be 10.1
or 10.0 or 9.8, i.e. after the next year), whether we update major
number again (11.0) or leave numeration as is (it will be
10.2/10.1/9.9) or it is the real reason to bump 9.8->10.0 (but in this
particular case I'd leave 9.9)?

> and parallelism will cover more cases.
> Built-in logical replication in 9.7 would be big too, and
> another reason to do 9.7 as 10.0.
>
> On the other hand, the _start_ of parallelism in 9.6 could be enough of
> a reason to call it 10.0, with the idea that the 10-series is
> increasingly parallel-aware.  You could argue that parallelism is a much
> bigger deal than near-zero-downtime upgrades.

I think each developer/DBA search for different benefit from each release.
E.g. parallelism is not so important for OLTP developers and
near-zero-downtime is mostly for system administrators/DBA.

> I think the fundamental issue is whether we want to lead the 10.0 branch
> with parallelism, or wait for an administrative change like
> near-zero-downtime major upgrades and built-in logical replication.
>
>   Bruce Momjian  <br...@momjian.us>    http://momjian.us
>   EnterpriseDB http://enterprisedb.com

I'd vote for 9.6 up to 9.9 then 10.0…

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Vitaly Burovoy

On 5/5/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Please review and comment before Monday, if you can.
>
>   regards, tom lane

1. "YUriy Zhuravlev" should be "Yury Zhuravlev"
Previously[1] he had the first version in his signature, but I guess
it was misconfiguring, now[2] hi has second version.

2. I'd add Dean Rasheed as co-author to "Add pg_size_bytes()" since
being a committer he made more than cosmetic changes[3] but he was
humble enough not to mention himself in the commit message.

[1] http://www.postgresql.org/message-id/2723429.ZaCixaFn1x@dinodell
[2] 
http://www.postgresql.org/message-id/dd701b62-008d-4048-882e-20df0e4b5...@postgrespro.ru
[3] 
http://www.postgresql.org/message-id/caezatcxhz5ggfrfcf9_yw5h6wdxr68qdfiwhvmgfr3ascnq...@mail.gmail.com
-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011

2016-05-04 Thread Vitaly Burovoy

On 5/3/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Vitaly Burovoy <vitaly.buro...@gmail.com> writes:
>> On 4/27/16, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
>>> Point 2 is where things differ from what I remember; my (possibly
>>> flawed) understanding was that there's no difference between those
>>> things.  Many (maybe all) of the things from this point on are probably
>>> fallout from that one change.
>
>> It is just mentioning that CHECK constraints have influence on
>> nullability characteristic, but it differs from NNC.
>> NNC creates CHECK constraint, but not vice versa. You can create
>> several CHECK "col IS NOT NULL" constraints, but only one NNC (several
>> ones by inheritance only?). And DROP NOT NULL should drop only those
>> CHECK that is linked with NNC (and inherited), but no more (full
>> explanation is in my initial letter).
>
> This seems to me to be a most curious reading of the standard.
> SQL:2011 11.4  syntax rule 17a says
>
>If a  is specified that contains
>the  NOT NULL, then it is equivalent to the
>following :
>
>   CND CHECK ( C IS NOT NULL ) CA
>
> As a rule, when the SQL spec says "equivalent", they do not mean "it's
> sort of like this", they mean the effects are indistinguishable.  In
> particular, I see nothing whatsoever saying that you're not allowed to
> write more than one per column.

1. SQL:2011 4.13 <Columns, fields, and attributes>:

 — If C is a column of a base table, then an indication of whether it is
 defined as NOT NULL and, if so, the constraint name of the associated 
table
 constraint definition.
 NOTE 41 — This indication and the associated constraint name 
exist for
 definitional purposes only and are not exposed through the 
COLUMNS view
 in the Information Schema.

There is only "constraint name", not "constraint names".

2. SQL:2011 11.15   General Rule 1:

... If the column descriptor of C does not contain an indication that
C is defined as NOT NULL, then:

And there is no rule 2. I.e. if the column is already set as NOT NULL
you can't specify it as NOT NULL again.

3. SQL:2011 11.15   General Rule 1.d:

 The following  is executed without further
Access Rule checking:
 ALTER TABLE TN ADD CONSTRAINT IDCN CHECK ( CN IS NOT NULL )


> So I don't like the proposal to add an attnotnullid column to
> pg_attribute.

Why and where to place it?

> What we'd talked about earlier was converting attnotnull
> into, effectively, a hint flag saying that there's at least one NOT NULL
> constraint attached to the column.  That still seems like a good approach
> to me.

Ok. But not only NOT NULL constraint, but also non-deferrable PK,
CHECK, domains, may be the strictest FK.

> When we're actually ready to throw an error for a null value,
> we could root through the table's constraint list for a not-null
> constraint name to report.

attnotnullid is not for reporting, it is for DROP NOT NULL and
recreating "CREATE TABLE" statements via pg_dump.

>  It doesn't matter which one we select, because
> constraint application order has never been promised to be deterministic;
> and a few extra cycles at that point don't seem like a big problem to me.
>
>   regards, tom lane

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011

2016-05-03 Thread Vitaly Burovoy

I'm sorry for the late answer.

On 4/27/16, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
> Vitaly Burovoy wrote:
>
> Hi,
>
>> But before starting working on it I had a look at the SQL-2011
>> standard (ISO/IEC 9075-2)[3] and found that:
>>
>> 1. A name for a "NOT NULL" constraint  can be given by a table
>> definition (subcl. 11.4, "Format"->"column constraint definition").
>> 2. The standard splits NNC and CHECK constraints (subcl. 11.4,
>> "Format"-> "column constraint")
>
> Point 2 is where things differ from what I remember; my (possibly
> flawed) understanding was that there's no difference between those
> things.  Many (maybe all) of the things from this point on are probably
> fallout from that one change.

It is just mentioning that CHECK constraints have influence on
nullability characteristic, but it differs from NNC.
NNC creates CHECK constraint, but not vice versa. You can create
several CHECK "col IS NOT NULL" constraints, but only one NNC (several
ones by inheritance only?). And DROP NOT NULL should drop only those
CHECK that is linked with NNC (and inherited), but no more (full
explanation is in my initial letter).

>> III. "pg_attribute" table should have an "attnotnullid oid" as an
>> indicator of "NOT NULL" (p.4) and points to a CHECK constraint; It is
>> in addition to a "Nullability characteristic" "attnotnull" (p.3).
>> IV. "pg_constraint" should have a column "connotnullkey int2[]" as a
>> "list of the nullable columns" which references to
>> "pg_attribute.attnum" for fast checking whether a column is still
>> nullable after deleting/updating constraints or not. Array is
>> necessary for cases like "CHECK ((col1 IS NOT NULL) AND (col2 IS NOT
>> NULL))" and for nondeferrable PKs.
>
> I think these points warrant some more consideration. I don't like the
> idea that pg_attribute and pg_constraint are both getting considerably
> bloated to support this.

Ok, I'm ready for a discussion.

Two additional columns are necessary: one for pointing to an
underlying CHECK constraint (or boolean column whether current CHECK
is NNC or not) and second for fast computation of "attnotnull" (which
means "nullable characteristic") and ability to skip check if
"attnotnull" is set but not triggered (I think it'll improve
performance for inherited tables).

I think placing the first column (attnotnullid) to pg_attribute is
better because you can't have more than one value in it.

The second is obviously should be placed in pg_constraint.

>> P.S.:
>> Since the SQL standard defines that "col NOT NULL" as an equivalent to
>> "CHECK (col IS NOT NULL)" (p.8) what to do with that behavior:
>>
>> postgres=# create type t as (x int);
>> CREATE TYPE
>> postgres=# SELECT v, v IS NOT NULL AS should_be_in_table FROM
>> (VALUES('(1)'::t),('()'),(NULL)) AS x(v);
>>   v  | should_be_in_table
>> -+
>>  (1) | t
>>  ()  | f
>>  | f
>> (3 rows)
>>
>> "attnotnull" in such case is stricter, like "CHECK (col IS DISTINCT FROM
>> NULL)".
>>
>> Should such values (with NULL in each attribute of a composite type)
>> violate NOT NULL constraints?
>
> I wonder if the standard has a concept of null composite values.  If
> not, then there is no difference between IS NOT NULL and IS DISTINCT
> FROM NULL, which explains why they define NNC in terms of the former.

Yes, it has. The PG's composite type is "Row types" (subcl.4.8) in the standard.

The standard also differentiates IS [NOT] NULL and IS [NOT] DISTINCT FROM:

>>> Subcl. 8.8 :
>>> ...
>>> 1) Let R be the  and let V be the value of R.
>>> 2) Case:
>>>  a) If V is the null value, then “R IS NULL” is True and
>>>   the value of “R IS NOT NULL” is False.
>>>  b) Otherwise:
>>>   i) The value of “R IS NULL” is
>>>Case:
>>>1) If the value of every field of V is the null value, then True.
>>>2) Otherwise, False.
>>> ...
>>>
>>> Subcl. 8.15 
>>> ...
>>> 1) Let V1 be the value of  and let V2 be the value 
>>> of .
>>> ...
>>>  b) If V1 is the null value and V2 is not the null value, or if V1 is not 
>>> the null value and V2 is the null
>>> value, then the result is True.
>>> ...

In subcl.8.8 "each column" is mentioned, in 8.15 if one of value is
the null value and the other is not then nothing more is checked and
True is returned.

> I think your email was too hard to read because of excessive density,
> which would explain the complete lack of response.

Hmm. I decided it was "silently approved". =)

> I haven't had the chance to work on this topic again, but I encourage you to,
> if you have the resources.

Thank you, I think I'll find a time for it no earlier than the summer.

> (TBH I haven't had the chance to study your proposed design in detail, 
> either).

I hope somebody find a time to study it before someone sends a proposal.

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [PATH] Jsonb, insert a new value into an array at arbitrary position

2016-03-31 Thread Vitaly Burovoy

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:tested, passed

I have reviewed this patch.
It applies cleanly at the top of current master (3501f71), compiles silently 
and implements declared feature.

The documentation describes behavior of the function with pointing to a 
difference between inserting into JsonbObject and JsonbArray.

The code is clean and commented. Linked comment is changed too.

Regression tests cover possible use cases and edge cases.


Notes for a committer:
1. pg_proc.h has changed, so the CATALOG_VERSION_NO must also be changed.
2. Code comments and changes in the documentation need proof-reading by a native
English speaker, which the Author and I are not.


The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATH] Jsonb, insert a new value into an array at arbitrary position

2016-03-31 Thread Vitaly Burovoy

On 3/30/16, Dmitry Dolgov <9erthali...@gmail.com> wrote:
> On 31 March 2016 at 05:04, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
>> The documentation changes still has to be fixed.
> Thanks for help. Looks like I'm not so good at text formulation. Fixed.
Never mind. I'm also not so good at it. It is still should be
rewritten by a native English speaker, but it is better to give him
good source for it.

===
I'm almost ready to mark it as "Ready for committer".

Few notes again.
1.
> a current item was pushed to parse state after JB_PATH_INSERT_BEFORE

I paid attention that the function's name 'pushJsonbValue' looks as
working with stack (push/pop), but there is no function "pop*" neither
in jsonfuncs.c nor jsonb_util.c.
That's why I wrote "it seems the logic in the code is correct" - it is
logical to insert new value if "before", then current value, then new
value if "after".
But yes, following by executing path to the "pushState" function
anyone understands "JsonbParseState" is constructed as a stack, not a
queue. So additional comment is needed why "JB_PATH_INSERT_AFTER"
check is placed before inserting curent value and
JB_PATH_INSERT_BEFORE check.

The comment can be located just before "if (op_type &
JB_PATH_INSERT_AFTER)" (line 3986) and look similar to:

/*
 * JsonbParseState struct behaves as a stack -- see the "pushState" function,
 * so check for JB_PATH_INSERT_BEFORE/JB_PATH_INSERT_AFTER in a reverse order.
 */

2.
A comment for the function "setPath" is obsolete: "If newval is null"
check and "If create is true" name do not longer exist.
Since I answered so late last time I propose the next formulating to
avoid one (possible) round:

/*
 * Do most of the heavy work for jsonb_set/jsonb_insert
 *
 * If JB_PATH_DELETE bit is set in op_type, the element is to be removed.
 *
 * If any bit mentioned in JB_PATH_CREATE_OR_INSERT is set in op_type,
 * we create the new value if the key or array index does not exist.
 *
 * Bits JB_PATH_INSERT_BEFORE and JB_PATH_INSERT_AFTER in op_type
 * behave as JB_PATH_CREATE if new value is inserted in JsonbObject.
 *
 * All path elements before the last must already exist
 * whatever bits in op_type are set, or nothing is done.
 */

===
I hope that notes are final (additional check in formulating is appreciated).

P.S.: if it is not hard for you, please rebase the patch to the
current master to avoid offset in func.sgml.

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Supporting +-Infinity values by to_timestamp(float8)

2016-03-30 Thread Vitaly Burovoy

On 3/29/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Pushed with minor adjustments.
>
>   regards, tom lane
>

Thank you very much!

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATH] Jsonb, insert a new value into an array at arbitrary position

2016-03-30 Thread Vitaly Burovoy

On 3/25/16, Dmitry Dolgov <9erthali...@gmail.com> wrote:
> Here is a new version of path, I hope I didn't miss anything. Few notes:
>
>> 4.
>> or even create a new constant (there can be better name for it):
>> #define JB_PATH_CREATE_OR_INSERT (JB_PATH_INSERT_BEFORE |
>> JB_PATH_INSERT_AFTER | JB_PATH_CREATE)
>
> Good idea, thanks.

You're welcome.
===

I'm sorry for the late letter.

Unfortunately, it seems one more round is necessary.
The documentation changes still has to be fixed.

The main description points to a "target section designated by path is
a JSONB array". It is a copy-paste from jsonb_set, but here it has
wrong context.
The main idea in jsonb_set is replacing any object which is pointed
(designated) by "path" no matter where (array or object) it is
located.
But in the phrase for jsonb_insert above you want to point to an upper
object _where_ "section designated by path" is located.

I'd write something like "If target section designated by path is _IN_
a JSONB array, ...". The same thing with "a JSONB object".

Also I'd rewrite "will act like" as "behaves as".

Last time I missed the argument's name "insert_before_after". It is
better to replace it as just "insert_after". Also reflect that name in
the documentation properly.

To be consistent change the constant "JB_PATH_NOOP" as "0x"
instead of just "0x0".

Also the variable's name "bool before" is incorrect because it
contradicts with the SQL function's argument "after" (from the
documentation) or "insert_after" (proposed above) or tests (de-facto
behavior). Moreover it seems the logic in the code is correct, so I
have no idea why "before ? JB_PATH_INSERT_BEFORE :
JB_PATH_INSERT_AFTER" works.
I think either proper comment should be added or lack in the code must be found.
Anyway the variable's name must reflect the SQL argument's name.

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in searching path in jsonb_set when walking through JSONB array

2016-03-23 Thread Vitaly Burovoy

On 2016-03-23, Oleg Bartunov <obartu...@gmail.com> wrote:
> On Wed, Mar 23, 2016 at 6:37 AM, Vitaly Burovoy <vitaly.buro...@gmail.com>
> wrote:
>
>> Hello, Hackers!
>>
>> While I was reviewed a patch with "json_insert" function I found a bug
>> which wasn't connected with the patch and reproduced at master.
>>
>> It claims about non-integer whereas input values are obvious integers
>> and in an allowed range.
>> More testing lead to understanding it appears when numbers length are
>> multiplier of 4:
>>
>> postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }',
>> '"4"');
>> ERROR:  path element at the position 2 is not an integer
>>
>
> Hmm, I see in master
>
> select version();
>  version
> -
>  PostgreSQL 9.6devel on x86_64-apple-darwin15.4.0, compiled by Apple LLVM
> version 7.3.0 (clang-703.0.29), 64-bit
> (1 row)
>
> select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }', '"4"');
>  jsonb_set
> 
>  {"a": [[], 1, 2, 3, "4"], "b": []}
> (1 row)

Yes, I can't reproduce it with "CFLAGS=-O2", but it is still
reproduced with "CFLAGS='-O0 -g3'".

postgres=# select version();
 version
--
 PostgreSQL 9.6devel on x86_64-pc-linux-gnu, compiled by gcc (Gentoo
4.8.4 p1.4, pie-0.6.1) 4.8.4, 64-bit
(1 row)

postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }', '"4"');
ERROR:  path element at the position 2 is not an integer


It depends on memory after the string. In debug mode it always (most
of the time?) has a garbage (in my case the char '~' following by
'\x7f' multiple times) there.

I think it is just a question of complexity of reproducing in release,
not a question whether there is a bug or not.

All the other occurrences of strtol in the file have
TextDatumGetCString before, except the occurrence in the setPathArray
function. It seems its type is TEXT (which is not null-terminated),
not cstring.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Bug in searching path in jsonb_set when walking through JSONB array

2016-03-22 Thread Vitaly Burovoy

Hello, Hackers!

While I was reviewed a patch with "json_insert" function I found a bug
which wasn't connected with the patch and reproduced at master.

It claims about non-integer whereas input values are obvious integers
and in an allowed range.
More testing lead to understanding it appears when numbers length are
multiplier of 4:

postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }', '"4"');
ERROR:  path element at the position 2 is not an integer

postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"b", 1000}', '"4"');
ERROR:  path element at the position 2 is not an integer

postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", -999}', '"4"');
ERROR:  path element at the position 2 is not an integer

postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a",
1000}', '"4"');
ERROR:  path element at the position 2 is not an integer

Close values are ok:
postgres=# select jsonb_set('{"a":[[],1,2,3]}', '{"a", 0, 999}', '"4"');
jsonb_set
-
 {"a": [["4"], 1, 2, 3]}
(1 row)

postgres=# select jsonb_set('{"a":[[],1,2,3]}', '{"a", 0, 1}', '"4"');
jsonb_set
-
 {"a": [["4"], 1, 2, 3]}
(1 row)


Research lead to setPathArray where a string which is got via
VARDATA_ANY but is passed to strtol which expects cstring.

In case of string number length is not a multiplier of 4 rest bytes
are padding by '\0', when length is a multiplier of 4 there is no
padding, just garbage after the last digit of the value.

Proposed patch in an attachment fixes it.

There is a magic number "20" as a length of an array for copying key
from a path before passing it to strtol. It is a maximal length of a
value which can be parsed by the function. I could not find a proper
constant for it. Also I found similar direct value in the code (e.g.
in numeric.c).

I've added a comment, I hope it is enough for it.


-- 
Best regards,
Vitaly Burovoy


fix_jsonb_set_path.0001.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATH] Jsonb, insert a new value into an array at arbitrary position

2016-03-22 Thread Vitaly Burovoy

On 2016-03-16, Dmitry Dolgov <9erthali...@gmail.com> wrote:
> Hi Vitaly, thanks for the review. I've attached a new version of path with
> improvements. Few notes:
>
>> 7. Why did you remove "skip"? It is a comment what "true" means...
>
> Actually, I thought that this comment was about skipping an element from
> jsonb in order to change/delete it,

As I got it, it is "skipNested", skipping iterating over nested
containers, not skipping an element.

> not about the last argument.  E.g. you can find several occurrences of
> `JsonbIteratorNext` with `true` as the last
> argument but without a "last argument is about skip" comment.

I think it is not a reason for deleting this comment. ;-)

> And there is a piece of code in the function `jsonb_delete` with a "skip
> element" commentary:
>
> ```
> /* skip corresponding value as well */
> if (r == WJB_KEY)
> JsonbIteratorNext(, , true);
> ```

The comment in your example is for the extra "JsonbIteratorNext" call
(the first one is in a "while" statement outside your example, but
over it in the code), not for the "skip" parameter in the
"JsonbIteratorNext" call here.
===

Notes for the jsonb_insert_v3.patch applied on the f6bd0da:

1a. Please, rebase your patch to the current master (it is easy to
resolve conflict, but still necessary).

1b. Now OID=3318 is "pg_stat_get_progress_info" (Hint: 3324 is free now).

2.
The documentation, func.sgml:
> Iftarget
Here must be a space after the "If" word.

3.
There is a little odd formulating me in the documentation part
(without formatting for better readability):
> If target section designated by path is a JSONB array, new_value will be 
> inserted before it, or after if after is true (defailt is false).

a. "new_value" is not inserted before "it" (a JSONB array), it is
inserted before target;
b. s/or after if/or after target if/;
c. s/defailt/default/

> If ... is a JSONB object, new_value will be added just like a regular key.

d. "new_value" is not a regular key: key is the final part of the "target";
e. "new_value" replaces target if it exists;
f. "new_value" is added if target is not exists as if jsonb_set is
called with create_missing=true.
g. "will" is about possibility. "be" is about an action.

4.
function setPathObject, block with "op_type = JB_PATH_CREATE"
(jsonfuncs.c, lines 3833..3840).
It seems it is not necessary now since you can easily check for all
three options like:
op_type & (JB_PATH_INSERT_BEFORE | JB_PATH_INSERT_AFTER | JB_PATH_CREATE)

or even create a new constant (there can be better name for it):
#define JB_PATH_CREATE_OR_INSERT (JB_PATH_INSERT_BEFORE |
JB_PATH_INSERT_AFTER | JB_PATH_CREATE)

and use it like:
op_type & JB_PATH_CREATE_OR_INSERT

all occurrences of JB_PATH_CREATE in the function are already with the
"(level == path_len - 1)" condition, there is no additional check
needed.

also the new constant JB_PATH_CREATE_OR_INSERT can be used at lines 3969, 4025.

5.
> if (op_type != JB_PATH_DELETE)
It seems strange direct comparison among bitwise operators (lines 3987..3994)

You can leave it as is, but I'd change it (and similar one at the line
3868) to a bitwise operator:
if (!op_type & JB_PATH_DELETE)

6.
I'd like to see a test with big negative index as a reverse for big
positive index:
select jsonb_insert('{"a": [0,1,2]}', '{a, -10}', '"new_value"');

I know now it works as expected, it is just as a defense against
further changes that can be destructive for this special case.

7.
Please, return the "skip" comment.

8.
The documentation: add "jsonb_insert" to the note about importance of
existing intermediate keys. Try to reword it since the function
doesn't have a "create_missing" parameter support.
> All the items of the path parameter of jsonb_set must be present in the 
> target,
> ... in which case all but the last item must be present.

Currently I can't break the code, so I think it is close to the final state. ;-)

--
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check

2016-03-19 Thread Vitaly Burovoy

On 2016-03-16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> My feeling is we ought to preserve the old behavior here, which would
> involve making JULIAN_MAXYEAR_FOR_TIMESTAMPS format-dependent and
> adjusting the float values for the two derived constants; not much of a
> problem code-wise.  I think though that it would break quite a number of
> the proposed new regression tests for the float case.

I think I can be solved by adding new tests for new upper bound for
float values and creating a new version of expected file like
date_1.out, horology_1.out, timestamp_1.out and timestamptz.out (the
original files are for integer values; with prefix "_1" are for float
ones).

> TBH, I thought
> the number of added test cases was rather excessive anyway, so I wouldn't
> have a problem with just leaving out whichever ones don't pass with both
> build options.

The tests checks edge cases. In case of saving bigger interval of
allowed values for float values you'll remove checks when they should
fail. What's the left cases for?

On 2016-03-16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Actually, it seems a lot of the provided test cases fail for float
> timestamps anyway; there's an assumption that
>   294276-12-31 23:59:59.99
>   294277-01-01 00:00:00.00
> are distinct timestamps, which they are not in float mode.

I'm ready to change fractional part '.99' to '.5' (to avoid issues
of different implementation of floating point on different
architectures) to emphasize "almost reached the threshold".

On 2016-03-16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> AFAICS the only way that we can avoid a dump/reload hazard is to tighten
> up the allowed range of timestamps by at least one day, so that any
> timestamp that passes IS_VALID_TIMESTAMP() is guaranteed to print, in
> any timezone, with a contained date that the Julian-day routines can
> handle.  I'd be inclined to set the lower limit of timestamps as
> '4713-01-01 00:00 GMT BC' just to keep things simple.  (The upper limit
> can stay where it is.)
>
> While dates don't have this timezone rotation problem, the documentation
> says that they have the same lower-bound limit as timestamps, and there
> are places in the code that assume that too.  Is it okay to move their
> lower bound as well?

I think it is important (see initial letter) since it is out of
supported interval (according to the documentation) and don't work as
expected in some cases (see "to_char(date_trunc('week', '4714-12-28
BC'::date),'day')"). It seems it leads to change
IS_VALID_JULIAN[_FOR_STAMPS] as well to something like:

#define IS_VALID_JULIAN(y,m,d) \
((JULIAN_MINYEAR < (y) && (y) < JULIAN_MAXYEAR)
 ||(JULIAN_MINYEAR == (y) && (m) == 12 && (d) == 31)
 ||(JULIAN_MAXYEAR == (y) && (m) == 01 && (d) == 01))

It can be correct if checks for IS_VALID_DATE is added in date.c to
places marked as "IS_VALID_JULIAN is enough for checking..."
All other places are (I'm sure) covered by IS_VALID_DATE/IS_VALID_TIMESTAMP.

What about the question:
On 2016-02-24, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> Also I'm asking for a help because the query (in default TZ='GMT+1'):
> postgres=# SELECT '4714-11-24 00:00:00.00+00 BC'::timestamptz;
>
> in psql gives a result "4714-11-23 23:00:00-01 BC",
> but in a testing system gives "Sun Nov 23 23:00:00 4714 GMT BC"
> without TZ offset.

Why there is "GMT" instead of "GMT+01"? Is it bug? If so should it be
fixed in this patch or can be fixed later?

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check

2016-03-19 Thread Vitaly Burovoy

On 2016-03-15, Mark Dilger <hornschnor...@gmail.com> wrote:
>
>> On Mar 14, 2016, at 5:12 PM, Vitaly Burovoy <vitaly.buro...@gmail.com>
>> wrote:
>>
>> On 3/14/16, Mark Dilger <hornschnor...@gmail.com> wrote:
>>> The first thing I notice about this patch is that
>>> src/include/datatype/timestamp.h
>>> has some #defines that are brittle.  The #defines have comments
>>> explaining
>>> their logic, but I'd rather embed that in the #define directly:
>>>
>>> This:
>>>
>>> +#ifdef HAVE_INT64_TIMESTAMP
>>> +#define MIN_TIMESTAMP  INT64CONST(-2118134880)
>>> +/* == (0 - POSTGRES_EPOCH_JDATE) * 86400 * USECS_PER_SEC */
>>> +#define MAX_TIMESTAMP  INT64CONST(92233713312)
>>> +/* == (JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * 86400 *
>>> USECS_PER_SEC
>>> */
>>> +#else
>>> +#define MIN_TIMESTAMP  -211813488000.0
>>> +/* == (0 - POSTGRES_EPOCH_JDATE) * 86400 */
>>> +#define MAX_TIMESTAMP  9223371331200.0
>>> +/* == (JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * 86400 */
>>> +#endif
>>>
>>> Could be written as:
>>>
>>> #ifdef HAVE_INT64_TIMESTAMP
>>> #define MIN_TIMESTAMP
>>> ((INT64CONST(0) - POSTGRES_EPOCH_JDATE) * USECS_PER_DAY)
>>> #define MAX_TIMESTAMP
>>> ((JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * USECS_PER_DAY)
>>> #else
>>> #define MIN_TIMESTAMP
>>> ((INT64CONST(0) - POSTGRES_EPOCH_JDATE) * SECS_PER_DAY)
>>> #define MAX_TIMESTAMP
>>> ((JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * SECS_PER_DAY)
>>> #endif
>>>
>>> I assume modern compilers would convert these to the same constants at
>>> compile-time,
>>
>> Firstly, Postgres is compiling not only by modern compilers.
>
> Do you have an example of a compiler that will not do this constant folding
> at compile time?

No, I'm not good at knowing features of all versions and all kings of
compilers, but I'm sure constants are better than expressions for big
values. =)

>>> rather than impose a runtime penalty.
>>
>> Secondary, It is hard to write it correctly obeying Postgres' coding
>> standard (e.g. 80-columns border) and making it clear: it is not so
>> visual difference between USECS_PER_DAY and SECS_PER_DAY in the
>> expressions above (but it is vivid in comments in the file).
>
> Hmm.  I think using USECS_PER_DAY is perfectly clear, but that is a
> personal
> opinion.  I don't see any way to argue if you don't see it that way.

I'm talking about perception of the constants when they a very similar
but they are not justified by a single column (where difference
_in_expressions_ are clear). There was a difference by a single char
("U") only which is not so obvious without deep looking on it (be
honest I'd missed it until started to write an answer).

>> or "Why is INT64CONST set for
>> MIN_TIMESTAMP, but not for MAX_TIMESTAMP?" (JULIAN_MAX4STAMPS is _not_
>> int64).
>
> I was only casting the zero to int64.  That doesn't seem necessary, so it
> can
> be removed.  Both MIN_TIMESTAMP and MAX_TIMESTAMP were defined
> in terms of USECS_PER_DAY, which itself is defined in terms of INT64CONST,
> so I believe they both work out to be an int64 constant.

I hope so. But in such cases I'd prefer to _begin_ calculations from
int64, not to _finish_ by it.
It is impossible to pass constants (like JULIAN_MAX4STAMPS) to
INT64CONST macros. Inserting "(int64)" makes rows larger by 7 chars...
;-)

>>> The #defines would be less brittle in
>>> the event, for example, that the postgres epoch were ever changed.
>>
>> I don't think it is real, and even in such case all constants are
>> collected together in the file and will be found and changed at once.
>
> I agree that they would be found at once.  I disagree that the example
> is not real, as I have changed the postgres epoch myself in some builds,
> to be able to use int32 timestamps on small devices.

Wow! I'm sorry, I didn't know about it.
But in such case (tighten to int32) there are more than two places
which should be changed. Two more (four with disabled
HAVE_INT64_TIMESTAMP) constants are not big deal with it.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check

2016-03-19 Thread Vitaly Burovoy

On 3/16/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> So I fixed that up and committed it, with a very short set of new
> regression test cases.  I seriously doubt that the other ones add
> enough value to be worth trying to make them work in both float- and
> int-timestamp cases; though if you want to submit a new patch to
> add more test cases we could consider it.  My feeling about it is that
> that kind of testing does nothing for errors of omission (ie functions
> that fail to range-check their results), which is the most likely sort
> of bug here, and so I doubt it's worth adding them to regression tests
> that many people will run many times a day for a long time to come.
>
>   regards, tom lane

Thank you very much! If you decide such kind of tests is not
necessary, it is enough for me.

Is there any reason to leave JULIAN_MINDAY and JULIAN_MAXDAY which are
not used now?
Also why JULIAN_MAXMONTH is set to "6" whereas
{DATE|TIMESTAMP}_END_JULIAN use "1" as month?

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Supporting +-Infinity values by to_timestamp(float8)

2016-03-19 Thread Vitaly Burovoy

On 2016-03-15, David Steele <da...@pgmasters.net> wrote:
> On 3/4/16 2:56 PM, Vitaly Burovoy wrote:
>> On 3/4/16, Anastasia Lubennikova <a.lubennik...@postgrespro.ru> wrote:
>>
>>> I think that you should update documentation. At least description of
>>> epoch on this page:
>>> http://www.postgresql.org/docs/devel/static/functions-datetime.html
>>
>> Thank you very much for pointing where it is located (I saw only
>> "to_timestamp(TEXT, TEXT)").
>> I'll think how to update it.
>
> Vitaly, have you decided how to update this yet?

Yes, there are three versions:
* remove mentioning how to get timestamptz from UNIX stamp;
* leave a note how to get timestamptz and add a note that such
encapsulation existed prior to 9.6;
* replace to the proposed current behavior (without interval).

I decided to implement the third case (but a result there has a time
zone which can be different, so the result can be not exactly same for
a concrete user). If a committer decides to do somehow else, he is
free to choose one from the list above or to do something else.

>>> 3. (nitpicking) I don't sure about "4STAMPS" suffix. "4" is nice
>>> abbreviation, but it seems slightly confusing to me.
>>
>> It doesn't matter for me what it is called, it is short enough and
>> reflects a type on which it is applied.
>> What would the best name be for it?
>
> Anastasia, any suggestions for a better name, or just leave it as is?
>
> I'm not in favor of the "4", either.  I think I would prefer
> JULIAN_MAXYEAR_STAMP.

It turns out that Tom has changed almost one third of timestamp.h and
now the constant has a name "TIMESTAMP_END_JULIAN".

I've rebased the patch to the current master (5db5146) and changed it
according to the new timestamp.h.

Now it passes both versions (integer and float timestamps).
I deleted test for the upper boundary for integer timestamps, changed
a little comments.

I decided to delete hints about minimum and maximum allowed values
because no one type has such hint.

Please find attached a new version of the patch.

-- 
Best regards,
Vitaly Burovoy

to_timestamp_infs.v002.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check

2016-03-14 Thread Vitaly Burovoy

On 3/14/16, Anastasia Lubennikova <a.lubennik...@postgrespro.ru> wrote:
> 14.03.2016 16:23, David Steele:
>> On 2/25/16 4:44 PM, Vitaly Burovoy wrote:
>>
>>> Added to the commitfest 2016-03.
>>>
>>> [CF] https://commitfest.postgresql.org/9/540/
>>
>> This looks like a fairly straight-forward bug fix (the size of the
>> patch is deceptive because there a lot of new tests included).  It
>> applies cleanly.
>>
>> Anastasia, I see you have signed up to review.  Do you have an idea
>> when you will get the chance to do that?
>>
>> Thanks,
>
> I've read the patch thoroughly and haven't found any problems. I think
> that the patch is in a very good shape.
> It fixes a bug and has an excellent set of tests.
>
> There is an issue, mentioned in the thread above:
>
>>postgres=# select
>>postgres-#  to_char(date_trunc('week', '4713-01-01 BC'::date),'day')
>>postgres-# ,to_char(date_trunc('week', '4714-12-29 BC'::date),'day')
>>postgres-# ,to_char(date_trunc('week', '4714-12-28 BC'::date),'day');
>>  to_char  |  to_char  |  to_char
>>---+---+---
>> monday| monday| thursday
>>(1 row)
>
>>since 4714-12-28 BC and to the past detection when a week is starting
>>is broken (because it is boundary of isoyears -4713 and -4712).
>>Is it worth to break undocumented range or leave it as is?
>
> But I suppose that behavior of undocumented dates is not essential.

I'm sorry... What should I do with "Waiting on Author" state if you
don't have complaints?

> --
> Anastasia Lubennikova
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check

2016-03-14 Thread Vitaly Burovoy

On 3/14/16, Mark Dilger <hornschnor...@gmail.com> wrote:
> The first thing I notice about this patch is that
> src/include/datatype/timestamp.h
> has some #defines that are brittle.  The #defines have comments explaining
> their logic, but I'd rather embed that in the #define directly:
>
> This:
>
> +#ifdef HAVE_INT64_TIMESTAMP
> +#define MIN_TIMESTAMP  INT64CONST(-2118134880)
> +/* == (0 - POSTGRES_EPOCH_JDATE) * 86400 * USECS_PER_SEC */
> +#define MAX_TIMESTAMP  INT64CONST(92233713312)
> +/* == (JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * 86400 * USECS_PER_SEC
> */
> +#else
> +#define MIN_TIMESTAMP  -211813488000.0
> +/* == (0 - POSTGRES_EPOCH_JDATE) * 86400 */
> +#define MAX_TIMESTAMP  9223371331200.0
> +/* == (JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * 86400 */
> +#endif
>
> Could be written as:
>
> #ifdef HAVE_INT64_TIMESTAMP
> #define MIN_TIMESTAMP
> ((INT64CONST(0) - POSTGRES_EPOCH_JDATE) * USECS_PER_DAY)
> #define MAX_TIMESTAMP
> ((JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * USECS_PER_DAY)
> #else
> #define MIN_TIMESTAMP
> ((INT64CONST(0) - POSTGRES_EPOCH_JDATE) * SECS_PER_DAY)
> #define MAX_TIMESTAMP
> ((JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * SECS_PER_DAY)
> #endif
>
> I assume modern compilers would convert these to the same constants at
> compile-time,

Firstly, Postgres is compiling not only by modern compilers.

> rather than impose a runtime penalty.

Secondary, It is hard to write it correctly obeying Postgres' coding
standard (e.g. 80-columns border) and making it clear: it is not so
visual difference between USECS_PER_DAY and SECS_PER_DAY in the
expressions above (but it is vivid in comments in the file). Also a
questions raise "Why is INT64CONST(0) necessary in `#else' block"
(whereas `float' is necessary there) or "Why is INT64CONST set for
MIN_TIMESTAMP, but not for MAX_TIMESTAMP?" (JULIAN_MAX4STAMPS is _not_
int64).

The file is full of constants. For example, JULIAN_MAX and
SECS_PER_DAY are one of them.

I would leave it as is.

> The #defines would be less brittle in
> the event, for example, that the postgres epoch were ever changed.

I don't think it is real, and even in such case all constants are
collected together in the file and will be found and changed at once.

> Mark Dilger

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATH] Correct negative/zero year in to_date/to_timestamp

2016-03-11 Thread Vitaly Burovoy

On 3/11/16, Tom Lane <t...@sss.pgh.pa.us> wrote:
> [ catches up with thread... ]
>
> Yes.  It would be more reasonable IMO for to_date to throw an error
> because this is bad input.  On the other hand, to_date mostly doesn't
> throw an error no matter how bad the input is.  I think that may have
> been intentional, although its habit of producing garbage output
> instead (and not, say, NULL) is certainly not very happy-making.
>
> It's a bit schizophrenic for this patch to be both adding ereport's
> for year zero (thereby breaking the no-failure-on-bad-input policy)
> *and* trying to produce sane output for arguably-insane input.
>
> I don't really see an argument why '0001-00-00' should be accepted
> but '-01-01' should throw an error, but that would be the result
> if we take this patch.

Well. In case of zero year it could return the first year instead of
an exception by the same way as "MM" and "DD" do it...

> And I quite agree with Robert that it's insane
> to consider '-2-06-01' as satisfying the format '-MM-DD'.  The
> fact that it even appears to do something related to a BC year is
> an implementation artifact, and not a very nice one.
>
> I would be in favor of a ground-up rewrite of to_date and friends, with
> some better-stated principles (in particular, a rationale why they even
> exist when date_in and friends usually do it better)

I think they exist because date_in can't convert something like
"IYYY-IDDD" (I wonder if date_in can do so) or to parse dates/stamps
independent from the "DateStyle" parameter.

> and crisper error
> detection.  But I'm not seeing the argument that hacking at the margins
> like this moves us forward on either point; what it does do is create
> another backward-compatibility hazard for any such rewrite.
>
> In short, I vote with Robert to reject this patch.

Accepted. Let's agree it is a case "garbage in, garbage out" and "an
implementation artifact".

> BTW, the context for the original report wasn't clear,

The context was to make "extract" and "to_date"/"to_timestamp" be
consistently reversible for "year"/"" for negative values (since
both of them support ones).

> but I wonder how
> much of the actual problem could be addressed by teaching make_date()
> and friends to accept negative year values as meaning BC.
>
>   regards, tom lane

Thank Thomas, Robert and Tom very much for an interesting (but short)
discussion.
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATH] Correct negative/zero year in to_date/to_timestamp

2016-03-11 Thread Vitaly Burovoy

On 3/11/16, Robert Haas <robertmh...@gmail.com> wrote:
> On Sun, Feb 28, 2016 at 9:38 PM, Vitaly Burovoy
> <vitaly.buro...@gmail.com> wrote:
>>> However, I'm not sure we ought to tinker with the behavior in this
>>> area.  If -MM-DD is going to accept things that are not of the
>>> format -MM-DD, and I'd argue that -1-06-01 is not in that format,
>>
>> It is not about format, it is about values.
>
> I disagree.  In a format like "-1-06-01", you want the first minus to
> indicate negation and the other two to be a separator.  That's not
> very far away from wanting the database to read your mind.

It is not my wish. The database does it just now:
postgres=# SELECT to_date('-1-06-01', '');
to_date
---
 0002-01-01 BC
(1 row)

>> Because it is inconvenient a little. If one value ("-2345") is passed,
>> another one ("2346 BC") is got. In the other case a programmer must
>> check for negative value, and if so change a sign and add "BC" to the
>> format. Moreover the programmer must keep in mind that it is not
>> enough to have usual date format "DD/MM/", because sometimes there
>> can be "BC" part.
>
> Yeah, well, that's life.  You can write an alternative function to
> construct dates that works the way you like, and that may well be a
> good idea.  But I think *this* change is not a good idea, and
> accordingly I vote we reject this patch.

My wish is to make the behavior be consistent.
Since there are two reverse functions ("extract" and "to_date"
["to_timestamp" in fact is the same]), I expect that is described as
"year" ("year"-"") means the same thing in both of them, the same
with pairs "isoyear"-"IYYY", "dow"-"DDD", "isodow"-"IDDD", etc.

Now "year" is _not_ the same as "" (but it cat be so according to
the documentation: there is no mentioning of any ISO standard),
whereas "isoyear" _is_ the same:
postgres=# SELECT y, to_date(y, ''),to_date(y, 'IYYY')IYYY
postgres-# FROM(VALUES('-1-06-01'))t(y);
y |   | iyyy
--+---+---
 -1-06-01 | 0002-01-01 BC | 0002-01-01 BC
(1 row)

and
postgres=# SELECT y, date_part('year', y)YYYY,date_part('isoyear', y)IYYY
postgres-# FROM(VALUES('0002-06-01 BC'::date))t(y);
   y   |  | iyyy
---+--+--
 0002-06-01 BC |   -2 |   -1
(1 row)

P.S.: proposed patch changes IYYY as well, but it is easy to fix it
and I'm ready to do it after finding a consensus.
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATH] Jsonb, insert a new value into an array at arbitrary position

2016-03-07 Thread Vitaly Burovoy

On 2016-02-29 17:19+00, Dmitry Dolgov <9erthali...@gmail.com> wrote:
On 2016-02-24 19:37+00, Petr Jelinek <p...@2ndquadrant.com> wrote:
>> Also this patch needs documentation.
> I've added new version in attachments, thanks.

Hello! The first pass of a review is below.

1. Rename the "flag" variable to something more meaningful. (e.g.
"op_type" - "operation_type")

2. There is two extra spaces after the "_DELETE" word
+#define JB_PATH_DELETE 0x0002

3.
res = setPath(, path_elems, path_nulls, path_len, ,
- 0, newval, create);
+ 0, newval, create ? JB_PATH_CREATE : 0x0);

What is the magic constant "0x0"? If not "create", then what?
Introduce something like JB_PATH_NOOP = 0x0...

4. In the "setPathArray" the block:
if (newval != NULL)
"newval == NULL" is considered as "to be deleted" from the previous
convention and from the comments for the "setPath" function.
I think since you've introduced special constants one of which is
JB_PATH_DELETE (which is not used now) you should replace convention
from checking for a value to checking for a constant:
if (flag != JB_PATH_DELETE)
or even better:
if (!flag & JB_PATH_DELETE)

5. Change checking for equality (to bit constants) to bit operations:
(flag == JB_PATH_INSERT_BEFORE || flag == JB_PATH_INSERT_AFTER)
which can be replaced to:
(flag & (JB_PATH_INSERT_BEFORE | JB_PATH_INSERT_AFTER))

also here:
(flag == JB_PATH_CREATE || flag == JB_PATH_INSERT_BEFORE || flag
== JB_PATH_INSERT_AFTER))
can be:
(flag & (JB_PATH_CREATE | JB_PATH_INSERT_BEFORE | JB_PATH_INSERT_AFTER))

6. Pay attention to parenthesises to make the code looks like similar
one around:
+   if ((npairs == 0) && flag == JB_PATH_CREATE && (level == path_len - 1))
should be:
+   if ((npairs == 0) && (flag == JB_PATH_CREATE) && (level == path_len - 
1))

7. Why did you remove "skip"? It is a comment what "true" means...
-   r = JsonbIteratorNext(it, , true);/* skip 
*/
+   r = JsonbIteratorNext(it, , true);

8. After your changes some statements exceed 80-column threshold...
The same rules for the documentation.

9. And finally... it does not work as expected in case of:
postgres=# select jsonb_insert('{"a":[0,1,2,3]}', '{"a", 10}', '"4"');
jsonb_insert
-
 {"a": [0, 1, 2, 3]}
(1 row)

wheras jsonb_set works:
postgres=# select jsonb_set('{"a":[0,1,2,3]}', '{"a", 10}', '"4"');
jsonb_set
--
 {"a": [0, 1, 2, 3, "4"]}
(1 row)

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][PATCH] Supporting +-Infinity values by to_timestamp(float8)

2016-03-04 Thread Vitaly Burovoy

On 3/4/16, Anastasia Lubennikova <a.lubennik...@postgrespro.ru> wrote:
> 27.02.2016 09:57, Vitaly Burovoy:
>> Hello, Hackers!
>>
>> I worked on a patch[1] allows "EXTRACT(epoch FROM
>> +-Inf::timestamp[tz])" to return "+-Inf::float8".
>> There is an opposite function "to_timestamp(float8)" which now defined
>> as:
>> SELECT ('epoch'::timestamptz + $1 * '1 second'::interval)
>
> Hi,
> thank you for the patches.

Thank you for the review.

> Could you explain, whether they depend on each other?

Only logically. They reverse each other:
postgres=# SELECT v, to_timestamp(v), extract(epoch FROM to_timestamp(v)) FROM
postgres-#   unnest(ARRAY['+inf', '-inf', 0, 65536, 982384720.12]::float8[]) v;
  v   |   to_timestamp|  date_part
--+---+--
 Infinity | infinity  | Infinity
-Infinity | -infinity |-Infinity
0 | 1970-01-01 00:00:00+00|0
65536 | 1970-01-01 18:12:16+00|65536
 982384720.12 | 2001-02-17 04:38:40.12+00 | 982384720.12
(5 rows)

>> Since intervals do not support infinity values, it is impossible to do
>> something like:
>>
>> SELECT to_timestamp('infinity'::float8);
>>
>> ... which is not good.
>>
>> Supporting of such converting is in the TODO list[2] (by "converting
>> between infinity timestamp and float8").
>
> You mention intervals here, and TODO item definitely says about
> 'infinity' interval,

Yes, it is in the same block. But I wanted to point to the link
"converting between infinity timestamp and float8". There are two-way
conversion examples.

> while patch and all the following discussion concerns to timestamps.
> Is it a typo or I misunderstood something important?

It is just a reason why I rewrote it as an internal function.
I asked whether to just rewrite the function
"pg_catalog.to_timestamp(float8)" as an internal one or to add support
of infinite intervals. Tom Lane answered[5] "you should stay away from
infinite intervals".
So I left intervals as is.

> I assumed that following query will work, but it isn't. Could you
> clarify that?
> select to_timestamp('infinity'::interval);

It is not hard. There is no logical way to convert interval (e.g.
"5minutes") to a timestamp (or date).
There never was a function "to_timestamp(interval)" and never will be.
postgres=# select to_timestamp('5min'::interval);
ERROR:  function to_timestamp(interval) does not exist
LINE 1: select to_timestamp('1min'::interval);
   ^
HINT:  No function matches the given name and argument types. You
might need to add explicit type casts.

>> Proposed patch implements it.
>>
>> There is an other patch in the CF[3] 2016-03 implements checking of
>> timestamp[tz] for being in allowed range. Since it is wise to set
>> (fix) the upper boundary of timestamp[tz]s, I've included the file
>> "src/include/datatype/timestamp.h" from there to check that an input
>> value and a result are in the allowed range.
>>
>> There is no changes in a documentation because allowed range is the
>> same as officially supported[4] (i.e. until 294277 AD).
>
> I think that you should update documentation. At least description of
> epoch on this page:
> http://www.postgresql.org/docs/devel/static/functions-datetime.html

Thank you very much for pointing where it is located (I saw only
"to_timestamp(TEXT, TEXT)").
I'll think how to update it.

> More thoughts about the patch:
>
> 1. When I copy value from hints for min and max values (see examples
> below), it works fine for min, while max still leads to error.
> It comes from the check   "if (seconds >= epoch_ubound)". I wonder,
> whether you should change hint message?
>
> select to_timestamp(-210866803200.00);
>to_timestamp
> -
>   4714-11-24 02:30:17+02:30:17 BC
> (1 row)
>
>
> select to_timestamp(9224318016000.00);
> ERROR:  UNIX epoch out of range: "9224318016000.00"
> HINT:  Maximal UNIX epoch value is "9224318016000.00"

I agree, it is a little confusing. Do you (or anyone) know how to
construct a condensed phrase that it is an exclusive upper bound of an
allowed UNIX epoch range?

> 2. There is a comment about JULIAN_MAXYEAR inaccuracy in timestamp.h:
>
>   * IS_VALID_JULIAN checks the minimum date exactly, but is a bit sloppy
>   * about the maximum, since it's far enough out to not be especially
>   * interesting.

It is just about the accuracy to the day for a lower bound, and to the
year (not to a day) for an upper bound.

> M

Re: [HACKERS][PATCH] Supporting +-Infinity values by to_timestamp(float8)

2016-03-03 Thread Vitaly Burovoy

On 2/26/16, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> Proposed patch implements it.

I'm sorry, I forgot to leave a note for reviewers and committers:

This patch requires CATALOG_VERSION_NO be bumped.

Since pg_proc.h entry has changed, it is important to check and run
regress tests on a new cluster (as if CATALOG_VERSION_NO was bumped).

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb array-style subscription

2016-03-02 Thread Vitaly Burovoy

On 1/19/16, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
> Dmitry Dolgov wrote:
>
>> I've cleaned up the code, created a separate JsonbRef node (and there are
>> a
>> lot of small changes because of that), abandoned an idea of "deep
>> nesting"
>> of assignments (because it doesn't relate to jsonb subscription, is more
>> about the
>> "jsonb_set" function, and anyway it's not a good idea). It looks fine for
>> me, and I need a little guidance - is it ok to propose this feature for
>> commitfest 2016-03 for a review?
>
> Has this patch been proposed in some commitfest previously?  One of the
> less-commonly-invoked rules of commitfests is that you can't add patches
> that are too invasive to the last one -- so your last chance for 9.6 was
> 2016-01.  This is harsh to patch submitters, but it helps keep the size
> of the last commitfest down to a reasonable level; otherwise we are
> never able to finish it.

I'd like to be a reviewer for the patch. It does not look big and very invasive.

Is it a final decision or it has a chance? If something there hurts
committers, it can end up as "Rejected with feedback" (since the patch
is already in the CF[1])?

[1]https://commitfest.postgresql.org/9/485/
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATH] Correct negative/zero year in to_date/to_timestamp

2016-02-28 Thread Vitaly Burovoy

On 2/27/16, Robert Haas <robertmh...@gmail.com> wrote:
> On Tue, Feb 23, 2016 at 6:23 AM, Thomas Munro
> <thomas.mu...@enterprisedb.com> wrote:
>> This seems to be a messy topic.  The usage of "AD" and "BC" imply that
>> TO_DATE is using the anno domini system which doesn't have a year 0,
>> but in the DATE type perhaps we are using the ISO 8601 model[2] where
>> 1 BC is represented as , leading to the difference of one in all
>> years before 1 AD?
>
> Well, the way to figure that out, I think, is to look at the
> documentation.  I had a look at...
>
> http://www.postgresql.org/docs/9.5/static/functions-formatting.html
>
> ...which says...
>
>  year (4 or more digits)
> IYYY ISO 8601 week-numbering year (4 or more digits)
>
> I don't really understand ISO 8601, but if IYYY is giving us an ISO
> 8601 thing, then presumably  is not supposed to be giving us that.
>   The same page elsewhere refers to Gregorian dates, and other parts
> of the documentation seem to agree that's what we use.
>
> But having said that, this is kind of a weird situation.  We're
> talking about this:
>
> rhaas=# SELECT y || '-06-01', to_date (y || '-06-01', '-MM-DD')
> FROM (VALUES (2), (1), (0), (-1), (-2)) t(y);
>  ?column? |to_date
> --+---
>  2-06-01  | 0002-06-01
>  1-06-01  | 0001-06-01
>  0-06-01  | 0001-06-01 BC
>  -1-06-01 | 0002-06-01 BC
>  -2-06-01 | 0003-06-01 BC
> (5 rows)
>
> Now, I would be tempted to argue that passing to_date('-1-06-01',
> '-MM-DD') ought to do the same thing as to_date('pickle',
> '-MM-DD') i.e. throw an error.  There's all kinds of what seems to
> me to be shoddy error checking in this area:
>
> rhaas=# select to_date('-3', ':MM');
> to_date
> ---
>  0004-01-01 BC
> (1 row)
>
> It's pretty hard for me to swallow the argument that the input matches
> the provided format.
>
> However, I'm not sure we ought to tinker with the behavior in this
> area.  If -MM-DD is going to accept things that are not of the
> format -MM-DD, and I'd argue that -1-06-01 is not in that format,

It is not about format, it is about values.

> then I think it should probably keep doing the same things it's always
> done.  If you want to supply a BC date, why not do

Because it is inconvenient a little. If one value ("-2345") is passed,
another one ("2346 BC") is got. In the other case a programmer must
check for negative value, and if so change a sign and add "BC" to the
format. Moreover the programmer must keep in mind that it is not
enough to have usual date format "DD/MM/", because sometimes there
can be "BC" part.

> this:
>
> rhaas=# select to_date('0001-06-01 BC', '-MM-DD BC');
> to_date
> ---
>  0001-06-01 BC
> (1 row)

Also because of:

postgres=# SELECT EXTRACT(year FROM to_date('-3', ''));
 date_part
---
-4
(1 row)

Note that the documentation[1] says "Keep in mind there is no 0 AD".
More examples by the link[2].

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

[1]http://www.postgresql.org/docs/devel/static/functions-datetime.html
[2]http://www.postgresql.org/message-id/cakoswnn6kpfabmrshzyn8+2nhpyfvbdmphya5pqguny8lp2...@mail.gmail.com
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATH] Correct negative/zero year in to_date/to_timestamp

2016-02-26 Thread Vitaly Burovoy

On 2/26/16, Shulgin, Oleksandr <oleksandr.shul...@zalando.de> wrote:
> On Fri, Feb 26, 2016 at 3:24 PM, Ivan Kartyshov
> <i.kartys...@postgrespro.ru>
> wrote:
>
>> The following review has been posted through the commitfest application:
>
>> make installcheck-world:  tested, failed
>> Implements feature:   tested, failed
>> Spec compliant:   tested, failed
>> Documentation:tested, failed
>>
>> Applied this patch, it works well, make what it expected correctly, code
>> style is maintained. Regression tests passed OK. No documentation or
>> comments.
>>
>
> Why does it say "tested, failed" for all points above there? ;-)

I hope Ivan misunderstood that "tested" and "passed" are two different
options, not a single "tested, passed" ;-)

Unfortunately, it seems there should be a discussion about meaning of
"": it seems authors made it as ISO8601-compliant (but there are
still several bugs), but there is no proofs in the documentation[1].

On the one hand there is only "year" mentioned for "", "YYY",
etc., and "ISO 8601 ... year" is "week-numbering", i.e. "IYYY", "IYY",
etc. only for using with "IDDD", "ID" and "IW".

On the other hand "extract" has two different options: "year" and
"isoyear" and only the second one is ISO8601-compliant (moreover it is
week-numbering at the same time):
postgres=# SELECT y src, EXTRACT(year FROM y) AS year, EXTRACT(isoyear
FROM y) AS isoyear
postgres-# FROM unnest(ARRAY[
postgres(# '4713-01-01 BC','4714-12-31 BC','4714-12-29 BC','4714-12-28
BC']::date[]) y;
  src  | year  | isoyear
---+---+-
 4713-01-01 BC | -4713 |   -4712
 4714-12-31 BC | -4714 |   -4712
 4714-12-29 BC | -4714 |   -4712
 4714-12-28 BC | -4714 |   -4713
(4 rows)

So there is lack of consistency: something should be changed: either
"extract(year..." (and the documentation[2], but there is "Keep in
mind there is no 0 AD, ..." for a long time, so it is a change which
breaks compatibility; and the patch will be completely different), or
the patch (it has an influence on "IYYY", so it is obviously wrong).

Now (after the letter[3] from Thomas Munro) I know the patch is not
ready for committing even with minimal changes. But I'm waiting for a
discussion: what part should be changed?

I would change behavior of "to_date" and "to_timestamp" to match with
extract options "year"/"isoyear", but I think getting decision of the
community before modifying the patch is a better idea.

[1]http://www.postgresql.org/docs/devel/static/functions-formatting.html
[2]http://www.postgresql.org/docs/devel/static/functions-datetime.html
[3]http://www.postgresql.org/message-id/CAEepm=0aznvy+frtyni04imdw4tlgzaelljqmhmcjbre+ln...@mail.gmail.com
--
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check

2016-02-25 Thread Vitaly Burovoy

> 

Added to the commitfest 2016-03.

[CF] https://commitfest.postgresql.org/9/540/

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check

2016-02-24 Thread Vitaly Burovoy

On 2/24/16, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> On 2/2/16, Jim Nasby <jim.na...@bluetreble.com> wrote:
>> On 2/2/16 6:39 PM, Tom Lane wrote:
>>> I'm inclined to think that a good solution would be to create an
>>> artificial restriction to not accept years beyond, say, 10 AD.
>>> That would leave us with a lot of daylight to not have to worry
>>> about corner-case overflows in timestamp arithmetic.  I'm not sure
>>> though where we'd need to enforce such a restriction; certainly in
>>> timestamp[tz]_in, but where else?
>>
>> Probably some of the casts (I'd think at least timestamp->timestamptz).
>> Maybe timestamp[tz]_recv. Most of the time*pl* functions. :/
>
> Please find attached a patch checks boundaries of date/timestamp[tz].
> There are more functions: converting to/from timestamptz, truncating,
> constructing from date and time etc.
>
> I left the upper boundary as described[1] in the documentation
> (294276-12-31 AD), lower - "as is" (4714-11-24 BC).
> It is easy to change the lower boundary to 4713-01-01BC (as described
> in the documentation) and it seems necessary because it allows to
> simplify IS_VALID_JULIAN and IS_VALID_JULIAN4STAMPS and avoid the next
> behavior:
>
> postgres=# select
> postgres-#  to_char(date_trunc('week', '4713-01-01 BC'::date),'day')
> postgres-# ,to_char(date_trunc('week', '4714-12-29 BC'::date),'day')
> postgres-# ,to_char(date_trunc('week', '4714-12-28 BC'::date),'day');
>   to_char  |  to_char  |  to_char
> ---+---+---
>  monday| monday| thursday
> (1 row)
>
> since 4714-12-28 BC and to the past detection when a week is starting
> is broken (because it is boundary of isoyears -4713 and -4712).
> Is it worth to break undocumented range or leave it as is?
>
> There is one more flaw: checking for a correctness begins from date
> and if default TZ is not UTC, dump/restore of values of type
> timestamptz which are close to allowed boundaries can be broken (and
> such result can't be restored because date is not in allowed range):
> postgres=# SET TIME ZONE 'GMT+1';
> SET
> postgres=# COPY (SELECT '4714-11-24 00:00:00.00+00
> BC'::timestamptz) TO STDOUT;
> 4714-11-23 23:00:00-01 BC
>
> Also I'm asking for a help because the query (in default TZ='GMT+1'):
> postgres=# SELECT '4714-11-24 00:00:00.00+00 BC'::timestamptz;
>
> in psql gives a result "4714-11-23 23:00:00-01 BC",
> but in a testing system gives "Sun Nov 23 23:00:00 4714 GMT BC"
> without TZ offset.
>
>
> I don't see what can be added to the documentation with the applied patch.
>
> More testings, finding bugs, uncovered functions, advice, comment
> improvements are very appreciated.
>
> [1]http://www.postgresql.org/docs/devel/static/datatype-datetime.html

I'm sorry, gmail hasn't added a header "In-Reply-To" to the last email.
Previous thread is by the link[2].

http://www.postgresql.org/message-id/flat/cakoswnked3joe++pes49gmdnfr2ieu9a2jfcez6ew-+1c5_...@mail.gmail.com

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 141 matches

Mail list logo