[HACKERS] Anonymous code block with parameters

2014-09-16 Thread Kalyanov Dmitry
I'd like to propose support for IN and OUT parameters in 'DO' blocks.

Currently, anonymous code blocks (DO statements) can not receive or
return parameters.

I suggest:

1) Add a new clause to DO statement for specifying names, types,
directions and values of parameters:

DO code [LANGUAGE lang] [USING (arguments)]

where arguments has the same syntax as in
'CREATE FUNCTION name (arguments)'.

Example:

do $$ begin z := x || y; end; $$
language plpgsql
using
(
  x text = '1',
  in out y int4 = 123,
  out z text
);

2) Values for IN and IN OUT parameters are specified using syntax for
default values of function arguments.

3) If DO statement has at least one of OUT or IN OUT parameters then it
returns one tuple containing values of OUT and IN OUT parameters.

Do you think that this feature would be useful? I have a
proof-of-concept patch in progress that I intend to publish soon.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triconsistent catalog declaration

2014-09-16 Thread Heikki Linnakangas

On 09/15/2014 08:56 PM, Robert Haas wrote:

On Mon, Sep 15, 2014 at 10:13 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

That makes for a bit awkward input and output from psql, when the values
used are 0, 1, 2, rather than ascii characters. But that's OK, because as
you said these functions are not callable from psql anyway, as they have
internal arguments.


Maybe we should change them to something a bit more understandable.


We can't change the return datatype to anything wider, or the values 
from 0, 1, 2, because those values have been chosen so that they are 
compatible with booleans. A boolean can be safely cast to a 
GinTernaryValue. I'm not sure if we make use of that anywhere ATM, but 
it's a useful property.



This requires a catalog change to fix. Are we still planning to do a
catversion bump for 9.4 because of the jsonb changes?


That was my understanding, although we seem to be proceeding at an
inexplicably glacial pace.


Ok, committed.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Pavel Stehule
Hi

2014-09-16 8:38 GMT+02:00 Kalyanov Dmitry kalyanov.dmi...@gmail.com:

 I'd like to propose support for IN and OUT parameters in 'DO' blocks.

 Currently, anonymous code blocks (DO statements) can not receive or
 return parameters.

 I suggest:

 1) Add a new clause to DO statement for specifying names, types,
 directions and values of parameters:

 DO code [LANGUAGE lang] [USING (arguments)]

 where arguments has the same syntax as in
 'CREATE FUNCTION name (arguments)'.

 Example:

 do $$ begin z := x || y; end; $$
 language plpgsql
 using
 (
   x text = '1',
   in out y int4 = 123,
   out z text
 );

 2) Values for IN and IN OUT parameters are specified using syntax for
 default values of function arguments.

 3) If DO statement has at least one of OUT or IN OUT parameters then it
 returns one tuple containing values of OUT and IN OUT parameters.

 Do you think that this feature would be useful? I have a
 proof-of-concept patch in progress that I intend to publish soon.


shortly

+1 parametrization of DO statement

-1 OUT parameters for DO - when you need OUTPUT, then use a function. A
rules used for output from something are messy now, and I strongly against
to do this area more complex. Instead we can define temporary functions or
we can define real server side session variables.

Pavel






 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-16 Thread Alexander Korotkov
On Mon, Sep 15, 2014 at 11:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Peter Geoghegan p...@heroku.com writes:
  On Mon, Sep 15, 2014 at 8:28 AM, Alexander Korotkov
  aekorot...@gmail.com wrote:
  Rename such opclasses and make them not default.
  Create new default opclasses with bitwise comparison functions.
  Write recommendation to re-create indexes with default opclasses into
  documentation.

  I certainly think this should be fixed if at all possible, but I'm not
  sure about this plan. Can we really rename an opclass without
  consequence, including having that respected across pg_upgrade?

 No.  And we don't know how to change the default opclass without
 breaking things, either.  See previous discussions about how we
 might fix the totally-broken default gist opclass that btree_gist
 creates for the inet type [1].  The motivation for getting rid of that
 is *way* stronger than it might be slow, but there's no apparent
 way to make something else be the default without creating havoc.


I've read thread about gist opclass for inet type. But that case is more
difficult because conflict is between builtin opclass and contrib opclass.
This case seems to be much simpler: we need to change builtin opclass to
builtin opclass and contrib opclass to contrib opclass. I realized that
it's problematic to rename builtin opclass due to pg_upgrade. However, it
seems still possible to create new opclass and make it default.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Heikki Linnakangas

On 09/16/2014 09:38 AM, Kalyanov Dmitry wrote:

I'd like to propose support for IN and OUT parameters in 'DO' blocks.

Currently, anonymous code blocks (DO statements) can not receive or
return parameters.

I suggest:

1) Add a new clause to DO statement for specifying names, types,
directions and values of parameters:

DO code [LANGUAGE lang] [USING (arguments)]

where arguments has the same syntax as in
'CREATE FUNCTION name (arguments)'.

Example:

do $$ begin z := x || y; end; $$
language plpgsql
using
(
   x text = '1',
   in out y int4 = 123,
   out z text
);

2) Values for IN and IN OUT parameters are specified using syntax for
default values of function arguments.

3) If DO statement has at least one of OUT or IN OUT parameters then it
returns one tuple containing values of OUT and IN OUT parameters.

Do you think that this feature would be useful? I have a
proof-of-concept patch in progress that I intend to publish soon.


There are two features here. One is to allow arguments to be passed to 
DO statements. The other is to allow a DO statement to return a result. 
Let's discuss them separately.


1) Passing arguments to a DO block can be useful feature, because it 
allows you to pass parameters to the DO block without injecting them 
into the string, which helps to avoid SQL injection attacks.


I don't like the syntax you propose though. It doesn't actually let you 
pass the parameters out-of-band, so I don't really see the point. I 
think this needs to work with PREPARE/EXECUTE, and the protocol-level 
prepare/execute mechanism. Ie. something like this:


PREPARE mydoblock (text, int4) AS DO $$ ... $$
EXECUTE mydoblock ('foo', 123);

2) Returning values from a DO block would also be handy. But I don't see 
why it should be restricted to OUT parameters. I'd suggest allowing a 
RETURNS clause, like in CREATE FUNCTION:


DO $$ ... $$ LANGUAGE plpgsql RETURNS int4;

or

DO $$ ... $$ LANGUAGE plpgsql RETURNS TABLE (col1 text, col2 int4);

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Pavel Stehule
2014-09-16 9:10 GMT+02:00 Heikki Linnakangas hlinnakan...@vmware.com:

 On 09/16/2014 09:38 AM, Kalyanov Dmitry wrote:

 I'd like to propose support for IN and OUT parameters in 'DO' blocks.

 Currently, anonymous code blocks (DO statements) can not receive or
 return parameters.

 I suggest:

 1) Add a new clause to DO statement for specifying names, types,
 directions and values of parameters:

 DO code [LANGUAGE lang] [USING (arguments)]

 where arguments has the same syntax as in
 'CREATE FUNCTION name (arguments)'.

 Example:

 do $$ begin z := x || y; end; $$
 language plpgsql
 using
 (
x text = '1',
in out y int4 = 123,
out z text
 );

 2) Values for IN and IN OUT parameters are specified using syntax for
 default values of function arguments.

 3) If DO statement has at least one of OUT or IN OUT parameters then it
 returns one tuple containing values of OUT and IN OUT parameters.

 Do you think that this feature would be useful? I have a
 proof-of-concept patch in progress that I intend to publish soon.


 There are two features here. One is to allow arguments to be passed to DO
 statements. The other is to allow a DO statement to return a result. Let's
 discuss them separately.

 1) Passing arguments to a DO block can be useful feature, because it
 allows you to pass parameters to the DO block without injecting them into
 the string, which helps to avoid SQL injection attacks.

 I don't like the syntax you propose though. It doesn't actually let you
 pass the parameters out-of-band, so I don't really see the point. I think
 this needs to work with PREPARE/EXECUTE, and the protocol-level
 prepare/execute mechanism. Ie. something like this:

 PREPARE mydoblock (text, int4) AS DO $$ ... $$
 EXECUTE mydoblock ('foo', 123);

 2) Returning values from a DO block would also be handy. But I don't see
 why it should be restricted to OUT parameters. I'd suggest allowing a
 RETURNS clause, like in CREATE FUNCTION:

 DO $$ ... $$ LANGUAGE plpgsql RETURNS int4;

 or

 DO $$ ... $$ LANGUAGE plpgsql RETURNS TABLE (col1 text, col2 int4);

 - Heikki


Why we don't introduce a temporary functions instead?

Pavel





 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Heikki Linnakangas

On 09/16/2014 10:15 AM, Pavel Stehule wrote:

2014-09-16 9:10 GMT+02:00 Heikki Linnakangas hlinnakan...@vmware.com:


On 09/16/2014 09:38 AM, Kalyanov Dmitry wrote:


I'd like to propose support for IN and OUT parameters in 'DO' blocks.

Currently, anonymous code blocks (DO statements) can not receive or
return parameters.

I suggest:

1) Add a new clause to DO statement for specifying names, types,
directions and values of parameters:

DO code [LANGUAGE lang] [USING (arguments)]

where arguments has the same syntax as in
'CREATE FUNCTION name (arguments)'.

Example:

do $$ begin z := x || y; end; $$
language plpgsql
using
(
x text = '1',
in out y int4 = 123,
out z text
);

2) Values for IN and IN OUT parameters are specified using syntax for
default values of function arguments.

3) If DO statement has at least one of OUT or IN OUT parameters then it
returns one tuple containing values of OUT and IN OUT parameters.

Do you think that this feature would be useful? I have a
proof-of-concept patch in progress that I intend to publish soon.



There are two features here. One is to allow arguments to be passed to DO
statements. The other is to allow a DO statement to return a result. Let's
discuss them separately.

1) Passing arguments to a DO block can be useful feature, because it
allows you to pass parameters to the DO block without injecting them into
the string, which helps to avoid SQL injection attacks.

I don't like the syntax you propose though. It doesn't actually let you
pass the parameters out-of-band, so I don't really see the point. I think
this needs to work with PREPARE/EXECUTE, and the protocol-level
prepare/execute mechanism. Ie. something like this:

PREPARE mydoblock (text, int4) AS DO $$ ... $$
EXECUTE mydoblock ('foo', 123);

2) Returning values from a DO block would also be handy. But I don't see
why it should be restricted to OUT parameters. I'd suggest allowing a
RETURNS clause, like in CREATE FUNCTION:

DO $$ ... $$ LANGUAGE plpgsql RETURNS int4;

or

DO $$ ... $$ LANGUAGE plpgsql RETURNS TABLE (col1 text, col2 int4);


Why we don't introduce a temporary functions instead?


You can already do that:

create function pg_temp.tempfunc(i int4) returns int4 as $$ begin end; 
$$ language plpgsql;


Compared to DO, you have to do extra steps to create the function, and 
drop it when you're done. And you can't use them in a hot standby, 
because it changes the catalogs. (although a better solution to that 
would be to make it work, as well as temporary tables, but that's a much 
bigger project).


Maybe we don't need any of this, you can just use temporary function. 
But clearly someone though that DO statements are useful in general, 
because we've had temporary functions for ages and we nevertheless added 
the DO statement.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Hannu Krosing
On 09/16/2014 09:15 AM, Pavel Stehule wrote:


 2014-09-16 9:10 GMT+02:00 Heikki Linnakangas hlinnakan...@vmware.com
 mailto:hlinnakan...@vmware.com:

 On 09/16/2014 09:38 AM, Kalyanov Dmitry wrote:

 I'd like to propose support for IN and OUT parameters in 'DO'
 blocks.

 Currently, anonymous code blocks (DO statements) can not
 receive or
 return parameters.

 I suggest:

 1) Add a new clause to DO statement for specifying names, types,
 directions and values of parameters:

 DO code [LANGUAGE lang] [USING (arguments)]

 where arguments has the same syntax as in
 'CREATE FUNCTION name (arguments)'.

 Example:

 do $$ begin z := x || y; end; $$
 language plpgsql
 using
 (
x text = '1',
in out y int4 = 123,
out z text
 );

 2) Values for IN and IN OUT parameters are specified using
 syntax for
 default values of function arguments.

 3) If DO statement has at least one of OUT or IN OUT
 parameters then it
 returns one tuple containing values of OUT and IN OUT parameters.

 Do you think that this feature would be useful? I have a
 proof-of-concept patch in progress that I intend to publish soon.


 There are two features here. One is to allow arguments to be
 passed to DO statements. The other is to allow a DO statement to
 return a result. Let's discuss them separately.

 1) Passing arguments to a DO block can be useful feature, because
 it allows you to pass parameters to the DO block without injecting
 them into the string, which helps to avoid SQL injection attacks.

 I don't like the syntax you propose though. It doesn't actually
 let you pass the parameters out-of-band, so I don't really see the
 point. I think this needs to work with PREPARE/EXECUTE, and the
 protocol-level prepare/execute mechanism. Ie. something like this:

 PREPARE mydoblock (text, int4) AS DO $$ ... $$
 EXECUTE mydoblock ('foo', 123);

 2) Returning values from a DO block would also be handy. But I
 don't see why it should be restricted to OUT parameters. I'd
 suggest allowing a RETURNS clause, like in CREATE FUNCTION:

 DO $$ ... $$ LANGUAGE plpgsql RETURNS int4;

 or

 DO $$ ... $$ LANGUAGE plpgsql RETURNS TABLE (col1 text, col2 int4);

 - Heikki


 Why we don't introduce a temporary functions instead?

As I see it, the DO blocks _are_ temporary (or rather in-line)
functions, though quite restricted in not taking arguments and not
returning anything.


DO you have a better syntax for temporary / in-line functions ?

What I would like to to is to make DO blocks equal to any other data
source, so you could do

WITH mydoblock(col1, col2)(DO $$ ... $$ LANGUAGE plpgsql RETURNS TABLE
(col1 text, col2 int4))
SELECT * FROM mydoblock;

or

SELECT *
FROM (DO $$ ... $$ LANGUAGE plpgsql RETURNS TABLE (col1 text, col2
int4)) mydoblock;

and for the parameter-taking version

SELECT (DO $$ ... $$ LANGUAGE plpgsql USING (user) RETURNS
int4)(username) AS usernum
FROM users;


Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-16 Thread Emre Hasegeli
 No.  And we don't know how to change the default opclass without
 breaking things, either.  See previous discussions about how we
 might fix the totally-broken default gist opclass that btree_gist
 creates for the inet type [1].  The motivation for getting rid of that
 is *way* stronger than it might be slow, but there's no apparent
 way to make something else be the default without creating havoc.

Inet case was not the same.  We tried to replace the default opclass
in contrib with another one in core.  It did not work because
pg_dump --binary-upgrade dumps the objects of the extension which
cannot be restored when there is a default opclass for the same
data type.

Changing the default opclasses should work if we make
pg_dump --binary-upgrade dump the default opclasses with indexes
and exclusion constraints.  I think it makes sense to do so in
--binary-upgrade mode.  I can try to come with a patch for this.

I cannot see a way to rename opclasses in core.  I think we can live
with default opclasses which are not named as type_ops.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-16 Thread Alexander Korotkov
On Tue, Sep 16, 2014 at 11:29 AM, Emre Hasegeli e...@hasegeli.com wrote:

 Changing the default opclasses should work if we make
 pg_dump --binary-upgrade dump the default opclasses with indexes
 and exclusion constraints.  I think it makes sense to do so in
 --binary-upgrade mode.  I can try to come with a patch for this.


Can you explain it a bit more detail? I didn't get it.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Pavel Stehule
2014-09-16 9:24 GMT+02:00 Heikki Linnakangas hlinnakan...@vmware.com:

 On 09/16/2014 10:15 AM, Pavel Stehule wrote:

 2014-09-16 9:10 GMT+02:00 Heikki Linnakangas hlinnakan...@vmware.com:

  On 09/16/2014 09:38 AM, Kalyanov Dmitry wrote:

  I'd like to propose support for IN and OUT parameters in 'DO' blocks.

 Currently, anonymous code blocks (DO statements) can not receive or
 return parameters.

 I suggest:

 1) Add a new clause to DO statement for specifying names, types,
 directions and values of parameters:

 DO code [LANGUAGE lang] [USING (arguments)]

 where arguments has the same syntax as in
 'CREATE FUNCTION name (arguments)'.

 Example:

 do $$ begin z := x || y; end; $$
 language plpgsql
 using
 (
 x text = '1',
 in out y int4 = 123,
 out z text
 );

 2) Values for IN and IN OUT parameters are specified using syntax for
 default values of function arguments.

 3) If DO statement has at least one of OUT or IN OUT parameters then it
 returns one tuple containing values of OUT and IN OUT parameters.

 Do you think that this feature would be useful? I have a
 proof-of-concept patch in progress that I intend to publish soon.


 There are two features here. One is to allow arguments to be passed to DO
 statements. The other is to allow a DO statement to return a result.
 Let's
 discuss them separately.

 1) Passing arguments to a DO block can be useful feature, because it
 allows you to pass parameters to the DO block without injecting them into
 the string, which helps to avoid SQL injection attacks.

 I don't like the syntax you propose though. It doesn't actually let you
 pass the parameters out-of-band, so I don't really see the point. I think
 this needs to work with PREPARE/EXECUTE, and the protocol-level
 prepare/execute mechanism. Ie. something like this:

 PREPARE mydoblock (text, int4) AS DO $$ ... $$
 EXECUTE mydoblock ('foo', 123);

 2) Returning values from a DO block would also be handy. But I don't see
 why it should be restricted to OUT parameters. I'd suggest allowing a
 RETURNS clause, like in CREATE FUNCTION:

 DO $$ ... $$ LANGUAGE plpgsql RETURNS int4;

 or

 DO $$ ... $$ LANGUAGE plpgsql RETURNS TABLE (col1 text, col2 int4);


 Why we don't introduce a temporary functions instead?


 You can already do that:

 create function pg_temp.tempfunc(i int4) returns int4 as $$ begin end; $$
 language plpgsql;


it looks much more like workaround than supported feature.



 Compared to DO, you have to do extra steps to create the function, and
 drop it when you're done. And you can't use them in a hot standby, because
 it changes the catalogs. (although a better solution to that would be to
 make it work, as well as temporary tables, but that's a much bigger
 project).

 Maybe we don't need any of this, you can just use temporary function. But
 clearly someone though that DO statements are useful in general, because
 we've had temporary functions for ages and we nevertheless added the DO
 statement.


I afraid so we create little bit obscure syntaxes, without real effect and
real cost

Any new useful syntax should be clean, simple, natural and shorter than
create function ...

and without risks a conflicts with ANSI SQL

I prefer a typed session variables, where is not risk of SQL injection or
some performance lost. The benefit of typed server side variables can be
for wide group of users.

Pavel




 - Heikki




Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Heikki Linnakangas

On 09/16/2014 10:44 AM, Pavel Stehule wrote:

2014-09-16 9:24 GMT+02:00 Heikki Linnakangas hlinnakan...@vmware.com:


On 09/16/2014 10:15 AM, Pavel Stehule wrote:


Why we don't introduce a temporary functions instead?



You can already do that:

create function pg_temp.tempfunc(i int4) returns int4 as $$ begin end; $$
language plpgsql;


it looks much more like workaround than supported feature.


What do you mean? How would the temporary functions you suggest look like?


Compared to DO, you have to do extra steps to create the function, and
drop it when you're done. And you can't use them in a hot standby, because
it changes the catalogs. (although a better solution to that would be to
make it work, as well as temporary tables, but that's a much bigger
project).

Maybe we don't need any of this, you can just use temporary function. But
clearly someone though that DO statements are useful in general, because
we've had temporary functions for ages and we nevertheless added the DO
statement.


I afraid so we create little bit obscure syntaxes, without real effect and
real cost

Any new useful syntax should be clean, simple, natural and shorter than
create function ...


Sure. I think adding a RETURNS clause to the existing DO syntax would be 
all of those.



and without risks a conflicts with ANSI SQL


DO is not in the standard, so no risk of conflicts there.


I prefer a typed session variables, where is not risk of SQL injection or
some performance lost. The benefit of typed server side variables can be
for wide group of users.


I don't see how session variables would help here. Sure, you could 
return a value from the DO-block by stashing it to a session variable 
and reading it out afterwards, but that's awkward.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Craig Ringer
On 09/16/2014 03:15 PM, Pavel Stehule wrote:

 Why we don't introduce a temporary functions instead?

I think that'd be a lot cleaner and simpler. It's something I've
frequently wanted, and as Hekki points out it's already possible by
creating the function in pg_temp, there just isn't the syntax sugar for
CREATE TEMPORARY FUNCTION.

So why not just add CREATE TEMPORARY FUNCTION?


It means two steps:

CREATE TEMPORARY FUNCTION ... $$ $$;

SELECT my_temp_function(blah);

but I'm not personally convinced that a parameterised DO block is much
easier, and the idea just rings wrong to me.


I agree with Pavel that the natural way to parameterise DO blocks, down
the track, will be to allow them to get (and set?) SQL-typed session
variables. Of course, we'd need to support them first ;-)

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Hannu Krosing
On 09/16/2014 09:44 AM, Pavel Stehule wrote:


 2014-09-16 9:24 GMT+02:00 Heikki Linnakangas hlinnakan...@vmware.com
 mailto:hlinnakan...@vmware.com:

 On 09/16/2014 10:15 AM, Pavel Stehule wrote:

 2014-09-16 9:10 GMT+02:00 Heikki Linnakangas
 hlinnakan...@vmware.com mailto:hlinnakan...@vmware.com:

 On 09/16/2014 09:38 AM, Kalyanov Dmitry wrote:

 I'd like to propose support for IN and OUT parameters
 in 'DO' blocks.

 Currently, anonymous code blocks (DO statements) can
 not receive or
 return parameters.

 I suggest:

 1) Add a new clause to DO statement for specifying
 names, types,
 directions and values of parameters:

 DO code [LANGUAGE lang] [USING (arguments)]

 where arguments has the same syntax as in
 'CREATE FUNCTION name (arguments)'.

 Example:

 do $$ begin z := x || y; end; $$
 language plpgsql
 using
 (
 x text = '1',
 in out y int4 = 123,
 out z text
 );

 2) Values for IN and IN OUT parameters are specified
 using syntax for
 default values of function arguments.

 3) If DO statement has at least one of OUT or IN OUT
 parameters then it
 returns one tuple containing values of OUT and IN OUT
 parameters.

 Do you think that this feature would be useful? I have a
 proof-of-concept patch in progress that I intend to
 publish soon.


 There are two features here. One is to allow arguments to
 be passed to DO
 statements. The other is to allow a DO statement to return
 a result. Let's
 discuss them separately.

 1) Passing arguments to a DO block can be useful feature,
 because it
 allows you to pass parameters to the DO block without
 injecting them into
 the string, which helps to avoid SQL injection attacks.

 I don't like the syntax you propose though. It doesn't
 actually let you
 pass the parameters out-of-band, so I don't really see the
 point. I think
 this needs to work with PREPARE/EXECUTE, and the
 protocol-level
 prepare/execute mechanism. Ie. something like this:

 PREPARE mydoblock (text, int4) AS DO $$ ... $$
 EXECUTE mydoblock ('foo', 123);

 2) Returning values from a DO block would also be handy.
 But I don't see
 why it should be restricted to OUT parameters. I'd suggest
 allowing a
 RETURNS clause, like in CREATE FUNCTION:

 DO $$ ... $$ LANGUAGE plpgsql RETURNS int4;

 or

 DO $$ ... $$ LANGUAGE plpgsql RETURNS TABLE (col1 text,
 col2 int4);


 Why we don't introduce a temporary functions instead?


 You can already do that:

 create function pg_temp.tempfunc(i int4) returns int4 as $$ begin
 end; $$ language plpgsql;


 it looks much more like workaround than supported feature.
a straightforward CREATE TEMPORARY FUNCTION implementation would do
exactly that.
  


 Compared to DO, you have to do extra steps to create the function,
 and drop it when you're done. And you can't use them in a hot
 standby, because it changes the catalogs. (although a better
 solution to that would be to make it work, as well as temporary
 tables, but that's a much bigger project).

 Maybe we don't need any of this, you can just use temporary
 function. But clearly someone though that DO statements are useful
 in general, because we've had temporary functions for ages and we
 nevertheless added the DO statement.


 I afraid so we create little bit obscure syntaxes, without real effect
 and real cost
I would agree with you if we had session-level temporary functions

But then we would still miss anonymous/in-line/on-the-spot functions

 Any new useful syntax should be clean, simple, natural and shorter
 than create function ...
This is not how SQL works, nor ADA nor pl/pgsql ;)

 and without risks a conflicts with ANSI SQL

 I prefer a typed session variables, where is not risk of SQL injection
 or some performance lost. The benefit of typed server side variables
 can be for wide group of users.
Agreed

but this would be a much bigger project, as Heikki already mentioned re.
temp things on replicas.

especially if typed session variables could hold temporary functions .

DECLARE FUNCTION mytempfucntion () ...


Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, 

Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Pavel Stehule
2014-09-16 9:58 GMT+02:00 Heikki Linnakangas hlinnakan...@vmware.com:

 On 09/16/2014 10:44 AM, Pavel Stehule wrote:

 2014-09-16 9:24 GMT+02:00 Heikki Linnakangas hlinnakan...@vmware.com:

  On 09/16/2014 10:15 AM, Pavel Stehule wrote:

  Why we don't introduce a temporary functions instead?


 You can already do that:

 create function pg_temp.tempfunc(i int4) returns int4 as $$ begin end; $$
 language plpgsql;


 it looks much more like workaround than supported feature.


 What do you mean? How would the temporary functions you suggest look like?


CREATE TEMPORARY FUNCTION ...



  Compared to DO, you have to do extra steps to create the function, and
 drop it when you're done. And you can't use them in a hot standby,
 because
 it changes the catalogs. (although a better solution to that would be to
 make it work, as well as temporary tables, but that's a much bigger
 project).

 Maybe we don't need any of this, you can just use temporary function. But
 clearly someone though that DO statements are useful in general, because
 we've had temporary functions for ages and we nevertheless added the DO
 statement.

  I afraid so we create little bit obscure syntaxes, without real effect
 and
 real cost

 Any new useful syntax should be clean, simple, natural and shorter than
 create function ...


 Sure. I think adding a RETURNS clause to the existing DO syntax would be
 all of those.

  and without risks a conflicts with ANSI SQL


 DO is not in the standard, so no risk of conflicts there.


I had a WIDTH ...  proposal on my mind



  I prefer a typed session variables, where is not risk of SQL injection or
 some performance lost. The benefit of typed server side variables can be
 for wide group of users.


 I don't see how session variables would help here. Sure, you could
 return a value from the DO-block by stashing it to a session variable and
 reading it out afterwards, but that's awkward.


you can use a global variables for injection values into block.

I am not against to do some simple parametrization, but some more complex
work with DO statement I don't would. It is messy now, and I don't see any
benefit from this area

Pavel



 - Heikki




Re: [HACKERS] LIMIT for UPDATE and DELETE

2014-09-16 Thread Etsuro Fujita

(2014/08/15 6:18), Rukh Meski wrote:

Based on the feedback on my previous patch, I've separated only the
LIMIT part into its own feature.  This version plays nicely with
inheritance.  The intended use is splitting up big UPDATEs and DELETEs
into batches more easily and efficiently.


IIUC, the patch doesn't support OFFSET with UPDATE/DELETE ... LIMIT.  Is 
that OK?  When we support ORDER BY ... LIMIT/OFFSET, we will also be 
allowing for OFFSET with UPDATE/DELETE ... LIMIT.  So, ISTM it would be 
better for the patch to support OFFSET at this point.  No?


Thanks,

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Pavel Stehule
2014-09-16 10:01 GMT+02:00 Hannu Krosing ha...@2ndquadrant.com:

  On 09/16/2014 09:44 AM, Pavel Stehule wrote:



 2014-09-16 9:24 GMT+02:00 Heikki Linnakangas hlinnakan...@vmware.com:

  On 09/16/2014 10:15 AM, Pavel Stehule wrote:

  2014-09-16 9:10 GMT+02:00 Heikki Linnakangas hlinnakan...@vmware.com:

  On 09/16/2014 09:38 AM, Kalyanov Dmitry wrote:

  I'd like to propose support for IN and OUT parameters in 'DO' blocks.

 Currently, anonymous code blocks (DO statements) can not receive or
 return parameters.

 I suggest:

 1) Add a new clause to DO statement for specifying names, types,
 directions and values of parameters:

 DO code [LANGUAGE lang] [USING (arguments)]

 where arguments has the same syntax as in
 'CREATE FUNCTION name (arguments)'.

 Example:

 do $$ begin z := x || y; end; $$
 language plpgsql
 using
 (
 x text = '1',
 in out y int4 = 123,
 out z text
 );

 2) Values for IN and IN OUT parameters are specified using syntax for
 default values of function arguments.

 3) If DO statement has at least one of OUT or IN OUT parameters then it
 returns one tuple containing values of OUT and IN OUT parameters.

 Do you think that this feature would be useful? I have a
 proof-of-concept patch in progress that I intend to publish soon.


 There are two features here. One is to allow arguments to be passed to
 DO
 statements. The other is to allow a DO statement to return a result.
 Let's
 discuss them separately.

 1) Passing arguments to a DO block can be useful feature, because it
 allows you to pass parameters to the DO block without injecting them
 into
 the string, which helps to avoid SQL injection attacks.

 I don't like the syntax you propose though. It doesn't actually let you
 pass the parameters out-of-band, so I don't really see the point. I
 think
 this needs to work with PREPARE/EXECUTE, and the protocol-level
 prepare/execute mechanism. Ie. something like this:

 PREPARE mydoblock (text, int4) AS DO $$ ... $$
 EXECUTE mydoblock ('foo', 123);

 2) Returning values from a DO block would also be handy. But I don't see
 why it should be restricted to OUT parameters. I'd suggest allowing a
 RETURNS clause, like in CREATE FUNCTION:

 DO $$ ... $$ LANGUAGE plpgsql RETURNS int4;

 or

 DO $$ ... $$ LANGUAGE plpgsql RETURNS TABLE (col1 text, col2 int4);


   Why we don't introduce a temporary functions instead?


 You can already do that:

 create function pg_temp.tempfunc(i int4) returns int4 as $$ begin end; $$
 language plpgsql;


  it looks much more like workaround than supported feature.

 a straightforward CREATE TEMPORARY FUNCTION implementation would do
 exactly that.




 Compared to DO, you have to do extra steps to create the function, and
 drop it when you're done. And you can't use them in a hot standby, because
 it changes the catalogs. (although a better solution to that would be to
 make it work, as well as temporary tables, but that's a much bigger
 project).

 Maybe we don't need any of this, you can just use temporary function. But
 clearly someone though that DO statements are useful in general, because
 we've had temporary functions for ages and we nevertheless added the DO
 statement.


  I afraid so we create little bit obscure syntaxes, without real effect
 and real cost

 I would agree with you if we had session-level temporary functions

 But then we would still miss anonymous/in-line/on-the-spot functions


  Any new useful syntax should be clean, simple, natural and shorter than
 create function ...

 This is not how SQL works, nor ADA nor pl/pgsql ;)


sure -- two languages are hard to maintain, hard to develop. Three ... my
God :)


  and without risks a conflicts with ANSI SQL

  I prefer a typed session variables, where is not risk of SQL injection
 or some performance lost. The benefit of typed server side variables can be
 for wide group of users.

 Agreed

 but this would be a much bigger project, as Heikki already mentioned re.
 temp things on replicas.


probably



 especially if typed session variables could hold temporary functions .

 DECLARE FUNCTION mytempfucntion () ...


Why not? When somebody solves a work with dynamic planning and solves all
issues related to stored plans. Still we have a issues, when some changes
needs a session cleaning (disconnect)

Regards

Pavel




 Cheers

 --
 Hannu Krosing
 PostgreSQL Consultant
 Performance, Scalability and High Availability
 2ndQuadrant Nordic OÜ




Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Heikki Linnakangas

On 09/16/2014 10:57 AM, Craig Ringer wrote:

On 09/16/2014 03:15 PM, Pavel Stehule wrote:


Why we don't introduce a temporary functions instead?


I think that'd be a lot cleaner and simpler. It's something I've
frequently wanted, and as Hekki points out it's already possible by
creating the function in pg_temp, there just isn't the syntax sugar for
CREATE TEMPORARY FUNCTION.

So why not just add CREATE TEMPORARY FUNCTION?


Sure, why not.


It means two steps:

CREATE TEMPORARY FUNCTION ... $$ $$;

SELECT my_temp_function(blah);

but I'm not personally convinced that a parameterised DO block is much
easier, and the idea just rings wrong to me.


With the above, you'll have to remember to drop the function when you're 
done, or deal with the fact that the function might already exist. 
That's doable, of course, but with a DO statement you don't have to.



I agree with Pavel that the natural way to parameterise DO blocks, down
the track, will be to allow them to get (and set?) SQL-typed session
variables. Of course, we'd need to support them first ;-)


I responded to Pavel that using a session variable for a return value 
would be awkward, but using them as parameters would open a different 
can of worms. A session variable might change while the statement is 
run, so for anything but trivial DO blocks, a best practice would have 
to be to copy the session variable to a local variable as the first 
thing to do. For example, if you just use session variables arg1 and 
arg2, and you call a function that uses those same session variables for 
some other purposes, you will be surprised. Also, you'd have to remember 
to reset the session variables after use if there's any sensitive 
information in them, or you might leak them to surprising places. And if 
you forget to pass an argument, i.e. you forget to set a session 
variable that's used as an argument, the parser would not help you to 
catch your mistake but would merrily run the DO block with whatever the 
content of the argument happens to be.


Using session variables for arguments would be anything but natural.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-16 Thread Emre Hasegeli
  Changing the default opclasses should work if we make
  pg_dump --binary-upgrade dump the default opclasses with indexes
  and exclusion constraints.  I think it makes sense to do so in
  --binary-upgrade mode.  I can try to come with a patch for this.
 
 Can you explain it a bit more detail? I didn't get it.

pg_upgrade uses pg_dump --binary-upgrade to dump the schema of
the old database.  Now, it generates CREATE INDEX statements without
explicit opclass if opclass is the default.  We can change pg_dump
to generate the statements with opclass even if opclass is the default
in --binary-upgrade mode.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Pavel Stehule
2014-09-16 10:09 GMT+02:00 Heikki Linnakangas hlinnakan...@vmware.com:

 On 09/16/2014 10:57 AM, Craig Ringer wrote:

 On 09/16/2014 03:15 PM, Pavel Stehule wrote:

  Why we don't introduce a temporary functions instead?


 I think that'd be a lot cleaner and simpler. It's something I've
 frequently wanted, and as Hekki points out it's already possible by
 creating the function in pg_temp, there just isn't the syntax sugar for
 CREATE TEMPORARY FUNCTION.

 So why not just add CREATE TEMPORARY FUNCTION?


 Sure, why not.

  It means two steps:

 CREATE TEMPORARY FUNCTION ... $$ $$;

 SELECT my_temp_function(blah);

 but I'm not personally convinced that a parameterised DO block is much
 easier, and the idea just rings wrong to me.


 With the above, you'll have to remember to drop the function when you're
 done, or deal with the fact that the function might already exist. That's
 doable, of course, but with a DO statement you don't have to.

  I agree with Pavel that the natural way to parameterise DO blocks, down
 the track, will be to allow them to get (and set?) SQL-typed session
 variables. Of course, we'd need to support them first ;-)


 I responded to Pavel that using a session variable for a return value
 would be awkward, but using them as parameters would open a different can
 of worms. A session variable might change while the statement is run, so
 for anything but trivial DO blocks, a best practice would have to be to
 copy the session variable to a local variable as the first thing to do. For
 example, if you just use session variables arg1 and arg2, and you call a
 function that uses those same session variables for some other purposes,
 you will be surprised. Also, you'd have to remember to reset the session
 variables after use if there's any sensitive information in them, or you
 might leak them to surprising places. And if you forget to pass an
 argument, i.e. you forget to set a session variable that's used as an
 argument, the parser would not help you to catch your mistake but would
 merrily run the DO block with whatever the content of the argument happens
 to be.


Personally I can't to imagine some more complex code as DO block.



 Using session variables for arguments would be anything but natural.


 - Heikki




Re: [HACKERS] Anonymous code block with parameters

2014-09-16 Thread Andres Freund
Hi,

On 2014-09-16 10:24:52 +0300, Heikki Linnakangas wrote:
 On 09/16/2014 10:15 AM, Pavel Stehule wrote:
 Why we don't introduce a temporary functions instead?
 
 You can already do that:
 
 create function pg_temp.tempfunc(i int4) returns int4 as $$ begin end; $$
 language plpgsql;

It's quite the, probably undocumented, hack though. I think it's hard to
find as a user, because it's more or less happenstance that it works. I
think we should introduce TEMPORARY properly for function, but that's a
separate patch.

 Compared to DO, you have to do extra steps to create the function, and drop
 it when you're done. And you can't use them in a hot standby, because it
 changes the catalogs. (although a better solution to that would be to make
 it work, as well as temporary tables, but that's a much bigger project).

It'd be neat, but I really don't see it happening.

 Maybe we don't need any of this, you can just use temporary function. But
 clearly someone though that DO statements are useful in general, because
 we've had temporary functions for ages and we nevertheless added the DO
 statement.

Doing a CREATE FUNCTION like that has a mighty amount of cost
associated. If you're not using the DO interactively, but
programmatically the amount of catalog and cache churn can be
problematic. So I'm in favor of adding parameters to DO.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CRC algorithm (was Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes)

2014-09-16 Thread Amit Kapila
On Sat, Sep 13, 2014 at 1:33 AM, Heikki Linnakangas hlinnakan...@vmware.com
wrote:
 On 09/12/2014 10:54 PM, Abhijit Menon-Sen wrote:
 At 2014-09-12 22:38:01 +0300, hlinnakan...@vmware.com wrote:
 We probably should consider switching to a faster CRC algorithm again,
 regardless of what we do with compression.

 As it happens, I'm already working on resurrecting a patch that Andres
 posted in 2010 to switch to zlib's faster CRC implementation.

 As it happens, I also wrote an implementation of Slice-by-4 the other day
:-).
 Haven't gotten around to post it, but here it is.

Incase we are using the implementation for everything that uses
COMP_CRC32() macro, won't it give problem for older version
databases.  I have created a database with Head code and then
tried to start server after applying this patch it gives below error:
FATAL:  incorrect checksum in control file

In general, the idea sounds quite promising.  To see how it performs
on small to medium size data, I have used attached test which is
written be you (with some additional tests) during performance test
of WAL reduction patch in 9.4.

Performance Data
--
Non-default settings

autovacuum = off
checkpoint_segments = 256
checkpoint_timeout = 20 min

HEAD -
testname | wal_generated | duration
-+---+--
 two short fields, no change | 583802008 | 11.6727559566498
 two short fields, no change | 580888024 | 11.8558299541473
 two short fields, no change | 580889680 | 11.5449349880219
 two short fields, one changed   | 620646400 | 11.6657111644745
 two short fields, one changed   | 620667904 | 11.6010649204254
 two short fields, one changed   | 622079320 | 11.6774570941925
 two short fields, both changed  | 620649656 | 12.0892491340637
 two short fields, both changed  | 620648360 | 12.1650269031525
 two short fields, both changed  | 620653952 | 12.2125108242035
 one short and one long field, no change | 329018192 | 4.74178600311279
 one short and one long field, no change | 329021664 | 4.71507883071899
 one short and one long field, no change | 330326496 | 4.84932994842529
 ten tiny fields, all changed| 701358488 |  14.236780166626
 ten tiny fields, all changed| 701355328 | 14.0777900218964
 ten tiny fields, all changed| 701358272 | 14.1000919342041
 hundred tiny fields, all changed| 315656568 | 6.99316620826721
 hundred tiny fields, all changed| 314875488 | 6.85715913772583
 hundred tiny fields, all changed| 315263768 | 6.94613790512085
 hundred tiny fields, half changed   | 314878360 | 6.89090895652771
 hundred tiny fields, half changed   | 314877216 | 7.05924606323242
 hundred tiny fields, half changed   | 314881816 | 6.93445992469788
 hundred tiny fields, half nulled| 236244136 | 6.43347096443176
 hundred tiny fields, half nulled| 236248104 | 6.30539107322693
 hundred tiny fields, half nulled| 236501040 | 6.33403086662292
 9 short and 1 long, short changed   | 262373616 | 4.24646091461182
 9 short and 1 long, short changed   | 262375136 | 4.49821400642395
 9 short and 1 long, short changed   | 262379840 | 4.38264393806458
(27 rows)


Patched -
testname | wal_generated | duration
-+---+--
 two short fields, no change | 580897400 | 10.6518769264221
 two short fields, no change | 581779816 | 10.7118690013885
 two short fields, no change | 581013224 | 10.8294110298157
 two short fields, one changed   | 620646264 | 10.8309078216553
 two short fields, one changed   | 620652872 | 10.8480410575867
 two short fields, one changed   | 620812376 | 10.9162290096283
 two short fields, both changed  | 620651792 | 10.9025599956512
 two short fields, both changed  | 620652304 | 10.7771129608154
 two short fields, both changed  | 620649960 | 11.0185468196869
 one short and one long field, no change | 329022000 | 3.88278198242188
 one short and one long field, no change | 329023656 | 4.01899003982544
 one short and one long field, no change | 329022992 | 3.91587209701538
 ten tiny fields, all changed| 701353296 | 12.7748699188232
 ten tiny fields, all changed| 701354848 |  12.761589050293
 ten tiny fields, all changed| 701356520 | 12.6703131198883
 hundred tiny fields, all changed| 314879424 | 6.25606894493103
 hundred tiny fields, all changed| 314878416 | 6.32905578613281
 hundred tiny fields, all changed| 314878464 | 

Re: [HACKERS] Patch to support SEMI and ANTI join removal

2014-09-16 Thread David Rowley
On Sat, Sep 13, 2014 at 1:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 David Rowley dgrowle...@gmail.com writes:
  On Fri, Sep 12, 2014 at 3:35 AM, Robert Haas robertmh...@gmail.com
 wrote:
  I haven't read the patch, but I think the question is why this needs
  to be different than what we do for left join removal.

  I discovered over here -
 
 http://www.postgresql.org/message-id/CAApHDvo5wCRk7uHBuMHJaDpbW-b_oGKQOuisCZzHC25=h3_...@mail.gmail.com
  during the early days of the semi and anti join removal code that the
  planner was trying to generate paths to the dead rel. I managed to track
  the problem down to eclass members still existing for the dead rel. I
 guess
  we must not have eclass members for outer rels? or we'd likely have seen
  some troubles with left join removals already.

 Mere existence of an eclass entry ought not cause paths to get built.
 It'd be worth looking a bit harder into what's happening there.


It took me a bit of time to create this problem again as I didn't record
the actual query where I hit the assert failure the first time. Though, now
I have managed to recreate the problem again by removing the code that I
had added which removes eclass members for dead rels.

Using the attached patch, the failing test case is:

create table b (id int primary key);
create table a (id int primary key, b_id int references b);
create index on a (b_id); -- add index to create alternative path

explain select a.* from a inner join b on b.id=a.b_id;

What seems to be happening is that generate_implied_equalities_for_column
generates a RestrictInfo for the dead rel due to the eclass member still
existing. This new rinfo gets matched to the index
by match_clauses_to_index()

The code then later fails in get_loop_count:

TRAP: FailedAssertion(!(outer_rel-rows  0), File:
src\backend\optimizer\path\indxpath.c, Line: 1861)

The call stack looks like:

 postgres.exe!get_loop_count(PlannerInfo * root, Bitmapset * outer_relids)
Line 1861 C
  postgres.exe!build_index_paths(PlannerInfo * root, RelOptInfo * rel,
IndexOptInfo * index, IndexClauseSet * clauses, char useful_predicate,
SaOpControl saop_control, ScanTypeControl scantype) Line 938 C
  postgres.exe!get_index_paths(PlannerInfo * root, RelOptInfo * rel,
IndexOptInfo * index, IndexClauseSet * clauses, List * * bitindexpaths)
Line 745 C
  postgres.exe!get_join_index_paths(PlannerInfo * root, RelOptInfo * rel,
IndexOptInfo * index, IndexClauseSet * rclauseset, IndexClauseSet *
jclauseset, IndexClauseSet * eclauseset, List * * bitindexpaths, Bitmapset
* relids, List * * considered_relids) Line 672 C
  postgres.exe!consider_index_join_outer_rels(PlannerInfo * root,
RelOptInfo * rel, IndexOptInfo * index, IndexClauseSet * rclauseset,
IndexClauseSet * jclauseset, IndexClauseSet * eclauseset, List * *
bitindexpaths, List * indexjoinclauses, int considered_clauses, List * *
considered_relids) Line 585 C
  postgres.exe!consider_index_join_clauses(PlannerInfo * root, RelOptInfo *
rel, IndexOptInfo * index, IndexClauseSet * rclauseset, IndexClauseSet *
jclauseset, IndexClauseSet * eclauseset, List * * bitindexpaths) Line 485 C
  postgres.exe!create_index_paths(PlannerInfo * root, RelOptInfo * rel)
Line 308 C
  postgres.exe!set_plain_rel_pathlist(PlannerInfo * root, RelOptInfo * rel,
RangeTblEntry * rte) Line 403 C
  postgres.exe!set_rel_pathlist(PlannerInfo * root, RelOptInfo * rel,
unsigned int rti, RangeTblEntry * rte) Line 337 C
  postgres.exe!set_base_rel_pathlists(PlannerInfo * root) Line 223 C
  postgres.exe!make_one_rel(PlannerInfo * root, List * joinlist) Line 157 C
  postgres.exe!query_planner(PlannerInfo * root, List * tlist, void
(PlannerInfo *, void *) * qp_callback, void * qp_extra) Line 236 C
  postgres.exe!grouping_planner(PlannerInfo * root, double tuple_fraction)
Line 1289 C
  postgres.exe!subquery_planner(PlannerGlobal * glob, Query * parse,
PlannerInfo * parent_root, char hasRecursion, double tuple_fraction,
PlannerInfo * * subroot) Line 573 C
  postgres.exe!standard_planner(Query * parse, int cursorOptions,
ParamListInfoData * boundParams) Line 211 C
  postgres.exe!planner(Query * parse, int cursorOptions, ParamListInfoData
* boundParams) Line 139 C
  postgres.exe!pg_plan_query(Query * querytree, int cursorOptions,
ParamListInfoData * boundParams) Line 750 C
  postgres.exe!ExplainOneQuery(Query * query, IntoClause * into,
ExplainState * es, const char * queryString, ParamListInfoData * params)
Line 330 C
  postgres.exe!ExplainQuery(ExplainStmt * stmt, const char * queryString,
ParamListInfoData * params, _DestReceiver * dest) Line 231 C
  postgres.exe!standard_ProcessUtility(Node * parsetree, const char *
queryString, ProcessUtilityContext context, ParamListInfoData * params,
_DestReceiver * dest, char * completionTag) Line 647 C
  postgres.exe!ProcessUtility(Node * parsetree, const char * queryString,
ProcessUtilityContext context, ParamListInfoData * params, _DestReceiver *
dest, char * completionTag) Line 314 C
  

Re: [HACKERS] WAL format and API changes (9.5)

2014-09-16 Thread Andres Freund
On 2014-09-15 15:41:22 +0300, Heikki Linnakangas wrote:
 Here we go. I've split this again into two patches. The first patch is just
 refactoring the current code. It moves XLogInsert into a new file,
 xloginsert.c, and the definition of XLogRecord to new xlogrecord.h header
 file. As a result, there is a a lot of churn in the #includes in C files
 that generate WAL records, or contain redo routines.  The number of files
 that pull in xlog.h - directly or indirectly through other headers - is
 greatly reduced.
 
 The second patch contains the interesting changes.
 
 I wrote a little benchmark kit to performance test this. I'm trying to find
 out two things:
 
 1) How much CPU overhead do the new XLogBeginInsert and XLogRegister*
 functions add, compared to the current approach with XLogRecDatas.
 
 2) How much extra WAL is generated with the patch. This affects the CPU time
 spent in the tests, but it's also interesting to measure directly, because
 WAL size affects many things like WAL archiving, streaming replication etc.
 
 Attached is the test kit I'm using. To run the battery of tests, use psql
 -f run.sql. To answer the question of WAL volume, it runs a bunch of tests
 that exercise heap insert, update and delete, as well as b-tree and GIN
 insertions. To answer the second test, it runs a heap insertion test, with a
 tiny record size that's chosen so that it generates exactly the same amount
 of WAL after alignment with and without the patch. The test is repeated many
 times, and the median of runtimes is printed out.
 
 Here are the results, comparing unpatched and patched versions. First, the
 WAL sizes:
 
 A heap insertion records are 2 bytes larger with the patch. Due to
 alignment, that makes for a 0 or 8 byte difference in the record sizes.
 Other WAL records have a similar store; a few extra bytes but no big
 regressions. There are a few outliers above where it appears that the
 patched version takes less space. Not sure why that would be, probably just
 a glitch in the test, autovacuum kicked in or something.

I've to admit, that's already not a painless amount of overhead.

 Now, for the CPU overhead:
 
   description   | dur_us (orig) | dur_us (patched) |   %
 +---+--+
  heap insert 30 | 0.7752835 | 0.831883 | 107.30
 (1 row)
 
 So, the patched version runs 7.3 % slower. That's disappointing :-(.
 
 This are the result I got on my laptop today. Previously, the typical result
 I've gotten has been about 5%, so that's a bit high. Nevertheless, even a 5%
 slowdown is probably not acceptable.

Yes, I definitely think it's not.

 While I've trying to nail down where that difference comes from, I've seen a
 lot of strange phenomenon. At one point, the patched version was 10% slower,
 but I was able to bring the difference down to 5% if I added a certain
 function in xloginsert.c, but never called it. It was very repeatable at the
 time, I tried adding and removing it many times and always got the same
 result, but I don't see it with the current HEAD and patch version anymore.
 So I think 5% is pretty close to the margin of error that arises from
 different compiler optimizations, data/instruction cache effects etc.
 
 Looking at the 'perf' profile, The new function calls only amount to about
 2% of overhead, so I'm not sure where the slowdown is coming from. Here are
 explanations I've considered, but I haven't been able to prove any of them:

I'd suggest doing:
a) perf stat -vvv of both workloads. That will often tell you stuff already
b) Look at other events. Particularly stalled-cycles-frontend,
   stalled-cycles-backend, cache-misses


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CRC algorithm (was Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes)

2014-09-16 Thread Andres Freund
On 2014-09-16 15:43:06 +0530, Amit Kapila wrote:
 On Sat, Sep 13, 2014 at 1:33 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:
  On 09/12/2014 10:54 PM, Abhijit Menon-Sen wrote:
  At 2014-09-12 22:38:01 +0300, hlinnakan...@vmware.com wrote:
  We probably should consider switching to a faster CRC algorithm again,
  regardless of what we do with compression.
 
  As it happens, I'm already working on resurrecting a patch that Andres
  posted in 2010 to switch to zlib's faster CRC implementation.
 
  As it happens, I also wrote an implementation of Slice-by-4 the other day
 :-).
  Haven't gotten around to post it, but here it is.
 
 Incase we are using the implementation for everything that uses
 COMP_CRC32() macro, won't it give problem for older version
 databases.  I have created a database with Head code and then
 tried to start server after applying this patch it gives below error:
 FATAL:  incorrect checksum in control file

That's indicative of a bug. This really shouldn't cause such problems -
at least my version was compatible with the current definition, and IIRC
Heikki's should be the same in theory. If I read it right.

 In general, the idea sounds quite promising.  To see how it performs
 on small to medium size data, I have used attached test which is
 written be you (with some additional tests) during performance test
 of WAL reduction patch in 9.4.

Yes, we should really do this.

 The patched version gives better results in all cases
 (in range of 10~15%), though this is not the perfect test, however
 it gives fair idea that the patch is quite promising.  I think to test
 the benefit from crc calculation for full page, we can have some
 checkpoint during each test (may be after insert).  Let me know
 what other kind of tests do you think are required to see the
 gain/loss from this patch.

I actually think we don't really need this. It's pretty evident that
slice-by-4 is a clear improvement.

 I think the main difference in this patch and what Andres has
 developed sometime back was code for manually unrolled loop
 doing 32bytes at once, so once Andres or Abhijit will post an
 updated version, we can do some performance tests to see
 if there is any additional gain.

If Heikki's version works I see little need to use my/Abhijit's
patch. That version has part of it under the zlib license. If Heikki's
version is a 'clean room', then I'd say we go with it. It looks really
quite similar though... We can make minor changes like additional
unrolling without problems lateron.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CRC algorithm (was Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes)

2014-09-16 Thread Heikki Linnakangas

On 09/16/2014 01:28 PM, Andres Freund wrote:

On 2014-09-16 15:43:06 +0530, Amit Kapila wrote:

On Sat, Sep 13, 2014 at 1:33 AM, Heikki Linnakangas hlinnakan...@vmware.com
wrote:

On 09/12/2014 10:54 PM, Abhijit Menon-Sen wrote:

At 2014-09-12 22:38:01 +0300, hlinnakan...@vmware.com wrote:

We probably should consider switching to a faster CRC algorithm again,
regardless of what we do with compression.


As it happens, I'm already working on resurrecting a patch that Andres
posted in 2010 to switch to zlib's faster CRC implementation.


As it happens, I also wrote an implementation of Slice-by-4 the other day

:-).

Haven't gotten around to post it, but here it is.


Incase we are using the implementation for everything that uses
COMP_CRC32() macro, won't it give problem for older version
databases.  I have created a database with Head code and then
tried to start server after applying this patch it gives below error:
FATAL:  incorrect checksum in control file


That's indicative of a bug. This really shouldn't cause such problems -
at least my version was compatible with the current definition, and IIRC
Heikki's should be the same in theory. If I read it right.


In general, the idea sounds quite promising.  To see how it performs
on small to medium size data, I have used attached test which is
written be you (with some additional tests) during performance test
of WAL reduction patch in 9.4.


Yes, we should really do this.


The patched version gives better results in all cases
(in range of 10~15%), though this is not the perfect test, however
it gives fair idea that the patch is quite promising.  I think to test
the benefit from crc calculation for full page, we can have some
checkpoint during each test (may be after insert).  Let me know
what other kind of tests do you think are required to see the
gain/loss from this patch.


I actually think we don't really need this. It's pretty evident that
slice-by-4 is a clear improvement.


I think the main difference in this patch and what Andres has
developed sometime back was code for manually unrolled loop
doing 32bytes at once, so once Andres or Abhijit will post an
updated version, we can do some performance tests to see
if there is any additional gain.


If Heikki's version works I see little need to use my/Abhijit's
patch. That version has part of it under the zlib license. If Heikki's
version is a 'clean room', then I'd say we go with it. It looks really
quite similar though... We can make minor changes like additional
unrolling without problems lateron.


I used http://create.stephan-brumme.com/crc32/#slicing-by-8-overview as 
reference - you can probably see the similarity. Any implementation is 
going to look more or less the same, though; there aren't that many ways 
to write the implementation.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CRC algorithm (was Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes)

2014-09-16 Thread Andres Freund
On 2014-09-16 13:49:20 +0300, Heikki Linnakangas wrote:
 I used http://create.stephan-brumme.com/crc32/#slicing-by-8-overview as
 reference - you can probably see the similarity. Any implementation is going
 to look more or less the same, though; there aren't that many ways to write
 the implementation.

True.

I think I see what's the problem causing Amit's test to fail. Amit, did
you use the powerpc machine?

Heikki, you swap bytes unconditionally - afaics that's wrong on big
endian systems. My patch had:

+ static inline uint32 swab32(const uint32 x);
+ static inline uint32 swab32(const uint32 x){
+   return ((x  (uint32)0x00ffUL)  24) |
+   ((x  (uint32)0xff00UL)   8) |
+   ((x  (uint32)0x00ffUL)   8) |
+   ((x  (uint32)0xff00UL)  24);
+ }
+ 
+ #if defined __BIG_ENDIAN__
+ #define cpu_to_be32(x)
+ #else
+ #define cpu_to_be32(x) swab32(x)
+ #endif


I guess yours needs something similar. I personally like the cpu_to_be*
naming - it imo makes it pretty clear what happens.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Minor improvement in lock.sgml

2014-09-16 Thread Etsuro Fujita
Here is a patch to a bit improve the reference page for the LOCK
command.  I think it'd be better for the isolation level to be in
capitals and wrapped in the literal tags.

Thanks,

Best regards,
Etsuro Fujita
diff --git a/doc/src/sgml/ref/lock.sgml b/doc/src/sgml/ref/lock.sgml
index 26e424b..913afe7 100644
--- a/doc/src/sgml/ref/lock.sgml
+++ b/doc/src/sgml/ref/lock.sgml
@@ -51,9 +51,9 @@ LOCK [ TABLE ] [ ONLY ] replaceable class=PARAMETERname/replaceable [ * ]
restrictive lock mode possible. commandLOCK TABLE/command
provides for cases when you might need more restrictive locking.
For example, suppose an application runs a transaction at the
-   Read Committed isolation level and needs to ensure that data in a
-   table remains stable for the duration of the transaction. To
-   achieve this you could obtain literalSHARE/ lock mode over the
+   literalREAD COMMITTED/ isolation level and needs to ensure that
+   data in a table remains stable for the duration of the transaction.
+   To achieve this you could obtain literalSHARE/ lock mode over the
table before querying. This will prevent concurrent data changes
and ensure subsequent reads of the table see a stable view of
committed data, because literalSHARE/ lock mode conflicts with

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-09-16 Thread Kouhei Kaigai
 On Mon, Sep 15, 2014 at 8:38 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
   The only reason why I put separate hooks here is,
   create_custom_scan() needs to know exact size of the CustomScan
   node (including private fields), however, it is helpful for
   extensions to kick its callback to initialize the node next to the
 common initialization stuff.
 
  Why does it need to know that?  I don't see that it's doing anything
  that requires knowing the size of that node, and if it is, I think it
  shouldn't be.  That should get delegated to the callback provided by
  the custom plan provider.
 
  Sorry, my explanation might be confusable. The create_custom_scan()
  does not need to know the exact size of the CustomScan (or its
  inheritance) because of the two separated hooks; one is node
  allocation time, the other is the tail of the series of initialization.
  If we have only one hook here, we need to have a mechanism to informs
  create_custom_scan() an exact size of the CustomScan node; including
  private fields managed by the provider, instead of the first hook on
  node allocation time. In this case, node allocation shall be processed
  by create_custom_scan() and it has to know exact size of the node to be
 allocated.
 
  How do I implement the feature here? Is the combination of static node
  size and callback on the tail more simple than the existing design
  that takes two individual hooks on create_custom_scan()?
 
 I still don't get it.  Right now, the logic in create_custom_scan(), which
 I think should really be create_custom_plan() or
 create_plan_from_custom_path(), basically looks like this:
 
 1. call hook function CreateCustomPlan
 2. compute values for tlist and clauses
 3. pass those values to hook function InitCustomScan() 4. call
 copy_path_costsize
 
 What I think we should do is:
 
 1. compute values for tlist and clauses
 2. pass those values to hook function PlanCustomPath(), which will return
 a Plan 3. call copy_path_costsize
 
 create_custom_scan() does not need to allocate the node.  You don't need
 the node to be allocated before computing tlist and clauses.
 
Thanks, I could get the point.
I'll revise the patch according to the suggestion above.


It seems to me, we can also apply similar manner on ExecInitCustomScan().
The current implementation doing is:
1. call CreateCustomScanState() to allocate a CustomScanState node
2. common initialization of the fields on CustomScanState, but not private
   fields.
3. call BeginCustomScan() to initialize remaining stuffs and begin execution.

If BeginCustomScan() is re-defined to accept values for common initialization
portions and to return a CustomScanState node, we may be able to eliminate
the CreateCustomScanState() hook.

Unlike create_custom_scan() case, it takes more number of values for common
initialization portions; expression tree of tlist and quals, scan and result
tuple-slot, projection info and relation handler. It may mess up the interface
specification.
In addition, BeginCustomScan() has to belong to CustomScanMethods, not
CustomexecMethods. I'm uncertain whether it is straightforward location.
(a whisper: It may not need to be separate tables. CustomScan always
populates CustomScanState, unlike relationship between CustomPath and
CustomScan.)

How about your opinion to apply the above manner on ExecInitCustomScan()
also?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL format and API changes (9.5)

2014-09-16 Thread Michael Paquier
On Mon, Sep 15, 2014 at 9:16 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 2) XLogCheckBufferNeedsBackup is not used. It can be removed.

Please ignore this comment, XLogCheckBufferNeedsBackup is used in heapam.c.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sequence Access Method WIP

2014-09-16 Thread Andres Freund
On 2014-09-15 01:38:52 +0200, Petr Jelinek wrote:
 - int64 minv, maxv, incby, bool is_cycled - these are basically options
 giving info about how the new numbers are allocated (I guess some
 implementations are not going to support all of those)
 - bool is_called - the current built-in sequence generator behaves
 differently based on it and I am not sure we can get over it (it could
 perhaps be done in back-end independently of AM?)

I think we might be able to get rid of is_called entirely. Or at least
get rid of it from the view of the AMs.

 Also makes me think that the seqam options interface should also be passed
 the minv/maxv/incby/is_cycled etc options for validation, not just the
 amoptions.

Sup.

BTW: Is 'is_cycled' a horrible name, or is that just me? Horribly easy
to confuse with the fact that a sequence has already wrapped around...

 And it should return:
 
 int64 value - the first value allocated.
 int nvalues - the number of values allocated.
 am_private - updated private data.
 
 
 There is also more needed than this, you need:
 - int64 value - first value allocated (value to be returned)
 - int64 nvalues - number of values allocated
 - int64 last - last cached value (used for cached/last_value)
 - int64 next - last logged value (used for wal logging)
 - am_private - updated private data, must be possible to return as null

 I personally don't like that we need all the nvalues, next and last as
 it makes the seqam a little bit too aware of the sequence logging internals
 in my opinion but I haven't found a way around it - it's impossible for
 backend to know how the AM will act around incby/maxv/minv/cycling so it
 can't really calculate these values by itself, unless ofcourse we fix the
 behavior and require seqams to behave predictably, but that somewhat breaks
 the whole idea of leaving the allocation to the seqam. Obviously it would
 also work to return list of allocated values and then backend could
 calculate the value, nvalues, last, next from that list by itself,
 but I am worried about performance of that approach.

Yea, it's far from pretty.

I'm not convinced that the AM ever needs to/should care about
caching. To me that's more like a generic behaviour. So it probably
should be abstracted away from the individual AMs.

I think the allocation routine might also need to be able to indicate
whether WAL logging is needed or not.

One thing I want attention to be paid to is that the interface should be
able to support 'gapless' sequences. I.e. where nextval() (and thus
alloc) needs to wait until the last caller to it finished. That very
well can be relevant from the locking *and* WAL logging perspective.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Scaling shared buffer eviction

2014-09-16 Thread Amit Kapila
On Sun, Sep 14, 2014 at 12:23 PM, Amit Kapila amit.kapil...@gmail.com
wrote:

 On Fri, Sep 12, 2014 at 11:55 AM, Amit Chapel amit.kapil...@gmail.com
 wrote:
  On Thu, Sep 11, 2014 at 4:31 PM, Andres Freund and...@2ndquadrant.com
 wrote:
   On 2014-09-10 12:17:34 +0530, Amit Kapila wrote:

 I will post the data with the latest patch separately (where I will focus
 on new cases discussed between Robert and Andres).


Performance Data with latest version of patch.
All the data shown below is a median of 3 runs, for each
individual run data, refer attached document
(perf_read_scalability_data_v9.ods)

 Performance Data for Read-only test
 -
 Configuration and Db Details
 IBM POWER-7 16 cores, 64 hardware threads
 RAM = 64GB
 Database Locale =C
 checkpoint_segments=256
 checkpoint_timeout=15min
 shared_buffers=8GB
 scale factor = 3000
 Client Count = number of concurrent sessions and threads (ex. -c 8 -j 8)
 Duration of each individual run = 5mins

All the data is in tps and taken using pgbench read-only load

   Client_Count/Patch_Ver 8 16 32 64 128  HEAD 58614 107370 140717 104357
65010  sbe_v9 62943 119064 172246 220174 220904


Observations
-
1. It scales well as with previous versions of patch, but
it seems the performance is slightly better in few cases,
may be because I have removed a statement (if check)
or 2 in bgreclaimer (those were done under spinlock) or it
could be just run-to-run difference.

 (1) A read-only pgbench workload that is just a tiny bit larger than
 shared_buffers, say size of shared_buffers plus 0.01%.  Such workloads
 tend to stress buffer eviction heavily.

When the data is just tiny bit larger than shared buffers, actually
there is no problem in scalability even in HEAD, because I think
most of the requests will be satisfied from existing buffer pool.
I have taken data for some of the loads where database size is
bit larger than shared buffers and it is as follows:

Scale Factor - 800
Shared_Buffers - 12286MB (Total db size is 12288MB)


   Client_Count/Patch_Ver 1 8 16 32 64 128  HEAD 8406 68712 13 198481
290340 289828  sbe_v9 8504 68546 131926 195789 289959 289021

Scale Factor - 800
Shared_Buffers - 12166MB (Total db size is 12288MB)

   Client_Count/Patch_Ver 1 8 16 32 64 128  HEAD 8428 68609 128092 196596
292066 293812  sbe_v9 8386 68546 126926 197126 289959 287621

Observations
-
In most cases performance with patch is slightly less as compare
to HEAD and the difference is generally less than 1% and in a case
or 2 close to 2%. I think the main reason for slight difference is that
when the size of shared buffers is almost same as data size, the number
of buffers it needs from clock sweep are very less, as an example in first
case (when size of shared buffers is 12286MB), it actually needs at most
256 additional buffers (2MB) via clock sweep, where as bgreclaimer
will put 2000 (high water mark) additional buffers (0.5% of shared buffers
is greater than 2000 ) in free list, so bgreclaimer does some extra work
when it is not required and it also leads to condition you mentioned
down (freelist will contain buffers that have already been touched since
we added them).  Now for case 2 (12166MB), we need buffers more
than 2000 additional buffers, but not too many, so it can also have
similar effect.

I think we have below options related to this observation
a. Some further tuning in bgreclaimer, so that instead of putting
the buffers up to high water mark in freelist, it puts just 1/4th or
1/2 of high water mark and then check if the free list still contains
lesser than equal to low water mark, if yes it continues and if not
then it can wait (or may be some other way).
b. Instead of waking bgreclaimer when the number of buffers fall
below low water mark, wake when the number of times backends
does clock sweep crosses certain threshold
c. Give low and high water mark as config knobs, so that in some
rare cases users can use them to do tuning.
d. Lets not do anything as if user does such a configuration, he should
be educated to configure shared buffers in a better way and or the
performance hit doesn't seem to be justified to do any further
work.

Now if we do either of 'a' or 'b', then I think there is a chance
that the gain might not be same for cases where users can
easily get benefit from this patch and there is a chance that
it degrades the performance in some other case.

 (2) A workload that maximizes the rate of concurrent buffer eviction
 relative to other tasks.  Read-only pgbench is not bad for this, but
 maybe somebody's got a better idea.

I think the first test of pgbench (scale_factor-3000;shared_buffers-8GB)
addresses this case.

 As I sort of mentioned in what I was writing for the bufmgr README,
 there are, more or less, three ways this can fall down, at least that
 I can see: (1) if the high water mark is too high, then we'll start
 finding buffers in the freelist 

Re: CRC algorithm (was Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes)

2014-09-16 Thread Amit Kapila
On Tue, Sep 16, 2014 at 4:27 PM, Andres Freund and...@2ndquadrant.com
wrote:
 On 2014-09-16 13:49:20 +0300, Heikki Linnakangas wrote:
  I used http://create.stephan-brumme.com/crc32/#slicing-by-8-overview as
  reference - you can probably see the similarity. Any implementation is
going
  to look more or less the same, though; there aren't that many ways to
write
  the implementation.

 True.

 I think I see what's the problem causing Amit's test to fail. Amit, did
 you use the powerpc machine?

Yes.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Triggers with DO functionality

2014-09-16 Thread Thom Brown
On 17 February 2012 22:42, Jaime Casanova ja...@2ndquadrant.com wrote:

 On Fri, Feb 17, 2012 at 4:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 
  Has anybody stopped to look at the SQL standard for this?  In-line
  trigger definitions are actually what they intend, IIRC.
 

 this is what i found there

 trigger definition ::=
   CREATE TRIGGER trigger name trigger action time trigger event
   ON table name [ REFERENCING transition table or variable list ]
   triggered action

 triggered action ::=
   [ FOR EACH { ROW | STATEMENT } ]
   [ WHEN left paren search condition right paren ]
   triggered SQL statement

 triggered SQL statement ::=
 SQL procedure statement
   | BEGIN ATOMIC { SQL procedure statement semicolon }... END


*slightly delayed response*

 So it looks like the standard doesn't complicate the proposal from what I
can tell.

Here's our current syntax:

CREATE [ CONSTRAINT ] TRIGGER name { BEFORE | AFTER | INSTEAD OF } { event
[ OR ... ] }
ON table_name
[ FROM referenced_table_name ]
[ NOT DEFERRABLE | [ DEFERRABLE ] { INITIALLY IMMEDIATE | INITIALLY
DEFERRED } ]
[ FOR [ EACH ] { ROW | STATEMENT } ]
[ WHEN ( condition ) ]
EXECUTE PROCEDURE function_name ( arguments )

Here's an updated syntax as per the proposal:

CREATE [ CONSTRAINT ] TRIGGER name { BEFORE | AFTER | INSTEAD OF } { event
[ OR ... ] }
ON table_name
[ FROM referenced_table_name ]
[ NOT DEFERRABLE | [ DEFERRABLE ] { INITIALLY IMMEDIATE | INITIALLY
DEFERRED } ]
[ FOR [ EACH ] { ROW | STATEMENT } ]
[ WHEN ( condition ) ]
{ EXECUTE PROCEDURE function_name ( arguments )
  | AS 'trigger function definition' [ LANGUAGE lang_name ]
[ SET configuration_parameter { TO value | = value | FROM CURRENT }
]
}

Example:

CREATE TRIGGER trg_my_trigger
  BEFORE INSERT ON customers
  FOR EACH ROW
  AS $$
BEGIN
  IF NEW.status IS NULL THEN
  ...
END;
  $$ LANGUAGE plpgsql SET search_path = shop;

All anonymous trigger functions would be implicitly volatile.  I imagine
that the function would need to be owned by the trigger, meaning the
function is dropped with the trigger.

So should this then just create a function named after the trigger, perhaps
with a leading underscore? (e.g. _trg_my_trigger)

I would expect that the only differences between this and a regular
trigger-function pair would be:

The function is auto-generated and named after the trigger.
The function has deptype of 'i' (DEPENDENCY_INTERNAL) so that it can only
be dropped as part of the trigger.
The function can't be the target of ALTER FUNCTION, or if it can, only a
relevant sub-set.
The function can't be the target of CREATE OR REPLACE FUNCTION.

And then there are event triggers, which could have the same functionality.

Thom


Re: [HACKERS] Support for N synchronous standby servers

2014-09-16 Thread Robert Haas
On Mon, Sep 15, 2014 at 3:00 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 At least a set of hooks has the merit to say: do what you like with
 your synchronous node policy.

Sure.  I dunno if people will find that terribly user-friendly, so we
might not want that to be the ONLY thing we offer.

But even if it is, it is certainly better than a poke in the eye with
a sharp stick.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2014-09-16 Thread Andres Freund
On 2014-09-16 13:15:59 +0100, Thom Brown wrote:
 On 17 February 2012 22:42, Jaime Casanova ja...@2ndquadrant.com wrote:
 
  On Fri, Feb 17, 2012 at 4:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  
   Has anybody stopped to look at the SQL standard for this?  In-line
   trigger definitions are actually what they intend, IIRC.
  
 
  this is what i found there
 
  trigger definition ::=
CREATE TRIGGER trigger name trigger action time trigger event
ON table name [ REFERENCING transition table or variable list ]
triggered action
 
  triggered action ::=
[ FOR EACH { ROW | STATEMENT } ]
[ WHEN left paren search condition right paren ]
triggered SQL statement
 
  triggered SQL statement ::=
  SQL procedure statement
| BEGIN ATOMIC { SQL procedure statement semicolon }... END
 
 
 *slightly delayed response*
 
  So it looks like the standard doesn't complicate the proposal from what I
 can tell.
 
 Here's our current syntax:
 
 CREATE [ CONSTRAINT ] TRIGGER name { BEFORE | AFTER | INSTEAD OF } { event
 [ OR ... ] }
 ON table_name
 [ FROM referenced_table_name ]
 [ NOT DEFERRABLE | [ DEFERRABLE ] { INITIALLY IMMEDIATE | INITIALLY
 DEFERRED } ]
 [ FOR [ EACH ] { ROW | STATEMENT } ]
 [ WHEN ( condition ) ]
 EXECUTE PROCEDURE function_name ( arguments )
 
 Here's an updated syntax as per the proposal:
 
 CREATE [ CONSTRAINT ] TRIGGER name { BEFORE | AFTER | INSTEAD OF } { event
 [ OR ... ] }
 ON table_name
 [ FROM referenced_table_name ]
 [ NOT DEFERRABLE | [ DEFERRABLE ] { INITIALLY IMMEDIATE | INITIALLY
 DEFERRED } ]
 [ FOR [ EACH ] { ROW | STATEMENT } ]
 [ WHEN ( condition ) ]
 { EXECUTE PROCEDURE function_name ( arguments )
   | AS 'trigger function definition' [ LANGUAGE lang_name ]
 [ SET configuration_parameter { TO value | = value | FROM CURRENT }
 ]
 }

I'm unconvinced that that's sufficient. You already noticed that you
need to add SET here. What's with e.g. SECURITY DEFINER? What's with
AS 'obj_file', 'link_symbol' when you create a C function? I think this
really would need to incorporate a more fundamental subset of CREATE
FUNCTION functionality.

 All anonymous trigger functions would be implicitly volatile.  I imagine
 that the function would need to be owned by the trigger, meaning the
 function is dropped with the trigger.

Right, that's necessary.

 So should this then just create a function named after the trigger, perhaps
 with a leading underscore? (e.g. _trg_my_trigger)

Hm...

 I would expect that the only differences between this and a regular
 trigger-function pair would be:
 
 The function is auto-generated and named after the trigger.

ok.

 The function has deptype of 'i' (DEPENDENCY_INTERNAL) so that it can only
 be dropped as part of the trigger.

ok.

 The function can't be the target of ALTER FUNCTION, or if it can, only a
 relevant sub-set.

ok.

 The function can't be the target of CREATE OR REPLACE FUNCTION.

That *really* sucks. To the point of making the feature useless in my
eyes. That's really something frequently done.

 And then there are event triggers, which could have the same functionality.

I think the need is much less there. You'll hardly create as many even
triggers as you create triggers on relations. Doesn't seem worth the effort.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2014-09-16 Thread Thom Brown
On 16 September 2014 13:29, Andres Freund and...@2ndquadrant.com wrote:

 On 2014-09-16 13:15:59 +0100, Thom Brown wrote:
  On 17 February 2012 22:42, Jaime Casanova ja...@2ndquadrant.com wrote:
 
   On Fri, Feb 17, 2012 at 4:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
   
Has anybody stopped to look at the SQL standard for this?  In-line
trigger definitions are actually what they intend, IIRC.
   
  
   this is what i found there
  
   trigger definition ::=
 CREATE TRIGGER trigger name trigger action time trigger event
 ON table name [ REFERENCING transition table or variable
 list ]
 triggered action
  
   triggered action ::=
 [ FOR EACH { ROW | STATEMENT } ]
 [ WHEN left paren search condition right paren ]
 triggered SQL statement
  
   triggered SQL statement ::=
   SQL procedure statement
 | BEGIN ATOMIC { SQL procedure statement semicolon }... END
 
 
  *slightly delayed response*
 
   So it looks like the standard doesn't complicate the proposal from what
 I
  can tell.
 
  Here's our current syntax:
 
  CREATE [ CONSTRAINT ] TRIGGER name { BEFORE | AFTER | INSTEAD OF } {
 event
  [ OR ... ] }
  ON table_name
  [ FROM referenced_table_name ]
  [ NOT DEFERRABLE | [ DEFERRABLE ] { INITIALLY IMMEDIATE | INITIALLY
  DEFERRED } ]
  [ FOR [ EACH ] { ROW | STATEMENT } ]
  [ WHEN ( condition ) ]
  EXECUTE PROCEDURE function_name ( arguments )
 
  Here's an updated syntax as per the proposal:
 
  CREATE [ CONSTRAINT ] TRIGGER name { BEFORE | AFTER | INSTEAD OF } {
 event
  [ OR ... ] }
  ON table_name
  [ FROM referenced_table_name ]
  [ NOT DEFERRABLE | [ DEFERRABLE ] { INITIALLY IMMEDIATE | INITIALLY
  DEFERRED } ]
  [ FOR [ EACH ] { ROW | STATEMENT } ]
  [ WHEN ( condition ) ]
  { EXECUTE PROCEDURE function_name ( arguments )
| AS 'trigger function definition' [ LANGUAGE lang_name ]
  [ SET configuration_parameter { TO value | = value | FROM
 CURRENT }
  ]
  }

 I'm unconvinced that that's sufficient. You already noticed that you
 need to add SET here. What's with e.g. SECURITY DEFINER? What's with
 AS 'obj_file', 'link_symbol' when you create a C function? I think this
 really would need to incorporate a more fundamental subset of CREATE
 FUNCTION functionality.


Fair enough, although others have mentioned that SECURITY DEFINER is pretty
much redundant on trigger functions anyway.


  The function can't be the target of CREATE OR REPLACE FUNCTION.

 That *really* sucks. To the point of making the feature useless in my
 eyes. That's really something frequently done.


Why not CREATE OR REPLACE TRIGGER?  Wouldn't the function itself be an
internal matter rather than something for users to worry about?  If the
user needs to adjust it, they'd need to discover the name of the function
the trigger referred to, which may not be trivial.



  And then there are event triggers, which could have the same
 functionality.

 I think the need is much less there. You'll hardly create as many even
 triggers as you create triggers on relations. Doesn't seem worth the
 effort.


Agreed, but I thought I'd mention it regardless.

-- 
Thom


Re: [HACKERS] Triggers with DO functionality

2014-09-16 Thread Andres Freund
On 2014-09-16 13:42:22 +0100, Thom Brown wrote:
   The function can't be the target of CREATE OR REPLACE FUNCTION.
 
  That *really* sucks. To the point of making the feature useless in my
  eyes. That's really something frequently done.
 
 
 Why not CREATE OR REPLACE TRIGGER?  Wouldn't the function itself be an
 internal matter rather than something for users to worry about?  If the
 user needs to adjust it, they'd need to discover the name of the function
 the trigger referred to, which may not be trivial.

Because CREATE OR REPLACE trigger has to take a heavy relation level
lock?  Because we don't have it? Because it'll allow to change things
that you really don't want to change?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2014-09-16 Thread Thom Brown
On 16 September 2014 13:45, Andres Freund and...@2ndquadrant.com wrote:

 On 2014-09-16 13:42:22 +0100, Thom Brown wrote:
The function can't be the target of CREATE OR REPLACE FUNCTION.
  
   That *really* sucks. To the point of making the feature useless in my
   eyes. That's really something frequently done.
  
 
  Why not CREATE OR REPLACE TRIGGER?  Wouldn't the function itself be an
  internal matter rather than something for users to worry about?  If the
  user needs to adjust it, they'd need to discover the name of the function
  the trigger referred to, which may not be trivial.

 Because CREATE OR REPLACE trigger has to take a heavy relation level
 lock?  Because we don't have it? Because it'll allow to change things
 that you really don't want to change?


Would CREATE OR REPLACE trigger need a heavy relational level lock if just
the anonymous function body were changing?

My concern is mainly about us on one hand saying Look, we've removed the
need for trigger statements when creating triggers, then on the other
saying But if you want to change anything, treat it as if we hadn't done
that, and you'll need to go find the function that we made in the
background.

I guess if we were to do that, we would just need to make it clear that
this is all syntactic sugar, and things like \d+ table output would show
the trigger calling the anonymous function rather than showing the body
of the function as part of the trigger.

-- 
Thom


Re: [HACKERS] Triggers with DO functionality

2014-09-16 Thread Andres Freund
On 2014-09-16 13:54:49 +0100, Thom Brown wrote:
 On 16 September 2014 13:45, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-09-16 13:42:22 +0100, Thom Brown wrote:
 The function can't be the target of CREATE OR REPLACE FUNCTION.
   
That *really* sucks. To the point of making the feature useless in my
eyes. That's really something frequently done.
   
  
   Why not CREATE OR REPLACE TRIGGER?  Wouldn't the function itself be an
   internal matter rather than something for users to worry about?  If the
   user needs to adjust it, they'd need to discover the name of the function
   the trigger referred to, which may not be trivial.
 
  Because CREATE OR REPLACE trigger has to take a heavy relation level
  lock?  Because we don't have it? Because it'll allow to change things
  that you really don't want to change?
 
 
 Would CREATE OR REPLACE trigger need a heavy relational level lock if just
 the anonymous function body were changing?

I think it's unlikely to change. At the very, very least it'd require a
ShareUpdateExclusive lock on the relation.

 My concern is mainly about us on one hand saying Look, we've removed the
 need for trigger statements when creating triggers, then on the other
 saying But if you want to change anything, treat it as if we hadn't done
 that, and you'll need to go find the function that we made in the
 background.

So what? The reason for changing stuff is that it requires superfluous
and annoying typing, right? That's much less the case when you just want
to replace the function's contents after the fact.

 I guess if we were to do that, we would just need to make it clear that
 this is all syntactic sugar, and things like \d+ table output would show
 the trigger calling the anonymous function rather than showing the body
 of the function as part of the trigger.

I think that should be the case anyway.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Robert Haas
On Mon, Sep 15, 2014 at 7:44 PM, Peter Geoghegan p...@heroku.com wrote:
 On Mon, Sep 15, 2014 at 4:05 PM, Josh Berkus j...@agliodbs.com wrote:
 Actually, having the keys all at the same level *is* relevant for the
 issue we're discussing.  If those 270 keys are organized in a tree, it's
 not the same as having them all on one level (and not as problematic).

 I believe Robert meant that the 270 keys are not at the top level, but
 are at some level (in other words, some object has 270 pairs). That is
 equivalent to having them at the top level for the purposes of this
 discussion.

Yes, that's exactly what I meant.

 FWIW, I am slightly concerned about weighing use cases around very
 large JSON documents too heavily. Having enormous jsonb documents just
 isn't going to work out that well, but neither will equivalent designs
 in popular document database systems for similar reasons. For example,
 the maximum BSON document size supported by MongoDB is 16 megabytes,
 and that seems to be something that their users don't care too much
 about. Having 270 pairs in an object isn't unreasonable, but it isn't
 going to be all that common either.

The JSON documents in this case were not particularly large.  These
objects were  100kB; they just had a lot of keys.   I'm a little
baffled by the apparent theme that people think that (object size) /
(# of keys) will tend to be large.  Maybe there will be some instances
where that's the case, but it's not what I'd expect.  I would expect
people to use JSON to serialize structured data in situations where
normalizing would be unwieldly.

For example, pick your favorite Facebook or Smartphone game - Plants
vs. Zombies, Farmville, Candy Crush Saga, whatever.  Or even a
traditional board game like chess.  Think about what the game state
looks like as an abstract object.  Almost without exception, you've
got some kind of game board with a bunch of squares and then you have
a bunch of pieces (plants, crops, candies, pawns) that are positioned
on those squares.  Now you want to store this in a database.  You're
certainly not going to have a table column per square, and EAV would
be stupid, so what's left?  You could use an array, but an array of
strings might not be descriptive enough; for a square in Farmville,
for example, you might need to know the type of crop, and whether it
was fertilized with special magic fertilizer, and when it's going to
be ready to harvest, and when it'll wither if not harvested.  So a
JSON is a pretty natural structure: an array of arrays of objects.  If
you have a 30x30 farm, you'll have 900 keys.  If you have a 50x50
farm, which probably means you're spending real money to buy imaginary
plants, you'll have 2500 keys.

(For the record, I have no actual knowledge of how any of these games
are implemented under the hood.  I'm just speculating on how I would
have done it.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collation-aware comparisons in GIN opclasses

2014-09-16 Thread Alexander Korotkov
On Tue, Sep 16, 2014 at 12:14 PM, Emre Hasegeli e...@hasegeli.com wrote:

   Changing the default opclasses should work if we make
   pg_dump --binary-upgrade dump the default opclasses with indexes
   and exclusion constraints.  I think it makes sense to do so in
   --binary-upgrade mode.  I can try to come with a patch for this.
 
  Can you explain it a bit more detail? I didn't get it.

 pg_upgrade uses pg_dump --binary-upgrade to dump the schema of
 the old database.  Now, it generates CREATE INDEX statements without
 explicit opclass if opclass is the default.  We can change pg_dump
 to generate the statements with opclass even if opclass is the default
 in --binary-upgrade mode.


Thanks, I get it. I checked pg_dump implementation. It appears to be not as
easy as it could be. pg_dump doesn't form index definition by itself. It
calls pg_get_indexdef function. This function have no option to dump names
of default opclasses. Since we can't change behaviour of old postgres
version, we have to make pg_dump form index definition by itself.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] LIMIT for UPDATE and DELETE

2014-09-16 Thread Kevin Grittner
Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote:

 (2014/08/15 6:18), Rukh Meski wrote:
 Based on the feedback on my previous patch, I've separated only the
 LIMIT part into its own feature.  This version plays nicely with
 inheritance.  The intended use is splitting up big UPDATEs and DELETEs
 into batches more easily and efficiently.

 IIUC, the patch doesn't support OFFSET with UPDATE/DELETE ... LIMIT.  Is
 that OK?  When we support ORDER BY ... LIMIT/OFFSET, we will also be
 allowing for OFFSET with UPDATE/DELETE ... LIMIT.  So, ISTM it would be
 better for the patch to support OFFSET at this point.  No?

Without ORDER BY you really would have no idea *which* rows the
OFFSET would be skipping.  Even more dangerously, you might *think*
you do, and get a surprise when you see the results (if, for
example, a seqscan starts at a point other than the start of the
heap, due to a concurrent seqscan from an unrelated query).  It
might be better not to provide an illusion of a degree of control
you don't have, especially for UPDATE and DELETE operations.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Josh Berkus
On 09/16/2014 06:31 AM, Robert Haas wrote:
 On Mon, Sep 15, 2014 at 7:44 PM, Peter Geoghegan p...@heroku.com wrote:
 On Mon, Sep 15, 2014 at 4:05 PM, Josh Berkus j...@agliodbs.com wrote:
 Actually, having the keys all at the same level *is* relevant for the
 issue we're discussing.  If those 270 keys are organized in a tree, it's
 not the same as having them all on one level (and not as problematic).

 I believe Robert meant that the 270 keys are not at the top level, but
 are at some level (in other words, some object has 270 pairs). That is
 equivalent to having them at the top level for the purposes of this
 discussion.
 
 Yes, that's exactly what I meant.
 
 FWIW, I am slightly concerned about weighing use cases around very
 large JSON documents too heavily. Having enormous jsonb documents just
 isn't going to work out that well, but neither will equivalent designs
 in popular document database systems for similar reasons. For example,
 the maximum BSON document size supported by MongoDB is 16 megabytes,
 and that seems to be something that their users don't care too much
 about. Having 270 pairs in an object isn't unreasonable, but it isn't
 going to be all that common either.

Well, I can only judge from the use cases I personally have, none of
which involve more than 100 keys at any level for most rows.  So far
I've seen some people argue hypotetical use cases involving hundreds of
keys per level, but nobody who *actually* has such a use case.  Also,
note that we currently don't know where the last value extraction
becomes a performance problem at this stage, except that it's somewhere
between 200 and 100,000.  Also, we don't have a test which shows the
hybrid approach (Heikki's patch) performing better with 1000's of keys.

Basically, if someone is going to make a serious case for Heikki's
hybrid approach over the simpler lengths-only approach, then please post
some test data showing the benefit ASAP, since I can't demonstrate it.
Otherwise, let's get beta 3 out the door so we can get the 9.4 release
train moving again.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Scaling shared buffer eviction

2014-09-16 Thread Robert Haas
On Tue, Sep 16, 2014 at 8:18 AM, Amit Kapila amit.kapil...@gmail.com
wrote:

 In most cases performance with patch is slightly less as compare
 to HEAD and the difference is generally less than 1% and in a case
 or 2 close to 2%. I think the main reason for slight difference is that
 when the size of shared buffers is almost same as data size, the number
 of buffers it needs from clock sweep are very less, as an example in first
 case (when size of shared buffers is 12286MB), it actually needs at most
 256 additional buffers (2MB) via clock sweep, where as bgreclaimer
 will put 2000 (high water mark) additional buffers (0.5% of shared buffers
 is greater than 2000 ) in free list, so bgreclaimer does some extra work
 when it is not required and it also leads to condition you mentioned
 down (freelist will contain buffers that have already been touched since
 we added them).  Now for case 2 (12166MB), we need buffers more
 than 2000 additional buffers, but not too many, so it can also have
 similar effect.


So there are two suboptimal things that can happen and they pull in
opposite directions.  I think you should instrument the server how often
each is happening.  #1 is that we can pop a buffer from the freelist and
find that it's been touched.  That means we wasted the effort of putting it
on the freelist in the first place.  #2 is that we can want to pop a buffer
from the freelist and find it empty and thus be forced to run the clock
sweep ourselves.   If we're having problem #1, we could improve things by
reducing the water marks.  If we're having problem #2, we could improve
things by increasing the water marks.  If we're having both problems, then
I dunno.  But let's get some numbers on the frequency of these specific
things, rather than just overall tps numbers.


 I think we have below options related to this observation
 a. Some further tuning in bgreclaimer, so that instead of putting
 the buffers up to high water mark in freelist, it puts just 1/4th or
 1/2 of high water mark and then check if the free list still contains
 lesser than equal to low water mark, if yes it continues and if not
 then it can wait (or may be some other way).


That sounds suspiciously like just reducing the high water mark.


 b. Instead of waking bgreclaimer when the number of buffers fall
 below low water mark, wake when the number of times backends
 does clock sweep crosses certain threshold


That doesn't sound helpful.


 c. Give low and high water mark as config knobs, so that in some
 rare cases users can use them to do tuning.


Yuck.


 d. Lets not do anything as if user does such a configuration, he should
 be educated to configure shared buffers in a better way and or the
 performance hit doesn't seem to be justified to do any further
 work.


At least worth entertaining.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Robert Haas
On Tue, Sep 16, 2014 at 12:47 PM, Josh Berkus j...@agliodbs.com wrote:
 On 09/16/2014 06:31 AM, Robert Haas wrote:
 On Mon, Sep 15, 2014 at 7:44 PM, Peter Geoghegan p...@heroku.com wrote:
 On Mon, Sep 15, 2014 at 4:05 PM, Josh Berkus j...@agliodbs.com wrote:
 Actually, having the keys all at the same level *is* relevant for the
 issue we're discussing.  If those 270 keys are organized in a tree, it's
 not the same as having them all on one level (and not as problematic).

 I believe Robert meant that the 270 keys are not at the top level, but
 are at some level (in other words, some object has 270 pairs). That is
 equivalent to having them at the top level for the purposes of this
 discussion.

 Yes, that's exactly what I meant.

 FWIW, I am slightly concerned about weighing use cases around very
 large JSON documents too heavily. Having enormous jsonb documents just
 isn't going to work out that well, but neither will equivalent designs
 in popular document database systems for similar reasons. For example,
 the maximum BSON document size supported by MongoDB is 16 megabytes,
 and that seems to be something that their users don't care too much
 about. Having 270 pairs in an object isn't unreasonable, but it isn't
 going to be all that common either.

 Well, I can only judge from the use cases I personally have, none of
 which involve more than 100 keys at any level for most rows.  So far
 I've seen some people argue hypotetical use cases involving hundreds of
 keys per level, but nobody who *actually* has such a use case.

I already told you that I did, and that it was the only and only app I
had written for JSONB.

 Also,
 note that we currently don't know where the last value extraction
 becomes a performance problem at this stage, except that it's somewhere
 between 200 and 100,000.  Also, we don't have a test which shows the
 hybrid approach (Heikki's patch) performing better with 1000's of keys.

Fair point.

 Basically, if someone is going to make a serious case for Heikki's
 hybrid approach over the simpler lengths-only approach, then please post
 some test data showing the benefit ASAP, since I can't demonstrate it.
 Otherwise, let's get beta 3 out the door so we can get the 9.4 release
 train moving again.

I don't personally care about this enough to spend more time on it.  I
told you my extremely limited experience because it seems to
contradict your broader experience.  If you don't care, you don't
care.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LIMIT for UPDATE and DELETE

2014-09-16 Thread Robert Haas
On Tue, Sep 16, 2014 at 11:31 AM, Kevin Grittner kgri...@ymail.com wrote:
 Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote:
 (2014/08/15 6:18), Rukh Meski wrote:
 Based on the feedback on my previous patch, I've separated only the
 LIMIT part into its own feature.  This version plays nicely with
 inheritance.  The intended use is splitting up big UPDATEs and DELETEs
 into batches more easily and efficiently.

 IIUC, the patch doesn't support OFFSET with UPDATE/DELETE ... LIMIT.  Is
 that OK?  When we support ORDER BY ... LIMIT/OFFSET, we will also be
 allowing for OFFSET with UPDATE/DELETE ... LIMIT.  So, ISTM it would be
 better for the patch to support OFFSET at this point.  No?

 Without ORDER BY you really would have no idea *which* rows the
 OFFSET would be skipping.  Even more dangerously, you might *think*
 you do, and get a surprise when you see the results (if, for
 example, a seqscan starts at a point other than the start of the
 heap, due to a concurrent seqscan from an unrelated query).  It
 might be better not to provide an illusion of a degree of control
 you don't have, especially for UPDATE and DELETE operations.

Fair point, but I'd lean toward including it.  I think we all agree
the end goal is ORDER BY .. LIMIT, and there OFFSET certainly has
meaning.  If we don't include it now, we'll just end up adding it
later.  It makes for fewer patches, and fewer changes for users, if we
do it all at once.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Josh Berkus
On 09/16/2014 09:54 AM, Robert Haas wrote:
 On Tue, Sep 16, 2014 at 12:47 PM, Josh Berkus j...@agliodbs.com wrote:
 On 09/16/2014 06:31 AM, Robert Haas wrote:
 On Mon, Sep 15, 2014 at 7:44 PM, Peter Geoghegan p...@heroku.com wrote:
 On Mon, Sep 15, 2014 at 4:05 PM, Josh Berkus j...@agliodbs.com wrote:
 Actually, having the keys all at the same level *is* relevant for the
 issue we're discussing.  If those 270 keys are organized in a tree, it's
 not the same as having them all on one level (and not as problematic).

 I believe Robert meant that the 270 keys are not at the top level, but
 are at some level (in other words, some object has 270 pairs). That is
 equivalent to having them at the top level for the purposes of this
 discussion.

 Yes, that's exactly what I meant.

 FWIW, I am slightly concerned about weighing use cases around very
 large JSON documents too heavily. Having enormous jsonb documents just
 isn't going to work out that well, but neither will equivalent designs
 in popular document database systems for similar reasons. For example,
 the maximum BSON document size supported by MongoDB is 16 megabytes,
 and that seems to be something that their users don't care too much
 about. Having 270 pairs in an object isn't unreasonable, but it isn't
 going to be all that common either.

 Well, I can only judge from the use cases I personally have, none of
 which involve more than 100 keys at any level for most rows.  So far
 I've seen some people argue hypotetical use cases involving hundreds of
 keys per level, but nobody who *actually* has such a use case.
 
 I already told you that I did, and that it was the only and only app I
 had written for JSONB.

Ah, ok, I thought yours was a test case.  Did you check how it performed
on the two patches at all?  My tests with 185 keys didn't show any
difference, including for a last key case.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Selectivity estimation for inet operators

2014-09-16 Thread Brightwell, Adam

 New version with semi join estimation function attached.


I have performed the following initial review:

- Patch format. -- submitted as unified, but not sure it makes it any
easier to read than context format.
- Apply to current master (77e65bf).  -- success (though, I do get
Stripping trailing CR's from patch; notification)
- check-world -- success
- Whitespace - were the whitespace changes in pg_operator.h necessary?

As for implementation, I'll leave that to those with a better understanding
of the purpose/expectations of the modified functions.

-Adam

-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com


Re: [HACKERS] jsonb contains behaviour weirdness

2014-09-16 Thread Josh Berkus
On 09/15/2014 11:12 AM, Tom Lane wrote:
 Or are you proposing that JSONARRAY @ JSONARRAY should work differently
 from ARRAY @ ARRAY?
 
 And no.  It's a bug that jsonb array containment works differently from
 regular array containment.  We understand the source of the bug, ie a
 mistaken optimization.  I don't see why there's much need for discussion
 about anything except whether removing the optimization altogether
 (as Peter proposed) is the best fix, or whether we want to retain
 some weaker form of it.

Right, so I was just saying that after we fix this behavior, the
behavior of JSONARRAY @ JSONARRAY should be commented somewhere because
that comparison may not work the way users who are not long-time
postgres users expect.  Heck, I've personally done very little ARRAY @
ARRAY myself in 12 years of using PostgreSQL arrays; I had to test it to
verify the current behavior.

Not sure exactly where this note should go, mind you.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replicating DROP commands across servers

2014-09-16 Thread Brightwell, Adam

  I think there's been some changes to this patch since july, care to
  resend a new version?

 Sure, here it is.

 The only difference with the previous version is that it now also
 supports column defaults.  This was found to be a problem when you drop
 a sequence that some column default depends on -- for example a column
 declared SERIAL, or a sequence marked with ALTER SEQUENCE OWNED BY.  The
 new code is able to drop both the sequence and the default value
 (leaving, of course, the rest of the column intact.)  This required
 adding support for such objects in get_object_address.


I have given this patch the following review:

- Apply to current master (77e65bf).  -- success
- check-world. --success
- multiple FIXME statements still exist -- are there plans to fix these
items? Can the duplicated code be extracted to a static function?

-Adam

-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Heikki Linnakangas

On 09/16/2014 07:47 PM, Josh Berkus wrote:

On 09/16/2014 06:31 AM, Robert Haas wrote:

On Mon, Sep 15, 2014 at 7:44 PM, Peter Geoghegan p...@heroku.com wrote:

On Mon, Sep 15, 2014 at 4:05 PM, Josh Berkus j...@agliodbs.com wrote:

Actually, having the keys all at the same level *is* relevant for the
issue we're discussing.  If those 270 keys are organized in a tree, it's
not the same as having them all on one level (and not as problematic).


I believe Robert meant that the 270 keys are not at the top level, but
are at some level (in other words, some object has 270 pairs). That is
equivalent to having them at the top level for the purposes of this
discussion.


Yes, that's exactly what I meant.


FWIW, I am slightly concerned about weighing use cases around very
large JSON documents too heavily. Having enormous jsonb documents just
isn't going to work out that well, but neither will equivalent designs
in popular document database systems for similar reasons. For example,
the maximum BSON document size supported by MongoDB is 16 megabytes,
and that seems to be something that their users don't care too much
about. Having 270 pairs in an object isn't unreasonable, but it isn't
going to be all that common either.


Well, I can only judge from the use cases I personally have, none of
which involve more than 100 keys at any level for most rows.  So far
I've seen some people argue hypotetical use cases involving hundreds of
keys per level, but nobody who *actually* has such a use case.  Also,
note that we currently don't know where the last value extraction
becomes a performance problem at this stage, except that it's somewhere
between 200 and 100,000.  Also, we don't have a test which shows the
hybrid approach (Heikki's patch) performing better with 1000's of keys.

Basically, if someone is going to make a serious case for Heikki's
hybrid approach over the simpler lengths-only approach, then please post
some test data showing the benefit ASAP, since I can't demonstrate it.
Otherwise, let's get beta 3 out the door so we can get the 9.4 release
train moving again.


Are you looking for someone with a real life scenario, or just synthetic 
test case? The latter is easy to do.


See attached test program. It's basically the same I posted earlier. 
Here are the results from my laptop with Tom's jsonb-lengths-merged.patch:


postgres=# select * from testtimes ;
 elem | duration_ms
--+-
   11 |0.289508
   12 |0.288122
   13 |0.290558
   14 |0.287889
   15 |0.286303
   17 |0.290415
   19 |0.289829
   21 |0.289783
   23 |0.287104
   25 |0.289834
   28 |0.290735
   31 |0.291844
   34 |0.293454
   37 |0.293866
   41 |0.291217
   45 |0.289243
   50 |0.290385
   55 |0.292085
   61 |0.290892
   67 |0.292335
   74 |0.292561
   81 |0.291416
   89 |0.295714
   98 | 0.29844
  108 |0.297421
  119 |0.299471
  131 |0.299877
  144 |0.301604
  158 |0.303365
  174 |0.304203
  191 |0.303596
  210 |0.306526
  231 |0.304189
  254 |0.307782
  279 |0.307372
  307 |0.306873
  338 |0.310471
  372 |  0.3151
  409 |0.320354
  450 | 0.32038
  495 |0.322127
  545 |0.323256
  600 |0.330419
  660 |0.334226
  726 |0.336951
  799 | 0.34108
  879 |0.347746
  967 |0.354275
 1064 |0.356696
 1170 |0.366906
 1287 |0.375352
 1416 |0.392952
 1558 |0.392907
 1714 |0.402157
 1885 |0.412384
 2074 |0.425958
 2281 |0.435415
 2509 | 0.45301
 2760 |0.469983
 3036 |0.487329
 3340 |0.505505
 3674 |0.530412
 4041 |0.552585
 4445 |0.581815
 4890 |0.610509
 5379 |0.642885
 5917 |0.680395
 6509 |0.713849
 7160 |0.757561
 7876 |0.805225
 8664 |0.856142
 9530 |0.913255
(72 rows)

That's up to 9530 elements - it's pretty easy to extrapolate from there 
to higher counts, it's O(n).


With unpatched git master, the runtime is flat, regardless of which 
element is queried, at about 0.29 s. With 
jsonb-with-offsets-and-lengths-2.patch, there's no difference that I 
could measure.


The difference starts to be meaningful at around 500 entries. In 
practice, I doubt anyone's going to notice until you start talking about 
tens of thousands of entries.


I'll leave it up to the jury to decide if we care or not. It seems like 
a fairly unusual use case, where you push around large enough arrays or 
objects to notice. Then again, I'm sure *someone* will do it. People do 
strange things, and they find ways to abuse the features that the 
original developers didn't think of.


- Heikki



jsonb-lengths.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Need guidance to startup

2014-09-16 Thread Tapan Halani
Hello everyone..i am new to PostgreSQL project. I had prior experience with
sql+ , with oracle 11g database server. Kindly help me grasp more about the
project or direct me in the right direction.

Thank you,
Tapan Halani


Re: [HACKERS] Support for N synchronous standby servers

2014-09-16 Thread Michael Paquier
On Tue, Sep 16, 2014 at 5:25 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Sep 15, 2014 at 3:00 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 At least a set of hooks has the merit to say: do what you like with
 your synchronous node policy.

 Sure.  I dunno if people will find that terribly user-friendly, so we
 might not want that to be the ONLY thing we offer.
Well, user-friendly interface is actually the reason why a simple GUC
integer was used in the first series of patches present on this thread
to set as sync the N-nodes with the lowest priority. I could not come
up with something more simple. Hence what about doing the following:
- A patch refactoring code for pg_stat_get_wal_senders and
SyncRepReleaseWaiters as there is in either case duplicated code in
this area to select the synchronous node as the one connected with
lowest priority
- A patch defining the hooks necessary, I suspect that two of them are
necessary as mentioned upthread.
- A patch for a contrib module implementing an example of simple
policy. It can be a fancy thing with a custom language or even a more
simple thing.
Thoughts? Patch 1 refactoring the code is a win in all cases.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Collations and Replication; Next Steps

2014-09-16 Thread Matthew Kelly
Hello,

Last month, I brought up the following issue to the general mailing list about 
how running streaming replication between machines running different versions 
of glibc can cause corrupt indexes.
http://www.postgresql.org/message-id/ba6132ed-1f6b-4a0b-ac22-81278f5ab...@tripadvisor.com

In the month following, we have done further investigation here at TripAdvisor 
and have found that scope of this issue is far more troubling than initially 
thought.  Hackers seems like appropriate place to present this update because 
it will certainly motivate some discussion about the approach to collation 
support going forward.

After the initial episode, we thought it was necessary to find the true scope 
of the problem.  We developed a quick smoke test to evaluate the integrity of 
the indexes on a given machine.  We understood that the test was not 
exhaustive, but it would catch most instances of corrupt indexes given 
TripAdvisor's normal database usage pattern.  The source code with 
documentation about how it works is available at 
(https://github.com/mkellycs/postgres_index_integrity_check) for those 
interested.

What we found with this simple check was simply frightening.  In every single 
streaming replica cluster where one or more machines had been commissioned at a 
different time, that member was found to be corrupt.  When hardware upgrades of 
the master had been accomplished with a streaming replication, the new master 
was also found to have similar issues.  The following numbers are only as small 
as they are because our adoption of streaming replication has barely just 
begun.  So far we have found:

  *   8 internal production databases, and 2 live site database servers 
effected.
  *   Up to 3771 rows out of place in a single index (more correctly: 3771 
times a row was smaller then the row before it when sorted in ascending order, 
the actual number of incorrectly placed rows is probably much higher)
  *   On the worst offender, there were 12806 rows out of place across 26 
indexes
  *   On average roughly 15% of indexes containing text keys on tables larger 
100MB were found to exhibit this issue
  *   In at least one case, rebuilding a unique index on a master revealed that 
the database had allowed 100+ primary key violations.

It sounds like we as a community knew that these issues were theoretically 
possible, but I now have empirical evidence demonstrating the prevalence of 
this issue on our corpus of international data.  Instances of this issue showed 
up in indexes of member usernames, location/property names, and even Facebook 
url's.  I encourage other sufficiently large operations who index 
internationalized text to run similar tests; its highly likely they have 
similar latent issues that they just have not detected yet.

Here is the simple reality.  Collation based indexes, streaming replication, 
and multiple versions of glibc/os cannot coexist in a sufficiently large 
operation and not cause corrupt indexes.  The current options are to collate 
all of your indexes in C, or to ensure that all of your machines run exactly 
the same OS version.

The first and immediate TODO is to patch the documentation to add warnings 
regarding this issue.  I can propose a doc patch explaining the issue, if no 
one has any objections.

The second and far more challenging problem is how do we fix this issue?  As of 
our last discussion, Peter Geoghegan revived the proposal of using ICU as an 
alternative.  
(http://www.postgresql.org/message-id/CAEYLb_WvdCzuL=cyf1xyzjwn-1cvo6kzeawmkbxts3jphtj...@mail.gmail.com)
  I do not feel qualified to compare the value of this library to other 
options, but I am certainly willing to help with the patch process once a 
direction has been selected.

I will be at Postgres Open in Chicago this week, and I will be more than 
willing to further discuss the details of what we have found.

Regards,
Matt Kelly


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Claudio Freire
On Tue, Sep 16, 2014 at 3:12 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 I'll leave it up to the jury to decide if we care or not. It seems like a
 fairly unusual use case, where you push around large enough arrays or
 objects to notice. Then again, I'm sure *someone* will do it. People do
 strange things, and they find ways to abuse the features that the original
 developers didn't think of.

Again, it's not abusing of the feature. It's using it. Jsonb is
supposed to be fast for this.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Robert Haas
On Tue, Sep 16, 2014 at 1:11 PM, Josh Berkus j...@agliodbs.com wrote:
 Well, I can only judge from the use cases I personally have, none of
 which involve more than 100 keys at any level for most rows.  So far
 I've seen some people argue hypotetical use cases involving hundreds of
 keys per level, but nobody who *actually* has such a use case.

 I already told you that I did, and that it was the only and only app I
 had written for JSONB.

 Ah, ok, I thought yours was a test case.  Did you check how it performed
 on the two patches at all?  My tests with 185 keys didn't show any
 difference, including for a last key case.

No, I didn't test it.   But I think Heikki's test results pretty much
tell us everything there is to see here.  This isn't really that
complicated; I've read a few papers on index compression over the
years and they seem to often use techniques that have the same general
flavor as what Heikki did here, adding complexity in the data format
to gain other advantages.  So I don't think we should be put off.

Basically, I think that if we make a decision to use Tom's patch
rather than Heikki's patch, we're deciding that the initial decision,
by the folks who wrote the original jsonb code, to make array access
less than O(n) was misguided.  While that could be true, I'd prefer to
bet that those folks knew what they were doing.  The only way reason
we're even considering changing it is that the array of lengths
doesn't compress well, and we've got an approach that fixes that
problem while preserving the advantages of fast lookup.  We should
have a darn fine reason to say no to that approach, and it didn't
benefit my particular use case is not it.

In practice, I'm not very surprised that the impact doesn't seem too
bad when you're running SQL queries from the client.  There's so much
other overhead, for de-TOASTing and client communication and even just
planner and executor costs, that this gets lost in the noise.  But
think about a PL/pgsql procedure, say, where somebody might loop over
all of the elements in array.  If those operations go from O(1) to
O(n), then the loop goes from O(n) to O(n^2).  I will bet you a
beverage of your choice that somebody will find that behavior within a
year of release and be dismayed by it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Josh Berkus
Heikki, Robert:

On 09/16/2014 11:12 AM, Heikki Linnakangas wrote:
 Are you looking for someone with a real life scenario, or just synthetic
 test case? The latter is easy to do.
 
 See attached test program. It's basically the same I posted earlier.
 Here are the results from my laptop with Tom's jsonb-lengths-merged.patch:

Thanks for that!

 postgres=# select * from testtimes ;
  elem | duration_ms
 --+-
  3674 |0.530412
  4041 |0.552585
  4445 |0.581815

This looks like the level at which the difference gets to be really
noticeable.  Note that this is completely swamped by the difference
between compressed vs. uncompressed though.

 With unpatched git master, the runtime is flat, regardless of which
 element is queried, at about 0.29 s. With
 jsonb-with-offsets-and-lengths-2.patch, there's no difference that I
 could measure.

OK, thanks.

 The difference starts to be meaningful at around 500 entries. In
 practice, I doubt anyone's going to notice until you start talking about
 tens of thousands of entries.
 
 I'll leave it up to the jury to decide if we care or not. It seems like
 a fairly unusual use case, where you push around large enough arrays or
 objects to notice. Then again, I'm sure *someone* will do it. People do
 strange things, and they find ways to abuse the features that the
 original developers didn't think of.

Right, but the question is whether it's worth having a more complex code
and data structure in order to support what certainly *seems* to be a
fairly obscure use-case, that is more than 4000 keys at the same level.
 And it's not like it stops working or becomes completely unresponsive
at that level; it's just double the response time.

On 09/16/2014 12:20 PM, Robert Haas wrote: Basically, I think that if
we make a decision to use Tom's patch
 rather than Heikki's patch, we're deciding that the initial decision,
 by the folks who wrote the original jsonb code, to make array access
 less than O(n) was misguided.  While that could be true, I'd prefer to
 bet that those folks knew what they were doing.  The only way reason
 we're even considering changing it is that the array of lengths
 doesn't compress well, and we've got an approach that fixes that
 problem while preserving the advantages of fast lookup.  We should
 have a darn fine reason to say no to that approach, and it didn't
 benefit my particular use case is not it.

Do you feel that way *as a code maintainer*?  That is, if you ended up
maintaining the JSONB code, would you still feel that it's worth the
extra complexity?  Because that will be the main cost here.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Robert Haas
On Tue, Sep 16, 2014 at 3:24 PM, Josh Berkus j...@agliodbs.com wrote:
 Do you feel that way *as a code maintainer*?  That is, if you ended up
 maintaining the JSONB code, would you still feel that it's worth the
 extra complexity?  Because that will be the main cost here.

I feel that Heikki doesn't have a reputation for writing or committing
unmaintainable code.

I haven't reviewed the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Petr Jelinek

On 16/09/14 21:20, Robert Haas wrote:

In practice, I'm not very surprised that the impact doesn't seem too
bad when you're running SQL queries from the client.  There's so much
other overhead, for de-TOASTing and client communication and even just
planner and executor costs, that this gets lost in the noise.  But
think about a PL/pgsql procedure, say, where somebody might loop over
all of the elements in array.  If those operations go from O(1) to
O(n), then the loop goes from O(n) to O(n^2).  I will bet you a
beverage of your choice that somebody will find that behavior within a
year of release and be dismayed by it.



As somebody who did see server melt (quite literally that time 
unfortunately) thanks to the CPU overhead of operations on varlena 
arrays +1 (in fact +many).


Especially if we are trying to promote the json improvements in 9.4 as 
best of both worlds kind of thing.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] New to PostGre SQL asking about write-ahead-log (WAL)

2014-09-16 Thread Mingzhe Li
Hi,

I am new to PostGre SQL and want to ask some questions. In my database
class, we plan to work on a class project based on WAL. Since we are not
familiar with WAL, we don't know what's a good start point. Could anyone
point me to any documentation mentioning about WAL in PostGre SQL? It will
be very helpful if anyone could point me to the file which includes the
core of WAL implementation. Any advice along this line is very welcome.

Thanks,
Mingzhe


[HACKERS] printing table in asciidoc with psql

2014-09-16 Thread Szymon Guz
Hi,
I've been working a little bit on a patch for printing tables in asciidoc
with psql.

It's not finished yet, I'm not sure it there is any sense in supporting
border types etc. The code is not cleared so far, but any remarks on the
style not playing well with the normal postgres style of code are welcomed.

The code just works. With extended and normal modes. With table columns
made of funny characters, with alignment of data in table cells. I was
trying to implement it similar to the html export function, however
escaping of the strings was much easier, as the normal html-way
substitution is not easy to implement in asciidoc.

I'd like to ask you for any advices for this code.

thanks,
Szymon
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 2227db4..ae6b106 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -2247,6 +2247,9 @@ _align2string(enum printFormat in)
 		case PRINT_TROFF_MS:
 			return troff-ms;
 			break;
+		case PRINT_ASCIIDOC:
+			return asciidoc;
+			break;
 	}
 	return unknown;
 }
@@ -2316,9 +2319,11 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 			popt-topt.format = PRINT_LATEX_LONGTABLE;
 		else if (pg_strncasecmp(troff-ms, value, vallen) == 0)
 			popt-topt.format = PRINT_TROFF_MS;
+		else if (pg_strncasecmp(asciidoc, value, vallen) == 0)
+			popt-topt.format = PRINT_ASCIIDOC;
 		else
 		{
-			psql_error(\\pset: allowed formats are unaligned, aligned, wrapped, html, latex, troff-ms\n);
+			psql_error(\\pset: allowed formats are unaligned, aligned, wrapped, html, latex, troff-ms, asciidoc\n);
 			return false;
 		}
 
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 3b3c3b7..236c8f3 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -2475,6 +2475,180 @@ print_troff_ms_vertical(const printTableContent *cont, FILE *fout)
 	}
 }
 
+/*/
+/* ASCIIDOC **/
+/*/
+
+static void
+print_asciidoc_text(const printTableContent *cont, FILE *fout)
+{
+	bool		opt_tuples_only = cont-opt-tuples_only;
+	unsigned int i;
+	const char *const * ptr;
+
+	if (cancel_pressed)
+		return;
+
+	if (cont-opt-start_table)
+	{
+		/* print title */
+		if (!opt_tuples_only  cont-title)
+		{
+  fputs(., fout);
+			fputs(cont-title, fout);
+  fputs(\n, fout);
+		}
+
+/* print table [] header definition */
+		fprintf(fout, [options=\header\,cols=\);
+for(i = 0; i  cont-ncolumns; i++) {
+  if (i != 0) fputs(,, fout);
+		  fprintf(fout, %s, cont-aligns[(i) % cont-ncolumns] == 'r' ? literal : literal);
+}
+fprintf(fout, \]\n);
+		fprintf(fout, |\n);
+
+		/* print headers */
+		if (!opt_tuples_only)
+		{
+			for (ptr = cont-headers; *ptr; ptr++)
+			{
+fputs(^| +++, fout);
+fputs(*ptr, fout);
+fputs(+++ , fout);
+			}
+			fputs(\n, fout);
+		}
+	}
+
+	/* print cells */
+	for (i = 0, ptr = cont-cells; *ptr; i++, ptr++)
+	{
+		if (i % cont-ncolumns == 0)
+		{
+			if (cancel_pressed)
+break;
+}
+		
+fprintf(fout, | );
+
+		if ((*ptr)[strspn(*ptr,  \t)] == '\0')
+			fputs( , fout);
+		else
+			fputs(*ptr, fout);
+
+		fputs( , fout);
+
+		if ((i + 1) % cont-ncolumns == 0)
+			fputs(\n, fout);
+		
+
+	}
+  
+  fprintf(fout, |\n);
+
+	if (cont-opt-stop_table)
+	{
+		printTableFooter *footers = footers_with_default(cont);
+
+		/* print footers */
+		if (!opt_tuples_only  footers != NULL  !cancel_pressed)
+		{
+			printTableFooter *f;
+
+			fputs(\n\n, fout);
+			for (f = footers; f; f = f-next)
+			{
+fputs(f-data, fout);
+fputs(\n, fout);
+			}
+			fputs(\n, fout);
+		}
+
+	}
+}
+
+// TODO add support for cont-opt-border
+// TODO add support for additional options
+
+static void
+print_asciidoc_vertical(const printTableContent *cont, FILE *fout)
+{
+	bool		opt_tuples_only = cont-opt-tuples_only;
+	unsigned short opt_border = cont-opt-border;
+	const char *opt_table_attr = cont-opt-tableAttr;
+	unsigned long record = cont-opt-prior_records + 1;
+	unsigned int i;
+	const char *const * ptr;
+
+	if (cancel_pressed)
+		return;
+
+	if (cont-opt-start_table)
+	{
+		/* print title */
+		if (!opt_tuples_only  cont-title)
+		{
+  fputs(., fout);
+			fputs(cont-title, fout);
+  fputs(\n, fout);
+		}
+
+/* print table [] header definition */
+	  fprintf(fout, [cols=\h,literal\]\n);
+		fprintf(fout, |\n);
+
+	}
+
+	/* print records */
+	for (i = 0, ptr = cont-cells; *ptr; i++, ptr++)
+	{
+		if (i % cont-ncolumns == 0)
+		{
+			if (cancel_pressed)
+break;
+			if (!opt_tuples_only)
+fprintf(fout,
+		2+^| Record %lu\n,
+		record++);
+			else
+fputs(2| \n, fout);
+		}
+
+fputs(|+++, fout);
+		fputs(cont-headers[i % cont-ncolumns], fout);
+fputs(+++, fout);
+
+		fprintf(fout,  %s|, cont-aligns[i % cont-ncolumns] == 'r' ?  : );
+		/* is string only whitespace? */
+		if ((*ptr)[strspn(*ptr,  \t)] == '\0')
+			fputs( , fout);
+		else
+			fputs(*ptr, fout);
+
+		fputs(\n, 

Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Heikki Linnakangas

On 09/16/2014 10:37 PM, Robert Haas wrote:

On Tue, Sep 16, 2014 at 3:24 PM, Josh Berkus j...@agliodbs.com wrote:

Do you feel that way *as a code maintainer*?  That is, if you ended up
maintaining the JSONB code, would you still feel that it's worth the
extra complexity?  Because that will be the main cost here.


I feel that Heikki doesn't have a reputation for writing or committing
unmaintainable code.

I haven't reviewed the patch.


The patch I posted was not pretty, but I'm sure it could be refined to 
something sensible.


There are many possible variations of the basic scheme of storing mostly 
lengths, but an offset for every N elements. I replaced the length with 
offset on some element and used a flag bit to indicate which it is. 
Perhaps a simpler approach would be to store lengths, but also store a 
separate smaller array of offsets, after the lengths array.


I can write a patch if we want to go that way.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New to PostGre SQL asking about write-ahead-log (WAL)

2014-09-16 Thread Heikki Linnakangas

On 09/16/2014 10:50 PM, Mingzhe Li wrote:

Hi,

I am new to PostGre SQL and want to ask some questions. In my database
class, we plan to work on a class project based on WAL. Since we are not
familiar with WAL, we don't know what's a good start point. Could anyone
point me to any documentation mentioning about WAL in PostGre SQL?


The user manual gives a high-level overview, see 
http://www.postgresql.org/docs/devel/static/wal.html. You can also find 
some presentation slides if you google.



It will be very helpful if anyone could point me to the file which
includes the core of WAL implementation.


src/backend/access/transam/xlog.c. The WAL record format is best 
described by the comments around XLogRecord, in src/include/access/xlog.h.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Arthur Silva
On Tue, Sep 16, 2014 at 4:20 PM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, Sep 16, 2014 at 1:11 PM, Josh Berkus j...@agliodbs.com wrote:
  Well, I can only judge from the use cases I personally have, none of
  which involve more than 100 keys at any level for most rows.  So far
  I've seen some people argue hypotetical use cases involving hundreds of
  keys per level, but nobody who *actually* has such a use case.
 
  I already told you that I did, and that it was the only and only app I
  had written for JSONB.
 
  Ah, ok, I thought yours was a test case.  Did you check how it performed
  on the two patches at all?  My tests with 185 keys didn't show any
  difference, including for a last key case.

 No, I didn't test it.   But I think Heikki's test results pretty much
 tell us everything there is to see here.  This isn't really that
 complicated; I've read a few papers on index compression over the
 years and they seem to often use techniques that have the same general
 flavor as what Heikki did here, adding complexity in the data format
 to gain other advantages.  So I don't think we should be put off.


I second this reasoning. Even if I ran a couple of very realistic test
cases that support all-lengths I do fell that the Hybrid aproach would be
better as it covers all bases. To put things in perspective Tom's latest
patch isn't much simpler either.

Since it would still be a breaking change we should consider changing the
layout to key-key-key-value-value-value as it seems to pay off.


 Basically, I think that if we make a decision to use Tom's patch
 rather than Heikki's patch, we're deciding that the initial decision,
 by the folks who wrote the original jsonb code, to make array access
 less than O(n) was misguided.  While that could be true, I'd prefer to
 bet that those folks knew what they were doing.  The only way reason
 we're even considering changing it is that the array of lengths
 doesn't compress well, and we've got an approach that fixes that
 problem while preserving the advantages of fast lookup.  We should
 have a darn fine reason to say no to that approach, and it didn't
 benefit my particular use case is not it.

 In practice, I'm not very surprised that the impact doesn't seem too
 bad when you're running SQL queries from the client.  There's so much
 other overhead, for de-TOASTing and client communication and even just
 planner and executor costs, that this gets lost in the noise.  But
 think about a PL/pgsql procedure, say, where somebody might loop over
 all of the elements in array.  If those operations go from O(1) to
 O(n), then the loop goes from O(n) to O(n^2).  I will bet you a
 beverage of your choice that somebody will find that behavior within a
 year of release and be dismayed by it.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-16 Thread Robert Haas
On Mon, Sep 15, 2014 at 7:21 PM, Peter Geoghegan p...@heroku.com wrote:
 On Mon, Sep 15, 2014 at 11:25 AM, Peter Geoghegan p...@heroku.com wrote:
 OK, I'll draft a patch for that today, including similar alterations
 to varstr_cmp() for the benefit of Windows and so on.

 I attach a much simpler patch, that only adds an opportunistic
 memcmp() == 0 before a possible strcoll().  Both
 bttextfastcmp_locale() and varstr_cmp() have the optimization added,
 since there is no point in leaving anyone out for this part.

Even though our testing seems to indicate that the memcmp() is
basically free, I think it would be good to make the effort to avoid
doing memcmp() and then strcoll() and then strncmp().  Seems like it
shouldn't be too hard.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-16 Thread Peter Geoghegan
On Tue, Sep 16, 2014 at 1:45 PM, Robert Haas robertmh...@gmail.com wrote:
 Even though our testing seems to indicate that the memcmp() is
 basically free, I think it would be good to make the effort to avoid
 doing memcmp() and then strcoll() and then strncmp().  Seems like it
 shouldn't be too hard.

Really? The tie-breaker for the benefit of locales like hu_HU uses
strcmp(), not memcmp(). It operates on the now-terminated copies of
strings. There is no reason to think that the strings must be the same
size for that strcmp(). I'd rather only do the new opportunistic
memcmp() == 0 thing when len1 == len2. And I wouldn't like to have
to also figure out that it's safe to use the earlier result, because
as it happens len1 == len2, or any other such trickery.

The bug fix that added the strcmp() tie-breaker was committed in 2005.
PostgreSQL had locale support for something like 8 years prior, and it
took that long for us to notice the problem. I would suggest that
makes the case for doing anything else pretty marginal. In the bug
report at the time, len1 != len2 anyway.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sequence Access Method WIP

2014-09-16 Thread Petr Jelinek

On 16/09/14 14:17, Andres Freund wrote:

On 2014-09-15 01:38:52 +0200, Petr Jelinek wrote:


There is also more needed than this, you need:
- int64 value - first value allocated (value to be returned)
- int64 nvalues - number of values allocated
- int64 last - last cached value (used for cached/last_value)
- int64 next - last logged value (used for wal logging)
- am_private - updated private data, must be possible to return as null



I personally don't like that we need all the nvalues, next and last as
it makes the seqam a little bit too aware of the sequence logging internals
in my opinion but I haven't found a way around it - it's impossible for
backend to know how the AM will act around incby/maxv/minv/cycling so it
can't really calculate these values by itself, unless ofcourse we fix the
behavior and require seqams to behave predictably, but that somewhat breaks
the whole idea of leaving the allocation to the seqam. Obviously it would
also work to return list of allocated values and then backend could
calculate the value, nvalues, last, next from that list by itself,
but I am worried about performance of that approach.


Yea, it's far from pretty.

I'm not convinced that the AM ever needs to/should care about
caching. To me that's more like a generic behaviour. So it probably
should be abstracted away from the individual AMs.

I think the allocation routine might also need to be able to indicate
whether WAL logging is needed or not.


Well that means we probably want to return first allocated value, last 
allocated value and then some boolean that tells backend if to wal log 
the sequence or not (number of values allocated does not really seem to 
be important unless I am missing something).


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-16 Thread Peter Eisentraut
On 9/16/14 12:06 PM, Matthew Kelly wrote:
 The second and far more challenging problem is how do we fix this issue?
  As of our last discussion, Peter Geoghegan revived the proposal of
 using ICU as an alternative.
  
 (http://www.postgresql.org/message-id/CAEYLb_WvdCzuL=cyf1xyzjwn-1cvo6kzeawmkbxts3jphtj...@mail.gmail.com)
  I do not feel qualified to compare the value of this library to other
 options, but I am certainly willing to help with the patch process once
 a direction has been selected.

It seems to me that this is a more general problem that can affect any
data type that relies on anything external.  For example, you could
probably create a case where indexes are corrupted if you have two
different time zone databases.  Or what if you use PostGIS and one of
the libraries it uses has different rounding behaviors in different
versions?

Even in the absence of such external dependencies, there will still be
problems like this if you don't upgrade all nodes participating in
replication at the same time.

Clearly, this is worth documenting, but I don't think we can completely
prevent the problem.  There has been talk of a built-in index integrity
checking tool.  That would be quite useful.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL 2

2014-09-16 Thread Álvaro Hernández Tortosa


On 04/09/14 18:02, Craig Ringer wrote:

On 09/04/2014 06:48 AM, Joshua D. Drake wrote:

On 09/03/2014 11:48 AM, Robert Haas wrote:


Anyway, to get back around to the topic of PL/SQL compatibility
specifically, if you care about that issue, pick one thing that exists
in PL/SQL but not in PL/pgsql and try to do something about it.  Maybe
it'll be something that EnterpiseDB has already done something about,
in which case, if your patch gets committed, Advanced Server will lose
a bit of distinction as compared with PostgreSQL.  Or maybe it'll be
something that EnterpriseDB hasn't done anything about, and then
everybody comes out strictly ahead.  What I think you shouldn't do
(although you're free to ignore me) is continue thinking of Oracle
compatibility as one monolithic thing, because it isn't, or to pursue
of a course of trying to get the PostgreSQL community to slavishly
follow Oracle, because I think you'll fail, and even if you succeed I
don't think the results will actually be positive for PostgreSQL.

Well put Robert.

Indeed, especially with reference to the size and scope of Oracle. Its
XML library alone is huge.

At best it's reasonable to hope for compatibility with a limited subset
of PL/SQL - and really, we're a good way there already, with most of
what's missing being down to missing core server features or things
PostgreSQL just does differently.

True Oracle compatibility (for procedures) pretty much requires an
embedded JVM with a rich class library. Since PL/Java seems to be dying
a slow death by neglect and disinterest I don't think it's likely anyone
would be tackling compatibility with the embedded JVM features anytime soon.

There are a few things I would like to see, like secure session
variables in PL/PgSQL. Mostly, though, I think talk of Oracle
compatibility seems to be something that comes up before the speaker
has really understood what that would mean, and the sheer scope of the
endeavour.

It's not going from 50% compatible to 80% compatible, it's going from 5%
compatible to 7% compatible. The most used 5% maybe, but still...



Getting that 5% of what is most used, would be a great gain. Maybe 
the speaker is mislead in the size of the endeavour, but quite sure 
about what that market needs are ;)


Cheers,

Álvaro




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL 2

2014-09-16 Thread Álvaro Hernández Tortosa


On 03/09/14 20:48, Robert Haas wrote:

On Tue, Sep 2, 2014 at 5:47 PM, Álvaro Hernández Tortosa a...@nosys.es wrote:

 Yeah, we differ there. I think having an Oracle compatibility layer in
PostgreSQL would be the-next-big-thing we could have. Oracle is has orders
of magnitude bigger user base than postgres has; and having the ability to
attract them would bring us many many more users which, in turn, would
benefit us all very significantly.

 It would be my #1 priority to do in postgres (but yes, I know -guess-
how hard and what resources that would require). But dreaming is free :)

There are a number of reasons why this isn't really practical.

First, Oracle compatibility isn't one feature.  The compatibility
layer that exists in EnterpriseDB's Advanced Server product consists
of many different changes to many different parts of the system.  A
few of those changes are simple syntax compatibility, where we do the
exact same thing PostgreSQL does but with different syntax, but a lot
of them are functional enhancements.  Even within SPL, there's a whole
bunch of different changes to a whole bunch of different areas, and
most of those are functional enhancements rather than just tweaking
syntax.  So, if you tried to implement a new, Oracle-compatible PL,
you'd find that you don't have one or a small number of changes to
make, but a long series of features ranging from small to very large.
You'd also find that adding a new PL, without changing any other parts
of the server, only bridges a small part of the compatibility gap.


Hi Robert, thanks for the insights here. Understood it is not a 
single thing the compatibility layer. And it's sure a very long and 
involved task to build such compatibility parts.


However, I don't see anything bad in having one or some parts of 
it. For example, having a pl that is similar -maybe only syntax- is a 
good thing.  Sure, there are surely lot of things that can't be done 
simply, tons of functions not available and so on, but that alone would 
mean Oracle users would feel both more comfortable and making their 
current code easier to port. That would already be a lot.




Second, if you did manage to develop something which was significantly
more compatible with Oracle than PostgreSQL or PL/pgsql is today,
you'd probably find that the community wouldn't accept it.  It's
almost misleading to think of Oracle as a database; it's an enormous
software ecosystem with facilities for doing just about everything
under the sun, and many of those things more than one way.  For
example, in 9.4, EnterpriseDB will be releasing a UTL_HTTP package
that contains many of the same interfaces that are present in Oracle.
The interface decisions made by Oracle Corporation are reasonable in
view of their architecture, but I am quite sure that this community
would not want, for example, to return long text values as SETOF
VARCHAR(2000) rather than TEXT, just because Oracle does that.  And
rightly so: I wouldn't want PostgreSQL to follow any other product
that slavishly whether I worked at EnterpriseDB or not.  This kind of
thing crops up over and over again, and it only works to say that
PostgreSQL should choose the Oracle behavior in every case if you
believe that the primary mission of PostgreSQL should be to copy
Oracle, and I don't.  I also don't think it's a bad thing that
Advanced Server makes different decisions than PostgreSQL in some
cases.  A further problem is that, in this particular case, you'd
probably here the argument from PostgreSQL hackers that they really
don't want to be burdened with maintaining an HTTP client in the core
server when the same thing could be done from an extension, and that's
a valid argument, too.  It is also valid for EnterpriseDB to make a
different decision for itself, based on business priorities.


I wouldn't follow those routes just for doing perfect 
compatibility. I agree, and I'd never push for those. In the light of 
all these things, I'd never expect perfect, scientific compatibility, 
but a best, but well documented, effort.




Now, none of that is to say that we wouldn't do well to give a little
more thought to Oracle compatibility than we do.  We've either made or
narrowly avoided a number of decisions over the years which introduced
- or threatened to introduce - minor, pointless incompatibilities with
other database products, Oracle included.  That really doesn't benefit
anyone.  To take another example, I've been complaining about the fact
that PostgreSQL 8.3+ requires far more typecasts in stored procedures
than any other database I'm aware of for years, probably since before
I joined EnterpriseDB.  And I still think we're kidding ourselves to
think that we've got that right when nobody else is doing something
similar.  I don't think the community should reverse that decision to
benefit EnterpriseDB, or to be compatible with Oracle: I think the
community should reverse that decision because it's stupid, and the

Re: [HACKERS] Collations and Replication; Next Steps

2014-09-16 Thread Peter Geoghegan
On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut pete...@gmx.net wrote:
 Clearly, this is worth documenting, but I don't think we can completely
 prevent the problem.  There has been talk of a built-in index integrity
 checking tool.  That would be quite useful.

We could at least use the GNU facility for versioning collations where
available, LC_IDENTIFICATION [1]. By not versioning collations, we are
going against the express advice of the Unicode consortium (they also
advise to do a strcmp() tie-breaker, something that I think we
independently discovered in 2005, because of a bug report - this is
what I like to call the Hungarian issue. They know what our
constraints are.). I recognize it's a tricky problem, because of our
historic dependence on OS collations, but I think we should definitely
do something. That said, I'm not volunteering for the task, because I
don't have time. While I'm not sure of what the long term solution
should be, it *is not* okay that we don't version collations. I think
that even the best possible B-Tree check tool is a not a solution.

[1] 
http://www.postgresql.org/message-id/CAEYLb_UTMgM2V_pP7qnuKZYmTYXoym-zNYVbwoU79=tup8h...@mail.gmail.com
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-16 Thread Peter Geoghegan
On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut pete...@gmx.net wrote:
 It seems to me that this is a more general problem that can affect any
 data type that relies on anything external.  For example, you could
 probably create a case where indexes are corrupted if you have two
 different time zone databases.  Or what if you use PostGIS and one of
 the libraries it uses has different rounding behaviors in different
 versions?

The timezone case you highlight here seems quite distinct from what
Matthew is talking about, because in point of fact the on-disk
representation is merely *interpreted* with reference to the timezone
database. So, you could have an inconsistency between standbys
concerning what the time was in a particular timezone at a particular
timestamp value as reported by the timestamptz output function, but
both standbys would be correct on their own terms, which isn't too
bad. You still cannot have a situation where on a single standby, a
value isn't returned by an index scan that patently exists in the
table on the same standby (i.e. index corruption); the timezone isn't
actually stored (just an offset from a special Postgres epoch). As for
the PostGIS example, I think that they'd know better than to change
the behavior of an established opclass B-Tree support function 1. If
people that author opclasses don't read the documentation on how to do
so correctly, what chance do regular DBAs have? Should they make sure
that operator classes are authored correctly in each and every
instance? Surely not.

Even if I was wrong about all of this, we should treat text as a
special case, a case worth making every effort for.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Любен Каравелов

- Цитат от Robert Haas (robertmh...@gmail.com), на 16.09.2014 в 22:20 -
 
 In practice, I'm not very surprised that the impact doesn't seem too
 bad when you're running SQL queries from the client.  There's so much
 other overhead, for de-TOASTing and client communication and even just
 planner and executor costs, that this gets lost in the noise.  But
 think about a PL/pgsql procedure, say, where somebody might loop over
 all of the elements in array.  If those operations go from O(1) to
 O(n), then the loop goes from O(n) to O(n^2).  I will bet you a
 beverage of your choice that somebody will find that behavior within a
 year of release and be dismayed by it.
 


Hi,

I can imagine situation exactly like that. We could use jsonb object to
represent sparse vectors in the database where the key is the dimension
and the value is the value. So they could easily grow to thousands of
dimensions. Once you have than in the database it is easy to go and
write some simple numeric computations on these vectors, let's say you
want a dot product of 2 sparse vectors. If the random access inside one
vector is going to O(n^2) then the dot product computation will be going
to O(n^2*m^2), so not pretty.

I am not saying that the DB is the right place to do this type of
computations but it is somethimes convenient to have it also in the DB.

Regards,
luben



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need guidance to startup

2014-09-16 Thread Craig Ringer
On 09/17/2014 02:16 AM, Tapan Halani wrote:
 Hello everyone..i am new to PostgreSQL project. I had prior experience
 with sql+ , with oracle 11g database server. Kindly help me grasp more
 about the project or direct me in the right direction.

I've replied off-list to direct the poster to some resources and more
appropriate mailing lists.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New to PostGre SQL asking about write-ahead-log (WAL)

2014-09-16 Thread Craig Ringer
On 09/17/2014 03:50 AM, Mingzhe Li wrote:
 Hi,
 
 I am new to PostGre SQL and want to ask some questions. In my database
 class, we plan to work on a class project based on WAL. Since we are not
 familiar with WAL, we don't know what's a good start point. Could anyone
 point me to any documentation mentioning about WAL in PostGre SQL? It
 will be very helpful if anyone could point me to the file which includes
 the core of WAL implementation. Any advice along this line is very welcome. 

This appears to be a re-post of
http://stackoverflow.com/q/25876287/398670 to -hackers.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-16 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 On 09/16/2014 10:37 PM, Robert Haas wrote:
 On Tue, Sep 16, 2014 at 3:24 PM, Josh Berkus j...@agliodbs.com wrote:
 Do you feel that way *as a code maintainer*?  That is, if you ended up
 maintaining the JSONB code, would you still feel that it's worth the
 extra complexity?  Because that will be the main cost here.

 I feel that Heikki doesn't have a reputation for writing or committing
 unmaintainable code.
 I haven't reviewed the patch.

 The patch I posted was not pretty, but I'm sure it could be refined to 
 something sensible.

We're somewhat comparing apples and oranges here, in that I pushed my
approach to something that I think is of committable quality (and which,
not incidentally, fixes some existing bugs that we'd need to fix in any
case); while Heikki's patch was just proof-of-concept.  It would be worth
pushing Heikki's patch to committable quality so that we had a more
complete understanding of just what the complexity difference really is.

 There are many possible variations of the basic scheme of storing mostly 
 lengths, but an offset for every N elements. I replaced the length with 
 offset on some element and used a flag bit to indicate which it is. 

Aside from the complexity issue, a demerit of Heikki's solution is that it
eats up a flag bit that we may well wish we had back later.  On the other
hand, there's definitely something to be said for not breaking
pg_upgrade-ability of 9.4beta databases.

 Perhaps a simpler approach would be to store lengths, but also store a 
 separate smaller array of offsets, after the lengths array.

That way would also give up on-disk compatibility, and I'm not sure it's
any simpler in practice than your existing solution.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers