from:"Joel Jacobson"

Re: [HACKERS] \set AUTOROLLBACK ON

2017-06-28 Thread Joel Jacobson

On Mon, Jun 26, 2017 at 9:35 PM, David G. Johnston
 wrote:
> We already have SET TRANSACTION READ ONLY.

But in my use-case I am OK with the query doing write operations,
since sometimes you need to test something in prod (that cannot be
tested easily locally) but you want to ROLLBACK the query as quickly
as possible to avoid locking things longer than necessary. Currently
I'm just manually appending "; rollback;" to the query to make sure
its rollbacked.

The best thing I can do today is to map some button on the keyboard to
automatically type "; rollback; [enter]" instead of hitting [enter] to
fire-off the query, but it would be nice if it was built-in psql so
you could never commit something by mistake unless you explicitly exit
the AUTOROLLBACK mode.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optional message to user when terminating/cancelling backend

2017-06-26 Thread Joel Jacobson

+1

On Tue, Jun 20, 2017 at 8:54 PM, Alvaro Herrera
 wrote:
> Unless you have a lot of users running psql manually, I don't see how
> this is actually very useful or actionable.  What would the user do with
> the information?  Hopefully your users already trust that you'd keep the
> downtime to the minimum possible.

I think this feature would be useful for PgTerminator
(https://github.com/trustly/pgterminator)
a tool which automatically kills unprotected processes that could
potentially be the reason why
>X number of protected important processes have been waiting for >Y seconds.

When I'm guilty of locking this in the production DB and get killed by
PgTerminator,
it would be nice to know the reason, e.g. that it was PgTerminator
that killed me
and what process I was blocking.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] \set AUTOROLLBACK ON

2017-06-26 Thread Joel Jacobson

Hi hackers,

A colleague of mine wondered if there is a way to always run
everything you type into psql in a db txn and automatically rollback
it as soon as it finish.
I couldn't think of any way to do so, but thought it would be a nice
feature and probably quite easy to add to psql, so I thought I should
suggest it here.

The typical use-case is you are doing something in production that you
just want to
a) test if some query works like expected and then rollback
or,
b) read-only queries that should not commit any changes anyway, so
here the rollback would just be an extra layer of security, since your
SELECT might call volatile functions that are actually not read-only

Thoughts?

/Joel


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] case_preservation_and_insensitivity = on

2017-02-24 Thread Joel Jacobson

On Thu, Feb 23, 2017 at 8:04 AM, Robert Haas  wrote:
>
> It doesn't sound like a good solution to me, because there can be SQL
> code inside stored procedures that clients never see.

In our code base, we use CamelCase in all PL/pgSQL functions, both for
columns and variables,
e.g. SELECT UserID INTO _UserID FROM Users WHERE Username = 'foo';

Here, it's not a problem that the column name is e.g. "userid",
since the case-insensitive feature makes it work.

What type of case problem do you foresee for stored procedures?

I've only experienced the case-folding to be a problem outside of SPs,
since the casing *is* preserved in the PL/pgSQL source code
(since it's stored as-is, without any modifications).

What *would* be a problem though, is if in a future PL/pgSQL 3,
a PL/pgSQL query like,
SELECT UserID FROM Users WHERE Username = 'foo';
would automatically export the column "UserID" to the current scope as
a PL/pgSQL 3 variable named "userid",
since then you would actually want the value of the userid column to
be exported to a variable named "UserID".

Such a feature would be nice, since a very common code-pattern in
PL/pgSQL is to just have lots of meaningless identical lists of
columns and then an identical list of variables with the same names as
the columns.
When the list is short, it's not a problem, but when selecting lots of
columns, it gets ugly.

What I usually end up with is to align the columns and variables on
two rows, e.g.:

SELECT  SomeCol,  OtherCol,   FooCol,   BarCol,   MyCol,   ExtraCol,   LastCol
INTO _SomeCol, _OtherCol, _FooCol, _BarCol, _MyCol, _ExtraCol, _LastCol
FROM Foo
WHERE Bar = 'Baz';

This is to avoid typos that are then visually easy to spot, thanks to
all chars being aligned.

Imagine if, thanks to case-preservation, if you should simply do:
SELECT  SomeCol,  OtherCol,   FooCol,   BarCol,   MyCol,   ExtraCol,   LastCol
FROM Foo
WHERE Bar = 'Baz';

And all the columns would be exported to the variables SomeCol,
OtherCol,   FooCol,   BarCol,   MyCol,   ExtraCol,   LastCol,
instead of somecol, othercol, foocol, barcol, mycol, extracol, lastcol;

This would be a huge win in avoiding unnecessary code repetition.

Then of course, if you want a column Foo to instead be exported to
Bar, then you simply do "SELECT Foo AS Bar".

Thoughts?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GRANT EXECUTE ON FUNCTION foo() TO bar();

2017-02-22 Thread Joel Jacobson

On Wed, Feb 22, 2017 at 2:18 PM, Tom Lane  wrote:
> I think this is really *not* a good idea.  The entire permissions model
> is built around granting permissions to roles, by other roles.

My bad. I shouldn't have proposed the idea on how to achieve/implement the idea.

I should instead just have presented the idea without suggesting to
use the permissions model.

Do you think it's a bad idea in general? Or is it just the idea of
using the permissions model for the purpose that is a bad idea?

If it's a good idea apart from that, then maybe we can figure out some other
more feasible way to control what functions can call what other functions?

> It's not that hard, if you have needs like this, to make an owning role
> for each such function.  You might end up with a lot of single-purpose
> roles, but they could be grouped under one or a few group roles for most
> purposes beyond the individual tailored grants.

I think that approach is not very user-friendly, but maybe it can be
made more convenient if adding syntactic sugar to allow doing it all
in a single command?

Or maybe there is some other way to implement it without the permissions model.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GRANT EXECUTE ON FUNCTION foo() TO bar();

2017-02-22 Thread Joel Jacobson

On Wed, Feb 22, 2017 at 9:07 AM, Pavel Stehule  wrote:
> Usage of X functions can be locked in schema.

I think that's also a good idea. Both are useful I think. They solve
two different use-cases.

If there are multiple callers of a private function within a schema,
it would be useful if you could just declare it PRIVATE,
to prevent any top-level usage of the function,
and to prevent any other function than functions in the same schema
from calling it.

This would be similar to how e.g. a private method in a Java class can
be called by any other method in the same class.
I think that's also a useful idea, but a different use-case.

This would be useful if you have lots of callers of a function,
and it would of course be tedious if you had to explicitly GRANT EXECUTE
for each function that you want should be allowed to call the function.
In that case, a PRIVATE declaration of the function would be better.

But if it is important a function is absolutely not called by any
other function than a a single very specific function,
then it would be better with a specific GRANT EXECUTE ON FUNCTION
foo() to bar() in the function definition file for bar(),
which would make it clear to a developer looking at the bar() source
code that the function is only supposed to be called by foo().

But like I said, I also like your PRIVATE idea. I think most functions
in my schemas would actually be PRIVATE, and only a few would be
PUBLIC, since you usually have more internal functions in a schema,
that are not supposed to be called outside of the schema and doesn't
even make sense outside of the schema.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] GRANT EXECUTE ON FUNCTION foo() TO bar();

2017-02-21 Thread Joel Jacobson

Hi Hackers,

Currently, it's only possible to grant/revoke execute on functions to roles.

I think it would be useful in many situations, both for documentation purposes,
but also for increased security, to in a precise way control what
other function(s)
are allowed to execute a specific function.

This would be useful for functions that are not supposed to be used
manually by any human or any other function(s) than the few places
where the function makes sense to use.

Thoughts?

/Joel


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] case_preservation_and_insensitivity = on

2017-02-20 Thread Joel Jacobson

On Mon, Feb 20, 2017 at 1:45 AM, Tom Lane  wrote:
> The versions of autocommit that have actually stood the test of time were
> implemented on the client side (in psql and JDBC, and I think ODBC as
> well), where the scope of affected code was lots smaller.  I wonder
> whether there's any hope of providing something useful for case-folding
> in that way.  psql's lexer is already smart enough that you could teach it
> rules like "smash any unquoted identifier to lower case" (probably it
> would fold keywords too, but that seems OK).  That's probably not much
> help for custom applications, which aren't likely to be going through
> psql scripts; but the fact that such behavior is in reach at all on the
> client side seems encouraging.

This sounds like a really good solution to me,
since there is actually nothing missing on the PostgreSQL server-side,
it's merely a matter of inconvenience on the client-side.

As long as the definitions of the database objects when stored
in the git repo can be written without the double-quotes,
i.e. CREATE TABLE Users (
instead of
CREATE TABLE "Users" (

where the object would be created as "Users" with capital "U",
then I see no problem.

Most people probably use psql to initiate a db instance of their
project locally,
so if psql would have a --preserve-case option, that would solve the
problem of creating new objects.
Or maybe --no-case-folding is a better name for the option.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] case_preservation_and_insensitivity = on

2017-02-20 Thread Joel Jacobson

On Mon, Feb 20, 2017 at 2:40 AM, Jim Nasby  wrote:
> Even if the project decided that "Users" and users is stupid and that we
> should deprecate it, I think the odds of also deciding to tell existing
> users to re-write their apps are zero.

But if the feature can't be turned on without also enforcing lowercase
uniqueness,
then the described problem situation will never happen.
Any existing projects who want to use the new feature but can't due to
conflicting names,
will simply just have to live without it, just like they already do.

> So no matter how this is designed, there has to be some way for existing
> users to be able to continue relying on "Users" and users being different.

There is a way, simply don't switch on the feature,
and "Users" and "users" will continue to be different.

> What would work is an initdb option that controls this: when ignoring case
> for uniqueness is disabled, your new column would simply be left as NULL.
> With some extra effort you could probably allow changing that on a running
> database as well, just not with something as easy to change as a GUC.

initdb option sounds good to me, just like you specify e.g.  --encoding.

Also, I think the --lowercase-uniqueness feature would be useful by
itself even without the --case-preserving feature,
since that might be a good way to enforce a good design of new databases,
as a mix of "users" and "Users" is probably considered ugly by many
system designers.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] case_preservation_and_insensitivity = on

2017-02-19 Thread Joel Jacobson

On Sun, Feb 19, 2017 at 5:58 PM, Robert Haas  wrote:
>> When case preservation by default is on, then simply enforce
>> UNIQUE(LOWER(object_name)), to prevent ambiguity.
>
> That (1) breaks backward compatibility, because people might have
> objects with names identical except for case in existing databases and

Yes, but if since the target for this new feature is probably new projects
(who need this feature to even consider PostgreSQL as a database,
like the original author of the other thread where they were using SQLAnywhere),
then maybe that's an acceptable tradeoff, since all the existing
database who try
to use the feature will just get an error if they try to switch on the feature
e.g. "error: objects exist with identical lowercase names".

> (2) requires an expression index on a system catalog, which is not
> supported.  You could work around problem #2 with enough work, I
> guess, by storing two copies of the name of "name" column, one with
> the original casing and a second that has been downcased for indexing
> purposes.

Yes, storing both the unique lowercase name together with the OriginalCase
was also my idea on how to implement it.

> I don't really understand what the GUC does in this scenario.

Changing the GUC (if you run into problems with tools) would simply
just change what names are returned from pg for all the object names,
i.e. it would then return the lowercase names instead of the OriginalCase
names, to make the tools happy.
This means any pg user who enabled the CasePreserving feature when
creating the database, would not have to redesign their entire database
if they later run into problems with tools, but can simply just switch off
the GUC.

> But once you've already
> decided to have a hard-and-fast rule that the names must be unique
> after lower-casing, there's no obvious benefit to rejecting queries
> that mention the same name with different case.

Exactly, that trade-off is necessary, otherwise such queries would be ambiguous.

I think a good general philosophy for the PostgreSQL project would be to
try to look at how to meed the needs for new users of new projects
in a way that don't impair things for existing users,
by accepting the new users might have to live with some trade-offs
for their new feature to be possible to implement,
such as in this case that the trade-off is to not be able to create
objects of different casing with the same lowercase names,
a tradeoff that I personally think would not be a problem for most projects,
since it seems unlikely you would both have a "users" table and a
"Users" table in the same database.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] case_preservation_and_insensitivity = on

2017-02-16 Thread Joel Jacobson

On Thu, Feb 16, 2017 at 6:53 AM, Tom Lane  wrote:
> Have you read any of our innumerable previous discussions about this?

No, sorry, didn't see them, thanks for sharing the links.

> The short answer is that nobody can see a way to modify the identifier
> case-folding rules that isn't going to add more pain than it subtracts.
> And much of the added pain will be felt by people who aren't getting
> any benefit, who will therefore be vociferously against the whole thing.

I've read the discussion and have an idea:

When case preservation by default is on, then simply enforce
UNIQUE(LOWER(object_name)), to prevent ambiguity.

If all objects lowercase names are unique, but the casing is
preserved, then a user who later on suffers from problems with
external tools that work poorly with non-lowercase object names, could
then simply switch back to lowercase object names by changing the GUC.

OTOH, if not enforcing lowercase uniqueness, there would be a risk two
objects with different casing would have conflicting lowercase names,
and then the user who later runs into problems with some external tool
would have a serious problem, since switching to lowercase would not
be an option.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] case_preservation_and_insensitivity = on

2017-02-15 Thread Joel Jacobson

Hi hackers,

"A system that is not case-preserving is necessarily case-insensitive,
but it is possible and common for a system to be case-insensitive, yet
case-preserving" [1]

Imagine if you could turn on a GUC that would turn PostgreSQL into
such a system,
where the case would be preserved by default for all created objects,
without having to use double-quotes around all identifiers,
and while still being able to refer to such created objects case-insensitively,
without having to use double-quotes around all identifiers.

Today, you have to sacrifice the nice case-insensitivity feature if
you wish to preserve case information, which is a shame.

This would make a huge difference in terms of usability when using
PostgreSQL together with external systems where casing is important.

Today, you have to maintain silly look-up tables to translate between
PostgreSQL's internal lowercase objects,
and the outside world's e.g. CamelCase names for the corresponding objects,
or you have to sacrifice the nice case-insensitivity feature.

Case Preservation + Case Insensitivity = A good combination

Thoughts?

[1] https://en.wikipedia.org/wiki/Case_preservation


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea on how to simplify comparing two sets

2017-02-07 Thread Joel Jacobson

On Tue, Feb 7, 2017 at 6:28 PM, David Fetter  wrote:
> This could be shortened further to the following if we ever implement
> DISTINCT for window functions, which might involve implementing
> DISTINCT via hashing more generally, which means hashable
> types...whee!
>
> SELECT array_agg(DISTINCT (a IS NOT NULL)::int + (b IS NOT NULL)::int) OVER 
> () = '{2}'
> FROM a FULL JOIN b ON ...

That's still a lot more syntax than just adding "IS NOT DISTINCT FROM"
in between the sets.

I just thought the general approach at looking for ways to express new
things in SQL,
without introducing new keywords, but instead rely on existing keywords but
that currently are syntax errors when used in some semantic way,
is an interesting approach to allow extending the SQL syntax without
breaking backwards compatibility.

Are there any historical examples of when this approach has been used
to make progress in PostgreSQL?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea on how to simplify comparing two sets

2017-02-07 Thread Joel Jacobson

On Tue, Feb 7, 2017 at 4:58 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Joel Jacobson <j...@trustly.com> writes:
>> Currently there is no simple way to check if two sets are equal.
>
> Uh ... maybe check whether SELECT set1 EXCEPT SELECT set2
> and SELECT set2 EXCEPT SELECT set1 are both empty?

Yes, that's one way, but it's ugly as you have to repeat yourself and
write both sets two times.
Not an issue for small queries, but if you have two big queries stored
in a .sql file,
you would have to modify both places for each query and always make
sure they are identical.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea on how to simplify comparing two sets

2017-02-07 Thread Joel Jacobson

But that's already a valid statement, so there is no ambiguity:

SELECT TRUE WHERE FALSE
IS NOT DISTINCT FROM
(SELECT TRUE);
 bool
--
(0 rows)


If you want to compare the set (SELECT TRUE WHERE FALSE) with the set
(SELECT TRUE) then just add parenthesis:
(SELECT TRUE WHERE FALSE)
IS NOT DISTINCT FROM
(SELECT TRUE);
ERROR:  syntax error at or near "IS"
LINE 2: IS NOT DISTINCT FROM
^

Which is currently invalid syntax.




On Tue, Feb 7, 2017 at 4:40 PM, Anders Granlund
<anders.granl...@trustly.com> wrote:
> What about this ambiguity?
>
> SELECT TRUE
> WHERE FALSE
> IS NOT DISTINCT FROM
> (SELECT TRUE)
>
> On Tue, Feb 7, 2017 at 4:13 PM, Joel Jacobson <j...@trustly.com> wrote:
>>
>>  Hi hackers,
>>
>> Currently there is no simple way to check if two sets are equal.
>>
>> Looks like no RDBMS in the world has a simple command for it.
>>
>> You have to do something like:
>>
>> WITH
>> T1 AS (SELECT * FROM Foo WHERE FooID BETWEEN 1 AND 1),
>> T2 AS (SELECT * FROM Bar WHERE BarID BETWEEN 1 AND 1)
>> SELECT
>> GREATEST(
>> (SELECT COUNT(*) FROM T1),
>> (SELECT COUNT(*) FROM T2)
>> )
>> =
>> (SELECT COUNT(*) FROM (
>> SELECT * FROM T1
>> INTERSECT ALL
>> SELECT * FROM T2
>> ) AS X)
>> INTO _Identical;
>>
>> or,
>>
>> SELECT 'Missmatch!' WHERE EXISTS (
>> SELECT * FROM Foo
>> FULL JOIN Bar ON (Foo.FooID = Bar.BarID AND
>>  Foo IS NOT DISTINCT FROM Bar)
>> WHERE TRUE
>> AND ( Foo.FooID BETWEEN 1 AND 1 AND
>>   Bar.BarID BETWEEN 1 AND 1)
>> AND ( Foo.FooID IS NULL OR
>>   Bar.BarID IS NULL);
>>
>> Introducing new SQL keywords is of course not an option,
>> since it would possibly break backwards compatibility.
>>
>> So here is an idea that doesn't break backwards compatibility:
>>
>> Let's give a meaning for the existing IS DISTINCT and IS NOT DISTINCT,
>> that is currently a syntax error when used between two sets.
>>
>> SELECT 1 IS DISTINCT FROM SELECT 1;
>> ERROR:  syntax error at or near "SELECT"
>> LINE 1: SELECT 1 IS DISTINCT FROM SELECT 1;
>>
>> The example above could be written as:
>>
>> _Identical := (
>> SELECT * FROM Foo WHERE FooID BETWEEN 1 AND 1
>> IS NOT DISTINCT FROM
>> SELECT * FROM Bar WHERE BarID BETWEEN 1 AND 1
>> );
>>
>> Which would set _Identical to TRUE if the two sets are equal,
>> and FALSE otherwise.
>>
>> Since it's currently a syntax error, there is no risk for changed
>> behaviour for any existing executable queries.
>>
>> Thoughts?
>>
>> /Joel
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>
>



-- 
Joel Jacobson

Mobile: +46703603801
Trustly.com | Newsroom | LinkedIn | Twitter


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Idea on how to simplify comparing two sets

2017-02-07 Thread Joel Jacobson

 Hi hackers,

Currently there is no simple way to check if two sets are equal.

Looks like no RDBMS in the world has a simple command for it.

You have to do something like:

WITH
T1 AS (SELECT * FROM Foo WHERE FooID BETWEEN 1 AND 1),
T2 AS (SELECT * FROM Bar WHERE BarID BETWEEN 1 AND 1)
SELECT
GREATEST(
(SELECT COUNT(*) FROM T1),
(SELECT COUNT(*) FROM T2)
)
=
(SELECT COUNT(*) FROM (
SELECT * FROM T1
INTERSECT ALL
SELECT * FROM T2
) AS X)
INTO _Identical;

or,

SELECT 'Missmatch!' WHERE EXISTS (
SELECT * FROM Foo
FULL JOIN Bar ON (Foo.FooID = Bar.BarID AND
 Foo IS NOT DISTINCT FROM Bar)
WHERE TRUE
AND ( Foo.FooID BETWEEN 1 AND 1 AND
  Bar.BarID BETWEEN 1 AND 1)
AND ( Foo.FooID IS NULL OR
  Bar.BarID IS NULL);

Introducing new SQL keywords is of course not an option,
since it would possibly break backwards compatibility.

So here is an idea that doesn't break backwards compatibility:

Let's give a meaning for the existing IS DISTINCT and IS NOT DISTINCT,
that is currently a syntax error when used between two sets.

SELECT 1 IS DISTINCT FROM SELECT 1;
ERROR:  syntax error at or near "SELECT"
LINE 1: SELECT 1 IS DISTINCT FROM SELECT 1;

The example above could be written as:

_Identical := (
SELECT * FROM Foo WHERE FooID BETWEEN 1 AND 1
IS NOT DISTINCT FROM
SELECT * FROM Bar WHERE BarID BETWEEN 1 AND 1
);

Which would set _Identical to TRUE if the two sets are equal,
and FALSE otherwise.

Since it's currently a syntax error, there is no risk for changed
behaviour for any existing executable queries.

Thoughts?

/Joel


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RustgreSQL

2017-01-09 Thread Joel Jacobson

On Mon, Jan 9, 2017 at 6:03 PM, Craig Ringer  wrote:
> Immutable functions can and do use functionality from all over the
> server. They just don't depend on user-visible mutable _state_
> elsewhere in the server.

OK, fair point, but at least the functionality *could* be written without using
existing C functions, since its only the input that determine what
output will be returned. The dependencies used by the immutable
functions can also be ported, function by function, until there are
no dependencies.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RustgreSQL

2017-01-09 Thread Joel Jacobson

On Mon, Jan 9, 2017 at 3:22 PM, Jim Nasby  wrote:
> I do wonder if there are parts of the codebase that would be much better
> suited to a language other than C, and could reasonably be ported.
> Especially if that could be done in such a way that the net result is still
> C code so we're not adding dependencies to non developers (similar to
> bison).
>
> Extensions are a step in that direction, but they're ultimately not core
> Postgres (which is a different issue).

I think this is a great idea!

That way the amount of C code could be reduced over time,
while safely extending the official version with new functionality on
the surface,
without increasing the amount of C code.

One idea of an area that would be most useful from a user-perspective
is probably all pg_catalog functions that are immutable.
They should be able to be written without expertise of the PostgreSQL internals,
since they only depend on the input parameters to produce the output.

And that also means is should be easier to write them in a different
language than C,
because they don't need access to any PostgreSQL internal data structures,
or make use of existing C functions.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RustgreSQL

2017-01-08 Thread Joel Jacobson

Thank you Craig for explaining the current C state of the project,
very interesting to learn about.

On Sun, Jan 8, 2017 at 4:19 AM, Craig Ringer
 wrote:
> If a low pain seamless conversation/adaptation like that were possible I'd 
> have to wonder what Rust could actually offer us over C since it clearly has 
> the same scope for issues if such an intermixture is possible. Kind of a 
> catch-22.

Not necessarily. If you write non-idiomatic Rust and preserve as much
of the code style as possible from the original C code, and just focus
on getting ride of all usage of unsafe, then it will be more easily
understandable by existing C developers than if strictly writing
idiomatic Rust code.

> You'd have a lot more chance writing extensions in Rust though. If you can 
> make it play OK with the exception handling in postgres and our memory 
> management, at least. If you really wanted to push this forward... start 
> there. Show how great it is.

Funny you mention, that's actually exactly how these thoughts started
in my head. I realized I would need to write a C function to do some
computations that needed to be fast, but since C is dangerous I don't
know the language well enough I thought "I wish there was a safe and
fast system language you could
write database functions in!", and that's how I first found Rust and
then later this project:
https://github.com/thehydroimpulse/postgres-extension.rs
Which allows doing exactly that, writing extensions in Rust.

> You don't want to to learn C so dozens/hundreds of people need to learn Rust. 
> What the?

Oh, I don't think you seriously think I meant to suggest others should
learn Rust just because I don't want to learn C.
I don't want to learn the complicated details of C, that's true.
But that has nothing to do why others would need to learn Rust. They
don't, unless the majority of the project would also want to move to
Rust, and that has of course nothing to do with me.
I'm just asking possibly stupid questions and having possibly stupid
theories, trying to understand why such a project would be possible or
not.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] RustgreSQL

2017-01-08 Thread Joel Jacobson

Hi all,

Is anyone working on porting PostgreSQL to Rust?

Corrode looks a bit limited for the task, but maybe it can be a start.
It doesn't support goto or switch, but maybe the gotos patterns are not too
complicated.

My motivation is primarily I don't want to learn all the over-complicated
details of C,
but at the same time I would like to be productive in a safe system
language,
a category in which Rust seems to be alone.

Porting PostgreSQL to Rust would be a multi-year project,
and it could only be done if the process could be fully automated,
by supporting all the coding patterns used by the project,
otherwise a Rust-port would quickly fall behind the master branch.
But if all git commits could be automatically converted to Rust,
then the RustgreSQL project could pull all commits from upstream
until all development has switched over to Rust among all developers.

Is this completely unrealistic or is it carved in stone PostgreSQL will
always be a C project forever and ever?

Re: [HACKERS] merging some features from plpgsql2 project

2017-01-08 Thread Joel Jacobson

On Sat, Jan 7, 2017 at 8:56 PM, Pavel Stehule  wrote:
>
> Jim, Marko, Joel - is there a place, features where we can find a partial 
> agreement? If it is, then we can move our view there.

I have decided I definitively want a new language, and I'm willing to
pay for it.

Hopefully the community will join forces and contribute with ideas and
code, but with or without you or the rest of the community, plpgsql2
is going to happen.
Call it pltrustly or plpgsql2, I don't care. I just care about ending
my suffering from being forced writing plpgsql every day. It sucks,
and I'm going to end it.

I'm just too fed up with the annoyances of plpgsql. I cannot care less
about _hypothetical_ incompatibility problems,
I think your arguments "this is like Perl6 or Python3" are delusional.
You can easily intermix plpgsql and plpgsql2 in the same
"application", something you cannot do with Perl6 or Python3. So
please stop using that as an argument.

If anyone has an application where the hypothetical incompatibility
problems would be a problem, then just continue to use plpgsql.

And please kill all these GUCs ideas. The best thing with PostgreSQL
is the natural expected behaviour of the default configuration.
Contrary to MySQL where you have to enable lots and lots of
configuration options just to get a behaviour you expect as a novice
user.

It's much better to just come together and agree on whatever we have
learned during the last 15 years of PL/pgSQL1, and sample all ideas
during a year maybe, and decide what to put into PL/pgSQL2. To make it
useful, we should aim to not break compatibility for _most_ code, but
accept some necessary rewrites of functions with deprecated
anti-patterns.

I'm even willing to suggest it might be a good idea to first try out
PL/pgSQL2 at Trustly, and after a year of usage, report back to the
community of our findings on how well it worked out for us, to allow
all others to learn from our mistakes during our first year of using
the new language. That way less people and companies will have to
suffer when we discover what we got wrong in what we thought would
work out well for us.

During the same trial period maybe your company Pavel and others can
try out their ideas of a PL/pgSQL2 and implement it, see how it works
out for you, and then report back to the community on your findings
from production environments.

That way we can avoid all these hypothetical discussions on what will
be good or bad without having any empirical evidence at hand.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_activity.waiting_start

2017-01-07 Thread Joel Jacobson

On Sat, Jan 7, 2017 at 3:25 AM, Greg Stark  wrote:
> What users need to know is in aggregate how much of the time the
> database is spending working on their queries is going into different
> states.

This is a separate feature idea, but I think it's really valuable as well.

Maybe something similar to pg_stat_user_functions?
But instead grouping by wait_event_type, wait_event, and showing
accumulated count and sum of waiting time since last stat reset, just
like the other pg_stat_* views?

Maybe something like this?

\d pg_stat_waiting
 View "pg_catalog.pg_stat_waiting"
   Column|   Type   | Modifiers
-+--+---
 wait_event_type | name |
 wait_event  | name |
 waiting_counter | bigint   |
 waiting_time| double precision |

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] merging some features from plpgsql2 project

2017-01-06 Thread Joel Jacobson

On Thu, Jan 5, 2017 at 7:03 AM, Robert Haas  wrote:
>
> I think it would be a good idea to lock all the people who really care
> about PL/pgsql in a room until they agree on what changes should be
> made for the next version of the language.  If they don't agree
> quickly enough, we can resort to the techniques described in
> https://en.wikipedia.org/wiki/Papal_election,_1268%E2%80%9371

I think that's a very good idea, and I'm happy to be locked into such a room.

I think such a discussion will be very fruitful,
given the others in the room have also
already decided they want a new language
and are there to discuss "the next version of the language",
instead of debating why they don't think we need a new language.

It would also be good if those people could bring laptops
with all their plpgsql code bases, to check if any of
the proposed possibly non-backwards compatible
syntax proposals would break nothing, just a few functions,
or a lot of functions in their code bases.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_activity.waiting_start

2017-01-05 Thread Joel Jacobson

On Thu, Jan 5, 2017 at 4:59 PM, Bruce Momjian  wrote:
> Agreed.  No need in adding overhead for short-lived locks because the
> milli-second values are going to be meaningless to users. I would be
> happy if we could find some weasel value for non-heavyweight locks.

To avoid a NULL value for waiting_start, and thanks to non-heavyweight
locks don't exceed order-of-milliseconds, I think it would be
acceptable to just return now() whenever something wants to know
waiting_start i.e. when something selects from pg_stat_activity.

The exact value would only be within orders-of-milliseconds away from
now() anyway, so one can argue it's not that important, as long as the
documentation is clear on that point.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_activity.waiting_start

2016-12-25 Thread Joel Jacobson

On Sun, Dec 25, 2016 at 8:01 PM, Andres Freund <and...@anarazel.de> wrote:
> On December 25, 2016 1:21:43 AM GMT+01:00, Joel Jacobson <j...@trustly.com> 
> wrote:
>
>>Is it really a typical real-life scenario that processes can be
>>waiting extremely often for extremely short periods of time,
>>where the timing overhead would be significant?
>
> Yes. Consider WAL insertion, procarray or other similar contended locks.

Ah, I see, then I understand it has to be blazingly fast.

Maybe a good tradeoff then would be to let "wait_start" represent the
very first time the txn started waiting?
That way gettimeofday() would only be called once per txn, and the
value would be remembered, but not exposed when the txn is not
waiting.
If the txn is waiting/not waiting multiple times during it's
life-time, the same "wait_start" value would be exposed when it's
waiting, and NULL when it's not. Sounds good?

As long as the documentation is clear on "wait_start" meaning when the
first wait started in the txn, I think that's useful enough to improve
the situation, as one could then ask a query like "select all
processes that have possibly been waiting for at least 5 seconds",
which you cannot do today.

The best you can do today is ask a query like "select all processes
that are waiting and have been running for at least 5 seconds", but
during those 5 seconds they have been running, they might only have
been waiting for the very last few milliseconds, which might not be a
problem at all. If instead knowing they were waiting 5 seconds ago,
and are still waiting, but might have had periods in between where
they were not waiting, I would say that is close enough to what I as a
user want to know, and can use that information for automatic
decision-making on e.g. if I want to terminate other blocking
processes.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_activity.waiting_start

2016-12-24 Thread Joel Jacobson

> This is not an easy problem.  See our most recent discussion at
> https://www.postgresql.org/message-id/flat/31856.1400021891%40sss.pgh.pa.us

Thanks for the small test program.
I tested it on my MacBook Pro and gettimeofday() was way faster than time().

The clock_gettime() used by the patch clock_gettime_1.patch in the
mailing thread from Hari Babu apparently doesn't work on OS X.

Instead, I tested the OS X specific mach_absolute_time() which was the fastest:

gcc -Wall -O2 -o time-timing-calls -DUSE_MACH_ABSOLUTE_TIME time-timing-calls.c
time ./time-timing-calls
real 0m16.806s
user 0m16.781s
sys 0m0.012s

gcc -Wall -O2 -o time-timing-calls -DUSE_GETTIMEOFDAY time-timing-calls.c
time ./time-timing-calls
real 0m35.466s
user 0m35.062s
sys 0m0.393s

Code:

#ifdef __MACH__
#include 
#endif
#ifdef USE_MACH_ABSOLUTE_TIME
uint64_t tv;
tv = mach_absolute_time();
#endif

> I'm prepared to consider an argument that wait timing might have weaker
> requirements than EXPLAIN ANALYZE (which certainly needs to measure short
> durations) but you didn't actually make that argument.

I can see why timing overload is a problem in EXPLAIN ANALYZE and at
other places,
and that would of course be a great thing to fix.

However, I'm not sure I fully understand how it can be that much of a
problem in pgstat_report_wait_start()?

As far as I can tell from reading the source code, it only appears
pgstat_report_wait_start() is only entered when a process is waiting?

Is it not likely the time spent waiting will vastly exceed the amount
of extra time for the gettimeofday() call?

Is it really a typical real-life scenario that processes can be
waiting extremely often for extremely short periods of time,
where the timing overhead would be significant?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_activity.waiting_start

2016-12-23 Thread Joel Jacobson

Attached is a patch implementing the seconds-resolution wait_start, but
presented as a timestamptz to the user, just like the other *_start fields:

commit c001e5c537e36d2683a7e55c7c8bfcc154de4c9d
Author: Joel Jacobson <j...@trustly.com>
Date:   Sat Dec 24 13:20:09 2016 +0700

Add OUT parameter "wait_start" timestamptz to pg_stat_get_activity()
and pg_catalog.pg_stat_activity

This is set to the timestamptz with seconds resolution
when the process started waiting, and reset to NULL
when it's not waiting any longer.

This is useful if you want to know not only what the process is
waiting for, but also how long it has been waiting,
which can be useful in situations when it might be
normal for different users/applications to wait for some amount of time,
but abnormal if they are waiting longer than some threshold.

When such a threshold is exceeded, monitoring applications
could then alert the user or possibly cancel/terminate
the blocking processes.

Without information on how long time processes have been waiting,
the monitoring applications would have no other option than
to cancel/terminate a process as soon as something is waiting,
or keep track of how long time processes have been waiting
by polling and keeping track on a per process basis,
which is less user-friendly than if PostgreSQL would provide
the information directly to the user.

 src/backend/catalog/system_views.sql | 3 ++-
 src/backend/postmaster/pgstat.c  | 1 +
 src/backend/storage/lmgr/proc.c  | 1 +
 src/backend/utils/adt/pgstatfuncs.c  | 7 ++-
 src/include/catalog/pg_proc.h| 2 +-
 src/include/pgstat.h | 6 +-
 src/include/storage/proc.h   | 3 +++
 7 files changed, 19 insertions(+), 4 deletions(-)

On Sat, Dec 24, 2016 at 12:32 PM, Joel Jacobson <j...@trustly.com> wrote:

> On Sat, Dec 24, 2016 at 9:56 AM, Joel Jacobson <j...@trustly.com> wrote:
> >> The difficulty with that is it'd require a gettimeofday() call for
> >> every wait start.  Even on platforms where those are relatively cheap,
>
> I just realized how this can be optimized.
> We only need to set wait_start for every new waiting period,
> not for every wait start, i.e. not for every call to
> pgstat_report_wait_start():
>
> Example:
>
> In pgstat_report_wait_start():
> if (proc->wait_start == 0)
>proc->wait_start = (pg_time_t) time(NULL);
>
> And then in pgstat_report_wait_end():
> proc->wait_start = 0;
>
> This means we only need to call time() or gettimeofday() once per
> waiting period.
>
>
> --
> Joel Jacobson
>

-- 
Joel Jacobson

Mobile: +46703603801
*Trustly.com <http://trustly.com/> | Newsroom
<http://www.mynewsdesk.com/trustly_en> | LinkedIn
<https://www.linkedin.com/company/trustly-group-ab> | **Twitter
<https://twitter.com/Trustly>*

* <https://trustly.com/>*

0001-pg_stat_get_activity_wait_start.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_activity.waiting_start

2016-12-23 Thread Joel Jacobson

On Sat, Dec 24, 2016 at 9:56 AM, Joel Jacobson <j...@trustly.com> wrote:
>> The difficulty with that is it'd require a gettimeofday() call for
>> every wait start.  Even on platforms where those are relatively cheap,

I just realized how this can be optimized.
We only need to set wait_start for every new waiting period,
not for every wait start, i.e. not for every call to pgstat_report_wait_start():

Example:

In pgstat_report_wait_start():
if (proc->wait_start == 0)
   proc->wait_start = (pg_time_t) time(NULL);

And then in pgstat_report_wait_end():
proc->wait_start = 0;

This means we only need to call time() or gettimeofday() once per
waiting period.

-- 
Joel Jacobson

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_activity.waiting_start

2016-12-23 Thread Joel Jacobson

On Sat, Dec 24, 2016 at 9:00 AM, Tom Lane  wrote:
>> I would like to propose adding a fourth such column, "waiting_start",
>> which would tell how long time a backend has been waiting.
>
> The difficulty with that is it'd require a gettimeofday() call for
> every wait start.  Even on platforms where those are relatively cheap,
> the overhead would be nasty --- and on some platforms, it'd be
> astonishingly bad.  We sweated quite a lot to get the overhead of
> pg_stat_activity wait monitoring down to the point where it would be
> tolerable for non-heavyweight locks, but I'm afraid this would push
> it back into the not-tolerable range.

I don't think we need the microsecond resolution provided by
gettimeofday() via GetCurrentTimestamp()
It would be enough to know which second the waiting started, so we
could use time().

gettimeofday() takes 42 cycles.
time() only takes 3 cycles. [1]

[1] http://stackoverflow.com/questions/6498972/faster-equivalent-of-gettimeofday


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_activity.waiting_start

2016-12-23 Thread Joel Jacobson

Actually, "wait_start" is a better name to match the others
("wait_event_type" and "wait_event").


On Sat, Dec 24, 2016 at 8:20 AM, Joel Jacobson <j...@trustly.com> wrote:
> Hi hackers,
>
> We already have xact_start, query_start and backend_start
> to get the timestamptz for when different things happened.
>
> I would like to propose adding a fourth such column, "waiting_start",
> which would tell how long time a backend has been waiting.
>
> The column would be NULL when waiting=FALSE.
>
> While it's trivial to write a script that just polls pg_stat_activity
> every second and keeps tack of when a backend started
> waiting by just checking for any new waiting=TRUE rows,
> it would be more convenient to just get the information from
> pg_stat_activity directly.
>
> The use-case would be e.g. monitoring tools
> when you want to know how long time queries are waiting.
>
> --
> Joel Jacobson



-- 
Joel Jacobson

Mobile: +46703603801
Trustly.com | Newsroom | LinkedIn | Twitter


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_stat_activity.waiting_start

2016-12-23 Thread Joel Jacobson

Hi hackers,

We already have xact_start, query_start and backend_start
to get the timestamptz for when different things happened.

I would like to propose adding a fourth such column, "waiting_start",
which would tell how long time a backend has been waiting.

The column would be NULL when waiting=FALSE.

While it's trivial to write a script that just polls pg_stat_activity
every second and keeps tack of when a backend started
waiting by just checking for any new waiting=TRUE rows,
it would be more convenient to just get the information from
pg_stat_activity directly.

The use-case would be e.g. monitoring tools
when you want to know how long time queries are waiting.

-- 
Joel Jacobson


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SET NOT NULL [NOT VALID / CONCURRENTLY]?

2016-12-21 Thread Joel Jacobson

Attached is the function SET_NOT_NULL(_Schema name, _Table name, _Column
name) which does the following:

1. LOCK TABLE %I.%I IN ACCESS EXCLUSIVE MODE
just like the normal DDL commands would do

2. SELECT EXISTS (SELECT 1 FROM %I.%I WHERE %I IS NULL)
which is fast if there is an index on the column

3. UPDATE pg_catalog.pg_attribute SET attnotnull = TRUE
WHERE attrelid = %L::oid
AND   attname  = %L

Pragmatically, would this be a safe approach?


On Wed, Dec 21, 2016 at 6:53 PM, Joel Jacobson <j...@trustly.com> wrote:

> If you are fully confident you have no NULL values,
> e.g. if you have all your logics in db functions and you validate all
> INSERTs to a table won't pass any NULL values,
> and you have checked all the rows in a table are NOT NULL for the column,
> would it be completely crazy to just set pg_attribute.attnotnull to
> TRUE for the column?
>
> Is anything else happening "under the hood" than just locking all rows
> and verifying there are no NULL rows, and then setting attnotnull to
> TRUE?
>
>
> On Wed, Dec 21, 2016 at 6:37 PM, Craig Ringer <cr...@2ndquadrant.com>
> wrote:
> > On 21 December 2016 at 19:01, Joel Jacobson <j...@trustly.com> wrote:
> >
> >> Similar to what we (Trustly) did when we sponsored the FOR KEY LOCK
> >> feature to improve concurrency,
> >> we would be very interested in also sponsoring this feature, as it
> >> would mean a great lot to us.
> >> I don't know if this is the right forum trying to find someone/some
> >> company to sign up for the task,
> >> please let me know if I should mail to some other list. Thanks.
> >
> > You'll probably get mail off list.
> >
> > For what it's worth, there's a bit of a complexity here. PostgreSQL
> > doesn't model NOT NULL as a true CONSTRAINT. Instead it's a column
> > attribute. I suspect we would need to change that in order to allow a
> > NOT VALID NOT NULL constraint to be created.
> >
> > That's at least partly why the docs say that "option NOT VALID [...]
> > is currently only allowed for foreign key and CHECK constraints".
> >
> > Note that "[VALIDATE] acquires only a SHARE UPDATE EXCLUSIVE lock on
> > the table being altered" so it's already suitable for what you need.
> > The challenge is making it possible to create a NOT VALID constraint
> > for NOT NULL.
> >
> > --
> >  Craig Ringer   http://www.2ndQuadrant.com/
> >  PostgreSQL Development, 24x7 Support, Training & Services
>
>
>
> --
> Joel Jacobson
>
> Mobile: +46703603801
> Trustly.com | Newsroom | LinkedIn | Twitter
>



-- 
Joel Jacobson

Mobile: +46703603801
*Trustly.com <http://trustly.com/> | Newsroom
<http://www.mynewsdesk.com/trustly_en> | LinkedIn
<https://www.linkedin.com/company/trustly-group-ab> | **Twitter
<https://twitter.com/Trustly>*


* <https://trustly.com/>*


set_not_null.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SET NOT NULL [NOT VALID / CONCURRENTLY]?

2016-12-21 Thread Joel Jacobson

If you are fully confident you have no NULL values,
e.g. if you have all your logics in db functions and you validate all
INSERTs to a table won't pass any NULL values,
and you have checked all the rows in a table are NOT NULL for the column,
would it be completely crazy to just set pg_attribute.attnotnull to
TRUE for the column?

Is anything else happening "under the hood" than just locking all rows
and verifying there are no NULL rows, and then setting attnotnull to
TRUE?


On Wed, Dec 21, 2016 at 6:37 PM, Craig Ringer <cr...@2ndquadrant.com> wrote:
> On 21 December 2016 at 19:01, Joel Jacobson <j...@trustly.com> wrote:
>
>> Similar to what we (Trustly) did when we sponsored the FOR KEY LOCK
>> feature to improve concurrency,
>> we would be very interested in also sponsoring this feature, as it
>> would mean a great lot to us.
>> I don't know if this is the right forum trying to find someone/some
>> company to sign up for the task,
>> please let me know if I should mail to some other list. Thanks.
>
> You'll probably get mail off list.
>
> For what it's worth, there's a bit of a complexity here. PostgreSQL
> doesn't model NOT NULL as a true CONSTRAINT. Instead it's a column
> attribute. I suspect we would need to change that in order to allow a
> NOT VALID NOT NULL constraint to be created.
>
> That's at least partly why the docs say that "option NOT VALID [...]
> is currently only allowed for foreign key and CHECK constraints".
>
> Note that "[VALIDATE] acquires only a SHARE UPDATE EXCLUSIVE lock on
> the table being altered" so it's already suitable for what you need.
> The challenge is making it possible to create a NOT VALID constraint
> for NOT NULL.
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services



-- 
Joel Jacobson

Mobile: +46703603801
Trustly.com | Newsroom | LinkedIn | Twitter


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SET NOT NULL [NOT VALID / CONCURRENTLY]?

2016-12-21 Thread Joel Jacobson

On Wed, Dec 21, 2016 at 4:24 PM, Craig Ringer <cr...@2ndquadrant.com> wrote:
>> Is anyone working on fixing this for PostgreSQL 10?
>
> Not as far as I know.
>
> IMO this and other similar cases should all be handled the same way:
> create the constraint NOT VALID, then VALIDATE it while holding a weak
> lock that only blocks concurrent schema changes.

Sounds like a good approach.

Similar to what we (Trustly) did when we sponsored the FOR KEY LOCK
feature to improve concurrency,
we would be very interested in also sponsoring this feature, as it
would mean a great lot to us.
I don't know if this is the right forum trying to find someone/some
company to sign up for the task,
please let me know if I should mail to some other list. Thanks.

Joel Jacobson
Trustly

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] SET NOT NULL [NOT VALID / CONCURRENTLY]?

2016-12-21 Thread Joel Jacobson

Hi hackers,

I would be good if it would be possible to quickly set NOT NULL for an
existing column in a table
that have no rows where the column IS NULL and where there is a full
index on the column
allowing the logics to quickly understand there are no NULL values,
and just have to take a quick
lock on the table to prevent any modifications during the short time
when the NOT NULL
is set for the column.

Currently if you want to set NOT NULL for a column in a huge table
that's not doable without blocking all writes to the table for quite
some time.
Setting NOT NULL for a 100 million row table took 28 seconds locally
on my machine.

Is anyone working on fixing this for PostgreSQL 10?

Joel Jacobson
Trustly


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Typo in doc/src/sgml/catalogs.sgml

2016-12-11 Thread Joel Jacobson

Hi,

I found a minor typo at
https://www.postgresql.org/docs/9.6/static/catalog-pg-am.html.

pg_catalog.pg_am. amhandler is of type "oid" according to the
documentation, but it's actually of type "regproc" in reality.

Compare with e.g. pg_aggregate where the columns of type "regproc" is
both "regproc" in the documentation and in reality.

To fix:

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 29738b0..93deda4 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -566,7 +566,7 @@

  
   amhandler
-  oid
+  regproc
   pg_proc.oid
   
OID of a handler function that is responsible for supplying information

Best regards,

Joel Jacobson
Trustly


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autonomous transactions

2016-09-02 Thread Joel Jacobson

On Thu, Sep 1, 2016 at 8:38 PM, Andres Freund  wrote:
>
> I'm not convinced this makes much sense. All FDWs, dblink etc. already
> allow you do stuff outside of a transaction.

You as a DBA can prevent FDWs from being used and dblink is an
extension that you don't have to install.
So if preventing side-effects is necessary in your application, that
can be achieved by simply not installing dblink and preventing FDWs.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autonomous transactions

2016-09-01 Thread Joel Jacobson

On Wed, Aug 31, 2016 at 6:22 PM, Simon Riggs <si...@2ndquadrant.com> wrote:
> On 31 August 2016 at 14:09, Joel Jacobson <j...@trustly.com> wrote:
>> My idea on how to deal with this would be to mark the function to be
>> "AUTONOMOUS" similar to how a function is marked to be "PARALLEL
>> SAFE",
>> and to throw an error if a caller that has blocked autonomous
>> transactions tries to call a function that is marked to be autonomous.
>>
>> That way none of the code that needs to be audited would ever get executed.
>
> Not sure I see why you would want to turn off execution for only some 
> functions.
>
> What happens if your function calls some other function with
> side-effects?

I'm not sure I understand your questions. All volatile functions modifying data
have side-effects. What I meant was if they are allowed to commit it
even if the caller doesn't want to.

However, I'll try to clarify the two scenarios I envision:

1. If a function is declared AUTONOMOUS and it gets called,
then that means nothing in the txn has blocked autonomous yet
and the function and any other function will be able to do autonomous txns
from that here on, so if some function would try to block autonomous that
would throw an error.

2. If a function has blocked autonomous and something later on
tries to call a function declared AUTONOMOUS then that would throw an error.

Basically, we start with a NULL state where autonomous is neither blocked
or explicitly allowed. Whatever happens first decides if autonomous transactions
will explicitly be blocked or allowed during the txn.

So we can go from NULL -> AUTONOMOUS ALLOWED
or NULL -> AUTONOMOUS BLOCKED,
but that's the only two state transitions possible.

Once set, it cannot be changed.

If nothing in an application cares about autonomous transactions,
they don't have to do anything special, they don't need to modify any
of their code.

But if it for some reason is important to block autonomous transactions
because the application is written in a way where it is expected
a RAISE EXCEPTION always rollbacks everything,
then the author of such an application (e.g. me) can just block
autonomous transactions
and continue to live happily ever after without having to dream nightmares about
developers misusing the feature, and only use it when appropriate.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autonomous transactions

2016-09-01 Thread Joel Jacobson

On Thu, Sep 1, 2016 at 12:12 AM, Vik Fearing  wrote:
> Part of what people want this for is to audit what people *try* to do.
> We can already audit what they've actually done.
>
> With your solution, we still wouldn't know when an unauthorized attempt
> to do something happened.

The unauthorized attempt to execute the function will still be logged
to the PostgreSQL log file
since it would throw an error, just like trying to connect with e.g.
an invalid username would be logged to the log files.
I think that's enough for that use-case, since it's arguably not an
application layer problem,
since no part of the code was ever executed.

But if someone tries to execute a function where one of the input params
is a password and the function raises an exception if the password
is incorrect and wants to log the unauthorized attempt, then that
would be a good example of when you could use and would need to use
autonomous transactions to log the invalid password attempt.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autonomous transactions

2016-08-31 Thread Joel Jacobson

On Wed, Aug 31, 2016 at 6:41 AM, Jaime Casanova
<jaime.casan...@2ndquadrant.com> wrote:
>
> On 30 August 2016 at 23:10, Joel Jacobson <j...@trustly.com> wrote:
> >
> > There should be a way to within the session and/or txn permanently
> > block autonomous transactions.
> >
>
> This will defeat one of the use cases of autonomous transactions: auditing

My idea on how to deal with this would be to mark the function to be
"AUTONOMOUS" similar to how a function is marked to be "PARALLEL
SAFE",
and to throw an error if a caller that has blocked autonomous
transactions tries to call a function that is marked to be autonomous.

That way none of the code that needs to be audited would ever get executed.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autonomous transactions

2016-08-30 Thread Joel Jacobson

I would love to see autonomous transactions in core.

I just have one major concern, but thankfully it's easily addressed.

There should be a way to within the session and/or txn permanently
block autonomous transactions.

This is important if you as a caller function want to be sure none of
the work made by anything called down the stack gets committed.
That is, if you as a caller decide to rollback, e.g. by raising an
exception, and you want to be sure *everything* gets rollbacked,
including all work by functions you've called.

If the caller can't control this, then the author of the caller
function would need to inspect the source code of all function being
called, to be sure there are no code using autonomous transactions.

Coding conventions, rules and discipline are all good and will help
against misuse of the feature, but some day someone will make a
mistake and wrongly use the autonomous transaction and cause unwanted
unknown side-effect I as a caller function didn't expect or know
about.

Once you have blocked autonomous transactions in a session or txn,
then any function called must not be able to unblock it (in the
session or txn), otherwise it defeats the purpose.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Provide much better wait information in pg_stat_activity.

2016-03-12 Thread Joel Jacobson

On Sun, Mar 13, 2016 at 12:40 AM, Tom Lane  wrote:
> In short: we've already been over this territory, at length,
> and I am not excited by people trying to bikeshed it again
> after the fact, especially when no new arguments are being
> presented.  Can we call the discussion closed, please?

Closed, at least from my side.

I'm grateful to have learned at least a bit more about when it's OK
to sacrifice backwards-compatibility.

Sorry for spamming this thread on that topic,
I'll instead wade through the archives to see what more I can learn
to hopefully become less confused.

Thanks.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Provide much better wait information in pg_stat_activity.

2016-03-11 Thread Joel Jacobson

On Sat, Mar 12, 2016 at 6:31 AM, Jim Nasby  wrote:
> On 3/10/16 8:36 PM, Robert Haas wrote:
>>
>> 1. We make it true only for heavyweight lock waits, and false for
>> other kinds of waits.  That's pretty strange.
>> 2. We make it true for all kinds of waits that we now know how to
>> report.  That still breaks compatibility.
>
>
> I would absolutely vote for 2 here. You could even argue that it's a bug
> fix, since those were waits we technically should have been indicating.
>
> The only way I can see #2 breaking anything is if you're using waiting=true
> to determine whether you look at pg_locks and your code will blow up if you
> get no rows back, but that seems like a pretty limited use case to me
> (Hello, LEFT JOIN).
>
> Dropping the column entirely though would break tons of things.

That was a very good point I hadn't thought about. In that case, +1 to keep it.

I also came to think of the renaming of pg_stat_activity.procpid ->
pg_stat_activity.pid,
which was renamed because "backwards compatibility was broken anyway"
(due to current_query -> query).

I wouldn't rule out there were users who didn't use the
"current_query" column but *did* use "procpid",
who wouldn't have been affected if the procpid column hadn't been
renamed as well.

Apparently in this case, it was OK to break even more things than
necessary, since another things was already broken.

I don't have any opinion whether doing so was a bad or good decision,
it all depends on what the project's objectives are.
Unfortunately, it feels like the case-by-case basis decision model
lead to conflicting arguments, if they would be compared side-by-side.

I really think the project need a policy here on how, why and when
it's OK to break backwards-compatibility.
Such a policy won't cover all cases of course, but if a consensus can
be made on the basic principles,
then discussion can be focused on the details and cases not covered by
the basic principles,
instead of "end up hashing it out on a case-by-case basis".

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Provide much better wait information in pg_stat_activity.

2016-03-11 Thread Joel Jacobson

On Sat, Mar 12, 2016 at 5:09 AM, Pavel Stehule  wrote:
>> What we need is more input on proposed changes from other companies
>> who are also heavy users of PL/pgSQL.
>>
>> Only then can we move forward. It's like Robert is saying, there is a
>> risk for bikeshedding here,
>> we must widen our perspectives and get better understanding for how
>> other heavy users are using PL/pgSQL.
>
>
> I disagree with this opinion - this is community sw, not commercial. We can
> do nothing if we don't find a agreement.

I disagree with your disagreement.

The users are what matters, and many of them are of course not on this
list (since this is a list for hackers), so we need to reach out to
the users, and those are companies/websites/nonprofits/governments.
So discussing proposed changes on this list will take us absolutely
nowhere, without further input from actual heavy users.
Once we do have input from the heavy users, then and only then can we
continue discussing things on this list, but before then it's kind of
pointless, because we don't know what the most commonly proposed
changes are, not you, not me. The risk of bikeshedding is just too
big, like Robert pointed out.

>> Pavel, do you know of any such companies?
> Probably the biggest company with pretty large code of PL/pgSQL was Skype,
> but I have not any info about current state.

True! I had almost forgotten about them after Microsoft acquired them.
Let's hope they are still on PostgreSQL. I'll check it out, thanks.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Provide much better wait information in pg_stat_activity.

2016-03-11 Thread Joel Jacobson

On Sat, Mar 12, 2016 at 4:48 AM, Joel Jacobson <j...@trustly.com> wrote:
> neither you nor me have nothing to add.

Correction: neither you nor me have anything to add.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Provide much better wait information in pg_stat_activity.

2016-03-11 Thread Joel Jacobson

On Sat, Mar 12, 2016 at 4:41 AM, Pavel Stehule  wrote:
> I afraid so you try to look on your use case as global/generic issue. The
> PL/SQL, ADA. PL/pgSQL are verbose languages, and too shortcuts does the
> languages dirty. In this point we have different opinion.
>
> I proposed some enhanced PLpgSQL API with a possibility to create a
> extension that can enforce your requested behave. The implementation can not
> be hard, and it can coverage some special/individual requests well.

I'm not at all interested to discuss any of the proposed changes that
have already been proposed,
because we have already had lengthy discussions on them, and I doubt
neither you nor me have nothing to add.

What we need is more input on proposed changes from other companies
who are also heavy users of PL/pgSQL.

Only then can we move forward. It's like Robert is saying, there is a
risk for bikeshedding here,
we must widen our perspectives and get better understanding for how
other heavy users are using PL/pgSQL.

Pavel, do you know of any such companies?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Provide much better wait information in pg_stat_activity.

2016-03-11 Thread Joel Jacobson

On Sat, Mar 12, 2016 at 4:08 AM, Robert Haas  wrote:
>
> I don't think my experience in this area is as deep as you seem to
> think.  I can tell you that most of the requests EnterpriseDB gets for
> PL/pgsql enhancements involve wanting it to be more like Oracle's
> PL/SQL, which of course has very little overlap with the stuff that
> you're interested in.

Do you know who could possibly be more experienced
with companies who are heavy users of PL/pgSQL in the community?

and/or,

Do you know of any companies who officially are heavy users of PL/pgSQL?

The only other company I can think of is Zalado, but of course there
are many more,
I just wish I knew their names, because I want to compile a wish list with
proposed changes from as many companies who are heavy users of
PL/pgSQL as possible.

That's the only way to push this forward. As you say, we need a
consensus and input
from a broad range of heavy users, not just from people on this list
with feelings
and opinions who might not actually be heavy users themselves.

Of course almost everybody on this list uses PL/pgSQL from time to
time or even daily,
but it's a completely different thing to write an entire backend
system in the language,
it's first then when you start to become really excited of e.g. not
having to type
at least 30 characters of text every time you do an UPDATE/INSERT
to be sure you modified exactly one row.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Provide much better wait information in pg_stat_activity.

2016-03-11 Thread Joel Jacobson

On Fri, Mar 11, 2016 at 11:14 AM, Robert Haas  wrote:
> I'm not direly opposed to most of what's on that page,
> but I'm not excited about most of it, either.

May I ask, what improvements of PL/pgSQL would you personally be most
excited about,
if you or someone else would have unlimited resources to hack on it?

> I bet if we canvassed 10 different companies that made heavy use of PL/pgsql 
> they'd all have
> a list of proposed changes like that, and I bet some of them would
> conflict with each other, and I bet if we did all that stuff the
> average PL/pgsql user's life would not be much better, but the manual
> would be much longer.

You as a professional PostgreSQL consultant obviously have a lot of more
contact than me with other companies who make heavy use of PL/pgSQL.

I'm assuming your bet on these proposed changes in conflict you talk about
are based on things you've picked up IRL from companies you've been
working with.

What would you say are the top most commonly proposed changes
from companies that make heavy use of PL/pgSQL, and which of those are
in conflict?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Provide much better wait information in pg_stat_activity.

2016-03-10 Thread Joel Jacobson

On Fri, Mar 11, 2016 at 9:36 AM, Robert Haas  wrote:
> Well, this was discussed.  If we keep the Boolean "waiting" column, then 
> either:

Oh, sorry for missing out on that discussion.

> 1. We make it true only for heavyweight lock waits, and false for
> other kinds of waits.  That's pretty strange.
> 2. We make it true for all kinds of waits that we now know how to
> report.  That still breaks compatibility.

Why not 3: We make it true for exactly the same type of situations as
in previous versions. Or is it not possible to do so for some reason?

> I do understand that changing this is backward-incompatible and a lot
> of people are going to have to update their monitoring tools.  But I
> think that's the best alternative.  If we choose option #1, we're
> going to be saddled with a weird backward-compatibility column
> forever, and ten years from now we'll be explaining that even if
> waiting = false you might still be waiting depending on the value of
> some other column.  If we choose option #2, it won't be
> backward-compatible and some people's queries will still break, just
> less obviously.  Neither of those things seems very appealing.

I understand it's necessary to break backward-compatibility if the
it's not possible to return the same boolean value for the "waiting"
column in exactly the same situations.
Actually, even if it would be possible, I agree with you it's better
to force people to learn how to improve their tools by using the new
features.

Off topic, but related to the backward-compatibility subject:

Is there any written policy/wiki/thread/document on the topic "When
breaking backward-compatibility is acceptable"?

It would be helpful to get a better understand of this, as some ideas
on how to improve things can quickly be ruled out or ruled in
depending on what is acceptable or not.
For instance, there are some minor but annoying flaws in PL/pgSQL that
I would love to get fixed,
but the main arguments against doing so have been that it might break
some users' code somewhere,
even though doing so would probably be a good thing as the user could
have a bug in the code.
See: https://wiki.postgresql.org/wiki/Improving_PL/PgSQL_(September_2014)

I think one general rule should be "Breaking backward-compatibility is
acceptable if the new major pg-version throws an error in a situation
where the old major pg-version would conceal a bug or allow misuse of
a feature".
Trying to select the now removed "waiting" column throws an error.
Good! That lead me as a user here to figure out why I can't and
shouldn't use it. :)
Trying to e.g. select a different number of columns into a different
number of variables in a PL/pgSQL function doesn't throw an error.
Bad. :(
Here I would argue it's better to throw an error, just like when
trying to select from "waiting". It will hopefully save the day for
some users out there who can't find the bug in their complicated
PL/pgSQL application with millions of lines of code.

Sorry if this was completely off-topic, maybe I should start a new
thread or read some old thread in the archives on
backward-compatibility instead.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Provide much better wait information in pg_stat_activity.

2016-03-10 Thread Joel Jacobson

This is an excellent feature, thanks!
But can we please keep the old boolean waiting column?
I see no reason to break backward-compatibility. Or maybe I'm missing something.

I just had to commit this to make our system run locally on 9.6:

commit 2e189f85fa56724bec5c5cab2fcf0d2f3a4ce22a
Author: Joel Jacobson <j...@trustly.com>
Date:   Fri Mar 11 08:19:52 2016 +0700

Make Have_Queries_Waiting() work with both <9.6 and >=9.6.

Apparently pg_stat_activity.waiting was removed by this commit:
  commit 53be0b1add7064ca5db3cd884302dfc3268d884e
  Author: Robert Haas <rh...@postgresql.org>
  Date:   Thu Mar 10 12:44:09 2016 -0500

  Provide much better wait information in pg_stat_activity.

This forces us to do some ugly version checking to know which column to use.
I for one can think it would have been better to keep the old
boolean column,
which is not entirely useless as sometimes you just want to know
if something is
waiting and don't care about the details, then it's convenient to
have a boolean column
instead of having to write "wait_event IS NOT NULL".

Let's hope they will add back our dear waiting column so we can avoid this
ugly hack before upgrading to 9.6.

diff --git a/public/FUNCTIONS/have_queries_waiting.sql
b/public/FUNCTIONS/have_queries_waiting.sql
index d83e7c8..b54caf5 100644
--- a/public/FUNCTIONS/have_queries_waiting.sql
+++ b/public/FUNCTIONS/have_queries_waiting.sql
@@ -3,9 +3,16 @@ SET search_path TO 'public', pg_catalog;
 CREATE OR REPLACE FUNCTION have_queries_waiting() RETURNS boolean
 SECURITY DEFINER
 SET search_path TO public, pg_temp
-LANGUAGE sql
+LANGUAGE plpgsql
 AS $$
-SELECT EXISTS (SELECT 1 FROM pg_stat_activity WHERE waiting)
+DECLARE
+BEGIN
+IF version() ~ '^PostgreSQL 9\.[1-5]' THEN
+RETURN EXISTS (SELECT 1 FROM pg_stat_activity WHERE waiting);
+ELSE
+RETURN EXISTS (SELECT 1 FROM pg_stat_activity WHERE wait_event IS
NOT NULL);
+END IF;
+END;
 $$;

On Fri, Mar 11, 2016 at 6:17 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Robert Haas <robertmh...@gmail.com> writes:
>> Or ... maybe this is intentional behavior?  Now that I think about it,
>> doesn't each backend cache this info the first time its transaction
>> reads the data?
>
> Your view of pg_stat_activity is supposed to hold still within a
> transaction, yes.  Otherwise it'd be really painful to do any complicated
> joins.  I think there may be a function to explicitly flush the cache,
> if you really need to see intratransaction changes.
>
> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Joel Jacobson

Mobile: +46703603801
Trustly.com | Newsroom | LinkedIn | Twitter


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Disabling an index temporarily

2016-03-09 Thread Joel Jacobson

On Mon, Dec 14, 2015 at 10:27 AM, Tom Lane  wrote:
> Corey Huinker  writes:
>> So, I'd propose we following syntax:
>> ALTER INDEX foo SET DISABLED
>> -- does the SET indisvalid = false shown earlier.
>
> This is exactly *not* what Tatsuo-san was after, though; he was asking
> for a session-local disable, which I would think would be by far the more
> common use-case.  It's hard for me to see much of a reason to disable an
> index globally while still paying all the cost to maintain it.  Seems to
> me the typical work flow would be more like "disable index in a test
> session, try all your queries and see how well they work, if you conclude
> you don't need the index then drop it".  Or perhaps you could imagine that
> you want the index selected for use only in certain specific sessions ...
> but the above doesn't cater for that use-case either.
>
> Certainly, there's opportunities to improve the flexibility of the
> index-disable specifications in the plug-in Oleg and Teodor did.  But
> I think that that is the right basic approach: some sort of SET command,
> not anything that alters the catalogs.  We already have lots of
> infrastructure that could handle desires like having specific values
> active in only some sessions.

I searched for "indisvalid" and this thread came up.

I need this exact same thing as Tatsuo-san; a way to session-local
disable index(es),
so that plpgsql functions can avoid certain indexes when they are
created/planned.

How would one go about to implement such a SET command, without
altering the catalog?

I noticed the RelationReloadIndexInfo() which appears to be doing a
light-weight update of index changes,
including "relation->rd_index->indisvalid = index->indisvalid".

Or maybe one could call  index_set_state_flags(indexId,
INDEX_DROP_CLEAR_VALID) before the function is compiled/planned,
and then reset it using index_set_state_flags(indexId,
INDEX_CREATE_SET_VALID) after it has been compiled/planned?

If someone could give me guidance on where to start I would be grateful.
Even if I don't succeed implementing this, it's at least fun and
interesting to dig into the postgres source code to learn things.

Thanks

Joel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-08 Thread Joel Jacobson

On Wed, Mar 9, 2016 at 1:25 AM, Shulgin, Oleksandr
 wrote:
> Thank you for spending your time to run these :-)

n/p, it took like 30 seconds :-)

> I don't want to be asking for too much here, but is there a chance you could
> try the effects of the proposed patch on an offline copy of your database?

Yes, I think that should be possible.

> Do you envision or maybe have experienced problems with query plans
> referring to the columns that are near the top of the above hist_ratio
> report?  In other words: what are the practical implications for you with
> the values being duplicated rather badly throughout the histogram like in
> the example you shown?

I don't know much about the internals of query planner,
I just read the "57.1. Row Estimation Examples" to get a basic understanding.

If I understand it correctly, if the histogram_bounds contains a lot
of duplicated values,
then the row estimation will be inaccurate, which in turn will trick
the query planner
into a sub-optimal plan?

We've had some problems lately with the query planner, or actually we've always
had them but never noticed them nor cared about them, but now during peak times
we've had short periods where we haven't been able to fully cope up
with the traffic.

I tracked down the most self_time-consuming functions and quickly saw
how to optimize them.
Many of them where on the form:
SELECT .. FROM SomeBigTable WHERE Col1 = [some dynamic value] AND Col2
= [some constant value] AND Col3 = [some other constant value]
The number of rows matching the WHERE clause were very tiny, perfect
match for a partial index:
CREATE INDEX .. ON SomeBigTable USING btree (Col1) WHERE Col2 = [some
constant value] AND Col3 = [some other constant value];

Even though the new partial index matched the query perfectly, the
query planner didn't want to use it. Instead it continued to use some
other sub-optimal index.

The only way to force it to use the correct index was to use the
"+0"-trick which I recently learned from one of my colleagues:
SELECT .. FROM SomeBigTable WHERE Col1 = [some dynamic value] AND
Col2+0 = [some constant value] AND Col3+0 = [some other constant
value]
CREATE INDEX .. ON SomeBigTable USING btree (Col1) WHERE Col2+0 =
[some constant value] AND Col3+0 = [some other constant value];

By adding +0 to the columns, the query planner will as I understand it
be extremely motivated to use the correct index, as otherwise it would
have to do a seq scan on the entire big table, which would be very
costly.

I'm glad the trick worked, now the system is fast again.

We're still on 9.1, so maybe these problems will go away once we upgrade to 9.5.

I don't know if these problems I described can be fixed by your patch,
but I wanted to share this story since I know our systems (Trustly's
and Zalando's) are quite similar in design,
so maybe you have experienced something similar.

(Side note: My biggest wish would be some way to specify explicitly on
a per top-level function level a list of indexes the query planner is
allowed to consider or is NOT allowed to consider.)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-08 Thread Joel Jacobson

Hi Alex,

Thanks for excellent research.

I've ran your queries against Trustly's production database and I can
confirm your findings, the results are similar:

WITH ...
SELECT count(1),
   min(hist_ratio)::real,
   avg(hist_ratio)::real,
   max(hist_ratio)::real,
   stddev(hist_ratio)::real
  FROM stats2
 WHERE histogram_bounds IS NOT NULL;

-[ RECORD 1 ]
count  | 2814
min| 0.193548
avg| 0.927357
max| 1
stddev | 0.164134


WHERE distinct_hist < num_hist
-[ RECORD 1 ]
count  | 624
min| 0.193548
avg| 0.672407
max| 0.990099
stddev | 0.194901


WITH ..
SELECT schemaname ||'.'|| tablename ||'.'|| attname || (CASE inherited
WHEN TRUE THEN ' (inherited)' ELSE '' END) AS columnname,
   n_distinct, null_frac,
   num_mcv, most_common_vals, most_common_freqs,
   mcv_frac, (mcv_frac / (1 - null_frac))::real AS nonnull_mcv_frac,
   distinct_hist, num_hist, hist_ratio,
   histogram_bounds
  FROM stats2
 ORDER BY hist_ratio
 LIMIT 1;

 -[ RECORD 1 
]-+-
columnname| public.x.y
n_distinct| 103
null_frac | 0
num_mcv   | 10
most_common_vals  | {0,1,2,3,4,5,6,7,8,9}
most_common_freqs |
{0.4765,0.141733,0.1073,0.0830667,0.0559667,0.037,0.0251,0.0188,0.0141,0.0113667}
mcv_frac  | 0.971267
nonnull_mcv_frac  | 0.971267
distinct_hist | 18
num_hist  | 93
hist_ratio| 0.193548387096774
histogram_bounds  |
{10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,12,12,12,12,12,12,12,12,12,12,12,12,13,13,13,13,13,13,13,13,13,13,14,14,14,14,14,15,15,15,15,16,16,16,16,21,23,5074,5437,5830,6049,6496,7046,7784,14629,21285}



On Mon, Jan 18, 2016 at 4:46 PM, Shulgin, Oleksandr
<oleksandr.shul...@zalando.de> wrote:
> On Wed, Dec 2, 2015 at 10:20 AM, Shulgin, Oleksandr
> <oleksandr.shul...@zalando.de> wrote:
>>
>> On Tue, Dec 1, 2015 at 7:00 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>>>
>>> "Shulgin, Oleksandr" <oleksandr.shul...@zalando.de> writes:
>>> > This post summarizes a few weeks of research of ANALYZE statistics
>>> > distribution on one of our bigger production databases with some
>>> > real-world
>>> > data and proposes a patch to rectify some of the oddities observed.
>>>
>>> Please add this to the 2016-01 commitfest ...
>>
>>
>> Added: https://commitfest.postgresql.org/8/434/
>
>
> It would be great if some folks could find a moment to run the queries I was
> showing on their data to confirm (or refute) my findings, or to contribute
> to the picture in general.
>
> As I was saying, the queries were designed in such a way that even
> unprivileged user can run them (the results will be limited to the stats
> data available to that user, obviously; and for custom-tailored statstarget
> one still needs superuser to join the pg_statistic table directly).  Also,
> on the scale of ~30k attribute statistics records, the queries take only a
> few seconds to finish.
>
> Cheers!
> --
> Alex
>



-- 
Joel Jacobson

Mobile: +46703603801
Trustly.com | Newsroom | LinkedIn | Twitter


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Releasing in September

2016-01-20 Thread Joel Jacobson

On Wed, Jan 20, 2016 at 5:29 PM, Andres Freund  wrote:
> I think one thing we should work on, is being absolutely religious about
> requiring, say, 2 reviews for every nontrivial contribution.  We
> currently seem to have a significantly increased submission rate, and at
> the same time the number of reviews per patch has gone down
> substantially.  I think the "honor" system has failed in that regard.

Good point, totally agree.

So the project should try to think of new ideas on how to incentivise reviewing?

I have three ideas so far:

1. The way I see it, the honor system is based on being mentioned
in the commit message and the release notes.

Authors are always mentioned in the release notes,
but I see reviewers are mostly only mentioned in the commit messages.

Maybe more skilled developers would think it's cool to do reviewing
if they were "paid" by also being mentioned in the release notes?

At least some skilled people would probably be motivated by it,
in addition to the good feeling of doing something just because it's
fun or important.

2. Currently "Needs review" can mean a patch is in any of the phases
(Submission -> Usability -> Feature test -> Performance review ->
Coding review -> Architecture review -> Review review).
If you as a reviewer don't even know if the patch will be accepted and
committed by a committer,
there is a risk review-time will be wasted until the patch has been
accepted by the community.

I suggest we inject a new intermediate optional step "Needs Code
Review" to mean the feature has been discussed on pghackers
and there is a consensus among committers that they at least agree the
feature is something desired,
and that someone has at least reviewed the patch applies and the
feature works like expected.
And maybe renaming the "Needs review" step to "Needs Usability Review"
(or some other word to capture the Submission -> Usability -> Feature
test -> Performance review, phase before the Code review phase).

Then you as a reviewer would be confident the feature will get
accepted as long as the code is correct,
and the time you spent on code-reviewing will not be wasted.

3. For those who are partly motivated by money,
maybe this idea would work too to some extent:

Since Trustly's Bug Country program [1] is aimed at motivating people
to find bugs in HEAD hasn't produced a single report so far,
maybe we should extend it to also cover reviews of commits
marked as "Ready For Committer" that shed light on a problem
that could have caused data corruption,
preventing a bad commit from being committed.

If we suddenly get hundreds of new reviewers who find hundreds of bugs
in uncommitted code, then maybe we have to change the terms again,
but I doubt neither of those will happen.

If people think this is a good idea, I can discuss internally if we can change
the conditions of the bug country program accordingly.

[1] https://trustly.com/se/postgresql-bug-bounty/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: "Causal reads" mode for load balancing reads without stale data

2016-01-07 Thread Joel Jacobson

epending on feedback/interest (but see 
> comments in the attached patch for some of those subjects).  For now I didn't 
> want to clog up the intertubes with too large a wall of text.
>
>
> === PROOF-OF-CONCEPT ===
>
> Please see the POC patch attached.  It adds two new GUCs.  After setting up 
> one or more hot standbys as per usual, simply add "causal_reads_timeout = 4s" 
> to the primary's postgresql.conf and restart.  Now, you can set "causal_reads 
> = on" in some/all sessions to get guaranteed causal consistency.  Expected 
> behaviour: the causal reads guarantee is maintained at all times, even when 
> you overwhelm, kill, crash, disconnect, restart, pause, add and remove 
> standbys, and the primary drops them from the set it waits for in a timely 
> fashion.  You can monitor the system with the replay_lag and 
> causal_reads_status in pg_stat_replication and some state transition LOG 
> messages on the primary.  (The patch also supports "synchronous_commit = 
> apply", but it's not clear how useful that is in practice, as already 
> discussed.)
>
> Lastly, a few notes about how this feature related to some other work:
>
> The current version of this patch has causal_reads as a feature separate from 
> synchronous_commit, from a user's point of view.  The thinking behind this is 
> that load balancing and data loss avoidance are separate concerns: 
> synchronous_commit deals with the latter, and causal_reads with the former.  
> That said, existing SyncRep machinery is obviously used (specifically  
> SyncRep queues, with a small modification, as a way to wait for apply 
> messages to arrive from standbys).  (An earlier prototype had causal reads as 
> a new level for synchronous_commit and associated states as new walsender 
> states above 'streaming'.  When contemplating how to combine this proposal 
> with the multiple-synchronous-standby patch, some colleagues and I came 
> around to the view that the concerns are separate.  The reason for wanting to 
> configure complicated quorum definitions is to control data loss risks and 
> has nothing to do with load balancing requirements, so we thought the 
> features should probably be separate.)
>
> The multiple-synchronous-servers patch[3] could be applied or not 
> independently of this feature as a result of that separation, as it doesn't 
> use synchronous_standby_names or indeed any kind of statically defined quorum.
>
> The standby WAL writer patch[4] would significantly improve walreceiver 
> performance and smoothness which would work very well with this proposal.
>
> Please let me know what you think!
>
> Thanks,
>
>
> [1] 
> http://www.postgresql.org/message-id/flat/CAEepm=1fqkivl4v-otphwsgw4af9hcogimrcw-ybtjipx9g...@mail.gmail.com
>
> [2] From http://queue.acm.org/detail.cfm?id=1466448
>
> "Causal consistency. If process A has communicated to process B that it has 
> updated a data item, a subsequent access by process B will return the updated 
> value, and a write is guaranteed to supersede the earlier write. Access by 
> process C that has no causal relationship to process A is subject to the 
> normal eventual consistency rules.
>
> Read-your-writes consistency. This is an important model where process A, 
> after it has updated a data item, always accesses the updated value and will 
> never see an older value. This is a special case of the causal consistency 
> model."
>
> [3] 
> http://www.postgresql.org/message-id/flat/caog9aphycpmtypaawfd3_v7svokbnecfivmrc1axhb40zbs...@mail.gmail.com
>
> [4] 
> http://www.postgresql.org/message-id/flat/CA+U5nMJifauXvVbx=v3UbYbHO3Jw2rdT4haL6CCooEDM5=4...@mail.gmail.com
>
> --
> Thomas Munro
> http://www.enterprisedb.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Joel Jacobson

Mobile: +46703603801
Trustly.com | Newsroom | LinkedIn | Twitter


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remaining 9.5 open items

2015-12-02 Thread Joel Jacobson

On Mon, Nov 30, 2015 at 8:43 PM, Tom Lane  wrote:
> * Finish multixact truncation rework
>
> We're not seriously going to push something this large into 9.5 at this
> point, are we?

I don't know all the details here, so my apologies if any of this is
incorrect/stupid/misinformed.

Given all the quite serious data corruption issues related to
multixact, wouldn't it be motivated to wait releasing 9.5.0 until this
much needed multixct truncation rework has been finished?

This is of course not an issue for users already on >=9.3, since the
problems started in 9.3 and those already on 9.3 and 9.4 already face
the risk.

But users who will consider upgrading from previous versions, will
have to ask themselves if all the new features in 9.5 are worth the
extra risk of data corruption due to the multixact issues.

We at Trustly are still running 9.1 and have been waiting for the
multixact problems to be fixed, which is why we didn't upgrade to 9.4,
and now when I read this I feel really sad we probably have to wait
for 9.6, unless we can accept some increase risk of data corruption.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remaining 9.5 open items

2015-12-02 Thread Joel Jacobson

On Wed, Dec 2, 2015 at 12:19 PM, Andres Freund  wrote:
> The significant changes are in 9.5.

Will multixact truncations be WAL logged in 9.5?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remaining 9.5 open items

2015-12-02 Thread Joel Jacobson

On Wed, Dec 2, 2015 at 12:36 PM, Andres Freund <and...@anarazel.de> wrote:
>
> On 2015-12-02 12:25:37 +0100, Joel Jacobson wrote:
> > On Wed, Dec 2, 2015 at 12:19 PM, Andres Freund <and...@anarazel.de> wrote:
> > > The significant changes are in 9.5.
> >
> > Will multixact truncations be WAL logged in 9.5?
>
> Yes.
>
> C.f. the release notes

Excellent! :-) Many many thanks for your efforts! By far the most
important change in 9.5. So much looking forward to upgrade.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rework the way multixact truncations work

2015-09-29 Thread Joel Jacobson

On Tue, Sep 22, 2015 at 3:20 PM, Andres Freund  wrote:
> What I've tested is the following:
> * continous burning of multis, both triggered via members and offsets
> * a standby keeping up when the primary is old
> * a standby keeping up when the primary is new
> * basebackups made while a new primary is under load
> * verified that we properly PANIC when a truncation record is replayed
>   in an old standby.

Are these test scripts available somewhere?
I understand they might be undocumented and perhaps tricky to set it all up,
but I would be very interested in them anyway,
think you could push them somewhere?

Thanks a lot for working on this!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction is back

2015-07-28 Thread Joel Jacobson

On Tue, Jul 28, 2015 at 12:56 AM, Josh Berkus j...@agliodbs.com wrote:

 Ah, ok.  The goal of the project is that the writer of X() *cannot*
 prevent Y() from writing its data (B1) and committing it.


 One of the primary use cases for ATX is audit triggers.  If a function
 writer could override ATX and prevent the audit triggers from
 committing, then that use case would be violated.

 Can you explain what use case you have where simply telling the staff
 if you use ATX without clearing it, you'll be fired is not sufficient?
  Possibly there's something we failed to account for in the unconference
 discussion.


I fully understand and agree why you want to prevent X() from letting Y()
commit, if the use-case is e.g. auditing.

I'll try to explain where I'm coming from by providing a bit of background
and context.

One of the greatest strengths with writing an entire application using only
sql and plpgsql functions,
is you don't have to worry about side effects when calling functions, since
you are always in total control to
rollback all your writes at your stack-depth and deeper down the stack, and
the caller can likewise be certain
that if the function it called threw an exception, all of its work is
rollbacked.

However, if e.g. making use of plperlu functions, it's possible those
functions might have written files to disk
or made network connections to the outside world,
i.e. it's possible they have caused side-effects that naturally cannot be
rollbacked by postgres.

Currently, in our codebase at Trustly, we have already quite a few plperlu
functions.
In each one of them, we need to think carefully about side-effects,
it's usually fine since most of them are immutable.

But if we had to worry about all plpgsql functions being able to write
things without consent from the caller,
then we would have a completely different situation with increased
complexity and risk for failures.

if you use ATX without clearing it, you'll be fired
We trust each other so we don't have that kind of problem.
But still, even when trusting your own and others code, I would say quite
often you make use of a function
for which there are small details you have forgotten about or never knew in
the first place,
and if that little detail would be something written in an ATX, that could
be a problem
if you the caller wouldn't want whatever was written to be written.

Don't get me wrong, ATX is something I would absoltely love, since then you
could
for instance in function doing password validation, update the
FailedLoginAttempts column
in an ATX and then still raise an exception to rollback the operation and
return an error to the client.

However, the need for ATXs is at least for us a special need you won't need
in most functions,
and since the risk and complexity increases with it, I would prefer if it
can be enaled/disabled
by default globally and explicitly enabled/disabled per function.

If the global default is disabled or if it's disabled for a specific
function, like for X() in your example,
and if it's enabled for Y(), then X() tries to call Y() you should get an
error even before Y() is executed.

That way we can still do auditing, since X() couldn't execute Y() since it
was declared as AT,
and that would be the same thing as if X() wouldn't have the line of code
in it that executs Y(),
something which X() is in power of as if X() calls Y() or not, is
ultimately X()'s decision.

If we declare entire functions as AT, then we only have to check before
executing the function
if AT is allowed or not in the context, determined by the global default or
if the caller function is defined AT or NOT AT.

Use cases:

1. X[AT] - Y[AT]
OK, since caller X() is declared AT i.e. allows AT in itself and in callees.

2. X[AT] - Y[NOT AT]
OK, since caller X() is declared AT i.e. allows AT in itself and in callees,
and since Y() is NOT AT, i.e. not making use of AT and not allowing it in
callees
that is not in violation with anything.

3: X[NOT AT] - Y[AT]
Invalid, since caller X() is declared NOT AT, i.e. disallows AT in itself
and in callees,
and since Y() is declared AT it cannot be executed since it's declared AT.

4: X[NOT AT] - Y[NOT AT]
OK, since caller X() is declared NOT AT, i.e. disallows AT in itself and in
callees,
and since Y() is also declared NOT AT, it can be executed.

Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Joel Jacobson

On Fri, Jul 24, 2015 at 9:39 PM, Merlin Moncure mmonc...@gmail.com wrote:

 On Thu, Jul 23, 2015 at 1:49 PM, Josh Berkus j...@agliodbs.com wrote:
  Batch Jobs: large data-manipulation tasks which need to be broken up
  into segments, with each segment committing separately.  Example:
  updating 1 million records in batches of 1000.

 Autonomous transactions are not a good fit for this case; stored
 procedures are a better way to go for any scenario where you don't
 want be be in a snapshot (for example, suppose you want to change
 isolation level on the fly).


Hm, you mean we need real stored procedures in PostgreSQL and not just
functions?

If not, I think it would be sufficient to add Autonomous Transaction
support to the type of functions we already have in pg to allow writing a
batch job function which would commit after X numbers of modified rows,
instead of having to write a script in an external language such as Perl to
call the function in a while-loop and commit in between each function call.

However, we should also add a way for the caller to protect against an
Autonomous Transaction in a function called by the caller. Imagine if
you're the author of function X() and within X() make use of some other
function Y() which has been written by some other author, and within your
function X(), it's very important either all of your work or none at all
gets committed, then you need to make sure none of the changes you made
before calling Y() gets committed, and thus we need a way to prevent Y()
from starting and committing an Autonomous Transaction, otherwise we would
increase the risk and complexity of working with functions and plpgsql in
PostgreSQL as you would then need to be sure none of the functions you are
using within a function will start and commit an ATX.

Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Joel Jacobson

On Mon, Jul 27, 2015 at 11:49 PM, Josh Berkus j...@agliodbs.com wrote:

 Ah, you're missing how commits in ATX are expected to work.  Let me
 illustrate:

 X (
Data write A1
call Y(
 Start ATX
 Data write B1
 Commit ATX
)
Data write A2
Exception
 )

 In this workflow, B1 would be committed and persistent. Neither A1 nor
 A2 would be committed, or visible to other users.  Depending on what
 implementation we end up with, A1 might not even be visible to Y().

 So that solves your use case without any need to block ATXs in called
 functions.  However, it leads to some interesting cases involving
 self-deadlocks; see the original post on this thread.


I don't follow. In your example above, if I'm X(), how do I ensure Y()
won't have committed anyting at all when I later at Exception decide to
rollback everything from Data write A1 to Data write A2 including any
writes made by Y() (in the example Data write B1)?

I understand the Exception will take care of rollbacking my (X's) writes,
but that's not sufficient if you want to make sure you rollback
*everything*, including any writes made by functions you call.

Right now, when writing a function, if you raise an exception, you can be
sure all writes you have made will be rollbacked, but your caller function
might caught the exception and decide to carry on and commit work made
before your function was called, but at least you can be confident your
writes won't be committed as long as you don't caught the exception you
raised in your own function. If I understand it correctly, that would
change with the addition of Autonomous Transaction, unless given a way to
prevent a function you call from starting and commiting a Autonomous
Transaction. Wrong? If so, then please show how to prevent Y() from
commiting the Data write B1 in your example, I don't get it.

Re: [HACKERS] pg_stat_*_columns?

2015-07-06 Thread Joel Jacobson

On Mon, Jun 29, 2015 at 11:14 PM, Jim Nasby jim.na...@bluetreble.com
wrote:

 What might be interesting is setting things up so the collector simply
 inserted into history tables every X seconds and then had a separate
 process to prune that data. The big problem with that is I see no way for
 that to easily allow access to real-time data (which is certainly necessary
 sometimes)


I think the idea sounds promising. If near real-time data is required, we
could just update once every second, which should be often enough for
everybody.

Each backend process could then simply INSERT the stats for each txn that
committed/rollbacked into an UNLOGGED table, and then the collector would
do one single UPDATE of the collector stats based on the aggregate of the
rows inserted since the previous update a second ago and then delete the
processed rows (naturally in one operation, using DELETE FROM .. RETURNING
*).

That way we could get rid of the legacy communication protocol between the
backends and the collector and instead rely on unlogged tables for the
submission of data from the backends to the collector.

INSERTing 100 000 rows to an unlogged table takes 70 ms on my laptop, so
should be fast enough to handle the 10s of thousands of updates per second
we need to handle.

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Joel Jacobson

On Tue, Jun 16, 2015 at 4:47 AM, Jim Nasby jim.na...@bluetreble.com wrote:

 Magnus idea was to first optimize the collector to make it less of a
 problem to collect more data. Sounds like a good thing to do, but maybe
 more data in it wouldn't be a problem as long as you don't read too
 often from it?


 The stats collector is a known problem under certain circumstances, so
 improving it would probably be a good thing. The first thing that comes to
 mind is splitting it into more files.


Is there any chance the project would accept a patch which adds the
pg_stat_*_columns-feature without first optimizing the collector? I guess
it primarily depends on how much of the new code that would need to be
rewritten, if the collector is optimized/rewritten in the future?

(I would be interested in sponsoring the work, if anyone is interested.)

Re: [HACKERS] pg_stat_*_columns?

2015-06-08 Thread Joel Jacobson

So I've heard from Magnus Hagander today IRL at our Stockholm PostgreSQL
User Group meeting where we discussed this idea. He told me the overhead in
the statistics collector is mainly when reading from it, not that much when
writing to it.
Magnus idea was to first optimize the collector to make it less of a
problem to collect more data. Sounds like a good thing to do, but maybe
more data in it wouldn't be a problem as long as you don't read too often
from it?

On Mon 8 Jun 2015 at 18:48 Robert Haas robertmh...@gmail.com wrote:

 On Fri, Jun 5, 2015 at 7:51 AM, Joel Jacobson j...@trustly.com wrote:
  Would others find it useful to see per column statistics about accesses
 to
  specific columns?

 A probably serious and maybe not entirely obvious problem with this is
 that it would increase the amount of statistical information we keep
 by a pretty large multiple.  How many columns do you have in a typical
 table?  20-30?  That's a lot of data for a statistics collector that,
 regrettably, is already overloaded.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

[HACKERS] pg_stat_*_columns?

2015-06-05 Thread Joel Jacobson

Would others find it useful to see per column statistics about accesses to
specific columns?

Two possible use-cases: (maybe there are more?)

1. I think it would be helpful for DBAs to better understand their own
system.
Finding unused *tables* is today easy thanks to pg_stat_*_tables, but
knowing if something is accessing a *column* or not is not easy.
In my case all our database access is via sprocs, so I can just grep the
source code for the column name to see if something is using it, but most
DBAs probably don't have that luxury.

2. It could also be useful for audit trailing, if you want to know what a
query did, i.e. what tables/columns were accessed in the txn.

Here is an idea of a very simple audit trailing system that would probably
fulfill my own needs:

Imagine if we had pg_stat_xact_user_columns and for each committed txn, do
an insert to an unlogged table with the same structure as
pg_stat_xact_user_columns
with the addition of session_user and timestamp for the txn.

I would much rather have audit trailing in a nice table than in a text
file. Maybe a foreign data wrapper could be used to ship the audit trail
data to some other external append-only pg-database, if the purpose of the
audit trailing is to prevent an evil DBA from doing evil things. But for
others it might be sufficient to do audit trailing to the same database,
for convenience purposes.

In summary:

Some users might only be interested in the statistics and mostly use
pg_stat_user_columns.
Other others might also be interested in what happened in a specific txn
and use pg_stat_xact_user_columns.
Yet some other users might be interested in audit trailing and want to log
pg_stat_xact_user_columns for each txn. Probably very expensive performance
wise, but might make sense if you have extremely sensitive data and audit
trailing is more important than performance.

Thoughts?

Re: [HACKERS] pg_xlog - pg_xjournal?

2015-06-02 Thread Joel Jacobson

On Tue, Jun 2, 2015 at 6:35 AM, Michael Nolan htf...@gmail.com wrote:
 Why not take a simpler approach and create a zero length file in directories
 that should not be fiddled with by non-experts using a file name something
 like DO.NOT.DELETE.THESE.FILES?

Then the smart sysadmin will say but I didn't delete any *files*, I
just deleted a log *directory* :-)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [PATCH] Fix documentation bug in how to calculate the quasi-unique pg_log session_id

2015-06-02 Thread Joel Jacobson

Fix documentation bug in how to calculate the quasi-unique pg_log session_id

While the code is truncating the backend_start time, the query example in
the documentation is rouding.
Fix by doing the same thing in the documentation as in, i.e. truncating the
backend_start.

elog.c:
snprintf(strfbuf, sizeof(strfbuf) - 1, %lx.%x,
(long) (MyStartTime), MyProcPid);

Example:

2015-06-02 16:50:29.045
CEST,joel,gluepay,49262,93.158.127.42:41093,556d7a0a.c06e,5062,idle
in transaction,2015-06-02 11:40:26
CEST,17/19970778,0,LOG,0,statement: select * from foo where bar =
'baz';exec_simple_query, postgres.c:876,psql

select backend_start, procpid, to_hex(EXTRACT(EPOCH FROM
backend_start)::integer) || '.' || to_hex(procpid) as invalid_session_id
from pg_stat_activity_log where procpid = 49262;
 backend_start | procpid | invalid_session_id
---+-+
 2015-06-02 11:40:26.516373+02 |   49262 | 556d7a0b.c06e

select backend_start, procpid, to_hex(trunc(EXTRACT(EPOCH FROM
backend_start))::integer) || '.' || to_hex(procpid) as valid_session_id
from pg_stat_activity_log where procpid = 49262;
 backend_start | procpid | valid_session_id
---+-+--
 2015-06-02 11:40:26.516373+02 |   49262 | 556d7a0a.c06e

---

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5549b7d..1da7dfb 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4707,7 +4707,7 @@ local0.*/var/log/postgresql
  of printing those items.  For example, to generate the session
  identifier from literalpg_stat_activity/, use this query:
 programlisting
-SELECT to_hex(EXTRACT(EPOCH FROM backend_start)::integer) || '.' ||
+SELECT to_hex(trunc(EXTRACT(EPOCH FROM backend_start))::integer) || '.' ||
to_hex(pid)
 FROM pg_stat_activity;
 /programlisting

Re: [HACKERS] pg_xlog - pg_xjournal?

2015-06-01 Thread Joel Jacobson

On Mon, Jun 1, 2015 at 10:17 PM, Josh Berkus j...@agliodbs.com wrote:
 If we symlink pg_xlog, then it will still trip up anyone who does rm
 -rf *log*/* or deletes files directly from inside the directory, both
 of which I've seen.  Deleting the directory itself is comparatively rare
 in my experience.  So for this do to any good, we'd have to plan to
 (eventally, at least) get rid of the symlinks.  Do links work the same
 on Windows, btw?

Hm, I don't agree the symlink version wouldn't do *any* good.

I think it's a good step since it solves the rm -rf pg_xlog problem,
and it solves the problem if a sysadmin uses /usr/bin/du to
find large directories suitable for deletion, as it won't show
symlinks.

 Also ... if we were to rename it, it should be pg_wal or pg_xact.
 Please let's not add yet another term for the WAL.

+1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_xlog - pg_xjournal?

2015-05-31 Thread Joel Jacobson

On Sun, May 31, 2015 at 7:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Hm.  I think the impact on third-party backup tools would be rather bad,
 but there's a simple modification of the idea that might fix that:
 just always create pg_xlog as a symlink to pg_xjournal during initdb.
 Anybody who blindly removes pg_xlog won't have done anything
 irreversible.  We could deprecate pg_xlog and stop creating the symlink
 after a few releases, once third-party tools have had a reasonable
 amount of time to adjust.

I like the solution. Simple and effective.
+1

 In the end though, this is a lot of thrashing for a problem that
 only comes up rarely ...

It happens often enough for the problem to be the first mentioned
use-case of pg_resetxlog at Stack Overflow:

pg_resetxlog is a tool of last resort for getting your database
running again after:
1. You deleted files you shouldn't have from pg_xlog;

(http://stackoverflow.com/questions/12897429/what-does-pg-resetxlog-do-and-how-does-it-work)

Preventing failure in the case of faults is of course one of the
primary objectives of any database.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_xlog - pg_xjournal?

2015-05-31 Thread Joel Jacobson

While anyone who is familiar with postgres would never do something as
stupid as to delete pg_xlog,
according to Google, there appears to be a fair amount of end-users out
there having made the irrevocable mistake of deleting the precious
directory,
a decision made on the assumption that since it has *log* in the name so
it must be unimportant (http://stackoverflow
.com/questions/12897429/what-does-pg-resetxlog-do-and-how-does-it-work).

If we could turn back time, would we have picked pg_xlog as the most
optimal name for this important directory, or would we have come up with a
more user-friendly name?

Personally, I have never had any problems with pg_xlog, but I realize there
are quite a few unlucky new users who end up in trouble.

My suggestion is to use pg_xjournal instead of pg_xlog when new users
create a new data directory using initdb, and allow for both directories to
exist (exclusive or, i.e. either one or the other, but not both). That way
we don't complicate the life for any existing users, all their tools will
continue to work who rely on pg_xlog to be named pg_xlog, but only force
new users to do a bit of googling when they can't use whatever tool that
can't find pg_xlog. When they find out it's an important directory, they
can simply create a symlink and their old not yet updated tool will work
again.

Thoughts?

Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-04-27 Thread Joel Jacobson

Looks good Pavel!

May I just suggest you add the default case
to src/test/regress/sql/plpgsql.sql
and src/test/regress/expected/plpgsql.out, to make it easier for the
reviewer to compare the difference between what happens in the default
case, when not using the raise-syntax and not using the GUCs?

Suggested addition to the beginning of src/test/regress/sql/plpgsql.sql:
+do $$
+begin
+  raise notice 'hello';
+end;
+$$;
+
+do $$
+begin
+  raise exception 'hello';
+end;
+$$;

Many thanks for this patch! I will pray to the PL/pgSQL God it will be
accepted. :)

Best regards,

Joel


On Sun, Apr 26, 2015 at 9:19 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:

 Hi

 I reduced this patch, little bit cleaned - now it is based on plpgsql GUC
 display_context_min_messages - like client_min_messages, log_min_messages.

 Documentation added.

 Regards

 Pavel

 2015-04-25 22:23 GMT+02:00 Pavel Stehule pavel.steh...@gmail.com:

 Hi

 2015-04-24 19:16 GMT+02:00 Joel Jacobson j...@trustly.com:

 On Fri, Apr 24, 2015 at 6:07 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  Example:
 
  context_messages = -warning, -error, +notice
 
 
  I prefer your first proposal - and there is a precedent for plpgsql -
  plpgsql_extra_checks
 
  It is clean for anybody. +-identifiers looks like horrible httpd
 config. :)

 I have to agree on that :) Just thought this is the best we can do if
 we want to reduce the number of GUCs to a minimum.


 I played with some prototype and I am thinking so we need only one GUC

 plpgsql.display_context_messages = 'none'; -- compatible with current
 plpgsql.display_context_messages = 'all';
 plpgsql.display_context_messages = 'exception, log'; -- what I prefer

 I implemented [ (WITH|WITHOUT) CONTEXT ] clause for RAISE statement

 RAISE NOTICE WITH CONTEXT 'some message';
 RAISE NOTICE WITH CONTEXT USING message = 'some message';
 RAISE EXCEPTION WITHOUT CONTEXT 'other message';

 The patch is very small with full functionality (without documentation) -
 I am thinking so it can work. This patch is back compatible - and allow to
 change default behave simply.

 plpgsql.display_context_messages can be simplified to some like
 plpgsql.display_context_min_messages

 What do you think about it?

 Regards

 Pavel

Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-04-27 Thread Joel Jacobson

On Mon, Apr 27, 2015 at 6:14 PM, Marko Tiikkaja ma...@joh.to wrote:

 That sounds weird.  log_min_messages are the messages sent to the log;
 client_min_messages are sent to the client.  context_min_messages are not
 sent to a context, whatever that would mean.


Good point.

I think it can't be any clearer than the proposed
plpgsql.display_context_min_messages

Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-04-25 Thread Joel Jacobson

+1


On Sat, Apr 25, 2015 at 10:23 PM, Pavel Stehule pavel.steh...@gmail.com
wrote:

 Hi

 2015-04-24 19:16 GMT+02:00 Joel Jacobson j...@trustly.com:

 On Fri, Apr 24, 2015 at 6:07 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  Example:
 
  context_messages = -warning, -error, +notice
 
 
  I prefer your first proposal - and there is a precedent for plpgsql -
  plpgsql_extra_checks
 
  It is clean for anybody. +-identifiers looks like horrible httpd
 config. :)

 I have to agree on that :) Just thought this is the best we can do if
 we want to reduce the number of GUCs to a minimum.


 I played with some prototype and I am thinking so we need only one GUC

 plpgsql.display_context_messages = 'none'; -- compatible with current
 plpgsql.display_context_messages = 'all';
 plpgsql.display_context_messages = 'exception, log'; -- what I prefer

 I implemented [ (WITH|WITHOUT) CONTEXT ] clause for RAISE statement

 RAISE NOTICE WITH CONTEXT 'some message';
 RAISE NOTICE WITH CONTEXT USING message = 'some message';
 RAISE EXCEPTION WITHOUT CONTEXT 'other message';

 The patch is very small with full functionality (without documentation) -
 I am thinking so it can work. This patch is back compatible - and allow to
 change default behave simply.

 plpgsql.display_context_messages can be simplified to some like
 plpgsql.display_context_min_messages

 What do you think about it?

 Regards

 Pavel

Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-04-24 Thread Joel Jacobson

Entering the discussion because this is a huge pain for me in my daily
work as well.

This is not a reply to any specific post in this thread, but my first
message in the thread.

I see a great value in providing both a GUC and a new RAISE syntax.
The different benefits of the two are maybe obvious, but perhaps worth
pointing out:
GUC: Good because you don't have to change any existing code.
RAISE syntax: Good because you can control exactly what message should
be emitted or not be emitted at that line of code.

I think preserving backwards compatibility is very important.
Not changing the default is not a problem for me, as long as it can be
overridden.

Whatever the default behaviour is, I think the need expressed by all
users in this thread boils down to any of these two sentences:

I want CONTEXT to be (DISPLAYED|SUPPRESSED) for (ALL|ONLY THIS LINE)
RAISE (NOTICE|WARNING|ERROR)
OR
I don't want to change the default current behaviour of CONTEXT

So we basically need a boolean setting value, where:
NULL means the default behaviour
TRUE means DISPLAY CONTEXT
FALSE means SUPPRESS CONTEXT

And the (ALL|ONLY THIS) part translates into using,
* a GUC to change behaviour for ALL lines of code,
* or using the RAISE syntax to change the behaviour of ONLY THIS line of code.

And then we have the different message levels, for which CONTEXT is
sometimes desirable in some situations:
* The RAISE syntax allows controlling any message level in a natural
way, as the message level is part of the syntax.
* Allowing the same control using GUC would mean the message level
would need to be part of the GUC key name, which means either add
multiple GUCs, one for each message level, or only allow controlling
the most important one and ignore the possibly need to control the
other message levels.

If it would be possible to somehow combine multiple message levels in
the same GUC, that would solve the latter problem.

We already have comma separated values for many GUCs, so maybe we
could use that approach here as well.

It looks like adding these two GUCs would meet the demands of all users:

suppress_context_messages (enum)
display_context_messages (enum)

This would allow doing something crazy as:

suppress_context_messages = warning,error
display_context_messages = notice


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-04-24 Thread Joel Jacobson

On Fri, Apr 24, 2015 at 1:16 PM, Robert Haas robertmh...@gmail.com wrote:
 This would allow doing something crazy as:

 suppress_context_messages = warning,error
 display_context_messages = notice

 This is a very flexible proposal, but it's a tremendous amount of
 machinery for what's really a very minor issue.  If we added two GUCs
 for every comparably important issue, we'd have about 40,000 of them.

I agree. The one-dimensional GUC syntax is not well suited for
multi-dimensional config settings. And that's a good thing mostly I
think. It would be a nightmare if the config file values could in JSON
format, it's good they are simple.

But I'm thinking maybe we could improve the config file syntax for the
general case when you have multiple things you want to control, in
this case the message levels, and for each such thing, you want to
turn something on/off, in this case the CONTEXT. Maybe we could simply
use plus + and minus - to mean on and off?

Example:

context_messages = -warning, -error, +notice


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-04-24 Thread Joel Jacobson

On Fri, Apr 24, 2015 at 6:07 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Example:

 context_messages = -warning, -error, +notice


 I prefer your first proposal - and there is a precedent for plpgsql -
 plpgsql_extra_checks

 It is clean for anybody. +-identifiers looks like horrible httpd config. :)

I have to agree on that :) Just thought this is the best we can do if
we want to reduce the number of GUCs to a minimum.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 1.2

2014-09-06 Thread Joel Jacobson

On Sat, Sep 6, 2014 at 6:59 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 People can prepare a simple functions like you did:

 ...

 CREATE OR REPLACE FUNCTION user_list ()
 RETURNS SETOF id AS $$
 BEGIN
   RETURN QUERY SELECT id FROM user WHERE .. some = $1
 END;
 $$ LANGUAGE plpgsql;

 CREATE OR REPLACE FUNCTION update_user(int)
 RETURNS void AS $$
 BEGIN
   UPDATE user SET .. WHERE id = $1
 END;
 $$ LANGUAGE;

 And then  use it in mass operations:

 BEGIN
   FOR company IN SELECT * FROM company_list()
   LOOP
 FOR id IN SELECT * FROM user_list(company)
 LOOP
   update_user(id);
 END LOOP;

 Or use it in application same style.

 It is safe .. sure, and I accept it. But It is terrible slow.

The above is horrible and ugly. That's not how I write code.
Only for top-level functions, i.e. API-functions, is it motivated to
encapsulate even simple queries like that, but *never* in other
PL-functions, as that doesn't fulfil any purpose, putting simple
queries inside functions only make it less obvious what the code does
where you have a function call instead of a SQL-query.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpgsql defensive mode

2014-09-06 Thread Joel Jacobson

On Sat, Sep 6, 2014 at 7:51 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hi

 There was a long discussion about future of PLpgSQL.

 I accept so Joel, Marko has good ideas based on probably strong experience
 from their domain. I can't accept their implementation and proposals as
 default for PLpgSQL now and in future. They try to mix wine and vodka
 concepts, and it has no good ends.

 I understand to their proposal as restrictive subset of PLpgSQL.
 restrictive subset is not good name. We can implement some features
 without impact on syntax as block on function level. Marko likes defensive
 programming, so we can use a name defensive_mode

 In this mode .. all DML statements should to return EXACTLY ONE row with
 exception CURSORs and FOR LOOP cycle where more rows is expected. But in
 this case we can raise a exception NODATA if there is no row.

 In this mode late IO casting will be disabled. We can disallow implicit
 casting too.

 We can talk what plpgsql warnings in this mode will be critical.

 This mode can be enabled for function by option

 #option defensive

 or for all plpgsql functions by GUC

 SET plpgsql.defensive = on

 In this moment I don't see a necessity to change or enhance syntax.

 I have no plan to use it as default, but anybody can do it simply by change
 one GUC in Postgres configuration. Defensive mode (strict defensive mode) is
 good strategy, but it is not for all.

+1 -- this would mean my original proposal would be possible, i.e. no
syntax change at all. But to solve this the proper way, and avoid a
long list of options/settings, it would be really nice being able to
define a new language, like pltrustly, which sets the mix of
settings which are relevant for us, where this would be a setting
which is apparently not desirable for everybody.

I also hope all the other things listed in the wiki* are possible to
fix in PL/pgSQL 2 (or even better in PL/pgSQL, if possible).
Pavel, do you have any input on the other items on the wiki? Most of
them are problems which really ought to raise errors.

*) https://wiki.postgresql.org/wiki/Improving_PL/PgSQL_(September_2014)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpgsql defensive mode

2014-09-06 Thread Joel Jacobson

 On 6 sep 2014, at 16:32, Marko Tiikkaja ma...@joh.to wrote:

 How do you run queries which affect more than one row in this mode? Because 
 that's crucial as well.  We want something we can run 100% of our code on, 
 but with a slightly more convenient syntax than PL/PgSQL provides right when 
 coding defensively in the cases where exactly one row should be affected.

If we would have ORDER BY also for UPDATE/DELETE then one could just
order by something arbitrary to express multiple or zero rows are OK,
even if not being interested in the order.



 .marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/PgSQL: EXIT USING ROLLBACK

2014-09-05 Thread Joel Jacobson

 On 3 sep 2014, at 16:20, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Sep 1, 2014 at 5:08 AM, Joel Jacobson j...@trustly.com wrote:
 On Sat, Jul 26, 2014 at 8:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Basically my point is that this just seems like inventing another way to
 do what one can already do with RAISE, and it doesn't have much redeeming
 social value to justify the cognitive load of inventing another construct.

 The main difference is with RAISE EXCEPTION 'OK'; you cannot know if
 it was *your* line of code which throw the 'OK'-exception or if it
 came from some other function which was called in the block of code.

 The real problem here is that if you're using PL/pgsql exceptions for
 control-flow reasons, you are taking a huge performance hit for that
 notational convenience.  I do agree that the syntax of PL/pgsql is
 clunky and maybe we should fix that anyway, but I honestly can't
 imagine too many people actually wanting to do this once they realize
 what it does to the run time of their procedure (and in some cases,
 the XID-consumption rate of their database).

Exceptions in plpgsql is indeed an exception itself :-)

There are a few use cases when they are crucial though, I would say I
use this code pattern in 0.1% of all functions, but still, when I need
this, it gets ugly.

Glad to hear you might consider the idea of fixing this.


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 1.2

2014-09-04 Thread Joel Jacobson

On Wed, Sep 3, 2014 at 11:19 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
 SELECT[1]   - select exactly one row, anything else raises error
 SELECT[0:1]   - select zero or one rows, anything else raises error
 SELECT[1:] - select one or more rows

 plain SELECT is equivalent to SELECT[0:]

 same syntax could be used for enforcing sane affected row counts
 for INSERT and DELETE

+1 for a new fresh creative idea! My mind was stuck in a looking for
keywords state.

The only suggestion I think is slightly better is the STRICT UPDATE,
but only if I'm right assuming the one row use-case is much more
common than the zero or one rows and one or more rows use-cases.

If all use-cases are equally important to support in a nice way,
then the suggested syntax is brilliant, as it supports all of them.

Bonus points for the extremely condensed syntax.

 2. Substitute for EXECUTE with string manipulation
 

 using backticks `` for value/command substitution in SQL as an alternative
 to EXECUTE string

+1 for being backwards compatible, but I have no strong opinion.

 3. A way to tell pl/pggsql not to cache plans fro normal queries
 ---

 OR we could do it in SQL-ish way using like this:

 SELECT
 ...
 USING FRESH PLAN;

+1 for clean syntax


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 1.2

2014-09-04 Thread Joel Jacobson

On Thu, Sep 4, 2014 at 9:39 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 we have totally different opinion what is good

Can you elaborate on that?

Your ASSERT CHECK ROWCOUNT = 1; is lengthly, which is why I don't like it.
Imagine if having to type
my $var === 'foo';
instead of
my $var = 'foo';
on every single line of could where you want to assign a variable,
that would just be ridiculous.

If you have a typical CRUD application and decide to do *all* data
operations via PL functions,
which is a design pattern advocated by many*, then you will end up
with a lot of very simple
short PL functions, to do things like update_worker_status(),
set_notification_response(), etc,
in which you always pass something which is a primary key in some
table, and want to update
exactly one row. Having to type 27 extra characters for every single
line of code, instead of the
suggested 3 extra characters, is a big difference, for anyone who
designs a CRUD application
which relies on the usage of PL functions.

For me, it would be useful to understand if you are developing CRUD
applications,
or if your main usage for PL/pgSQL functions are other things?

If the latter, then maybe that could explain why you don't feel strongly about
simplifying and condensing the syntax for the most common use-case of them all.

*) but there are probably equally who prefer to handle business logics
outside the database


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 1.2

2014-09-04 Thread Joel Jacobson

On Thu, Sep 4, 2014 at 10:42 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I am strong in opinion so PLpgSQL is targeted primary for implementation
 business logic in server side. CRUD  is only one from possible use cases -
 and without any special importance to others.

Just curious, what kind of business logic do you write, where one row
is not more special to you than two rows or five rows?

Think about for a second what PL functions are able to return.

Compare,
RETURNS RECORD
vs
RETURNS SETOF RECORD

When you return from a function, you get exactly 1 row, unless
you explicitly use the SETOF RECORD syntax, where 0...n rows are possible.

To add to that, we have the STRICT keyword, which also recognize the
fact 1 row is special.

So, we already *have* special handling for the 1 row case in many areas.

I cannot see how you can fail to agree it would be a good thing to
make it simple also for UPDATE/DELETE/INSERT.



On Thu, Sep 4, 2014 at 10:42 AM, Pavel Stehule pavel.steh...@gmail.com wrote:



 2014-09-04 10:06 GMT+02:00 Joel Jacobson j...@trustly.com:

 On Thu, Sep 4, 2014 at 9:39 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  we have totally different opinion what is good

 Can you elaborate on that?


 I would to elaborate on enhancing plpgsql - but my primary target is
 readability without necessity of special special statements, types.

 I am strong against to create some shortcuts for relative too special use
 case.



 Your ASSERT CHECK ROWCOUNT = 1; is lengthly, which is why I don't like
 it.
 Imagine if having to type
 my $var === 'foo';
 instead of
 my $var = 'foo';
 on every single line of could where you want to assign a variable,
 that would just be ridiculous.

 If you have a typical CRUD application and decide to do *all* data
 operations via PL functions,
 which is a design pattern advocated by many*, then you will end up
 with a lot of very simple
 short PL functions, to do things like update_worker_status(),
 set_notification_response(), etc,
 in which you always pass something which is a primary key in some
 table, and want to update
 exactly one row. Having to type 27 extra characters for every single
 line of code, instead of the
 suggested 3 extra characters, is a big difference, for anyone who
 designs a CRUD application
 which relies on the usage of PL functions.


 Is not better to design special PL for this usage? I understand to your
 motivation, but it is not acceptable for me in plpgsql.

 Ten years ago, we had to solve similar problem - and we designed
 metalanguage that was translated to plpgsql.



 For me, it would be useful to understand if you are developing CRUD
 applications,
 or if your main usage for PL/pgSQL functions are other things?


 I am strong in opinion so PLpgSQL is targeted primary for implementation
 business logic in server side. CRUD  is only one from possible use cases -
 and without any special importance to others.



 If the latter, then maybe that could explain why you don't feel strongly
 about
 simplifying and condensing the syntax for the most common use-case of them
 all.


 I don't agree so what you propose, it is common use case. And I don't think
 so it can be used in synergy with current design



 *) but there are probably equally who prefer to handle business logics
 outside the database


 It is maybe main difference between me and you. Usually I don't write CRUD
 applications, and I am not sure if plpgsql is good for CRUD.

 Mainly I would not to optimize plpgsql primary for CRUD.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 1.2

2014-09-04 Thread Joel Jacobson

On Thu, Sep 4, 2014 at 11:07 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 it is different semantic - returns composite or set of composites  --- it is
 not row or rows

The point was, RETURNS returns 1 while RETURNS SETOF returns 0 .. n.

 Actually BL is usually processed oriented, so PL functions coverages changes
 in data, and for queries you use SELECT

OK, so you SELECT directly from tables?
And in the PLs you change a lot of rows in the same txn?

 Returning SET from function is less often - and usually it is not in
 preferred patterns because you can very simple block a optimizer.

Not if you do all access, also SELECT via PLs, then you might want to returns
lists of things based on some input.

But that's a different topic. What I wanted to examplify is the fact
we *already*
have a lot of syntax which handles the 1 row case in a special way.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 1.2

2014-09-04 Thread Joel Jacobson

On 4 sep 2014, at 11:42, Pavel Stehule pavel.steh...@gmail.com wrote:

2014-09-04 11:22 GMT+02:00 Joel Jacobson j...@trustly.com:

 The point was, RETURNS returns 1 while RETURNS SETOF returns 0 .. n.


no RETURNS return VALUE (it is not a row) .. and in combination with
SELECT - value will be a row. RETURNS SETOF returns rows


I intentionally excluded the data type of what is returned.
1 VALUE vs 0...n VALUES
Do you still fail to see the point 1 VALUE is special in the context of
what a function returns?

Re: [HACKERS] PL/pgSQL 2

2014-09-04 Thread Joel Jacobson

 On 4 sep 2014, at 15:09, Shaun Thomas stho...@optionshouse.com wrote:

 On 09/01/2014 04:04 AM, Joel Jacobson wrote:

 + Make UPDATE/INSERT/DELETE throw error if they didnt' modify exactly 1
 row, as that's the most common use-case, and provide alternative syntax
 to modify multiple or zero rows.

 What? No. The whole point of SQL is that it's set-based and can modify 
 multiple rows at once. Hobbling it specifically for functions seems 
 fundamentally flawed. Especially for what we purport to be a core PostgreSQL 
 language.

I've dropped that suggestion and is instead in favour of a keyword
like STRICT, ONE ROW, SINGLETON or [1] like suggested by others. Any
keyword or syntax will do fine, but I would prefer STRICT.


 + Change all warnings into errors

 I... what? I could see coming up with a better exception handling mechanism 
 for escalating messages. But you're talking about taking a core element of 
 PostgreSQL (warnings) and simply ripping them out so plpgsql2 loses even that 
 small functionality.

You misunderstood, I meant plpgsql warnings, that you currently can
turn into errors by setting things in the config file. Such as
shadowing of variables.


 I'm sure you've put a lot of thought into this, but you're not the only 
 person using plpgsql or any, however ambitious, potential replacement.

 --
 Shaun Thomas
 OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
 312-676-8870
 stho...@optionshouse.com

 __

 See http://www.peak6.com/email_disclaimer/ for terms and conditions related 
 to this email


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 1.2

2014-09-04 Thread Joel Jacobson

On 4 sep 2014, at 15:32, Pavel Stehule pavel.steh...@gmail.com wrote:




2014-09-04 15:24 GMT+02:00 Jan Wieck j...@wi3ck.info:

 On 09/04/2014 01:14 AM, Pavel Stehule wrote:

 2014-09-03 23:19 GMT+02:00 Hannu Krosing ha...@2ndquadrant.com
 A more SQL-ish way of doing the same could probably be called COMMAND
 CONSTRAINTS
 and look something like this

 SELECT
 ...
 CHECK (ROWCOUNT BETWEEN 0 AND 1);


 It is very near to my proposed ASSERT


 Only if the ASSERT syntax would become part of the original statement, it
 is supposed to check. In Hannu's command constraint example above, the
 statement that causes the error, and thus will be logged and become
 identified by the error message, is the actual SELECT (or other DML
 statement).


this is valid argument.

On second hand, I proposed a ASSERT that was not based on expressions only.
There is not a technical issue to write assert with knowledge of related
statement.



 I think I like the COMMAND CONSTRAINT the best so far.


I not, because when it will not be part of SQL, than parser in plpgsql will
be more complex. You have to inject SELECT, UPDATE, INSERT, DELETE


This is what I suspected. You are against the best syntax because they are
more complex to implement. I think that's coming into the discussion from
the wrong direction. First agree on the best syntax, then worry about the
implementation.

I also understand the syntax changes will mean a lot of trouble for your
plpgsql_check_function() project, but that cannot hold us back, we must aim
for the best possible syntax with plpgsql2.
Your work with plpgsql_check_function() btw saved me hundreds of hours of
work, when we upgraded from 8.4 a few years ago, many thanks Pavel!


Pavel




 Regards,
 Jan

 --
 Jan Wieck
 Senior Software Engineer
 http://slony.info

Re: [HACKERS] PL/pgSQL 1.2

2014-09-04 Thread Joel Jacobson

 On 4 sep 2014, at 16:45, Hannu Krosing ha...@2ndquadrant.com wrote:

 When looking from the other end of the problem, we are
 using SELECT/INSERT/UPDATE/DELETE *SET statements* in pl/pgsql
 when we really want scalars.

 My understanding is that one main drivers of starting this thread
 was wanting also guaranteed SCALAR versions of these.

 And wanting them in a way that is easy to use.

+1

Thank you! I have been trying to explain this in multiple cryptic ways
but failed. You just nailed it! That's *exactly* what I mean!

Thanks for clarifying!



 Cheers


 --
 Hannu Krosing
 PostgreSQL Consultant
 Performance, Scalability and High Availability
 2ndQuadrant Nordic OÜ


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 1.2

2014-09-04 Thread Joel Jacobson

 On 4 sep 2014, at 17:18, Pavel Stehule pavel.steh...@gmail.com wrote:

 You just need a ISAM API for Postgres, That is all.

Now you are being ironic, and I would prefer to keep the discussion on
a serious level. You know that's not applicable in my case, you know
what I do for work and what kind of system we already have.

I *love* plpgsql and our development method. I just want it to get
slightly more convenient and secure.

When you suggest ISAM, that's like saying demolish your house and
build a new one when all I want is to make small but important
changes to what I already do as a professional on a daily basis.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgcrypto: PGP signatures

2014-09-04 Thread Joel Jacobson

Marko, et al,

This is a review of the pgcrypto PGP signatures patch:
http://www.postgresql.org/message-id/53edbcf0.9070...@joh.to

There hasn't been any discussion, at least that I've been able to find.

Contents  Purpose
==
This patch add functions to create, verify and extract infromation
from OpenPGP signatures. Previously pgcrypto only peformed
PGP encrypt/decrypt, not sign/verify. This is a painful limitation
since a very common use-case for OpenPGP is the signature-part,
where two parties want to verify messages originate from each other,
and not only encrypt the messages.

Included in the patch are updated regression test cases and documentation.

Initial Run
===
The patch applies cleanly to HEAD after changing a single line in the patch:
 ! Giving this function a secret key will produce an error.
---
 ! Giving this function a secret key will produce a error.
This grammar fix was already fixed in 05258761bf12a64befc9caec1947b254cdeb74c5,
and therefore caused the conflict.

The 144 regression tests all pass successfully against the new patch.

Conclusion
==
Since I'm using these functions in the BankAPI project,
https://github.com/trustly/bankapi, I have tested them
by actually using them in production, in addition to the provided
regression tests, which is a good sign they are working not just
in theory.

+1 for committer review after the changes suggested by Jeff Janes and
Thomas Munro.


On Fri, Aug 15, 2014 at 9:55 AM, Marko Tiikkaja ma...@joh.to wrote:
 Hi,


 On 8/7/14 12:15 PM, I wrote:

 Here's v2 of the patch.  I've changed the info-extracting code to not
 look for signatures beyond the data, which also meant that it had to
 parse one-pass signatures (which it didn't do before).  This matches the
 behaviour of the main decryption code.


 Here's the latest version where I've added the option to extract the
 creation time from the signatures.



 .marko


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 1.2

2014-09-04 Thread Joel Jacobson

On Thu, Sep 4, 2014 at 5:51 PM, Robert Haas robertmh...@gmail.com wrote:
 When you suggest ISAM, that's like saying demolish your house and
 build a new one when all I want is to make small but important
 changes to what I already do as a professional on a daily basis.

 Go right ahead: this is an open source project, after all, and with an
 extremely permissive license to boot.  You can modify your copy of
 PL/pgsql, or clone it and make PL/joelsql and then change whatever you
 like.  Optionally, you could then publish that on PGXN for others to
 use and contribute to.

 On the other hand, if what you want is for other people to make
 changes to the official versions of PostgreSQL that are supported and
 maintained by the community, then that's a different thing altogether.
 It entails two challenges: first, to persuade the community that those
 changes will be good for everyone, not just you; and second,
 convincing them that they (rather than you) should be the ones to do
 the work.  So far I'd say you're losing the first argument, and I
 expect you'll lose the second one, too (barring a financial
 transaction, of course).

 I'm not trying to brush you off here - I understand your concerns, and
 they're not stupid.  But, like most of the people who have commented,
 I don't agree that your proposals would be an improvement for the
 majority of people.  There are several ways to deal with that, but if
 your goal is to get those changes made in the PostgreSQL community
 then you have to acknowledge the competing concerns to be just as
 valid as your own and come up with a proposal everyone can live with.

If my company would write code in PL/joelsql, I think I would have a hard
time through any technical due diligence in the future. :-)

The main reason why I'm so eager of finding a support from you,
the majority of other readers on this list, is of course because I think
we as a group can come up with a much better solution to the problem
than what I could on my own. And for me it's better if we can agree on
*something* which improves my and others life to *some* extent,
rather than to just sitting here silent waiting another 16 years for
PL/pgSQL 2 to
develop itself.

I can certainly live with a more SQLish syntax than the one I had in mind.

I'm less concerned about the verbosity of the language, if I wanted a condensed
language I should have opted for some other language in the first place,
so that's not my problem.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgcrypto: PGP armor headers

2014-09-04 Thread Joel Jacobson

Marko, et al,

This is a review of the pgcrypto PGP Armor Headers patch:
http://www.postgresql.org/message-id/53edcae8.20...@joh.to

Contents  Purpose
==
This patch add functions to create and extract OpenPGP Armor Headers.
from OpenPGP messages.

Included in the patch are updated regression test cases and documentation.

Initial Run
===
The patch applies cleanly to HEAD.

The 144 regression tests all pass successfully against the new patch.

Conclusion
==
Since I'm using these functions in the BankAPI project,
https://github.com/trustly/bankapi, I have tested them
by actually using them in production, in addition to the provided
regression tests, which is a good sign they are working not just
in theory.

+1 for committer review.

On Fri, Aug 15, 2014 at 10:55 AM, Marko Tiikkaja ma...@joh.to wrote:
 Hi,


 On 8/8/14 3:18 PM, I wrote:

 Currently there's no way to generate or extract armor headers from the
 PGP armored format in pgcrypto.  I've written a patch to add the
 support.


 Latest version of the patch here, having fixed some small coding issues.


 .marko


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 2

2014-09-03 Thread Joel Jacobson

On Wed, Sep 3, 2014 at 7:54 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I am not against to improve a PL/pgSQL. And I repeat, what can be done and
 can be done early:

 a) ASSERT clause -- with some other modification to allow better static
 analyze of DML statements, and enforces checks in runtime.

 b) #option or PRAGMA clause with GUC with function scope that enforce check
 on processed rows after any DML statement

 c) maybe introduction automatic variable ROW_COUNT as shortcut for GET
 DIAGNOSTICS rc = ROW_COUNT

 If you need more, and some users would more, then it job for new language
 really.

You fail to illustrate *why* it's a job for a new language.
All improvements suggested above are possible with plpgsql, and *should*
be improved in plpgsql, that I agree with.

But the 100% backwards-compatibiity ambition puts hard limits on
what's possible,
and if we can accept (100%-X) backwards compatibility where X is a small number,
then so much more ideas are possible, and that's why plpgsql2 is a good idea.

Hopefully, most of the plpgsql2 changes can be turned on/off in
plpgsql with PRAGMA clause with GUC,
but will be more messy than a good decent default behaviour.

I'm in favour of Tom's idea. To merely make the plpgsql2 language a
way of explicitly saying you want
a specific exact combination of features/beaviour/settings which we
can implemented in plpgsql's existing codebase.

Since it was about 100 posts since Tom's post, maybe it's worth
repeating for those who missed it:

 What I would think about is

c) plpgsql and plpgsql2 are the same code base, with a small number
of places that act differently depending on the language version.

We could alternatively get the result by inventing a bunch of pragma
declarations, or some similar notation, that control the behavioral
changes one-at-a-time.  That might even be worth doing anyway, in
case somebody likes some of the ideas and others not so much.  But
I'd see the language version as a convenient shorthand for enabling a
specified collection of pretty-localized incompatible behavior changes.
If they're not pretty localized, there's going to be a barrier to
uptake, very comparable to the python3 analogy mentioned upthread.

regards, tom lane

I fully agree on this approach. It's maintainable and it will be
useful from day 1.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 2

2014-09-03 Thread Joel Jacobson

On Wed, Sep 3, 2014 at 10:07 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 When you use name plpgsql2 you say, so plpgsql2 is successor plpgsql. It is
 very hard to accept it.  So any other name is not problem for me - like
 plpgsql-safe-subset or something else

plpgsql2 *is* the successor of plpgsql, that's why it has a 2 in the name.
Anything which is very different from plpgsql should instead get a
different name.

For all new users, having a convenient shorthand (plpgsql2) for
enabling what ever the
project think is the best-practice collection of pragmas, is a simple
and efficient way
of helping new users to get the best possible behaviour of the
language, when starting
from scratch.
It also simplifies communication among developers, when they talk
about code written in plpgsql2,
they will all eventually know what they mean, instead of having to
describe what collection of pragmas
they use in their code. That also simplifies code examples, but most
importantly, one does not have
to declare all the pragmas for each function, or worry about the
pragmas in the config file will ever change.

Once we have agreed upon plpgsql2, then it will be a constant, and
never break compatibility,
and that's a good thing. Then we can all write new code according to
the updated specs and
hopefully we will not need a plpgsql3 until year 2030.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgcrypto: PGP signatures

2014-09-03 Thread Joel Jacobson

On Wed, Aug 6, 2014 at 2:46 PM, Marko Tiikkaja ma...@joh.to wrote:
 Hi hackers,

 Attached is a patch to add support for PGP signatures in encrypted messages
 into pgcrypto.

I noticed Heikki wanted to check if there is any interested for the
patches in the current commitfest.

Yes, our company Trustly are very interested in the two PGP additions
to pgcrypto.

We currently use these patches in production in a separate database,
but if they would be part of standard postgres, we wouldn't need to
run the application using the functionality in a separate database
server, which would simplify things a lot.

Without these patches, there is no way to deal with PGP signatures.
Since signatures is a crucial component of OpenPGP, the existing
encryption/decryption features are useful, but not nearly as useful as
if you also have the capabilities to generate and verify PGP
signatures.

We use the PGP functionality in a system called BankAPI, which is open
source and available here: https://github.com/trustly/bankapi

Also, in the documentation, it has already been acknowledged the lack
of signing is a current limitation:
F.25.3.9. Limitations of PGP Code
No support for signing. That also means that it is not checked whether
the encryption subkey belongs to the master key.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 2

2014-09-03 Thread Joel Jacobson

On Wed, Sep 3, 2014 at 3:17 PM, Joshua D. Drake j...@commandprompt.com wrote:
 Well, I don't think PostgreSQL needs its own PL. I mean we already have
 several (what other database has pl/javascript or pl/python?)

PostgreSQL already *have* it's own PL, it's called PL/pgSQL.

 Besides, the idea of this community trying to build its own programming
 language... oh lord ;)

I would agree it's too much of a challenge to invent a brand new
programming language,
I agree that's unrealistic, that's why I'm opting to do as much as
possible in the existing
language, and carefully think about what non-compatible important
changes we simply
cannot make to PL/pgSQL, as they by definition would break
compatibility (which we all
agree is not acceptable), but that *would* be possible with PL/pgSQL 2.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 2

2014-09-02 Thread Joel Jacobson

On Tue, Sep 2, 2014 at 5:46 AM, Craig Ringer cr...@2ndquadrant.com wrote:
 My point is that weeks can be spent just arguing about whether you
 should have a variable-delimiter ($variable) or not, how syntax should
 look, etc. Imagine how long it'd take to get a new language syntax
 agreed upon?

I would guess about a year.

 You jumped in to say that you thought that:

   EXECUTE format(SELECT %I FROM %I WHERE $1, col, tbl) USING val;

 was is exactly why we need a new language and that All the clumsy
 stuff we cannot fix in plpgsql, can easily be fixed in plpgsql2, with
 the most beautiful syntax we can come up with. But you haven't said HOW
 you propose to fix this one case.

 Show me. How do you want this to look? The user requirement is Execute
 a SELECT against a table whose name is provided at runtime, selecting a
 column or set of columns whose names are provided at runtime, with
 literals substituted as placement parameters.

 The above is ugly. Fine, not arguing. Show me what you want instead.


 You're happy to say how much you dislike PL/PgSQL, but I haven't seen a
 concrete proposal on how you want something new to look. That would be a
 useful and constructive start, as we could then examine, point-by-point,
 how/if those needs can be met in PL/PgSQL. If they can't then you'd have
 a more convincing argument for a new version than PL/PgSQL sucks.

I've *never* said PL/pgSQL sucks.
I *love* PL/pgSQL, seriously.
I write code for many hours a day in the language.
I don't even want to change much.
My wishlist consists mostly of things which makes the language more secure.
Currently it's a pain to verify your data operations do exactly what
you requested.
I would guess most novice developers don't understand this, and by
mistake write insecure code.
I don't want any OO. I don't want PL/SQL or PL/PSM. I'm a happy camper
with PL/pgSQL.

That said, *if* we now have a one-shot opportunity of possibly
breaking a bit of compatibility for a minority of current code,
motivated by the introduction of new important features not possible
without plpgsql2, *then* let's make the best of that opportunity.

I don't find myself selecting from a table which table name I don't
know the name when writing the code,
so I'm not pariticulary interested in prodiving a syntax for that use case,
but I'm not against the feature if others need it, even if it would
possibly increase the lines of code of existing plpgsql code which
needs to be modified to remain compatible by X %.

Given the needed diff between plpgsql and plpgsql2 for the changes I'm
mostly interested in would probably be quite small,
I'm in favour of Tom's suggestion of:
c) plpgsql and plpgsql2 are the same code base, with a small number
of places that act differently depending on the language version.

That fits perfectly for my needs, as I don't want to change much.

But even if we find we want to make larger mostly-compatible changes,
maybe that also can be implemented using the same approach.

For me, the most important is to not break *most* of existing plpgsql
code, but it's OK to break *some*.
And when breaking it, it should be trivial to rewrite it to become compatible.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 2

2014-09-02 Thread Joel Jacobson

On Tue, Sep 2, 2014 at 8:26 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 09/02/2014 09:06 AM, Joel Jacobson wrote:
 For me, the most important is to not break *most* of existing plpgsql
 code, but it's OK to break *some*.
 And when breaking it, it should be trivial to rewrite it to become
 compatible.


 I think the next step would be to list all the things you don't like with
 current PL/pgSQL, and write down how you would want them to work if you were
 starting with a clean slate. Let's see how wide the consensus is that the
 new syntax/behavior is better than what we have now. We can then start
 thinking how to best adapt them to the current PL/pgSQL syntax and codebase.
 Maybe with pragmas, or new commands, or deprecating the old behavior; the
 best approach depends on the details, and how widely desired the new
 behavior is, so we need to see that first.

 I'd suggest collecting the ideas on a wiki page, and once you have some
 concrete set of features and syntax there, start a new thread to discuss
 them. Others will probably have other features they want, like the simpler
 DROP TABLE ? thing.

Excellent idea, I'm on it!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 2

2014-09-02 Thread Joel Jacobson

On Tue, Sep 2, 2014 at 11:04 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 What we can do better?

 1. we can implement a conditional RAISE

 DELETE FROM tab WHERE xx = somevar;
 GET DIAGNOSTICS  rc = ROW_COUNT;
 RAISE EXCEPTION 'some' WHEN rc  0;

 It is relatively natural and we use similar construct in CONTINUE statement.

 2. What can be next? We can implement some idiom (shortcut) for GET
 DIAGNOSTICS

 DELETE FROM tab WHERE xx = somevar;
 RAISE EXCEPTION 'some' WHEN AFFECTED_ROW_COUNT  1;

 3. What next? Maybe some notations -

 -- ** ensure_exact_one_row
 DELETE FROM tab WHERE xx = somevar;

 But default will be same as in plain SQL.

All three suggestions are either too verbose, ugly or hackish.
I write too much code every day in PL/pgSQL to find any other solution
than the cleanest and simplest to be acceptable.
I reckon there are those who mostly use the language to create
aggregated reports or to run some kind of batch jobs.
But I use it almost exlusively for OLTP, and then you most often
update a single row, and if 0 or 1 rows are affected, it's an error.
Therefore, I wish the syntax for the most common use case to be as
clean as possible, and there is nothing cleaner than plain UPDATE.

Also, when showing a beginner the power of PL/pgSQL, it cannot be
acceptable to have to write two rows to do something as simple as an
update. All the suggestions above range between 2-3 rows (for DELETE,
but I guess the syntax would be the same for UPDATE).

For an in-depth discussion on this subject, please see
http://joelonsql.com/2013/05/03/plpgsql-1-annoyance/

I have no good ideas though on what the syntax would look like to
allow zero rows or multiple rows for an UPDATE though.

It's much harder to come up with things to *add* to a syntax than what
obvious ugliness you want to *remove*.

If I had to guess though, I would think something in the end of the
UPDATE command like a new keyword, could work. It wouldn't mess up the
syntax too much, and wouldn't require an extra line of code.

I strongly feel we should give a plain UPDATE without any extra lines
of code or special syntax a default behaviour, which is different from
accept any number of affected rows.
My definitive vote is to throw an error if not exactly 1 row was
affected, and to provide a nice syntax to allow the other use cases.
Right now it's the other way around, we never throw an error, and
*always* have to check how many rows were affected. That means we
*always* get both more lines of code and also uglier code in our
applications, than we would if we optimized for the most common use
case.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 3 >

1 - 100 of 262 matches

Mail list logo