Re: Chapter 43.8. "Transaction Management" fails to state two critical restrictions

2019-09-30 Thread Bryn Llewellyn
Hello, Bruce.  (We met at the OUGN conference last March on the Oslo-Kiel 
ferry.) Thanks for your reply.

1. About AUTOCOMMIT

It’s very hard to get a clear account of what AUTOCOMMIT really is. So consider 
the example from the docs section that I cited, and run these psql commands at 
its prompt:

DROP TABLE IF EXISTS test1;
CREATE TABLE test1(a int);

CREATE OR REPLACE PROCEDURE transaction_test1()
LANGUAGE plpgsql
AS $$
BEGIN
FOR i IN 0..9 LOOP
INSERT INTO test1 (a) VALUES (i);
IF i % 2 = 0 THEN
COMMIT;
ELSE
ROLLBACK;
END IF;
END LOOP;
END
$$;

Now invoke it like this:

\set AUTOCOMMIT on
CALL transaction_test1();

And see what it produced like this:

SELECT a FROM test1 ORDER BY a; 

The result is what I expect (even values of “a” from 0 through 8). Now re-test 
like this

\set AUTOCOMMIT on
DELETE FROM test1;

\set AUTOCOMMIT off
CALL transaction_test1();

It causes this error:

ERROR:  invalid transaction termination
CONTEXT:  PL/pgSQL function transaction_test1() line 6 at COMMIT

And (of course), following the ROLLBACK that you must do before any new SQL 
works, table test1 is left empty. You get the identical outcome (as I’d expect) 
if you execute this:

START TRANSACTION;

between setting AUTOCOMMIT to off and calling the proc.

I work at YugaByte (www.yugabyte.com) and I filed this issue on 1-Aug-2019:

https://github.com/yugabyte/yugabyte-db/issues/1957

You can see from what I wrote how confused I was. (My earlier life was at 
Oracle Corp where the corresponding notions are simple and intuitive.) Then I 
sent the URL to the pgsql-general list asking for comments. I got all sorts of 
replies—mainly trying to tell me that my aim (ultimately to write a PL/pgSQL 
proc to encapsulate the retry loop that you need when you execute a txn at the 
serializable level) was silly. (The point there is that a serialization error 
sometimes occurs first on attempting to commit the txn.) There were some 
exchanges on Twitter. And eventually Peter Eisentraut tweeted to say that the 
AUTOCOMMIT requirement was probably an implementation restriction.

2. About committing in a block statement’s executable section when the block 
statement has an exception section

Simply modify the proc from the docs thus:

CREATE OR REPLACE PROCEDURE transaction_test1()
LANGUAGE plpgsql
AS $$
BEGIN
FOR i IN 0..9 LOOP
IF i = 9 THEN
  RAISE EXCEPTION 'My bad' USING errcode = '9';
END IF;

INSERT INTO test1 (a) VALUES (i);
IF i % 2 = 0 THEN
COMMIT;
ELSE
ROLLBACK;
END IF;
END LOOP;
EXCEPTION
  WHEN OTHERS THEN
RAISE INFO 'sqlerrm:  %', sqlerrm;
RAISE INFO 'sqlstate: %', sqlstate;
END
$$;

With proper semantics, it ought to commit the even values of “a” from 0 through 
8, then report “My bad” using RAISE INFO, and then end silently.

And invoke it (as we now know we must) like this:

\set AUTOCOMMIT on
CALL transaction_test1();

It actually reports this:

INFO:  sqlerrm:  cannot commit while a subtransaction is active
INFO:  sqlstate: 2D000

In other words, executing “commit” in the circumstances that I complained about 
causes the error that my OTHERS handler reports.

This undocumented behavior means that stored procedures in PostgreSQL simply 
cannot, in general, be used for their intended purpose. I just filed this issue 
(on 29-Sep-2019):

https://github.com/yugabyte/yugabyte-db/issues/2464 


I indent as write to the pgsql-hackers list, tell them this URL, and ask for 
comments.

Obviously, I’ll be delighted to be shown that my pessimistic analysis is 
faulty. My aim is simply to implement my use cases.

3. Back to AUTOCOMMIT in general

It seems to me that the notion is a conflation of what could have been a pure 
client-side mode with a server-side mode. Apparently, different client-side 
drivers do different things when AUTOCOMMIT is off (like silently issuing a 
BEGIN or START TRANSACTION on your behalf before submitting your intended SQL 
statement). But something must also be going on server side—else the execution 
of a stored proc could not know whether the client code did its own BEGIN or if 
this was done implicitly. I’ve been told that the server distinguishes between 
a so-called explicit txn and and implicit txn. I’d be happy if someone would 
explain this clearly to me.

On 30-Sep-2019, at 07:24, Bruce Momjian  wrote:

On Mon, Sep 30, 2019 at 04:11:47AM +, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
> 
> Page: https://www.postgresql.org/docs/11/plpgsql-transactions.html
> Description:
> 
> This chapter fails to state:
> 
> (1) If a PL/pgSQL procedure issues "commit" then it must be called with
> AUTOCOMMIT set to On. This is counter-intuitive. You'd expect the
> opposite.

You mean psql autocommit mode?

> (2) If a PL/pgSQL procedure has a 

Re: Proposal to improve the content in subsection 8.16.6. "Composite Type Input and Output Syntax"

2020-04-06 Thread Bryn Llewellyn
Tom Lane  wrote:...It's already an example, so I don't see this as an improvementThe proposed additional text is flat-out wrong.What actually happens here is that text between quotes is consideredquoted (so that, for example, commas within it are not field separators),but that does not exclude there being other unquoted text within thesame field value.			regards, tom laneThanks, Tom. Your reply was very helpful. It prompted me to read, re-read, and then re-re-read 8.16.6 several times. I did notice that neither of “varchar” nor “char” are found in the whole of section 8.16. But I suppose that it’s obvious to the reader that “text” is to be taken as a term of art, denoting “character data types” and not as the name of one of these data types.I ran some fairly exhaustive empirical tests. Not to be mysterious, I created two “.sql” files. The zip is attached.record_literal.sql--This uses "type rt as (v text)” and "table t(k serial primary key, literal text, r rt)”.It inserts a series of text values into column “literal” as is and into column “r” typecast to “rt”.I aimed to end up with specified target values shown by “select '>'||(r).v||’<‘ from t”.Here are two example findings,First: for this specified target:>hello hello "you" 

Re: COPY statement: no list of the allowed values for "format_name"

PG Doc comments form wrote:

Look at the account of the COPY statement, here:

https://www.postgresql.org/docs/11/sql-copy.html 


and see this:
«
[ [ WITH ] ( option [, ...] ) ]
»

It goes on to say this:

«
where option can be one of:

   FORMAT format_name
»

But there's no list of the allowed values for "format_name". The same holds for 
other options like, for example, "encoding_name”.

David G. Johnston  wrote:

Keep reading…

"""
FORMAT
Selects the data format to be read or written: text, csv (Comma Separated 
Values), or binary. The default is text.
"""

I suppose including the syntax term "format_type" might help, but the all-caps 
FORMAT is definitive.


Thanks for the quick reply, David. I do think that it’s reasonable to expect to 
search in the page rather than to have to read every word from top to bottom in 
the class of use cases that my example indicates. FORMAT is the syntax keyword; 
and format_name denotes the value. I think that this present sentence:

«
Selects the data format to be read or written: text, csv (Comma Separated 
Values), or binary. The default is text.
»

would be improved if it were spelled more explicitly:

«
Selects the data format to be read or written. The allowed values for 
format_name are text, csv (Comma Separated Values), or binary. The default is 
text.
»

Please consider making this change (and the convention that it implies for 
comparable cases).

Where is the syntax "<-08>+08" documented—comes from "select current_setting('timezone')"

Summary: Where in the PG docs can I find a self-contained expository essay that 
explains everything that I need to know in order to understand what's going on 
in the examples that I show below, and to allow me always confidently to 
predict the results?

Notice that "show datestyle" gives me the answer "ISO, MDY". It seems to be the 
default in my env. (I'm in San Francisco.) I never did anything to influence it.

Before the actual tests:

deallocate all;
prepare q as select to_char('2020-01-01 12:00:00'::timestamptz, 'hh24:mi:ss 
TZH:TZM') as t;

Here’s some examples that produce the strangely formatted answer:

set timezone = -8; -- no quotes
show timezone;

or:

set timezone = '-8'; -- notice the quotes
show timezone;

or:

set timezone = interval '-8 hours';
show timezone;

or even:

set timezone = '<-08>+08';
show timezone;

When "show timezone" answers with "<-08>+08", this:

execute q;

gives the answer "20:00:00 -08:00" as I'd expect from the more transparent 
spelling of the "set" statements that I did.

Here's a "stress test":

set timezone = '+08';
show timezone;

I get an upper-cased rendition of what I said: "+08". Even now, "execute 
q" gets the answer that I showed above with "-08:00" for the "TZH:TZM" part.

So it looks like "< ... >" inside the quotes before the number reverses the 
positive/negative sense in which the number is taken and serves as some kind of 
comment.

Finally, what looks sensible and tempting:

set timezone = '-08:00';
show timezone;

giving the answer "-08:00" from "show" brings "04:00:00 +08:00" from "execute 
q". In other words, maximally counter-intuitive.

My search skills aren't up to finding anything that explains what's going on 
here in the PG doc. I find stuff (who knows if I can trust it) on stack 
overflow and similar like this:

« POSIX has positive signs west of Greenwich, but many people expect positive 
signs east of Greenwich. »

Is there a setting that I can do to make "set timezone = '-08:00'" treat it as 
a request to set the negative value that I said?



Re: Where is the syntax "<-08>+08" documented—comes from "select current_setting('timezone')"

> david.g.johns...@gmail.com wrote:
> 
> On Thu, May 13, 2021 at 12:38 PM Bryn Llewellyn  <mailto:b...@yugabyte.com>> wrote:
> « POSIX has positive signs west of Greenwich, but many people expect positive 
> signs east of Greenwich. »
> 
> Is there a setting that I can do to make "set timezone = '-08:00'" treat it 
> as a request to set the negative value that I said?
> 
> That ultimately comes from the appendix.
> 
> https://www.postgresql.org/docs/current/datetime-posix-timezone-specs.html 
> <https://www.google.com/url?q=https://www.postgresql.org/docs/current/datetime-posix-timezone-specs.html=gmail-imap=162154136000=AOvVaw0IG_klyIecU9loqoCt3Ck9>
> 
> As noted here:
> 
> https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-TIMEZONES
>  
> <https://www.google.com/url?q=https://www.postgresql.org/docs/current/datatype-datetime.html%23DATATYPE-TIMEZONES=gmail-imap=162154136000=AOvVaw3btBYIuR5Wht7xmxQUsm8A>
> 
> There are three ways to specify TimeZone in PostgreSQL.  You are using the 
> third option and thus are dealing with the caveats noted in the third entry's 
> description and described in detail in the appendix linked to above.
> 
> I suggest restricting your use to one of the other two, recommended, options. 
>  If you cannot avoid using the third option you must learn and abide by its 
> rules.

Thanks for the quick reply, David. I'd already worked out that the 
overwhelmingly common case for setting a time zone calls for using its _name_ 
because that's the key to the DST rules and therefore correct answers when you 
view an extant timestamptz value. For example, it would seem to be crazy when 
viewing an extant timestamptz value in San Francisco that denotes a date some 
time in June to ask to see it at an offset of -8 hours.

Reading between your lines, the answer to my question « Where is the syntax 
"<-08>+08" documented? » is NOWHERE! The "B.5. POSIX Time Zone Specifications" 
page that you pointed me to has not a single example of a "set timezone" 
statement that does in in the POSIX style. And not a single example of what 
"show timezone" gives after doing a POZIX style setting. (There's no hit for 
the left- or right-chevron on the page.)

I'll conclude that there's no usable doc on this because the recommendation 
that your second URL leads to is "Don't ever do anything that needs you to 
understand that counter-intuitive mess."

As I see it, even using a time zone abbreviation like "PDT" is unhelpful (as 
that second URL shows).

I'm presently documenting the date-time story for YugabyteDB's "as is" exposure 
of the PostgreSQL SQL processing layer. I'll simply recommend always to use a 
full name (from pg_timezone_names.name) as the argument of "set timezone" and 
mention that anything else is dangerous and generally not useful.




Re: typo in doc for "Miscellaneous Coding Conventions"

t...@sss.pgh.pa.us wrote:

> PG Doc comments form  writes:
> 
>> small fix in description at [1] as follows
> 
>> -If that were not done interrupted code that's currently inspecting errno
>> might see the wrong value.
>> +If that was not done interrupted code that's currently inspecting errno
>> might see the wrong value.
> 
> The existing text is perfectly good English; your change
> makes it less so.  I'm afraid it's been too many years since
> high school English for me to remember the exact grammatical
> term for this, but "were not" is typical usage when stating
> a contrary-to-fact hypothetical.

(1) “IF bla bla…  THEN bla bla…”

It might be more words than the bare minimum. But it helps the user separate 
out the proposition and the consequence.

(2) This is the dreaded curse of the passive voice (“mistakes were made”). 
There are many cases where an active formulation is nicer. Anyway, you can 
sidestep lots of the conundrums, like the alternatives here pose, by standing 
back and finding a different way to make the point.

> If the implementation (or you) didn’t do X, then Y bad thing could happen.

Where are the legal values for LC_TIME listed?

It’s easy to guess values for, say, countries in Europe:

This:

create function to_char_demo()
  returns table(z text)
  language plpgsql
as $body$
declare
  -- Counted from midnight 1-Jan-1970 UTC.
  secs   constant double precision not null := 94996756799.456789;
  t  constant timestampnot null := to_timestamp(-secs) at time zone 
'UTC';
  fmt_1  constant text not null := 'TMDay / TMMonth';
  fmt_2  constant text not null := 'TMDy dd-TMMon- 
hh24:mi:ss.us BC';
begin
  set lc_time = 'en_US';
  z := to_char(t, fmt_1);   return next;
  z := to_char(t, fmt_2);   return next;
  z := '';  return next;

  set lc_time = 'it_IT';
  z := to_char(t, fmt_1);   return next;
  z := to_char(t, fmt_2);   return next;
  z := '';  return next;

  set lc_time = 'fi_FI';
  z := to_char(t, fmt_1);   return next;
  z := to_char(t, fmt_2);   return next;
end;
$body$;

select z from to_char_demo();

…brings this result:

 Monday / September
 Mon 03-Sep-1042 12:00:00.543216 BC
 
 Lunedì / Settembre
 Lun 03-Set-1042 12:00:00.543216 BC
 
 Maanantai / Syyskuu
 Ma 03-Syy-1042 12:00:00.543216 BC

But what do I use for, say, Simplified Chinese? Nothing that I guess works. 
And, unlike is the case with “set IntervalStyle”, a bad value doesn’t bring a 
hint (or a doc ref) that gives the LoV.

The obvious search (LC_TIME in the search box of the PG doc for the current 
version) gets no useful hits. Nor does this:

https://www.google.com/search?client=safari=en=LC_TIME+site:https://www.postgresql.org/docs/13/=UTF-8=UTF-8
 




Re: Where are the legal values for LC_TIME listed?

t...@sss.pgh.pa.us wrote:

> Bryn wrote:
> 
>> It’s easy to guess values for, say, countries in Europe:
> 
> On Unix-ish systems, "locale -a" should provide the set of available values.  
> We don't attempt to document this because it's so installation-dependent.
> 
>> But what do I use for, say, Simplified Chinese?
> 
> Maybe you don't have a suitable locale installed.
> 
>> The obvious search (LC_TIME in the search box of the PG doc for the current 
>> version) gets no useful hits.
> 
> The main entry for lc_time in
> 
> 19.11.2. Locale and Formatting: 
> https://www.postgresql.org/docs/current/runtime-config-client.html#RUNTIME-CONFIG-CLIENT-FORMAT
>  
> 
> says “Acceptable values are system-dependent; see Section 23.1 for more 
> information", and if you follow that link, you'll read
> 
>What locales are available on your system under what names depends on what 
> was provided by the operating system vendor and what was
>installed. On most Unix systems, the command `locale -a` will provide a 
> list of available locales.
> 
> Not sure what more we could say.

Thanks for the quick reply, Tom. `locale -a` showed that I do have a suitable 
locale installed. When I add an extra paragraph to my code that starts with 
`set lc_time = 'zh_CN’;`, I get this:

 星期一 / 九月
 一 03- 9-1042 12:00:00.543216 BC

What looks like an m-dash is actually the Chinese character for “one” as in 
“星期一” (lit. “week-one” which is the convention they adopted when they adopted 
the Western Calendar). Had I picked Friday (“星期五”) the output would have been 
nicer.

About “ Not sure what more we could say”, no… I don’t suppose there’s a 
cost-effective next step. In an ideal world, you’d have O/S-dependent code that 
reads the output of `locale -a`, sanitizes it to get the legal arguments for 
`set lc_time`, and presents it as a relation.

My problem is that doc search only gets me so far. Then I have to read each 
whole page from top to bottom to find the nuggets—in this case “ Acceptable 
values are system-dependent”.