Re: Content for talk on Postgres Type System at PostgresConf

2024-03-02 Thread grimy . outshine830
On Fri, Mar 01, 2024 at 03:25:35PM -0500, Tom Lane wrote:

> No, what he showed was correct.  I'm talking about a different facet
> of the problem:

> ...

> Even if that took account of the exchange rate, it'd not be great.
> But it doesn't; it's just the same digits reinterpreted with a new
> currency sign and possibly a different number of fractional digits.
> This might be sort of tolerable if your database only ever deals in
> one currency, but even then you'd likely want to lock down what that
> currency is.  Making it be controlled by a user-set GUC was probably
> not a great idea.

Yes, I get it now, thanks. Not useful, I agree.

-- 
Ian




Re: Content for talk on Postgres Type System at PostgresConf

2024-03-02 Thread Laurenz Albe
On Fri, 2024-03-01 at 08:14 -0800, Adrian Klaver wrote:

> [pointers to the documentation]

I know.  I was not searching for help with PostgreSQL,
I was trying to point out the surprising behavior of
"character" as a suggestion for the talk.

Yours,
Laurenz Albe




Re: Content for talk on Postgres Type System at PostgresConf

2024-03-01 Thread grimy . outshine830
On Thu, Feb 29, 2024 at 05:51:11PM -0500, Tom Lane wrote:

> >> money is a fixed-point decimal value, the number of decimal
> >> places is locale determined. I’m not aware of any particular
> >> problems with that

> > You forget about the currency symbol dynamic. Like with time zones
> > the local session provides the context, not the stored data.

> Yeah.  The fact that the same stored value might look like 10.00
> euros to one session and 1000 yen to another one is pretty
> catastrophic.  The other nasty thing about money is that it's an
> int64 so it can't represent more than 2^63 pennies (for whatever a
> "penny" is).  Now, that's still a Frickin Lot Of Money in any
> non-hyperinflated currency, but it's the sort of restriction that
> banks don't like to hear of.

Lame excuse first: I have never used the money type, probably because
I overheard a word like "catastrophic" in my early development :-)

But, doesn't what Tom says above contradict Adrian's example session?
Either the stored value is re-interpreted according to the locale
context, or it isn't. IMO it would be *more* catastrophic if it wasn't,
as it looks from Adrian's example.

-- 
Ian




Re: Content for talk on Postgres Type System at PostgresConf

2024-03-01 Thread Adrian Klaver

On 3/1/24 01:18, Laurenz Albe wrote:

On Thu, 2024-02-29 at 13:38 -0800, Guyren Howe wrote:

what are the misconceptions, or where might I find them for  myself?


In addition to what was already said:


My current understanding:
  * character is fixed-length, blank-padded. Not sure when you’d
want that, but it seems clear. Is the name just confusing?


I find the semantics confusing:

   test=> SELECT 'a'::character(10);
  bpchar
   
a
   (1 row)

Ok, it is 10 characters long.

   test=> SELECT length('a'::character(10));
length
   
 1
   (1 row)

Or is it?


https://www.postgresql.org/docs/current/datatype-character.html

"Values of type character are physically padded with spaces to the 
specified width n, and are stored and displayed that way. However, 
trailing spaces are treated as semantically insignificant and 
disregarded when comparing two values of type character. In collations 
where whitespace is significant, this behavior can produce unexpected 
results; for example SELECT 'a '::CHAR(2) collate "C" < E'a\n'::CHAR(2) 
returns true, even though C locale would consider a space to be greater 
than a newline. Trailing spaces are removed when converting a character 
value to one of the other string types. Note that trailing spaces are 
semantically significant in character varying and text values, and when 
using pattern matching, that is LIKE and regular expressions."




   test=> SELECT 'a'::character(10) || 'b'::character(10);
?column?
   ══
ab
   (1 row)

And why is the result not 20 characters long, with spaces between "a" and "b"?


SELECT pg_typeof('a'::character(10) || 'b'::character(10));
 pg_typeof
---
 text

This is covered by  "Trailing spaces are removed when converting a 
character value to one of the other string types.".


Though that still leaves you with:

SELECT pg_typeof(('a'::character(10) || 'b'::character(10))::char(20));
 pg_typeof
---
 character

SELECT ('a'::character(10) || 'b'::character(10))::char(20);
bpchar
--
 ab





Best avoid "character".


  * timestamptz is just converted to a timestamp in UTC. Folks might
imagine that it stores the time zone but it doesn’t.


Yes, and I find that lots of people are confused by that.

You could talk about the interaction with the "timezone" parameter, and
that it is not so much a timestamp with time zone, but an "absolute timestamp",
and in combination with "timestamp" a great way to let the database handle
the difficult task of time zone conversion for you.

Yours,
Laurenz Albe





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Content for talk on Postgres Type System at PostgresConf

2024-03-01 Thread Laurenz Albe
On Thu, 2024-02-29 at 13:38 -0800, Guyren Howe wrote:
> what are the misconceptions, or where might I find them for  myself?

In addition to what was already said:

> My current understanding:
>  * character is fixed-length, blank-padded. Not sure when you’d
>want that, but it seems clear. Is the name just confusing?

I find the semantics confusing:

  test=> SELECT 'a'::character(10);
 bpchar   
  
   a 
  (1 row)

Ok, it is 10 characters long.

  test=> SELECT length('a'::character(10));
   length 
  
1
  (1 row)

Or is it?

  test=> SELECT 'a'::character(10) || 'b'::character(10);
   ?column? 
  ══
   ab
  (1 row)

And why is the result not 20 characters long, with spaces between "a" and "b"?

Best avoid "character".

>  * timestamptz is just converted to a timestamp in UTC. Folks might
>imagine that it stores the time zone but it doesn’t.

Yes, and I find that lots of people are confused by that.

You could talk about the interaction with the "timezone" parameter, and
that it is not so much a timestamp with time zone, but an "absolute timestamp",
and in combination with "timestamp" a great way to let the database handle
the difficult task of time zone conversion for you.

Yours,
Laurenz Albe





Re: Content for talk on Postgres Type System at PostgresConf

2024-02-29 Thread Paul Jungwirth

On 2/29/24 14:47, Guyren Howe wrote:
What is worth saying about custom range types? What even would be a good use case? A range of 
strings or ip address ranges, something like that, I suppose?


I wrote a blog post and an extension for an inetrange type. It's from 2016 so it might need some 
updates, but you are welcome to use anything you like for your talk:


https://illuminatedcomputing.com/posts/2016/06/inet-range/

Yours,

--
Paul  ~{:-)
p...@illuminatedcomputing.com




Re: Content for talk on Postgres Type System at PostgresConf

2024-02-29 Thread Guyren Howe
On 29 Feb 2024, at 14:51, Tom Lane  wrote:
> 
>>> - time with time zone *does* store the time zone, but this isn’t
>>> actually useful and should be avoided (I’m not entirely sure why and the
>>> docs only gesture at the problems without stating them, IIRC)
> 
>> No it doesn't store the time zone.  Nor do the docs say they do.  And
>> clearly point out the issue that evaluating a time zone without both date
>> and time inputs is basically pointless.
> 
> timetz *does* store a time zone, in the sense of storing a numeric
> offset from UTC (i.e., "so many minutes east or west of Greenwich").
> The problem is that in most real-world applications your notion of
> "time zone" probably includes annual DST changes, which timetz can't
> represent.  I don't say the type is completely useless, but its
> usefulness is a lot less than you might guess.

The closest I can come to this in the docs is:

"The appropriate time zone offset is recorded in the time with time zone value 
and is output as stored; it is not adjusted to the active time zone.”

I expect to be submitting some documentation updates as part of this project, 
fwiw.

Re: Content for talk on Postgres Type System at PostgresConf

2024-02-29 Thread Tom Lane
"David G. Johnston"  writes:
> On Thu, Feb 29, 2024 at 2:38 PM Guyren Howe  wrote:
>> My current understanding:
>> 
>> - character is fixed-length, blank-padded. Not sure when you’d want
>> that, but it seems clear. Is the name just confusing?

> Character is not stored fixed length, it is stored variable width but it
> just so happens that the variable length is the same for all rows.

It's nastier than that: the restriction is that the length in
*characters* be the same in all rows.  In a variable-width encoding
like UTF8 the length in bytes can vary.  This means that the
optimization that CHARACTER was meant to enable (ie "the physical
width of this column is fixed, so you don't have to store a length
indication") doesn't work for us, making it just about totally
useless.

BTW, the SQL spec words the constraint the same way, making me wonder
if any modern RDBMS can use this type in the way it was designed.

>> - time with time zone *does* store the time zone, but this isn’t
>> actually useful and should be avoided (I’m not entirely sure why and the
>> docs only gesture at the problems without stating them, IIRC)

> No it doesn't store the time zone.  Nor do the docs say they do.  And
> clearly point out the issue that evaluating a time zone without both date
> and time inputs is basically pointless.

timetz *does* store a time zone, in the sense of storing a numeric
offset from UTC (i.e., "so many minutes east or west of Greenwich").
The problem is that in most real-world applications your notion of
"time zone" probably includes annual DST changes, which timetz can't
represent.  I don't say the type is completely useless, but its
usefulness is a lot less than you might guess.

>> money is a fixed-point decimal value, the number of decimal places is
>> locale determined. I’m not aware of any particular problems with that

> You forget about the currency symbol dynamic. Like with time zones the
> local session provides the context, not the stored data.

Yeah.  The fact that the same stored value might look like 10.00 euros
to one session and 1000 yen to another one is pretty catastrophic.
The other nasty thing about money is that it's an int64 so it can't
represent more than 2^63 pennies (for whatever a "penny" is).  Now,
that's still a Frickin Lot Of Money in any non-hyperinflated currency,
but it's the sort of restriction that banks don't like to hear of.

regards, tom lane




Re: Content for talk on Postgres Type System at PostgresConf

2024-02-29 Thread Guyren Howe
On 28 Feb 2024, at 17:08, Guyren Howe  wrote:
> 
> I am to talk about Postgres’s type system at PGConf:
> 
> https://postgresconf.org/conferences/2024/program/proposals/postgres-s-type-system
> 
> I picked the issue because I think it’s poorly understood, greatly 
> under-discussed, and an excellent way to empower postgres users.
> 
> I am reasonably conversant with the issue. I’m not looking for others to 
> write the talk for me, but in order to make the best talk I can, I’m asking:
> 
> What would you want to see in such a talk?
> 
> I’m planning on covering:
> 
> - The built-in types that are underused and their advantages (eg inet)
> - domains
> - such things as details of arrays including multidimensional arrays
> - user-defined types, their relationship to tables, and generally how to use 
> them
> 
> I would spend most of the time discussing ways to make effective use of 
> types. Some examples:
> - defining functions of rows so the table can be used kind of like a set of 
> objects (including the dot notation for invoking functions)
> - using UDFs to make code clearer eg if you have an idiosyncratic functional 
> index, define it using a function of the row, so it’s easy to get right when 
> querying
> - using UDFs as a kind of better domain. eg differentiating imperial from 
> metric units by requiring an explicit constructor, not just accepting any old 
> number
> 
> I would mention enumerated types, although I’m inclined to advise that their 
> inflexibility (eg can’t delete or rearrange them) means that a related table 
> is probably better (I’d be delighted to be proved wrong).
> 
> Custom Range Types are an interesting feature, but I can’t think of a good 
> use case. Any good examples?
> 
> Automatic casting is a feature I’m aware of, but I’d be interested in any 
> cool use cases.
> 
> 
> Anything I’m missing? Any existing good discussions of the subject?

Any gotchas about user defined types or arrays? I was only planning on 
discussing simple user defined types, without going into writing the optional 
features in C.

What is worth saying about custom range types? What even would be a good use 
case? A range of strings or ip address ranges, something like that, I suppose?

Is there a good case for enumerations versus a fk to a table? I guess if you’re 
sure you won’t need to change the values in the enumeration? Days of the week, 
that sort of thing.

Re: Content for talk on Postgres Type System at PostgresConf

2024-02-29 Thread David G. Johnston
On Thu, Feb 29, 2024 at 2:38 PM Guyren Howe  wrote:

> In case I’m not correct on the issues with these types, myself, what are
> the misconceptions, or where might I find them for  myself?
>
> My current understanding:
>
>- character is fixed-length, blank-padded. Not sure when you’d want
>that, but it seems clear. Is the name just confusing?
>
> Character is not stored fixed length, it is stored variable width but it
just so happens that the variable length is the same for all rows.



>
>- timestamptz is just converted to a timestamp in UTC. Folks might
>imagine that it stores the time zone but it doesn’t.
>
> You are missing the point then whenever converting a stored timestamptz to
textual representation the action is stable because it involves looking up
the session time zone setting.  The same goes in reverse unless the textual
representation includes the time zone specification to use.

And yes, people do believe that whatever timezone was in effect when the
data was recorded is saved when indeed it is not; such information would
need to be captured separately.

>
>- time with time zone *does* store the time zone, but this isn’t
>actually useful and should be avoided (I’m not entirely sure why and the
>docs only gesture at the problems without stating them, IIRC)
>
> No it doesn't store the time zone.  Nor do the docs say they do.  And
clearly point out the issue that evaluating a time zone without both date
and time inputs is basically pointless.

>  money is a fixed-point decimal value, the number of decimal places is
locale determined. I’m not aware of any particular problems with that

You forget about the currency symbol dynamic. Like with time zones the
local session provides the context, not the stored data.

David J.


Re: Content for talk on Postgres Type System at PostgresConf

2024-02-29 Thread Adrian Klaver


On 2/29/24 1:38 PM, Guyren Howe wrote:
In case I’m not correct on the issues with these types, myself, what 
are the misconceptions, or where might I find them for  myself?


My current understanding:

  * character is fixed-length, blank-padded. Not sure when you’d want
that, but it seems clear. Is the name just confusing?
  * timestamptz is just converted to a timestamp in UTC. Folks might
imagine that it stores the time zone but it doesn’t.

The issue I see is people getting confused when it come back rotated to 
whatever TimeZone is set to, which can change from client to client.


  * time with time zone *does* store the time zone, but this isn’t
actually useful and should be avoided (I’m not entirely sure why
and the docs only gesture at the problems without stating them, IIRC)
  * money is a fixed-point decimal value, the number of decimal places
is locale determined. I’m not aware of any particular problems
with that



test=# show lc_monetary ;
 lc_monetary
-
 en_US.UTF8
(1 row)

test=# select 1000::money;
   money
---
 $1,000.00
(1 row)

test=# set lc_monetary = 'en_GB.utf8';
SET
test=# select 1000::money;
   money
---
 £1,000.00


On 29 Feb 2024 at 01:11 -0800, Laurenz Albe 
, wrote:

On Wed, 2024-02-28 at 17:08 -0800, Guyren Howe wrote:

I am to talk about Postgres’s type system at PGConf:

https://postgresconf.org/conferences/2024/program/proposals/postgres-s-type-system

I picked the issue because I think it’s poorly understood, greatly
under-discussed, and an excellent way to empower postgres users.

What would you want to see in such a talk?


It might be good to explain how "timestamp with time zone" works.
That's often confusing for beginners, because it is different from
other databases and arguably deviates from the SQL standard.

Perhaps it would be good to warn people about using data types like
"character", "time with time zone" and "money".

Yours,
Laurenz Albe


--
Adrian Klaver
adrian.kla...@aklaver.com


Re: Content for talk on Postgres Type System at PostgresConf

2024-02-29 Thread Guyren Howe
In case I’m not correct on the issues with these types, myself, what are the 
misconceptions, or where might I find them for  myself?

My current understanding:

• character is fixed-length, blank-padded. Not sure when you’d want that, but 
it seems clear. Is the name just confusing?
• timestamptz is just converted to a timestamp in UTC. Folks might imagine that 
it stores the time zone but it doesn’t.
• time with time zone *does* store the time zone, but this isn’t actually 
useful and should be avoided (I’m not entirely sure why and the docs only 
gesture at the problems without stating them, IIRC)
• money is a fixed-point decimal value, the number of decimal places is locale 
determined. I’m not aware of any particular problems with that

On 29 Feb 2024 at 01:11 -0800, Laurenz Albe , wrote:
> On Wed, 2024-02-28 at 17:08 -0800, Guyren Howe wrote:
> > I am to talk about Postgres’s type system at PGConf:
> >
> > https://postgresconf.org/conferences/2024/program/proposals/postgres-s-type-system
> >
> > I picked the issue because I think it’s poorly understood, greatly
> > under-discussed, and an excellent way to empower postgres users.
> >
> > What would you want to see in such a talk?
>
> It might be good to explain how "timestamp with time zone" works.
> That's often confusing for beginners, because it is different from
> other databases and arguably deviates from the SQL standard.
>
> Perhaps it would be good to warn people about using data types like
> "character", "time with time zone" and "money".
>
> Yours,
> Laurenz Albe


Re: Content for talk on Postgres Type System at PostgresConf

2024-02-29 Thread grimy . outshine830
On Thu, Feb 29, 2024 at 10:11:03AM +0100, Laurenz Albe wrote:

> It might be good to explain how "timestamp with time zone" works.
> That's often confusing for beginners, because it is different from
> other databases and arguably deviates from the SQL standard.

The most confusing part is the name :-P

It is natural for a beginner -- and I am not that far from being a
beginner -- to assume a data type with this name must store a time
zone somehow.

-- 
Ian




Re: Content for talk on Postgres Type System at PostgresConf

2024-02-29 Thread Laurenz Albe
On Wed, 2024-02-28 at 17:08 -0800, Guyren Howe wrote:
> I am to talk about Postgres’s type system at PGConf:
> 
> https://postgresconf.org/conferences/2024/program/proposals/postgres-s-type-system
> 
> I picked the issue because I think it’s poorly understood, greatly
> under-discussed, and an excellent way to empower postgres users.
> 
> What would you want to see in such a talk?

It might be good to explain how "timestamp with time zone" works.
That's often confusing for beginners, because it is different from
other databases and arguably deviates from the SQL standard.

Perhaps it would be good to warn people about using data types like
"character", "time with time zone" and "money".

Yours,
Laurenz Albe




Content for talk on Postgres Type System at PostgresConf

2024-02-28 Thread Guyren Howe
I am to talk about Postgres’s type system at PGConf:

https://postgresconf.org/conferences/2024/program/proposals/postgres-s-type-system

I picked the issue because I think it’s poorly understood, greatly 
under-discussed, and an excellent way to empower postgres users.

I am reasonably conversant with the issue. I’m not looking for others to write 
the talk for me, but in order to make the best talk I can, I’m asking:

What would you want to see in such a talk?

I’m planning on covering:

- The built-in types that are underused and their advantages (eg inet)
- domains
- such things as details of arrays including multidimensional arrays
- user-defined types, their relationship to tables, and generally how to use 
them

I would spend most of the time discussing ways to make effective use of types. 
Some examples:
- defining functions of rows so the table can be used kind of like a set of 
objects (including the dot notation for invoking functions)
- using UDFs to make code clearer eg if you have an idiosyncratic functional 
index, define it using a function of the row, so it’s easy to get right when 
querying
- using UDFs as a kind of better domain. eg differentiating imperial from 
metric units by requiring an explicit constructor, not just accepting any old 
number

I would mention enumerated types, although I’m inclined to advise that their 
inflexibility (eg can’t delete or rearrange them) means that a related table is 
probably better (I’d be delighted to be proved wrong).

Custom Range Types are an interesting feature, but I can’t think of a good use 
case. Any good examples?

Automatic casting is a feature I’m aware of, but I’d be interested in any cool 
use cases.


Anything I’m missing? Any existing good discussions of the subject?