> On 08-Apr-2021, at 10:24, Bruce Momjian <br...@momjian.us> wrote:
> 
> On Mon, Apr  5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
>> On Mon, Apr  5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
>> Well, bug or not, we are not going to change back branches for this, and
>> if you want a larger discussion, it will have to wait for PG 15.
>> 
>>>> https://www.google.com/url?q=https://www.postgresql.org/docs/current/datatype-datetime.html%23DATATYPE-INTERVAL-INPUT&source=gmail-imap&ust=1618507489000000&usg=AOvVaw2h2TNbK7O41zsDn8HfD88C
>>>> « …field values can have fractional parts; for example '1.5 week' or 
>>>> '01:02:03.45'. Such input is converted to the appropriate number of 
>>>> months, days, and seconds for storage. When this would result in a 
>>>> fractional number of months or days, the fraction is added to the 
>>>> lower-order fields using the conversion factors 1 month = 30 days and 1 
>>>> day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only 
>>>> seconds will ever be shown as fractional on output. »
>> 
>> I see that.  What is not clear here is how far we flow down.  I was
>> looking at adding documentation or regression tests for that, but was
>> unsure.  I adjusted the docs slightly in the attached patch.
> 
> Here is an updated patch, which will be for PG 15.  It updates the
> documentation to state:
> 
>       The fractional parts are used to compute appropriate values for the next
>       lower-order internal fields (months, days, seconds).
> 
> It removes the flow from fractional months/weeks to
> hours-minutes-seconds, and adds missing rounding for fractional
> computations.

Thank you Bruce. I look forward to documenting this new algorithm for 
YugabyteDB. The algorithm implements the transformation from this:

[
  yy_in numeric,
  mo_in numeric,
  dd_in numeric,
  hh_in numeric,
  mi_in numeric,
  ss_in numeric
]

to this:

[
  mo_internal_representation int,
  dd_internal_representation int,
  ss_internal_representation numeric(1000,6)
]

I am convinced that a prose account of the algorithm, by itself, is not the 
best way to tell the reader the rules that the algorithm implements. Rather, 
psuedocode is needed. I mentioned before that, better still, is actual 
executable PL/pgSQL code. (I can expect readers to be fluent in PL/pgSQL.) 
Given this executable simulation, an informal prose sketch of what it does will 
definitely add value.

May I ask you to fill in the body of this stub by translating the C that you 
have in hand?

create type internal_representation_t as(
  mo_internal_representation int,
  dd_internal_representation int,
  ss_internal_representation numeric(1000,6));

create function internal_representation(
  yy_in numeric default 0,
  mo_in numeric default 0,
  dd_in numeric default 0,
  hh_in numeric default 0,
  mi_in numeric default 0,
  ss_in numeric default 0)
  returns internal_representation_t
  language plpgsql
as $body$
declare
  mo_internal_representation  int     not null := 0;
  dd_internal_representation  int     not null := 0;
  ss_internal_representation  numeric not null := 0.0;

  ok constant boolean :=
    (yy_in is not null) and
    (mo_in is not null) and
    (dd_in is not null) and
    (hh_in is not null) and
    (mi_in is not null) and
    (ss_in is not null);
begin
  assert ok, 'No actual argument, when provided, may be null';

  -- The algorithm.

  return (mo_internal_representation, dd_internal_representation, 
ss_internal_representation)::internal_representation_t;
end;
$body$;

By the way, I believe that a user might well decide always to supply all the 
fields in a "from text to interval" typecast, except for the seconds, as 
integral values. This, after all, is what the "make_interval()" function 
enforces. But, because the typecast approach allows non-integral values, the 
reference documentation must explain the rules unambiguously so that the reader 
can predict the outcome of any ad hoc test that they might try.

It's a huge pity that the three values of the internal representation cannot be 
observed directly using SQL because each behaves with different semantics when 
an interval value is added to a timestamptz value. However, as a second best 
(and knowing the algorithm above), a user can create interval values where only 
one of the three fields is populated and test their understanding of the 
semantic rules that way.

Reply via email to