Re: Calendar support in localization

2021-04-11 Thread Thomas Munro
On Wed, Mar 17, 2021 at 8:20 AM Thomas Munro  wrote:
> *I mean, we can discuss the different "timelines" like UT, UTC, TAI
> etc, but that's getting into the weeds, the usual timeline for
> computer software outside specialist scientific purposes is UTC
> without leap seconds.

(Erm, rereading this thread, I meant to write "time scales" there.)




Re: Calendar support in localization

2021-03-31 Thread Surafel Temesgen
On Tue, Mar 30, 2021 at 11:16 AM Daniel Verite 
wrote:

>
> The conversions from julian dates are not necessarily hard, but the
> I/O functions means having localized names for all days, months, eras
> of all calendars in all supported languages. If you're thinking of
> implementing this from scratch (without the ICU dependency), where
> would these names come from? OTOH if we're using ICU, then why
> bother reinventing the julian-to-calendars conversions that ICU
> already does?
>
>
i donno why  but  currently we are using our own function for
converting (see j2date and date2j) maybe it's written before ICU but i
think ICU helps in adding other calendar support easly. Regarding  I/O
functions postgresql hard coded days and months names on array and just
parse and string compare, if it is not on the list then error(see
datetime.c) and it will be the same for other calendar but i think we don't
need all  that if we use ICU

regards
Surafel


Re: Calendar support in localization

2021-03-30 Thread Daniel Verite
Surafel Temesgen wrote:

> > About intervals, if there were locale-aware functions like
> >  add_interval(timestamptz, interval [, locale]) returns timestamptz
> > or
> >  sub_timestamp(timestamptz, timestamptz [,locale]) returns interval
> > that would use ICU to compute the results according to the locale,
> > wouldn't it be good enough?
> >
> >
> Yes it can be enough for now but there are patches proposed to support the
> system and application time period which are in SQL standard 

To clarify, these function signatures are not meant to oppose
a core vs extension implementation, nor an ICU vs non-ICU
implementation. They're meant to illustrate the case of using
specific functions instead of adding specific data types.
AFAIU, adding data types come from the idea that since
(non-gregorian-date + interval) doesn't have the same result as
(gregorian-date + interval), we could use a different type for
non-gregorian-date and so a different "+" operator, maybe
even a specific interval type.

For the case of temporal tables, I'm not quite familiar with the
feature, but I notice that the patch says:

+When system versioning is specified two columns are added which
+record the start timestamp and end timestamp of each row verson.
+The data type of these columns will be TIMESTAMP WITH TIME ZONE.

The user doesn't get to choose the data type, so if we'd require to
use specific data types for non-gregorian calendars, that would
seemingly complicate things for this feature. This is consistent
with the remark upthread that the SQL standard assumes the
gregorian calendar.


> what it takes to support calendar locally is input/output function
> and a converter from and to julian calendar and that may not be that
> much hard since most of the world calendar is based on julian or
> gregorian calendar[0]

The conversions from julian dates are not necessarily hard, but the
I/O functions means having localized names for all days, months, eras
of all calendars in all supported languages. If you're thinking of
implementing this from scratch (without the ICU dependency), where
would these names come from? OTOH if we're using ICU, then why
bother reinventing the julian-to-calendars conversions that ICU
already does?


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite




Re: Calendar support in localization

2021-03-29 Thread Surafel Temesgen
Hi Daniel,

On Fri, Mar 26, 2021 at 8:51 PM Daniel Verite 
wrote:

> Thomas Munro wrote:
>
> > Right, so if this is done by trying to extend Daniel Verite's icu_ext
> > extension (link given earlier) and Robert's idea of a fast-castable
> > type, I suppose you might want now()::icu_date + '1 month'::internal
> > to advance you by one Ethiopic month if you have done SET
> > icu_ext.ICU_LC_TIME = 'am_ET@calendar=traditional'.
>
> I've pushed a calendar branch on icu_ext [1] with preliminary support
> for non-gregorian calendars through ICU, so far with only format and parse
> of timetamptz.
>

Thanks


>
>
> I understand that adding months or years with some of the non-gregorian
> calendars should lead to different points in time than with the gregorian
> calendar.
>
> For instance with the ethiopic calendar, the query above displays today as
> 17/mägabit/2013 and 1 month from now as 18/miyazya/2013,
> while the correct result is probably 17/miyazya/2013 (?)
>
>
yes it should be 17/miyazya/2013 (?)


> I'm not sure at this point that there should be a new set of
> data/interval/timestamp types though, especially if considering
> the integration in core.
>
> About intervals, if there were locale-aware functions like
>  add_interval(timestamptz, interval [, locale]) returns timestamptz
> or
>  sub_timestamp(timestamptz, timestamptz [,locale]) returns interval
> that would use ICU to compute the results according to the locale,
> wouldn't it be good enough?
>
>
Yes it can be enough for now but there are patches proposed to support the
system and application time period which are in SQL standard and if we have
that feature the calendar has to be in core and It doesn't appear hard for
me to support the calendar locally. Postgresql itself does't store
Gregorian date it stores julian date(which is more accurate than gregorian
calendar) and almost all of function and operator is done using julian date
converted to second(TimestampTz) so what it takes to support calendar
locally is input/output function and a converter from and to julian
calendar and that may not be that much hard since most of the world
calendar is based on julian or gregorian calendar[0]. Thought?

0.https://en.wikipedia.org/wiki/List_of_calendars

regards
Surafel


Re: Calendar support in localization

2021-03-29 Thread Matthias van de Meent
On Mon, 29 Mar 2021 at 14:33, Daniel Verite  wrote:
>
> Matthias van de Meent wrote:
>
> > The results for the Japanese locale should be "0003 Reiwa" instead of
> > "0033 Heisei", as the era changed in 2019. ICU releases have since
> > implemented this and other corrections; this specific change was
> > implemented in the batched release of ICU versions on 2019-04-12.
>
> Right. I've run this test on an Ubuntu 18.04 desktop which comes with
> libicu60 . The current version for my system is 60.2-3ubuntu3.1.
> Ubuntu maintainers did not pick up the change of the new Japanese era.
> As a guess, it's because it's not a security fix.
> This contrasts with the baseline maintainers, who did an
> exceptional effort to backpatch this down to ICU 53
> (exceptional in the sense that they don't do that for bugfixes).
>
> >> For instance with the ethiopic calendar, the query above displays today as
> >> 17/mägabit/2013 and 1 month from now as 18/miyazya/2013,
> >> while the correct result is probably 17/miyazya/2013 (?)
>
> > Seeing the results for Japanese locale, you might want to update your
> > ICU library, which could fix this potential inconsistency.
>
> I agree it's always best to have the latest ICU version, but in the
> context of Postgres, we have to work with the versions that are
> typically installed on users systems. People who have pre-2019
> versions will simply be stuck with the previous Japanese era.
>
> Anyway, for the specific problem that the interval datatype cannot be
> used seamlessly across all calendars, it's essentially about how days
> are mapped into calendars, and it's unrelated to ICU updates AFAIU.

Ah, yes, I only glanced over the supplied query and misunderstood it
due to not taking enough time. I understood it as 'use icu locale info
to add 1 month to the current date', which would use ICU knowledge
about months in the locale and would be consistent with the question
mark, instead of 'use icu to interpret the result of adding one
non-icu-locale-dependent month to the current non-icu-locale-dependent
date'. If it were the former, my response would have made more sense,
but it doesn't in this case. So, sorry for the noise.

> About intervals, if there were locale-aware functions like
>  add_interval(timestamptz, interval [, locale]) returns timestamptz
> or
>  sub_timestamp(timestamptz, timestamptz [,locale]) returns interval
> that would use ICU to compute the results according to the locale,
> wouldn't it be good enough?

I agree, that should fix the issues at hand / grant a workable path
for locale-aware timestamp manipulation.


With regards,

Matthias van de Meent




Re: Calendar support in localization

2021-03-29 Thread Daniel Verite
Matthias van de Meent wrote:

> The results for the Japanese locale should be "0003 Reiwa" instead of
> "0033 Heisei", as the era changed in 2019. ICU releases have since
> implemented this and other corrections; this specific change was
> implemented in the batched release of ICU versions on 2019-04-12.

Right. I've run this test on an Ubuntu 18.04 desktop which comes with
libicu60 . The current version for my system is 60.2-3ubuntu3.1.
Ubuntu maintainers did not pick up the change of the new Japanese era.
As a guess, it's because it's not a security fix.
This contrasts with the baseline maintainers, who did an
exceptional effort to backpatch this down to ICU 53
(exceptional in the sense that they don't do that for bugfixes).

>> For instance with the ethiopic calendar, the query above displays today as
>> 17/mägabit/2013 and 1 month from now as 18/miyazya/2013,
>> while the correct result is probably 17/miyazya/2013 (?)

> Seeing the results for Japanese locale, you might want to update your
> ICU library, which could fix this potential inconsistency.

I agree it's always best to have the latest ICU version, but in the
context of Postgres, we have to work with the versions that are
typically installed on users systems. People who have pre-2019
versions will simply be stuck with the previous Japanese era.

Anyway, for the specific problem that the interval datatype cannot be
used seamlessly across all calendars, it's essentially about how days
are mapped into calendars, and it's unrelated to ICU updates AFAIU.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite




Re: Calendar support in localization

2021-03-29 Thread Matthias van de Meent
On Fri, 26 Mar 2021 at 18:51, Daniel Verite  wrote:
> [...]
> -[ RECORD 2 ]--+---
> cal| japanese
> now_str| 26/mars/0033 Heisei 18:22:07.566 UTC+1
> now_parsed | 2021-03-26 18:22:07.566+01
> plus_1m| 26/avril/0033 Heisei 18:22:07.566 UTC+2
> plus_1m_parsed | 2021-04-26 18:22:07.566+02
> -[ RECORD 3 ]--+---
> [...]
> -[ RECORD 12 ]-+---
> cal| ethiopic
> now_str| 17/mägabit/2013 après l’Incarnation 18:22:07.566 UTC+1
> now_parsed | 2021-03-26 18:22:07.566+01
> plus_1m| 18/miyazya/2013 après l’Incarnation 18:22:07.566 UTC+2
> plus_1m_parsed | 2021-04-26 18:22:07.566+02
> -[ RECORD 13 ]-+---
> cal| ethiopic-amete-alem
> now_str| 17/mägabit/7513 ERA0 18:22:07.566 UTC+1
> now_parsed | 2021-03-26 18:22:07.566+01
> plus_1m| 18/miyazya/7513 ERA0 18:22:07.566 UTC+2
> plus_1m_parsed | 2021-04-26 18:22:07.566+02
> [...]
> I understand that adding months or years with some of the non-gregorian
> calendars should lead to different points in time than with the gregorian
> calendar.
>
> For instance with the ethiopic calendar, the query above displays today as
> 17/mägabit/2013 and 1 month from now as 18/miyazya/2013,
> while the correct result is probably 17/miyazya/2013 (?)

Seeing the results for Japanese locale, you might want to update your
ICU library, which could fix this potential inconsistency.

The results for the Japanese locale should be "0003 Reiwa" instead of
"0033 Heisei", as the era changed in 2019. ICU releases have since
implemented this and other corrections; this specific change was
implemented in the batched release of ICU versions on 2019-04-12.


With regards,

Matthias van de Meent




Re: Calendar support in localization

2021-03-29 Thread Thomas Munro
On Sat, Mar 27, 2021 at 6:51 AM Daniel Verite  wrote:
> now_str| 17/mägabit/2013 après l’Incarnation 18:22:07.566 UTC+1

Very nice!

> For instance with the ethiopic calendar, the query above displays today as
> 17/mägabit/2013 and 1 month from now as 18/miyazya/2013,
> while the correct result is probably 17/miyazya/2013 (?)
>
>
> I'm not sure at this point that there should be a new set of
> data/interval/timestamp types though, especially if considering
> the integration in core.
>
> About intervals, if there were locale-aware functions like
>  add_interval(timestamptz, interval [, locale]) returns timestamptz
> or
>  sub_timestamp(timestamptz, timestamptz [,locale]) returns interval
> that would use ICU to compute the results according to the locale,
> wouldn't it be good enough?

+1, I'd probably do that next if I were hacking on this...




Re: Calendar support in localization

2021-03-26 Thread Daniel Verite
Thomas Munro wrote:

> Right, so if this is done by trying to extend Daniel Verite's icu_ext
> extension (link given earlier) and Robert's idea of a fast-castable
> type, I suppose you might want now()::icu_date + '1 month'::internal
> to advance you by one Ethiopic month if you have done SET
> icu_ext.ICU_LC_TIME = 'am_ET@calendar=traditional'.

I've pushed a calendar branch on icu_ext [1] with preliminary support
for non-gregorian calendars through ICU, so far with only format and parse
of timetamptz.
The ICU locale drives both the localization of field names (language) and the
choice of calendar.

It looks like this:

\set fmt 'dd//  HH:mm:ss.SSS zz'

 WITH list(cal) AS ( values 
('gregorian'),
('japanese'),
('buddhist'),
('roc'),
('persian'),
('islamic-civil'),
('islamic'),
('hebrew'),
('chinese'),
('indian'),
('coptic'),
('ethiopic'),
('ethiopic-amete-alem'),
('iso8601'),
('dangi')
),
fmt AS (
 select
  cal,
  icu_format_date(now(),  :'fmt', 'fr@calendar='||cal) as now_str,
  icu_format_date(now()+'1 month'::interval,  :'fmt', 'fr@calendar='||cal) as
plus_1m
  from list
)
SELECT
 cal,
 now_str,
 icu_parse_date(now_str, :'fmt', 'fr@calendar='||cal) as now_parsed,
 plus_1m,
 icu_parse_date(plus_1m, :'fmt', 'fr@calendar='||cal) as plus_1m_parsed
FROM fmt;


-[ RECORD 1 ]--+---
cal| gregorian
now_str| 26/mars/2021 après Jésus-Christ 18:22:07.566 UTC+1
now_parsed | 2021-03-26 18:22:07.566+01
plus_1m| 26/avril/2021 après Jésus-Christ 18:22:07.566 UTC+2
plus_1m_parsed | 2021-04-26 18:22:07.566+02
-[ RECORD 2 ]--+---
cal| japanese
now_str| 26/mars/0033 Heisei 18:22:07.566 UTC+1
now_parsed | 2021-03-26 18:22:07.566+01
plus_1m| 26/avril/0033 Heisei 18:22:07.566 UTC+2
plus_1m_parsed | 2021-04-26 18:22:07.566+02
-[ RECORD 3 ]--+---
cal| buddhist
now_str| 26/mars/2564 ère bouddhique 18:22:07.566 UTC+1
now_parsed | 2021-03-26 18:22:07.566+01
plus_1m| 26/avril/2564 ère bouddhique 18:22:07.566 UTC+2
plus_1m_parsed | 2021-04-26 18:22:07.566+02
-[ RECORD 4 ]--+---
cal| roc
now_str| 26/mars/0110 RdC 18:22:07.566 UTC+1
now_parsed | 2021-03-26 18:22:07.566+01
plus_1m| 26/avril/0110 RdC 18:22:07.566 UTC+2
plus_1m_parsed | 2021-04-26 18:22:07.566+02
-[ RECORD 5 ]--+---
cal| persian
now_str| 06/farvardin/1400 Anno Persico 18:22:07.566 UTC+1
now_parsed | 2021-03-26 18:22:07.566+01
plus_1m| 06/ordibehešt/1400 Anno Persico 18:22:07.566 UTC+2
plus_1m_parsed | 2021-04-26 18:22:07.566+02
-[ RECORD 6 ]--+---
cal| islamic-civil
now_str| 12/chaabane/1442 ère de l’Hégire 18:22:07.566 UTC+1
now_parsed | 2021-03-26 18:22:07.566+01
plus_1m| 14/ramadan/1442 ère de l’Hégire 18:22:07.566 UTC+2
plus_1m_parsed | 2021-04-26 18:22:07.566+02
-[ RECORD 7 ]--+---
cal| islamic
now_str| 13/chaabane/1442 ère de l’Hégire 18:22:07.566 UTC+1
now_parsed | 2021-03-26 18:22:07.566+01
plus_1m| 14/ramadan/1442 ère de l’Hégire 18:22:07.566 UTC+2
plus_1m_parsed | 2021-04-26 18:22:07.566+02
-[ RECORD 8 ]--+---
cal| hebrew
now_str| 13/nissan/5781 Anno Mundi 18:22:07.566 UTC+1
now_parsed | 2021-03-26 18:22:07.566+01
plus_1m| 14/iyar/5781 Anno Mundi 18:22:07.566 UTC+2
plus_1m_parsed | 2021-04-26 18:22:07.566+02
-[ RECORD 9 ]--+---
cal| chinese
now_str| 14/èryuè/0038 78 18:22:07.566 UTC+1
now_parsed | 2021-03-26 18:22:07.566+01
plus_1m| 15/sānyuè/0038 78 18:22:07.566 UTC+2
plus_1m_parsed | 2021-04-26 18:22:07.566+02
-[ RECORD 10 ]-+---
cal| indian
now_str| 05/chaitra/1943 ère Saka 18:22:07.566 UTC+1
now_parsed | 2021-03-26 18:22:07.566+01
plus_1m| 06/vaishākh/1943 ère Saka 18:22:07.566 UTC+2
plus_1m_parsed | 2021-04-26 18:22:07.566+02
-[ RECORD 11 ]-+---
cal| coptic
now_str| 17/barmahât/1737 après Dioclétien 18:22:07.566 UTC+1
now_parsed | 2021-03-26 18:22:07.566+01
plus_1m| 18/barmoudah/1737 après Dioclétien 18:22:07.566 UTC+2
plus_1m_parsed | 2021-04-26 18:22:07.566+02
-[ RECORD 12 ]-+---
cal| ethiopic
now_str| 17/mägabit/2013 après l’Incarnation 18:22:07.566 UTC+1
now_parsed | 2021-03-26 

Re: Calendar support in localization

2021-03-18 Thread Surafel Temesgen
On Wed, Mar 17, 2021 at 3:39 PM Thomas Munro  wrote:

> On Thu, Mar 18, 2021 at 3:48 AM Tom Lane  wrote:
>
> Right, so if this is done by trying to extend Daniel Verite's icu_ext
> extension (link given earlier) and Robert's idea of a fast-castable
> type, I suppose you might want now()::icu_date + '1 month'::internal
> to advance you by one Ethiopic month if you have done SET
> icu_ext.ICU_LC_TIME = 'am_ET@calendar=traditional'.  Or if using my
> first idea of just sticking with the core types, perhaps you'd have to
> replace stuff via search path... I admit that sounds rather error
> prone and fragile (I was thinking mainly of different functions, not
> operators).  Either way, I suppose there'd also be more explicit
> functions for various operations including ones that take an extra
> argument if you want to use an explicit locale instead of relying on
> the ICU_LC_TIME setting.  I dunno.
>
>
As you know internally timestamptz data type does't existe instead it
stored as integer kind and we depend on operating system and external
library for our date data type support so i think that put as on the
position for not be the first one to implement timestamptz data type thing
and i don't know who give as the casting for free?

regards
Surafel


Re: Calendar support in localization

2021-03-17 Thread Vik Fearing
On 3/17/21 3:48 PM, Tom Lane wrote:
> Also, the SQL spec says in so many words
> that the SQL-defined datetime types follow the Gregorian calendar.

We already don't follow the SQL spec for timestamps (and I, for one,
think we are better for it) so I don't think we should worry about that.
-- 
Vik Fearing




Re: Calendar support in localization

2021-03-17 Thread Thomas Munro
On Thu, Mar 18, 2021 at 3:48 AM Tom Lane  wrote:
> Robert Haas  writes:
> > It's not very obvious how to scale this kind of approach to a wide
> > variety of calendar types, and as Thomas says, it would much cooler to
> > be able to handle all of the ones that ICU knows how to support rather
> > than just one. But, the problem I see with using timestamptz is that
> > it's not so obvious how to get a different output format ... unless, I
> > guess, we can cram it into DateStyle. And it's also much less obvious
> > how you get the other functions and operators to do what you want, if
> > it's different.
>
> Yeah, I'm afraid that it probably is different.  The most obvious
> example is in operations involving type interval:
> select now() + '1 month'::interval;
> That should almost certainly give a different answer when using a
> different calendar --- indeed the units of interest might not even
> be the same.  (Do all human calendars use the concept of months?)

Right, so if this is done by trying to extend Daniel Verite's icu_ext
extension (link given earlier) and Robert's idea of a fast-castable
type, I suppose you might want now()::icu_date + '1 month'::internal
to advance you by one Ethiopic month if you have done SET
icu_ext.ICU_LC_TIME = 'am_ET@calendar=traditional'.  Or if using my
first idea of just sticking with the core types, perhaps you'd have to
replace stuff via search path... I admit that sounds rather error
prone and fragile (I was thinking mainly of different functions, not
operators).  Either way, I suppose there'd also be more explicit
functions for various operations including ones that take an extra
argument if you want to use an explicit locale instead of relying on
the ICU_LC_TIME setting.  I dunno.

As for whether all calendars have months, it looks like ICU's model
has just the familiar looking standardised fields; whether some of
them make no sense in some calendars, I don't know, but it has stuff
like x.get(field, ), x.set(field, ), x.add(field, amount,
) and if it fails for some field on your particular calendar, or
for some value (you can't set a Gregorian date's month to 13
(apparently we call this month "undecember", hah), but you can for a
Hebrew or Ethiopic one) I suppose we'd just report the error?

> I don't feel like DateStyle is chartered to affect the behavior
> of datetime operators; it's understood as tweaking the I/O behavior
> only.  There might be more of a case for letting LC_TIME choose
> this behavior, but I bet the relevant standards only contemplate

About LC_TIME... I suppose in one possible future we eventually use
ICU for more core stuff, and someone proposes to merge hypothetical
icu_date etc types into the core date etc types, and then LC_TIME
controls that.  But then you might have a version of the problem that
Peter E ran into in attempts so far to use ICU collations as the
default: if you put ICU's funky extensible locale names into the
LC_XXX environment variables, then your libc will see it too, and
might get upset, since PostgreSQL uses the en.  I suspect that ICU
will understand typical libc locale names, but common libcs won't
understand ICU's highly customisable syntax, but I haven't looked into
it.  If that's generally true, then perhaps the solution to both
problems is a kind of partial separation:  regular LC_XXX, and then
also ICU_LC_XXX which defaults to the same value but can be changed to
access more advanced stuff, and is used only for interacting with ICU.

> Gregorian calendars.  Also, the SQL spec says in so many words
> that the SQL-defined datetime types follow the Gregorian calendar.

:-(




Re: Calendar support in localization

2021-03-17 Thread Tom Lane
Robert Haas  writes:
> It's not very obvious how to scale this kind of approach to a wide
> variety of calendar types, and as Thomas says, it would much cooler to
> be able to handle all of the ones that ICU knows how to support rather
> than just one. But, the problem I see with using timestamptz is that
> it's not so obvious how to get a different output format ... unless, I
> guess, we can cram it into DateStyle. And it's also much less obvious
> how you get the other functions and operators to do what you want, if
> it's different.

Yeah, I'm afraid that it probably is different.  The most obvious
example is in operations involving type interval:
select now() + '1 month'::interval;
That should almost certainly give a different answer when using a
different calendar --- indeed the units of interest might not even
be the same.  (Do all human calendars use the concept of months?)

I don't feel like DateStyle is chartered to affect the behavior
of datetime operators; it's understood as tweaking the I/O behavior
only.  There might be more of a case for letting LC_TIME choose
this behavior, but I bet the relevant standards only contemplate
Gregorian calendars.  Also, the SQL spec says in so many words
that the SQL-defined datetime types follow the Gregorian calendar.

So on the whole, new datatypes and operators seem like the way
to go.  I concur that if ICU has solved this problem, piggybacking
on it seems more promising than writing our own code.

regards, tom lane




Re: Calendar support in localization

2021-03-17 Thread Robert Haas
On Wed, Mar 17, 2021 at 9:54 AM Surafel Temesgen  wrote:
> As you mention above whatever the calendar type is we ended up storing an 
> integer that  represent the date so rather than re-implementing every 
> function and operation for every calendar we can use existing Gerigorian 
> implementation as a base implementation and if new calendar want to perform 
> same function or operation it translate to Gregorian one and use the existing 
> function and operation and translate to back to working calendar. In this 
> approach the only function we want for supporting a new calendar is a 
> translator from the new calendar to Gregorian one and from Gerigorian 
> calendar to the new calendar and may be input/ output function. What do you 
> think of this implementation?

I'm not sure what the best way of tackling this problem is, but I
wanted to mention another possible approach: instead of actually using
the timestamptz data type, have another data type that is
binary-compatible with timestamptz - that is, it's the same thing on
disk, so you can cast between the two data types for free. Then have
separate input/output functions for it, separate operators and
functions and so forth.

It's not very obvious how to scale this kind of approach to a wide
variety of calendar types, and as Thomas says, it would much cooler to
be able to handle all of the ones that ICU knows how to support rather
than just one. But, the problem I see with using timestamptz is that
it's not so obvious how to get a different output format ... unless, I
guess, we can cram it into DateStyle. And it's also much less obvious
how you get the other functions and operators to do what you want, if
it's different.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Calendar support in localization

2021-03-17 Thread Surafel Temesgen
On Tue, Mar 16, 2021 at 12:20 PM Thomas Munro 
wrote:

> On Wed, Mar 17, 2021 at 6:31 AM Surafel Temesgen 
> wrote:
> > Ethiopice calendar have 13 months so it can not be stored as date and
> timestamp type and you approach seems more complicated and i suggest to
> have this feature on the purpose of PostgreSQL popularities too not only
> for my need
>
> I know, but the DATE and TIMESTAMPTZ datatypes don't intrinsically
> know anything about months or other calendar concepts.  Internally,
> they are just a single number that counts the number of days or
> seconds since an arbitrary epoch time.  We are all in agreement about
> how many times the Earth has rotated since then*.  The calendar
> concepts such as "day", "month", "year", whether Gregorian, Ethiopic,
> Islamic, ... are all derivable from those numbers, if you know the
> rules.
>
>
Okay


>
>
> > I think you suggesting this by expecting the implementation is difficult
> but it's not that much difficult once you fully understand Gregorian
> calendar and the calendar you work on
>
> Yeah, I am sure it's all just a bunch of simple integer maths.  But
> I'm talking about things like software architecture, maintainability,
> cohesion, and getting maximum impact for the work we do.
>
> I may be missing some key detail though: why do you think it should be
> a different type?  The two reasons I can think of are: (1) the
> slightly tricky detail that the date apparently changes at 1:00am
> (which I don't think is a show stopper for this approach, I could
> elaborate), (2) you may want dates to be formatted on the screen with
> the Ethiopic calendar in common software like psql and GUI clients,
> which may be easier to arrange with different types, but that seems to
> be a cosmetic thing that could eventually be handled with tighter
> locale integration with ICU.  In the early stages you'd access
> calendar logic though special functions with names like
> icu_format_date(), or whatever.
>
>
As you mention above whatever the calendar type is we ended up storing an
integer that  represent the date so rather than re-implementing every
function and operation for every calendar we can use existing Gerigorian
implementation as a base implementation and if new calendar want to perform
same function or operation it translate to Gregorian one and use the
existing function and operation and translate to back to working calendar.
In this approach the only function we want for supporting a new calendar is
a translator from the new calendar to Gregorian one and from Gerigorian
calendar to the new calendar and may be input/ output function. What do you
think of this implementation?

regards
Surafel


Re: Calendar support in localization

2021-03-16 Thread Thomas Munro
On Wed, Mar 17, 2021 at 6:31 AM Surafel Temesgen  wrote:
> Ethiopice calendar have 13 months so it can not be stored as date and 
> timestamp type and you approach seems more complicated and i suggest to have 
> this feature on the purpose of PostgreSQL popularities too not only for my 
> need

I know, but the DATE and TIMESTAMPTZ datatypes don't intrinsically
know anything about months or other calendar concepts.  Internally,
they are just a single number that counts the number of days or
seconds since an arbitrary epoch time.  We are all in agreement about
how many times the Earth has rotated since then*.  The calendar
concepts such as "day", "month", "year", whether Gregorian, Ethiopic,
Islamic, ... are all derivable from those numbers, if you know the
rules.

So I think you should seriously consider using the same types.

> Each  calendar-aware date arithmetic is different so solving one calendar 
> problem didn't help on other calendar

They have a *lot* in common though.  They have similar "fields" (day,
month, year etc), based on the Earth, moon, sun etc, so it is possible
to use a common abstraction to interact with them.  I haven't studied
it too closely, but it looks like ICU can give you a "Calendar" object
for a given Locale (which you create from a string like
"am_ET@calendar=traditional") and timezone ("Africa/Addis_Ababa").
Then you can set the object's time to X seconds since an epoch, based
on UTC seconds without leap seconds -- which is exactly like our
TIMESTAMPTZ's internal value -- and then you can query it to get
fields like month etc.  Or do the opposite, or use formatting and
parsing routines etc.  Internally, ICU has a C++ class for each
calendar with a name like EthiopicCalendar, IslamicCalendar etc which
encapsulates all the logic, but it's not necessary to use them
directly: we could just look them up with names via the C API and then
treat them all the same.

> I think you suggesting this by expecting the implementation is difficult but 
> it's not that much difficult once you fully understand Gregorian calendar and 
> the calendar you work on

Yeah, I am sure it's all just a bunch of simple integer maths.  But
I'm talking about things like software architecture, maintainability,
cohesion, and getting maximum impact for the work we do.

I may be missing some key detail though: why do you think it should be
a different type?  The two reasons I can think of are: (1) the
slightly tricky detail that the date apparently changes at 1:00am
(which I don't think is a show stopper for this approach, I could
elaborate), (2) you may want dates to be formatted on the screen with
the Ethiopic calendar in common software like psql and GUI clients,
which may be easier to arrange with different types, but that seems to
be a cosmetic thing that could eventually be handled with tighter
locale integration with ICU.  In the early stages you'd access
calendar logic though special functions with names like
icu_format_date(), or whatever.

Maybe I'm totally wrong about all of this, but this is the first way
I'd probably try to tackle this problem, and I suspect it has the
highest chance of eventually being included in core PostgreSQL.

*I mean, we can discuss the different "timelines" like UT, UTC, TAI
etc, but that's getting into the weeds, the usual timeline for
computer software outside specialist scientific purposes is UTC
without leap seconds.




Re: Calendar support in localization

2021-03-16 Thread Surafel Temesgen
Hi Thomas

On Mon, Mar 15, 2021 at 2:58 PM Thomas Munro  wrote:

>
> One key question here is whether you need a different date type or
> just different operations (functions, operators etc) on the existing
> types.
>
>
I am thinking of having a converter to a specific calendar after each
operation and function for display or storage. It works on
Ethiopice calendar and i expect it will work on other calendar too


> > I cc Thomas Munro and Vik because they have interest on this area
>
> Last time it came up[1], I got as far as wondering if the best way
> would be to write a set of ICU-based calendar functions.  Would it be
> enough for your needs to have Ethiopic calendar-aware date arithmetic
> (add, subtract a month etc), date part extraction (get the current
> Ethiopic day/month/year of a date), display and parsing, and have all
> of these as functions that you have to call explicitly, but have them
> take the standard built-in date and timestamp types, so that your
> tables would store regular date and timestamp values?  If not, what
> else do you need?
>
>
Ethiopice calendar have 13 months so it can not be stored as date and
timestamp type and you approach seems more complicated and i suggest to
have this feature on the purpose of PostgreSQL popularities too not only
for my need


> ICU is very well maintained and widely used software, and PostgreSQL
> already depends on it optionally, and that's enabled in all common
> distributions.  In other words, maybe all the logic you want exists
> already in your process's memory, we just have to figure out how to
> reach it from SQL.  Another reason to use ICU is that we can solve
> this problem once and then it'll work for many other calendars.
>
>
Each  calendar-aware date arithmetic is different so solving one calendar
problem didn't help on other calendar


> > Please don't suggests to fork from PostgreSQL just for this feature
>
> I would start with an extension, and I'd try to do a small set of
> simple functions, to let me write things like:
>
>   icu_format(now(), 'fr_FR@calendar=buddhist') to get a Buddhist
> calendar with French words
>
>   icu_date_part('year', current_date, 'am_ET@calendar=traditional') to
> get the current year in the Ethiopic calendar (2013 apparently)
>
> Well, the first one probably also needs a format string too, actual
> details to be worked out by reading the ICU manual...
>

I think you suggesting this by expecting the implementation is difficult
but it's not that much difficult once you fully understand Gregorian
calendar and the calendar you work on


>
> Maybe instead of making a new extension, I might try to start from
> https://github.com/dverite/icu_ext and see if it makes sense to extend
> it to cover calendars.
>
> Maybe one day ICU will become a hard dependency of PostgreSQL and
> someone will propose all that stuff into core, and then maybe we could
> start to think about the possibility of tighter integration with the
> built-in date/time functions (and LC_TIME setting?  seems complicated,
> see also problems with setting LC_COLLATE/datcollate to an ICU
> collation name, but I digress and that's a far off problem).  I would
> also study the SQL standard and maybe DB2 (highly subjective comment:
> at a wild guess, the most likely commercial RDBMS to have done a good
> job of this if anyone has) to see if they contemplate non-Gregorian
> calendars, to get some feel for whether that would eventually be a
> possibility to conform with whatever the standard says.
>
> In summary, getting something of very high quality by using a widely
> used open source library that we already use seems like a better plan
> than trying to write and maintain our own specialist knowledge about
> individual calendars.  If there's something you need that can't be
> done with its APIs working on top of our regular date and timestamp
> types, could you elaborate?
>
> [1]
> https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BybW0LJuLtj3yAUsbOw3DrzK00pGk8JyfpCREzi_LSsg%40mail.gmail.com#393d827f1be589d0ad6ca6b016905e80


I don't know how you see this but for me the feature deserves a specialist
and it is not that much difficult to have one because i guess every majore
calendar have english documentation

regards
Surafel


Re: Calendar support in localization

2021-03-15 Thread Thomas Munro
Hi Surafel,

On Tue, Mar 16, 2021 at 3:48 AM Surafel Temesgen  wrote:
> My country(Ethiopia) is one of the nations that uses different kind of 
> calendar than what PostgreSQL have so we are deprived from the benefit of 
> data datatype. We just uses String to store date that limits our application 
> quality greatly. The lag became even worst once application and system time 
> support is available and it seems to me it is not fair to suggest to add 
> other date data type kind and implementation for just different calendar that 
> even not minor user group. Having calendar support to localization will be 
> very very very very exciting feature for none Gregorian calendar user group 
> and make so loved. As far as i can see the difficult thing is understanding 
> different calendar. I can prepare a patch for Ethiopian calendar once we have 
> consensus.

One key question here is whether you need a different date type or
just different operations (functions, operators etc) on the existing
types.

> I cc Thomas Munro and Vik because they have interest on this area

Last time it came up[1], I got as far as wondering if the best way
would be to write a set of ICU-based calendar functions.  Would it be
enough for your needs to have Ethiopic calendar-aware date arithmetic
(add, subtract a month etc), date part extraction (get the current
Ethiopic day/month/year of a date), display and parsing, and have all
of these as functions that you have to call explicitly, but have them
take the standard built-in date and timestamp types, so that your
tables would store regular date and timestamp values?  If not, what
else do you need?

ICU is very well maintained and widely used software, and PostgreSQL
already depends on it optionally, and that's enabled in all common
distributions.  In other words, maybe all the logic you want exists
already in your process's memory, we just have to figure out how to
reach it from SQL.  Another reason to use ICU is that we can solve
this problem once and then it'll work for many other calendars.

> Please don't suggests to fork from PostgreSQL just for this feature

I would start with an extension, and I'd try to do a small set of
simple functions, to let me write things like:

  icu_format(now(), 'fr_FR@calendar=buddhist') to get a Buddhist
calendar with French words

  icu_date_part('year', current_date, 'am_ET@calendar=traditional') to
get the current year in the Ethiopic calendar (2013 apparently)

Well, the first one probably also needs a format string too, actual
details to be worked out by reading the ICU manual...

Maybe instead of making a new extension, I might try to start from
https://github.com/dverite/icu_ext and see if it makes sense to extend
it to cover calendars.

Maybe one day ICU will become a hard dependency of PostgreSQL and
someone will propose all that stuff into core, and then maybe we could
start to think about the possibility of tighter integration with the
built-in date/time functions (and LC_TIME setting?  seems complicated,
see also problems with setting LC_COLLATE/datcollate to an ICU
collation name, but I digress and that's a far off problem).  I would
also study the SQL standard and maybe DB2 (highly subjective comment:
at a wild guess, the most likely commercial RDBMS to have done a good
job of this if anyone has) to see if they contemplate non-Gregorian
calendars, to get some feel for whether that would eventually be a
possibility to conform with whatever the standard says.

In summary, getting something of very high quality by using a widely
used open source library that we already use seems like a better plan
than trying to write and maintain our own specialist knowledge about
individual calendars.  If there's something you need that can't be
done with its APIs working on top of our regular date and timestamp
types, could you elaborate?

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BybW0LJuLtj3yAUsbOw3DrzK00pGk8JyfpCREzi_LSsg%40mail.gmail.com#393d827f1be589d0ad6ca6b016905e80