Re: Calendar support in localization
On Wed, Mar 17, 2021 at 8:20 AM Thomas Munro wrote: > *I mean, we can discuss the different "timelines" like UT, UTC, TAI > etc, but that's getting into the weeds, the usual timeline for > computer software outside specialist scientific purposes is UTC > without leap seconds. (Erm, rereading this thread, I meant to write "time scales" there.)
Re: Calendar support in localization
On Tue, Mar 30, 2021 at 11:16 AM Daniel Verite wrote: > > The conversions from julian dates are not necessarily hard, but the > I/O functions means having localized names for all days, months, eras > of all calendars in all supported languages. If you're thinking of > implementing this from scratch (without the ICU dependency), where > would these names come from? OTOH if we're using ICU, then why > bother reinventing the julian-to-calendars conversions that ICU > already does? > > i donno why but currently we are using our own function for converting (see j2date and date2j) maybe it's written before ICU but i think ICU helps in adding other calendar support easly. Regarding I/O functions postgresql hard coded days and months names on array and just parse and string compare, if it is not on the list then error(see datetime.c) and it will be the same for other calendar but i think we don't need all that if we use ICU regards Surafel
Re: Calendar support in localization
Surafel Temesgen wrote: > > About intervals, if there were locale-aware functions like > > add_interval(timestamptz, interval [, locale]) returns timestamptz > > or > > sub_timestamp(timestamptz, timestamptz [,locale]) returns interval > > that would use ICU to compute the results according to the locale, > > wouldn't it be good enough? > > > > > Yes it can be enough for now but there are patches proposed to support the > system and application time period which are in SQL standard To clarify, these function signatures are not meant to oppose a core vs extension implementation, nor an ICU vs non-ICU implementation. They're meant to illustrate the case of using specific functions instead of adding specific data types. AFAIU, adding data types come from the idea that since (non-gregorian-date + interval) doesn't have the same result as (gregorian-date + interval), we could use a different type for non-gregorian-date and so a different "+" operator, maybe even a specific interval type. For the case of temporal tables, I'm not quite familiar with the feature, but I notice that the patch says: +When system versioning is specified two columns are added which +record the start timestamp and end timestamp of each row verson. +The data type of these columns will be TIMESTAMP WITH TIME ZONE. The user doesn't get to choose the data type, so if we'd require to use specific data types for non-gregorian calendars, that would seemingly complicate things for this feature. This is consistent with the remark upthread that the SQL standard assumes the gregorian calendar. > what it takes to support calendar locally is input/output function > and a converter from and to julian calendar and that may not be that > much hard since most of the world calendar is based on julian or > gregorian calendar[0] The conversions from julian dates are not necessarily hard, but the I/O functions means having localized names for all days, months, eras of all calendars in all supported languages. If you're thinking of implementing this from scratch (without the ICU dependency), where would these names come from? OTOH if we're using ICU, then why bother reinventing the julian-to-calendars conversions that ICU already does? Best regards, -- Daniel Vérité PostgreSQL-powered mailer: https://www.manitou-mail.org Twitter: @DanielVerite
Re: Calendar support in localization
Hi Daniel, On Fri, Mar 26, 2021 at 8:51 PM Daniel Verite wrote: > Thomas Munro wrote: > > > Right, so if this is done by trying to extend Daniel Verite's icu_ext > > extension (link given earlier) and Robert's idea of a fast-castable > > type, I suppose you might want now()::icu_date + '1 month'::internal > > to advance you by one Ethiopic month if you have done SET > > icu_ext.ICU_LC_TIME = 'am_ET@calendar=traditional'. > > I've pushed a calendar branch on icu_ext [1] with preliminary support > for non-gregorian calendars through ICU, so far with only format and parse > of timetamptz. > Thanks > > > I understand that adding months or years with some of the non-gregorian > calendars should lead to different points in time than with the gregorian > calendar. > > For instance with the ethiopic calendar, the query above displays today as > 17/mägabit/2013 and 1 month from now as 18/miyazya/2013, > while the correct result is probably 17/miyazya/2013 (?) > > yes it should be 17/miyazya/2013 (?) > I'm not sure at this point that there should be a new set of > data/interval/timestamp types though, especially if considering > the integration in core. > > About intervals, if there were locale-aware functions like > add_interval(timestamptz, interval [, locale]) returns timestamptz > or > sub_timestamp(timestamptz, timestamptz [,locale]) returns interval > that would use ICU to compute the results according to the locale, > wouldn't it be good enough? > > Yes it can be enough for now but there are patches proposed to support the system and application time period which are in SQL standard and if we have that feature the calendar has to be in core and It doesn't appear hard for me to support the calendar locally. Postgresql itself does't store Gregorian date it stores julian date(which is more accurate than gregorian calendar) and almost all of function and operator is done using julian date converted to second(TimestampTz) so what it takes to support calendar locally is input/output function and a converter from and to julian calendar and that may not be that much hard since most of the world calendar is based on julian or gregorian calendar[0]. Thought? 0.https://en.wikipedia.org/wiki/List_of_calendars regards Surafel
Re: Calendar support in localization
On Mon, 29 Mar 2021 at 14:33, Daniel Verite wrote: > > Matthias van de Meent wrote: > > > The results for the Japanese locale should be "0003 Reiwa" instead of > > "0033 Heisei", as the era changed in 2019. ICU releases have since > > implemented this and other corrections; this specific change was > > implemented in the batched release of ICU versions on 2019-04-12. > > Right. I've run this test on an Ubuntu 18.04 desktop which comes with > libicu60 . The current version for my system is 60.2-3ubuntu3.1. > Ubuntu maintainers did not pick up the change of the new Japanese era. > As a guess, it's because it's not a security fix. > This contrasts with the baseline maintainers, who did an > exceptional effort to backpatch this down to ICU 53 > (exceptional in the sense that they don't do that for bugfixes). > > >> For instance with the ethiopic calendar, the query above displays today as > >> 17/mägabit/2013 and 1 month from now as 18/miyazya/2013, > >> while the correct result is probably 17/miyazya/2013 (?) > > > Seeing the results for Japanese locale, you might want to update your > > ICU library, which could fix this potential inconsistency. > > I agree it's always best to have the latest ICU version, but in the > context of Postgres, we have to work with the versions that are > typically installed on users systems. People who have pre-2019 > versions will simply be stuck with the previous Japanese era. > > Anyway, for the specific problem that the interval datatype cannot be > used seamlessly across all calendars, it's essentially about how days > are mapped into calendars, and it's unrelated to ICU updates AFAIU. Ah, yes, I only glanced over the supplied query and misunderstood it due to not taking enough time. I understood it as 'use icu locale info to add 1 month to the current date', which would use ICU knowledge about months in the locale and would be consistent with the question mark, instead of 'use icu to interpret the result of adding one non-icu-locale-dependent month to the current non-icu-locale-dependent date'. If it were the former, my response would have made more sense, but it doesn't in this case. So, sorry for the noise. > About intervals, if there were locale-aware functions like > add_interval(timestamptz, interval [, locale]) returns timestamptz > or > sub_timestamp(timestamptz, timestamptz [,locale]) returns interval > that would use ICU to compute the results according to the locale, > wouldn't it be good enough? I agree, that should fix the issues at hand / grant a workable path for locale-aware timestamp manipulation. With regards, Matthias van de Meent
Re: Calendar support in localization
Matthias van de Meent wrote: > The results for the Japanese locale should be "0003 Reiwa" instead of > "0033 Heisei", as the era changed in 2019. ICU releases have since > implemented this and other corrections; this specific change was > implemented in the batched release of ICU versions on 2019-04-12. Right. I've run this test on an Ubuntu 18.04 desktop which comes with libicu60 . The current version for my system is 60.2-3ubuntu3.1. Ubuntu maintainers did not pick up the change of the new Japanese era. As a guess, it's because it's not a security fix. This contrasts with the baseline maintainers, who did an exceptional effort to backpatch this down to ICU 53 (exceptional in the sense that they don't do that for bugfixes). >> For instance with the ethiopic calendar, the query above displays today as >> 17/mägabit/2013 and 1 month from now as 18/miyazya/2013, >> while the correct result is probably 17/miyazya/2013 (?) > Seeing the results for Japanese locale, you might want to update your > ICU library, which could fix this potential inconsistency. I agree it's always best to have the latest ICU version, but in the context of Postgres, we have to work with the versions that are typically installed on users systems. People who have pre-2019 versions will simply be stuck with the previous Japanese era. Anyway, for the specific problem that the interval datatype cannot be used seamlessly across all calendars, it's essentially about how days are mapped into calendars, and it's unrelated to ICU updates AFAIU. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: https://www.manitou-mail.org Twitter: @DanielVerite
Re: Calendar support in localization
On Fri, 26 Mar 2021 at 18:51, Daniel Verite wrote: > [...] > -[ RECORD 2 ]--+--- > cal| japanese > now_str| 26/mars/0033 Heisei 18:22:07.566 UTC+1 > now_parsed | 2021-03-26 18:22:07.566+01 > plus_1m| 26/avril/0033 Heisei 18:22:07.566 UTC+2 > plus_1m_parsed | 2021-04-26 18:22:07.566+02 > -[ RECORD 3 ]--+--- > [...] > -[ RECORD 12 ]-+--- > cal| ethiopic > now_str| 17/mägabit/2013 après l’Incarnation 18:22:07.566 UTC+1 > now_parsed | 2021-03-26 18:22:07.566+01 > plus_1m| 18/miyazya/2013 après l’Incarnation 18:22:07.566 UTC+2 > plus_1m_parsed | 2021-04-26 18:22:07.566+02 > -[ RECORD 13 ]-+--- > cal| ethiopic-amete-alem > now_str| 17/mägabit/7513 ERA0 18:22:07.566 UTC+1 > now_parsed | 2021-03-26 18:22:07.566+01 > plus_1m| 18/miyazya/7513 ERA0 18:22:07.566 UTC+2 > plus_1m_parsed | 2021-04-26 18:22:07.566+02 > [...] > I understand that adding months or years with some of the non-gregorian > calendars should lead to different points in time than with the gregorian > calendar. > > For instance with the ethiopic calendar, the query above displays today as > 17/mägabit/2013 and 1 month from now as 18/miyazya/2013, > while the correct result is probably 17/miyazya/2013 (?) Seeing the results for Japanese locale, you might want to update your ICU library, which could fix this potential inconsistency. The results for the Japanese locale should be "0003 Reiwa" instead of "0033 Heisei", as the era changed in 2019. ICU releases have since implemented this and other corrections; this specific change was implemented in the batched release of ICU versions on 2019-04-12. With regards, Matthias van de Meent
Re: Calendar support in localization
On Sat, Mar 27, 2021 at 6:51 AM Daniel Verite wrote: > now_str| 17/mägabit/2013 après l’Incarnation 18:22:07.566 UTC+1 Very nice! > For instance with the ethiopic calendar, the query above displays today as > 17/mägabit/2013 and 1 month from now as 18/miyazya/2013, > while the correct result is probably 17/miyazya/2013 (?) > > > I'm not sure at this point that there should be a new set of > data/interval/timestamp types though, especially if considering > the integration in core. > > About intervals, if there were locale-aware functions like > add_interval(timestamptz, interval [, locale]) returns timestamptz > or > sub_timestamp(timestamptz, timestamptz [,locale]) returns interval > that would use ICU to compute the results according to the locale, > wouldn't it be good enough? +1, I'd probably do that next if I were hacking on this...
Re: Calendar support in localization
Thomas Munro wrote: > Right, so if this is done by trying to extend Daniel Verite's icu_ext > extension (link given earlier) and Robert's idea of a fast-castable > type, I suppose you might want now()::icu_date + '1 month'::internal > to advance you by one Ethiopic month if you have done SET > icu_ext.ICU_LC_TIME = 'am_ET@calendar=traditional'. I've pushed a calendar branch on icu_ext [1] with preliminary support for non-gregorian calendars through ICU, so far with only format and parse of timetamptz. The ICU locale drives both the localization of field names (language) and the choice of calendar. It looks like this: \set fmt 'dd// HH:mm:ss.SSS zz' WITH list(cal) AS ( values ('gregorian'), ('japanese'), ('buddhist'), ('roc'), ('persian'), ('islamic-civil'), ('islamic'), ('hebrew'), ('chinese'), ('indian'), ('coptic'), ('ethiopic'), ('ethiopic-amete-alem'), ('iso8601'), ('dangi') ), fmt AS ( select cal, icu_format_date(now(), :'fmt', 'fr@calendar='||cal) as now_str, icu_format_date(now()+'1 month'::interval, :'fmt', 'fr@calendar='||cal) as plus_1m from list ) SELECT cal, now_str, icu_parse_date(now_str, :'fmt', 'fr@calendar='||cal) as now_parsed, plus_1m, icu_parse_date(plus_1m, :'fmt', 'fr@calendar='||cal) as plus_1m_parsed FROM fmt; -[ RECORD 1 ]--+--- cal| gregorian now_str| 26/mars/2021 après Jésus-Christ 18:22:07.566 UTC+1 now_parsed | 2021-03-26 18:22:07.566+01 plus_1m| 26/avril/2021 après Jésus-Christ 18:22:07.566 UTC+2 plus_1m_parsed | 2021-04-26 18:22:07.566+02 -[ RECORD 2 ]--+--- cal| japanese now_str| 26/mars/0033 Heisei 18:22:07.566 UTC+1 now_parsed | 2021-03-26 18:22:07.566+01 plus_1m| 26/avril/0033 Heisei 18:22:07.566 UTC+2 plus_1m_parsed | 2021-04-26 18:22:07.566+02 -[ RECORD 3 ]--+--- cal| buddhist now_str| 26/mars/2564 ère bouddhique 18:22:07.566 UTC+1 now_parsed | 2021-03-26 18:22:07.566+01 plus_1m| 26/avril/2564 ère bouddhique 18:22:07.566 UTC+2 plus_1m_parsed | 2021-04-26 18:22:07.566+02 -[ RECORD 4 ]--+--- cal| roc now_str| 26/mars/0110 RdC 18:22:07.566 UTC+1 now_parsed | 2021-03-26 18:22:07.566+01 plus_1m| 26/avril/0110 RdC 18:22:07.566 UTC+2 plus_1m_parsed | 2021-04-26 18:22:07.566+02 -[ RECORD 5 ]--+--- cal| persian now_str| 06/farvardin/1400 Anno Persico 18:22:07.566 UTC+1 now_parsed | 2021-03-26 18:22:07.566+01 plus_1m| 06/ordibehešt/1400 Anno Persico 18:22:07.566 UTC+2 plus_1m_parsed | 2021-04-26 18:22:07.566+02 -[ RECORD 6 ]--+--- cal| islamic-civil now_str| 12/chaabane/1442 ère de l’Hégire 18:22:07.566 UTC+1 now_parsed | 2021-03-26 18:22:07.566+01 plus_1m| 14/ramadan/1442 ère de l’Hégire 18:22:07.566 UTC+2 plus_1m_parsed | 2021-04-26 18:22:07.566+02 -[ RECORD 7 ]--+--- cal| islamic now_str| 13/chaabane/1442 ère de l’Hégire 18:22:07.566 UTC+1 now_parsed | 2021-03-26 18:22:07.566+01 plus_1m| 14/ramadan/1442 ère de l’Hégire 18:22:07.566 UTC+2 plus_1m_parsed | 2021-04-26 18:22:07.566+02 -[ RECORD 8 ]--+--- cal| hebrew now_str| 13/nissan/5781 Anno Mundi 18:22:07.566 UTC+1 now_parsed | 2021-03-26 18:22:07.566+01 plus_1m| 14/iyar/5781 Anno Mundi 18:22:07.566 UTC+2 plus_1m_parsed | 2021-04-26 18:22:07.566+02 -[ RECORD 9 ]--+--- cal| chinese now_str| 14/èryuè/0038 78 18:22:07.566 UTC+1 now_parsed | 2021-03-26 18:22:07.566+01 plus_1m| 15/sānyuè/0038 78 18:22:07.566 UTC+2 plus_1m_parsed | 2021-04-26 18:22:07.566+02 -[ RECORD 10 ]-+--- cal| indian now_str| 05/chaitra/1943 ère Saka 18:22:07.566 UTC+1 now_parsed | 2021-03-26 18:22:07.566+01 plus_1m| 06/vaishākh/1943 ère Saka 18:22:07.566 UTC+2 plus_1m_parsed | 2021-04-26 18:22:07.566+02 -[ RECORD 11 ]-+--- cal| coptic now_str| 17/barmahât/1737 après Dioclétien 18:22:07.566 UTC+1 now_parsed | 2021-03-26 18:22:07.566+01 plus_1m| 18/barmoudah/1737 après Dioclétien 18:22:07.566 UTC+2 plus_1m_parsed | 2021-04-26 18:22:07.566+02 -[ RECORD 12 ]-+--- cal| ethiopic now_str| 17/mägabit/2013 après l’Incarnation 18:22:07.566 UTC+1 now_parsed | 2021-03-26
Re: Calendar support in localization
On Wed, Mar 17, 2021 at 3:39 PM Thomas Munro wrote: > On Thu, Mar 18, 2021 at 3:48 AM Tom Lane wrote: > > Right, so if this is done by trying to extend Daniel Verite's icu_ext > extension (link given earlier) and Robert's idea of a fast-castable > type, I suppose you might want now()::icu_date + '1 month'::internal > to advance you by one Ethiopic month if you have done SET > icu_ext.ICU_LC_TIME = 'am_ET@calendar=traditional'. Or if using my > first idea of just sticking with the core types, perhaps you'd have to > replace stuff via search path... I admit that sounds rather error > prone and fragile (I was thinking mainly of different functions, not > operators). Either way, I suppose there'd also be more explicit > functions for various operations including ones that take an extra > argument if you want to use an explicit locale instead of relying on > the ICU_LC_TIME setting. I dunno. > > As you know internally timestamptz data type does't existe instead it stored as integer kind and we depend on operating system and external library for our date data type support so i think that put as on the position for not be the first one to implement timestamptz data type thing and i don't know who give as the casting for free? regards Surafel
Re: Calendar support in localization
On 3/17/21 3:48 PM, Tom Lane wrote: > Also, the SQL spec says in so many words > that the SQL-defined datetime types follow the Gregorian calendar. We already don't follow the SQL spec for timestamps (and I, for one, think we are better for it) so I don't think we should worry about that. -- Vik Fearing
Re: Calendar support in localization
On Thu, Mar 18, 2021 at 3:48 AM Tom Lane wrote: > Robert Haas writes: > > It's not very obvious how to scale this kind of approach to a wide > > variety of calendar types, and as Thomas says, it would much cooler to > > be able to handle all of the ones that ICU knows how to support rather > > than just one. But, the problem I see with using timestamptz is that > > it's not so obvious how to get a different output format ... unless, I > > guess, we can cram it into DateStyle. And it's also much less obvious > > how you get the other functions and operators to do what you want, if > > it's different. > > Yeah, I'm afraid that it probably is different. The most obvious > example is in operations involving type interval: > select now() + '1 month'::interval; > That should almost certainly give a different answer when using a > different calendar --- indeed the units of interest might not even > be the same. (Do all human calendars use the concept of months?) Right, so if this is done by trying to extend Daniel Verite's icu_ext extension (link given earlier) and Robert's idea of a fast-castable type, I suppose you might want now()::icu_date + '1 month'::internal to advance you by one Ethiopic month if you have done SET icu_ext.ICU_LC_TIME = 'am_ET@calendar=traditional'. Or if using my first idea of just sticking with the core types, perhaps you'd have to replace stuff via search path... I admit that sounds rather error prone and fragile (I was thinking mainly of different functions, not operators). Either way, I suppose there'd also be more explicit functions for various operations including ones that take an extra argument if you want to use an explicit locale instead of relying on the ICU_LC_TIME setting. I dunno. As for whether all calendars have months, it looks like ICU's model has just the familiar looking standardised fields; whether some of them make no sense in some calendars, I don't know, but it has stuff like x.get(field, ), x.set(field, ), x.add(field, amount, ) and if it fails for some field on your particular calendar, or for some value (you can't set a Gregorian date's month to 13 (apparently we call this month "undecember", hah), but you can for a Hebrew or Ethiopic one) I suppose we'd just report the error? > I don't feel like DateStyle is chartered to affect the behavior > of datetime operators; it's understood as tweaking the I/O behavior > only. There might be more of a case for letting LC_TIME choose > this behavior, but I bet the relevant standards only contemplate About LC_TIME... I suppose in one possible future we eventually use ICU for more core stuff, and someone proposes to merge hypothetical icu_date etc types into the core date etc types, and then LC_TIME controls that. But then you might have a version of the problem that Peter E ran into in attempts so far to use ICU collations as the default: if you put ICU's funky extensible locale names into the LC_XXX environment variables, then your libc will see it too, and might get upset, since PostgreSQL uses the en. I suspect that ICU will understand typical libc locale names, but common libcs won't understand ICU's highly customisable syntax, but I haven't looked into it. If that's generally true, then perhaps the solution to both problems is a kind of partial separation: regular LC_XXX, and then also ICU_LC_XXX which defaults to the same value but can be changed to access more advanced stuff, and is used only for interacting with ICU. > Gregorian calendars. Also, the SQL spec says in so many words > that the SQL-defined datetime types follow the Gregorian calendar. :-(
Re: Calendar support in localization
Robert Haas writes: > It's not very obvious how to scale this kind of approach to a wide > variety of calendar types, and as Thomas says, it would much cooler to > be able to handle all of the ones that ICU knows how to support rather > than just one. But, the problem I see with using timestamptz is that > it's not so obvious how to get a different output format ... unless, I > guess, we can cram it into DateStyle. And it's also much less obvious > how you get the other functions and operators to do what you want, if > it's different. Yeah, I'm afraid that it probably is different. The most obvious example is in operations involving type interval: select now() + '1 month'::interval; That should almost certainly give a different answer when using a different calendar --- indeed the units of interest might not even be the same. (Do all human calendars use the concept of months?) I don't feel like DateStyle is chartered to affect the behavior of datetime operators; it's understood as tweaking the I/O behavior only. There might be more of a case for letting LC_TIME choose this behavior, but I bet the relevant standards only contemplate Gregorian calendars. Also, the SQL spec says in so many words that the SQL-defined datetime types follow the Gregorian calendar. So on the whole, new datatypes and operators seem like the way to go. I concur that if ICU has solved this problem, piggybacking on it seems more promising than writing our own code. regards, tom lane
Re: Calendar support in localization
On Wed, Mar 17, 2021 at 9:54 AM Surafel Temesgen wrote: > As you mention above whatever the calendar type is we ended up storing an > integer that represent the date so rather than re-implementing every > function and operation for every calendar we can use existing Gerigorian > implementation as a base implementation and if new calendar want to perform > same function or operation it translate to Gregorian one and use the existing > function and operation and translate to back to working calendar. In this > approach the only function we want for supporting a new calendar is a > translator from the new calendar to Gregorian one and from Gerigorian > calendar to the new calendar and may be input/ output function. What do you > think of this implementation? I'm not sure what the best way of tackling this problem is, but I wanted to mention another possible approach: instead of actually using the timestamptz data type, have another data type that is binary-compatible with timestamptz - that is, it's the same thing on disk, so you can cast between the two data types for free. Then have separate input/output functions for it, separate operators and functions and so forth. It's not very obvious how to scale this kind of approach to a wide variety of calendar types, and as Thomas says, it would much cooler to be able to handle all of the ones that ICU knows how to support rather than just one. But, the problem I see with using timestamptz is that it's not so obvious how to get a different output format ... unless, I guess, we can cram it into DateStyle. And it's also much less obvious how you get the other functions and operators to do what you want, if it's different. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Calendar support in localization
On Tue, Mar 16, 2021 at 12:20 PM Thomas Munro wrote: > On Wed, Mar 17, 2021 at 6:31 AM Surafel Temesgen > wrote: > > Ethiopice calendar have 13 months so it can not be stored as date and > timestamp type and you approach seems more complicated and i suggest to > have this feature on the purpose of PostgreSQL popularities too not only > for my need > > I know, but the DATE and TIMESTAMPTZ datatypes don't intrinsically > know anything about months or other calendar concepts. Internally, > they are just a single number that counts the number of days or > seconds since an arbitrary epoch time. We are all in agreement about > how many times the Earth has rotated since then*. The calendar > concepts such as "day", "month", "year", whether Gregorian, Ethiopic, > Islamic, ... are all derivable from those numbers, if you know the > rules. > > Okay > > > > I think you suggesting this by expecting the implementation is difficult > but it's not that much difficult once you fully understand Gregorian > calendar and the calendar you work on > > Yeah, I am sure it's all just a bunch of simple integer maths. But > I'm talking about things like software architecture, maintainability, > cohesion, and getting maximum impact for the work we do. > > I may be missing some key detail though: why do you think it should be > a different type? The two reasons I can think of are: (1) the > slightly tricky detail that the date apparently changes at 1:00am > (which I don't think is a show stopper for this approach, I could > elaborate), (2) you may want dates to be formatted on the screen with > the Ethiopic calendar in common software like psql and GUI clients, > which may be easier to arrange with different types, but that seems to > be a cosmetic thing that could eventually be handled with tighter > locale integration with ICU. In the early stages you'd access > calendar logic though special functions with names like > icu_format_date(), or whatever. > > As you mention above whatever the calendar type is we ended up storing an integer that represent the date so rather than re-implementing every function and operation for every calendar we can use existing Gerigorian implementation as a base implementation and if new calendar want to perform same function or operation it translate to Gregorian one and use the existing function and operation and translate to back to working calendar. In this approach the only function we want for supporting a new calendar is a translator from the new calendar to Gregorian one and from Gerigorian calendar to the new calendar and may be input/ output function. What do you think of this implementation? regards Surafel
Re: Calendar support in localization
On Wed, Mar 17, 2021 at 6:31 AM Surafel Temesgen wrote: > Ethiopice calendar have 13 months so it can not be stored as date and > timestamp type and you approach seems more complicated and i suggest to have > this feature on the purpose of PostgreSQL popularities too not only for my > need I know, but the DATE and TIMESTAMPTZ datatypes don't intrinsically know anything about months or other calendar concepts. Internally, they are just a single number that counts the number of days or seconds since an arbitrary epoch time. We are all in agreement about how many times the Earth has rotated since then*. The calendar concepts such as "day", "month", "year", whether Gregorian, Ethiopic, Islamic, ... are all derivable from those numbers, if you know the rules. So I think you should seriously consider using the same types. > Each calendar-aware date arithmetic is different so solving one calendar > problem didn't help on other calendar They have a *lot* in common though. They have similar "fields" (day, month, year etc), based on the Earth, moon, sun etc, so it is possible to use a common abstraction to interact with them. I haven't studied it too closely, but it looks like ICU can give you a "Calendar" object for a given Locale (which you create from a string like "am_ET@calendar=traditional") and timezone ("Africa/Addis_Ababa"). Then you can set the object's time to X seconds since an epoch, based on UTC seconds without leap seconds -- which is exactly like our TIMESTAMPTZ's internal value -- and then you can query it to get fields like month etc. Or do the opposite, or use formatting and parsing routines etc. Internally, ICU has a C++ class for each calendar with a name like EthiopicCalendar, IslamicCalendar etc which encapsulates all the logic, but it's not necessary to use them directly: we could just look them up with names via the C API and then treat them all the same. > I think you suggesting this by expecting the implementation is difficult but > it's not that much difficult once you fully understand Gregorian calendar and > the calendar you work on Yeah, I am sure it's all just a bunch of simple integer maths. But I'm talking about things like software architecture, maintainability, cohesion, and getting maximum impact for the work we do. I may be missing some key detail though: why do you think it should be a different type? The two reasons I can think of are: (1) the slightly tricky detail that the date apparently changes at 1:00am (which I don't think is a show stopper for this approach, I could elaborate), (2) you may want dates to be formatted on the screen with the Ethiopic calendar in common software like psql and GUI clients, which may be easier to arrange with different types, but that seems to be a cosmetic thing that could eventually be handled with tighter locale integration with ICU. In the early stages you'd access calendar logic though special functions with names like icu_format_date(), or whatever. Maybe I'm totally wrong about all of this, but this is the first way I'd probably try to tackle this problem, and I suspect it has the highest chance of eventually being included in core PostgreSQL. *I mean, we can discuss the different "timelines" like UT, UTC, TAI etc, but that's getting into the weeds, the usual timeline for computer software outside specialist scientific purposes is UTC without leap seconds.
Re: Calendar support in localization
Hi Thomas On Mon, Mar 15, 2021 at 2:58 PM Thomas Munro wrote: > > One key question here is whether you need a different date type or > just different operations (functions, operators etc) on the existing > types. > > I am thinking of having a converter to a specific calendar after each operation and function for display or storage. It works on Ethiopice calendar and i expect it will work on other calendar too > > I cc Thomas Munro and Vik because they have interest on this area > > Last time it came up[1], I got as far as wondering if the best way > would be to write a set of ICU-based calendar functions. Would it be > enough for your needs to have Ethiopic calendar-aware date arithmetic > (add, subtract a month etc), date part extraction (get the current > Ethiopic day/month/year of a date), display and parsing, and have all > of these as functions that you have to call explicitly, but have them > take the standard built-in date and timestamp types, so that your > tables would store regular date and timestamp values? If not, what > else do you need? > > Ethiopice calendar have 13 months so it can not be stored as date and timestamp type and you approach seems more complicated and i suggest to have this feature on the purpose of PostgreSQL popularities too not only for my need > ICU is very well maintained and widely used software, and PostgreSQL > already depends on it optionally, and that's enabled in all common > distributions. In other words, maybe all the logic you want exists > already in your process's memory, we just have to figure out how to > reach it from SQL. Another reason to use ICU is that we can solve > this problem once and then it'll work for many other calendars. > > Each calendar-aware date arithmetic is different so solving one calendar problem didn't help on other calendar > > Please don't suggests to fork from PostgreSQL just for this feature > > I would start with an extension, and I'd try to do a small set of > simple functions, to let me write things like: > > icu_format(now(), 'fr_FR@calendar=buddhist') to get a Buddhist > calendar with French words > > icu_date_part('year', current_date, 'am_ET@calendar=traditional') to > get the current year in the Ethiopic calendar (2013 apparently) > > Well, the first one probably also needs a format string too, actual > details to be worked out by reading the ICU manual... > I think you suggesting this by expecting the implementation is difficult but it's not that much difficult once you fully understand Gregorian calendar and the calendar you work on > > Maybe instead of making a new extension, I might try to start from > https://github.com/dverite/icu_ext and see if it makes sense to extend > it to cover calendars. > > Maybe one day ICU will become a hard dependency of PostgreSQL and > someone will propose all that stuff into core, and then maybe we could > start to think about the possibility of tighter integration with the > built-in date/time functions (and LC_TIME setting? seems complicated, > see also problems with setting LC_COLLATE/datcollate to an ICU > collation name, but I digress and that's a far off problem). I would > also study the SQL standard and maybe DB2 (highly subjective comment: > at a wild guess, the most likely commercial RDBMS to have done a good > job of this if anyone has) to see if they contemplate non-Gregorian > calendars, to get some feel for whether that would eventually be a > possibility to conform with whatever the standard says. > > In summary, getting something of very high quality by using a widely > used open source library that we already use seems like a better plan > than trying to write and maintain our own specialist knowledge about > individual calendars. If there's something you need that can't be > done with its APIs working on top of our regular date and timestamp > types, could you elaborate? > > [1] > https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BybW0LJuLtj3yAUsbOw3DrzK00pGk8JyfpCREzi_LSsg%40mail.gmail.com#393d827f1be589d0ad6ca6b016905e80 I don't know how you see this but for me the feature deserves a specialist and it is not that much difficult to have one because i guess every majore calendar have english documentation regards Surafel
Re: Calendar support in localization
Hi Surafel, On Tue, Mar 16, 2021 at 3:48 AM Surafel Temesgen wrote: > My country(Ethiopia) is one of the nations that uses different kind of > calendar than what PostgreSQL have so we are deprived from the benefit of > data datatype. We just uses String to store date that limits our application > quality greatly. The lag became even worst once application and system time > support is available and it seems to me it is not fair to suggest to add > other date data type kind and implementation for just different calendar that > even not minor user group. Having calendar support to localization will be > very very very very exciting feature for none Gregorian calendar user group > and make so loved. As far as i can see the difficult thing is understanding > different calendar. I can prepare a patch for Ethiopian calendar once we have > consensus. One key question here is whether you need a different date type or just different operations (functions, operators etc) on the existing types. > I cc Thomas Munro and Vik because they have interest on this area Last time it came up[1], I got as far as wondering if the best way would be to write a set of ICU-based calendar functions. Would it be enough for your needs to have Ethiopic calendar-aware date arithmetic (add, subtract a month etc), date part extraction (get the current Ethiopic day/month/year of a date), display and parsing, and have all of these as functions that you have to call explicitly, but have them take the standard built-in date and timestamp types, so that your tables would store regular date and timestamp values? If not, what else do you need? ICU is very well maintained and widely used software, and PostgreSQL already depends on it optionally, and that's enabled in all common distributions. In other words, maybe all the logic you want exists already in your process's memory, we just have to figure out how to reach it from SQL. Another reason to use ICU is that we can solve this problem once and then it'll work for many other calendars. > Please don't suggests to fork from PostgreSQL just for this feature I would start with an extension, and I'd try to do a small set of simple functions, to let me write things like: icu_format(now(), 'fr_FR@calendar=buddhist') to get a Buddhist calendar with French words icu_date_part('year', current_date, 'am_ET@calendar=traditional') to get the current year in the Ethiopic calendar (2013 apparently) Well, the first one probably also needs a format string too, actual details to be worked out by reading the ICU manual... Maybe instead of making a new extension, I might try to start from https://github.com/dverite/icu_ext and see if it makes sense to extend it to cover calendars. Maybe one day ICU will become a hard dependency of PostgreSQL and someone will propose all that stuff into core, and then maybe we could start to think about the possibility of tighter integration with the built-in date/time functions (and LC_TIME setting? seems complicated, see also problems with setting LC_COLLATE/datcollate to an ICU collation name, but I digress and that's a far off problem). I would also study the SQL standard and maybe DB2 (highly subjective comment: at a wild guess, the most likely commercial RDBMS to have done a good job of this if anyone has) to see if they contemplate non-Gregorian calendars, to get some feel for whether that would eventually be a possibility to conform with whatever the standard says. In summary, getting something of very high quality by using a widely used open source library that we already use seems like a better plan than trying to write and maintain our own specialist knowledge about individual calendars. If there's something you need that can't be done with its APIs working on top of our regular date and timestamp types, could you elaborate? [1] https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BybW0LJuLtj3yAUsbOw3DrzK00pGk8JyfpCREzi_LSsg%40mail.gmail.com#393d827f1be589d0ad6ca6b016905e80