Bug#954112: tzdata: Add ICU tzdata files

2021-01-04 Thread Aurelien Jarno
Hi,

On 2021-01-04 15:21, Dimitri John Ledkov wrote:
> On Sun, Jan 3, 2021 at 3:02 PM Aurelien Jarno  wrote:
> >
> > control: tag -1 +moreinfo
> >
> > Hi,
> >
> > On 2020-10-19 21:02, Aurelien Jarno wrote:
> > > Hi,
> > >
> > > On 2020-10-19 14:56, Dimitri John Ledkov wrote:
> > > > On Mon, 16 Mar 2020 23:09:58 + Dimitri John Ledkov 
> > > >  wrote:
> > > > > Package: tzdata
> > > > > Version: 2019c-3
> > > > > Severity: normal
> > > > >
> > > > > Dear Maintainer,
> > > > >
> > > > > This adds ICU timezone datafiles from icu-data repository.
> > > > >
> > > > > The source .txt data files are sources for the binary .res files,
> > > > > which are compiled at build time. Shipping this enabled to update
> > > > > timezone database files at runtime for icu, by rebuilding icu by
> > > > > setting `U_TIMEZONE_FILES_DIR` build-time config option, or at runtime
> > > > > with environment variable `ICU_TIMEZONE_FILES_DIR`. This will resolve
> > > > > a long standing bug that tzdata inside icu is never updated, and thus
> > > > > apps that use icu to access tzdata are always out of date (i.e. php).
> > > > >
> > > > > Note that the .txt files do duplicate tzdata data files a bit. As they
> > > > > are generated with a Java app by ICU upstream which merges tzdata
> > > > > files as input together with https://github.com/unicode-org/cldr xmls
> > > > > overrides. Maybe in the future, I will provide a more complete /
> > > > > reproducible process to rebuild icu input .txt files from the tzdata
> > > > > files directly with the xml overlays "from complete scratch".
> > > > >
> > > > > However, at least .res files generated are reproducible and match
> > > > > checksums of the prebuild .res files distributed in the icu-data
> > > > > repository.
> > > > >
> > > > > Regards,
> > > > >
> > > > > Dimitri.
> > > >
> > > > Hi, Is this going to be reviewed / considered for inclusion?
> > > >
> > > > icu package in Debian now compiles with such a definition too, and is
> > > > actively trying to lookup updated tzdata from that location.
> >
> > I got a look at that patch, and I fail to see why it should be part of
> > the tzdata source package:
> > - it doesn't use any files from the tzdata sources
> > - the unicode-org github repository is not updated synchronously with
> >   tzdata, and even lagging by a few versions (currently it only has
> >   2020d instead of 2020f). This would prevent use to ship new tzdata
> >   versions until the unicode-org repository is updated.
> >
> > In that regard it would be better to just ship and independent
> > tzdata-icu source package instead.
> >
> 
> In theory it should be possible to rebuild the "unicode-org" files
> from tzdata sources. But it's quite convoluted and I haven't managed
> to do that yet. And it will need like icu tools & java as build deps,
> last time I looked. If this was done rationale for doing that in
> tzdata source package would have at least some merit. But that's not
> here yet.

Ok. The day it is available, we might want to have the tzdata-icu
package built by the src:tzdata package, although the experience we had
with tzdata-java was not that great.

> However, my rationale for including this into the tzdata source
> package was mostly about ensuring that this data ends up in the tzdata
> binary package. That way systems that have tzdata package installed,
> have the right timezones for libicu as well as all the existing ways
> of consuming tzdata.

I don't understand why you necessarily want to have the timezone for
libicu installed on systems. libicu is not an essential package and is
not part of a standard chroot created by debootstrap.

> If this is done as a separate package, i'm not sure how to add
> dependencies for it correctly. Would tzdata binary package be ok to
> "Depends: tzdata-icu" ?

I am not sure we will want that. As said not everybody want tzdata-icu
to be installed on their system.

> I don't want to make libicuXX Depend on it, as
> many minimal container workloads may have libicu but don't currently
> install or pull in tzdata which is considered as large by them. And
> yet, if tzdata is pulled in, I would want to have tzdata-icu
> available. And hence if tzdata will always depend on tzdata-icu. i
> might be easier to do it as a single source/binary combo, rather than
> two source/binary packages with one depending on the other.

I am not sure I really understand your reasoning here. For me if you
want to have tzdata-icu installed in most cases but not always you need
to have libicuXX to recommends tzdata-icu. Most systems with libicuXX
will have it, still it would be possible to have a system without it. 

> Wr.t. to being out of sync, the delay in updates is usually small. And
> i would be ok to ship slighly lagging tzdata-icu w.r.t tzdata. As any
> update is better than none, as libicu itself has a very stale builtin
> tzdata-icu in stable series (whichever libicu version got shipped with
> upstream and ended up in testing at 

Bug#954112: tzdata: Add ICU tzdata files

2021-01-04 Thread Dimitri John Ledkov
On Sun, Jan 3, 2021 at 3:02 PM Aurelien Jarno  wrote:
>
> control: tag -1 +moreinfo
>
> Hi,
>
> On 2020-10-19 21:02, Aurelien Jarno wrote:
> > Hi,
> >
> > On 2020-10-19 14:56, Dimitri John Ledkov wrote:
> > > On Mon, 16 Mar 2020 23:09:58 + Dimitri John Ledkov  
> > > wrote:
> > > > Package: tzdata
> > > > Version: 2019c-3
> > > > Severity: normal
> > > >
> > > > Dear Maintainer,
> > > >
> > > > This adds ICU timezone datafiles from icu-data repository.
> > > >
> > > > The source .txt data files are sources for the binary .res files,
> > > > which are compiled at build time. Shipping this enabled to update
> > > > timezone database files at runtime for icu, by rebuilding icu by
> > > > setting `U_TIMEZONE_FILES_DIR` build-time config option, or at runtime
> > > > with environment variable `ICU_TIMEZONE_FILES_DIR`. This will resolve
> > > > a long standing bug that tzdata inside icu is never updated, and thus
> > > > apps that use icu to access tzdata are always out of date (i.e. php).
> > > >
> > > > Note that the .txt files do duplicate tzdata data files a bit. As they
> > > > are generated with a Java app by ICU upstream which merges tzdata
> > > > files as input together with https://github.com/unicode-org/cldr xmls
> > > > overrides. Maybe in the future, I will provide a more complete /
> > > > reproducible process to rebuild icu input .txt files from the tzdata
> > > > files directly with the xml overlays "from complete scratch".
> > > >
> > > > However, at least .res files generated are reproducible and match
> > > > checksums of the prebuild .res files distributed in the icu-data
> > > > repository.
> > > >
> > > > Regards,
> > > >
> > > > Dimitri.
> > >
> > > Hi, Is this going to be reviewed / considered for inclusion?
> > >
> > > icu package in Debian now compiles with such a definition too, and is
> > > actively trying to lookup updated tzdata from that location.
>
> I got a look at that patch, and I fail to see why it should be part of
> the tzdata source package:
> - it doesn't use any files from the tzdata sources
> - the unicode-org github repository is not updated synchronously with
>   tzdata, and even lagging by a few versions (currently it only has
>   2020d instead of 2020f). This would prevent use to ship new tzdata
>   versions until the unicode-org repository is updated.
>
> In that regard it would be better to just ship and independent
> tzdata-icu source package instead.
>

In theory it should be possible to rebuild the "unicode-org" files
from tzdata sources. But it's quite convoluted and I haven't managed
to do that yet. And it will need like icu tools & java as build deps,
last time I looked. If this was done rationale for doing that in
tzdata source package would have at least some merit. But that's not
here yet.

However, my rationale for including this into the tzdata source
package was mostly about ensuring that this data ends up in the tzdata
binary package. That way systems that have tzdata package installed,
have the right timezones for libicu as well as all the existing ways
of consuming tzdata.

If this is done as a separate package, i'm not sure how to add
dependencies for it correctly. Would tzdata binary package be ok to
"Depends: tzdata-icu" ? I don't want to make libicuXX Depend on it, as
many minimal container workloads may have libicu but don't currently
install or pull in tzdata which is considered as large by them. And
yet, if tzdata is pulled in, I would want to have tzdata-icu
available. And hence if tzdata will always depend on tzdata-icu. i
might be easier to do it as a single source/binary combo, rather than
two source/binary packages with one depending on the other.

Wr.t. to being out of sync, the delay in updates is usually small. And
i would be ok to ship slighly lagging tzdata-icu w.r.t tzdata. As any
update is better than none, as libicu itself has a very stale builtin
tzdata-icu in stable series (whichever libicu version got shipped with
upstream and ended up in testing at freeze, as in years out of date
sometimes).

Let me know what you think is the best course for me to invest effort in.

-- 
Regards,

Dimitri.



Processed: Re: Bug#954112: tzdata: Add ICU tzdata files

2021-01-03 Thread Debian Bug Tracking System
Processing control commands:

> tag -1 +moreinfo
Bug #954112 [tzdata] tzdata: Add ICU tzdata files
Added tag(s) moreinfo.

-- 
954112: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=954112
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#954112: tzdata: Add ICU tzdata files

2021-01-03 Thread Aurelien Jarno
control: tag -1 +moreinfo

Hi,

On 2020-10-19 21:02, Aurelien Jarno wrote:
> Hi,
> 
> On 2020-10-19 14:56, Dimitri John Ledkov wrote:
> > On Mon, 16 Mar 2020 23:09:58 + Dimitri John Ledkov  
> > wrote:
> > > Package: tzdata
> > > Version: 2019c-3
> > > Severity: normal
> > >
> > > Dear Maintainer,
> > >
> > > This adds ICU timezone datafiles from icu-data repository.
> > >
> > > The source .txt data files are sources for the binary .res files,
> > > which are compiled at build time. Shipping this enabled to update
> > > timezone database files at runtime for icu, by rebuilding icu by
> > > setting `U_TIMEZONE_FILES_DIR` build-time config option, or at runtime
> > > with environment variable `ICU_TIMEZONE_FILES_DIR`. This will resolve
> > > a long standing bug that tzdata inside icu is never updated, and thus
> > > apps that use icu to access tzdata are always out of date (i.e. php).
> > >
> > > Note that the .txt files do duplicate tzdata data files a bit. As they
> > > are generated with a Java app by ICU upstream which merges tzdata
> > > files as input together with https://github.com/unicode-org/cldr xmls
> > > overrides. Maybe in the future, I will provide a more complete /
> > > reproducible process to rebuild icu input .txt files from the tzdata
> > > files directly with the xml overlays "from complete scratch".
> > >
> > > However, at least .res files generated are reproducible and match
> > > checksums of the prebuild .res files distributed in the icu-data
> > > repository.
> > >
> > > Regards,
> > >
> > > Dimitri.
> > 
> > Hi, Is this going to be reviewed / considered for inclusion?
> > 
> > icu package in Debian now compiles with such a definition too, and is
> > actively trying to lookup updated tzdata from that location.

I got a look at that patch, and I fail to see why it should be part of
the tzdata source package:
- it doesn't use any files from the tzdata sources
- the unicode-org github repository is not updated synchronously with
  tzdata, and even lagging by a few versions (currently it only has
  2020d instead of 2020f). This would prevent use to ship new tzdata
  versions until the unicode-org repository is updated.

In that regard it would be better to just ship and independent
tzdata-icu source package instead.

Regards,
Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Bug#954112: tzdata: Add ICU tzdata files

2020-10-19 Thread Aurelien Jarno
Hi,

On 2020-10-19 14:56, Dimitri John Ledkov wrote:
> On Mon, 16 Mar 2020 23:09:58 + Dimitri John Ledkov  
> wrote:
> > Package: tzdata
> > Version: 2019c-3
> > Severity: normal
> >
> > Dear Maintainer,
> >
> > This adds ICU timezone datafiles from icu-data repository.
> >
> > The source .txt data files are sources for the binary .res files,
> > which are compiled at build time. Shipping this enabled to update
> > timezone database files at runtime for icu, by rebuilding icu by
> > setting `U_TIMEZONE_FILES_DIR` build-time config option, or at runtime
> > with environment variable `ICU_TIMEZONE_FILES_DIR`. This will resolve
> > a long standing bug that tzdata inside icu is never updated, and thus
> > apps that use icu to access tzdata are always out of date (i.e. php).
> >
> > Note that the .txt files do duplicate tzdata data files a bit. As they
> > are generated with a Java app by ICU upstream which merges tzdata
> > files as input together with https://github.com/unicode-org/cldr xmls
> > overrides. Maybe in the future, I will provide a more complete /
> > reproducible process to rebuild icu input .txt files from the tzdata
> > files directly with the xml overlays "from complete scratch".
> >
> > However, at least .res files generated are reproducible and match
> > checksums of the prebuild .res files distributed in the icu-data
> > repository.
> >
> > Regards,
> >
> > Dimitri.
> 
> Hi, Is this going to be reviewed / considered for inclusion?
> 
> icu package in Debian now compiles with such a definition too, and is
> actively trying to lookup updated tzdata from that location.

For some reason this bug never went to the mailing list, so I am
discovering it just now. I'll try to have a look at it in the next
weeks.

Regards,
Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Bug#954112: tzdata: Add ICU tzdata files

2020-10-19 Thread Dimitri John Ledkov
On Mon, 16 Mar 2020 23:09:58 + Dimitri John Ledkov  wrote:
> Package: tzdata
> Version: 2019c-3
> Severity: normal
>
> Dear Maintainer,
>
> This adds ICU timezone datafiles from icu-data repository.
>
> The source .txt data files are sources for the binary .res files,
> which are compiled at build time. Shipping this enabled to update
> timezone database files at runtime for icu, by rebuilding icu by
> setting `U_TIMEZONE_FILES_DIR` build-time config option, or at runtime
> with environment variable `ICU_TIMEZONE_FILES_DIR`. This will resolve
> a long standing bug that tzdata inside icu is never updated, and thus
> apps that use icu to access tzdata are always out of date (i.e. php).
>
> Note that the .txt files do duplicate tzdata data files a bit. As they
> are generated with a Java app by ICU upstream which merges tzdata
> files as input together with https://github.com/unicode-org/cldr xmls
> overrides. Maybe in the future, I will provide a more complete /
> reproducible process to rebuild icu input .txt files from the tzdata
> files directly with the xml overlays "from complete scratch".
>
> However, at least .res files generated are reproducible and match
> checksums of the prebuild .res files distributed in the icu-data
> repository.
>
> Regards,
>
> Dimitri.

Hi, Is this going to be reviewed / considered for inclusion?

icu package in Debian now compiles with such a definition too, and is
actively trying to lookup updated tzdata from that location.

-- 
Regards,

Dimitri.