On Thu, 20 Aug 2020 09:35:40 -0700, Charles Mills wrote: >I wonder if it might make sense to go UTF-32 even to disk, but compress the >data. > >I wonder how well standard compression schemes work with UTF-32? Are they too >octet-oriented to work optimally? > A non-scientific sample: 1995 $ ls -l ~ | wc 24 213 1403 1996 $ ls -l ~ | gzip | wc 1 9 441 1997 $ ls -l ~ | iconv -f UTF-8 -t UTF-32 | wc 24 213 5616 1998 $ ls -l ~ | iconv -f UTF-8 -t UTF-32 | gzip | wc 0 9 679
>I wonder if one might write an LZW implementation that assumed 32-bit >characters. -- gil ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN