Alexander Graf <ag...@suse.de> さんは書きました: > On 26.08.18 21:00, Heinrich Schuchardt wrote: >> On 08/26/2018 08:22 PM, Alexander Graf wrote: >>> >>> >>> On 11.08.18 17:28, Heinrich Schuchardt wrote: >>>> This patch provides a define to initialize a table that maps lower to >>>> capital letters for Unicode code point 0x0000 - 0xffff. >>>> >>>> Signed-off-by: Heinrich Schuchardt <xypron.g...@gmx.de> >>>> --- >>>> MAINTAINERS | 1 + >>>> include/capitalization.h | 1909 ++++++++++++++++++++++++++++++++++++++ >>>> 2 files changed, 1910 insertions(+) >>>> create mode 100644 include/capitalization.h >>>> >>>> diff --git a/MAINTAINERS b/MAINTAINERS >>>> index a324139471..0a543309f2 100644 >>>> --- a/MAINTAINERS >>>> +++ b/MAINTAINERS >>>> @@ -368,6 +368,7 @@ F: doc/DocBook/efi.tmpl >>>> F: doc/README.uefi >>>> F: doc/README.iscsi >>>> F: Documentation/efi.rst >>>> +F: include/capitalization.h >>>> F: include/efi* >>>> F: include/pe.h >>>> F: include/asm-generic/pe.h >>>> diff --git a/include/capitalization.h b/include/capitalization.h >>>> new file mode 100644 >>>> index 0000000000..50d5108f98 >>>> --- /dev/null >>>> +++ b/include/capitalization.h >>>> @@ -0,0 +1,1909 @@ >>>> +/* SPDX-License-Identifier: Unicode-DFS-2016 */ >>>> +/* >>>> + * Correspondence table for small and capital Unicode letters in the >>>> range of >>>> + * 0x0000 - 0xffff based on >>>> http://www.unicode.org/Public/UCA/11.0.0/allkeys.txt >>>> + */ >>>> + >>>> +struct capitalization_table { >>>> + u16 upper; >>>> + u16 lower; >>>> +}; >>>> + >>>> +#define UNICODE_CAPITALIZATION_TABLE { \ >>> >>> Ugh, that is a *lot* of data. How much does the binary size grow with >>> the table compiled in?
That data is also in glibc. I don’t know whether you use glibc though ... >>> Is there any slightly more sophisticated pattern in the table maybe that >>> we could just express as code? Would that turn out smaller maybe? >> >> This is 3792 bytes of data. Unicode capitalization is quite random in >> arranging lower and upper letters. >> >> We could resort to zlib or gzip. But these libraries are not built by >> default. > > Yeah, and that only adds to more overhead. > >> Most urgently we will need the capitalization table for generating and >> checking short FAT filenames, so we could create a configuration switch >> that would reduce this table to codepage 437 or codepage 1250 letters >> depending on the chosen native character set. > > I think that's a great idea. There probably is a lot of overlap even > between the two, so maybe just make it a config option for "non-latin > upper/lower case conversion". > >> In EDK2 I only found code for codepage 1250. > > Yeah, I'd be surprised if people really needed more. In fact, how about > you just default the config option to =n by default? > > > Alex > -- 📧 Mike FABIAN <mike.fab...@gmx.de> 睡眠不足はいい仕事の敵だ。 _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot