Re: [U-Boot] [PATCH 11/15] efi_loader: capitalization table

2018-08-30 Thread Alexander Graf


On 30.08.18 04:51, Simon Glass wrote:
> Hi,
> 
> On 27 August 2018 at 02:37, Alexander Graf  wrote:
>>
>>
>>> Am 27.08.2018 um 10:30 schrieb Mike FABIAN :
>>>
>>> Alexander Graf  さんは書きました:
>>>
> On 26.08.18 21:00, Heinrich Schuchardt wrote:
>> On 08/26/2018 08:22 PM, Alexander Graf wrote:
>>
>>
>>> On 11.08.18 17:28, Heinrich Schuchardt wrote:
>>> This patch provides a define to initialize a table that maps lower to
>>> capital letters for Unicode code point 0x - 0x.
>>>
>>> Signed-off-by: Heinrich Schuchardt 
>>> ---
>>> MAINTAINERS  |1 +
>>> include/capitalization.h | 1909 ++
>>> 2 files changed, 1910 insertions(+)
>>> create mode 100644 include/capitalization.h
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index a324139471..0a543309f2 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -368,6 +368,7 @@ F:doc/DocBook/efi.tmpl
>>> F:doc/README.uefi
>>> F:doc/README.iscsi
>>> F:Documentation/efi.rst
>>> +F:include/capitalization.h
>>> F:include/efi*
>>> F:include/pe.h
>>> F:include/asm-generic/pe.h
>>> diff --git a/include/capitalization.h b/include/capitalization.h
>>> new file mode 100644
>>> index 00..50d5108f98
>>> --- /dev/null
>>> +++ b/include/capitalization.h
>>> @@ -0,0 +1,1909 @@
>>> +/* SPDX-License-Identifier: Unicode-DFS-2016 */
>>> +/*
>>> + * Correspondence table for small and capital Unicode letters in the 
>>> range of
>>> + * 0x - 0x based on 
>>> http://www.unicode.org/Public/UCA/11.0.0/allkeys.txt
>>> + */
>>> +
>>> +struct capitalization_table {
>>> +u16 upper;
>>> +u16 lower;
>>> +};
>>> +
>>> +#define UNICODE_CAPITALIZATION_TABLE { \
>>
>> Ugh, that is a *lot* of data. How much does the binary size grow with
>> the table compiled in?
>>>
>>> That data is also in glibc. I don’t know whether you use glibc though
>>> ...
>>
>> U-Boot is a standalone OS so to say, we do not use glibc (except for 
>> sandbox, but that's a special target).
>>
>> The main problem is that some times people cram U-Boot into very tight 
>> spaces, like small flash chips or even on-chip RAM. So we have to be very 
>> cautious on space requirements.
> 
> Indeed.
> 
> Most archs use a private glib so do not build with one that supports this 
> table.
> 
> Wouldn't it be better to put this in a C file as a const struct?
> 
> If it is big that's OK, we just need to add a CONFIG option for it. If
> we already have EFI enabled, perhaps it doesn't matter, since it is
> pretty big.


EFI support when it started was about 10kb. I assume it grew by now, so
maybe we're at 15kb by now. Adding another 4kb for the upper/lower
translation is *massive* by comparison.

Edk2 only supports upper/lower conversion for English anyway, so we're
not worse off if we don't support all the fancy unicode conversions.


Alex
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH 11/15] efi_loader: capitalization table

2018-08-29 Thread Simon Glass
Hi,

On 27 August 2018 at 02:37, Alexander Graf  wrote:
>
>
>> Am 27.08.2018 um 10:30 schrieb Mike FABIAN :
>>
>> Alexander Graf  さんは書きました:
>>
 On 26.08.18 21:00, Heinrich Schuchardt wrote:
> On 08/26/2018 08:22 PM, Alexander Graf wrote:
>
>
>> On 11.08.18 17:28, Heinrich Schuchardt wrote:
>> This patch provides a define to initialize a table that maps lower to
>> capital letters for Unicode code point 0x - 0x.
>>
>> Signed-off-by: Heinrich Schuchardt 
>> ---
>> MAINTAINERS  |1 +
>> include/capitalization.h | 1909 ++
>> 2 files changed, 1910 insertions(+)
>> create mode 100644 include/capitalization.h
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index a324139471..0a543309f2 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -368,6 +368,7 @@ F:doc/DocBook/efi.tmpl
>> F:doc/README.uefi
>> F:doc/README.iscsi
>> F:Documentation/efi.rst
>> +F:include/capitalization.h
>> F:include/efi*
>> F:include/pe.h
>> F:include/asm-generic/pe.h
>> diff --git a/include/capitalization.h b/include/capitalization.h
>> new file mode 100644
>> index 00..50d5108f98
>> --- /dev/null
>> +++ b/include/capitalization.h
>> @@ -0,0 +1,1909 @@
>> +/* SPDX-License-Identifier: Unicode-DFS-2016 */
>> +/*
>> + * Correspondence table for small and capital Unicode letters in the 
>> range of
>> + * 0x - 0x based on 
>> http://www.unicode.org/Public/UCA/11.0.0/allkeys.txt
>> + */
>> +
>> +struct capitalization_table {
>> +u16 upper;
>> +u16 lower;
>> +};
>> +
>> +#define UNICODE_CAPITALIZATION_TABLE { \
>
> Ugh, that is a *lot* of data. How much does the binary size grow with
> the table compiled in?
>>
>> That data is also in glibc. I don’t know whether you use glibc though
>> ...
>
> U-Boot is a standalone OS so to say, we do not use glibc (except for sandbox, 
> but that's a special target).
>
> The main problem is that some times people cram U-Boot into very tight 
> spaces, like small flash chips or even on-chip RAM. So we have to be very 
> cautious on space requirements.

Indeed.

Most archs use a private glib so do not build with one that supports this table.

Wouldn't it be better to put this in a C file as a const struct?

If it is big that's OK, we just need to add a CONFIG option for it. If
we already have EFI enabled, perhaps it doesn't matter, since it is
pretty big.

Regards,
Simon
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH 11/15] efi_loader: capitalization table

2018-08-27 Thread Mike FABIAN
Alexander Graf  さんは書きました:

> On 26.08.18 21:00, Heinrich Schuchardt wrote:
>> On 08/26/2018 08:22 PM, Alexander Graf wrote:
>>>
>>>
>>> On 11.08.18 17:28, Heinrich Schuchardt wrote:
 This patch provides a define to initialize a table that maps lower to
 capital letters for Unicode code point 0x - 0x.

 Signed-off-by: Heinrich Schuchardt 
 ---
  MAINTAINERS  |1 +
  include/capitalization.h | 1909 ++
  2 files changed, 1910 insertions(+)
  create mode 100644 include/capitalization.h

 diff --git a/MAINTAINERS b/MAINTAINERS
 index a324139471..0a543309f2 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
 @@ -368,6 +368,7 @@ F: doc/DocBook/efi.tmpl
  F:doc/README.uefi
  F:doc/README.iscsi
  F:Documentation/efi.rst
 +F:include/capitalization.h
  F:include/efi*
  F:include/pe.h
  F:include/asm-generic/pe.h
 diff --git a/include/capitalization.h b/include/capitalization.h
 new file mode 100644
 index 00..50d5108f98
 --- /dev/null
 +++ b/include/capitalization.h
 @@ -0,0 +1,1909 @@
 +/* SPDX-License-Identifier: Unicode-DFS-2016 */
 +/*
 + * Correspondence table for small and capital Unicode letters in the 
 range of
 + * 0x - 0x based on 
 http://www.unicode.org/Public/UCA/11.0.0/allkeys.txt
 + */
 +
 +struct capitalization_table {
 +  u16 upper;
 +  u16 lower;
 +};
 +
 +#define UNICODE_CAPITALIZATION_TABLE { \
>>>
>>> Ugh, that is a *lot* of data. How much does the binary size grow with
>>> the table compiled in?

That data is also in glibc. I don’t know whether you use glibc though
...

>>> Is there any slightly more sophisticated pattern in the table maybe that
>>> we could just express as code? Would that turn out smaller maybe?
>> 
>> This is 3792 bytes of data. Unicode capitalization is quite random in
>> arranging lower and upper letters.
>> 
>> We could resort to zlib or gzip. But these libraries are not built by
>> default.
>
> Yeah, and that only adds to more overhead.
>
>> Most urgently we will need the capitalization table for generating and
>> checking short FAT filenames, so we could create a configuration switch
>> that would reduce this table to codepage 437 or codepage 1250 letters
>> depending on the chosen native character set.
>
> I think that's a great idea. There probably is a lot of overlap even
> between the two, so maybe just make it a config option for "non-latin
> upper/lower case conversion".
>
>> In EDK2 I only found code for codepage 1250.
>
> Yeah, I'd be surprised if people really needed more. In fact, how about
> you just default the config option to =n by default?
>
>
> Alex
>

-- 
 Mike FABIAN   
睡眠不足はいい仕事の敵だ。
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH 11/15] efi_loader: capitalization table

2018-08-27 Thread Alexander Graf


> Am 27.08.2018 um 10:30 schrieb Mike FABIAN :
> 
> Alexander Graf  さんは書きました:
> 
>>> On 26.08.18 21:00, Heinrich Schuchardt wrote:
 On 08/26/2018 08:22 PM, Alexander Graf wrote:
 
 
> On 11.08.18 17:28, Heinrich Schuchardt wrote:
> This patch provides a define to initialize a table that maps lower to
> capital letters for Unicode code point 0x - 0x.
> 
> Signed-off-by: Heinrich Schuchardt 
> ---
> MAINTAINERS  |1 +
> include/capitalization.h | 1909 ++
> 2 files changed, 1910 insertions(+)
> create mode 100644 include/capitalization.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a324139471..0a543309f2 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -368,6 +368,7 @@ F:doc/DocBook/efi.tmpl
> F:doc/README.uefi
> F:doc/README.iscsi
> F:Documentation/efi.rst
> +F:include/capitalization.h
> F:include/efi*
> F:include/pe.h
> F:include/asm-generic/pe.h
> diff --git a/include/capitalization.h b/include/capitalization.h
> new file mode 100644
> index 00..50d5108f98
> --- /dev/null
> +++ b/include/capitalization.h
> @@ -0,0 +1,1909 @@
> +/* SPDX-License-Identifier: Unicode-DFS-2016 */
> +/*
> + * Correspondence table for small and capital Unicode letters in the 
> range of
> + * 0x - 0x based on 
> http://www.unicode.org/Public/UCA/11.0.0/allkeys.txt
> + */
> +
> +struct capitalization_table {
> +u16 upper;
> +u16 lower;
> +};
> +
> +#define UNICODE_CAPITALIZATION_TABLE { \
 
 Ugh, that is a *lot* of data. How much does the binary size grow with
 the table compiled in?
> 
> That data is also in glibc. I don’t know whether you use glibc though
> ...

U-Boot is a standalone OS so to say, we do not use glibc (except for sandbox, 
but that's a special target).

The main problem is that some times people cram U-Boot into very tight spaces, 
like small flash chips or even on-chip RAM. So we have to be very cautious on 
space requirements.

Alex


___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH 11/15] efi_loader: capitalization table

2018-08-26 Thread Alexander Graf


On 26.08.18 21:00, Heinrich Schuchardt wrote:
> On 08/26/2018 08:22 PM, Alexander Graf wrote:
>>
>>
>> On 11.08.18 17:28, Heinrich Schuchardt wrote:
>>> This patch provides a define to initialize a table that maps lower to
>>> capital letters for Unicode code point 0x - 0x.
>>>
>>> Signed-off-by: Heinrich Schuchardt 
>>> ---
>>>  MAINTAINERS  |1 +
>>>  include/capitalization.h | 1909 ++
>>>  2 files changed, 1910 insertions(+)
>>>  create mode 100644 include/capitalization.h
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index a324139471..0a543309f2 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -368,6 +368,7 @@ F:  doc/DocBook/efi.tmpl
>>>  F: doc/README.uefi
>>>  F: doc/README.iscsi
>>>  F: Documentation/efi.rst
>>> +F: include/capitalization.h
>>>  F: include/efi*
>>>  F: include/pe.h
>>>  F: include/asm-generic/pe.h
>>> diff --git a/include/capitalization.h b/include/capitalization.h
>>> new file mode 100644
>>> index 00..50d5108f98
>>> --- /dev/null
>>> +++ b/include/capitalization.h
>>> @@ -0,0 +1,1909 @@
>>> +/* SPDX-License-Identifier: Unicode-DFS-2016 */
>>> +/*
>>> + * Correspondence table for small and capital Unicode letters in the range 
>>> of
>>> + * 0x - 0x based on 
>>> http://www.unicode.org/Public/UCA/11.0.0/allkeys.txt
>>> + */
>>> +
>>> +struct capitalization_table {
>>> +   u16 upper;
>>> +   u16 lower;
>>> +};
>>> +
>>> +#define UNICODE_CAPITALIZATION_TABLE { \
>>
>> Ugh, that is a *lot* of data. How much does the binary size grow with
>> the table compiled in?
>>
>> Is there any slightly more sophisticated pattern in the table maybe that
>> we could just express as code? Would that turn out smaller maybe?
> 
> This is 3792 bytes of data. Unicode capitalization is quite random in
> arranging lower and upper letters.
> 
> We could resort to zlib or gzip. But these libraries are not built by
> default.

Yeah, and that only adds to more overhead.

> Most urgently we will need the capitalization table for generating and
> checking short FAT filenames, so we could create a configuration switch
> that would reduce this table to codepage 437 or codepage 1250 letters
> depending on the chosen native character set.

I think that's a great idea. There probably is a lot of overlap even
between the two, so maybe just make it a config option for "non-latin
upper/lower case conversion".

> In EDK2 I only found code for codepage 1250.

Yeah, I'd be surprised if people really needed more. In fact, how about
you just default the config option to =n by default?


Alex
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH 11/15] efi_loader: capitalization table

2018-08-26 Thread Heinrich Schuchardt
On 08/26/2018 08:22 PM, Alexander Graf wrote:
> 
> 
> On 11.08.18 17:28, Heinrich Schuchardt wrote:
>> This patch provides a define to initialize a table that maps lower to
>> capital letters for Unicode code point 0x - 0x.
>>
>> Signed-off-by: Heinrich Schuchardt 
>> ---
>>  MAINTAINERS  |1 +
>>  include/capitalization.h | 1909 ++
>>  2 files changed, 1910 insertions(+)
>>  create mode 100644 include/capitalization.h
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index a324139471..0a543309f2 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -368,6 +368,7 @@ F:   doc/DocBook/efi.tmpl
>>  F:  doc/README.uefi
>>  F:  doc/README.iscsi
>>  F:  Documentation/efi.rst
>> +F:  include/capitalization.h
>>  F:  include/efi*
>>  F:  include/pe.h
>>  F:  include/asm-generic/pe.h
>> diff --git a/include/capitalization.h b/include/capitalization.h
>> new file mode 100644
>> index 00..50d5108f98
>> --- /dev/null
>> +++ b/include/capitalization.h
>> @@ -0,0 +1,1909 @@
>> +/* SPDX-License-Identifier: Unicode-DFS-2016 */
>> +/*
>> + * Correspondence table for small and capital Unicode letters in the range 
>> of
>> + * 0x - 0x based on 
>> http://www.unicode.org/Public/UCA/11.0.0/allkeys.txt
>> + */
>> +
>> +struct capitalization_table {
>> +u16 upper;
>> +u16 lower;
>> +};
>> +
>> +#define UNICODE_CAPITALIZATION_TABLE { \
> 
> Ugh, that is a *lot* of data. How much does the binary size grow with
> the table compiled in?
> 
> Is there any slightly more sophisticated pattern in the table maybe that
> we could just express as code? Would that turn out smaller maybe?

This is 3792 bytes of data. Unicode capitalization is quite random in
arranging lower and upper letters.

We could resort to zlib or gzip. But these libraries are not built by
default.

Most urgently we will need the capitalization table for generating and
checking short FAT filenames, so we could create a configuration switch
that would reduce this table to codepage 437 or codepage 1250 letters
depending on the chosen native character set.

In EDK2 I only found code for codepage 1250.

Best regards

Heinrich

> 
> 
> Alex
> 
>> +{ 0x0531, /* ARMENIAN CAPITAL LETTER AYB */ \
>> +  0x0561, /* ARMENIAN SMALL LETTER AYB */ }, \
>> +{ 0x0532, /* ARMENIAN CAPITAL LETTER BEN */ \
>> +  0x0562, /* ARMENIAN SMALL LETTER BEN */ }, \
>> +{ 0x053E, /* ARMENIAN CAPITAL LETTER CA */ \
>> +  0x056E, /* ARMENIAN SMALL LETTER CA */ }, \
>> +{ 0x0549, /* ARMENIAN CAPITAL LETTER CHA */ \
> 
> [...]
> 

___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH 11/15] efi_loader: capitalization table

2018-08-26 Thread Alexander Graf


On 11.08.18 17:28, Heinrich Schuchardt wrote:
> This patch provides a define to initialize a table that maps lower to
> capital letters for Unicode code point 0x - 0x.
> 
> Signed-off-by: Heinrich Schuchardt 
> ---
>  MAINTAINERS  |1 +
>  include/capitalization.h | 1909 ++
>  2 files changed, 1910 insertions(+)
>  create mode 100644 include/capitalization.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a324139471..0a543309f2 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -368,6 +368,7 @@ F:doc/DocBook/efi.tmpl
>  F:   doc/README.uefi
>  F:   doc/README.iscsi
>  F:   Documentation/efi.rst
> +F:   include/capitalization.h
>  F:   include/efi*
>  F:   include/pe.h
>  F:   include/asm-generic/pe.h
> diff --git a/include/capitalization.h b/include/capitalization.h
> new file mode 100644
> index 00..50d5108f98
> --- /dev/null
> +++ b/include/capitalization.h
> @@ -0,0 +1,1909 @@
> +/* SPDX-License-Identifier: Unicode-DFS-2016 */
> +/*
> + * Correspondence table for small and capital Unicode letters in the range of
> + * 0x - 0x based on 
> http://www.unicode.org/Public/UCA/11.0.0/allkeys.txt
> + */
> +
> +struct capitalization_table {
> + u16 upper;
> + u16 lower;
> +};
> +
> +#define UNICODE_CAPITALIZATION_TABLE { \

Ugh, that is a *lot* of data. How much does the binary size grow with
the table compiled in?

Is there any slightly more sophisticated pattern in the table maybe that
we could just express as code? Would that turn out smaller maybe?


Alex

> + { 0x0531, /* ARMENIAN CAPITAL LETTER AYB */ \
> +   0x0561, /* ARMENIAN SMALL LETTER AYB */ }, \
> + { 0x0532, /* ARMENIAN CAPITAL LETTER BEN */ \
> +   0x0562, /* ARMENIAN SMALL LETTER BEN */ }, \
> + { 0x053E, /* ARMENIAN CAPITAL LETTER CA */ \
> +   0x056E, /* ARMENIAN SMALL LETTER CA */ }, \
> + { 0x0549, /* ARMENIAN CAPITAL LETTER CHA */ \

[...]
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot