Re: [Freedos-devel] FreeDOS code page Unicode compatibility

Eric Auer via Freedos-devel Sun, 02 Jul 2023 16:29:22 -0700

Hi!

If you’re interested about the whole forint sign issue, read my Unicode
proposal and its follow-up:

https://www.unicode.org/L2/L2023/23060r-forint-sign.pdf
https://www.unicode.org/L2/L2023/ ... -forint-sign-follow-up.pdf


The way in which the follow-up quotes me misses out on quite a few
relevant details for this issue. However, those who wonder about the
follow-up may just as well be interested in what we have discussed
off-list, so I quote myself below :-p I think we should investigate
which of those 30+ apps contain which amount of support for mappings
of Unicode and codepages. I was surprised that there are that many
CANDIDATES in our distro, but a candidate is not the same as an app
for which we KNOW that they contain sufficiently comprehensive support
to make them relevant for the Unicode Forint sign topic.

Because of this, please support my OTHER thread on the question:

2023-06-23 on Freedos-user:

"Unicode and codepages in apps already bundled with FreeDOS?"

My first attempt on the list of apps, off-list to Vacek, 2023-06-17:

Hi Vacek,

thanks for the CP3845 details!

Wilhelm Spiegl has replied to me since, indicating that HTMLHELP does not
use mapping tables to convert *to*, only *from* Unicode and a forint sign
is also unlikely to occur in help text.


Depends. You can use HTMLHELP as generic minimal HTML viewer.

But as it only converts FROM Unicode, one would have to agree
on a Unicode value for Forint to let it display "Unicode Forint"
in codepages which include the Forint sign. I guess you can
also use HTMLHELP for non-Unicode HTML. In that case, it will
simply have no opinion about characters, so you could use it
to view CP3845 HTML simply by letting DOS use the CP3845 font.

Other projects may have stronger cases to include the mapping, so thank you
for the list, I’ll check whether any of these projects fit the scope of my
project and contact the maintainers.


I think it would be good in general to have a list of apps for
which Unicode and/or codepage conversions are  supported, not
supported or status is unknown. That list could start with the
30+ apps mentioned in my mail. Then you could check the docs,
and sources to reduce the number of unknowns. Next step could
be to ask about the rest on the FreeDOS mailing list and only
for those where no other mailing list readers know the answers,
you would have to ask the maintainers 😄

Of course I do not know whether you plan to spend much time for
this. But as said, it would be good to know how far our distro
is when it comes to Unicode support and related codepage tricks.

Do you allow me to cite relevant parts of your letter in a document relayed
to the people in charge for my Unicode proposal? It definitely suggests
there are numerous applications ‘bridging’ FreeDOS and Unicode, which
support my stance that 1:1 mapping of this DOS-derived character ligature
is needed and substitution with a regular letter sequence “Ft” as suggested
by the Unicode Consortium is not an option for these purposes.

I think the existence of projects like those listed would be already enough
impetus for the Unicode Consortium to move forward, then if and when a
stable code point assignment is made (the next UTC meeting is in July) I’ll
notify any maintainers interested.

Thanks for your help,
Vacek


I do not think that my mail will help you in any way with the
Unicode people at this moment: I simply went through the list
of the apps in our distro and, to my own surprise, found more
than 30 for which I BELIEVE that there is a CHANCE that they
include Unicode support or at least some type of translation
between codepages.

You will first have to find out how many of those REALLY have
the features for which we hope that they may have them.

In the end, you could argue along the lines of this thought:

With DOS, including the actively supported free open source
distro FreeDOS, people have been using fonts with Forint sign
for decades. In DOS, this required selecting an 8-bit mapping
(code page) which includes the sign, so people had to decide
whether they wanted to work with a font with Hungarian Forint
and other Hungarian characters or with a font which includes
other accented characters common in other languages. Unicode
aims to solve the problem of having to switch fonts for text
in different languages. It makes it possible to use one font
which supports many languages at the same time. However, this
also means that all different characters have to be assigned
to globally unique Unicode code points. At the moment, there
is the unusual situation that a widespread and very old font
mapping supports a currency symbol which is yet unknown to
the very newest version of Unicode: The Forint sign. So it
is not possible to exchange text including that sign in new,
Unicode based file formats. Only old 8-bit formats which are
in use already since DOS became popular are supported, which
is good for modern-day DOS users but bad for Unicode users.

FreeDOS has started to include apps with Unicode support, but
as long as there is not standard Unicode code point for the
Forint sign, only preliminary support based on a temporary
Unicode assignment could be added to individual apps. With
an official mapping, more apps, both for DOS and other, more
widespread operating systems, could support the Forint sign.

In FreeDOS, the following apps and drivers either work with
their own Unicode fonts or include support mappings between
Unicode and codepage based fonts commonly used in DOS: ...

And there you would have to add a list of examples. I think
in MinEd, you would load CP3845 and when editing Unicode
text, it would show characters existing in CP3845 correctly
and other characters would be replaced by placeholder chars.
But first, it would have to KNOW things about THAT code page.

In Blocek, support would depend on whether somebody ADDS a
Forint sign to the FONT bundled with Blocek. In both cases,
the problem will be that Forint has no OFFICIAL code point
in Unicode yet. So if MinEd and Blocek would AGREE on some
temporary Unicode code point, you present use those as proof
of concept that giving Forint a well-defined code point will
make things better for users both in the Unicode world AND
in the code page world.

Also note that other operating systems come with tools such
as recode which I like to use in Linux which can be used to
transform text files between Unicode UTF-... and other, for
example code page based encodings. If I want to use the tool
for text including Forint signs, I must use a version of the
recode tool which 1. knows about CP3845 and 2. agrees with
other software about the Unicode code point for Forint sign.

Obviously, if we end up with 10+ or more instead of just
2 DOS apps which COULD support Unicode-with-Forint as soon
as people agree on a code point for it, it will be FAR more
convincing 😄 And instead of saying they COULD, you might
even find some maintainers who can help you with a proof of
concept by REALLY adding support, using some self-assigned
inofficial code point for Forint to show the look and feel.

Regards, Eric




For clarification, I then followed up with some explanations:

[...] You can use HTMLHELP as generic minimal HTML viewer.

But as it only converts FROM Unicode, one would have to agree
on a Unicode value for Forint to let it display "Unicode Forint"
in codepages which include the Forint sign. I guess you can
also use HTMLHELP for non-Unicode HTML. In that case, it will
simply have no opinion about characters, so you could use it
to view CP3845 HTML simply by letting DOS use the CP3845 font.

Other projects may have stronger cases to include the mapping, so thank you
for the list, I’ll check whether any of these projects fit the scope of my
project and contact the maintainers.


I think it would be good in general to have a list of apps for
which Unicode and/or codepage conversions are  supported, not
supported or status is unknown. That list could start with the
30+ apps mentioned in my mail. Then you could check the docs,
and sources to reduce the number of unknowns. Next step could
be to ask about the rest on the FreeDOS mailing list and only
for those where no other mailing list readers know the answers,
you would have to ask the maintainers 😄

Of course I do not know whether you plan to spend much time for
this. But as said, it would be good to know how far our distro
is when it comes to Unicode support and related codepage tricks.

Do you allow me to cite relevant parts of your letter in a document relayed
to the people in charge for my Unicode proposal? It definitely suggests
there are numerous applications ‘bridging’ FreeDOS and Unicode, which
support my stance that 1:1 mapping of this DOS-derived character ligature
is needed and substitution with a regular letter sequence “Ft” as suggested
by the Unicode Consortium is not an option for these purposes.

I think the existence of projects like those listed would be already enough
impetus for the Unicode Consortium to move forward, then if and when a
stable code point assignment is made (the next UTC meeting is in July) I’ll
notify any maintainers interested.

Thanks for your help,
Vacek


I do not think that my mail will help you in any way with the
Unicode people at this moment: I simply went through the list
of the apps in our distro and, to my own surprise, found more
than 30 for which I BELIEVE that there is a CHANCE that they
include Unicode support or at least some type of translation
between codepages.

You will first have to find out how many of those REALLY have
the features for which we hope that they may have them.

In the end, you could argue along the lines of this thought:

With DOS, including the actively supported free open source
distro FreeDOS, people have been using fonts with Forint sign
for decades. In DOS, this required selecting an 8-bit mapping
(code page) which includes the sign, so people had to decide
whether they wanted to work with a font with Hungarian Forint
and other Hungarian characters or with a font which includes
other accented characters common in other languages. Unicode
aims to solve the problem of having to switch fonts for text
in different languages. It makes it possible to use one font
which supports many languages at the same time. However, this
also means that all different characters have to be assigned
to globally unique Unicode code points. At the moment, there
is the unusual situation that a widespread and very old font
mapping supports a currency symbol which is yet unknown to
the very newest version of Unicode: The Forint sign. So it
is not possible to exchange text including that sign in new,
Unicode based file formats. Only old 8-bit formats which are
in use already since DOS became popular are supported, which
is good for modern-day DOS users but bad for Unicode users.

FreeDOS has started to include apps with Unicode support, but
as long as there is not standard Unicode code point for the
Forint sign, only preliminary support based on a temporary
Unicode assignment could be added to individual apps. With
an official mapping, more apps, both for DOS and other, more
widespread operating systems, could support the Forint sign.

In FreeDOS, the following apps and drivers either work with
their own Unicode fonts or include support mappings between
Unicode and codepage based fonts commonly used in DOS: ...

And there you would have to add a list of examples. I think
in MinEd, you would load CP3845 and when editing Unicode
text, it would show characters existing in CP3845 correctly
and other characters would be replaced by placeholder chars.
But first, it would have to KNOW things about THAT code page.

In Blocek, support would depend on whether somebody ADDS a
Forint sign to the FONT bundled with Blocek. In both cases,
the problem will be that Forint has no OFFICIAL code point
in Unicode yet. So if MinEd and Blocek would AGREE on some
temporary Unicode code point, you present use those as proof
of concept that giving Forint a well-defined code point will
make things better for users both in the Unicode world AND
in the code page world.

Also note that other operating systems come with tools such
as recode which I like to use in Linux which can be used to
transform text files between Unicode UTF-... and other, for
example code page based encodings. If I want to use the tool
for text including Forint signs, I must use a version of the
recode tool which 1. knows about CP3845 and 2. agrees with
other software about the Unicode code point for Forint sign.

Obviously, if we end up with 10+ or more instead of just
2 DOS apps which COULD support Unicode-with-Forint as soon
as people agree on a code point for it, it will be FAR more
convincing 😄 And instead of saying they COULD, you might
even find some maintainers who can help you with a proof of
concept by REALLY adding support, using some self-assigned
inofficial code point for Forint to show the look and feel.

Regards, Eric


Thinking about how to find out which of those 30+ apps actually
have how much support for Unicode and codepage mappings made me
ask exactly that question on freedos-user a week later, which
started my thread mentioned above. Thanks to everybody who can
shed some light on the support level of those 30 apps. I think
it would be cool if we find out that a 2-digit number of *DOS*
apps already is aware of those things, in spite of the *first*
version of Unicode appearing around the time when the *last*
version of classic MS DOS did.

Thanks for your thoughts, everybody :-)

Regards, Eric




_______________________________________________
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel

Re: [Freedos-devel] FreeDOS code page Unicode compatibility

Reply via email to