Re: [Freedos-devel] GitLab - problem with codepages

2022-08-02 Thread Jerome Shidel
Hi, 

> On Aug 2, 2022, at 7:15 PM, Andrew Bird via Freedos-devel 
>  wrote:
> 
> Hi Aitor,
> 
> On Wed, 3 Aug 2022 00:37:29 +0200
> Aitor Santamaría mailto:aitor...@gmail.com>> wrote:
> 
>> Hi,
>> 
>> I am trying to adapt to work with GIT from the GitLab repository, and am
>> committing a couple of changes on FD-KEYB that I had ready.
>> 
>> However, I have noticed a problem with the codepages. I try to avoid
>> non-ASCII characters, but seldomly use those that are common to most
>> codepages.
>> What I see is that neither the character stored (that the web omits) or the
>> character committed (strange character) are correct, which obviously didn't
>> happen when I simply committed a ZIP file with everything:
>> 
>> [image: image.png]
>> 
>> I can live with them and try and take all characters from ASCII, but I am
>> just worried if the sources at GIT would all be affected by this apparent
>> codepage problem.
>> 
>> Maybe someone more expert on GIT than myself can add something on this.
>> 
>> Aitor
> 
> The thing is that git stores text files internally as UTF-8, so unless
> any uploaded text file has a match in the .gitattributes file it's assumed to 
> be that (or
> binary if can't be interpreted as UTF-8). Consequently any diff shown
> is anybody's guess as to what codepage it might be encoded in. You
> could set .gitattributes to match each codepage which might make sense
> for nls files etc, like I have done for some projects like find (you'll see 
> that the diff is rendered properly)
> https://github.com/FDOS/find/commit/f763ce94f837e15c2e865e8fc333f8ca8396d427 
> 
>  . Though ideally we'd keep all master translation files in UTF-8 format and 
> translate to specific codepages at build time.
> If the problematic file is a source code file, then I'd tend to encode it in 
> UTF-8 format as any special chars are likely to be in comment sections and so 
> have no bearing upon output.
> 
> Hope it helps, Andrew

There some things I’ve noticed while doing the FD-NLS project on GitHub and the 
Archive on GitLab regarding DOS Codepage files.

Basically, some of the code page translation files, when viewed through the web 
interface, show incorrect characters. However, checking out the project (or 
cloning) it back to a local machine shows the files were not changed and the 
codepage was preserved. 

The biggest problem I’ve found is modern editors mangling codepages or 
incorrect codepages being used for a given language. In part that is why I 
started the FD-NLS Desktop App. 

Overall, It is probably a good idea to include a UTF-8 version of any 
translated text along with the codepage versions.

Jerome

___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


Re: [Freedos-devel] GitLab - problem with codepages

2022-08-02 Thread Andrew Bird via Freedos-devel
Hi Aitor,

On Wed, 3 Aug 2022 00:37:29 +0200
Aitor Santamaría  wrote:

> Hi,
> 
> I am trying to adapt to work with GIT from the GitLab repository, and am
> committing a couple of changes on FD-KEYB that I had ready.
> 
> However, I have noticed a problem with the codepages. I try to avoid
> non-ASCII characters, but seldomly use those that are common to most
> codepages.
> What I see is that neither the character stored (that the web omits) or the
> character committed (strange character) are correct, which obviously didn't
> happen when I simply committed a ZIP file with everything:
> 
> [image: image.png]
> 
> I can live with them and try and take all characters from ASCII, but I am
> just worried if the sources at GIT would all be affected by this apparent
> codepage problem.
> 
> Maybe someone more expert on GIT than myself can add something on this.
> 
> Aitor

The thing is that git stores text files internally as UTF-8, so unless
any uploaded text file has a match in the .gitattributes file it's assumed to 
be that (or
binary if can't be interpreted as UTF-8). Consequently any diff shown
is anybody's guess as to what codepage it might be encoded in. You
could set .gitattributes to match each codepage which might make sense
for nls files etc, like I have done for some projects like find (you'll see 
that the diff is rendered properly)
https://github.com/FDOS/find/commit/f763ce94f837e15c2e865e8fc333f8ca8396d427 . 
Though ideally we'd keep all master translation files in UTF-8 format and 
translate to specific codepages at build time.
If the problematic file is a source code file, then I'd tend to encode it in 
UTF-8 format as any special chars are likely to be in comment sections and so 
have no bearing upon output.

Hope it helps, Andrew


___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel


[Freedos-devel] GitLab - problem with codepages

2022-08-02 Thread Aitor Santamaría
Hi,

I am trying to adapt to work with GIT from the GitLab repository, and am
committing a couple of changes on FD-KEYB that I had ready.

However, I have noticed a problem with the codepages. I try to avoid
non-ASCII characters, but seldomly use those that are common to most
codepages.
What I see is that neither the character stored (that the web omits) or the
character committed (strange character) are correct, which obviously didn't
happen when I simply committed a ZIP file with everything:

[image: image.png]

I can live with them and try and take all characters from ASCII, but I am
just worried if the sources at GIT would all be affected by this apparent
codepage problem.

Maybe someone more expert on GIT than myself can add something on this.

Aitor
___
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel