Not really surprising.

Copying and pasting into vim is a no-go because the distributors of vim decided 
when they coded up a rip-off of the actual 'vi' command to add
In UTF-8 support - even though the entire command-line terminal environment 
that vim is used in - is really an ASCII environment NOT a UTF-8 environment.

It's important to understand that vim IS NOT vi.  This is a common 
misperception by newcomers to vi   (which, in my personal option, is the 
greatest
Text editor ever invented)

There's 2 efforts out there that are as close as possible to the REAL vi code:

https://ex-vi.sourceforge.net/

https://github.com/n-t-roff/heirloom-ex-vi

I had intended you copy and paste into the GUI text editor that comes with 
Linux since you were copying and pasting from a web page - I had not assumed 
you were running the command-line version of a web browser :-)   As you 
discovered notepadqq also supports the UTF-8 stuff but it at least understands 
when it writes out a textfile that "text" means ascii.

There's an interesting discussion of the conversion problems here with some 
suggestions you could use at the command line:

https://unix.stackexchange.com/questions/171832/converting-a-utf-8-file-to-ascii-best-effort

One commentor recommended this program:

https://manpages.ubuntu.com/manpages/jammy/man1/konwert.1.html

I know that this is going to sound terribly privileged and nationalistic but 
the fact is that the UNIX operating system was invented in the United States 
not in any other country, and the simple reality is that every other country 
has had the same access to electronics knowledge and scientific information 
since the invention of the vacuum tube - but every other government and culture 
on the fact of the Earth pretty much didn't value any of that "tech stuff" 
until AFTER us Americans invented it.  And NOW, they all want a piece of the 
action.  Well OK maybe if they all had valued open information, the free 
exchange of ideas, scientific advancement, much more than they valued 
dictatorial socio-religious crap used to tell people what to do and how to live 
and who to screw, then MAYBE they would have gotten to the digital age FIRST 
and then maybe us Americans would have to learn Chinese if we wanted to write 
software.  (there's a reason the Americans using stone knives and bearskins 
made it to the moon and back and the Chinese today even though they manufacture 
tech that would knock 1969 NASA tech into a cocked hat - still haven't made it 
there) Get me drift, here?

UTF-8 was tacked on to UNIX as a way of accommodating the rest of the world who 
frankly couldn't give a tinker's damn about the digital age - until we 
Americans started kicking their butts with it.  So it's NEVER going to be 
completely fully integrated into the Linux experience the way ASCII is.  If, 
you, Randall, are stuck having to deal with that interface of American 
computing to rest of the world computing - your going to always have to deal 
with this fundamental mismatch.

What I find most interesting in all of this is that the tech types in the REST 
of the world fully accept this - THEY are NOT in general the ones complaining 
about the second-class citizen status of UTF-8.  They know that they came 
second, they know they came in second because the majority of people in their 
culture don't value freedom of choice, and all that other stuff needed for 
scientific advancement, and they accept that their native languages play second 
fiddle to ASCII.  They type "rm" and "ls" and all the other ASCII commands in 
UNIX/Linux without complaint, and they generally don't have a problem spending 
time on this conversion stuff...it's us Americans who are mostly bitching and 
complaining about it...not realizing that we won the digital war, here.... 
(hell, even Linus Torvalds gave up his Finnish citizenship and became a US 
citizen, that really ought to tell you something)

Ted


-----Original Message-----
From: PLUG <[email protected]> On Behalf Of American Citizen
Sent: Thursday, December 25, 2025 1:33 PM
To: [email protected]
Subject: Re: [PLUG] Ascii versus UTF-8 woes

Ted:

I am using vim, but when I attempt to write the UTF-8 file which I saved from 
the internet browser cut and paste command, into ascii format, vim fails with a 
curious error

vim command:

:write ++enc=ASCII my_ascii_file.txt

I get the following error:

"my_ascii_file.txt" E513: Write error, conversion failed (make 'fenc' 
empty to override)
WARNING: Original file may be lost or damaged don't quit the editor until the 
file is successfully written!
Press ENTER or type command to continue

And trying to internally set the values of encoding and file encoding seems to 
work

:set encoding=ascii

:set fileencoding=ascii

except when you double check the encoding, it stays at utf-8

but the fileencoding appears to be changed to the new value=ascii

But then when you attempt to overwrite the file or write to a new file, vim 
throws errors again

"new_file.txt" E513: Write error, conversion failed (make 'fenc' empty to 
override)
WARNING: Original file may be lost or damaged don't quit the editor until the 
file is successfully written!
Press ENTER or type command to continue

So I am unable to get linux vim version 9.1.83 to work to change the encoding.

I had to actually use notepadqq to paste the browser text and then set the 
encoding to ascii and this seems to work.

I suppose you could pipe the file and let tr strip off the non-ascii characters 
??? But this means going back in and manually comparing the two files, to see 
how to fix the omitted characters (if possible)

TexStudio crashed mysteriously when I turned off its internal file scanning so 
I had to set the option again.

Supposedly there is some tex sty code which allows UTF-8 to be used in a tex 
file. And yes, my editor settings under TexStudio IS UTF-8

I already have used up at least an hour of time on this problem as iconv 
doesn't really change a pure ascii file into a UTF-8 file and vim was failing 
me.

Randall

On 12/25/25 11:28, Ted Mittelstaedt wrote:
> Open the regular textedit, paste into there, save, open the saved file 
> in TexStudio
>
> Ted
>
> -----Original Message-----
> From: PLUG <[email protected]> On Behalf Of American 
> Citizen
> Sent: Wednesday, December 24, 2025 7:40 PM
> To: Portland Linux/Unix Group <[email protected]>
> Subject: [PLUG] Ascii versus UTF-8 woes
>
> Hi:
>
> I have a set of tex files which are in pure ascii format. Unfortunately when 
> I copy material from the internet (Mozilla Firefox browser) it is in UTF-8 
> format, not ascii. This appears to be standard behavior for the internet 
> browsers.
>
> When I paste the material into the tex document (using TexStudio) the 
> paste goes okay. It only blows up when I try to save the newer file. 
> The
> UTF-8 characters cannot be saved in ascii format and for some bizarre reason 
> Tex Studio wont' change the encoding to UTF-8 even though I have the option 
> set that the editor is working with UTF-8 character set.
>
> iconv won't work either, I do the "iconv -f ASCII -t UTF-8 input_file -o 
> output_file and the file remains ascii.
>
> Does anyone have an idea of how I can get TexStudio to wake up and change the 
> file encoding on the current ascii file to UTF-8?
>
> I cannot get iconv to change the ascii file to UTF-8, so I am stuck between 
> the devil and the deep blue sea.
>
> Randall
>
>
>

Reply via email to