One "editor" tool worth to mention in this context is MS Word, using "plain 
text" format.
Might be unbelievable, but let me explain.

Whenever someone asks how to convert from one encoding to another,
but if he or she is not willing to download (or learn how to use) a new tool or 
editor,
I suggest to use MS Word.
It's useful to activate the (mostly unknown) "Confirm conversion at Open" 
option.

The conversion dialog can be used both on reading and saving, and it offers all 
installed codepages
including some (not all) UTF encodings.
Upon saving, the CR/LF line break setting may be chosen.

When reading, the preview is quite useful to guess the unknown encoding of text 
files.
When saving, the preview highlights characters that would be lost if the chosen 
encoding would really be used.

Limitations are: For creating UTF-8, using a BOM is always enabled. Some 
encodings come with
confusing names, such as "GB2312" (which means GBK) vs. "GB2312-80" (which is 
indeed GB2312-1980).
Files may have to be renamed to a *.txt suffix to force the conversion dialog 
to appear.


As an additional hint I want to mention a tool named "WinMerge" which is quite 
useful
for comparing text files content-based. For both files to be compared, the 
encoding can be chosen, and
the line break setting may be different, too.

Albrecht

________________________________
From: [email protected] [mailto:[email protected]] On Behalf 
Of Stephan Stiller
Sent: Thursday, October 04, 2012 6:59 AM
To: [email protected]
Subject: texteditors that can process and save in different encodings

Dear all,

In your experience, what are the best (plaintext) texteditors or word 
processors for Linux / Mac OS X / Windows that have the ability to save in many 
different encodings?

This question is more specific than asking which editors have the best 
knowledge of conversion tables for codepages (incl their different versions), 
which I'm interested in as well. There are a number of programs that appear to 
be able to read many different encodings – though I prefer the type that 
actually tells me about where format errors are when a file is loaded. Then, 
many editors that claim to be able to read all those encodings cannot display 
them; as for that, I don't care about font choice and the aesthetics of 
display, as I'm only interested in plaintext.

Some things I have seen that are no good:

 *   the editor not telling me about the encoding and line breaks it has 
detected and not letting me choose
 *   the editor displaying a BOM in hex mode even if there is none (a version 
of UltraEdit I worked with at some point)

Stephan

Reply via email to