On 3/29/2017 1:05 PM, The Tick wrote:
On 3/29/2017 2:36 PM, Richard Hipp wrote:
Most of the world is using UTF-8 now.
I'm wondering how that can be for programming language source files.
I managed to put the "bom" in front of a one-line tcl script:
puts "This is a copyright symbol: ©."
where the '©' was previously converted to utf-8 by fossil.
gvim now reads the file and renders the utf-8 '©' as a '©'
notepad displays the file and renders the utf-8 '©' as '©'
If a BOM encoded in UTF-8 is present, that unambiguously marks the text
as UTF-8. But. As you note that is not always compatible with other uses
of the file. As UTF-8 was designed to be highly compatible with ASCII,
including the BOM is not usually recommended unless it is required for
other reasons.
VIM seems to default out of the box to Latin1 encoding which is more
consistent with Windows. (More correctly, it defaults to an encoding
consistent with the current Locale, which on Windows is usually Latin1
or another 8-bit codepage. Windows does support a UTF-8 codepage (aka
65001) but I've never seen that set as the system default.)
You can (probably) change it to support UTF-8, but it seems to make that
task as difficult as possible for a novice to the weird and subtle world
of file encoding issues. My copy of VIM 8 on Win 10 Pro correctly reads
UCS-2 (16-bit Unicode) files with BOMs, but proudly converts them to
Latin1 for display and editing which would clearly be a bad idea if they
had included characters from outside the coverage of the Latin1
codepage. Copyright and a number of other non-ASCII but otherwise
ordinary symbols are included in Latin1 and work as expected.
From my reading of the help file mbyte.txt, especially Section 10 Using
UTF-8, you want to :set encoding utf-8 before reading the file. Your
.vimrc might be a good place to do that. Another place to do that is to
use a modeline in your .tcl file that tells vim to assume UTF-8.
Something like
# vim: set enc=utf-8 fenc=utf-8
"near" the top or bottom of the file should do the trick.
The other huge caveat is that you also need to have fonts configured
that cover enough Unicode Codepoints to be useful to you. I believe VIM
defaults to "fixedsys" on Windows which is not a Unicode font. You will
want to change to Lucida Console at least, if not to something even more
programmer-friendly such as Hack[1], Source Code Pro[2], or DejaVu Sans
Mono[3] with good Unicode coverage and other features useful to coding
without eyestrain.
[1]: http://sourcefoundry.org/hack/
[2]: http://www.adobe.com/products/type/fonts-by-adobe.html
[3]: http://dejavu-fonts.org/
You may also want your console windows to understand UTF-8. If you have
the console set to use an appropriate font, (I personally use Hack for
both my consoles and my editors) then all you need to do is CHCP 65001
at the CMD prompt to switch to the UTF-8 codepage.
>but<
$ /c/Program\ Files/tcl/bin/tclsh u.tcl
invalid command name "puts"
while executing
"puts "This is a copyright symbol: ©.""
(file "k.tcl" line 1)
While adding the option "-encoding utf-8" to the tclsh command line
makes it work, it does not work when I double-click on the .tcl file
as I have no way to set any sort of encoding option -- unless I have
to make a windows shortcut for each and every .tcl file that I want to
run and put the -encoding there.
There is a way to do this automatically. Windows uses registry keys to
associate a file extension with a logical file type, and a file type
with the command to "open" it. The assoc and ftype commands provide a
simpler interface to viewing and setting the needed registry keys. For
ActiveTcl, I have:
C:...>assoc .tcl
.tcl=ActiveTclScript
C:...>ftype ActiveTclScript
ActiveTclScript="C:\Programs\Tcl\bin\wish86.exe" "%1" %*
You can change the definition of ActiveTclScript to include -encoding utf-8
Note that would make your installation less consistent with the rest of
the users, and is thus likely not the first choice for addressing this
issue.
So, how can one use a program source file encoded in utf-8?
--
Ross Berteig [email protected]
Cheshire Engineering Corp. http://www.CheshireEng.com/
+1 626 303 1602
_______________________________________________
fossil-users mailing list
[email protected]
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users