On Sat, Aug 28, 2010 at 4:16 PM, Tony Mechelynck <[email protected]> wrote: >> >>> From my understanding, 'fileencoding' is the encoding Vim is supposed >> >> to use to read/write the file. So, it does make sense that we should >> use this instead of just 'encoding' for the charset of the generated >> html. Does anyone know why TOhtml has used 'encoding' instead? I have >> not touched the charset detection code yet, other than to move it from >> the 2html.vim file into the autoload/tohtml.vim file. > > You got it right, and it does indeed make sense. > One possibility is that anything can be represented in UTF-8, including text > not yet saved from the latest edit of the file, and possibly incompatible > with the 'fileencoding' - such text is of course in error, and will cause an > error if one tries to save it. >
Ok, I think I'll make the edit, then. Your response gives me an idea to fix something else that's been bothering me. Currently, if Vim cannot determine the correct charset to use, it defaults to not including one at all. I think I'll have it default the charset and file encoding to UTF-8 if neither the fileencoding nor the encoding option gives a valid charset. The user should be able to manually leave out the charset and manually set the encoding if desired. Here's what I'm thinking in more detail: For one buffer: 1. If user specified a charset, try to determine 'fileencoding' from charset. If this fails, warn the user they will need to manually set the fileencoding. 2. If no charset is specified, try to determine a charset from the 'fileencoding' option. If successful, use the same 'fileencoding' and the associated charset in the generated buffer. 3. If could not determine charset from 'fileencoding', try again with 'encoding'. If successful, set 'fileencoding' to blank in the new html buffer and use the charset from the 'encoding' option. 4. If could not determine charset from either 'encoding' or 'fileencoding', default to UTF-8 and warn the user. Multiple buffers in diff mode will be done similarly, except that we will determine the charset as above for ALL buffers. If they differ, set 'fileencoding' to blank and use the charset from 'encoding' (or UTF-8 if cannot determine charset from 'encoding'). What do you think? Or maybe this is too complicated and I should just use 'encoding' as done currently? What do you think? -- You received this message from the "vim_dev" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php
