On Sun, Jul 12, 2020 at 20:04:31 +0200, Benno Schulenberg wrote: > Op 12-07-2020 om 16:26 schreef Nils König: > > FWIW I would have expected leading BOM/NoBOM to be an option when saving > > with > > ^O (like DOS/Mac-Format) and by default keep status quo. > > No-no-no, horrible! The user ought not to be aware of the presence > of a BOM.
Looking in the save dialogue was just my first intuition, others ways to handle this are fine too – or even better. Though I don't understand how this would make a user more aware of a BOM in a file; maybe I described it poorly. Basically, I imagined this just like: > If nano were to handle a BOM properly, it must remove a BOM whenever > a file is read, and add it back when it is written. But that would > make it impossible to delete an unwanted BOM with a simple backspace. > Then the user would need to fall back to a tool like dos2unix. But without the need for an external tool to remove a BOM, instead this could be toggled in the save dialog. (And only a BOM at the very beginning would be treated special) > Software that accepts UTF-8 ought not to require a BOM. > Nano is a simple editor, a Unix editor It is meant for editing emails, > configuration files, shell scripts, and other plain text files. There > are never any BOMs there. And now, because some people want to use nano > to edit files with a silly required format, nano must adapt and treat a > BOM as a sacred trio of bytes? It is definitely not my intention to tell you what to do with nano. Sorry, if I failed to make this clear before. But I think that nano's current handling of BOMs is not ideal – both for keeping and removing a BOM – and hope that my suggestions may help with that. If you decide that it is not needed this is fine too, now that I know about it I can take care on the few occasions I edit a BOMed file in nano. > I've contemplated adding the attached patch, but then the user > could still backspace over the BOM or cut the line unawares. > > diff --git a/src/nano.c b/src/nano.c > index 8e8b9952..db213857 100644 > --- a/src/nano.c > +++ b/src/nano.c > @@ -1649,6 +1649,8 @@ void process_a_keystroke(void) > #endif > } > > +#define byte(n) (unsigned char)openfile->current->data[openfile->current_x > + n] > + > int main(int argc, char **argv) > { > int stdin_flags, optchr; > @@ -2489,6 +2491,13 @@ int main(int argc, char **argv) > lastmessage = VACUUM; > as_an_at = TRUE; > > +#if defined(ENABLE_UTF8) && !defined(NANO_TINY) > + /* Tell the user when the cursor sits on a BOM. */ > + if (byte(0) == 0xEF && byte(1) == 0xBB && byte(2) == 0xBF) { > + statusline(NOTICE, _("Byte Order Mark")); > + beep(); > + } > +#endif > /* Refresh just the cursor position or the entire edit window. > */ > if (!refresh_needed) { > place_the_cursor(); This seems useful and makes it both harder to accidentally remove a bom and easier to intentionally remove it. Though as you mentioned it's still possible to accidentally remove in some circumstances. Nils