On Sun, Jul 12, 2020 at 20:04:31 +0200, Benno Schulenberg wrote:
> Op 12-07-2020 om 16:26 schreef Nils König:
> > FWIW I would have expected leading BOM/NoBOM to be an option when saving 
> > with 
> > ^O (like DOS/Mac-Format) and by default keep status quo.
> 
> No-no-no, horrible!  The user ought not to be aware of the presence
> of a BOM.

Looking in the save dialogue was just my first intuition, others ways to 
handle this are fine too – or even better. Though I don't understand how this 
would make a user  more  aware of a BOM in a file; maybe I described it 
poorly. Basically, I imagined this just like:

> If nano were to handle a BOM properly, it must remove a BOM whenever
> a file is read, and add it back when it is written.  But that would
> make it impossible to delete an unwanted BOM with a simple backspace.
> Then the user would need to fall back to a tool like dos2unix.

But without the need for an external tool to remove a BOM, instead this could 
be toggled in the save dialog. (And only a BOM at the very beginning would be 
treated special)

  
>           Software that accepts UTF-8 ought not to require a BOM.
> Nano is a simple editor, a Unix editor  It is meant for editing emails,
> configuration files, shell scripts, and other plain text files.  There
> are never any BOMs there.  And now, because some people want to use nano
> to edit files with a silly required format, nano must adapt and treat a
> BOM as a sacred trio of bytes?

It is definitely not my intention to tell you what to do with nano. Sorry, if 
I failed to make this clear before.

But I think that nano's current handling of BOMs is not ideal – both 
for keeping and removing a BOM – and hope that my suggestions may help with 
that. If you decide that it is not needed this is fine too, now that I know 
about it I can take care on the few occasions I edit a BOMed file in nano.


> I've contemplated adding the attached patch, but then the user
> could still backspace over the BOM or cut the line unawares.
> 
> diff --git a/src/nano.c b/src/nano.c
> index 8e8b9952..db213857 100644
> --- a/src/nano.c
> +++ b/src/nano.c
> @@ -1649,6 +1649,8 @@ void process_a_keystroke(void)
>  #endif
>  }
>  
> +#define byte(n)  (unsigned char)openfile->current->data[openfile->current_x 
> + n]
> +
>  int main(int argc, char **argv)
>  {
>       int stdin_flags, optchr;
> @@ -2489,6 +2491,13 @@ int main(int argc, char **argv)
>               lastmessage = VACUUM;
>               as_an_at = TRUE;
>  
> +#if defined(ENABLE_UTF8) && !defined(NANO_TINY)
> +             /* Tell the user when the cursor sits on a BOM. */
> +             if (byte(0) == 0xEF && byte(1) == 0xBB && byte(2) == 0xBF) {
> +                     statusline(NOTICE, _("Byte Order Mark"));
> +                     beep();
> +             }
> +#endif
>               /* Refresh just the cursor position or the entire edit window. 
> */
>               if (!refresh_needed) {
>                       place_the_cursor();

This seems useful and makes it both harder to accidentally remove a bom and
easier to intentionally remove it. Though as you mentioned it's still possible 
to accidentally remove in some circumstances.


Nils

Reply via email to