Epilogue (CLOSING THREAD)

2023-10-12 Thread Walter Alejandro Iglesias
Given that I'm the OP of this thread I feel entitled to officially
closing it.

Those interested in my proposal will find my latest version of the
patch in this new thread:

  https://marc.info/?t=16961663101=1=2

For my part, what was said in this thread has already been assimilated
and overcome.  As my patches show, I heard and took into account every
correction and suggestion (even though many, if not most, regardless of
the intention with which they were made, functioned more as a boycott
than as a contribution.)


-- 
Walter



Re: mail(1) MIME support [PATCH]

2023-10-11 Thread Walter Alejandro Iglesias
More changes:

 - I changed my overpopulated Message-ID for the function used by
   smtpd(8) (generate_uid()).  Now the Message-ID generated by mail(1)
   is identical to the one generated by smtpd(8). 

 - Added a conditional to skip adding the Message-ID if the machine
   doesn't have hostname yet.

Also some clenage and ordering:

 - Moved included libraries to def.h
 - Moved function definitions to extern.h
 - Moved isutf8() and generate_uid() funcions to util.c


Index: cmd3.c
===
RCS file: /cvs/src/usr.bin/mail/cmd3.c,v
retrieving revision 1.30
diff -u -p -r1.30 cmd3.c
--- cmd3.c  8 Mar 2023 04:43:11 -   1.30
+++ cmd3.c  11 Oct 2023 06:39:20 -
@@ -238,6 +238,7 @@ _respond(int *msgvec)
head.h_cc = np;
} else
head.h_cc = NULL;
+   head.h_msgid = hfield("message-id", mp);
head.h_bcc = NULL;
head.h_smopts = NULL;
mail1(, 1);
@@ -617,6 +618,7 @@ _Respond(int *msgvec)
if ((head.h_subject = hfield("subject", mp)) == NULL)
head.h_subject = hfield("subj", mp);
head.h_subject = reedit(head.h_subject);
+   head.h_msgid = hfield("message-id", mp);
head.h_from = NULL;
head.h_cc = NULL;
head.h_bcc = NULL;
Index: collect.c
===
RCS file: /cvs/src/usr.bin/mail/collect.c,v
retrieving revision 1.34
diff -u -p -r1.34 collect.c
--- collect.c   17 Jan 2014 18:42:30 -  1.34
+++ collect.c   11 Oct 2023 06:39:20 -
@@ -87,7 +87,7 @@ collect(struct header *hp, int printhead
 * refrain from printing a newline after
 * the headers (since some people mind).
 */
-   t = GTO|GSUBJECT|GCC|GNL;
+   t = GTO|GSUBJECT|GMID|GCC|GNL;
getsub = 0;
if (hp->h_subject == NULL && value("interactive") != NULL &&
(value("ask") != NULL || value("asksub") != NULL))
@@ -208,7 +208,7 @@ cont:
/*
 * Grab a bunch of headers.
 */
-   grabh(hp, GTO|GSUBJECT|GCC|GBCC);
+   grabh(hp, GTO|GSUBJECT|GMID|GCC|GBCC);
goto cont;
case 't':
/*
@@ -328,7 +328,7 @@ cont:
 */
rewind(collf);
puts("---\nMessage contains:");
-   puthead(hp, stdout, GTO|GSUBJECT|GCC|GBCC|GNL);
+   puthead(hp, stdout, GTO|GSUBJECT|GMID|GCC|GBCC|GNL);
while ((t = getc(collf)) != EOF)
(void)putchar(t);
goto cont;
Index: def.h
===
RCS file: /cvs/src/usr.bin/mail/def.h,v
retrieving revision 1.17
diff -u -p -r1.17 def.h
--- def.h   28 Jan 2022 06:18:41 -  1.17
+++ def.h   11 Oct 2023 06:39:20 -
@@ -53,6 +53,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include "pathnames.h"
 
@@ -156,14 +158,15 @@ struct headline {
 
 #defineGTO 1   /* Grab To: line */
 #defineGSUBJECT 2  /* Likewise, Subject: line */
-#defineGCC 4   /* And the Cc: line */
-#defineGBCC8   /* And also the Bcc: line */
-#defineGMASK   (GTO|GSUBJECT|GCC|GBCC)
+#defineGMID4   /* Message-ID: line */
+#defineGCC 8   /* And the Cc: line */
+#defineGBCC16  /* And also the Bcc: line */
+#defineGMASK   (GTO|GSUBJECT|GMID|GCC|GBCC)
/* Mask of places from whence */
 
-#defineGNL 16  /* Print blank line after */
-#defineGDEL32  /* Entity removed from list */
-#defineGCOMMA  64  /* detract puts in commas */
+#defineGNL 32  /* Print blank line after */
+#defineGDEL64  /* Entity removed from list */
+#defineGCOMMA  128 /* detract puts in commas */
 
 /*
  * Structure used to pass about the current
@@ -173,6 +176,7 @@ struct header {
struct name *h_to;  /* Dynamic "To:" string */
char *h_from;   /* User-specified "From:" string */
char *h_subject;/* Subject string */
+   char *h_msgid;  /* Message-ID string */
struct name *h_cc;  /* Carbon copies string */
struct name *h_bcc; /* Blind carbon copies */
struct name *h_smopts;  /* Sendmail options */
Index: extern.h
===
RCS file: /cvs/src/usr.bin/mail/extern.h,v
retrieving revision 1.29
diff -u -p -r1.29 extern.h
--- extern.h16 Sep 

Re: mail(1) MIME support [PATCH]

2023-10-10 Thread Walter Alejandro Iglesias
Added random number to Message-ID to get more unique string.


Index: cmd3.c
===
RCS file: /cvs/src/usr.bin/mail/cmd3.c,v
retrieving revision 1.30
diff -u -p -r1.30 cmd3.c
--- cmd3.c  8 Mar 2023 04:43:11 -   1.30
+++ cmd3.c  10 Oct 2023 16:58:19 -
@@ -238,6 +238,7 @@ _respond(int *msgvec)
head.h_cc = np;
} else
head.h_cc = NULL;
+   head.h_msgid = hfield("message-id", mp);
head.h_bcc = NULL;
head.h_smopts = NULL;
mail1(, 1);
@@ -617,6 +618,7 @@ _Respond(int *msgvec)
if ((head.h_subject = hfield("subject", mp)) == NULL)
head.h_subject = hfield("subj", mp);
head.h_subject = reedit(head.h_subject);
+   head.h_msgid = hfield("message-id", mp);
head.h_from = NULL;
head.h_cc = NULL;
head.h_bcc = NULL;
Index: collect.c
===
RCS file: /cvs/src/usr.bin/mail/collect.c,v
retrieving revision 1.34
diff -u -p -r1.34 collect.c
--- collect.c   17 Jan 2014 18:42:30 -  1.34
+++ collect.c   10 Oct 2023 16:58:19 -
@@ -87,7 +87,7 @@ collect(struct header *hp, int printhead
 * refrain from printing a newline after
 * the headers (since some people mind).
 */
-   t = GTO|GSUBJECT|GCC|GNL;
+   t = GTO|GSUBJECT|GMID|GCC|GNL;
getsub = 0;
if (hp->h_subject == NULL && value("interactive") != NULL &&
(value("ask") != NULL || value("asksub") != NULL))
@@ -208,7 +208,7 @@ cont:
/*
 * Grab a bunch of headers.
 */
-   grabh(hp, GTO|GSUBJECT|GCC|GBCC);
+   grabh(hp, GTO|GSUBJECT|GMID|GCC|GBCC);
goto cont;
case 't':
/*
@@ -328,7 +328,7 @@ cont:
 */
rewind(collf);
puts("---\nMessage contains:");
-   puthead(hp, stdout, GTO|GSUBJECT|GCC|GBCC|GNL);
+   puthead(hp, stdout, GTO|GSUBJECT|GMID|GCC|GBCC|GNL);
while ((t = getc(collf)) != EOF)
(void)putchar(t);
goto cont;
Index: def.h
===
RCS file: /cvs/src/usr.bin/mail/def.h,v
retrieving revision 1.17
diff -u -p -r1.17 def.h
--- def.h   28 Jan 2022 06:18:41 -  1.17
+++ def.h   10 Oct 2023 16:58:19 -
@@ -156,14 +156,15 @@ struct headline {
 
 #defineGTO 1   /* Grab To: line */
 #defineGSUBJECT 2  /* Likewise, Subject: line */
-#defineGCC 4   /* And the Cc: line */
-#defineGBCC8   /* And also the Bcc: line */
-#defineGMASK   (GTO|GSUBJECT|GCC|GBCC)
+#defineGMID4   /* Message-ID: line */
+#defineGCC 8   /* And the Cc: line */
+#defineGBCC16  /* And also the Bcc: line */
+#defineGMASK   (GTO|GSUBJECT|GMID|GCC|GBCC)
/* Mask of places from whence */
 
-#defineGNL 16  /* Print blank line after */
-#defineGDEL32  /* Entity removed from list */
-#defineGCOMMA  64  /* detract puts in commas */
+#defineGNL 32  /* Print blank line after */
+#defineGDEL64  /* Entity removed from list */
+#defineGCOMMA  128 /* detract puts in commas */
 
 /*
  * Structure used to pass about the current
@@ -173,6 +174,7 @@ struct header {
struct name *h_to;  /* Dynamic "To:" string */
char *h_from;   /* User-specified "From:" string */
char *h_subject;/* Subject string */
+   char *h_msgid;  /* Message-ID string */
struct name *h_cc;  /* Carbon copies string */
struct name *h_bcc; /* Blind carbon copies */
struct name *h_smopts;  /* Sendmail options */
Index: extern.h
===
RCS file: /cvs/src/usr.bin/mail/extern.h,v
retrieving revision 1.29
diff -u -p -r1.29 extern.h
--- extern.h16 Sep 2018 02:38:57 -  1.29
+++ extern.h10 Oct 2023 16:58:19 -
@@ -163,7 +163,7 @@ void load(char *);
 struct var *
 lookup(char *);
 int mail(struct name *, struct name *, struct name *, struct name *,
-  char *, char *);
+  char *, char *, char *);
 voidmail1(struct header *, int);
 voidmakemessage(FILE *, int);
 voidmark(int);
Index: main.c
===
RCS file: 

Re: mail(1) MIME support [PATCH]

2023-10-02 Thread Walter Alejandro Iglesias
Avoid printing some headers to stdout when responding from the interface
(when you type r or R), especially Conten-Transfer-Enconding and
Content-Type since the values showed before sending are not the ones
that will be used after processing the body when sending.


Index: cmd3.c
===
RCS file: /cvs/src/usr.bin/mail/cmd3.c,v
retrieving revision 1.30
diff -u -p -r1.30 cmd3.c
--- cmd3.c  8 Mar 2023 04:43:11 -   1.30
+++ cmd3.c  2 Oct 2023 16:02:02 -
@@ -238,6 +238,7 @@ _respond(int *msgvec)
head.h_cc = np;
} else
head.h_cc = NULL;
+   head.h_msgid = hfield("message-id", mp);
head.h_bcc = NULL;
head.h_smopts = NULL;
mail1(, 1);
@@ -617,6 +618,7 @@ _Respond(int *msgvec)
if ((head.h_subject = hfield("subject", mp)) == NULL)
head.h_subject = hfield("subj", mp);
head.h_subject = reedit(head.h_subject);
+   head.h_msgid = hfield("message-id", mp);
head.h_from = NULL;
head.h_cc = NULL;
head.h_bcc = NULL;
Index: collect.c
===
RCS file: /cvs/src/usr.bin/mail/collect.c,v
retrieving revision 1.34
diff -u -p -r1.34 collect.c
--- collect.c   17 Jan 2014 18:42:30 -  1.34
+++ collect.c   2 Oct 2023 16:02:02 -
@@ -87,7 +87,7 @@ collect(struct header *hp, int printhead
 * refrain from printing a newline after
 * the headers (since some people mind).
 */
-   t = GTO|GSUBJECT|GCC|GNL;
+   t = GTO|GSUBJECT|GMID|GCC|GNL;
getsub = 0;
if (hp->h_subject == NULL && value("interactive") != NULL &&
(value("ask") != NULL || value("asksub") != NULL))
@@ -208,7 +208,7 @@ cont:
/*
 * Grab a bunch of headers.
 */
-   grabh(hp, GTO|GSUBJECT|GCC|GBCC);
+   grabh(hp, GTO|GSUBJECT|GMID|GCC|GBCC);
goto cont;
case 't':
/*
@@ -328,7 +328,7 @@ cont:
 */
rewind(collf);
puts("---\nMessage contains:");
-   puthead(hp, stdout, GTO|GSUBJECT|GCC|GBCC|GNL);
+   puthead(hp, stdout, GTO|GSUBJECT|GMID|GCC|GBCC|GNL);
while ((t = getc(collf)) != EOF)
(void)putchar(t);
goto cont;
Index: def.h
===
RCS file: /cvs/src/usr.bin/mail/def.h,v
retrieving revision 1.17
diff -u -p -r1.17 def.h
--- def.h   28 Jan 2022 06:18:41 -  1.17
+++ def.h   2 Oct 2023 16:02:02 -
@@ -156,14 +156,15 @@ struct headline {
 
 #defineGTO 1   /* Grab To: line */
 #defineGSUBJECT 2  /* Likewise, Subject: line */
-#defineGCC 4   /* And the Cc: line */
-#defineGBCC8   /* And also the Bcc: line */
-#defineGMASK   (GTO|GSUBJECT|GCC|GBCC)
+#defineGMID4   /* Message-ID: line */
+#defineGCC 8   /* And the Cc: line */
+#defineGBCC16  /* And also the Bcc: line */
+#defineGMASK   (GTO|GSUBJECT|GMID|GCC|GBCC)
/* Mask of places from whence */
 
-#defineGNL 16  /* Print blank line after */
-#defineGDEL32  /* Entity removed from list */
-#defineGCOMMA  64  /* detract puts in commas */
+#defineGNL 32  /* Print blank line after */
+#defineGDEL64  /* Entity removed from list */
+#defineGCOMMA  128 /* detract puts in commas */
 
 /*
  * Structure used to pass about the current
@@ -173,6 +174,7 @@ struct header {
struct name *h_to;  /* Dynamic "To:" string */
char *h_from;   /* User-specified "From:" string */
char *h_subject;/* Subject string */
+   char *h_msgid;  /* Message-ID string */
struct name *h_cc;  /* Carbon copies string */
struct name *h_bcc; /* Blind carbon copies */
struct name *h_smopts;  /* Sendmail options */
Index: extern.h
===
RCS file: /cvs/src/usr.bin/mail/extern.h,v
retrieving revision 1.29
diff -u -p -r1.29 extern.h
--- extern.h16 Sep 2018 02:38:57 -  1.29
+++ extern.h2 Oct 2023 16:02:02 -
@@ -163,7 +163,7 @@ void load(char *);
 struct var *
 lookup(char *);
 int mail(struct name *, struct name *, struct name *, struct name *,
-  char *, char *);
+  char *, char *, char *);
 void

Re: mail(1) MIME support [PATCH]

2023-10-02 Thread Walter Alejandro Iglesias
On Sun Oct  1 15:19:12 2023, Walter wrote
> I decided to add another header, "Message-ID".  Sending mails from my
> patched mail(1) I realized that it's convenient the MUA itself add the
> Message-ID (the one in this message was generated by my patch), if you
> relegate this to the MTA, your MUA will save the local copy without that
> header, then if more late you wish to read your mail with a
> thread-capable MUA (eg Mutt), those messages won't be in the right
> place.
>

To not break threads, one last detail was needed.  Now mail(1) adds a
In-Reply-To: header.


Index: cmd3.c
===
RCS file: /cvs/src/usr.bin/mail/cmd3.c,v
retrieving revision 1.30
diff -u -p -r1.30 cmd3.c
--- cmd3.c  8 Mar 2023 04:43:11 -   1.30
+++ cmd3.c  2 Oct 2023 12:59:49 -
@@ -238,6 +238,7 @@ _respond(int *msgvec)
head.h_cc = np;
} else
head.h_cc = NULL;
+   head.h_msgid = hfield("message-id", mp);
head.h_bcc = NULL;
head.h_smopts = NULL;
mail1(, 1);
@@ -617,6 +618,7 @@ _Respond(int *msgvec)
if ((head.h_subject = hfield("subject", mp)) == NULL)
head.h_subject = hfield("subj", mp);
head.h_subject = reedit(head.h_subject);
+   head.h_msgid = hfield("message-id", mp);
head.h_from = NULL;
head.h_cc = NULL;
head.h_bcc = NULL;
Index: collect.c
===
RCS file: /cvs/src/usr.bin/mail/collect.c,v
retrieving revision 1.34
diff -u -p -r1.34 collect.c
--- collect.c   17 Jan 2014 18:42:30 -  1.34
+++ collect.c   2 Oct 2023 12:59:49 -
@@ -87,7 +87,7 @@ collect(struct header *hp, int printhead
 * refrain from printing a newline after
 * the headers (since some people mind).
 */
-   t = GTO|GSUBJECT|GCC|GNL;
+   t = GTO|GSUBJECT|GMID|GCC|GNL;
getsub = 0;
if (hp->h_subject == NULL && value("interactive") != NULL &&
(value("ask") != NULL || value("asksub") != NULL))
@@ -208,7 +208,7 @@ cont:
/*
 * Grab a bunch of headers.
 */
-   grabh(hp, GTO|GSUBJECT|GCC|GBCC);
+   grabh(hp, GTO|GSUBJECT|GMID|GCC|GBCC);
goto cont;
case 't':
/*
@@ -328,7 +328,7 @@ cont:
 */
rewind(collf);
puts("---\nMessage contains:");
-   puthead(hp, stdout, GTO|GSUBJECT|GCC|GBCC|GNL);
+   puthead(hp, stdout, GTO|GSUBJECT|GMID|GCC|GBCC|GNL);
while ((t = getc(collf)) != EOF)
(void)putchar(t);
goto cont;
Index: def.h
===
RCS file: /cvs/src/usr.bin/mail/def.h,v
retrieving revision 1.17
diff -u -p -r1.17 def.h
--- def.h   28 Jan 2022 06:18:41 -  1.17
+++ def.h   2 Oct 2023 12:59:49 -
@@ -156,14 +156,15 @@ struct headline {
 
 #defineGTO 1   /* Grab To: line */
 #defineGSUBJECT 2  /* Likewise, Subject: line */
-#defineGCC 4   /* And the Cc: line */
-#defineGBCC8   /* And also the Bcc: line */
-#defineGMASK   (GTO|GSUBJECT|GCC|GBCC)
+#defineGMID4   /* Message-ID: line */
+#defineGCC 8   /* And the Cc: line */
+#defineGBCC16  /* And also the Bcc: line */
+#defineGMASK   (GTO|GSUBJECT|GMID|GCC|GBCC)
/* Mask of places from whence */
 
-#defineGNL 16  /* Print blank line after */
-#defineGDEL32  /* Entity removed from list */
-#defineGCOMMA  64  /* detract puts in commas */
+#defineGNL 32  /* Print blank line after */
+#defineGDEL64  /* Entity removed from list */
+#defineGCOMMA  128 /* detract puts in commas */
 
 /*
  * Structure used to pass about the current
@@ -173,6 +174,7 @@ struct header {
struct name *h_to;  /* Dynamic "To:" string */
char *h_from;   /* User-specified "From:" string */
char *h_subject;/* Subject string */
+   char *h_msgid;  /* Message-ID string */
struct name *h_cc;  /* Carbon copies string */
struct name *h_bcc; /* Blind carbon copies */
struct name *h_smopts;  /* Sendmail options */
Index: extern.h
===
RCS file: /cvs/src/usr.bin/mail/extern.h,v
retrieving revision 1.29
diff -u -p -r1.29 extern.h
--- extern.h16 Sep 2018 02:38:57 

mail(1) MIME support [PATCH]

2023-10-01 Thread Walter Alejandro Iglesias
Opening a new clean thread.

First of all, I'm sending this message from my patched mail(1), take a
look at the headers. :-)

I tried what each and everyone suggested to me, the only way to know for
sure what works and what doesn't.  If you decide that this modification
is too much for a simple application as mail, that's your decision, my
work could still be useful if in the future you change your mind.
Anyway, for me was useful to learn.

I renamed my function "isutf8" to make the return values more logical
and easy to understand.  I abused Otto Morebeek's kindness asking him
questions about malloc to the point he sent me a vacation email.  I'll
be banned from his email server for at least a couple of years.  Sorry
Otto! :-)  He helped showing me more readable ways of using realloc, the
realloc loop in the code is from him.  As Omar Polo did, Otto also
advised me not to use variable length arrays (I also did some research
about it.)  He also told me that the fsize() function in mail(1) may
likely fail reading from stdin, besides the file can grow while reading
from it.  So, I removed the "int len" from my function arguments and fed
mbstowcs() with the new read size (i.e. "i").

Otto also showed me the "DEBUG=-g make" trick to get the useful
addr2line reports with kdump.  Acording to these tests, it seems that my
patch doesn't add leaks:

 Start dump mail ***
M=8 I=1 F=1 U=1 J=2 R=0 X=0 C=-935363880 cache=0 G=4096
Leak report:
  f sum  #avg
0x0   46464327142 addr2line -e . 0x0
  0x34f10a4b153   20480  1  20480 addr2line -e /usr/lib/libc.so.97.1 0x4d153
  0x34f10a961dc   55910  1  55910 addr2line -e /usr/lib/libc.so.97.1 0x981dc
  0x34f10a96470  410576 25  16423 addr2line -e /usr/lib/libc.so.97.1 0x98470
  0x34f10abb562   22376  1  22376 addr2line -e /usr/lib/libc.so.97.1 0xbd562

 End dump mail ***

$ addr2line -e /usr/lib/libc.so.97.1 0x4d153
/usr/src/lib/libc/stdio/makebuf.c:62
$ addr2line -e /usr/lib/libc.so.97.1 0x981dc
/usr/src/lib/libc/locale/rune.c:258
$ addr2line -e /usr/lib/libc.so.97.1 0x98470
/usr/src/lib/libc/locale/rune.c:137
$ addr2line -e /usr/lib/libc.so.97.1 0xbd562
/usr/src/lib/libc/time/localtime.c:1121


I decided to add another header, "Message-ID".  Sending mails from my
patched mail(1) I realized that it's convenient the MUA itself add the
Message-ID (the one in this message was generated by my patch), if you
relegate this to the MTA, your MUA will save the local copy without that
header, then if more late you wish to read your mail with a
thread-capable MUA (eg Mutt), those messages won't be in the right
place.


   Summary of what OpenBSD mail(1) does with this patch
   

  The string used as Message-ID is equivalent to this shell command:

 $ echo "Message-ID: <$(date +%Y%m%d.%H%M%S@$(hostname))>"

  When the body is all ASCII, mail(1) adds these headers:

MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

  When valid UTF-8 is detected in the body, mail(1) adds these headers:

MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit\n

  When non valid UTF-8 characters are found in the body, it adds only
  the Message-ID.



Index: send.c
===
RCS file: /cvs/src/usr.bin/mail/send.c,v
retrieving revision 1.26
diff -u -p -r1.26 send.c
--- send.c  8 Mar 2023 04:43:11 -   1.26
+++ send.c  1 Oct 2023 07:47:30 -
@@ -32,7 +32,10 @@
 
 #include "rcv.h"
 #include "extern.h"
+#include "locale.h"
 
+static int utf8body;
+static int isutf8(FILE *s);/* UTF-8 check  */
 static volatile sig_atomic_t sendsignal;   /* Interrupted by a signal? */
 
 /*
@@ -341,6 +344,14 @@ mail1(struct header *hp, int printheader
else
puts("Null message body; hope that's ok");
}
+
+   /* UTF-8 check */
+   setlocale(LC_CTYPE, "en_US.UTF-8");
+   if (fsize(mtf) != 0) {
+   utf8body = isutf8(mtf);
+   rewind(mtf);
+   }
+
/*
 * Now, take the user names from the combined
 * to and cc lists and do all the alias
@@ -516,19 +527,42 @@ puthead(struct header *hp, FILE *fo, int
 {
int gotcha;
char *from;
+   time_t t = time(NULL);
+   struct tm tm = *localtime();
+   char hostname[1024];
+   gethostname(hostname, 1023);
 
gotcha = 0;
from = hp->h_from ? hp->h_from : value("from");
if (from != NULL)
-   fprintf(fo, "From: %s\n", from), gotcha++;
+   fprintf(fo, "From: %s\n", from),
+   gotcha++;
if (hp->h_to != NULL && w & GTO)
-   fmt("To:", hp->h_to, fo, w), gotcha++;
+   fmt("To:", hp->h_to, fo, w),
+   gotcha++;
   

Re: Send international text with mail(1) - proposal and patches

2023-09-25 Thread Walter Alejandro Iglesias
On Mon, 25 Sep 2023 19:00:15 +0200, Hiltjo Posthuma wrote:
> On Mon, Sep 25, 2023 at 03:13:03PM +0200, Walter Alejandro Iglesias wrote:
> > This new version, when it detects invalid utf-8 in the body saves a
> > dead.letter, prints the following message and exits.
> > 
> >   $ mail -s hello user < invalid_utf8.txt
> >   Invalid or incomplete multibyte or wide character
> >   . . . message not sent.
> > 
> > 
> > 
> > Index: send.c
> > ===
> > RCS file: /cvs/src/usr.bin/mail/send.c,v
> > retrieving revision 1.26
> > diff -u -p -r1.26 send.c
> > --- send.c  8 Mar 2023 04:43:11 -   1.26
> > +++ send.c  25 Sep 2023 13:07:17 -
> > @@ -32,6 +32,11 @@
> >  
> >  #include "rcv.h"
> >  #include "extern.h"
> > +#include "locale.h"
> > +
> > +/* To check charset of the message and add the appropiate MIME headers  */
> > +static char nutf8;
> > +static int not_utf8(FILE *s, int len);
> >  
> >  static volatile sig_atomic_t sendsignal;   /* Interrupted by a signal? */
> >  
> > @@ -341,6 +346,17 @@ mail1(struct header *hp, int printheader
> > else
> > puts("Null message body; hope that's ok");
> > }
> > +
> > +   /* Check non valid UTF-8 characters in the message */
> > +   nutf8 = not_utf8(mtf, fsize(mtf));
> > +   rewind(mtf);
> > +   if (nutf8 > 1) {
> > +   savedeadletter(mtf);
> > +   puts("Invalid or incomplete multibyte or wide character");
> > +   fputs(". . . message not sent.\n", stderr);
> > +   exit(1);
> > +   }
> > +
> > /*
> >  * Now, take the user names from the combined
> >  * to and cc lists and do all the alias
> > @@ -520,15 +536,30 @@ puthead(struct header *hp, FILE *fo, int
> > gotcha = 0;
> > from = hp->h_from ? hp->h_from : value("from");
> > if (from != NULL)
> > -   fprintf(fo, "From: %s\n", from), gotcha++;
> > +   fprintf(fo, "From: %s\n", from),
> > +   gotcha++;
> > if (hp->h_to != NULL && w & GTO)
> > -   fmt("To:", hp->h_to, fo, w), gotcha++;
> > +   fmt("To:", hp->h_to, fo, w),
> > +   gotcha++;
> > if (hp->h_subject != NULL && w & GSUBJECT)
> > -   fprintf(fo, "Subject: %s\n", hp->h_subject), gotcha++;
> > +   fprintf(fo, "Subject: %s\n", hp->h_subject),
> > +   gotcha++;
> > +   if (nutf8 == 0)
> > +   fprintf(fo, "MIME-Version: 1.0\n"
> > +   "Content-Type: text/plain; charset=us-ascii\n"
> > +   "Content-Transfer-Encoding: 7bit\n"),
> > +   gotcha++;
> > +   else if (nutf8 == 1)
> > +   fprintf(fo, "MIME-Version: 1.0\n"
> > +   "Content-Type: text/plain; charset=utf-8\n"
> > +   "Content-Transfer-Encoding: 8bit\n"),
> > +   gotcha++;
> > if (hp->h_cc != NULL && w & GCC)
> > -   fmt("Cc:", hp->h_cc, fo, w), gotcha++;
> > +   fmt("Cc:", hp->h_cc, fo, w),
> > +   gotcha++;
> > if (hp->h_bcc != NULL && w & GBCC)
> > -   fmt("Bcc:", hp->h_bcc, fo, w), gotcha++;
> > +   fmt("Bcc:", hp->h_bcc, fo, w),
> > +   gotcha++;
> > if (gotcha && w & GNL)
> > (void)putc('\n', fo);
> > return(0);
> > @@ -609,4 +640,44 @@ sendint(int s)
> >  {
> >  
> > sendsignal = s;
> > +}
> > +
> > +/* Search non valid UTF-8 characters in the message */
> > +static int
> > +not_utf8(FILE *message, int len)
> > +{
>
> Nitpick: I would call `message` maybe `fp` or something here.
>
> > +   int c, count, n, ulen;
> > +   size_t i, resize;
> > +   size_t jump = 100;
> > +   unsigned char *s = NULL;
> > +
> > +   setlocale(LC_CTYPE, "en_US.UTF-8");
> > +
>
> Should setlocale() be restored later on?
>
> > +   if (s == NULL && (s = malloc(jump)) == NULL)
> > +   err(1, NULL);
>
> The check if `s` is NULL seems unncessary here.
>
> > +
> > +   i = count = 0;
> > +   while ((c = getc(messa

Re: Send international text with mail(1) - proposal and patches

2023-09-25 Thread Walter Alejandro Iglesias
This new version, when it detects invalid utf-8 in the body saves a
dead.letter, prints the following message and exits.

  $ mail -s hello user < invalid_utf8.txt
  Invalid or incomplete multibyte or wide character
  . . . message not sent.



Index: send.c
===
RCS file: /cvs/src/usr.bin/mail/send.c,v
retrieving revision 1.26
diff -u -p -r1.26 send.c
--- send.c  8 Mar 2023 04:43:11 -   1.26
+++ send.c  25 Sep 2023 13:07:17 -
@@ -32,6 +32,11 @@
 
 #include "rcv.h"
 #include "extern.h"
+#include "locale.h"
+
+/* To check charset of the message and add the appropiate MIME headers  */
+static char nutf8;
+static int not_utf8(FILE *s, int len);
 
 static volatile sig_atomic_t sendsignal;   /* Interrupted by a signal? */
 
@@ -341,6 +346,17 @@ mail1(struct header *hp, int printheader
else
puts("Null message body; hope that's ok");
}
+
+   /* Check non valid UTF-8 characters in the message */
+   nutf8 = not_utf8(mtf, fsize(mtf));
+   rewind(mtf);
+   if (nutf8 > 1) {
+   savedeadletter(mtf);
+   puts("Invalid or incomplete multibyte or wide character");
+   fputs(". . . message not sent.\n", stderr);
+   exit(1);
+   }
+
/*
 * Now, take the user names from the combined
 * to and cc lists and do all the alias
@@ -520,15 +536,30 @@ puthead(struct header *hp, FILE *fo, int
gotcha = 0;
from = hp->h_from ? hp->h_from : value("from");
if (from != NULL)
-   fprintf(fo, "From: %s\n", from), gotcha++;
+   fprintf(fo, "From: %s\n", from),
+   gotcha++;
if (hp->h_to != NULL && w & GTO)
-   fmt("To:", hp->h_to, fo, w), gotcha++;
+   fmt("To:", hp->h_to, fo, w),
+   gotcha++;
if (hp->h_subject != NULL && w & GSUBJECT)
-   fprintf(fo, "Subject: %s\n", hp->h_subject), gotcha++;
+   fprintf(fo, "Subject: %s\n", hp->h_subject),
+   gotcha++;
+   if (nutf8 == 0)
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=us-ascii\n"
+   "Content-Transfer-Encoding: 7bit\n"),
+   gotcha++;
+   else if (nutf8 == 1)
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=utf-8\n"
+   "Content-Transfer-Encoding: 8bit\n"),
+   gotcha++;
if (hp->h_cc != NULL && w & GCC)
-   fmt("Cc:", hp->h_cc, fo, w), gotcha++;
+   fmt("Cc:", hp->h_cc, fo, w),
+   gotcha++;
if (hp->h_bcc != NULL && w & GBCC)
-   fmt("Bcc:", hp->h_bcc, fo, w), gotcha++;
+   fmt("Bcc:", hp->h_bcc, fo, w),
+   gotcha++;
if (gotcha && w & GNL)
(void)putc('\n', fo);
return(0);
@@ -609,4 +640,44 @@ sendint(int s)
 {
 
sendsignal = s;
+}
+
+/* Search non valid UTF-8 characters in the message */
+static int
+not_utf8(FILE *message, int len)
+{
+   int c, count, n, ulen;
+   size_t i, resize;
+   size_t jump = 100;
+   unsigned char *s = NULL;
+
+   setlocale(LC_CTYPE, "en_US.UTF-8");
+
+   if (s == NULL && (s = malloc(jump)) == NULL)
+   err(1, NULL);
+
+   i = count = 0;
+   while ((c = getc(message)) != EOF) {
+   if (s == NULL || count == jump) {
+   if ((s = realloc(s, i + jump + 1)) == NULL)
+   err(1, NULL);
+   count = 0;
+   }
+   s[i++] = c;
+   count++;
+   }
+
+   s[i] = '\0';
+
+   ulen = mbstowcs(NULL, s, 0);
+
+   if (ulen == len)
+   n = 0;
+   else if (ulen < 0)
+   n = 2; 
+   else if (ulen < len)
+   n = 1;
+   
+   free(s);
+   return n;
 }


-- 
Walter



Re: Send international text with mail(1) - proposal and patches

2023-09-24 Thread Walter Alejandro Iglesias
On Sun, Sep 24, 2023 at 11:12:10AM -0300, Crystal Kolipe wrote:
> On Sun, Sep 24, 2023 at 12:37:08PM +0200, Walter Alejandro Iglesias wrote:
> > +static int
> > +not_utf8(FILE *message, int len)
> > +{
> > +   int n, ulen;
> > +   unsigned char s[len];
> 
> Please re-read Omar's advice about large unbounded arrays.

Better?


Index: send.c
===
RCS file: /cvs/src/usr.bin/mail/send.c,v
retrieving revision 1.26
diff -u -p -r1.26 send.c
--- send.c  8 Mar 2023 04:43:11 -   1.26
+++ send.c  24 Sep 2023 14:54:25 -
@@ -32,6 +32,11 @@
 
 #include "rcv.h"
 #include "extern.h"
+#include "locale.h"
+
+/* To check charset of the message and add the appropiate MIME headers  */
+static char nutf8;
+static int not_utf8(FILE *s, int len);
 
 static volatile sig_atomic_t sendsignal;   /* Interrupted by a signal? */
 
@@ -341,6 +346,11 @@ mail1(struct header *hp, int printheader
else
puts("Null message body; hope that's ok");
}
+
+   /* Check non valid UTF-8 characters in the message */
+   nutf8 = not_utf8(mtf, fsize(mtf));
+   rewind(mtf);
+
/*
 * Now, take the user names from the combined
 * to and cc lists and do all the alias
@@ -520,15 +530,30 @@ puthead(struct header *hp, FILE *fo, int
gotcha = 0;
from = hp->h_from ? hp->h_from : value("from");
if (from != NULL)
-   fprintf(fo, "From: %s\n", from), gotcha++;
+   fprintf(fo, "From: %s\n", from),
+   gotcha++;
if (hp->h_to != NULL && w & GTO)
-   fmt("To:", hp->h_to, fo, w), gotcha++;
+   fmt("To:", hp->h_to, fo, w),
+   gotcha++;
if (hp->h_subject != NULL && w & GSUBJECT)
-   fprintf(fo, "Subject: %s\n", hp->h_subject), gotcha++;
+   fprintf(fo, "Subject: %s\n", hp->h_subject),
+   gotcha++;
+   if (nutf8 == 0)
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=us-ascii\n"
+   "Content-Transfer-Encoding: 7bit\n"),
+   gotcha++;
+   else if (nutf8 == 1)
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=utf-8\n"
+   "Content-Transfer-Encoding: 8bit\n"),
+   gotcha++;
if (hp->h_cc != NULL && w & GCC)
-   fmt("Cc:", hp->h_cc, fo, w), gotcha++;
+   fmt("Cc:", hp->h_cc, fo, w),
+   gotcha++;
if (hp->h_bcc != NULL && w & GBCC)
-   fmt("Bcc:", hp->h_bcc, fo, w), gotcha++;
+   fmt("Bcc:", hp->h_bcc, fo, w),
+   gotcha++;
if (gotcha && w & GNL)
(void)putc('\n', fo);
return(0);
@@ -609,4 +634,44 @@ sendint(int s)
 {
 
sendsignal = s;
+}
+
+/* Search non valid UTF-8 characters in the message */
+static int
+not_utf8(FILE *message, int len)
+{
+   int c, count, n, ulen;
+   size_t i, resize;
+   size_t jump = 100;
+   unsigned char *s = NULL;
+
+   setlocale(LC_CTYPE, "en_US.UTF-8");
+
+   if (s == NULL && (s = malloc(jump)) == NULL)
+   err(1, NULL);
+
+   i = count = 0;
+   while ((c = getc(message)) != EOF) {
+   if (s == NULL || count == jump) {
+   if ((s = realloc(s, i + jump + 1)) == NULL)
+   err(1, NULL);
+   count = 0;
+   }
+   s[i++] = c;
+   count++;
+   }
+
+   s[i] = '\0';
+
+   ulen = mbstowcs(NULL, s, 0);
+
+   if (ulen == len)
+   n = 0;
+   else if (ulen < 0)
+   n = 2; 
+   else if (ulen < len)
+   n = 1;
+   
+   free(s);
+   return n;
 }


-- 
Walter



Re: Send international text with mail(1) - proposal and patches

2023-09-24 Thread Walter Alejandro Iglesias
Hi Stefan,

Do you like this?


Index: send.c
===
RCS file: /cvs/src/usr.bin/mail/send.c,v
retrieving revision 1.26
diff -u -p -r1.26 send.c
--- send.c  8 Mar 2023 04:43:11 -   1.26
+++ send.c  24 Sep 2023 10:33:11 -
@@ -32,6 +32,11 @@
 
 #include "rcv.h"
 #include "extern.h"
+#include "locale.h"
+
+/* To check charset of the message and add the appropiate MIME headers  */
+static char nutf8;
+static int not_utf8(FILE *s, int len);
 
 static volatile sig_atomic_t sendsignal;   /* Interrupted by a signal? */
 
@@ -341,6 +346,11 @@ mail1(struct header *hp, int printheader
else
puts("Null message body; hope that's ok");
}
+
+   /* Check non valid UTF-8 characters in the message */
+   nutf8 = not_utf8(mtf, fsize(mtf));
+   rewind(mtf);
+
/*
 * Now, take the user names from the combined
 * to and cc lists and do all the alias
@@ -520,15 +530,30 @@ puthead(struct header *hp, FILE *fo, int
gotcha = 0;
from = hp->h_from ? hp->h_from : value("from");
if (from != NULL)
-   fprintf(fo, "From: %s\n", from), gotcha++;
+   fprintf(fo, "From: %s\n", from),
+   gotcha++;
if (hp->h_to != NULL && w & GTO)
-   fmt("To:", hp->h_to, fo, w), gotcha++;
+   fmt("To:", hp->h_to, fo, w),
+   gotcha++;
if (hp->h_subject != NULL && w & GSUBJECT)
-   fprintf(fo, "Subject: %s\n", hp->h_subject), gotcha++;
+   fprintf(fo, "Subject: %s\n", hp->h_subject),
+   gotcha++;
+   if (nutf8 == 0)
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=us-ascii\n"
+   "Content-Transfer-Encoding: 7bit\n"),
+   gotcha++;
+   else if (nutf8 == 1)
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=utf-8\n"
+   "Content-Transfer-Encoding: 8bit\n"),
+   gotcha++;
if (hp->h_cc != NULL && w & GCC)
-   fmt("Cc:", hp->h_cc, fo, w), gotcha++;
+   fmt("Cc:", hp->h_cc, fo, w),
+   gotcha++;
if (hp->h_bcc != NULL && w & GBCC)
-   fmt("Bcc:", hp->h_bcc, fo, w), gotcha++;
+   fmt("Bcc:", hp->h_bcc, fo, w),
+   gotcha++;
if (gotcha && w & GNL)
(void)putc('\n', fo);
return(0);
@@ -609,4 +634,25 @@ sendint(int s)
 {
 
sendsignal = s;
+}
+
+/* Search non valid UTF-8 characters in the message */
+static int
+not_utf8(FILE *message, int len)
+{
+   int n, ulen;
+   unsigned char s[len];
+   setlocale(LC_CTYPE, "en_US.UTF-8");
+
+   fread(, sizeof(char), len, message);
+   ulen = mbstowcs(NULL, s, 0);
+
+   if (ulen == len)
+   n = 0;
+   else if (ulen < 0)
+   n = 2; 
+   else if (ulen < len)
+   n = 1;
+   
+   return n;
 }


-- 
Walter



Re: Send international text with mail(1) - proposal and patches

2023-09-24 Thread Walter Alejandro Iglesias
On Sun, Sep 24, 2023 at 09:47:41AM +0200, Stefan Sperling wrote:
> In the UTF-8 locale I can trigger an error message with your program
> by sending the latin1 code for a-acute to stdin. I suppose your test
> command didn't actually send latin1 to stdin for some reason?
> 
>   $ perl -e 'printf "\xe1rbol\n"' | ./a.out
>   error: Illegal byte sequence
> 

Right, I can trigger the error with your command, also directly typing
the characters in wscons (my keyboard is Spanish), what I was doing in
those commands was to copy and paste those latin charactes with my
mouse.  The strange thing is xterm still showed me the (?) glyphos.
Besides, I made a test using the mbstowcs function in my mail patch, and
it didn't worked.  I'll try again.


Thanks Stefan!


-- 
Walter



Re: Send international text with mail(1) - proposal and patches

2023-09-23 Thread Walter Alejandro Iglesias
Hi Ingo,

On Thu, Sep 21, 2023 at 03:04:24PM +0200, Ingo Schwarze wrote:
> In general, the tool for checking the validity of UTF-8 strings
> is a simple loop around mblen(3) if you want to report the precise
> positions of errors found, or simply mbstowcs(3) with a NULL pwcs
> argument if you are content with a one-bit "valid" or "invalid" answer.

Acording to mbstowcs(3):

RETURN VALUES
  mbstowcs() returns:

  0 or positive
The value returned is the number of elements stored in the array
pointed to by pwcs, except for a terminating null wide character
(if any).  If pwcs is not null and the value returned is equal
to n, the wide-character string pointed to by pwcs is not null
terminated.  If pwcs is a null pointer, the value returned is
the number of elements to contain the whole string converted,
except for a terminating null wide character.

  (size_t)-1  The array indirectly pointed to by s contains a byte
  sequence forming invalid character.  In this case,
  mbstowcs() sets errno to indicate the error.

ERRORS
 mbstowcs() may cause an error in the following cases:

 [EILSEQ]  s points to the string containing invalid or
   incomplete multibyte character.


To understand what mbstowcs(3) does I wrote the little test.c program
pasted at bottom.  In the following example [a] is UTF-8 aaculte and (a)
iso-latin aacute.

Using setlocale(LC_CTYPE, "en_US.UTF-8");

  $ cc -g -Wall test.c
  $ echo -n arbol | a.out
  ulen: 5
  $ echo -n [a]rbol | a.out
  ulen: 5
  $ echo -n (a)rbol | a.out
  ulen: 5

Using setlocale(LC_CTYPE, "C");

  $ cc -g -Wall test.c
  $ echo -n arbol | a.out
  ulen: 5
  $ echo -n [a]rbol | a.out
  ulen: 6
  $ echo -n (a)rbol | a.out
  ulen: 7

And no error message in any case.  I don't understand in which way those
return values let me know that the third string is invalid UTF-8.  Am I
doing something wrong?


test.c

#include 
#include 
#include 

int
main()
{

int c, i;
size_t ulen;
char s[100];

i = 0;
while ((c = getchar()) != EOF)
s[i++] = c;

s[i] = '\0';

setlocale(LC_CTYPE, "en_US.UTF-8");
//setlocale(LC_CTYPE, "C");

if ((ulen = mbstowcs(NULL, s, 0)) == (size_t)-1)
perror("error");

printf("ulen: %zu\n", ulen);

return 0;
}

-- 
Walter



Re: Send international text with mail(1) proposal and patches]

2023-09-23 Thread Walter Alejandro Iglesias
> On Thu, Sep 21, 2023 at 02:12:50PM +0200, Stefan Sperling wrote:
> > Your implementation lacks proper bounds checking. It accesses
> > s[i + 3] based purely on the contents of the input string, without
> > checking whether len < i + 3. Entering the while (i != len) loop with

You surely meant "len > i + 3" (grater than).  The patch below is wrong.

I know it doesn't matter anymore but I'm still clarifying so that no one
wastes time trying the patch.

> 
> 
> 
> Index: send.c
> ===
> RCS file: /cvs/src/usr.bin/mail/send.c,v
> retrieving revision 1.26
> diff -u -p -r1.26 send.c
> --- send.c8 Mar 2023 04:43:11 -   1.26
> +++ send.c21 Sep 2023 14:16:08 -
> @@ -33,6 +33,10 @@
>  #include "rcv.h"
>  #include "extern.h"
>  
> +/* To check charset of the message and add the appropiate MIME headers  */
> +static char nutf8;
> +static int not_utf8(FILE *s, int len);
> +
>  static volatile sig_atomic_t sendsignal; /* Interrupted by a signal? */
>  
>  /*
> @@ -341,6 +345,11 @@ mail1(struct header *hp, int printheader
>   else
>   puts("Null message body; hope that's ok");
>   }
> +
> + /* Check non valid UTF-8 characters in the message */
> + nutf8 = not_utf8(mtf, fsize(mtf));
> + rewind(mtf);
> +
>   /*
>* Now, take the user names from the combined
>* to and cc lists and do all the alias
> @@ -525,6 +534,14 @@ puthead(struct header *hp, FILE *fo, int
>   fmt("To:", hp->h_to, fo, w), gotcha++;
>   if (hp->h_subject != NULL && w & GSUBJECT)
>   fprintf(fo, "Subject: %s\n", hp->h_subject), gotcha++;
> + if (nutf8 == 0)
> + fprintf(fo, "MIME-Version: 1.0\n"
> + "Content-Type: text/plain; charset=us-ascii\n"
> + "Content-Transfer-Encoding: 7bit\n"), gotcha++;
> + else if (nutf8 == 1)
> + fprintf(fo, "MIME-Version: 1.0\n"
> + "Content-Type: text/plain; charset=utf-8\n"
> + "Content-Transfer-Encoding: 8bit\n"), gotcha++;
>   if (hp->h_cc != NULL && w & GCC)
>   fmt("Cc:", hp->h_cc, fo, w), gotcha++;
>   if (hp->h_bcc != NULL && w & GBCC)
> @@ -609,4 +626,60 @@ sendint(int s)
>  {
>  
>   sendsignal = s;
> +}
> +
> +/* Search non valid UTF-8 characters in the message */
> +static int
> +not_utf8(FILE *message, int len)
> +{
> + int i, n, nonascii;
> + char c;
> + unsigned char s[len + 1];
> +
> + i = 0;
> +while ((c = getc(message)) != EOF)
> + s[i++] = c;
> +
> + s[i] = '\0';
> +
> + i = n = nonascii = 0;
> + while (i != len)
> + if (s[i] <= 0x7f) {
> + i++;
> + /* Two bytes case */
> + } else if (len < i + 1 && s[i] >= 0xc2 && s[i] < 0xe0 &&
> + s[i + 1] >= 0x80 && s[i + 1] <= 0xbf) {
> + i += 2;
> + nonascii++;
> + /* Special three bytes case */
> + } else if ((len < i + 2 && s[i] == 0xe0 &&
> + s[i + 1] >= 0xa0 && s[i + 1] <= 0xbf &&
> + s[i + 2] >= 0x80 && s[i + 2] <= 0xbf) ||
> + /* Three bytes case */
> + (len < i + 2 && s[i] > 0xe0 && s[i] < 0xf0 &&
> + s[i + 1] >= 0x80 && s[i + 1] <= 0xbf &&
> + s[i + 2] >= 0x80 && s[i + 2] <= 0xbf)) {
> + i += 3;
> + nonascii++;
> + /* Special four bytes case */
> + } else if ((len < i + 3 && s[i] == 0xf0 &&
> + s[i + 1] >= 0x90 && s[i + 1] <= 0xbf &&
> + s[i + 2] >= 0x80 && s[i + 2] <= 0xbf &&
> + s[i + 3] >= 0x80 && s[i + 3] <= 0xbf) ||
> + /* Four bytes case */
> + (len < i + 3 && s[i] > 0xf0 &&
> + s[i + 1] >= 0x80 && s[i + 1] <= 0xbf &&
> + s[i + 2] >= 0x80 && s[i + 2] <= 0xbf &&
> + s[i + 3] >= 0x80 && s[i + 3] <= 0xbf)) {
> + i += 4;
> + nonascii++;
> + } else {
> + n = i + 1;
> + break;
> + }
> +
> + if (nonascii)
> + n++;
> +
> + return n;
>  }
> 
> 
> -- 
> Walter

-- 
Walter



Re: Send international text with mail(1) - proposal and patches

2023-09-21 Thread Walter Alejandro Iglesias
On Fri, Sep 22, 2023 at 06:57:24AM +0200, Walter Alejandro Iglesias wrote:
> Below, a version without utf8 parser.  I added a ASCII check for the
> subject.  The day will come when wscons support UTF-8, right?  In the
> meantime, just by being careful not to type iso-latin characters while
> using mail on wscons this version does its job.

Last version caused a core dump when sending without subject.  I fixed
that by adding a check in the conditional:

while (hp->h_subject != NULL && hp->h_subject[i] != '\0') {



Index: send.c
===
RCS file: /cvs/src/usr.bin/mail/send.c,v
retrieving revision 1.26
diff -u -p -r1.26 send.c
--- send.c  8 Mar 2023 04:43:11 -   1.26
+++ send.c  22 Sep 2023 05:47:37 -
@@ -33,6 +33,10 @@
 #include "rcv.h"
 #include "extern.h"
 
+/* This will be used to add MIME headers */
+static char noascii_subject;
+static char noascii_body;
+
 static volatile sig_atomic_t sendsignal;   /* Interrupted by a signal? */
 
 /*
@@ -341,6 +345,22 @@ mail1(struct header *hp, int printheader
else
puts("Null message body; hope that's ok");
}
+
+   /* Check for non ascii characters in the subject */
+   int i, ch;
+   i = 0;
+   while (hp->h_subject != NULL && hp->h_subject[i] != '\0') {
+   if (!isascii(hp->h_subject[i]))
+   noascii_subject = 1;
+   i++;
+   }
+
+   /* Check for non ascii characters in the body */
+   while ((ch = getc(mtf)) != EOF)
+   if (!isascii(ch))
+   noascii_body = 1;
+   rewind(mtf);
+
/*
 * Now, take the user names from the combined
 * to and cc lists and do all the alias
@@ -524,7 +544,18 @@ puthead(struct header *hp, FILE *fo, int
if (hp->h_to != NULL && w & GTO)
fmt("To:", hp->h_to, fo, w), gotcha++;
if (hp->h_subject != NULL && w & GSUBJECT)
-   fprintf(fo, "Subject: %s\n", hp->h_subject), gotcha++;
+   fprintf(fo, "Subject: %s\n", hp->h_subject),
+   gotcha++;
+   if (noascii_subject || noascii_body)
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=utf-8\n"
+   "Content-Transfer-Encoding: 8bit\n"),
+   gotcha++;
+   else
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=us-ascii\n"
+   "Content-Transfer-Encoding: 7bit\n"),
+   gotcha++;
if (hp->h_cc != NULL && w & GCC)
fmt("Cc:", hp->h_cc, fo, w), gotcha++;
if (hp->h_bcc != NULL && w & GBCC)
@@ -607,6 +638,5 @@ savemail(char *name, FILE *fi)
 void
 sendint(int s)
 {
-
sendsignal = s;
 }

-- 
Walter



Re: Send international text with mail(1) - proposal and patches

2023-09-21 Thread Walter Alejandro Iglesias
Hi Ingo,

On Thu, Sep 21, 2023 at 03:04:24PM +0200, Ingo Schwarze wrote:
> As Stefan says, adding a hand-written UTF-8 parser to mail(1) is
> clearly not acceptable.

Below, a version without utf8 parser.  I added a ASCII check for the
subject.  The day will come when wscons support UTF-8, right?  In the
meantime, just by being careful not to type iso-latin characters while
using mail on wscons this version does its job.

> 
> Yours,
>   Ingo


Index: send.c
===
RCS file: /cvs/src/usr.bin/mail/send.c,v
retrieving revision 1.26
diff -u -p -r1.26 send.c
--- send.c  8 Mar 2023 04:43:11 -   1.26
+++ send.c  22 Sep 2023 03:54:37 -
@@ -33,6 +33,10 @@
 #include "rcv.h"
 #include "extern.h"
 
+/* This will be used to add MIME headers */
+static char noascii_subject;
+static char noascii_body;
+
 static volatile sig_atomic_t sendsignal;   /* Interrupted by a signal? */
 
 /*
@@ -341,6 +345,22 @@ mail1(struct header *hp, int printheader
else
puts("Null message body; hope that's ok");
}
+
+   /* Check for non ascii characters in the subject */
+   int i, ch;
+   i = 0;
+   while (hp->h_subject[i] != '\0') {
+   if (!isascii(hp->h_subject[i]))
+   noascii_subject = 1;
+   i++;
+   }
+
+   /* Check for non ascii characters in the body */
+   while ((ch = getc(mtf)) != EOF)
+   if (!isascii(ch))
+   noascii_body = 1;
+   rewind(mtf);
+
/*
 * Now, take the user names from the combined
 * to and cc lists and do all the alias
@@ -524,7 +544,18 @@ puthead(struct header *hp, FILE *fo, int
if (hp->h_to != NULL && w & GTO)
fmt("To:", hp->h_to, fo, w), gotcha++;
if (hp->h_subject != NULL && w & GSUBJECT)
-   fprintf(fo, "Subject: %s\n", hp->h_subject), gotcha++;
+   fprintf(fo, "Subject: %s\n", hp->h_subject),
+   gotcha++;
+   if (noascii_subject || noascii_body)
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=utf-8\n"
+   "Content-Transfer-Encoding: 8bit\n"),
+   gotcha++;
+   else
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=us-ascii\n"
+   "Content-Transfer-Encoding: 7bit\n"),
+   gotcha++;
if (hp->h_cc != NULL && w & GCC)
fmt("Cc:", hp->h_cc, fo, w), gotcha++;
if (hp->h_bcc != NULL && w & GBCC)
@@ -607,6 +638,5 @@ savemail(char *name, FILE *fi)
 void
 sendint(int s)
 {
-
sendsignal = s;
 }


-- 
Walter



Re: Send international text with mail(1) - proposal and patches

2023-09-21 Thread Walter Alejandro Iglesias
On Thu, Sep 21, 2023 at 02:12:50PM +0200, Stefan Sperling wrote:
> On Thu, Sep 21, 2023 at 01:25:01PM +0200, Walter Alejandro Iglesias wrote:
> > I corrected many of the things you pointed me, but not all.  The
> > function I use to check utf8 is mine, I use it in a pair of little
> > programs which I've *hardly* checked for memory leacks.  I know my
> > function looks BIG :-), but I know for sure that it does the job.
> 
> We already have code in libc that does this, see the function
> _citrus_utf8_ctype_mbrtowc in lib/libc/citrus/citrus_utf8.c.
> Please use the libc interface if at all possible, it is best to
> have just one place to fix when a UTF-8 parser bug is found.
> 
> There is also utf8_isvalid() in tmux utf8.c though you would
> have to trim tmux UTF-8 code down for your narrow use case.
> 
> Your implementation lacks proper bounds checking. It accesses
> s[i + 3] based purely on the contents of the input string, without
> checking whether len < i + 3. Entering the while (i != len) loop with
> i == len-1 and a specially crafted input string can be problematic.

Hey Stefan,

I'll give up for now.  Another day I'll invetigate and try to understand
what you're asking me, because so far I fail to see how what you propose
could facilitate maintenance or reduce bugs.

Notice that you saw the issue in my code (bounds checking) at a first
glance, that's because my code is neither too complicated (citrus) nor
too elegant (tmux), hence by far easier to read, understand and debug.
Among other things it deals with utf-8 without using wchar.h.

I'm sorry you don't like it.  Anyways, in case someone else can do
something with it, here is my last version with the boundary check.



Index: send.c
===
RCS file: /cvs/src/usr.bin/mail/send.c,v
retrieving revision 1.26
diff -u -p -r1.26 send.c
--- send.c  8 Mar 2023 04:43:11 -   1.26
+++ send.c  21 Sep 2023 14:16:08 -
@@ -33,6 +33,10 @@
 #include "rcv.h"
 #include "extern.h"
 
+/* To check charset of the message and add the appropiate MIME headers  */
+static char nutf8;
+static int not_utf8(FILE *s, int len);
+
 static volatile sig_atomic_t sendsignal;   /* Interrupted by a signal? */
 
 /*
@@ -341,6 +345,11 @@ mail1(struct header *hp, int printheader
else
puts("Null message body; hope that's ok");
}
+
+   /* Check non valid UTF-8 characters in the message */
+   nutf8 = not_utf8(mtf, fsize(mtf));
+   rewind(mtf);
+
/*
 * Now, take the user names from the combined
 * to and cc lists and do all the alias
@@ -525,6 +534,14 @@ puthead(struct header *hp, FILE *fo, int
fmt("To:", hp->h_to, fo, w), gotcha++;
if (hp->h_subject != NULL && w & GSUBJECT)
fprintf(fo, "Subject: %s\n", hp->h_subject), gotcha++;
+   if (nutf8 == 0)
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=us-ascii\n"
+   "Content-Transfer-Encoding: 7bit\n"), gotcha++;
+   else if (nutf8 == 1)
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=utf-8\n"
+   "Content-Transfer-Encoding: 8bit\n"), gotcha++;
if (hp->h_cc != NULL && w & GCC)
fmt("Cc:", hp->h_cc, fo, w), gotcha++;
if (hp->h_bcc != NULL && w & GBCC)
@@ -609,4 +626,60 @@ sendint(int s)
 {
 
sendsignal = s;
+}
+
+/* Search non valid UTF-8 characters in the message */
+static int
+not_utf8(FILE *message, int len)
+{
+   int i, n, nonascii;
+   char c;
+   unsigned char s[len + 1];
+
+   i = 0;
+while ((c = getc(message)) != EOF)
+   s[i++] = c;
+
+   s[i] = '\0';
+
+   i = n = nonascii = 0;
+   while (i != len)
+   if (s[i] <= 0x7f) {
+   i++;
+   /* Two bytes case */
+   } else if (len < i + 1 && s[i] >= 0xc2 && s[i] < 0xe0 &&
+   s[i + 1] >= 0x80 && s[i + 1] <= 0xbf) {
+   i += 2;
+   nonascii++;
+   /* Special three bytes case */
+   } else if ((len < i + 2 && s[i] == 0xe0 &&
+   s[i + 1] >= 0xa0 && s[i + 1] <= 0xbf &&
+   s[i + 2] >= 0x80 && s[i + 2] <= 0xbf) ||
+   /* Three bytes case */
+   (len < i + 2 && s[i] > 0xe0 && s[i] < 0xf0 &&
+   s[i + 1] >= 0x80 && s[i + 1] <= 0xbf &&
+  

Re: Send international text with mail(1) - proposal and patches

2023-09-21 Thread Walter Alejandro Iglesias
On Thu, Sep 21, 2023 at 11:26:11AM +0200, Omar Polo wrote:
> On 2023/09/21 10:55:47 +0200, Walter Alejandro Iglesias  
> wrote:
> > On Wed, Sep 20, 2023 at 08:36:23PM +0200, Walter Alejandro Iglesias wrote:
> > > On Wed, Sep 20, 2023 at 07:44:12PM +0200, Walter Alejandro Iglesias wrote:
> > > > And this new idea simplifies all to this:
> > > 
> > > In case anyone else is worried.  Crystal Kolipe already pointed me out
> > > that a better UTF-8 checking is needed, I know, I'll get to that
> > > tomorrow.
> > 
> > The following version checks for not valid UTF-8 characters.  I could
> > make it fail in this case and send a dead.letter but I imagine that
> > those who really use mail(1) surely do it mostly in a tty console where,
> > at least with a non US keyboard, is too easy to type some non valid utf-8
> > character, hence this feature would be more a hassle than a help, so I
> > chose to make it simply skip adding any MIME header in this case (how it
> > has been used until now and no one complained :-)).  If you prefer the
> > other behavior let me know.
> > 
> > 
> > Index: send.c
> > ===
> > RCS file: /cvs/src/usr.bin/mail/send.c,v
> > retrieving revision 1.26
> > diff -u -p -r1.26 send.c
> > --- send.c  8 Mar 2023 04:43:11 -   1.26
> > +++ send.c  21 Sep 2023 08:40:11 -
> > @@ -33,6 +33,15 @@
> >  #include "rcv.h"
> >  #include "extern.h"
> >  
> > +/*
> > + * Variables and functions declared here will be useful to check the
> > + * character set of the message to add the appropiate MIME headers.
> > + */
> > +static char nascii = 0;
> > +static char nutf8 = 0;
> 
> There's no need to explicitly zero static (or global) variables.
> 
> > +static int not_ascii(struct __sFILE *s);
> > +static int not_utf8(struct __sFILE *s, int len);
> 
> I'd use FILE * instead of struct __sFILE
> 
> >  static volatile sig_atomic_t sendsignal;   /* Interrupted by a signal? */
> >  
> >  /*
> > @@ -341,6 +350,15 @@ mail1(struct header *hp, int printheader
> > else
> > puts("Null message body; hope that's ok");
> > }
> > +
> > +   /* Check for non ASCII characters in the message */
> > +   nascii = not_ascii(mtf);
> > +   rewind(mtf);
> > +
> > +   /* Check for non valid UTF-8 characters in the message */
> > +   nutf8 = not_utf8(mtf, fsize(mtf));
> > +   rewind(mtf);
> 
> assuming that we care for this two checks, why not doing everything in
> a single pass?
> 
> Do we really need the two checks?
> 
> > /*
> >  * Now, take the user names from the combined
> >  * to and cc lists and do all the alias
> > @@ -525,6 +543,14 @@ puthead(struct header *hp, FILE *fo, int
> > fmt("To:", hp->h_to, fo, w), gotcha++;
> > if (hp->h_subject != NULL && w & GSUBJECT)
> > fprintf(fo, "Subject: %s\n", hp->h_subject), gotcha++;
> > +   if (!nascii)
> > +   fprintf(fo, "MIME-Version: 1.0\n"
> > +   "Content-Type: text/plain; charset=us-ascii\n"
> > +   "Content-Transfer-Encoding: 7bit\n"), gotcha++;
> 
> +1 for splitting the string in multiple lines, this is an improvements
> over previous versions, but please
> 
>  - use four spaces of indentation for continuation lines
> 
>  - although existing code uses ", gotcha++" I'd split that in a
>separate line for clarity.
> 
> > +   else if (nutf8 == 0)
> > +   fprintf(fo, "MIME-Version: 1.0\n"
> > +   "Content-Type: text/plain; charset=utf-8\n"
> > +   "Content-Transfer-Encoding: 8bit\n"), gotcha++;
> > if (hp->h_cc != NULL && w & GCC)
> > fmt("Cc:", hp->h_cc, fo, w), gotcha++;
> > if (hp->h_bcc != NULL && w & GBCC)
> > @@ -609,4 +635,67 @@ sendint(int s)
> >  {
> >  
> > sendsignal = s;
> > +}
> > +
> > +/* Search non ASCII characters in the message */
> > +static int
> > +not_ascii(struct __sFILE *s)
> > +{
> > +   int ch, n;
> > +   n = 0;
> > +while ((ch = getc(s)) != EOF)
> 
> There are some spacing issues, both here and below.
> 
> > +if (ch > 0x7f)
> > +   n = 1;
> > +
> > +   re

Re: Send international text with mail(1) - proposal and patches

2023-09-21 Thread Walter Alejandro Iglesias
On Wed, Sep 20, 2023 at 08:36:23PM +0200, Walter Alejandro Iglesias wrote:
> On Wed, Sep 20, 2023 at 07:44:12PM +0200, Walter Alejandro Iglesias wrote:
> > And this new idea simplifies all to this:
> 
> In case anyone else is worried.  Crystal Kolipe already pointed me out
> that a better UTF-8 checking is needed, I know, I'll get to that
> tomorrow.

The following version checks for not valid UTF-8 characters.  I could
make it fail in this case and send a dead.letter but I imagine that
those who really use mail(1) surely do it mostly in a tty console where,
at least with a non US keyboard, is too easy to type some non valid utf-8
character, hence this feature would be more a hassle than a help, so I
chose to make it simply skip adding any MIME header in this case (how it
has been used until now and no one complained :-)).  If you prefer the
other behavior let me know.


Index: send.c
===
RCS file: /cvs/src/usr.bin/mail/send.c,v
retrieving revision 1.26
diff -u -p -r1.26 send.c
--- send.c  8 Mar 2023 04:43:11 -   1.26
+++ send.c  21 Sep 2023 08:40:11 -
@@ -33,6 +33,15 @@
 #include "rcv.h"
 #include "extern.h"
 
+/*
+ * Variables and functions declared here will be useful to check the
+ * character set of the message to add the appropiate MIME headers.
+ */
+static char nascii = 0;
+static char nutf8 = 0;
+static int not_ascii(struct __sFILE *s);
+static int not_utf8(struct __sFILE *s, int len);
+
 static volatile sig_atomic_t sendsignal;   /* Interrupted by a signal? */
 
 /*
@@ -341,6 +350,15 @@ mail1(struct header *hp, int printheader
else
puts("Null message body; hope that's ok");
}
+
+   /* Check for non ASCII characters in the message */
+   nascii = not_ascii(mtf);
+   rewind(mtf);
+
+   /* Check for non valid UTF-8 characters in the message */
+   nutf8 = not_utf8(mtf, fsize(mtf));
+   rewind(mtf);
+
/*
 * Now, take the user names from the combined
 * to and cc lists and do all the alias
@@ -525,6 +543,14 @@ puthead(struct header *hp, FILE *fo, int
fmt("To:", hp->h_to, fo, w), gotcha++;
if (hp->h_subject != NULL && w & GSUBJECT)
fprintf(fo, "Subject: %s\n", hp->h_subject), gotcha++;
+   if (!nascii)
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=us-ascii\n"
+   "Content-Transfer-Encoding: 7bit\n"), gotcha++;
+   else if (nutf8 == 0)
+   fprintf(fo, "MIME-Version: 1.0\n"
+   "Content-Type: text/plain; charset=utf-8\n"
+   "Content-Transfer-Encoding: 8bit\n"), gotcha++;
if (hp->h_cc != NULL && w & GCC)
fmt("Cc:", hp->h_cc, fo, w), gotcha++;
if (hp->h_bcc != NULL && w & GBCC)
@@ -609,4 +635,67 @@ sendint(int s)
 {
 
sendsignal = s;
+}
+
+/* Search non ASCII characters in the message */
+static int
+not_ascii(struct __sFILE *s)
+{
+   int ch, n;
+   n = 0;
+while ((ch = getc(s)) != EOF)
+if (ch > 0x7f)
+   n = 1;
+
+   return n;
+}
+
+/* Search non valid UTF-8 characters in the message */
+static int
+not_utf8(struct __sFILE *message, int len)
+{
+   int i, nou8;
+   char c;
+   unsigned char s[len + 1];
+
+   i = 0;
+while ((c = getc(message)) != EOF)
+   s[i++] = c;
+
+   s[i] = '\0';
+
+   i = nou8 = 0;
+   while (i != len)
+   if (s[i] <= 0x7f)
+   ++i;
+   /* Two bytes case */
+   else if (s[i] >= 0xc2 && s[i] < 0xe0 &&
+   s[i + 1] >= 0x80 && s[i + 1] <= 0xbf)
+   i += 2;
+   /* Special three bytes case */
+   else if ((s[i] == 0xe0 &&
+   s[i + 1] >= 0xa0 && s[i + 1] <= 0xbf &&
+   s[i + 2] >= 0x80 && s[i + 2] <= 0xbf) ||
+   /* Three bytes case */
+   (s[i] > 0xe0 && s[i] < 0xf0 &&
+   s[i + 1] >= 0x80 && s[i + 1] <= 0xbf &&
+   s[i + 2] >= 0x80 && s[i + 2] <= 0xbf))
+   i += 3;
+   /* Special four bytes case */
+   else if ((s[i] == 0xf0 &&
+   s[i + 1] >= 0x90 && s[i + 1] <= 0xbf &&
+   s[i + 2] >= 0x80 && s[i + 2] <= 0xbf &&
+   s[i + 3] >= 0x80 && s[i + 3] &

Re: Send international text with mail(1) - proposal and patches

2023-09-20 Thread Walter Alejandro Iglesias
On Wed, Sep 20, 2023 at 07:44:12PM +0200, Walter Alejandro Iglesias wrote:
> And this new idea simplifies all to this:

In case anyone else is worried.  Crystal Kolipe already pointed me out
that a better UTF-8 checking is needed, I know, I'll get to that
tomorrow.



Re: Send international text with mail(1) - proposal and patches

2023-09-20 Thread Walter Alejandro Iglesias
On Wed, Sep 20, 2023 at 06:13:10PM +0200, Walter Alejandro Iglesias wrote:
> Now I was investigating exactly that :-) (like Mutt also does): to make
> mail(1) automatically set the appropiate MIME headers when it detects
> any utf8 characters in the body text.  So, you don't like this idea?
> 

And this new idea simplifies all to this:


Index: send.c
===
RCS file: /cvs/src/usr.bin/mail/send.c,v
retrieving revision 1.26
diff -u -p -r1.26 send.c
--- send.c  8 Mar 2023 04:43:11 -   1.26
+++ send.c  20 Sep 2023 17:40:22 -
@@ -33,6 +33,8 @@
 #include "rcv.h"
 #include "extern.h"
 
+char utf8 = 0;
+
 static volatile sig_atomic_t sendsignal;   /* Interrupted by a signal? */
 
 /*
@@ -341,6 +343,13 @@ mail1(struct header *hp, int printheader
else
puts("Null message body; hope that's ok");
}
+   /* Check for non ascii characters */
+   int ch;
+while ((ch = getc(mtf)) != EOF)
+if (ch > 0x7f)
+   utf8 = 1;
+   rewind(mtf);
+
/*
 * Now, take the user names from the combined
 * to and cc lists and do all the alias
@@ -525,6 +534,10 @@ puthead(struct header *hp, FILE *fo, int
fmt("To:", hp->h_to, fo, w), gotcha++;
if (hp->h_subject != NULL && w & GSUBJECT)
fprintf(fo, "Subject: %s\n", hp->h_subject), gotcha++;
+   if (utf8)
+   fprintf(fo, "MIME-Version: 1.0\nContent-Type: text/plain; 
charset=utf-8\nContent-Transfer-Encoding: 8bit\n"), gotcha++;
+   else
+   fprintf(fo, "MIME-Version: 1.0\nContent-Type: text/plain; 
charset=us-ascii\nContent-Transfer-Encoding: 7bit\n"), gotcha++;
if (hp->h_cc != NULL && w & GCC)
fmt("Cc:", hp->h_cc, fo, w), gotcha++;
if (hp->h_bcc != NULL && w & GBCC)


-- 
Walter



Re: Send international text with mail(1) - proposal and patches

2023-09-20 Thread Walter Alejandro Iglesias
On Wed, Sep 20, 2023 at 05:30:08PM +0200, Ingo Schwarze wrote:
> Hi,
> 
> i checked the following points:
> 
>  * Even though RFC 2049 section 2 bullet point 1 only *requires*
>MIME-conformant MUAs to always write the header "MIME-Version:
>1.0" - and mail(1) is most certainly not MIME-conformant - RFC 2049
>section 2 bullet point 8 explicitly *recommends* that even non-MIME
>MUAs always set appropriate MIME headers.  RFC 2046 section 4.1.2
>paragraph 8 also "strongly" recommends the explicit inclusion of a
>"charset" parameter even for us-ascii.
> 
>Consequently, i believe that when sending a message in US-ASCII,
>mail(1) should include these headers:
> 
>MIME-Version: 1.0
>Content-Transfer-Encoding: 7bit
>Content-Type: text/plain; charset=us-ascii

I already thought about adding this, it's what Mutt does by default, But
I thought, Ingo is going to scold me for complicating things. :-)

> 
>  * Adding a "Content-Transfer-Encoding: ..." header is indeed required
>for sending UTF-8 messages, see  RFC 2049 section 2 bullet point 2.
>"8bit" is one of the valid values that MUAs must support for
>receiving messages by default.
>Using it seems sane because it is most likely to work with receiving
>MUAs that are not MIME-conformant, like our mail(1) itself.
>I think nowadays, that's a bigger concern than MTAs that are not
>8-bit clean, in particular when maintaining a low-level program
>like our mail(1).
>Consequently, i think using 8bit is indeed better for our mail(1)
>than quoted-printable or base64.

Well, this also saves you the conversion, especially with the subject,
which is tricky.

> 
>  * Adding "Content-Type: text/plain; charset=utf-8" is required by
>RFC 2049 section 2 bullet point 4 (for the simplest kind of UTF-8
>encoded messages).
> 
>  * The Content-Disposition: header is defined in RFC 2183, clearly
>optional, and not useful in single-part messages.  Consequently,
>mail(1) should not write it.

Yeah, I read that, that's why I didn't add that header.


> 
> So apart from writing the headers for us-ascii, i think you are
> almost there.
> 
> Given that the charset cannot be inferred from the environment
> and that setting it per-system or per-user in a configuration file
> is also inadequate - it shouldn't be uncommon for users to sometimes
> send US-ASCII and sometimes UTF-8 mail - i think that a new option
> is indeed needed.
> 
> Regarding the naming of the option, compatibility with POSIX
>   https://pubs.opengroup.org/onlinepubs/9699919799/utilities/mailx.html
> is paramount, which kills the tentative idea to use -u for "UTF-8"
> because -u already means "user".
> 
> Compatibility with other mailx(1) implementations is also a
> consideration.  See, for example,
>   https://linux.die.net/man/1/mail
> and -m is indeed among the very few options still available over there.
> I would document it focussing on a "multibyte character encoding"
> mnemonic.  The "mime" mnemonic feels far too broad because MIME can
> be used for lots of other purposes besides specifying a character
> encoding.
> 
> The -m option is also free here:
>   https://man.freebsd.org/cgi/man.cgi?query=mail(1)
>   https://man.netbsd.org/mail.1
>   https://docs.oracle.com/cd/E88353_01/html/E37839/mailx-1.html
>   https://www.ibm.com/docs/en/aix/7.3?topic=m-mail-command-1
> None of those appears to support command line selection of the
> character set for sending mail, so i don't see any immediate
> logioc clashes either.
> 
> The -m option does clash with this one:
>   https://www.sdaoden.eu/code-nail.html
> But i think dismissing Steffen Daode Nurpmeso as a lunatic is obviously
> the way to go.  Try to listen to that person and you will never get
> anything done.
> 
> The mailx(1) documented on die.net appears to be the Heirloom one.
> It does not have an option to select sending US-ASCII or UTF-8.
> Instead, it has a "sendcharsets" configuration variable.  That's
> clearly overengineering, but even when hardcoding the equivalent of
> 
>   sendcharsets=utf-8
> 
> which is also the default, that's nasty because it silently switches to
> UTF-8 as soon as a non-ASCII character appears in the input.  I think
> at least in interactive mode, explicit confirmation from the user would
> be required to send UTF-8, instead writing dead.letter if the user
> rejects the request, such that they can clean up the file and try again.
> 
> That would certainly be more complicated than requiring an option
> up front, not only from the implementation perspective, but arguably
> also from the user perspective.  So unless other developers think this
> should be fully automatic with confirmation rather than controlled
> by an option, i suggest staying with Walter's idea of using an option.

Now I was investigating exactly that :-) (like Mutt also does): to make
mail(1) automatically set the appropiate MIME headers when it detects
any utf8 characters 

Re: Send international text with mail(1) - proposal and patches

2023-09-20 Thread Walter Alejandro Iglesias
On Wed, Sep 20, 2023 at 10:30:31AM +, Klemens Nanni wrote:
> Except for mandoc(1) and other manuals where "utf8" is a literal keyword,
> our manuals consistently use upper-case UTF-8 for what is an abbreviation,
> so this should do as wlel.
> 
> >  .It Fl n
> >  Inhibits reading
> >  .Pa /etc/mail.rc
> 
> You forgot SYNOPSIS:
>   $ man -h mail
>   mail [-dEIinv] [-b list] [-c list] [-r from-addr] [-s subject] to-addr 
> ...
>   mail [-dEIiNnv] -f [file]
>   mail [-dEIiNnv] [-u user]
> 
> Otherwise looks sane.
> 

Thank you!


Index: extern.h
===
RCS file: /cvs/src/usr.bin/mail/extern.h,v
retrieving revision 1.29
diff -u -p -r1.29 extern.h
--- extern.h16 Sep 2018 02:38:57 -  1.29
+++ extern.h20 Sep 2023 10:44:41 -
@@ -261,3 +261,4 @@ int  writeback(FILE *);
 extern char *__progname;
 extern char *tmpdir;
 extern const struct cmd *com; /* command we are running */
+extern char mime; /* Add MIME headers */
Index: mail.1
===
RCS file: /cvs/src/usr.bin/mail/mail.1,v
retrieving revision 1.83
diff -u -p -r1.83 mail.1
--- mail.1  31 Mar 2022 17:27:25 -  1.83
+++ mail.1  20 Sep 2023 10:44:41 -
@@ -40,7 +40,7 @@
 .Sh SYNOPSIS
 .Nm mail
 .Bk -words
-.Op Fl dEIinv
+.Op Fl dEIimnv
 .Op Fl b Ar list
 .Op Fl c Ar list
 .Op Fl r Ar from-addr
@@ -106,6 +106,8 @@ on noisy phone lines.
 .It Fl N
 Inhibits initial display of message headers
 when reading mail or editing a mail folder.
+.It Fl m
+Add MIME headers to send UTF-8 encoded messages.
 .It Fl n
 Inhibits reading
 .Pa /etc/mail.rc
Index: main.c
===
RCS file: /cvs/src/usr.bin/mail/main.c,v
retrieving revision 1.35
diff -u -p -r1.35 main.c
--- main.c  26 Jan 2021 18:21:47 -  1.35
+++ main.c  20 Sep 2023 10:44:41 -
@@ -79,6 +79,8 @@ int   realscreenheight;   /* the real scree
 intuflag;  /* Are we in -u mode? */
 sigset_t intset;   /* Signal set that is just SIGINT */
 
+char mime = 0; /* Add MIME headers */
+
 /*
  * The pointers for the string allocation routines,
  * there are NSPACE independent areas.
@@ -136,7 +138,7 @@ main(int argc, char **argv)
smopts = NULL;
fromaddr = NULL;
subject = NULL;
-   while ((i = getopt(argc, argv, "EINb:c:dfinr:s:u:v")) != -1) {
+   while ((i = getopt(argc, argv, "EINb:c:dfimnr:s:u:v")) != -1) {
switch (i) {
case 'u':
/*
@@ -171,6 +173,10 @@ main(int argc, char **argv)
 */
subject = optarg;
break;
+   case 'm':
+   /* Add MIME headers */
+   mime = 1;
+   break;
case 'f':
/*
 * User is specifying file to "edit" with Mail,
@@ -337,7 +343,7 @@ __dead void
 usage(void)
 {
 
-   fprintf(stderr, "usage: %s [-dEIinv] [-b list] [-c list] "
+   fprintf(stderr, "usage: %s [-dEIimnv] [-b list] [-c list] "
"[-r from-addr] [-s subject] to-addr ...\n", __progname);
fprintf(stderr, "   %s [-dEIiNnv] -f [file]\n", __progname);
fprintf(stderr, "   %s [-dEIiNnv] [-u user]\n", __progname);
Index: send.c
===
RCS file: /cvs/src/usr.bin/mail/send.c,v
retrieving revision 1.26
diff -u -p -r1.26 send.c
--- send.c  8 Mar 2023 04:43:11 -   1.26
+++ send.c  20 Sep 2023 10:44:41 -
@@ -525,6 +525,8 @@ puthead(struct header *hp, FILE *fo, int
fmt("To:", hp->h_to, fo, w), gotcha++;
if (hp->h_subject != NULL && w & GSUBJECT)
fprintf(fo, "Subject: %s\n", hp->h_subject), gotcha++;
+   if (mime)
+   fprintf(fo, "MIME-Version: 1.0\nContent-Type: text/plain; 
charset=utf-8\nContent-Transfer-Encoding: 8bit\n"), gotcha++;
if (hp->h_cc != NULL && w & GCC)
fmt("Cc:", hp->h_cc, fo, w), gotcha++;
if (hp->h_bcc != NULL && w & GBCC)


-- 
Walter



Re: Send international text with mail(1) - proposal and patches

2023-09-20 Thread Walter Alejandro Iglesias
Hi Ingo,

I did what you suggested me, I investigated a bit and you were right in
that the MIME-Version header was necessary.  This new set of patches
add the following headers (hardcoded as you suggested me):

  MIME-Version: 1.0
  Content-Type: text/plain; charset=utf-8
  Content-Transfer-Encoding: 8bit

I modified the code the less as possible, just a '-m' option:

  $ mail -m -s Hello d...@ext.net < body_message_in_utf8


Although, to tell the truth, I'm not really convinced if this change is
worth it.  Feel free to ignore it.



Index: extern.h
===
RCS file: /cvs/src/usr.bin/mail/extern.h,v
retrieving revision 1.29
diff -u -p -r1.29 extern.h
--- extern.h16 Sep 2018 02:38:57 -  1.29
+++ extern.h20 Sep 2023 09:55:06 -
@@ -261,3 +261,4 @@ int  writeback(FILE *);
 extern char *__progname;
 extern char *tmpdir;
 extern const struct cmd *com; /* command we are running */
+extern char mime; /* Add MIME headers */
Index: mail.1
===
RCS file: /cvs/src/usr.bin/mail/mail.1,v
retrieving revision 1.83
diff -u -p -r1.83 mail.1
--- mail.1  31 Mar 2022 17:27:25 -  1.83
+++ mail.1  20 Sep 2023 09:55:06 -
@@ -106,6 +106,8 @@ on noisy phone lines.
 .It Fl N
 Inhibits initial display of message headers
 when reading mail or editing a mail folder.
+.It Fl m
+Add MIME headers to send utf-8 encoded messages.
 .It Fl n
 Inhibits reading
 .Pa /etc/mail.rc
Index: main.c
===
RCS file: /cvs/src/usr.bin/mail/main.c,v
retrieving revision 1.35
diff -u -p -r1.35 main.c
--- main.c  26 Jan 2021 18:21:47 -  1.35
+++ main.c  20 Sep 2023 09:55:06 -
@@ -79,6 +79,8 @@ int   realscreenheight;   /* the real scree
 intuflag;  /* Are we in -u mode? */
 sigset_t intset;   /* Signal set that is just SIGINT */
 
+char mime = 0; /* Add MIME headers */
+
 /*
  * The pointers for the string allocation routines,
  * there are NSPACE independent areas.
@@ -136,7 +138,7 @@ main(int argc, char **argv)
smopts = NULL;
fromaddr = NULL;
subject = NULL;
-   while ((i = getopt(argc, argv, "EINb:c:dfinr:s:u:v")) != -1) {
+   while ((i = getopt(argc, argv, "EINb:c:dfimnr:s:u:v")) != -1) {
switch (i) {
case 'u':
/*
@@ -171,6 +173,10 @@ main(int argc, char **argv)
 */
subject = optarg;
break;
+   case 'm':
+   /* Add MIME headers */
+   mime = 1;
+   break;
case 'f':
/*
 * User is specifying file to "edit" with Mail,
@@ -337,7 +343,7 @@ __dead void
 usage(void)
 {
 
-   fprintf(stderr, "usage: %s [-dEIinv] [-b list] [-c list] "
+   fprintf(stderr, "usage: %s [-dEIimnv] [-b list] [-c list] "
"[-r from-addr] [-s subject] to-addr ...\n", __progname);
fprintf(stderr, "   %s [-dEIiNnv] -f [file]\n", __progname);
fprintf(stderr, "   %s [-dEIiNnv] [-u user]\n", __progname);
Index: send.c
===
RCS file: /cvs/src/usr.bin/mail/send.c,v
retrieving revision 1.26
diff -u -p -r1.26 send.c
--- send.c  8 Mar 2023 04:43:11 -   1.26
+++ send.c  20 Sep 2023 09:55:06 -
@@ -525,6 +525,8 @@ puthead(struct header *hp, FILE *fo, int
fmt("To:", hp->h_to, fo, w), gotcha++;
if (hp->h_subject != NULL && w & GSUBJECT)
fprintf(fo, "Subject: %s\n", hp->h_subject), gotcha++;
+   if (mime)
+   fprintf(fo, "MIME-Version: 1.0\nContent-Type: text/plain; 
charset=utf-8\nContent-Transfer-Encoding: 8bit\n"), gotcha++;
if (hp->h_cc != NULL && w & GCC)
fmt("Cc:", hp->h_cc, fo, w), gotcha++;
if (hp->h_bcc != NULL && w & GBCC)



-- 
Walter



Re: Send international text with mail(1) - proposal and patches

2023-09-19 Thread Walter Alejandro Iglesias
On Tue, Sep 19, 2023 at 05:48:01PM +0200, Ingo Schwarze wrote:
> Hi Walter,
> 
> i did not look closely at the patch yet, and i did not dig for standards
> documents, which one should almost certainly do before committing such
> a patch unless one knows all the relevant standards by heart (which i
> do not), so i'm not saying this must be done differently, but instead
> i am merely asking questions.

Today I came from having a biopsy of a tumor that appeared in my leg in
"February" of this year and thanks to the bureaucracy and the fact that
nowadays nobody takes anything seriously, I still don't know if the
tumor is malignant.  The apathy and irresponsibility of the people
(especially here in Spain) is such that I am thinking of buying a
scalpel and operating my tumor myself.  I explain this because, you
can't imagine, dear Ingo, how happy it would make me if at least 10% of
the people in this world were half as responsible as you are. :-)

> 
> 1. Are you really sure that a header like
>  MIME-Version: 1.0
>is not needed when you add Content-*: headers?
> 
> 2. Are you really sure that a header like
>  Content-Disposition: inline
>is not needed?

Thanks for the info. :-)

> 
> 3. What is the reason for not simply hardcoding
>  Content-Transfer-Encoding: 8bit
>when sending UTF-8 mail?

Yeah, I thought about it.

>Are there really still MTAs that choke on that in 2023?
>quoted-printable is definitely a PITA no matter the context,
>so in my book, if it can be avoided, avoiding it would be a plus.

I always try to choose what, from my ignorance, I suspect will cause the
least problems.  In this case I take into account that when sending a
file to the Internet its health no longer depends only on what *my
system* supports or not, out there it'll have to survive different
environments.  Many people still send messages in iso-latin and use
MSWin which still doesn't use utf-8: I send a message utf-8 encoded and
I get the response in iso-latin.  So, from my ignorance, I feel that
ASCII has more chances of surviving.

> 
> 4. What's the motivation for the -y flag taking an argument
>and not simply hardcoding "text/plain;charset=utf-8"?

I also thought about that.

>OpenBSD does not support any other charset and does not plan to
>change that in the future.
>I hope your next patch isn't going to be support for text/html.  =:-S

Believe me, I try to do everything in my life in the simplest way, while
others allow me.  But as with everything you have to be careful not to
overdo it, for example, in the case that concerns us here, if you notice
that every time you need some job done you have to install and use the
bloated version of the tools, you should ask yourself if you haven't
gone too far with your simplifications.  I'm more in favor of the
traditional "Keep it simple..." and "If ain't broken..." rather than
"simplifying".  Simplifying is dangerous, amputating a leg simplifies
your body as a system but not your life.


> 
> 5. What's the motivation for supporting -y without -e
>and for supporting -e without -y ?

Right, that's an inconsistency.

> 
> In general, we want as few options as possible and as little
> configurabity as possible.  If there is a sane use case for something -
> in this case, sending UTF-8 mail - *one* option is possibly warranted.
> But adding more than one option would need a very robust justification,
> and so would adding an option that takes an argument.
> 
> Note that mail(1) is not mail/swaks.  Its purpose is reading and
> sending mail in a *simple* way, not low-level testing or protocol
> debugging.
> 
> I'll postpone code review and testing, maybe you can simplify this
> first?

Well, as you have done with me on many occasions, your intention is to
kindly educate me, on this occasion you're making me notice that
publishing "sketches" instead of a finished work I'm wasting the
developers' time.  Thanks Ingo!  What saddens me is that I'm too old to
hope that one day I will win your approval in something. :-)

> 
> Yours,
>   Ingo


-- 
Walter



Re: Send international text with mail(1) - proposal and patches

2023-09-19 Thread Walter Alejandro Iglesias
I'd forgotten that adding a "charset" specification to the Content-Type
header is also needed.  In the *new* set of patches below, besides I
corrected some other errors, I added a '-y' option to specify utf-8
character set:

  $ mail -s Hello -e quoted-printable -y "text/plain;charset=utf-8" \
recipi...@example.com < message.txt


Index: collect.c
===
RCS file: /cvs/src/usr.bin/mail/collect.c,v
retrieving revision 1.34
diff -u -p -r1.34 collect.c
--- collect.c   17 Jan 2014 18:42:30 -  1.34
+++ collect.c   19 Sep 2023 13:30:14 -
@@ -87,7 +87,7 @@ collect(struct header *hp, int printhead
 * refrain from printing a newline after
 * the headers (since some people mind).
 */
-   t = GTO|GSUBJECT|GCC|GNL;
+   t = GTO|GSUBJECT|GENCODING|GTYPE|GCC|GNL;
getsub = 0;
if (hp->h_subject == NULL && value("interactive") != NULL &&
(value("ask") != NULL || value("asksub") != NULL))
@@ -208,7 +208,7 @@ cont:
/*
 * Grab a bunch of headers.
 */
-   grabh(hp, GTO|GSUBJECT|GCC|GBCC);
+   grabh(hp, GTO|GSUBJECT|GENCODING|GTYPE|GCC|GBCC);
goto cont;
case 't':
/*
@@ -328,7 +328,7 @@ cont:
 */
rewind(collf);
puts("---\nMessage contains:");
-   puthead(hp, stdout, GTO|GSUBJECT|GCC|GBCC|GNL);
+   puthead(hp, stdout, 
GTO|GSUBJECT|GENCODING|GTYPE|GCC|GBCC|GNL);
while ((t = getc(collf)) != EOF)
(void)putchar(t);
goto cont;
Index: def.h
===
RCS file: /cvs/src/usr.bin/mail/def.h,v
retrieving revision 1.17
diff -u -p -r1.17 def.h
--- def.h   28 Jan 2022 06:18:41 -  1.17
+++ def.h   19 Sep 2023 13:30:14 -
@@ -158,12 +158,14 @@ struct headline {
 #defineGSUBJECT 2  /* Likewise, Subject: line */
 #defineGCC 4   /* And the Cc: line */
 #defineGBCC8   /* And also the Bcc: line */
-#defineGMASK   (GTO|GSUBJECT|GCC|GBCC)
+#defineGMASK   (GTO|GSUBJECT|GENCODING|GTYPE|GCC|GBCC)
/* Mask of places from whence */
 
 #defineGNL 16  /* Print blank line after */
 #defineGDEL32  /* Entity removed from list */
 #defineGCOMMA  64  /* detract puts in commas */
+#defineGENCODING 128   /* Content-Transfer-Encoding: line */
+#defineGTYPE   256 /* Content-Type: line */
 
 /*
  * Structure used to pass about the current
@@ -173,6 +175,8 @@ struct header {
struct name *h_to;  /* Dynamic "To:" string */
char *h_from;   /* User-specified "From:" string */
char *h_subject;/* Subject string */
+   char *h_encoding;   /* Content-Transfer-Encoding string */
+   char *h_type;   /* Content-Type string */
struct name *h_cc;  /* Carbon copies string */
struct name *h_bcc; /* Blind carbon copies */
struct name *h_smopts;  /* Sendmail options */
Index: extern.h
===
RCS file: /cvs/src/usr.bin/mail/extern.h,v
retrieving revision 1.29
diff -u -p -r1.29 extern.h
--- extern.h16 Sep 2018 02:38:57 -  1.29
+++ extern.h19 Sep 2023 13:30:14 -
@@ -163,7 +163,7 @@ void load(char *);
 struct var *
 lookup(char *);
 int mail(struct name *, struct name *, struct name *, struct name *,
-  char *, char *);
+  char *, char *, char *, char *);
 voidmail1(struct header *, int);
 voidmakemessage(FILE *, int);
 voidmark(int);
Index: mail.1
===
RCS file: /cvs/src/usr.bin/mail/mail.1,v
retrieving revision 1.83
diff -u -p -r1.83 mail.1
--- mail.1  31 Mar 2022 17:27:25 -  1.83
+++ mail.1  19 Sep 2023 13:30:15 -
@@ -45,6 +45,8 @@
 .Op Fl c Ar list
 .Op Fl r Ar from-addr
 .Op Fl s Ar subject
+.Op Fl e Ar transfer-encoding
+.Op Fl y Ar content-type
 .Ar to-addr ...
 .Ek
 .Nm mail
@@ -77,6 +79,8 @@ Causes
 .Nm mail
 to output all sorts of information useful for debugging
 .Nm mail .
+.It Fl e Ar encoding
+Add a Content-Transfer-Enconding header on command line.
 .It Fl E
 Don't send messages with an empty body.
 .It Fl f
@@ -133,6 +137,8 @@ except that locking is done.
 Verbose mode.
 The details of
 delivery are displayed on the user's terminal.
+.It Fl y Ar content-type
+Add a Content-Type header on command line.
 .El
 

Send international text with mail(1) - proposal and patches

2023-09-19 Thread Walter Alejandro Iglesias
Hello everyone,

Years ago I made a suggestion here about this, now that I am a little
less ignorant in C I dared with the patches below, they add a '-e'
option to mail(1) to pass a Content-Transfer-Encoding on the command
line.  First you convert the text in "message.txt" to quoted-printable
or base64 (a simple shell script would do the job) and then you're able
to send international text doing this:

 $ mail -s Hello -e quoted-printable recipi...@example.com < message.txt


Index: collect.c
===
RCS file: /cvs/src/usr.bin/mail/collect.c,v
retrieving revision 1.34
diff -u -p -r1.34 collect.c
--- collect.c   17 Jan 2014 18:42:30 -  1.34
+++ collect.c   19 Sep 2023 11:40:29 -
@@ -87,7 +87,7 @@ collect(struct header *hp, int printhead
 * refrain from printing a newline after
 * the headers (since some people mind).
 */
-   t = GTO|GSUBJECT|GCC|GNL;
+   t = GTO|GSUBJECT|GENCODING|GCC|GNL;
getsub = 0;
if (hp->h_subject == NULL && value("interactive") != NULL &&
(value("ask") != NULL || value("asksub") != NULL))
Index: def.h
===
RCS file: /cvs/src/usr.bin/mail/def.h,v
retrieving revision 1.17
diff -u -p -r1.17 def.h
--- def.h   28 Jan 2022 06:18:41 -  1.17
+++ def.h   19 Sep 2023 11:40:29 -
@@ -158,12 +158,13 @@ struct headline {
 #defineGSUBJECT 2  /* Likewise, Subject: line */
 #defineGCC 4   /* And the Cc: line */
 #defineGBCC8   /* And also the Bcc: line */
-#defineGMASK   (GTO|GSUBJECT|GCC|GBCC)
+#defineGMASK   (GTO|GSUBJECT|GENCODING|GCC|GBCC)
/* Mask of places from whence */
 
 #defineGNL 16  /* Print blank line after */
 #defineGDEL32  /* Entity removed from list */
 #defineGCOMMA  64  /* detract puts in commas */
+#defineGENCODING 128   /* Content-Transfer-Encoding: line */
 
 /*
  * Structure used to pass about the current
@@ -173,6 +174,7 @@ struct header {
struct name *h_to;  /* Dynamic "To:" string */
char *h_from;   /* User-specified "From:" string */
char *h_subject;/* Subject string */
+   char *h_encoding;   /* Content-Transfer-Encoding string */
struct name *h_cc;  /* Carbon copies string */
struct name *h_bcc; /* Blind carbon copies */
struct name *h_smopts;  /* Sendmail options */
Index: extern.h
===
RCS file: /cvs/src/usr.bin/mail/extern.h,v
retrieving revision 1.29
diff -u -p -r1.29 extern.h
--- extern.h16 Sep 2018 02:38:57 -  1.29
+++ extern.h19 Sep 2023 11:40:29 -
@@ -163,7 +163,7 @@ void load(char *);
 struct var *
 lookup(char *);
 int mail(struct name *, struct name *, struct name *, struct name *,
-  char *, char *);
+  char *, char *, char *);
 voidmail1(struct header *, int);
 voidmakemessage(FILE *, int);
 voidmark(int);
Index: mail.1
===
RCS file: /cvs/src/usr.bin/mail/mail.1,v
retrieving revision 1.83
diff -u -p -r1.83 mail.1
--- mail.1  31 Mar 2022 17:27:25 -  1.83
+++ mail.1  19 Sep 2023 11:40:29 -
@@ -45,6 +45,7 @@
 .Op Fl c Ar list
 .Op Fl r Ar from-addr
 .Op Fl s Ar subject
+.Op Fl s Ar encoding
 .Ar to-addr ...
 .Ek
 .Nm mail
@@ -77,6 +78,8 @@ Causes
 .Nm mail
 to output all sorts of information useful for debugging
 .Nm mail .
+.It Fl e Ar encoding
+Specify a Content-Transfer-Enconding on command line.
 .It Fl E
 Don't send messages with an empty body.
 .It Fl f
Index: main.c
===
RCS file: /cvs/src/usr.bin/mail/main.c,v
retrieving revision 1.35
diff -u -p -r1.35 main.c
--- main.c  26 Jan 2021 18:21:47 -  1.35
+++ main.c  19 Sep 2023 11:40:29 -
@@ -103,6 +103,7 @@ main(int argc, char **argv)
struct name *to, *cc, *bcc, *smopts;
char *fromaddr;
char *subject;
+   char *encoding;
char *ef;
char nosrc = 0;
char *rc;
@@ -136,7 +137,8 @@ main(int argc, char **argv)
smopts = NULL;
fromaddr = NULL;
subject = NULL;
-   while ((i = getopt(argc, argv, "EINb:c:dfinr:s:u:v")) != -1) {
+   encoding = NULL;
+   while ((i = getopt(argc, argv, "EINb:c:de:finr:s:u:v")) != -1) {
switch (i) {
case 'u':
/*
@@ -158,6 +160,10 @@ main(int argc, char **argv)
case 'd':
debug++;
break;
+   case 'e':
+   /* Set 

Re: Reminder of bug in vi and nvi including tested diff

2023-09-07 Thread Walter Alejandro Iglesias
Hi Raf,

On Thu, Sep 07, 2023 at 12:08:00PM +0100, Raf Czlonka wrote:
> On Thu, Sep 07, 2023 at 08:04:43AM BST, Walter Alejandro Iglesias wrote:
> > Dear OpenBSD developers,
> > 
> > On Aug 2 I reported this bug:
> > 
> >   https://marc.info/?l=openbsd-bugs=169100763926909=2
> > 
> > After fiddling around I found a solution that works for both vi base and
> > nvi from ports:
> > 
> >   https://marc.info/?l=openbsd-bugs=16926218514=2
> > 
> > Since nobody answered me in bugs@ I sent a message to ports@ and Cc:
> > Anthony J. Bentley who told me to contact Zhihao Yuan, nvi developer
> > upstream.  I don't use github, I don't have a github account, luckily
> > after searching the web I found an email address of Zhihao.  He
> > understood the issue and answered me with what seems to be a more
> > consistent patch:
> > 
> >   https://marc.info/?l=openbsd-bugs=169277277928008=2
> > 
> > Which, needless to say, also works for both. vi on base and nvi on
> > ports.  Below I paste a cvs version of Zhihao's patch to use it with vi
> > on base.
> > 
> > So it only rests some OpenBSD developer here to take look.  It's not
> > going to take up much of your time, everything has already been chewed
> > up :-).
> 
> Hi Walter,
> 
> This isn't related to the bug per se but it might be useful bit of
> information regardless.
> 
> The nvi port is actually nvi2[0], which is based on the original
> (read Keith Bostic's) nvi, where, in turn, the base vi(1) comes
> from.
> 
> The original nvi is still maintained[1] by Sven Verdoolaege[2] so
> you might want to give him a shout, too.
> 
> [0] https://github.com/lichray/nvi2
> [1] https://repo.or.cz/nvi.git
> [2] 
> https://sites.google.com/a/bostic.com/keithbostic/the-berkeley-vi-editor-home-page

I didn't know that, thanks.  I'd like to send the patch to the
maintainer of this version too, but after downloading a snapshot a
README.1st in the tarball says:

Please do not contact the original authors about bugs you
find in this version. Contact skimo...@kotnet.org instead.
[...] 
New versions will be made available on 
http://www.kotnet.org/~skimo/nvi
[...]

Sven Verdoolaege


It's not clear who I should contact to, the http link is broken so
probably the email address is also broken.

And the info the tarball about how to build it is also incomplete.

Perhaps this project is a bit abandoned?


> 
> Regards,
> 
> Raf


-- 
Walter



Reminder of bug in vi and nvi including tested diff

2023-09-07 Thread Walter Alejandro Iglesias
Dear OpenBSD developers,

On Aug 2 I reported this bug:

  https://marc.info/?l=openbsd-bugs=169100763926909=2

After fiddling around I found a solution that works for both vi base and
nvi from ports:

  https://marc.info/?l=openbsd-bugs=16926218514=2

Since nobody answered me in bugs@ I sent a message to ports@ and Cc:
Anthony J. Bentley who told me to contact Zhihao Yuan, nvi developer
upstream.  I don't use github, I don't have a github account, luckily
after searching the web I found an email address of Zhihao.  He
understood the issue and answered me with what seems to be a more
consistent patch:

  https://marc.info/?l=openbsd-bugs=169277277928008=2

Which, needless to say, also works for both. vi on base and nvi on
ports.  Below I paste a cvs version of Zhihao's patch to use it with vi
on base.

So it only rests some OpenBSD developer here to take look.  It's not
going to take up much of your time, everything has already been chewed
up :-).


Zhihao's diff translated to cvs:

Index: vi/v_paragraph.c
===
RCS file: /cvs/src/usr.bin/vi/vi/v_paragraph.c,v
retrieving revision 1.9
diff -u -p -r1.9 v_paragraph.c
--- vi/v_paragraph.c18 Apr 2017 01:45:35 -  1.9
+++ vi/v_paragraph.c23 Aug 2023 10:18:39 -
@@ -41,15 +41,20 @@
if (p[0] == '\014') {   \
if (!--cnt) \
goto found; \
+   if (pstate == P_INTEXT && !--cnt)   \
+   goto found; \
continue;   \
}   \
if (p[0] != '.' || len < 2) \
continue;   \
for (lp = VIP(sp)->ps; *lp != '\0'; lp += 2)\
if (lp[0] == p[1] &&\
-   ((lp[1] == ' ' && len == 2) || lp[1] == p[2]) &&\
-   !--cnt) \
-   goto found; \
+   (lp[1] == ' ' && len == 2 || lp[1] == p[2])) {  \
+   if (!--cnt) \
+   goto found; \
+   if (pstate == P_INTEXT && !--cnt)   \
+   goto found; \
+   }   \
 }
 
 /*


-- 
Walter



Re: cwm: add fvwm and tvm as default wm entries

2023-05-16 Thread Walter Alejandro Iglesias
I'm not an OpenBSD developer but, allow me to share my opinion about
this, please.

On May 15 2023, Okan Demirmen wrote:
> On Mon 2023.05.15 at 10:41 +0200, Matthieu Herrb wrote:
> > Last year I mentionned that I think we should retire twm. It's really
> > too old and missing support for the modern window managers hints.

For full support for ICCCM/EWMH you already have that modern, updated
and maintained version of fvwm you have in base :-).

> > 
> > People still using it ...

I started using unix-like systems in 2006, I guess that *decades* before
twm was already short for daily desktop use.  But twm has been always
there as part of the default X installation on any unix-like system.
Perhaps because its simplicity is precisely what makes twm useful as a
rescue option?  This is, at least, the use I've made of twm along all
these years.

As a side note.  Even being so simple, unlike the other two WMs in base,
twm supports utf-8.  This is now in part dysfunctional since you removed
half of the bitmap fixed "miscellaneous" fonts, which also affects fvwm2
and fvwm3 ports.

> > 
> > Otherwise ok to add this and fix the other WM menus for other window
> > managers (those parts of the configs are already local changes in
> > Xenocara)
> 
> I might argue the opposite, to remove cwm from fvwm and twm restart menus, if
> this inconsistency is a real concern. The entries in fvwm/twm are in the
> (shipped) example config files, where-as below it is, well, there for good 
> with
> no user choice. Heck, how often to do people even use this restart wm to
> another WM outside of playing around? Most window managers handle restarts
> differently, regardless of what ICCCM/EWMH says) and even then, crossing 
> window
> managers like this introduces inconsistencies. It's fine for playing around I
> suppose, but is it really a demanded "workflow"?
> 
> > > 
> > > PS:  fwvm and twm menus more programs we don't ship, e.g. "wm2", and
> > >   twm dies when failing to execute them (fvwm and cwm keeps running);
> > >   do we want to keep those default-broken entries around?
> 
> I'd support removing them.

In my experience, restarting from one window manager to another has *never*
worked fine.


-- 
Walter



Re: libX11 patch for X*IfEvent() API issues, take #2

2022-11-14 Thread Walter Alejandro Iglesias
Hi Matthiew,

On Nov 11 2022, Walter Alejandro Iglesias wrote:
> On Nov 11 2022, Matthieu Herrb wrote:
> > On Fri, Nov 11, 2022 at 03:17:21PM +0100, Walter Alejandro Iglesias wrote:
> > > On Nov 11 2022, Matthieu Herrb wrote:
> > > > Hi,
> > > > 
> > > > So the patch provided by Adam Jackson upstreams is completely buggy.
> > > > 
> > > > - the logic to setup the new locking function was busted
> > > > - I've observed  cases where even the XGetIfEvent() function gets
> > > >   re-rentered. So the flags does in fact need to be a counter.
> > > > 
> > > > New patch which works for me with fvwm2...
> > > > 
> > > > Testing is welcome...
> > > 
> > > Unfortunately I still can reproduce the bug.
> > > 
> > > This patch also makes firefox crash, (I'm not sure if it's because of
> > > firefox or cwm).  After reinstalling xenocara from the snapshots
> > > packages I could run firefox again.  
> > > 
> > > fvwm is giving a lot of problems lately, there's also this drm bug I
> > > reported: 
> > > 
> > >   https://marc.info/?l=openbsd-bugs=166332904632232=2
> > > 
> > > Now FvwmIcons is crashing on fvwm3...  I'm giving up with fvwm, I'm more
> > > comfortable with cwm.
> > 
> > Ok, I've figured out the cwm + firefox issue. There's also a
> > XInternalLock that needs to be neutered in the X*IfEvent() callbacks.
> > 
> > New diff, on which I've left my debug printfs. If you get other
> > crashes it may be interesting to look in .xsession-errors...
> 
> This patch solved the firefox cwm issue.  Firefox shows this when
> it starts: 
> 
>   ~$ firefox
>   XXX _XLockDisplay 1
>   XXX _XIfEventInternalLockDisplay 1
>   XXX _XIfEventUnlockDisplay 0
> 
> fvwm2/3 still crashing.

Today, after upgrading to the latest snapshot I can not reproduce this
fvwm bug anymore.  I ignore why.

> 
> 
> 
> > 
> > Index: Makefile.bsd-wrapper
> > ===
> > RCS file: /cvs/OpenBSD/xenocara/lib/libX11/Makefile.bsd-wrapper,v
> > retrieving revision 1.29
> > diff -u -p -u -r1.29 Makefile.bsd-wrapper
> > --- Makefile.bsd-wrapper3 Sep 2022 06:55:25 -   1.29
> > +++ Makefile.bsd-wrapper11 Nov 2022 15:13:59 -
> > @@ -14,7 +14,6 @@ SHARED_LIBS=  X11 18.0 X11_xcb 2.0
> >  
> >  CONFIGURE_ARGS= --enable-tcp-transport --enable-unix-transport 
> > --enable-ipv6 \
> > --disable-composecache \
> > -   --disable-thread-safety-constructor \
> > --without-xmlto --without-fop --without-xsltproc
> >  
> >  .include 
> > Index: include/X11/Xlibint.h
> > ===
> > RCS file: /cvs/OpenBSD/xenocara/lib/libX11/include/X11/Xlibint.h,v
> > retrieving revision 1.15
> > diff -u -p -u -r1.15 Xlibint.h
> > --- include/X11/Xlibint.h   21 Feb 2022 08:01:24 -  1.15
> > +++ include/X11/Xlibint.h   11 Nov 2022 15:14:00 -
> > @@ -207,6 +207,7 @@ struct _XDisplay
> >  
> > XIOErrorExitHandler exit_handler;
> > void *exit_handler_data;
> > +CARD32 in_ifevent;
> >  };
> >  
> >  #define XAllocIDs(dpy,ids,n) (*(dpy)->idlist_alloc)(dpy,ids,n)
> > Index: src/ChkIfEv.c
> > ===
> > RCS file: /cvs/OpenBSD/xenocara/lib/libX11/src/ChkIfEv.c,v
> > retrieving revision 1.4
> > diff -u -p -u -r1.4 ChkIfEv.c
> > --- src/ChkIfEv.c   30 May 2011 19:19:38 -  1.4
> > +++ src/ChkIfEv.c   11 Nov 2022 15:14:00 -
> > @@ -49,6 +49,7 @@ Bool XCheckIfEvent (
> > unsigned long qe_serial = 0;
> > int n;  /* time through count */
> >  
> > +dpy->in_ifevent++;
> >  LockDisplay(dpy);
> > prev = NULL;
> > for (n = 3; --n >= 0;) {
> > @@ -60,6 +61,7 @@ Bool XCheckIfEvent (
> > *event = qelt->event;
> > _XDeq(dpy, prev, qelt);
> > _XStoreEventCookie(dpy, event);
> > +dpy->in_ifevent = False;
> > UnlockDisplay(dpy);
> > return True;
> > }
> > @@ -78,6 +80,7 @@ Bool XCheckIfEvent (
> > /* another thread has snatched this event */
> > prev = NULL;
> > }
> > +dpy->in_ifevent--;
> > UnlockDisplay(dpy);
> > r

Re: libX11 patch for X*IfEvent() API issues, take #2

2022-11-11 Thread Walter Alejandro Iglesias
On Nov 11 2022, Matthieu Herrb wrote:
> On Fri, Nov 11, 2022 at 03:17:21PM +0100, Walter Alejandro Iglesias wrote:
> > On Nov 11 2022, Matthieu Herrb wrote:
> > > Hi,
> > > 
> > > So the patch provided by Adam Jackson upstreams is completely buggy.
> > > 
> > > - the logic to setup the new locking function was busted
> > > - I've observed  cases where even the XGetIfEvent() function gets
> > >   re-rentered. So the flags does in fact need to be a counter.
> > > 
> > > New patch which works for me with fvwm2...
> > > 
> > > Testing is welcome...
> > 
> > Unfortunately I still can reproduce the bug.
> > 
> > This patch also makes firefox crash, (I'm not sure if it's because of
> > firefox or cwm).  After reinstalling xenocara from the snapshots
> > packages I could run firefox again.  
> > 
> > fvwm is giving a lot of problems lately, there's also this drm bug I
> > reported: 
> > 
> >   https://marc.info/?l=openbsd-bugs=166332904632232=2
> > 
> > Now FvwmIcons is crashing on fvwm3...  I'm giving up with fvwm, I'm more
> > comfortable with cwm.
> 
> Ok, I've figured out the cwm + firefox issue. There's also a
> XInternalLock that needs to be neutered in the X*IfEvent() callbacks.
> 
> New diff, on which I've left my debug printfs. If you get other
> crashes it may be interesting to look in .xsession-errors...

This patch solved the firefox cwm issue.  Firefox shows this when
it starts: 

  ~$ firefox
  XXX _XLockDisplay 1
  XXX _XIfEventInternalLockDisplay 1
  XXX _XIfEventUnlockDisplay 0

fvwm2/3 still crashing.



> 
> Index: Makefile.bsd-wrapper
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/Makefile.bsd-wrapper,v
> retrieving revision 1.29
> diff -u -p -u -r1.29 Makefile.bsd-wrapper
> --- Makefile.bsd-wrapper  3 Sep 2022 06:55:25 -   1.29
> +++ Makefile.bsd-wrapper  11 Nov 2022 15:13:59 -
> @@ -14,7 +14,6 @@ SHARED_LIBS=X11 18.0 X11_xcb 2.0
>  
>  CONFIGURE_ARGS= --enable-tcp-transport --enable-unix-transport --enable-ipv6 
> \
>   --disable-composecache \
> - --disable-thread-safety-constructor \
>   --without-xmlto --without-fop --without-xsltproc
>  
>  .include 
> Index: include/X11/Xlibint.h
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/include/X11/Xlibint.h,v
> retrieving revision 1.15
> diff -u -p -u -r1.15 Xlibint.h
> --- include/X11/Xlibint.h 21 Feb 2022 08:01:24 -  1.15
> +++ include/X11/Xlibint.h 11 Nov 2022 15:14:00 -
> @@ -207,6 +207,7 @@ struct _XDisplay
>  
>   XIOErrorExitHandler exit_handler;
>   void *exit_handler_data;
> +CARD32 in_ifevent;
>  };
>  
>  #define XAllocIDs(dpy,ids,n) (*(dpy)->idlist_alloc)(dpy,ids,n)
> Index: src/ChkIfEv.c
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/src/ChkIfEv.c,v
> retrieving revision 1.4
> diff -u -p -u -r1.4 ChkIfEv.c
> --- src/ChkIfEv.c 30 May 2011 19:19:38 -  1.4
> +++ src/ChkIfEv.c 11 Nov 2022 15:14:00 -
> @@ -49,6 +49,7 @@ Bool XCheckIfEvent (
>   unsigned long qe_serial = 0;
>   int n;  /* time through count */
>  
> +dpy->in_ifevent++;
>  LockDisplay(dpy);
>   prev = NULL;
>   for (n = 3; --n >= 0;) {
> @@ -60,6 +61,7 @@ Bool XCheckIfEvent (
>   *event = qelt->event;
>   _XDeq(dpy, prev, qelt);
>   _XStoreEventCookie(dpy, event);
> +dpy->in_ifevent = False;
>   UnlockDisplay(dpy);
>   return True;
>   }
> @@ -78,6 +80,7 @@ Bool XCheckIfEvent (
>   /* another thread has snatched this event */
>   prev = NULL;
>   }
> +dpy->in_ifevent--;
>   UnlockDisplay(dpy);
>   return False;
>  }
> Index: src/IfEvent.c
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/src/IfEvent.c,v
> retrieving revision 1.4
> diff -u -p -u -r1.4 IfEvent.c
> --- src/IfEvent.c 30 May 2011 19:19:38 -  1.4
> +++ src/IfEvent.c 11 Nov 2022 15:14:00 -
> @@ -48,6 +48,7 @@ XIfEvent (
>   register _XQEvent *qelt, *prev;
>   unsigned long qe_serial = 0;
>  
> +dpy->in_ifevent++;
>  LockDisplay(dpy);
>   prev = NULL;
>   while (1) {
> @@ -59,6 +60,7 @@ XIfEvent (
>   *event = qelt->event;
&

Re: libX11 patch for X*IfEvent() API issues, take #2

2022-11-11 Thread Walter Alejandro Iglesias
On Nov 11 2022, Matthieu Herrb wrote:
> Hi,
> 
> So the patch provided by Adam Jackson upstreams is completely buggy.
> 
> - the logic to setup the new locking function was busted
> - I've observed  cases where even the XGetIfEvent() function gets
>   re-rentered. So the flags does in fact need to be a counter.
> 
> New patch which works for me with fvwm2...
> 
> Testing is welcome...

Unfortunately I still can reproduce the bug.

This patch also makes firefox crash, (I'm not sure if it's because of
firefox or cwm).  After reinstalling xenocara from the snapshots
packages I could run firefox again.  

fvwm is giving a lot of problems lately, there's also this drm bug I
reported: 

  https://marc.info/?l=openbsd-bugs=166332904632232=2

Now FvwmIcons is crashing on fvwm3...  I'm giving up with fvwm, I'm more
comfortable with cwm.


> 
> Index: Makefile.bsd-wrapper
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/Makefile.bsd-wrapper,v
> retrieving revision 1.29
> diff -u -p -u -r1.29 Makefile.bsd-wrapper
> --- Makefile.bsd-wrapper  3 Sep 2022 06:55:25 -   1.29
> +++ Makefile.bsd-wrapper  11 Nov 2022 12:10:16 -
> @@ -14,7 +14,6 @@ SHARED_LIBS=X11 18.0 X11_xcb 2.0
>  
>  CONFIGURE_ARGS= --enable-tcp-transport --enable-unix-transport --enable-ipv6 
> \
>   --disable-composecache \
> - --disable-thread-safety-constructor \
>   --without-xmlto --without-fop --without-xsltproc
>  
>  .include 
> Index: include/X11/Xlibint.h
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/include/X11/Xlibint.h,v
> retrieving revision 1.15
> diff -u -p -u -r1.15 Xlibint.h
> --- include/X11/Xlibint.h 21 Feb 2022 08:01:24 -  1.15
> +++ include/X11/Xlibint.h 11 Nov 2022 12:10:17 -
> @@ -207,6 +207,7 @@ struct _XDisplay
>  
>   XIOErrorExitHandler exit_handler;
>   void *exit_handler_data;
> +CARD32 in_ifevent;
>  };
>  
>  #define XAllocIDs(dpy,ids,n) (*(dpy)->idlist_alloc)(dpy,ids,n)
> Index: src/ChkIfEv.c
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/src/ChkIfEv.c,v
> retrieving revision 1.4
> diff -u -p -u -r1.4 ChkIfEv.c
> --- src/ChkIfEv.c 30 May 2011 19:19:38 -  1.4
> +++ src/ChkIfEv.c 11 Nov 2022 12:10:17 -
> @@ -49,6 +49,7 @@ Bool XCheckIfEvent (
>   unsigned long qe_serial = 0;
>   int n;  /* time through count */
>  
> +dpy->in_ifevent++;
>  LockDisplay(dpy);
>   prev = NULL;
>   for (n = 3; --n >= 0;) {
> @@ -60,6 +61,7 @@ Bool XCheckIfEvent (
>   *event = qelt->event;
>   _XDeq(dpy, prev, qelt);
>   _XStoreEventCookie(dpy, event);
> +dpy->in_ifevent = False;
>   UnlockDisplay(dpy);
>   return True;
>   }
> @@ -78,6 +80,7 @@ Bool XCheckIfEvent (
>   /* another thread has snatched this event */
>   prev = NULL;
>   }
> +dpy->in_ifevent--;
>   UnlockDisplay(dpy);
>   return False;
>  }
> Index: src/IfEvent.c
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/src/IfEvent.c,v
> retrieving revision 1.4
> diff -u -p -u -r1.4 IfEvent.c
> --- src/IfEvent.c 30 May 2011 19:19:38 -  1.4
> +++ src/IfEvent.c 11 Nov 2022 12:10:17 -
> @@ -48,6 +48,7 @@ XIfEvent (
>   register _XQEvent *qelt, *prev;
>   unsigned long qe_serial = 0;
>  
> +dpy->in_ifevent++;
>  LockDisplay(dpy);
>   prev = NULL;
>   while (1) {
> @@ -59,6 +60,7 @@ XIfEvent (
>   *event = qelt->event;
>   _XDeq(dpy, prev, qelt);
>   _XStoreEventCookie(dpy, event);
> +dpy->in_ifevent--;
>   UnlockDisplay(dpy);
>   return 0;
>   }
> Index: src/OpenDis.c
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/src/OpenDis.c,v
> retrieving revision 1.12
> diff -u -p -u -r1.12 OpenDis.c
> --- src/OpenDis.c 28 Nov 2020 14:39:48 -  1.12
> +++ src/OpenDis.c 11 Nov 2022 12:10:17 -
> @@ -189,6 +189,7 @@ XOpenDisplay (
>   dpy->xcmisc_opcode  = 0;
>   dpy->xkb_info   = NULL;
>   dpy->exit_handler_data  = NULL;
> +dpy->in_ifevent = 0;
>  
>  /*
>   * Setup other information in this display structure.
> Index: src/PeekIfEv.c
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/src/PeekIfEv.c,v
> retrieving revision 1.3
> diff -u -p -u -r1.3 PeekIfEv.c
> --- src/PeekIfEv.c30 May 2011 19:19:38 -  1.3
> +++ src/PeekIfEv.c11 Nov 2022 12:10:17 -
> @@ -49,6 +49,7 @@ XPeekIfEvent (

Re: libX11 patch for X*IfEvent() API issues

2022-11-02 Thread Walter Alejandro Iglesias
Hello Matthieu,

On Nov 01 2022, Matthieu Herrb wrote:
> Hi,
> 
> here's a libX11 patch that needs some wide testing, especially from
> people who have experienced issues with various applications (fvwm[23]
> from ports, Motif applications with Drag'n'Drop,..) with the upgrade
> to libX11 1.8.1, before I added the --disable-thread-safety-constructor
> option to the build.
> 
> This patch allows the callback functions from X*IfEvent() class of
> Xlib functions to re-enter libX11, by making the
> XlockDisplay()/XunlockDisplay() functions void while running the
> callbacks.
> 
> I've tested fvwm2 with this patch and it seems to fix it (but since
> the issue was never easy to reproduce I'm not 100% confident).
> 
> Please apply and report

For me fvwm2 and fvwm3 crash again.

The bug is the one I explained to you in a private message:

> I'm able to make fvwm2 and fvwm3 to abort, in this case with a clean X
> exit, by including any xapp with the -iconic option in the InitFunction or
> the StartFuncion:
> 
>  AddToFunc InitFunction I Exec exec xconsole -iconic
> 
> Just removing the "-iconic" option it doesn't happen.  Besides,
> iconifying to FvwmIconMan doesn't do the trick, it's necessary to NOT
> have the following option enabled (as is the case in latest default
> fvwm2 config):
> 
>  #Style * !Icon
> 
> 
> So, unless it happens only to me, as always, you should reproduce it
> just coping the default fvwm2 config to your home directory and adding
> three lines.
> 
> $ cp -r /usr/locar/share/fvwm/default-config ~/.fvwm
> $ echo "AddToFunc InitFunction I Exec exec xconsole -iconic" >> ~/.fvwm/config
> $ echo "Style * Icon" >> ~/.fvwm/config



   ***

> 
> cd /usr/xenocara/lib/libX11
> patch -p0 -E < /path/to/this/patch
> doas make -f Makefile.bsd-wrapper obj
> doas make -f Makefile.bsd-wrapper build
> 
> Then restart X with the applications that where crashing last august.
> 
> Index: Makefile.bsd-wrapper
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/Makefile.bsd-wrapper,v
> retrieving revision 1.29
> diff -u -p -u -r1.29 Makefile.bsd-wrapper
> --- Makefile.bsd-wrapper  3 Sep 2022 06:55:25 -   1.29
> +++ Makefile.bsd-wrapper  1 Nov 2022 09:21:34 -
> @@ -14,7 +14,6 @@ SHARED_LIBS=X11 18.0 X11_xcb 2.0
>  
>  CONFIGURE_ARGS= --enable-tcp-transport --enable-unix-transport --enable-ipv6 
> \
>   --disable-composecache \
> - --disable-thread-safety-constructor \
>   --without-xmlto --without-fop --without-xsltproc
>  
>  .include 
> Index: include/X11/Xlibint.h
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/include/X11/Xlibint.h,v
> retrieving revision 1.15
> diff -u -p -u -r1.15 Xlibint.h
> --- include/X11/Xlibint.h 21 Feb 2022 08:01:24 -  1.15
> +++ include/X11/Xlibint.h 1 Nov 2022 09:21:34 -
> @@ -207,6 +207,7 @@ struct _XDisplay
>  
>   XIOErrorExitHandler exit_handler;
>   void *exit_handler_data;
> +Bool in_ifevent;
>  };
>  
>  #define XAllocIDs(dpy,ids,n) (*(dpy)->idlist_alloc)(dpy,ids,n)
> Index: src/ChkIfEv.c
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/src/ChkIfEv.c,v
> retrieving revision 1.4
> diff -u -p -u -r1.4 ChkIfEv.c
> --- src/ChkIfEv.c 30 May 2011 19:19:38 -  1.4
> +++ src/ChkIfEv.c 1 Nov 2022 09:21:35 -
> @@ -50,6 +50,7 @@ Bool XCheckIfEvent (
>   int n;  /* time through count */
>  
>  LockDisplay(dpy);
> +dpy->in_ifevent = True;
>   prev = NULL;
>   for (n = 3; --n >= 0;) {
>   for (qelt = prev ? prev->next : dpy->head;
> @@ -60,6 +61,7 @@ Bool XCheckIfEvent (
>   *event = qelt->event;
>   _XDeq(dpy, prev, qelt);
>   _XStoreEventCookie(dpy, event);
> +dpy->in_ifevent = False;
>   UnlockDisplay(dpy);
>   return True;
>   }
> @@ -78,6 +80,7 @@ Bool XCheckIfEvent (
>   /* another thread has snatched this event */
>   prev = NULL;
>   }
> +dpy->in_ifevent = False;
>   UnlockDisplay(dpy);
>   return False;
>  }
> Index: src/IfEvent.c
> ===
> RCS file: /cvs/OpenBSD/xenocara/lib/libX11/src/IfEvent.c,v
> retrieving revision 1.4
> diff -u -p -u -r1.4 IfEvent.c
> --- src/IfEvent.c 30 May 2011 19:19:38 -  1.4
> +++ src/IfEvent.c 1 Nov 2022 09:21:35 -
> @@ -49,6 +49,7 @@ XIfEvent (
>   unsigned long qe_serial = 0;
>  
>  LockDisplay(dpy);
> +dpy->in_ifevent = True;
>   prev = NULL;
>   while (1) {
>   for (qelt = prev ? prev->next : dpy->head;
> @@ -59,6 +60,7 @@ XIfEvent (
>   *event = qelt->event;
>   _XDeq(dpy, prev, qelt);
>   

Re: cwm: do not overlap menu entries

2022-10-14 Thread Walter Alejandro Iglesias
Hello Okan,

On Oct 13 2022, Okan Demirmen wrote:
> **incomplete** but i think the right direction to use ascent+descent,
> however i've missed something, so take this with a sea full of salt (and
> yes, i'm still alive...). the menu rect is too big (by a factor of
> entries i think) now and messes with other calculations dealing with ptr
> selections/movement; i just need find the other assumptions made with this
> + 1 stuff and if i used the right surgical hammer.

I just tried your patch and the issue disappeared.

Thank you!


> 
> Index: menu.c
> ===
> RCS file: /home/open/cvs/xenocara/app/cwm/menu.c,v
> retrieving revision 1.109
> diff -u -p -r1.109 menu.c
> --- menu.c27 Feb 2020 14:56:39 -  1.109
> +++ menu.c14 Oct 2022 01:40:30 -
> @@ -355,7 +355,7 @@ menu_draw(struct menu_ctx *mc, struct me
>   XftTextExtentsUtf8(X_Dpy, sc->xftfont,
>   (const FcChar8*)mc->dispstr, strlen(mc->dispstr), );
>   mc->geom.w = extents.xOff;
> - mc->geom.h = sc->xftfont->height + 1;
> + mc->geom.h = sc->xftfont->ascent + sc->xftfont->descent;
>   mc->num = 1;
>  
>   TAILQ_FOREACH(mi, resultq, resultentry) {
> @@ -364,7 +364,7 @@ menu_draw(struct menu_ctx *mc, struct me
>   (const FcChar8*)mi->print,
>   MIN(strlen(mi->print), MENU_MAXENTRY), );
>   mc->geom.w = MAX(mc->geom.w, extents.xOff);
> - mc->geom.h += sc->xftfont->height + 1;
> + mc->geom.h += sc->xftfont->ascent + sc->xftfont->descent;
>   mc->num++;
>   }
>  
> @@ -403,7 +403,7 @@ menu_draw(struct menu_ctx *mc, struct me
>   (const FcChar8*)mc->dispstr, strlen(mc->dispstr));
>  
>   TAILQ_FOREACH(mi, resultq, resultentry) {
> - int y = n * (sc->xftfont->height + 1) + sc->xftfont->ascent + 1;
> + int y = n * sc->xftfont->height + sc->xftfont->ascent;
>  
>   /* Stop drawing when menu doesn't fit inside the screen. */
>   if (mc->geom.y + y > area.h)
> @@ -435,12 +435,12 @@ menu_draw_entry(struct menu_ctx *mc, str
>  
>   color = (active) ? CWM_COLOR_MENU_FG : CWM_COLOR_MENU_BG;
>   XftDrawRect(mc->xftdraw, >xftcolor[color], 0,
> - (sc->xftfont->height + 1) * entry, mc->geom.w,
> - (sc->xftfont->height + 1) + sc->xftfont->descent);
> + sc->xftfont->height * entry, mc->geom.w,
> + sc->xftfont->ascent + sc->xftfont->descent);
>   color = (active) ? CWM_COLOR_MENU_FONT_SEL : CWM_COLOR_MENU_FONT;
>   XftDrawStringUtf8(mc->xftdraw,
>   >xftcolor[color], sc->xftfont,
> - 0, (sc->xftfont->height + 1) * entry + sc->xftfont->ascent + 1,
> + 0, sc->xftfont->height * entry + sc->xftfont->ascent,
>   (const FcChar8*)mi->print, strlen(mi->print));
>  }
>  
> @@ -487,11 +487,11 @@ menu_calc_entry(struct menu_ctx *mc, int
>   struct screen_ctx   *sc = mc->sc;
>   int  entry;
>  
> - entry = y / (sc->xftfont->height + 1);
> + entry = y / (sc->xftfont->ascent + sc->xftfont->descent);
>  
>   /* in bounds? */
>   if (x < 0 || x > mc->geom.w || y < 0 ||
> - y > (sc->xftfont->height + 1) * mc->num ||
> + y > (sc->xftfont->ascent + sc->xftfont->descent) * mc->num ||
>   entry < 0 || entry >= mc->num)
>   entry = -1;
>  
> 
> 



Re: cwm: do not overlap menu entries

2022-10-13 Thread Walter Alejandro Iglesias
On Oct 13 2022, Walter Alejandro Iglesias wrote:
> On Oct 13 2022, Klemens Nanni wrote:
> > On Thu, Oct 13, 2022 at 08:28:50AM -0400, Okan Demirmen wrote:
> > > And I keep missing it! I can't reproduce this - can you share the font
> > > you're using maybe?
> > 
> > Whatever is the default, I never fiddled with fonts in X, no xorg.conf,
> > `cwm -c/dev/null' shows the glitch for me on a ThinkPad X230.
> > 
> > 
> 
> In my case I can reproduce it with TrueType fonts (eg DejaVu Monospace),

Not with all truetype fonts and sizes.  It depends on how each font
distributes vertical space (especially below).  Here some fonts and
sizes I can reproduce the issue:

 DejaVu Monospace 15 or 18
 FreeMono 15 or 18
 Liberation Mono all sizes

Liberation Mono is the most affected because is the less vertically
centered.


-- 
Walter



Re: cwm: do not overlap menu entries

2022-10-13 Thread Walter Alejandro Iglesias
On Oct 13 2022, Klemens Nanni wrote:
> On Thu, Oct 13, 2022 at 08:28:50AM -0400, Okan Demirmen wrote:
> > And I keep missing it! I can't reproduce this - can you share the font
> > you're using maybe?
> 
> Whatever is the default, I never fiddled with fonts in X, no xorg.conf,
> `cwm -c/dev/null' shows the glitch for me on a ThinkPad X230.
> 
> 

In my case I can reproduce it with TrueType fonts (eg DejaVu Monospace),
with bitmap fonts (eg Fixed) doesn't happen.  I observe the overlapping
in any menu, not only the windows one.

Months ago I tried to fix this issue playing with the "height + 1"
values, but since all cwm patches I been sending lately were ignored I
lost interest.  By the way, I'm glad Okan is still alive.  :-)


-- 
Walter



cwm(1): another bug in cycling windows [PATCH]

2019-05-31 Thread Walter Alejandro Iglesias
In the situation you kill a client and it happens the cursor lands in
the root window no window has focus.  Then you Alt-Tab to cycle and the
window which gets the pointer is not the last focused but the previous
one.

Index: client.c
===
RCS file: /cvs/xenocara/app/cwm/client.c,v
retrieving revision 1.255
diff -u -p -r1.255 client.c
--- client.c7 Mar 2019 14:28:17 -   1.255
+++ client.c1 May 2019 13:04:04 -
@@ -722,7 +722,16 @@ client_cycle(struct screen_ctx *sc, int 
newcc->ptr.x = newcc->geom.w / 2;
newcc->ptr.y = newcc->geom.h / 2;
}
-   client_ptrwarp(newcc);
+
+   /* When no client is active warp pointer to last active */
+   if (oldcc->flags & (CLIENT_ACTIVE))
+   client_ptrwarp(newcc);
+   else if (oldcc->flags & (CLIENT_SKIP_CYCLE))
+   client_ptrwarp(newcc);
+   else {
+   client_raise(oldcc);
+   client_ptrwarp(oldcc);
+   }
 }
 
 static struct client_ctx *



Re: manpage text width

2018-03-31 Thread Walter Alejandro Iglesias
Hi Ingo,

In article <20180329235743.ge59...@athene.usta.de> Ingo Schwarze 
 wrote:
> > Can I do anything to fix this?
> 
> Yes.
> 

I've always wondered why groff did that nonsense with man pages.  I
remember discussing this same issue in groff mailing lists years ago
(with E. Raymond).

In my opinion, thank you Ingo or whoever had the good idea of not doing
the same with mandoc.


Walter



Re: Proposal for sshd_config(5) man page

2017-10-13 Thread Walter Alejandro Iglesias
In article <20171013160142.gb48...@symphytum.spacehopper.org> Stuart Henderson 
 wrote:
> I had an OK from Ingo for mine, but I prefer your version. OK with me!
> 

One more (funny) thing.

After reading the man page, besides reading some info in the openssh
site I googled bit about the issue.  In some ubuntu forum someone was
asking what the entry "PermitRootLogin without-password" in his
sshd_config file really meant.  He thought he was allowing to log in as
root with a blank password.

I mean, don't fully trust in the effectiveness of self-explanatory
keywords. :-)



Re: Proposal for sshd_config(5) man page

2017-10-13 Thread Walter Alejandro Iglesias
In article <20171013145400.GA82524@harkle> Jason McIntyre <j...@kerhand.co.uk> 
wrote:
> On Fri, Oct 13, 2017 at 02:01:17PM +0100, Stuart Henderson wrote:
> > On 2017/10/13 12:57, Walter Alejandro Iglesias wrote:
> > > In sshd_config(5), to avoid confusion with PermitRootLogin options.
> > > 
> > > Original:
> > > 
> > >   If this option is set to *prohibit-password* or *without-password*,
> > >   password and keyboard-interactive authentication are disabled for
> > >   root.
> > > 
> > > Proposed:
> > > 
> > >   If this option is set to *prohibit-password* (renamed from
> > >   *without-password* to avoid ambiguity, both valid) only non
> > >   keyboard-interactive authentication (public-key, hostbased and GSSAPI)
> > >   is allowed for root.
> > 
> > How about a briefer alternative that points people towards the
> > more self-explanatory option keyword?
> > 
> > Index: sshd_config.5
> > ===
> > RCS file: /cvs/src/usr.bin/ssh/sshd_config.5,v
> > retrieving revision 1.254
> > diff -u -p -r1.254 sshd_config.5
> > --- sshd_config.5 9 Oct 2017 20:12:51 -   1.254
> > +++ sshd_config.5 13 Oct 2017 12:59:14 -
> > @@ -1198,10 +1198,11 @@ The default is
> >  .Cm prohibit-password .
> >  .Pp
> >  If this option is set to
> > -.Cm prohibit-password
> > -or
> > -.Cm without-password ,
> > +.Cm prohibit-password ,
> >  password and keyboard-interactive authentication are disabled for root.
> > +.Cm without-password
> > +is a deprecated alias for
> > +.Cm prohibit-password .
> >  .Pp
> >  If this option is set to
> >  .Cm forced-commands-only ,
> > 
> 
> i agree that we should not try to list all the other types that are
> valid, since it means one more thing to remember when things change.
> and means adding more text.
> 
> i'm fine with your diff, but couldn;t resist having a stab myself:

The first paragraph is the more important.  I like this version.


> 
> Index: sshd_config.5
> ===
> RCS file: /cvs/src/usr.bin/ssh/sshd_config.5,v
> retrieving revision 1.254
> diff -u -r1.254 sshd_config.5
> --- sshd_config.5   9 Oct 2017 20:12:51 -   1.254
> +++ sshd_config.5   13 Oct 2017 14:52:03 -
> @@ -1190,7 +1190,6 @@
>  The argument must be
>  .Cm yes ,
>  .Cm prohibit-password ,
> -.Cm without-password ,
>  .Cm forced-commands-only ,
>  or
>  .Cm no .
> @@ -1199,8 +1198,8 @@
>  .Pp
>  If this option is set to
>  .Cm prohibit-password
> -or
> -.Cm without-password ,
> +(or its deprecated alias,
> +.Cm without-password ) ,
>  password and keyboard-interactive authentication are disabled for root.
>  .Pp
>  If this option is set to
> 
> 



Re: Proposal for sshd_config(5) man page

2017-10-13 Thread Walter Alejandro Iglesias
Hi Stuart,

On Fri, Oct 13, 2017 at 02:01:17PM +0100, Stuart Henderson wrote:
> How about a briefer alternative that points people towards the
> more self-explanatory option keyword?

Or even better, to modify the first paragraph to put it clear from the
very start there are *four* arguments, not five (if my English fails let
me know, please):

To change this:

   PermitRootLogin
Specifies whether root can log in using ssh(1).  The argument
must be yes, prohibit-password, without-password,
forced-commands-only, or no.  The default is prohibit-password.

for this:

   PermitRootLogin
Specifies whether root can log in using ssh(1).  The argument
must be yes, prohibit-password (late without-password),
forced-commands-only, or no.  The default is prohibit-password.


I still think some redundancy in the second paragraph is welcome to
leave the reader no doubt about what each option exactly allows and
prohibit.  Without that clarification when you get to the third
paragraph:

   If this option is set to forced-commands-only, root login with public
   key authentication will be allowed, but only if the command option...

you may wonder if prohibit-password allows public key authentication.
At least that's what happened to me. :-)


New version:


--- sshd_config.5.orig  Fri Oct 13 16:23:06 2017
+++ sshd_config.5   Fri Oct 13 16:20:34 2017
@@ -1189,8 +1189,8 @@
 .Xr ssh 1 .
 The argument must be
 .Cm yes ,
-.Cm prohibit-password ,
-.Cm without-password ,
+.Cm prohibit-password
+.Pq late without-password ,
 .Cm forced-commands-only ,
 or
 .Cm no .
@@ -1199,9 +1199,8 @@
 .Pp
 If this option is set to
 .Cm prohibit-password
-or
-.Cm without-password ,
-password and keyboard-interactive authentication are disabled for root.
+(without-password is still valid) only non keyboard-interactive
+authentication (public-key, hostbased and GSSAPI) is allowed for root.
 .Pp
 If this option is set to
 .Cm forced-commands-only ,




Proposal for sshd_config(5) man page

2017-10-13 Thread Walter Alejandro Iglesias
In sshd_config(5), to avoid confusion with PermitRootLogin options.

Original:

  If this option is set to *prohibit-password* or *without-password*,
  password and keyboard-interactive authentication are disabled for
  root.

Proposed:

  If this option is set to *prohibit-password* (renamed from
  *without-password* to avoid ambiguity, both valid) only non
  keyboard-interactive authentication (public-key, hostbased and GSSAPI)
  is allowed for root.


--- sshd_config.5.orig  Mon Oct  9 22:12:51 2017
+++ sshd_config.5   Fri Oct 13 12:38:13 2017
@@ -1199,9 +1199,10 @@
 .Pp
 If this option is set to
 .Cm prohibit-password
-or
-.Cm without-password ,
-password and keyboard-interactive authentication are disabled for root.
+(renamed from
+.Cm without-password
+to avoid ambiguity, both valid) only non keyboard-interactive authentication
+(public-key, hostbased and GSSAPI) is allowed for root.
 .Pp
 If this option is set to
 .Cm forced-commands-only ,


 ***

A related question.  About these messages (/var/log/authlog):

 ... error: maximum authentication attempts exceeded for root ...

 ... error: maximum authentication attempts exceeded for invalid user admin ...

Is there any reason why the connection isn't just terminated after
confirming the user is root or invalid?



Re: ksh(1): don't output invalid UTF-8 characters

2017-06-05 Thread Walter Alejandro Iglesias
On Mon, Jun 05, 2017 at 10:46:49PM +0200, Ingo Schwarze wrote:
> Hi Walter,
> 
> Walter Alejandro Iglesias wrote on Mon, Jun 05, 2017 at 09:21:31PM +0200:
> > On Mon, Jun 05, 2017 at 06:06:34PM +0200, Ingo Schwarze wrote:
> >> Walter Alejandro Iglesias wrote on Mon, Jun 05, 2017 at 04:50:21PM +0200:
> 
> > But this time I don't think you need a capture of the sequence.
> 
> Well *you* obviously know what problem you have, but *I* don't
> understand it, so please don't tell me that i understand when i
> don't, and don't tell me that i don't need the details i need.  In
> general, when reporting bugs, do not guess which information might
> or might not be needed, but provide what people trying to fix it
> ask for.

First, you're who is assuming wrong.  I don't contact free software
developers to ask them to solve *my* problems.

Second, I didn't provided you more details now than the first time.  The
information I provided was enough to understand and to reproduce the
bug.

Finally, I'm doing this for free too.  If you're not happy with the way I
explain myself feel free to ignore me.


Greetings.




Re: ksh(1): don't output invalid UTF-8 characters

2017-06-05 Thread Walter Alejandro Iglesias
In article <20170605192131.ga60...@server.roquesor.com> you wrote:
> 
>Encodings using more bytes than required are invalid.  In particular,
>1100 and 1101 are not valid start bytes, the byte after
>1110 must be at least 1010, and the byte after  must
>be at least 1001.
> 

Someone off list explained me this is true just for the exact 1110
byte.  I'd assumed the man page was referring to any 1110 sequence.
Now I understand.



Re: ksh(1): don't output invalid UTF-8 characters

2017-06-05 Thread Walter Alejandro Iglesias
On Mon, Jun 05, 2017 at 06:06:34PM +0200, Ingo Schwarze wrote:
> Hi Walter,
> 
> Walter Alejandro Iglesias wrote on Mon, Jun 05, 2017 at 04:50:21PM +0200:
> 
> > report (I'm on chapter 2 of K :-)).  I wish with time I'll learn how
> > to do it.
> 
> IIRC, you said you saw some undesirable behaviour with ksh input.
> 
> I assume you have a sequence of key presses on your keyboard that
> demonstrate the undesirable behaviour.  To capture the sequence,
>

I will *study* all the indications you gave me.  But this time
I don't think you need a capture of the sequence.  Just use *any*
latin-1 character whose hex value is smaller than \xc0.

To facilitate you the test, in xterm after setting "setxkbmap de":

  AltGr + Shift + 1

prints me the opening exclamation mark (\xa1) we also use in Spanish.
In console or a C xterm, type that merged among random ascii characters,
then move the cursor from the first to the last column passing over that
character.  Assuming you're running current, see what happens.


Anyway, to be honest, these bugs don't hurt, you can live with them.
What I'm trying to say with these reports is I'm not truly convinced
utf8 support in console is a good idea.

Another test you can do, this time in a utf-8 xterm: if you activate the
bell and go with the cursor to the end of the line it'll beep.  Now type
some utf-8 character at the end and do the same, it won't beep, because
the cursor is in the first byte of the utf-8 character, *it can't reach
the real end of the line*.  Nobody will die because this issue or the
other above.  My point is utf8 will always be a mess.  KEN, DO YOU HEAR
ME?, IT WAS YOUR OWN CHILD, KEN! :-)

I wonder how plan9 handle utf8.


[...]

>
> 
> For testing, go to the regress directory:
> 
>$ cd /usr/src/regress/bin/ksh
>$ cvs up -dP
>$ cd edit
>$ make obj
>$ make cleandir
>$ make regress
>$ ./obj/edit < input.txt | hexdump -C
>     24 20 78 79 08 c3 a9 79  08 0a   |$ xy...y..|
>   000a


I've been wondering how to work with this.  Thanks!


[...]

> > By the way, something the last paragraph of the new utf8(7) man page
> > isn't clear enough (I mentioned this to tedu@).
> 
> Which paragraph exactly, and what is unclear?  Maybe we can fix it
> quickly.

As I told you, the _last_ one:

   Encodings using more bytes than required are invalid.  In particular,
   1100 and 1101 are not valid start bytes, the byte after
   1110 must be at least 1010, and the byte after  must
   be at least 1001.

I don't understand the 'at least' assumptions.  Some examples in which
the byte after 1110 is *smaller* than 1010:

Euro sign:
11100010 1010 10101100

Em dash:
11100010 1000 10010100

Double quotes:
11100010 1000 10011100
11100010 1000 10011101

You can find examples where the byte after 1110 is *grater* than
1010 here:

http://www.utf8-chartable.de/



Thank you for your advices I'll study your whole message carefuly.


> 
> Yours,
>   Ingo



Re: ksh(1): don't output invalid UTF-8 characters

2017-06-05 Thread Walter Alejandro Iglesias
Just to applogize to developers here,

I'm still not skilled enough to make a proper patch or a clear bug
report (I'm on chapter 2 of K :-)).  I wish with time I'll learn how
to do it.  I came to the ksh utf8 discussion because I've been playing
with some mail mime encoder just to learn C and recognizing valid utf-8
was the first challenge I ecountered.

The code pasted below is what I got so far in recognizing valid utf-8.
I'm showing it to make my point, I realize it isn't easy; and from my
poor C I'm not able to figure out how you can do such test byte by byte
while the user is typing at command line.  (Don't bother in explaining
me how, I know this is not the place to take C lessons.)

By the way, something the last paragraph of the new utf8(7) man page
isn't clear enough (I mentioned this to tedu@).

Thanks to all of you for your work.  Now I know how hard it is.


#include 

#define ASCII   0x7f
#define YES 1
#define NO  0

int
main()
{
int c, ch, wd, ln, col, isutf8;

ch = wd = ln = col = 1;
isutf8 = YES;

while ((c = getchar()) != EOF) {
if (c > ASCII) {
if ((ch == 1 && (c < 0xc2 || c > 0xf7)) ||
((ch > 1 && c <= 4) &&
ch <= wd && (c < 0x80 || c > 0xbf)))
isutf8 = NO;
/* 110. */
else if (ch == 1 && c >= 0xc2 && c <= 0xdf) {
wd = 2;
++ch;
/* 1110 */
} else if (ch == 1 && c >= 0xe0 && c <= 0xef) {
wd = 3;
++ch;
/* 0... */
} else if (ch == 1 && c >= 0xf0 && c <= 0xf7) {
wd = 4;
++ch;
} else if (ch > 1 && c <= 4 &&
ch == wd && c >= 0x80 && c <= 0xbf)
ch = 1;
else if (ch > 1 && c <= 4 &&
ch < wd && c >= 0x80 && c <= 0xbf)
++ch;
else
++ch;
} else if (ch > 1 && ch <= 4 && ch <= wd)
isutf8= NO;
else
ch = 1;

if (isutf8 == NO) {
printf("Invalid utf-8 character");
printf(" at line %d col %d.\n", ln, col);
return 1;
}
if (c == '\n') {
col = 1;
++ln;
} else
++col;
}

return 0;
}



[w...@roquesor.com: Re: ksh(1): vi mode UTF-8 bug]

2017-06-04 Thread Walter Alejandro Iglesias

I realized the issue I describe in the message below (sent to Ingo in
private) happens in the tty console (no X11) indifferently of what you
set in LC_CTYPE.  The problem comes when you have a non english
keyboard; you can easily type by accident some non ascii character
smaller than 0xc0, then your line is lost.  So, I think it *is* a bug.

Vi editing mode users can workaround the problem using vi-show8.

As far as I can understand there isn't an easy solid way to know if a
non ascii character is utf-8, so (with all respect others work deserve
and please correct me if I'm wrong) each time you fix some utf8 input
issue there is a big chance you're unfixing the non-utf8 non-ascii
input. :-)  To preserve a safe ascii implementation you should consider
to keep utf8 hacks aside.  You could do the same you do with nvi,
diverting the effort developers are now putting in this hacks in
implementing some utf-8 version of ksh as a package for those who think
they need utf-8 support in console (don't count me among them).


- Forwarded message from Walter Alejandro Iglesias <w...@roquesor.com> -

Date: Fri, 2 Jun 2017 14:47:55 +0200
From: Walter Alejandro Iglesias <w...@roquesor.com>
To: Ingo Schwarze <schwa...@usta.de>
Subject: Re: ksh(1): vi mode UTF-8 bug
User-Agent: Mutt/1.8.2hg (2017-04-18)

Hi Ingo,

On Mon, May 29, 2017 at 07:28:37PM +0200, Ingo Schwarze wrote:
> Hi Walter,
> 
> Walter Alejandro Iglesias wrote on Mon, May 29, 2017 at 06:44:40PM +0200:
> 
> > Are those wide char versions of C functions consistent enough to write
> > a separate implementation to be loaded when LC_TYPE is set to utf-8?
> 
> Sure, you can rewrite the complete shell to use wchar_t * rather
> than char *, and if you do that, you can use the new code to handle
> ASCII as well, no need to have two copies.  But that would be a
> huge effort, even more error-prone than the small, careful adjustments
> we are doing now, and would have a number of additional downsides;
> among others, losing the ability to handle arbitrary bytes, while
> in UTF-8 mode.
> 
> For an editor, going wchar_t might be better because having substantial
> amounts of UTF-8 in user input is a common case in some files that
> people edit.
> 
> For a shell, editing strings that contain non-ASCII is not the main
> purpose.  Sure, it is nice if the command line is able to handle
> strings containing an occasional UTF-8 character.  But the main
> purpose of the shell remains to safely input and execute Unix-style
> command lines, where non-ASCII characters are a non-essential addition
> at best.
> 
> Yours,
>   Ingo
> 
> 
> For more details, see
> https://www.openbsd.org/papers/eurobsdcon2016-utf8.pdf


There is an issue I ignore since when it is present (regression?).  I
suppose it's caused by the way you test if non ascii characteres are
utf8.  It happens with both vi and emacs editing modes.

With LC_CTYPE=C if you type non ascii characters smaller than 0xc0 and
pass over them with the cursor you'll see how the cursor thinks it's a
wide character.  This overrides characters, commands as x, r or s get
confused and calling the line from the history file get screwed too.

Given opensbsd formally support only utf8, are you aware and accept this
issue as part of the deal to handle utf8 or may I report it as a bug?


- End forwarded message -



Re: ksh(1): vi mode UTF-8 bug

2017-05-29 Thread Walter Alejandro Iglesias
On Mon, May 29, 2017 at 07:28:37PM +0200, Ingo Schwarze wrote:
> Hi Walter,
> 
> Walter Alejandro Iglesias wrote on Mon, May 29, 2017 at 06:44:40PM +0200:
> 
> > Are those wide char versions of C functions consistent enough to write
> > a separate implementation to be loaded when LC_TYPE is set to utf-8?
> 
> Sure, you can rewrite the complete shell to use wchar_t * rather
> than char *, and if you do that, you can use the new code to handle
> ASCII as well, no need to have two copies.  But that would be a
> huge effort, even more error-prone than the small, careful adjustments
> we are doing now, and would have a number of additional downsides;
> among others, losing the ability to handle arbitrary bytes, while
> in UTF-8 mode.
> 
> For an editor, going wchar_t might be better because having substantial
> amounts of UTF-8 in user input is a common case in some files that
> people edit.
> 
> For a shell, editing strings that contain non-ASCII is not the main
> purpose.  Sure, it is nice if the command line is able to handle
> strings containing an occasional UTF-8 character.  But the main
> purpose of the shell remains to safely input and execute Unix-style
> command lines, where non-ASCII characters are a non-essential addition
> at best.

I totally agree with you and that's exactly why I value you're
preserving the ascii version, not only ksh, even the editor, I mostly
use vi and have nvi from packages at hand just for when I want to send
mail to family or edit my web site.

Thanks for your kind explanation.


> 
> Yours,
>   Ingo
> 
> 
> For more details, see
> https://www.openbsd.org/papers/eurobsdcon2016-utf8.pdf



Re: ksh(1): vi mode UTF-8 bug

2017-05-29 Thread Walter Alejandro Iglesias
On Mon, May 29, 2017 at 05:59:34PM +0200, Ingo Schwarze wrote:
> So handling multi-byte "r" should probably be treated as a separate
> issue.
> 

I'm just a beginner with C, what I'm about to say is purely intuitive.

As far as I can understand you're trying to adapt the code that works
with ascii to handle utf-8, what requires to *guess* how to deal with
the next character the user will type at any time.

Are those wide char versions of C functions consistent enough to write a
separate implementation to be loaded when LC_TYPE is set to utf-8?

If I'm telling nonsense just ignore me. :)



Re: ksh(1): vi mode UTF-8 bug

2017-05-28 Thread Walter Alejandro Iglesias
In article <20170528160659.GA46003@lol.local> you wrote:
> Hi Walter,
> 
> Thanks for the report, please try out the diff below.
> 

The diff works as you explained.  It replaces correctly utf-8 with
ascii chars.


Thank you.




[w...@roquesor.com: Re: ksh(1): vi mode UTF-8 bug]

2017-05-28 Thread Walter Alejandro Iglesias

I forgot to Cc this here.


- Forwarded message from Walter Alejandro Iglesias <w...@roquesor.com> -

From: Walter Alejandro Iglesias <w...@roquesor.com>
To: Anton Lindqvist <anton.lindqv...@gmail.com>
Subject: Re: ksh(1): vi mode UTF-8 bug
In-Reply-To: <20170519121136.GA16623@lol.local>
X-Newsgroups: gmane.os.openbsd.tech
User-Agent: tin/2.4.1-20161224 ("Daill") (UNIX) (OpenBSD/6.1 (amd64))
Status: RO
Content-Length: 703
Lines: 25

Hi Anton,

In article <20170519121136.GA16623@lol.local> you wrote:
> Hi,
> Another UTF-8 related bug reported by tb@. How to re-produce:
> 
> 1. Enable vi mode:
> 
>$ set -o vi
> 
> 2. Input the following characters: ??a
> 
> 3. Press escape and then x twice.
> 
> 4. An invalid UTF-8 character is displayed.
> 
> Similar to one of my previous diffs, looks like the column counter is
> wrong. The diff below fixes the problem and includes a regression test.
> I'm not running vi mode myself so further testing would be appreciated.
> 

There is still a similar issue when you try to "replace" a utf-8
character (in command mode press 'r' to replace a single character or
'R' to replace a string).


- End forwarded message -



I should elaborate a bit what this diff does

2017-04-10 Thread Walter Alejandro Iglesias
I'm thinking I should've explained more in detail what this patch does.

With window managers that implement a window list menu, like fvwm, after
cycling using Alt+Tab, previous and new focused windows, both end in the
front (currently cwm lets the previously focused window overlapped.)
This is specially convenient when one of the windows is maximized.

To imitate this useful behavior this diff makes cwm to raise the
previously focused window after releasing the Alt key, right before
focusing and warping the pointer to the new chosen one.

It's not my idea, I took it from WindowMaker.




I finally learned how to make the diff :-)

2017-04-10 Thread Walter Alejandro Iglesias

I finally learned how to make the diff using cvs :-)


Index: client.c
===
RCS file: /cvs/xenocara/app/cwm/client.c,v
retrieving revision 1.234
diff -u -p -r1.234 client.c
--- client.c6 Feb 2017 18:10:28 -   1.234
+++ client.c10 Apr 2017 14:16:31 -
@@ -645,9 +645,11 @@ match:
 void
 client_cycle(struct screen_ctx *sc, int flags)
 {
-   struct client_ctx   *newcc, *oldcc;
+   struct client_ctx   *newcc, *oldcc, *prevcc;
int  again = 1;
 
+   prevcc = TAILQ_FIRST(>clientq);
+
/* For X apps that ignore events. */
XGrabKeyboard(X_Dpy, sc->rootwin, True,
GrabModeAsync, GrabModeAsync, CurrentTime);
@@ -686,6 +688,7 @@ client_cycle(struct screen_ctx *sc, int 
/* reset when cycling mod is released. XXX I hate this hack */
sc->cycling = 1;
client_ptrsave(oldcc);
+   client_raise(prevcc);
client_raise(newcc);
if (!client_inbound(newcc, newcc->ptr.x, newcc->ptr.y)) {
newcc->ptr.x = newcc->geom.w / 2;



Alt-Tab cycling improvement for cwm

2017-04-10 Thread Walter Alejandro Iglesias

Raise previous window after cycling on cwm.


--- client.c.orig   Mon Apr 10 14:24:11 2017
+++ client.cMon Apr 10 14:29:14 2017
@@ -645,9 +645,11 @@ match:
 void
 client_cycle(struct screen_ctx *sc, int flags)
 {
-   struct client_ctx   *newcc, *oldcc;
+   struct client_ctx   *newcc, *oldcc, *prevcc;
int  again = 1;
 
+   prevcc = TAILQ_FIRST(>clientq);
+
/* For X apps that ignore events. */
XGrabKeyboard(X_Dpy, sc->rootwin, True,
GrabModeAsync, GrabModeAsync, CurrentTime);
@@ -686,6 +688,7 @@ client_cycle(struct screen_ctx *sc, int flags)
/* reset when cycling mod is released. XXX I hate this hack */
sc->cycling = 1;
client_ptrsave(oldcc);
+   client_raise(prevcc);
client_raise(newcc);
if (!client_inbound(newcc, newcc->ptr.x, newcc->ptr.y)) {
newcc->ptr.x = newcc->geom.w / 2;





Something that could be important to document

2017-03-21 Thread Walter Alejandro Iglesias
I don't know if what I read time ago about how to correctly request a CA
to use with original sendmail is still important and applicable
(currently I use opensmtpd).

As far as I understood, you must use your FQDN as the principal name in
your certificate.  That's why I use 'server.roquesor.com' (my machine
name) instead of just the domain name 'roquesor.com' as the principal
name.

I don't know if Let's Encrypt people, since they thought their
certificates mostly for web sites, didn't care about documenting this
detail or if it's not important anymore.

In case this is still important perhaps could be useful to mention it in
the man page (or in FAQ).





Re: acme-client doc diff

2017-03-19 Thread Walter Alejandro Iglesias
Nick wrote:

> I found that the current man pages and example file for acme-client
> are confusing and leave one with an imperfect certificate setup, with
> the intermediate certs missing.  Doesn't generate an error on
> OpenBSD, but does on some other OSs.

Right.  It took me some time to realize I had to use fullchain.pem
instead of chain.pem with smptd and pop3d.


Thanks!




Re: Xorg stipple

2017-02-26 Thread Walter Alejandro Iglesias
Thanks to the stipple pattern I noticed this:

https://marc.info/?l=openbsd-x11=146506160500583=2

I know it's not relevant or if it's useful for discovering some bug, but
I'd still like to know what's that phantom shadow that appears at the
right bottom of my screen. :-)