strftime(3) has a clear distinction between "%e" and "%d":

        %d    is replaced by the day of the month as a decimal number (01-31).

        %e    is replaced by the day of month as a decimal number (1-31); single
              digits are preceded by a blank.

Whereas strptime(3), being the converse, treats them equally:

        %d    the day of month [1,31]; leading zeros are permitted but not
              required.

        %e    the same as %d.

This makes it impossible to convert back and forth unless one uses the
additional conversion specifier "%n" to parse "any whitespace" before
single digit numbers.


I came across this thanks to a nicely detailed bug report about
mail/isync's `CopyArrivalDate' feature, which uses strptime(3).  Quote:

        The IMAP protocol specifies a date format beginning with a
        day-of-month space-padded to two characters. The %d specifier in
        glibc's strptime(3) will consume a space-padded day of month,
        but OpenBSD's %d only accepts leading zeroes.

That is, formats used in the protocol and/or program such as

        " 4-Mar-2018 16:49:25 -0500" ("%d-%b-%Y %H:%M:%S ")

work with glibc, but not ours, resulting in failures when mbsync tries
to propagate the arrival time together with the messages.


I've checked various implementations with regard to differences between
both conversion specifiers:

glibc           %d same as %e[0], both allow multiple leading spaces[1]
musl            %d same as %e[2], no spaces allowed[3]
Illumos         %d same as %e[4], both allow one leading space
FreeBSD         %d other than %e[5], %e allows one leading space[6]

glibc and Illumos sparsely document them with
"The day of the month (1-31)".  musl appearently has no manual and
FreeBSD's strptime(3) refers to strftime(3); there they make the same
distinction we do, but the strptime behaviour in question here can still
only be implied from that:

        %d    is replaced by the day of the month as a decimal number (01-31).

        %e    is replaced by the day of the month as a decimal number (1-31);
              single digits are preceded by a blank.

So Using "%n%d" works in all implementations and looks like a portable
way to address this, but is not exactly the same as _any_ number of
spaces is matched.

Making "%e" in strptime accept blanks added through "%e" in strftime
seems more reasonable to me.  Diff below does that, effectively behaving
like FreeBSD, but also tries to document it explicitly.

Tested on amd64 with the following snippet:

$ cat test.c
#include <time.h>
#include <stdio.h>
int main() {
        struct tm datetime;
        char *fmts[] = { "1", "02", " 3", " 04", "  5", "006" };
        int i;
        for (i = 0; i < sizeof(fmts)/sizeof(fmts[0]); ++i) {
                if (strptime(fmts[i], "%e", &datetime) != NULL)
                        printf("'%s':\t%d\n", fmts[i], datetime.tm_mday);
                else
                        printf("'%s':\tinvalid\n", fmts[i]);
        }
        return 0;
}
$ cc test.c && ./a.out
'1':    1
'02':   2
' 3':   3
' 04':  4
'  5':  invalid
'006':  invalid

regress/lib/libc/time/strptime passes, but is only uses "%d", not "%e",
In case we go this way, I'll happily add more tests.

Feedback is greatly appreciated on this.  Specifically, I'm unsure about
potential downsides or pitfalls this can bring, although other
implementations seem to do fine.

Could changing the existing behaviour of "%e" break certain use cases?
Are "%d" and "%e" are deliberately kept the same? If so, why?

0: https://git.launchpad.net/glibc/tree/time/strptime_l.c#n74
1: https://git.launchpad.net/glibc/tree/time/strptime_l.c#n518

2: https://git.musl-libc.org/cgit/musl/tree/src/time/strptime.c#n53
3: https://git.musl-libc.org/cgit/musl/tree/src/time/strptime.c#n162

4: 
https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/port/locale/strptime.c#L331

5: 
https://svnweb.freebsd.org/base/head/lib/libc/stdtime/strptime.c?revision=340106&view=markup#l401
6: 
https://svnweb.freebsd.org/base/head/lib/libc/stdtime/strptime.c?revision=340106&view=markup#l420

Index: lib/libc/time/strptime.3
===================================================================
RCS file: /cvs/src/lib/libc/time/strptime.3,v
retrieving revision 1.26
diff -u -p -r1.26 strptime.3
--- lib/libc/time/strptime.3    21 Jan 2019 21:35:58 -0000      1.26
+++ lib/libc/time/strptime.3    17 Feb 2019 21:49:23 -0000
@@ -96,8 +96,8 @@ leading zeros are permitted but not requ
 .It Cm \&%D
 the date as %m/%d/%y.
 .It Cm \&%e
-the same as
-.Cm \&%d .
+the day of month [1,31];
+leading spaces or zeros are permitted but not required.
 .It Cm \&%F
 the date as %Y-%m-%d
 (the
Index: lib/libc/time/strptime.c
===================================================================
RCS file: /cvs/src/lib/libc/time/strptime.c,v
retrieving revision 1.24
diff -u -p -r1.24 strptime.c
--- lib/libc/time/strptime.c    22 Jan 2019 11:09:03 -0000      1.24
+++ lib/libc/time/strptime.c    17 Feb 2019 20:26:23 -0000
@@ -254,8 +254,11 @@ literal:
                        century = i * 100;
                        break;
 
-               case 'd':       /* The day of month. */
-               case 'e':
+               case 'e':       /* The day of month. */
+                       if (isspace(*bp))
+                               bp++;
+                       /* FALLTHROUGH */
+               case 'd':
                        _LEGAL_ALT(_ALT_O);
                        if (!(_conv_num(&bp, &tm->tm_mday, 1, 31)))
                                return (NULL);

Reply via email to