Re: new module suggestion: fprintftime-check

2019-01-05 Thread Florian Weimer
* Bruno Haible:

> Florian Weimer wrote:
>> The standards do not provide a way to report errors for malformed format
>> strings.  I think the current behavior is acceptable, all things
>> considered.
>
> OK, then I'm fine with Assaf's approach to create a new, separate function
> that does only the syntax checking.

How do you plan to keep this aligned with the glibc implementation?

Thanks,
Florian



Re: new module suggestion: fprintftime-check

2019-01-05 Thread Bruno Haible
Florian Weimer wrote:
> The standards do not provide a way to report errors for malformed format
> strings.  I think the current behavior is acceptable, all things
> considered.

OK, then I'm fine with Assaf's approach to create a new, separate function
that does only the syntax checking.

Bruno




Re: new module suggestion: fprintftime-check

2019-01-02 Thread Florian Weimer
* Bruno Haible:

> [CCing Florian Weimer.
> Florian, the thread started at
> https://lists.gnu.org/archive/html/bug-gnulib/2018-12/msg00149.html ]
>
> Assaf Gordon wrote:
>> The comment even says:
>>/* Unknown format; output the format, including the '%',
>>   since this is most likely the right thing to do if a
>>   multibyte string has been misparsed.  */
>> 
>> This has been the case since 1996 when strftime.c was imported from libc
>> (gnulib commit afabd949).
>> 
>> I suspect that changing this behavior would be a disruptive
>> backwards-incompatible change (but other opinions are welcomed).
>
> The "security" and "robustness" aspects of software have gained importance
> over the last 22 years, also in domain of glibc.
>
> Florian, Assaf discovered that glibc processing of time format strings
> (strftime) operates according to the garbage-in - garbage-out principle,
> that is, an invalid format string does not get reported to the caller
> but instead produces output that is "most likely the right thing".

Historically, some Lua scripts have relied on strftime not crashing, but
I think this awas fixed on the Lua side a couple of years ago.

The standards do not provide a way to report errors for malformed format
strings.  I think the current behavior is acceptable, all things
considered.

Thanks,
Florian



Re: new module suggestion: fprintftime-check

2018-12-29 Thread Ben Pfaff
On Sat, Dec 29, 2018 at 09:22:17AM -0800, Paul Eggert wrote:
> (Using ptrdiff_t is part of my campaign to prefer ptrdiff_t to size_t. While
> we're at it, let's change the other size_t args to ptrdiff_t, but I
> digress)

Have you said anything about this campaign elsewhere?  I'd like to hear
more.



Re: new module suggestion: fprintftime-check

2018-12-29 Thread Paul Eggert

I suspect that changing this behavior would be a disruptive
backwards-incompatible change (but other opinions are welcomed).


I wouldn't mind a change as long as it changes the API enough so that compilers 
complain if we don't also update the calling code. For example, nstrftime could 
take an additional ptrdiff_t * argument that (if not null) is set to the offset 
of the first offending % or encoding error, or to -1 if the format is OK. (We 
should report encoding errors as well as bad % formats.)


(Using ptrdiff_t is part of my campaign to prefer ptrdiff_t to size_t. While 
we're at it, let's change the other size_t args to ptrdiff_t, but I digress)



nstrtime's __strftime_internal() is a complicated beast, I don't
claim to fully understand it.
But the function contains this comment:
    /* POSIX.1 requires that local time zone information be used as
  though strftime called tzset.  */
https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/nstrftime.c#n540


This comment is for the POSIX API where there was no timezone_t argument. It 
doesn't apply to the Gnulib API, because of the "undef HAVE_TZSET" earlier on. 
The only reason for the tzset_called stuff is the faint hope that we can merge 
all this stuff into glibc someday.




Re: new module suggestion: fprintftime-check

2018-12-29 Thread Assaf Gordon

On 2018-12-28 11:08 p.m., Bruno Haible wrote:

[CCing Florian Weimer.
Florian, the thread started at
https://lists.gnu.org/archive/html/bug-gnulib/2018-12/msg00149.html ]

Assaf Gordon wrote:

The comment even says:
/* Unknown format; output the format, including the '%',
   since this is most likely the right thing to do if a
   multibyte string has been misparsed.  */

This has been the case since 1996 when strftime.c was imported from libc
(gnulib commit afabd949).

I suspect that changing this behavior would be a disruptive
backwards-incompatible change (but other opinions are welcomed).


The "security" and "robustness" aspects of software have gained importance
over the last 22 years, also in domain of glibc.

Florian, Assaf discovered that glibc processing of time format strings
(strftime) operates according to the garbage-in - garbage-out principle,
that is, an invalid format string does not get reported to the caller
but instead produces output that is "most likely the right thing".

Is this still considered the adequate processing, from a glibc point of
view?



For reference, this is about ./time/strftime_l.c lines 1414-1428:

https://sourceware.org/git/?p=glibc.git;a=blob;f=time/strftime_l.c;h=c71f9f47a9525046b59a89c005de22a304367d4d;hb=HEAD#l1414


Also, POSIX says:
"If a conversion specification does not correspond to any of the above, 
the behavior is undefined."

http://pubs.opengroup.org/onlinepubs/9699919799/functions/strftime.html

Looking at the "bigger picture",
I'll just say my goal is to provide a helpful warning in date(1),
not to change any APIs...


regards,
 - assaf



Re: new module suggestion: fprintftime-check

2018-12-28 Thread Bruno Haible
[CCing Florian Weimer.
Florian, the thread started at
https://lists.gnu.org/archive/html/bug-gnulib/2018-12/msg00149.html ]

Assaf Gordon wrote:
> The comment even says:
>/* Unknown format; output the format, including the '%',
>   since this is most likely the right thing to do if a
>   multibyte string has been misparsed.  */
> 
> This has been the case since 1996 when strftime.c was imported from libc
> (gnulib commit afabd949).
> 
> I suspect that changing this behavior would be a disruptive
> backwards-incompatible change (but other opinions are welcomed).

The "security" and "robustness" aspects of software have gained importance
over the last 22 years, also in domain of glibc.

Florian, Assaf discovered that glibc processing of time format strings
(strftime) operates according to the garbage-in - garbage-out principle,
that is, an invalid format string does not get reported to the caller
but instead produces output that is "most likely the right thing".

Is this still considered the adequate processing, from a glibc point of
view?

Bruno




Re: new module suggestion: fprintftime-check

2018-12-28 Thread Assaf Gordon

Hello Bruno,

On 2018-12-28 9:34 a.m., Bruno Haible wrote:

This function enables syntax-check of the format string.


First question: Should this syntax-check be integrated into the
nstrftime() and fprintftime() functions? These functions are gnulib
inventions, therefore they could be extended to return
   - an error indicator (maybe EINVAL?),
   - a detailed error message, if you like,
   - a pointer to the wrong directive in the format string, if you like.
Or is it better to have a function that does only the syntax check,
and document that nstrftime() and fprintftime() should only be called
after the syntax check has been done?
(I have no preference. Just asking.)


Both nstrtime and fprintftime silently ignore bad formats:
https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/nstrftime.c#n1478

The comment even says:
  /* Unknown format; output the format, including the '%',
 since this is most likely the right thing to do if a
 multibyte string has been misparsed.  */

This has been the case since 1996 when strftime.c was imported from libc
(gnulib commit afabd949).

I suspect that changing this behavior would be a disruptive
backwards-incompatible change (but other opinions are welcomed).



I'd like to suggest the following new module: fprintftime-check.


For a module that does only the syntax check, I would suggest a
different name:
   - The prefix 'fprint' indicates output to a 'FILE *'.
   - The prefix 'str' indicates output to a 'char *' buffer.
Why not call it 'ftime-check' or 'check-ftime'?


Good idea. I'll change it.



tzalloc() may fail and return NULL. (And why would parsing the format
string be dependent on a particular time zone at all?)


nstrtime's __strftime_internal() is a complicated beast, I don't
claim to fully understand it.
But the function contains this comment:
   /* POSIX.1 requires that local time zone information be used as
 though strftime called tzset.  */
https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/nstrftime.c#n540

And there is even an output "bool * tzset_called" parameter, to
let the caller know if it has been called or not - so I assume it is
critical to correct behavior of both nstrtime and fprintftime.
I will try to see if it can be disabled for the syntax-check code.

-assaf




Re: new module suggestion: fprintftime-check

2018-12-28 Thread Bruno Haible
Hi Assaf,

> This function enables syntax-check of the format string.

First question: Should this syntax-check be integrated into the
nstrftime() and fprintftime() functions? These functions are gnulib
inventions, therefore they could be extended to return
  - an error indicator (maybe EINVAL?),
  - a detailed error message, if you like,
  - a pointer to the wrong directive in the format string, if you like.
Or is it better to have a function that does only the syntax check,
and document that nstrftime() and fprintftime() should only be called
after the syntax check has been done?
(I have no preference. Just asking.)

> I'd like to suggest the following new module: fprintftime-check.

For a module that does only the syntax check, I would suggest a
different name:
  - The prefix 'fprint' indicates output to a 'FILE *'.
  - The prefix 'str' indicates output to a 'char *' buffer.
Why not call it 'ftime-check' or 'check-ftime'?

> It uses the same infrastructure as fprintftime
> (i.e. #include "nstrtime,c")

That's a good idea, otherwise the syntax check and the actual use of
the format string might too easily get out-of-sync.

> This patch is only a rough draft

tzalloc() may fail and return NULL. (And why would parsing the format
string be dependent on a particular time zone at all?)

Bruno