On Thu, 6 Jun 2024 21:21:23 GMT, David Holmes <dhol...@openjdk.org> wrote:

>> ### Summary
>> This change ensures we don't get undefined behavior when 
>> calling[`isspace`](https://pubs.opengroup.org/onlinepubs/007904975/functions/isspace.html).
>>   `isspace` accepts an `int` argument that "the application shall ensure is 
>> a character representable as an unsigned char or equal to the value of the 
>> macro EOF.". 
>> 
>> Previously, there was no checking of the values passed to `isspace`. I've 
>> replaced direct calls with a new wrapper `os::is_space` that performs a 
>> range check and prevents the possibility of undefined behavior from 
>> happening. For instances outside of Hotspot, I've added casts to `unsigned 
>> char`. 
>> 
>> **Testing**
>> - Added a new test in `test/hotspot/gtest/runtime/test_os.cpp` to check 
>> `os::is_space` is working correctly.
>> - tier1
>
> Sorry I think this is well-intentioned but unnecessary in nearly all cases. 
> If you pass a char there is no potential problem. Only passing an actual int 
> could be a problem.

@dholmes-ora 

> Sorry I think this is well-intentioned but unnecessary in nearly all cases. 
> If you pass a char there is no potential problem. Only passing an actual int 
> could be a problem.

Note that the issue was motivated by an Oracle engineer complaining about me 
using isspace on a char. That caused me to look up its behavior. Recently, we 
seem intent on eliminating UB, so why not.

That said, I agree that we probably don't need the wrapper. And casting to int 
feels awkward. I propose
- input from trusted sources, e.g. proc fs, don't need casting
- input from other sources should be casted to unsigned char (see 
recommendation here: https://en.cppreference.com/w/cpp/string/byte/isspace "To 
use these functions safely with plain chars (or signed chars), the argument 
should first be converted to unsigned char")

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19567#issuecomment-2154146793

Reply via email to