On Mon, Oct 03, 2022 at 11:28:49PM +0200, [email protected] wrote:
>     The expression
>     
>             GRAPHEME_STATE state = 0;
>     
>     admittedly looks a little fishy, given you don't really know what happens
>     behind the scenes unless you look in the header,

Another possibility is wrapping the integer inside a struct:

        typedef struct { unsigned internal_state; } GRAPHEME_STATE;

the benefit of this is that the type GRAPHEME_STATE clearly states the
purpose, whereas a `uint_least16_t` doesn't.

Wrapping an enum into a struct is also a common trick to get stronger
type-checking from the compiler; I don't think it matters in this case
though, since the state is always passed via pointer.

>  and I want all of the semantics to be crystal clear to the end-user.

Other way of looking at it is that the state is an internal thing so the
user shouldn't be concerned about what's going on behind the scene.

> +static inline void
> +state_serialize(const struct character_break_state *in, uint_least16_t *out)
> +{
> +     *out = (uint_least16_t)(in->prop & UINT8_C(0xFF))                   | 
> /* first 8 bits */
> +            (uint_least16_t)(((uint_least16_t)(in->prop_set))     <<  8) | 
> /* 9th bit */
> +            (uint_least16_t)(((uint_least16_t)(in->gb11_flag))    <<  9) | 
> /* 10th bit */
> +            (uint_least16_t)(((uint_least16_t)(in->gb12_13_flag)) << 10);  
> /* 11th bit */
> +}
> +
> +static inline void
> +state_deserialize(uint_least16_t in, struct character_break_state *out)
> +{
> +     out->prop         = in & UINT8_C(0xFF);
> +     out->prop_set     = in & (((uint_least16_t)(1)) <<  8);
> +     out->gb11_flag    = in & (((uint_least16_t)(1)) <<  9);
> +     out->gb12_13_flag = in & (((uint_least16_t)(1)) << 10);
> +}

The `(uint_least16_t)1` casts don't really do much since `int` is
guaranteed to be 16bits anyways. But if you want to be explicit, you can
still use `UINT16_C(1)`, which is shorter thus less noisy, instead of
casting:

-       out->prop_set     = in & (((uint_least16_t)(1)) <<  8);
+       out->prop_set     = in & (UINT16_C(1) << 8);

I'd also return by value in these 2 functions. Expressions like these
are more clear compared to out pointers:

        *s = state_deserialize(&state);
        state = state_serialize(*s);

- NRK

Reply via email to