>@@ -605,14 +605,21 @@ c_strlen (tree src, int only_value) > > /* Set MAXELTS to sizeof (SRC) / sizeof (*SRC) - 1, the maximum possible > length of SRC. Prefer TYPE_SIZE() to TREE_STRING_LENGTH() if possible >- in case the latter is less than the size of the array. */ >- HOST_WIDE_INT maxelts = TREE_STRING_LENGTH (src); >+ in case the latter is less than the size of the array, such as when >+ SRC refers to a short string literal used to initialize a large array. >+ In that case, the elements of the array after the terminating NUL are >+ all NUL. */ >+ HOST_WIDE_INT strelts = TREE_STRING_LENGTH (src); >+ strelts = strelts / eltsize - 1; >+ >+ HOST_WIDE_INT maxelts = strelts; > tree type = TREE_TYPE (src); > if (tree size = TYPE_SIZE_UNIT (type)) > if (tree_fits_shwi_p (size))
If you already touch this area, please make at least the tree_fits_ and tree_to_ consistent. >- maxelts = tree_to_uhwi (size); >- >- maxelts = maxelts / eltsize - 1; >+ { >+ maxelts = tree_to_uhwi (size); >+ maxelts = maxelts / eltsize - 1; >+ } > > /* PTR can point to the byte representation of any string type, including > char* and wchar_t*. */ >@@ -620,10 +627,12 @@ c_strlen (tree src, int only_value) > > if (byteoff && TREE_CODE (byteoff) != INTEGER_CST) > { Please don't go into this if block, when eltsize != 1, the folding as it stands is incorrect for eltsize != 1. >- /* If the string has an internal zero byte (e.g., "foo\0bar"), we can't >- compute the offset to the following null if we don't know where to >+ /* If the string has an internal NUL character followed by any >+ non-NUL characters (e.g., "foo\0bar"), we can't compute >+ the offset to the following NUL if we don't know where to > start searching for it. */ >- if (string_length (ptr, eltsize, maxelts) < maxelts) >+ unsigned len = string_length (ptr, eltsize, strelts); >+ if (len < strelts) > { >> I really don't see why we need to support wide characters especially >> when there is no reasonable test coverage, and no usable wstrlen >> builtin first. > >I'm not sure what special support are you talking about. >The string_constant function handles all kinds of strings >and the change above doesn't affect that. The purpose of >the check above is to conservatively fail for constant >arrays with excess initializers such as in the comment: > > const char a[4] = "abc\000\000"; Yes, fine, but as I said, conservative is not folding stuff that is too complicated, for instance for the variable index case _together_ with wide character strings. Out of curiosity: how often have you seen existing code to use strlen(&a[i]) and similar? I know emacs (pr86528), but how common is that code do you have numbers? Bernd.