Geoff Clare <[email protected]> wrote: |Steffen Nurpmeso <[email protected]> wrote, on 04 Aug 2016: |> Don Cragun <[email protected]> wrote: |>|But, the point is that an application using iconv() to con\ |>|vert text doesn't need to know whether or not the target \ |>|codeset has locking shift states. It should always make a \ |>|final call similar to: |>| ret = size_t iconv(cd, NULL, 0, outbuf, outbytesleft); |>|with outbytesleft large enough to handle any string that \ |>|might be required to terminate the output buffer with whatever \ |>|string is needed to get back to the initial shift state \ |>|for the target codeset. (This will be a no-op for many \ |>|target codesets, but it won't hurt an application to make \ |>|this final call before calling: |>| ret = iconv_close(cd); |>|to terminate the conversion even if the target codeset does \ |>|not need to add any text to get back to an initial shift state.) |> |> That is exactly my point, but which i don't see fullfilled with |> the accepted sentence |> |> It is the responsibility of the application to ensure that, if the |> output codeset has a locking-shift encoding, the output buffer is |> returned to its initial shift state when conversion is completed. | |In your first mail you suggest removing the "if the output codeset has |a locking-shift encoding" condition from the accepted text. This would |then imply that there are always shift states involved, which would be |worse.
My original hope was that the term stateful goes away to make room for possible future Unicode enhancements, meaning that iconv becomes capable to perform better conversion to those character sets / languages where things change dependent upon graphem boundary detection. For this to happen, it is actually necessary to allow iconv to wait for more input, so that it can be enabled to truly decide wether the input seen last should be converted to "language term A" or "term B". Unfortunately i don't have a real-language example at hand, but want to at least point to the currently used "From:" email signature used by the head of the Unicode Consortium, Mark Davis, who added to his well-known coffee cup symbol a special Unicode variation selector which combines with the coffee cup code-point before it and provides hints (for example "present this is in colour if possible") for capable renderers. Therefore i think it would be good to agree in "the concept as such exists, and a future POSIX standard should lead to the right direction regarding this as early as possible". |How about if we add another sentence: | | Since the standard does not provide a way to query whether a codeset | has a locking-shift encoding, it is recommended that applications | always call iconv() in this way before calling iconv_close(). I would really appreciate to see the final part of this sentence in the standard. Have a nice weekend. --steffen
