On Tue, Nov 05, 2024 at 01:18:59PM +0000, A bughunter via Unicode wrote:
> On Tuesday, November 5th, 2024 at 08:59, Arthur Rosendahl via Unicode
> <[email protected]> wrote:
>> I think that’s what the OP means. He has a UTF-8-encoded string
>> which he wants to map to a sequence of code points. That’s my guess
>> anyway.
>
> This is pretty much the reverse of what I have asked for: reverse engineering
> an UTF-8 string in order to re-create the sourcecode I have asked for. You
> shouldn't have to reverse engineer.
Do you then mean that you have a sequence of Unicode code points that
you want to convert to UTF-8? In that case, the C code by Oren is what
you need (for a single code point).
> If you do not have sourcecode for UTF-8 then it is more than likely the
> standard does sit on the sidelines disconnected from whatever is being called
> Unicode and UTF-8 actually software in use. You shouldn't have to reverse
> engineer the software to contrast it against the Unicode standard it purports
> to have been.
It sounds like you’re confusing a standard and its implementation.
> Originating Question
>
> Where to get the sourcecode of relevent (version) UTF-8?: in order to
> checksum text against the specific encoding map (codepage).
If you’re going to keep repeating that, you should be aware that it is
incomprehensible, even without the typo and the weird wording.
Arthur