Re: [fpc-pascal] Parse unicode scalar

Hairy Pixels via fpc-pascal Mon, 03 Jul 2023 21:17:56 -0700


> On Jul 4, 2023, at 9:58 AM, Nikolay Nikolov via fpc-pascal 
> <fpc-pascal@lists.freepascal.org> wrote:
> 
> You need to understand all these terms and know exactly what you need to do. 
> E.g. are you dealing with keyboard input, are you dealing with the low level 
> parts of text display, are you searching for something in the text, are you 
> just passing strings around and letting the GUI deal with it? These are all 
> different use cases, and they require careful understanding what Unicode 
> thing you need to iterate over.


Thanks for trying to help but this is more complicated than I thought and I 
don't have the patience for a deep dive right now :)

Unicode is complicated under the hood but we should have some libraries to help 
right? I mean the user thinks of these things as "characters" be it "A" or the 
unicode symbol 👍 so we should be able to operate on that basis as well. 
Something like an iterator that return the character (wide char) and  byte 
offset or writing would be a nice place to start.

I have a parser/tokenizer I want to update so I'm trying to find tokens by 
advancing one character at a time. That's why I have a requirement to know 
which character is next in the file and probably the byte offset also so it can 
be referenced later.


Regards,
Ryan Joseph

_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Parse unicode scalar

Reply via email to