Re: std.uni, std.ascii, std.encoding, std.utf ugh!

learner via Digitalmars-d-learn Wed, 06 May 2020 04:11:43 -0700

On Tuesday, 5 May 2020 at 19:24:41 UTC, WebFreak001 wrote:

On Tuesday, 5 May 2020 at 18:41:50 UTC, learner wrote:
Good morning,
Trying to do this:

```
bool foo(string s) nothrow { return s.all!isDigit; }
```

I realised that the conversion from char to dchar could throw.
I need to validate and operate over ascii strings and utf8strings, possibly in separate functions, what's the best wayto transition between:
```
immutable(ubyte)[] -> validate utf8 -> string -> nothrow usage-> isDigit etcimmutable(ubyte)[] -> validate ascii -> AsciiString? ->nothrow usage -> isDigit etcstring -> validate ascii -> AsciiString? ->nothrow usage -> isDigit etc
```

Thank you


Thank you WebFreak,

if you want nothrow operations on the sequence of characters(bytes) of the strings, use `str.representation` to get`immutable(ubyte)[]` and work on that. This is useful forexample for doing indexOf (countUntil), startsWith, endsWith,etc. Make sure at least one of your inputs is validated thoughto avoid potentially handling or cutting off unfinished codepoints. I think this is the best way to go if you want to dosimple things.

What I really want is a way to validate an immutable(ubyte)[]sequence for UFT8 or ASCII, and from that point forward, applyfunctions like isDigit in nothrow functions.

If your algorithm is sufficiently complex that you would liketo still decode but not crash, you can also manually call.decode with UseReplacementDchar.yes to make it emit \uFFFD forinvalid characters.


I will simply reject invalid UTF8 input, that's coming from I/O

To get the best of both worlds, use `.byUTF!dchar` which givesyou an input range to iterate over and defaults to usingreplacement dchar. You can then call the various algorithm &array functions on it.


Can you explain better?

Unless you are working with different encodings than UTF-8(like doing file or network operations) you shouldn't beneeding std.encoding.


I'm expecting UTF8 and ASCII encoding from I/O

Thank you!

Re: std.uni, std.ascii, std.encoding, std.utf ugh!

Reply via email to