Re: Reading text (I mean "real" text...)

Denis via Digitalmars-d-learn Sat, 20 Jun 2020 00:35:48 -0700

Digging into this a bit further --

POSIX defines a "print" class, which I believe is an exact fit.The Unicode spec doesn't define this class, which I presume iswhy D's std.uni library also omits it. But there is an isprint()function in libc, which I should be able to use (POSIX here).This function refers to the system locale, so it isn't limited toASCII characters (unlike std.ascii:isPrintable).


So that's one down, two to go:

  Loop until newline or EOF
   (1) Read bytes or character             } Possibly
   (2) Decode UTF-8, exception if invalid  } together
   (3) Call isprint(), exception if invalid
  Return line

(This simplified outline obviously doesn't show how to deal withthe complications arising from using buffers, handling codepointsthat straddle the end of the buffer, etc.)

Where I'm still stuck is the read or read-and-auto-decode: thisis where the waters get really muddy for me. Three differenttechniques for reading characters are suggested in this thread(iopipe, ranges, rawRead):https://forum.dlang.org/thread/cgteipqqfxejngtpg...@forum.dlang.org

I'd like to stick with standard D or C libraries initially, sothat rules out iopipe for now. What would really help is somedetails about what one read technique does particularly well vs.another. And is there a technique that seems more suited to thisuse case than the rest?


Thanks again

Re: Reading text (I mean "real" text...)

Reply via email to