On 9/10/18 1:45 AM, Chris wrote:

After a while your code will be cluttered with absurd stuff like this. `.byCodeUnit`, `.byGrapheme`, `.array` etc. Due to my experience with `splitter` et. al. I tried to create my own parser to have better control over every step.

I considered that, but I'm still trying to make this buffer reference thing work. Phobos just needs to be fixed. This is actually not as hopeless as I once thought. But what needs to happen is all of Phobos algorithms need to be tested with byCodeUnit et. al.

After a few *minutes* of testing things I ran into this bug [1] that didn't get fixed till early 2018. I never started to write my own step-by-step parser. I'm glad I didn't.

It actually was fixed accidentally in 2017 in this PR: https://github.com/dlang/druntime/pull/1952. The bug was closed in 2018 when someone noticed the code no longer failed.

Essentially, the whole string switch algorithm was replaced with a completely rewritten better approach. This is a great example of why we should be moving more of the compiler magic into the library -- it's just easier to write and understand there.

I wish people began to realize that string handling is a basic necessity and that the correct handling of strings is of utmost importance. Please keep us updated on how things work out (or not) for you.

Absolutely, D needs to have great support for string parsing and manipulation. The potential is awesome.

I will keep it up, what I'm trying to fix is the fact that using std.algorithm to extract pieces from a buffer, but then using the position in that buffer to determine things (i.e. parsing) is really difficult without some stupid requirements like pointer math.

[Please, nobody answer my post pointing out that a) we don't understand Unicode and b) that it's an insult to the Universe to draw attention to flaws that keep pestering us on an almost daily basis - without trying to fix them ourselves stante pede. As is clear from Steve's efforts, the Universe doesn't seem to care.)

I don't characterize it as the universe not caring. Phobos has a legacy problem with string handling, and it needs to somehow be addressed -- either by painfully extracting the problem, or painfully working around it. I don't think anyone here thinks there isn't a problem or that it's insulting to bring it up. But anything that needs to be done is painful either way, which is why it's not happening very fast.

-Steve

Reply via email to