string-ish range/stream from curl ubyte[] chunks?

Vlad via Digitalmars-d-learn Fri, 16 May 2014 14:00:31 -0700

Hello D programmers,

I am toying with writing my own HTML parser as a pet project, andI strive to have a range API for the tokenizer and the parseroutput itself.

However it occurs to me that in real-life browsers the advantageof this type of 'streaming' parsing would be given by also havingthe string that plays as input to the tokenizer treated as a'stream'/'range'.

While D's *string classes do play as ranges, what I want to writeis a 'ChunkDecoder' range that would take curl 'byChunk' outputand make it consumable by the tokenizer.

Now, the problem: string itself has ElementType!string == dchar.Consuming a string a dchar at a time looks like a wastefuloperation if e.g. your string is UTF-8 or UTF-16.

So, naturally, I would like to use indexOf() - instead ofcountUntil() - and opSlice (without opDollar?) on my ChunkDecoder(forward) range.

Q: Is anything like this already in use somewhere in the standardlibrary or a project you know?Q2: Or do you have any pointers for what the smallest API wouldbe for a string-like range class?


And bonus:

Q3: any uses of such a string-ish range in other standard librarymethods that you can think of and could be contributed to? e.g.suppose this doesn't exist and I / we come up with a proposal ofminimal API to consume a string from left to right.


Thanks for your time and your suggestions!

string-ish range/stream from curl ubyte[] chunks?

Reply via email to