On 8/8/18 4:13 PM, Walter Bright wrote:
On 8/6/2018 6:57 AM, Steven Schveighoffer wrote:
But I'm not sure if the performance is going to be the same, since now it will likely FORCE autodecoding on the algorithms that have specialized versions to AVOID autodecoding (I think).

Autodecoding is expensive which is why the algorithms defeat it. Nearly none actually need it.

You can get decoding if needed by using .byDchar or .by!dchar (forgot which it was).

There is byCodePoint and byCodeUnit, whereas byCodePoint forces auto decoding.

The problem is, I want to use this wrapper just like it was a string in all respects (including the performance gains had by ignoring auto-decoding).

Not trying to give too much away about the library I'm writing, but the problem I'm trying to solve is parsing out tokens from a buffer. I want to delineate the whole, as well as the parts, but it's difficult to get back to the original buffer once you split and slice up the buffer using phobos functions.

Consider that you are searching for something in a buffer. Phobos provides all you need to narrow down your range to the thing you are looking for. But it doesn't give you a way to figure out where you are in the whole buffer.

Up till now, I've done it by weird length math, but it gets tiring (see for instance: https://github.com/schveiguy/fastaq/blob/master/source/fasta/fasta.d#L125). I just want to know where the darned thing I've narrowed down is in the original range!

So this wrapper I thought would be a way to use things like you always do, but at any point, you just extract a piece of information (a buffer reference) that shows where it is in the original buffer. It's quite easy to do that part, the problem is getting it to be a drop-in replacement for the original type.

Here's where I'm struggling -- because a string provides indexing, slicing, length, etc. but Phobos ignores that. I can't make a new type that does the same thing. Not only that, but I'm finding the specializations of algorithms only work on the type "string", and nothing else.

I'll try using byCodeUnit and see how it fares.

-Steve

Reply via email to