On 10/13/2010 04:48 PM, Shin Fujishiro wrote:
No.  Uh... Let's forget about ranges for the next two paragraphs.

Consider decoding a base64 unit (4 chars) by hand.  You may want to
(a) pull four chars from a source, and decode them into three bytes.
Then, (b) you'll push the decoded bytes to a destination.  Done.

Conversion works naturally if (a) data can be pulled from a source and
(b) converted data can be pushed to a destination.  If either of them
can't be achieved, we need an extra cache to pool dangling data.  The
"control" I wrote meant these two points.

Then, can ranges support pull and push?  No, unfortunately.  They are
restricted to pull *or* push semantics.  Decorator input ranges can pull
data from source ranges, but can't push converted data to anywhere.
Output ranges have similar inconveniences.

So, ranges are not best for conversion drivers IMO.  Ranges are at
their best when used as sources and destinations.  We may support
decorator ranges, but they should not be the main API.

Hey, I'm not dissing ranges nor your implementation. :-)  I'm just
afraid of people making everything ranges in the first place!

This ties into our earlier discussion about streams, and the ongoing discussion on the newsgroup.

Shin, there's no interface to satisfy all streaming needs. Some streams produce data at a variable rate. For those this is best:

void read(ref ubyte[] data);

So the client would have to pull data at unpredictable lengths and deal with it. Some streams need to hold internal buffers that are not under user's control. For those a straight range interface exposing ubyte[] is enough:

@property ubyte[] front();

Some other ranges work best with a user-supplied buffer of a size also decided by the user:

size_t read(ubyte[] buffer);

Now let's talk about decorator streams, which must read from some stream and write to another. In particular, those M:N ranges that produce and consume data at different rates. Depending on M > N versus M < N _and_ on the use of one of the APIs above, the M:N decorator would have to do its own buffering. I don't think there's a simple way out that satisfies everyone.

About the ongoing discussion about Base64: I do see a few problems with the current interface, although not a major one.

1. The template parameters '!' and '/' are not justified. They should be runtime parameters. Rule of thumb: use generic code when you stand to profit.

2. This function:

size_t encode(Range)(in ubyte[] source, Range range);

has one issue: (a) it forces input to an array although it could work with any input range with length of ubyte. Suggestion:

size_t encode(R1, R2)(R1 source, R2 target);

Constrain the template any way you need that keeps implementation efficient. Ideally you should have roughly the same performance with a ubyte[] as before.

3. Same discussion about decode. This is actually more important because you might want to decode streams of dchar. This is how many streams will come through, even though they are technically Ascii.

I'm not saying we should use ranges everywhere, but if it doesn't really cost anything, accepting a range is better than an array.

Regarding Daniel's approach with char/byte level ranges through and through, in an ideal world I'd agree. But I fear that the implementation would not be as efficient. (I suggest you benchmark it against Masahiro's.) Also, practically, more often than not I'll want to work one chunk at a time, not one byte/char at a time.


Andrei
_______________________________________________
phobos mailing list
[email protected]
http://lists.puremagic.com/mailman/listinfo/phobos

Reply via email to