On Tue, 28 Dec 2010 00:02:29 -0700, Andrei Alexandrescu <[email protected]> wrote:

I've put together over the past days an embryonic streaming interface. It separates transport from formatting, input from output, and buffered from unbuffered operation.

http://erdani.com/d/phobos/std_stream2.html

There are a number of questions interspersed. It would be great to start a discussion using that design as a baseline. Please voice any related thoughts - thanks!


Andrei

Here are my initial thoughts and responses to the questions. Now to go read everyone else's.

Re: TransportBase
Q1: Internally, I think it is a good idea for transport to support lazy opening, but I'm not sure of the hassle/benefit reward for exposing this to user code. If open is supported, I don't think it should take any parameters. Q2: If seek isn't considered universal, having a isSeekable and rewind, might be beneficial. But while I know of transports where seeking might be slow, I'm not sure which one wouldn't support it at all, or only support rewind.
Q3: Yes, to seek + tell and getting rid of seekFromXXX.

Re: UnbufferedInputTransport
Q1: I think that read should be allowed to return less than buffered length, but since the transport should know the most efficient way to block on an input, I don't think returning a length zero array is valid.

Re: BufferedInputTransport
Q1: I think it's valid for the front of a buffer input to be empty: an empty front simply means that popFront should be called. popFront should be required to fill at least some of front (See UnbufferedInputTransport Q1)

Q2: Semantically, 'advance' feels to like popFront: I want to advance my input and I'm intending to work with it. The seek routines, on the other hand feel more like indexing: I want to do something with that index, but I do not necessarily need everything in between. In particular, I'd expect long seeks to reduce the front array to a zero elements, while I'd expect advance to enlarge the internal buffer if necessary.

Re: Formatter
Q1: I don't think formatters should be responsible for buffering, but certain formats require rather extensive buffering that can't be provided by the current buffer transport classes. (BSON comes to mind). My initial impression is that seek, etc should be able to handle these use cases, but adding a buffer hint setter/getter might be a good idea. The idea being that if the formatter knows that it will come back to this part of the stream, it can set a hint, so the buffer can make a more intelligent choice of when/where to flush internally. Q2: putln only makes sense in terms of text based streams, plus it adds a large number of methods to implement. So I'm a bit on the fence about it. I think writefln would be a better solution to a similar problem. Q3: The issue I see with a reflection-based solution is that the runtime reflection system should respect the visibility of the member: i.e. private variables shouldn't be accessible. But to do effective serialization, private members are generally required. As for the more technical aspects, combining __traits(derivedMembers,T) and BaseClassesTuple!T can determine which objects overload toString, etc. Q4: Reading/writting the same sub-object is an internal mater, in my opinion. The really important aspect is handling slices, etc nicely for formats that support cyclic graphs. For which, the only thing missing is put(void*) to handle pointers (I think). Q5: I think handling AA's with hooks is the best case with this design, though I only see a need for start and end. The major issue is that reading should be done as a tuple, which basically breaks the interface idiom. Alternatively, callbacks could be used to set read's mode: i.e. readKeyMode, readValueMode & putKeyMode, putValueMode. Q6: Well, toString and cast(int/double/etc), should go a long way to covering most of the printf specifiers
Q7: Yes, writefln should probable be supported for text based transport.

Re: Unformatter
Q1: Implementations should be free (and indeed encouraged) to minimize allocations by returning a reusable buffer for arrays. So the stream should be responsible for inferring the size of an array.
Q2: See Formatter Q3.
Q3: See Formatter Q5.


Other Formatter/Unformatter thoughts:
For objects, several formats also require additional meta information (i.e. a unique string id, member offset position, etc), while others don't.

Reply via email to