Re: redefining "put" and "OutputRange"

Dmitry Olshansky Fri, 30 Aug 2013 08:21:07 -0700

30-Aug-2013 18:38, monarch_dodra пишет:


Which is all good and well, but seeing this:

static
if (is (R == T function(const( char)[]), T) || is (R == T
delegate(const( char)[]), T))
        enum isSink = 1;
    else static if (is (R == T function(const(wchar)[]), T) || is (R
== T delegate(const(wchar)[]), T))
        enum isSink = 2;
    else static if (is (R == T function(const(dchar)[]), T) || is (R
== T delegate(const(dchar)[]), T))
        enum isSink = 4;
    else
        enum isSink = 0;

Doesn't inspire confidence - it's special casing on (w|d)char arrays,
again... Let's hopefully stop spreading this plague throughout
especially under banners of generality.


It's a special case for sinks, yes. I'm not a fan of this, but I think
it is the *single* cases we can trust. (More on this bellow)


No thanks. Full functionality outweighs trusted but crippled.

The real reason I'm starting this thread is I believe the current way
"put" leads to a *MASSIVE*, *HORRIFYING* issue. I dare not say it:
Escaping references to local stack variables (!!!).


It is a dangerous primitive. It's not a good idea to wrap everything
with safe bags and specialize a single case - arrays and not even
appender of (w|d)char[].

Instead it's once again a case where primitive needs better high-level
contract inexpressible in simply terms such as @safe-ty provides.

The rule is: OutputRange must not hold references to any slices given.
And is trivially true for many of current ranges.


OutputRange really just means that put(r, e) resolves one way or
another. And it also fundamentally depends on what you consider the
"element type".

You put too much faith in the source code alone. Not every assumption iswritten in the source (while it should be probably).

For example, int[][] is an output range for the element int[]. It makes
a copy of said element (int[]), but it certainly *won't* copy the
contents of that slice.

The main reason for output range is to absorb data one by one or inchunks (= slices). In that sense int[][] is a bad output range.

I do not really care for formalism that defines what is an element typehere.

I'd like to make a proposition: "put" needs to be changed to *not*
accept putting an E into something that accepts E[]. There is simply *no
way* to do this safely, and without allocating (both of which, IMO, are
non-negotiable).


Just relax and step back for a moment. The bug in question is
painfully easy to blowup so chances for it being HORRIBLE are quite
low (it's a loud bug). Safety is cool but I expect that output ranges
are designed with idea of copying something somewhere or absorbing or
accumulating.


I'd agree, if output ranges were actually "designed".


And they were.

Right now, the
basic definition is that an "OutputRange" collects "Elements". "put"
extends the supported "Elements".

The truth is that format sinks "(const(char)[]){}" is the *only*
OutputRange that collects "Elements", but whose' signture is one that
accepts a slice. This "flaws" the slice/element notion.


Because it was lacking in performance the most.

If format sinks were defined as "(char){}" to begin with, then
everything would work fine (and *does*),

And would slowly crawling into oblivion, that said std.stdio is sloweven w/o put-ing char by char (+char is not complete thus would requirebuffering on the other side of fence).


but this is not the case today,

and that is the *only* reason I made an exception for them.


Chances are you missed ubyte/ubyte[] of std.digest.

For objects that define put/opCall, then it is not very complicated to
have two different signatures for "put(E[])"/"opCall(E[])" *and*
"put(E)"/"opCall(E)". This makes it explicit what is and isn't accepted.


And that will subtly break some genuinely fine code...


It would "explicitly" break code


... and that is bad ...

that may (or may *not*) be fine.


The point is if it wasn't fine then it wouldn't survive a day in the wilds.

Lucky enough, the problem never existed with input ranges: "int[][]"
never accepted "int", so there is no problem there.


This is it - a confusion between output range of int[]'s accepting
them one by one and of int and accepting them in chunks.


I think the problem is "put" overstepping its boundaries. If
"r.put(someSlice)" compiles, "put" has no reason to think that R
actually owns the elements in the slice.

It should and this is where we differ I guess. I can't think of a usefuloutput range that stores away aliases to slices it takes.


--
Dmitry Olshansky

Re: redefining "put" and "OutputRange"

Reply via email to