[perl #132030] [REGRESSION] Broken Text::CSV tests and possibly other ecosystem fallout

jn...@jnthn.net via RT Wed, 06 Sep 2017 06:39:44 -0700

On Tue, 05 Sep 2017 09:11:19 -0700, allber...@gmail.com wrote:
> On Tue, Sep 5, 2017 at 5:40 AM, jn...@jnthn.net via RT <
> perl6-bugs-follo...@perl.org> wrote:
> 
> > Failing to close output handles has been clearly documented (and yes,
> > documented well before the recent buffering change) as something that can
> > cause data loss. Default output buffering just makes it quite a lot more
> > likely to show up.
> >
> > While there will be some ecosystem fallout like this, unfortunately I
> > don't think it's avoidable. If we whip out the patch that turns output
> > buffering on by default for non-TTYs for this release, then when will we
> > include it? The longer we leave it, the more painful it will be, because
> > more code will be written that is careless with handles.
> >
> > I don't think "leave it off by default" is a good option either, otherwise
> > we get to spend the next decade hearing "Perl 6 I/O is slow" because it'd
> > be one of the only languages that doesn't buffer its output without an
> > explicit flag being passed to enable that (which nearly nobody doing quick
> > benchmarks will know to use).
> >
> 
> Are we missing something to flush/close handles at exit? Leaving it to a GC
> that may not finalize before exit is not really an option.
> 
To recap the IRC discussion yesterday: no, we haven't had this so far (except 
for stdout/stderr), and have gotten away with it due to the lack of output 
buffering. At present, we can either choose between:


1) Start keeping a list of open files, and at exit close them (flushing is 
already part of closing). This can be done at Perl 6 level, in the same place 
we make sure to run END blocks.

2) Having unclosed handles possible to GC, and closing them if/when they get 
GC'd.

Today we are doing #2. We could switch to doing #1. We can't currently do both, 
because the moment we start keeping a list of open handles then they can't be 
GC'd, and so #2 can't happen.

My initial inclination was to preserve behavior #2, though others have pointed 
out that behavior #1 is more useful for debugging in that it ensures log files, 
for example, will be written in the event of a crash, and a program relying on 
behavior #2 could already run out of handles today anyway if it were less lucky 
with GC timing. This is a fair argument, and the automatic close at exit might 
be softer on the ecosystem too (but would have done nothing for the Text::CSV 
case, which is the original subject of this ticket, because it wrote a file, 
didn't close it, then separately opened it for reading).

There's probably enough consensus to switch to option #1, and lizmat++ 
mentioned maybe looking into a patch to do that.

In the longer run, we can have both, but it depends on implementing weak 
references. In terms of backend support, the JVM does have them, and it seems 
there's an npm package [1] exposing v8 weak refs so a Node.js backend could 
support that also. I'm OK with adding them to MoarVM in the future, but both 
doing that and exposing weak references at Perl 6 level would be a non-small, 
and certainly non-trivial, task.

[perl #132030] [REGRESSION] Broken Text::CSV tests and possibly other ecosystem fallout

Reply via email to