Re: [bitc-dev] Retrospective Thoughts on BitC

Jonathan S. Shapiro Mon, 16 Apr 2012 10:16:31 -0700

So a couple of comments about David's thoughts on modularity and versioning.

First: At a high level, I agree with him that these are important issues.
Where we may possibly disagree is how many of these problems should be
solved at the language level. The issue that concerns me is that different
platforms have different solutions, and an effective language needs to work
with all of them. Which is rather a nuisance. Beyond that, we don't know a
really good solution. It would be a shame to commit ourselves only to find
a better solution emerging that we cannot take advantage of. Kinda like
Java and 16 bit Unicode, for example.

My take for BitC:

   - Arriving at an effective module system is in scope
   - Dealing with module versioning is *not* in scope - or at least not yet.

On Fri, Apr 13, 2012 at 8:59 AM, David Jeske <[email protected]> wrote:

> Solutions with GC pauses will not be used because pauses are unacceptable
> for a leveraged software stack. Out of process solutions will not be used
> because they are too slow.
>

David, I'm sorry, but you are simply wrong on *both* of these points. What
you say concerning pauseless GC is true for large, monolithic applications.
It's worth noting that DLLs/shlibs exacerbate the problem of application
scale. It is *not* true for more rationally sized applications. The pause
times for such applications are negligible. The more pressing issue is the
need for a hand-over-hand collector. The requirement for hand-over-hand
tends to drive us back into concurrent collector technology, so the end
result is the same, but we should be careful to remain clear about which
requirements are being driven from which use cases and why.

What you are arguing amounts to "we need a pauseless collector because we
are too stupid to write our applications in manageable form". Which is,
sadly, true.

Out of process solutions are indeed horrendously slow in Windows. This is *
not* generally true in other systems.

Refcounting+AzulC4-style no-stop cycle finding is entirely acceptable.
>

No, it isn't. Refcounting collectors aren't coalescing collectors, and that
turns out to be important.

> The reason pauses are unacceptable in our leveraged software stack is in
> some sense an economical one. In software technology businesses, software
> engineering costs generally don't scale with profits.
>

I agree with the scaling statement, but it isn't clear how it relates to
the economics statement.

--> Which means in very-popular or very-profitable software there is no
> reason to compromise user-experience for coding-efficiency. <--
>

That's just nonsense. The reason to avoid compromising user experience is
entirely profit-driven. The user experience must be maintained. If that
requires improving coding efficiency, so be it.

> This economic basis is why pausing-GC-systems are only viable for the top
> of our stack.... Lower-volume solutions such as scripting, business
> specific customization, low-volume apps, low-volume webservers, and
> top-of-stack programs where authors don't realize the pauses are problem
> until later (like webservers).
>

Well, it's a theory. Unfortunately it's based on a bad assumption.
Scripting languages are no longer used for small or low-volume
applications. Some of the applications being written in these languages are
surprisingly large. That's the main motivation for Dart.

> High-volume software must be pause free because customers would choose a
> non-pausing alternative otherwise.
>

>From this we must conclude that all browsers necessarily implement C or C++
as a scripting language. Hmm. Let's check... No. Afraid not.

David: you are engaged in drum-beating, and you are allowing the intensity
of your frustration to run away with you. Nobody here is debating the
importance of pause-free collectors. There is considerable reason to doubt
that the situation, as seen by users, is as dire as you describe. My
suggestion is to step off of the soap box and try to return to a more
constructively framed discussion that will let us actually solve something
here.

I *have* argued that other management techniques are sometimes appropriate.
In particular I've argued for regions. I would also argue for something
like ehCache where appropriate. There is no reason to imagine that a single
memory management regime exists that is appropriate for all scenarios. That
has never been true in the manual world, and I think it's fairly clear that
it's not true in the GC world.

Are these alternatives a replacement for pauseless GC? Certainly not. But
they are fine holding actions until the hardware and OS support required
for a good pauseless GC becomes mainstream.

Is interesting that C4 does incur significant  pauses for small ( 4G) heaps
>> (obviously  for the current thread)
>>
>
> By my read of the paper, this wasn't because of a small heap, this was
> because of the ration of used-space to free-space. (i.e. when there is
> memory pressure, there is a chance that allocation catches
> up-to-reclamation, and then there is a pause. Note that malloc/free could
> deadlock in such a situation because of fragmenting and lack of compacting)
>

Yes. The underlying problem here is the inability to flip page permissions
fast enough. The corresponding collector on Azul hardware doesn't have the
same issues. The linux page permission change operation has *horrible*
performance,
and the UNIX mapping API really isn't suited to the needs of VMs. It
wouldn't work very well on most microkernels either. There are some
alternative TLB update schemes that would go a long way here.

If it's so difficult, then why has such a simple system (the C-shib) solved
> this problem so well and for so long? I think the "difficult" part about
> building a better system is avoiding the classic v2.0 problem of throwing
> out the critical requirements of v1.0.
>

I think you are referring to ELF here. Since I was on the extended ELF
team, I can actually answer that one.

ELF succeeded mainly because it was driven by (a) clearly defined,
real-world requirements, (b) the world's deepest base of experience with
live upgrade, which came from 5-ESS (c) a *very* smart systems engineer as
a designer, and (d) a willingness to change the linkage model in a slightly
incompatible way in order to do a better job.

I believe the path to this is zero-pause GC, combined with at least
> CIL-level support for structs/value-types, JIT, and possibly a constrained
> forward-compatible-upgradable polymorphic interface boundary for modules.
>

Curious: are you willing to work on this actively, or are you just venting?

A more constructive question: can you explain what you mean by "constrained
forward-compatible-upgradable polymorphic interface boundaries"?

> The technique Azul is using could have been written a decade ago. Nobody
> is working on this stuff...
>

You know, that's pretty disrespectful to people on both sides of C4. The *
fact* is that the Azul collector was only discoverable on their custom
hardware. They worked out how to back-port it to linux after the fact, and
only with significant mods to the linux kernel. Even then, the result gets
into trouble under sustained allocation (as the benchmarks show).

Even a cursory read of the last 30 years of GC literature will tell you
that *lots* of people pushed in this direction, and that they ran into
barriers they could not find ways around.

To say that Azul's work was "obvious", or that it could have been done a
decade ago, is both wrong and disrespectful.

Jonathan

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] Retrospective Thoughts on BitC

Reply via email to