Hi Mark,
I agree that the whole point of generators is performance.  When combined
with lazy sequences (srfi-127), it covers many typical use cases of streams
much more efficiently.
I've written Gauche version with performance in mind, but it depends on a
few non-standard features and can not be used as-is.  Besides, if we do go
for performance, we need benchmark suites as well.  I'm interested in
adopting Gauche version as the reference implementation but don't know when
I have time.  If you have some code to improve the current reference
implementation and can merge it to the repo, we can proceed gradually, I
guess.

--shiro


On Fri, Jan 22, 2021 at 2:59 PM Mark H Weaver <m...@netris.org> wrote:

> Hi John,
>
> John Cowan <co...@ccil.org> writes:
>
> > Mark: I'm interested to know if you have a sketch of ideas for a more
> > efficient implementation of SRFI 121/158.  You say it requires specific
> > knowledge of Guile internals, but are you willing to sketch how you might
> > do it?  Thanks.
>
> Here are a few suggestions off the top of my head, looking only at the
> latest SRFI-121 reference implementation:
>
> * In 'gcombine', 'generator-fold', 'generator-for-each', and possibly
>   also 'generator-unfold', it would be helpful to use 'case-lambda' to
>   provide specialized code for cases where the 'dotted' tail in the
>   argument list consists of 1 or 2 arguments.  When a procedure with a
>   dotted tail is called, it forces allocation of a new list, and last I
>   checked Guile does not include optimizations to avoid that allocation
>   where possible.  Worse, the general case requires calling 'map' and
>   allocating a new list every time a new item is requested.  It's a
>   great help to avoid these expenses in the most common cases.  For
>   example, see the definitions of 'map', 'for-each', and 'fold' in
>   Guile:
>
>
> https://git.savannah.gnu.org/cgit/guile.git/tree/module/ice-9/boot-9.scm?id=75b0db1a286f936a90683973efc2315a07c03b21#n214
>
> https://git.savannah.gnu.org/cgit/guile.git/tree/module/srfi/srfi-1.scm?id=75b0db1a286f936a90683973efc2315a07c03b21#n451
>
> * Avoid using 'define-values' in internal lexical contexts in Guile for
>   now.  Hopefully some day Guile's implementation of 'define-values'
>   will be more efficient, but for now it's implemented as a simple macro
>   that expands into code that mutates variables, which prevents several
>   other optimizations that could otherwise be done by Guile's compiler.
>   In particular, in 'gcombine', better use 'call-with-values' or
>   'receive'.
>
> * Several procedures are defined in terms of more general higher-order
>   procedures, or create intermediate lists/generators unnecessarily, for
>   the sake of making the code simpler.  In most contexts I would applaud
>   this practice, but there will be a significant price to pay in
>   efficiency.  For example, 'generator->reverse-list' with 1 argument is
>   implemented in terms of 'generator-fold', and with 2 arguments by
>   creating an intermediate generator using 'gtake'.
>
> * To make matters worse: 'gtake', as well as 'make-for-each-generator'
>   and 'make-unfold-generator', are defined in terms of coroutine
>   generators.  It's good to have coroutine generators available, but
>   they are quite expensive, and best avoided where efficiency is
>   desirable.
>
> * 'generator->vector' is defined using 'generator->list' and
>   'list->vector'.  That's bad enough, but digging deeper we see that
>   'generator->list' is implemented by reversing the result of
>   'generator->reverse-list', which as I've already pointed out is
>   defined using 'generator-fold', which currently involves calling 'map'
>   and 'append' and allocating temporary lists for each item generated.
>
> In general, it's helpful to go through the mental exercise of tracing an
> evaluation of each of these procedures, and more importantly to trace
> calls to the produced generators, to get some idea of the expense
> involved.
>
> This is just a sampling of the problems I noticed from a quick skim of
> the code, by no means comprehensive.
>
> I acknowledge that following the suggestions above will make the code
> much larger, uglier, and more difficult to understand.  I would not
> recommend such practices except in cases where efficiency is paramount.
>
> In this case, I think efficiency *is* paramount.  My rationale is that,
> like hash tables, generators inevitably force code that uses them to be
> written in an imperative style, and therefore they are best avoided in
> my opinion, *except* in cases where efficiency is paramount.
>
> To avoid being forced to write code in an imperative style, I suggest
> that in most cases streams are preferable to generators.
>
>       Regards,
>         Mark
>

Reply via email to