Re: Realtime Clojure program

Ken Wesson Wed, 16 Feb 2011 07:38:13 -0800

On Wed, Feb 16, 2011 at 3:36 AM, Marko Topolnik
<marko.topol...@gmail.com> wrote:
>> HOFs and
>> lazy seqs add a bit more expense. Try deliberately throwing an
>> exception in lazy-seq and then (first (map this (map that ...
>> (my-exception-throwing-lazy-seq)))) and see how deep the stack trace
>> nests; the method call overheads do add up when there's many of them!
>
> Tell me about it... these stacktraces are my daily bread. Once we had
> a misterious StackOverflowError in the middle of our non-recursive
> code. The culprit turned out to be concat, iteratively applied to an
> open-ended amount of vectors.


A situation where you have little choice but to apply doall.

I don't suppose there's any chance a future version will change nested
map/concats on single seqs (at least) to somehow flatten the call
stack? For example,

(map foo (map bar x))

could emit the same thing as

(map (comp foo bar) x)

by doing something conceptually like this:

(if (and
      (= (count seqs) 1)
      (is-map-derived-seq? (first seqs)))
  (let [f2 seqs] (take-apart-map (first seqs))]
    (apply map (comp f f2) seqs)))
  (original-map-code-goes-here))

This moves the problem down into comp, where something similar can be
done with suitable compiler support:

(let [fs (loop [fs (seq fs) out []]
           (if fs
             (let [f (first fs)]
               (if (is-comp-derived-fn? f)
                 (recur (concat (take-apart-comp f) (next fs)) out)
                 (recur (next fs) (conj out f))))
             out))]
  (make-comp-derived-fn fs))

That would be the *entirety* of comp; that last function has to be a
primitive that for two functions outputs bytecode equivalent to #(f2
(f1 args)), for three #(f3 (f2 (f1 args))), and so on. For one
argument it should just return the argument without wrapping it in
anything and for zero return clojure.core/identity.

With those, any amount of nested maps would result in a bounded number
of stack frames between seq realizer and the fn supplied to the
innermost map.

The scheme can be expanded to concat (and therefore to mapcat) easily
enough, for instance by making concat produce a Concat object that
implements ISeq and is internally just a list of its component seqs; I
could actually write this as a deftype in ten minutes and have it
tested and fairly likely bug-free in twenty. The concat function would
then construct one and use a loop very similar to the one above to
flatten nested Concats into the parent.

In fact, this suggests the above loop be factored out into a separate
selectively-flatten function, something like

(defn selectively-flatten

  "Produces a lazy sequence of elements of coll, except that elements
   for which flatten-elt? returns true are recursively expanded into
   subsequences by subseq:

   (defn selectively-flatten

  "Produces a lazy sequence of elements of coll, except that elements
   for which flatten-elt? returns true are recursively expanded into
   subsequences by subseq:

   (selectively-flatten vector? identity [1 2 [3 4 5] 6 '(7 8)])

   will return '(1 2 3 4 5 6 (7 8))."

  [flatten-elt? subseq coll]
  (let [step (fn step [colls]
               (lazy-seq
                 (if colls
                   (let [c (first colls)]
                     (if c
                       (let [e (first c)]
                         (if (flatten-elt? e)
                           (step
                             (cons (seq (subseq e))
                             (cons (next c) (next colls))))
                           (cons e
                             (step (cons (next c) (next colls))))))
                       (step (next colls)))))))]
    (seq (step [(seq coll)]))))

which the comp replacement can call as (selectively-flatten
is-comp-derived-fn? take-apart-comp fs) and the concat replacement as
(selectively-flatten #(instance? Concat %) take-apart-concat colls).

(This last function has been tested moderately, including on the
example in its docstring, and seems to work. Its output is very
similar to that of tree-seq but it only emits the leaf nodes. I
suppose (remove branch? (tree-seq branch? children root)) might do as
an implementation or a replacement though it ought to be a touch less
efficient due to additional nested function calls.)

> When a new version of Clojure comes out, I routinely invest enough
> time to read the new API docs cover-to-cover, searching for new
> interesting stuff. The problem is, some important changes don't happen
> through new API, but through subtle improvements in the behavior of
> existing API. For example, as of 1.2 we have something Rich named
> "fine-grained locals clearing" -- a killer feature for lazy seqs, but
> not mentioned anywhere in the docs and not easily googlable, unless
> you already heard the phrase.

That's not even an API change -- it's a compiler change. An improved
API behavior would be making nth work on sorted-sets and sorted-maps.
;)

> The feature you mention -- functions
> that accept primitive arguments -- seems to fall in this class. I
> continually worry that my code might be jumping through nonexistent
> hoops that I only imagine due to my incomplete understanding of
> Clojure.
>
> I think it is important to clearly document all such features, even
> more so than the API, where the source code is the ultimate, easily
> reachable documentation.

Compiler/language features that aren't clojure.core API features, you
mean? The reference pages, like Sequences and Java Interop and Special
Forms, document some of this. Recent Changes ought to document changes
of this sort from older versions, but would currently documents the
changes between 1.1 and 1.2 if it did. Instead it seems to document
changes to the *web site*. Perhaps rename that Recent Site Changes and
make Recent Changes do this, and Upcoming Changes document the changes
between the current stable version and the development branch? In that
case the improved locals clearing would be in Recent Changes, the
changes to primitive math/functions that take primitives would be in
Upcoming Changes (and could be something of a moving target, at least
until the spec for 1.3's exact feature set settled down), and the
current content of Recent Changes would be in Recent Site Changes.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Realtime Clojure program

Reply via email to