Anyone got an implementation of a mutable StringBuilder-like object? I 
could use it in Rebellion's implementation of `into-string` which currently 
isn't quadratic, but it definitely has the allocation problem.

On Sunday, June 27, 2021 at 11:36:14 AM UTC-7 bogdan wrote:

> While I think the complexity piece is important, I feel like it's worth
> pointing out just how much more expensive the allocation -- and its
> ramifications, like the resulting GC pressure and CPU cache misses -- is
> than one might think. Here's a quadratic version of the code that
> avoids allocations:
>
> #lang racket
>
> (require racket/flonum)
>
> (define (xy->string x y)
> (string-append
> (~r x #:precision 1) ","
> (~r y #:precision 1)))
>
> (define len 0)
> (define dst (make-string 5000000))
> (define (string-append! b)
> (string-copy! dst 0 dst 0 len) ;; intentionally performing pointless work 
> to be n^2
> (string-copy! dst len b)
> (set! len (+ len (string-length b))))
>
> (define (xy-vectors->string x-vec y-vec)
> (for ((x (in-flvector x-vec))
> (y (in-flvector y-vec)))
> (string-append! (xy->string x y))
> (string-append! " ")))
>
> (time
> (let ([x (make-flvector 100000)]
> [y (make-flvector 100000)])
> (xy-vectors->string x y)))
>
> On my machine (running Racket CS 8.1), this is faster than the
> `call-with-output-string` version I posted earlier by about 50ms.
>
>
> Jens Axel Søgaard writes:
>
> > Den søn. 27. jun. 2021 kl. 18.58 skrev Alessandro Motta <
> amot...@gmail.com
> >>:
> >
> >> I also like Jens' code for its pure functional elegance. I'm surprised
> >> that building up a long list of short strings before joining them is so
> >> much faster. After all, isn't the final `string-join` essentially doing
> >> what the original code snipped did? (Clearly not! I will have a look at
> >> the implementation.)
> >>
> >
> > The following is the same answer as Robby's, but fills in some details.
> > I wrote this mainly because I couldn't find a web-page with the 
> explanation.
> >
> > Let's say we have two strings a and b of lengths m and n respectively.
> >
> > How much work must (string-append a b) do?
> >
> > Conceptually the following steps need to be done:
> > 1. Allocate a new string c of length m+n.
> > 2. Copy the string a into c.
> > 3. Copy the string b into c.
> >
> > This means that the time (work) needed to append the strings a and b are
> > proportional to m+n.
> >
> > How much work is done here?
> >
> > (string-append "x" (string-append "y" (string-append "z" "")))
> >
> > First (string-append "z" "w") takes time 1+0=1 with the result "z".
> > Then (string-append "y" (string-append "z" "")) or (string-append "y" 
> "z")
> > takes time 1+1=2 with the result "yz".
> > Then (string-append "x" (string-append "y" (string-append "z" ""))) or
> > (string-append "x" "yz") takes time 1+2 = 3 with result "xyz".
> > In total we have used time 1+2+3 =6.
> >
> > We see that if we 3 times use string-append to add strings of length 1,
> > then we use time 1+2+3=6.
> > In general, if we n times use string-append to add strings of length 1,
> > then we use time 1+2+3+...+n.
> > The last sum is more or less n^2. See [1].
> >
> > The same thing happens in your loop, where you repeatedly use 
> string-append
> > to append a small string.
> > The length of the small string is no longer 1, but the same happens - and
> > the time used by the
> > loop is proportional to n^2.
> >
> >
> > Now suppose we have a list of the strings to append
> > (list "x" "y" "z")
> > and we need to append them. How much work is there now?
> >
> > Well, first we can calculate the length of the result 1+1+1 = 3.
> > Then we allocate a new string of length 3. Then each individual
> > string is copied and that takes time 1+1+1=3.
> >
> > Here, each string is only copied once. For comparison in the loop version
> > the string z is copied multiple times.
> >
> >
> > But wait - why does a loop that uses cons multiple times to build up
> > a list not have the same problem?
> >
> > Because in (cons x xs) the existing list xs isn't copied.
> > The steps are
> > 1. a new pair with two cells (the car and the cdr) are allocated
> > 2. the contents of the car is set to x
> > 3. the contents of the cdr is set to xs.
> >
> > This always takes the same amount of time, no matter how long the list xs
> > is.
> > This means that the time used by cons is constant (that is, proportional 
> to
> > 1).
> >
> > Note: This phenomenon that using string-append in a loop is not a Racket
> > only problem.
> > It is a common pitfall in many languages.
> >
> >
> >
> > [1] Remember the famous story of Gauss as a child that calculated
> > 1+2+...1000 ?
> > https://www.youtube.com/watch?v=Dd81F6-Ar_0
> >
> >
> > /Jens Axel - https://racket-stories.com
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/61691331-226f-4708-bd2a-42e5c6e5c2bcn%40googlegroups.com.

Reply via email to