I've implemented Okasaki's purely functional, random-access lists in Typed Racket. They perform well there. I thought I'd see how they would do crossing the contract barrier, so I ported my benchmarks to Racket.

Here's what I get doing `list-ref', passing each index of 1-to-10-element lists 10000 times:

  list-ref
  cpu time: 20 real time: 15 gc time: 0
  cpu time: 10 real time: 14 gc time: 0
  cpu time: 10 real time: 14 gc time: 0
  cpu time: 20 real time: 14 gc time: 0
  cpu time: 10 real time: 14 gc time: 0

In Typed Racket, `ralist-ref' keeps up pretty well on the same benchmark:

  ralist-ref
  cpu time: 50 real time: 47 gc time: 0
  cpu time: 50 real time: 49 gc time: 0
  cpu time: 50 real time: 47 gc time: 0
  cpu time: 40 real time: 48 gc time: 0
  cpu time: 50 real time: 47 gc time: 0

Unfortunately, it's awfully slow when used on the other side of the contract barrier:

  ralist-ref
  cpu time: 25280 real time: 25280 gc time: 560
  cpu time: 25150 real time: 25151 gc time: 490
  cpu time: 25500 real time: 25502 gc time: 880
  cpu time: 25110 real time: 25111 gc time: 560
  cpu time: 25130 real time: 25127 gc time: 220

This was run from the command line. It's 500x slower, which is much worse than anything I've seen from arrays.

I think this is the problem: the RAList type has a nested structure that has to be checked every time an instance crosses the barrier. Not only that, but checking the structure is O(n) in the size of the instance, which turns `ralist-ref' from O(log(n)) to O(n).

The following are the types of the values used to generate those numbers. Notice that they're all first-order.

  (struct: (A) leaf ([value : A]))
  (struct: (A) node ([value : A] [left : (Tree A)] [right : (Tree A)]))

  (define-type (Tree A) (U (leaf A) (node A)))

  (struct: (A) root ([size : Positive-Index] [tree : (Tree A)]))
  (struct: (A) RAList ([size : Index] [roots : (Listof (root A))])

  (: list->ralist (All (A) ((Listof A) -> (RAList A))))
  (: ralist-ref (All (A) ((RAList A) Integer -> A)))

Example: The instance created by (list->ralist '(1 2 3 4)) is

  (RAList 4 (list (root 1 (leaf 1))
                  (root 3 (node 2 (leaf 3) (leaf 4)))))

Because everything about this first-order data structure is utterly typical, the best performance advice I can currently give for mixed untyped/typed programs is this:

  Don't let anything but small instances of built-in types cross from
  typed to untyped code, or vice-versa.

Feel free to make exceptions for functions that do a lot of computation with their values.

**********

This is sad, because Typed Racket is better than Racket for coding up efficient data structures. It *makes a lot of sense* to write all the constructors, accessors, and basic operations in Typed Racket, because it optimizes the heck out of them. If only moving the values were cheap.

Why isn't it cheap? There's no reason for untyped Racket code to know anything about the internal workings of an RAList. That's the essence of data abstraction. I'm not even providing the default constructor. It shouldn't be possible (without significant hackery that revokes a user's right to safety) to create an ill-typed RAList, because they can only be created in Typed Racket code.

What if RAList could be exported as a polymorphic, opaque type?

It seems the contract system could then exploit the fact that an RAList is always well-typed. What would it take? Tagging structs? A special chaperone? User-definable contracts for user struct types?

If RAList were a polymorphic, opaque type, how could we put a contract on the following function, which instantiates it?

  (: ralist-sum ((RAList Real) -> Real))
  (define (ralist-sum lst)
    (apply + (ralist->list lst)))

Could exhaustively checking the structure of a polymorphic type be put off until such functions are called?

Would it be impossible for mutable struct types to be opaque? Would having a type variable in positive position cause problems?

I really don't want to write "Performance Warning: Using random-access lists in untyped Racket is currently 500x slower than using them in Typed Racket" at the top of the documentation for `data/ralist'.

Neil ⊥
_________________________
 Racket Developers list:
 http://lists.racket-lang.org/dev

Reply via email to