Re: [Haskell-cafe] Reference for technique wanted
On 4/11/2010, at 10:56 PM, Claus Reinke wrote: Even in your SML code, the boring old plain lists seemed to be list_flatten, which uses difference lists in disguise, and won your little test, right? Using Haskell notation: flat (LEAF x) ys = x : ys flat (FORK(a,b)) ys = flat a (flat b ys) Measured time: 0.902 seconds (2**22 leaves) 2.716 seconds (2**23 leaves) -- flat (LEAF x) = \ys-x : ys flat (FORK(a,b)) = \ys-flat a (flat b ys) Measured time: 0.980 seconds (2**22 leaves) 7.999 seconds (2**23 leaves) -- flat (LEAF x) = (x :) flat (FORK(a,b)) = flat a . flat b Measured time: 10.189 seconds (2**22 leaves) 163.184 seconds (2**23 leaves) In all cases, the final result was a list, not a function. I was actually expecting the first and second versions to be the same code. You have shown very clearly how someone trying to write the first version in a point-free style would have rediscovered the list-as-function hack. I haven't the faintest idea what SML is doing with the third version, but clearly it shouldn't. I find your report that GHC doesn't do as well with the third version as the first two somewhat reassuring. The difference makes me wonder whether there might be performance consequences of the point-free style in more cases. As in Prolog, it is often better not to make the structure explicit, If that's a reference to DCGs, they *hide* the difference pair, but the data structure is still *there*. The practical consequence of this is that calling both techniques by the same name is going to mislead people familiar with one kind of language when they use the other: logic programmers will wrongly expect dlists to be practically useful in functional languags, functional programmers will expect them to be impractical in logic programming languages. I do tend to expect a little more insight from Haskellers, but perhaps you're right. Yes, but I don't believe the original poster (whose messages you have not seen) *is* a Haskeller. I'd still call both patterns difference lists, I have been shouting about how bad a name difference list is in Prolog for about 20 years. The problem is that it suggests that you should think about list differences *as a kind of list*, which in turn suggests passing around both ends in a single data structure like your Xs0\Xs. But doing so turns over a lot of extra storage and negates much of the performance advantage of list differences. It really is much more useful in Prolog to think of list differences as a kind of *relationship*, because then you are able to think hey, why just two ends? Why can't I pass around *several* references into the same list? Here's a grammar for a**n b**n c**n. abc(N, ABC) :- abc(N, ABC, BC, BC, C, C). abc(0, ABC, ABC, BC, BC, []). abc(s(N), [a|ABC1], ABC, [b|BC1], BC, [c|C1]) :- abc(N, ABC1, ABC, BC1, BC, C1). Example: ?- abc(N, ABC). N = 0, ABC = [] ; N = s(0), ABC = [a, b, c] ; N = s(s(0)), ABC = [a, a, b, b, c, c] ; N = s(s(s(0))), ABC = [a, a, a, b, b, b, c, c, c] ; N = s(s(s(s(0, ABC = [a, a, a, a, b, b, b, b, c|...] ; ... ?- abc(N, [a,a,a,a,b,b,b,b,c,c,c,c]). N = s(s(s(s(0 . ?- abc(N, [a,a,a,a,b,b,b,c,c,c,c]). false. ?- abc(N, [a,a,a,a,b,b,b,b,c,c,c,c,c]). false. You eventually realise that in Prolog there is no difference between list differences and accumulator passing. If you want to call accumulator passing and higher order functions by the same name, feel free, just don't expect me to regard it as illuminating. because it can be useful to know about the connections, I've been using Prolog since 1979 and Haskell since, ooh, Haskell 1.3, but I've never found it useful to know about the connection, and I plan to forget it as soon as I can. but if you could suggest text for the DList documentation that warns of the differences, and of performance pitfalls, I'm sure the package author would be happy to add such improvements. I've just been marking a software engineering exam in which I asked the students to comment, with supporting detail, on If it hasn't been tested, it doesn't work. If it hasn't been ported, it isn't portable. Let me add one other: If it hasn't been measured, it's slower. Before suggesting changes to any package, it might be worth asking the question can we find any examples where this KIND of transformation DOES pay off? (compared with other more elementary approaches). If we ever get per-package wikis for hackage, you could add such comments yourself. Claus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On 5 November 2010 06:40, Richard O'Keefe o...@cs.otago.ac.nz wrote: On 4/11/2010, at 10:56 PM, Claus Reinke wrote: Even in your SML code, the boring old plain lists seemed to be list_flatten, which uses difference lists in disguise, and won your little test, right? Using Haskell notation: flat (LEAF x) ys = x : ys flat (FORK(a,b)) ys = flat a (flat b ys) Measured time: 0.902 seconds (2**22 leaves) 2.716 seconds (2**23 leaves) -- flat (LEAF x) = \ys-x : ys flat (FORK(a,b)) = \ys-flat a (flat b ys) Measured time: 0.980 seconds (2**22 leaves) 7.999 seconds (2**23 leaves) -- flat (LEAF x) = (x :) flat (FORK(a,b)) = flat a . flat b Measured time: 10.189 seconds (2**22 leaves) 163.184 seconds (2**23 leaves) In all cases, the final result was a list, not a function. I was actually expecting the first and second versions to be the same code. [SNIP] Hello Richard I'm not entirely convinced that your experiment is a case against Hughes lists. The flattening of the bin-tree can use exactly _cons_ as it can go to right-bottom leaf and then work backwards with the accumulator cons-ing a single element at each step. I'd expect a plain list to be better for this as I expect a constructor to be more efficient than a function. Figuratively speaking, I'd contend that the bin-tree (aka a join-list) has already taken the weight of the concatenation. To show a Hughes list as efficient or inefficient a test would need to compare a plain list and a Hughes list doing the concatenation themselves - the common exemplar being string building vis the ShowS type. Best wishes Stephen ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
I haven't the faintest idea what SML is doing with the third version, but clearly it shouldn't. Those numbers are worrying, not just because of the third version - should doubling the tree size have such a large effect? I find your report that GHC doesn't do as well with the third version as the first two somewhat reassuring. Well, I reported it as a performance bug;-) Also because GHC 6.12.3 had the second version as fast as the first while GHC 7.1.x had the second version as slow as the third. You can find my code in the ticket: http://hackage.haskell.org/trac/ghc/ticket/4474 The difference makes me wonder whether there might be performance consequences of the point-free style in more cases. It would be good to know what is going on, as those code transformations are not unusual, and I like to be aware of any major performance trade-offs involved. I'd still call both patterns difference lists, I have been shouting about how bad a name difference list is in Prolog for about 20 years. Ah, when you put it like this, I agree completely. The name focusses on the wrong part of the idea, or rather on the junk in one particular, explanatory representation of the idea. It really is much more useful in Prolog to think of list differences as a kind of *relationship*, because then you are able to think hey, why just two ends? Why can't I pass around *several* references into the same list? Names have power - if you want to get rid of one as well established as difference lists, you'll have to replace it with another. How about incomplete data structures? That is already used in connection with more general applications of logic variables, and it focusses on half of the interesting bit of the idea (the other part being how one completes the structures). Then logic programmers can think of logic variables, functional programmers can think of parameterised structures, operational semanticists can think of contexts with hole filling, denotational semanticists can think of partially defined structures, and so on.. Claus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
The bottom line is that - in logic programming languages, building a list by working on a pair of arguments representing a segment of the list is the NORMAL way to build a list; it's as fast as it gets, and the list is inspectable during construction. modulo usage patterns: e.g., mostly linear use. - at least in SML, the (fn ys = xs @ ys) [Haskell: \ys - xs ++ ys] approach is stunningly slow, so slow as to make even naive use of append more attractive for most uses of; it is not the normal way to build lists, and for good reason; you can do nothing with a list so represented except compose and apply. Even in your SML code, the boring old plain lists seemed to be list_flatten, which uses difference lists in disguise, and won your little test, right? Using Haskell notation: flat (LEAF x) ys = x : ys flat (FORK(a,b)) ys = flat a (flat b ys) -- flat (LEAF x) = \ys-x : ys flat (FORK(a,b)) = \ys-flat a (flat b ys) -- flat (LEAF x) = (x :) flat (FORK(a,b)) = flat a . flat b As in Prolog, it is often better not to make the structure explicit, though I am slightly disappointed that GHC's optimizer doesn't give the same performance for the two versions (when summing a flattened strict tree in Haskell, I get roughly a factor of 2 between naive list append, explicit diff lists, as in the lower version of flat, and hidden diff lists, as in your list_flatten, the latter being fastest). Of course, one has to be careful about usage patterns and efficiency considerations. For instance, head and tail with Haskell diff lists are not O(n), as you mentioned, because of non-strictness. And building up lots of nested operations under a lambda means that many of those operations are not likely to be shared, but repeated every time the lambda is applied (such as converting back to plain lists), so one has to be careful about not doing that too often. etc. The practical consequence of this is that calling both techniques by the same name is going to mislead people familiar with one kind of language when they use the other: logic programmers will wrongly expect dlists to be practically useful in functional languags, functional programmers will expect them to be impractical in logic programming languages. I do tend to expect a little more insight from Haskellers, but perhaps you're right. Having a pre-canned library instead of writing out the near-trivial code patterns oneself, one might be surprised when expected miracles don't happen (diffarrays were one such case for me;-). I'd still call both patterns difference lists, because it can be useful to know about the connections, but if you could suggest text for the DList documentation that warns of the differences, and of performance pitfalls, I'm sure the package author would be happy to add such improvements. If we ever get per-package wikis for hackage, you could add such comments yourself. Claus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
The characteristics of the logical variable are as follows. An incomplete data structure (ie. containing free variables) may be returned as a procedure's output. The free variables can later be filled in by other procedures, giving the effect of implicit assignments to a data structure (cf. Lisp's rplaca, rplacd). There they are *explaining* things to Lisp programmers; not giving the origin of an idea. If you want to read it that way, it still means that they and their readers were sufficiently aware of the connections that it made sense to explain things this way. I was trying to point out the general context in which Prolog techniques developed, but my impressions are undoubtedly biased by the way I was introduced to Prolog in the late 1980s. If you have an undisputed reference to the original invention of difference lists, where the author(s) explicitly deny any connection to Lisp, I'd be interested. Also, I thought that Prolog had two origins - one in grammars, the other in logic as a programming language. See http://en.wikipedia.org/wiki/Definite_clause_grammar This was specifically the focus of Alain Colmerauer. You may be thinking of Cordell Green's 'The use of theorem-proving techniques in question-answering systems. No, I haven't read that, yet (I've found the later 'Application of Theorem Proving to Problem Solving' online, but not this one). I was thinking of the later theme of 'Predicate Logic as a Programming Language', 'Algorithm = Logic + Control', etc (here exemplified by titles of Kowalski's papers), but Kowalski points to Green's paper as 'perhaps the first zenith of logic in AI' in his 'The Early Years of Logic Programming' (Kowalski's papers online at: http://www.doc.ic.ac.uk/~rak/), so perhaps that was the start of this theme. Claus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On 3/11/2010, at 9:50 PM, Claus Reinke wrote: The bottom line is that - in logic programming languages, building a list by working on a pair of arguments representing a segment of the list is the NORMAL way to build a list; it's as fast as it gets, and the list is inspectable during construction. - at least in SML, the (fn ys = xs @ ys) [Haskell: \ys - xs ++ ys] approach is stunningly slow, so slow as to make even naive use of append more attractive for most uses of; it is not the normal way to build lists, and for good reason; you can do nothing with a list so represented except compose and apply. The practical consequence of this is that calling both techniques by the same name is going to mislead people familiar with one kind of language when they use the other: logic programmers will wrongly expect dlists to be practically useful in functional languags, functional programmers will expect them to be impractical in logic programming languages. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On 2 November 2010 04:18, Richard O'Keefe o...@cs.otago.ac.nz wrote: [SNIP] SML/NJ MLton 0.899 0.244 boring old plain list 5.481 1.244 build a raum using raums 8.186 1.380 build a raum then convert it to a list 12.581 4.449 build a dlist using dlists 16.209 5.096 build a dlist then convert it to a list Times were measured on an Intel Core 2 Duo Mac running Mac OS X 10.6.4, SSML/NJ v110.70 [built: Wed Jun 17 16:24:00 2009] MLton MLTONVERSION (built Mon Jun 15 11:10:01 CDT 2009 on fenrir.uchicago.edu) Building a list using dlists is 16 (SML/NJ) or 20 (MLton) times slower than using a plain list. Caveat: because raums handle reverse in O(1) time as well as concatenation, for a fair comparison I made dlists do the same, Och, adding reverse or even head and tail to a Dlist / Hughes list seems out of spirit with the idea - build as a Hughes list (enjoying cheap concat) - convert to a list and manipulate thereafter. I know the version on Hackage has head, tail, fmap etc. their existence is one of the reasons I avoid it and roll my own. Interestingly what was the test doing for boring old plain list to do so well? More than just cons I hope. Best wishes Stephen ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
Interesting discussion. I still think it is the same idea, namely to represent not-yet-known list tails by variables, embedded into two different kinds of languages. \rest-start++rest [start|rest]\rest-- '\' is an infix constructor Savvy Prolog programmers wouldn't *DREAM* of using an infix constructor here. Well, you got me there, I'm not a savvy Prolog programmer anymore, so I took the '\' for difference lists straight from SterlingShapiro, who employ it for clarity over efficiency. http://books.google.de/books?id=w-XjuvpOrjMCpg=PA284lpg=PA283ots=4WF3_NFXOtdq=difference+lists+Prolog Since my argument was about common origin of ideas vs differences in representation/implementation, I chose representations that I considered suggestive of the ideas. The differences arise from the different handling of variables and scoping in those languages: - functional languages: explicit, local scopes, variables are bound by injecting values from outside the scope (applying the binding construct to values); scoped expressions can be copied, allowing multiple instantiations of variables - logic languages: implicit, global scopes, variables are bound by finding possible values inside the scope Eh? The scope of *identifiers* is strictly LOCAL in Prolog. Logic *variables* don't *have* scope. I was thinking of abstract declarative semantics, where variables (even logic ones) are replaced by substitution. If you make the quantifiers explicit, the identifiers become alpha-convertible, so their names are irrelevant, and the scope is given explicitly, as the part of the program enclosed by the quantifiers. But since variables can be passed around in Prolog, one needs to move the quantifiers out of the way, upwards (called scope extrusion in process calculi, which have the same issue). You end up with no upper bound for the quantifiers - in that sense, logic variables have a global scope, because the quantifiers must enclose all parts of the program to which the variables have been passed. To close the circle: I understand that difference lists in Prolog might have been a declarative translation of in-place updating in Lisp lists:-) It seems unlikely. Prolog was born as a grammar formalism (metamorphosis grammars) and the idea of a non-terminal as a relation between a (starting point, end point) pair owes nothing to Lisp. I suspect you've researched the history of Prolog in more detail than I have, so I'll just remind you that Prolog wouldn't have been successful without efficient implementations (including structure sharing), and that both implementers and early users tended to be aware of implementation aspects and of Lisp - they wanted to leave behind the impure aspects of Lisp, but they wanted their implementations of 'Predicate Logic as a Programming Language' to be as efficient as Lisp implementations. One example reference: Warren, Pereira, Pereira; 1977 Prolog - The Language and its Implementation compared with Lisp http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.7097 On page 111, we find: The characteristics of the logical variable are as follows. An incomplete data structure (ie. containing free variables) may be returned as a procedure's output. The free variables can later be filled in by other procedures, giving the effect of implicit assignments to a data structure (cf. Lisp's rplaca, rplacd). This mindset - wanting to program declaratively, but being aware of implementation issues that might help efficiency, and being aware of how competing languages do it, has persisted till today, and continues to fuel advances (e.g., the way that logic programming could fill in incomplete structures without traversing them again prompted the development of circular programming techniques in functional languages - recursion and non-strictness allow variables to be bound by the result of a computation that already distributes those variables over a structure). Also, I thought that Prolog had two origins - one in grammars, the other in logic as a programming language. Claus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On 2/11/2010, at 10:08 PM, Stephen Tetley wrote: Och, adding reverse or even head and tail to a Dlist / Hughes list seems out of spirit with the idea - build as a Hughes list (enjoying cheap concat) - convert to a list and manipulate thereafter. I know the version on Hackage has head, tail, fmap etc. their existence is one of the reasons I avoid it and roll my own. Interestingly what was the test doing for boring old plain list to do so well? More than just cons I hope. My message included the code, and yes, it was just cons. But that's the point of the The Concatenates Vanish paper: a straightforward source to source transformation that doesn't so much make concatenation fast as eliminate it completely. For the case of a tree with 2**25 leaves, we have - building a list using cons: 0.215 seconds - using a dlist then converting: 5.144 seconds (fancy dlists) - using a dlist then converting: 5.512 seconds (plain dlists) - building a list using append: 17.764 seconds (using MLton again). plain dlists used fun l2dl xs ys = xs @ ys(* list to dlist *) fun dl2l d = d [] (* dlist to list *) fun dlap e d = e o d (* dlist append *) which is precisely the roll-your-own concatenation-only approach. I must admit that I was surprised to find the fancy version that handles reverse in O(1) time as well as concatenation was faster than the plain version; it just goes to show that rolling your own simplified code DOESN'T always pay off. Flattening this tree using append involved 25 layers of appends, each allocating 2**24 cons cells. If dlists can only beat _that_ by a factor of 3.2-3.5, they don't seem worth bothering about; even if you refuse to rewrite your source code to avoid concatenation, a data structure does better than a function. The data structure is more versatile too. Plain dlists give you O(1) concatenate and O(n) conversion to list. Fancy ones give you O(1) reverse as well. There they stop. With a data structure, it's easy to get O(1) length and even easier to do O(1) null test, you can do folding without first copying. Of course using Haskell and GHC the relative costs would be different, but GHC knows enough about lists that the payoff from using a data structure the compiler knows how to optimise might outweigh other concerns. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On 2/11/2010, at 10:44 PM, Claus Reinke wrote:I suspect you've researched the history of Prolog in more detail than I have, so I'll just remind you that Prolog wouldn't have been successful without efficient implementations (including structure sharing), Structure sharing came from theorem proving work at Edinburgh. Later, *more* efficient, implementations, abandoned it. and that both implementers and early users tended to be aware of implementation aspects and of Lisp - they wanted to leave behind the impure aspects of Lisp, but they wanted their implementations of 'Predicate Logic as a Programming Language' to be as efficient as Lisp implementations. At Edinburgh, the rival wasn't Lisp but Pop-2; specifically the WonderPop implementation on the DEC-10. Of possible interest to people in this list is that Pop-2 handled input as lazy lists. As Robin Popplestone wrote: While I (at least) did not understand the issue of lazy evaluation, still less know how to implement it in general, we hoisted aboard Landin’s preaching about streams at least to the extent of providing what we called dynamic lists – any lists in POP-2 could in effect be lazily extended in the tl (or CDR) direction. This was handy for writing the compiler: the compiler proper operated on a token-list which it could parse functionally. If you want to trace list differences back to something other than grammars, the idea of lists gradually materialising on the right (rather than _changing_ once built) was well known to everyone in the AI department at Edinburgh because of this. One example reference: Warren, Pereira, Pereira; 1977 Prolog - The Language and its Implementation compared with Lisp http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.7097 On page 111, we find: The characteristics of the logical variable are as follows. An incomplete data structure (ie. containing free variables) may be returned as a procedure's output. The free variables can later be filled in by other procedures, giving the effect of implicit assignments to a data structure (cf. Lisp's rplaca, rplacd). There they are *explaining* things to Lisp programmers; not giving the origin of an idea. Also, I thought that Prolog had two origins - one in grammars, the other in logic as a programming language. See http://en.wikipedia.org/wiki/Definite_clause_grammar This was specifically the focus of Alain Colmerauer. You may be thinking of Cordell Green's 'The use of theorem-proving techniques in question-answering systems. Claus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On 31 October 2010 23:05, Gregory Collins g...@gregorycollins.net wrote: They're called difference lists: Andy Gill and Graham Hutton's first worker wrapper paper calls then Hughes lists. This seems more apt to me than difference lists as difference lists (in the Haskell formulation at least*) don't seem to have any connection to a notion difference. Best wishes Stephen * It seems from other commentators on the thread that Haskell Hughes lists are actually different from Prolog difference lists anyway. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
To simplify, the difference in persistence between the two representations is enough to consider them very different as it makes a dramatic difference in interface. Interesting discussion. I still think it is the same idea, namely to represent not-yet-known list tails by variables, embedded into two different kinds of languages. \rest-start++rest [start|rest]\rest-- '\' is an infix constructor The differences arise from the different handling of variables and scoping in those languages: - functional languages: explicit, local scopes, variables are bound by injecting values from outside the scope (applying the binding construct to values); scoped expressions can be copied, allowing multiple instantiations of variables - logic languages: implicit, global scopes, variables are bound by finding possible values inside the scope (unifying bound variables); outside of non-determinism/ sets-of-solutions handling, only non-scoped terms can be copied, allowing single instantiation only If current functional language implementations had not abandoned support for reducing under binders, one could float out the binder for a difference list, and get limitations closer to those of logic languages (though binding to values would then become much more unwieldy). If logic languages allowed local binders in terms, one could copy difference lists more easily (though substitution and unification would then involve binders). So, yes, the realization of the idea is different, as are the language frameworks, and the junk in the representations, but the idea is the same. Btw, functional language implementations not reducing under binders also implies that functional difference list operations are not shared to the same extent as logic difference list operations are - copying a closure copies un-evaluated expressions, to be re-evaluated every time the closure is opened with different variable bindings, so the functional representation is not as efficient as the logical one, in most implementations. I can't confirm the reference, but several authors point to this for an early description [CT77] Clark, K.L.; Tärnlund, S,Å: A First Order Theory of Data and Programs. In: Inf. Proc. (B. Gilchrist, ed.), North Holland, pp. 939-944, 1977. To close the circle: I understand that difference lists in Prolog might have been a declarative translation of in-place updating in Lisp lists:-) Claus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On 1/11/2010, at 10:37 PM, Claus Reinke wrote: Interesting discussion. I still think it is the same idea, namely to represent not-yet-known list tails by variables, embedded into two different kinds of languages. \rest-start++rest [start|rest]\rest-- '\' is an infix constructor Savvy Prolog programmers wouldn't *DREAM* of using an infix constructor here. The art of doing list differences well in Prolog is to think of a list difference as a RELATIONSHIP between TWO terms, not as a single data structure. (The same is true for the queue example I mentioned. Keeping the length, the front, and the back as separate arguments gives usefully better performance.) The differences arise from the different handling of variables and scoping in those languages: - functional languages: explicit, local scopes, variables are bound by injecting values from outside the scope (applying the binding construct to values); scoped expressions can be copied, allowing multiple instantiations of variables - logic languages: implicit, global scopes, variables are bound by finding possible values inside the scope Eh? The scope of *identifiers* is strictly LOCAL in Prolog. Logic *variables* don't *have* scope. So, yes, the realization of the idea is different, as are the language frameworks, and the junk in the representations, but the idea is the same. In an extremely vague and not obviously useful sense. This is rather like saying that association lists and hash tables are both implementations of the finite map idea, so we might as well call them by the same name. The algorithmic and observability properties of the two approaches are very different. Since there already is a DList implementation for Haskell, I decided to write one for SML. It was very frustrating, because while everything is possible, almost nothing is easy. Many of the functions degenerated into fun foo x ... = fromList (List.foo (toList x) ...) and this included head and tail. Afterwards, looking at Data.DList to find out what clever trick I had missed, I was glumly pleased to find out there wasn't one: head and tail are O(size of list) in Data.DList too. The bottom line for anything that's supposed to make concatenation fast is that it ought to be able to make singleton and ++ fast. So I tried a test case. datatype tree = LEAF of int | FORK of (tree * tree); For SML/NJ I used a full binary tree of depth 22; for MLton I used a full binary tree of depth 25. SML/NJ MLton 0.899 0.244boring old plain list 5.481 1.244build a raum using raums 8.186 1.380build a raum then convert it to a list 12.581 4.449build a dlist using dlists 16.209 5.096build a dlist then convert it to a list Times were measured on an Intel Core 2 Duo Mac running Mac OS X 10.6.4, SSML/NJ v110.70 [built: Wed Jun 17 16:24:00 2009] MLton MLTONVERSION (built Mon Jun 15 11:10:01 CDT 2009 on fenrir.uchicago.edu) Building a list using dlists is 16 (SML/NJ) or 20 (MLton) times slower than using a plain list. Caveat: because raums handle reverse in O(1) time as well as concatenation, for a fair comparison I made dlists do the same, so type 'x dlist = bool - 'x list - 'x list fun fromList xs flag tail = if flag then xs @ tail else List.revAppend(xs, tail) fun list_flatten t = let fun flat (LEAF x) ys = x :: ys | flat (FORK(a,b)) ys = flat a (flat b ys) in flat t [] end fun raum_flat (LEAF x)= Raum.singleton x | raum_flat (FORK(a,b)) = Raum.cat (raum_flat a) (raum_flat b) fun raum_flatten t = Raum.toList (raum_flat t) fun dlist_flat (LEAF x)= DList.singleton x | dlist_flat (FORK(a,b)) = DList.cat (dlist_flat a) (dlist_flat b) fun dlist_flatten t = DList.toList (dlist_flat t) To close the circle: I understand that difference lists in Prolog might have been a declarative translation of in-place updating in Lisp lists:-) It seems unlikely. Prolog was born as a grammar formalism (metamorphosis grammars) and the idea of a non-terminal as a relation between a (starting point, end point) pair owes nothing to Lisp. I mean, do you call parser combinators in Haskell a declarative translation of in-place updating in Lisp lists? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Reference for technique wanted
There's a long-known technique in functional languages where [x1,...,xn] = \tail - x1:...xn:tail xs ++ ys= f . g xs = f [] A correspondent mentioned to me that he couldn't find a reference to the idea (which I gather he had independently rediscovered). I know I've read about it somewhere. Can anyone provide a reference? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
Richard O'Keefe o...@cs.otago.ac.nz writes: There's a long-known technique in functional languages where [x1,...,xn] = \tail - x1:...xn:tail xs ++ ys= f . g xs = f [] A correspondent mentioned to me that he couldn't find a reference to the idea (which I gather he had independently rediscovered). I know I've read about it somewhere. Can anyone provide a reference? They're called difference lists: http://hackage.haskell.org/packages/archive/dlist/latest/doc/html/Data-DList.html They allow O(1) snoc and append. I use them all the time! G -- Gregory Collins g...@gregorycollins.net ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
Well, you can get A Novel Representation of Lists and Its Application to the Function 'Reverse' by John Hughes online published in 1986 which is referenced by Wadler's 1987 The Concatenate Vanishes and references Richard Bird's 1984 paper Transformational programming and the paragraph problem though I'd be quite surprised if that was the first place the representation appeared in print. On Sun, Oct 31, 2010 at 6:51 PM, Richard O'Keefe o...@cs.otago.ac.nz wrote: There's a long-known technique in functional languages where [x1,...,xn] = \tail - x1:...xn:tail xs ++ ys = f . g xs = f [] A correspondent mentioned to me that he couldn't find a reference to the idea (which I gather he had independently rediscovered). I know I've read about it somewhere. Can anyone provide a reference? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On 1/11/2010, at 12:10 PM, Derek Elkins wrote: Well, you can get A Novel Representation of Lists and Its Application to the Function 'Reverse' by John Hughes online published in 1986 which is referenced by Wadler's 1987 The Concatenate Vanishes and references Richard Bird's 1984 paper Transformational programming and the paragraph problem though I'd be quite surprised if that was the first place the representation appeared in print. Thank you. I'd seen Wadler's paper and forgotten it. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On 1/11/2010, at 12:05 PM, Gregory Collins wrote: They're called difference lists: As a matter of fact the original context was precisely difference lists in logic programming. http://hackage.haskell.org/packages/archive/dlist/latest/doc/html/Data-DList.html Thank you. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On Sun, Oct 31, 2010 at 7:27 PM, Richard O'Keefe o...@cs.otago.ac.nz wrote: On 1/11/2010, at 12:05 PM, Gregory Collins wrote: They're called difference lists: As a matter of fact the original context was precisely difference lists in logic programming. http://hackage.haskell.org/packages/archive/dlist/latest/doc/html/Data-DList.html Difference lists in logic programming are almost the opposite of functional representations of lists and I find the move to try to nominally connect them worse than useless. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On 10/31/10 7:10 PM, Derek Elkins wrote: Well, you can get A Novel Representation of Lists and Its Application to the Function 'Reverse' by John Hughes online published in 1986 which is referenced by Wadler's 1987 The Concatenate Vanishes and references Richard Bird's 1984 paper Transformational programming and the paragraph problem though I'd be quite surprised if that was the first place the representation appeared in print. Barring the worse than useless appellation, the technique has been around in logic programming (and classic Lisp, IIRC) for a few decades longer. I've always heard it referred to as part of the folklore of logic/functional programming though, so I'm not sure of any earlier print references off-hand. (Though I find it curious that you think the logic version is so different...) -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On Sun, Oct 31, 2010 at 9:02 PM, wren ng thornton w...@freegeek.org wrote: On 10/31/10 7:10 PM, Derek Elkins wrote: Well, you can get A Novel Representation of Lists and Its Application to the Function 'Reverse' by John Hughes online published in 1986 which is referenced by Wadler's 1987 The Concatenate Vanishes and references Richard Bird's 1984 paper Transformational programming and the paragraph problem though I'd be quite surprised if that was the first place the representation appeared in print. Barring the worse than useless appellation, the technique has been around in logic programming (and classic Lisp, IIRC) for a few decades longer. I've always heard it referred to as part of the folklore of logic/functional programming though, so I'm not sure of any earlier print references off-hand. I agree that difference lists have been around quite a bit longer, but they are something different. (Though I find it curious that you think the logic version is so different...) I'm curious as to how you find them similar. Beyond both of them being ways to get fast appends in a declarative language, they have no similarities. To begin, Prolog is a first order language so it clearly can't represent functional lists. Meanwhile, difference lists rely on, at least, single assignment variables which Haskell does not have as a language feature so Haskell can't represent difference lists outside of a monad. The use of logic variables requires a linear use of a difference list within a branch of non-deterministic computation, i.e. difference lists are not persistent. Functional lists clearly are. As a simple example, if xs is a functional list, I can return a pair (xs . ys, xs . zs), if I tried to do that with difference lists I would be unifying ys and zs. If I -really- wanted to do that with difference lists I would have to use a difference list copy predicate to do it. Functional lists are an opaque data structure. If I want to know what the head of a functional list is, I have to first turn it into a real list and then take the head. With difference lists, I can just look at the head, and this is cheap and easy. Both representations have junk, though I'm inclined to say the functional representation has quite a bit more. At any rate, the junk is rather different. The junk of a the functional representation is any [a] - [a] function that can't be put into the form (xs ++) for some list xs. For example, reverse. Difference lists are pairs of lists where the latter is a suffix of the former. The junk in the typical representation, i.e. just pairs of lists, are pairs that don't meet that criterion. The idea behind difference lists is to represent the list xs as the pair (xs ++ ys, ys), i.e. xs ++ ys - ys = xs is where the name difference list comes from. One way of motivating the functional representation is that it is nothing more than the natural embedding of the list monoid into its monoid of endofunctions; for every set X, X - X is a monoid, and for every monoid (M,*,1), curry (*) is a monoid homomorphism from M to (M - M). I'm unsure how to apply either of these views to the other representation. In fact, difference lists are nothing more than the normal imperative way of handling appends quickly for singly-linked lists, with NULL replaced by an unbound variable. To simplify, the difference in persistence between the two representations is enough to consider them very different as it makes a dramatic difference in interface. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On 1/11/2010, at 2:02 PM, wren ng thornton wrote: Barring the worse than useless appellation, the technique has been around in logic programming (and classic Lisp, IIRC) for a few decades longer. I've always heard it referred to as part of the folklore of logic/functional programming though, so I'm not sure of any earlier print references off-hand. (Though I find it curious that you think the logic version is so different...) The logic programming version uses a SINGLE data structure for lists and differences, so that + converting from the difference representation to the plain representation involves no more than binding a single (logic) variable + does not involve ANY overheads compared with building a list directly + can easily be traversed while still under construction, as in the queue([s(...s(z)...), [X1,...,Xn|Hole], Hole) representation for a queue, which allows O(1) worst case enqueue and dequeue. enqueue(X, queue(N,List,[X|Hole]),queue(s(N),List,Hole)). dequeue(X, queue(s(N),[X|List],Hole), queue(N,List,Hole)). (Fernando Pereira taught me this one.) - can only be used once, after which the hole has *gone*. The closure version uses TWO data structures, so that - converting from the difference representation to the plain list representation involves building a whole new data structure at O(n) time and space cost - requires building closures which have no other reason to exist, which may retain references to data that would otherwise be dead - cannot be traversed while under construction + can be used as many times as you want I don't see the closure technique used in logic programming at all. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Reference for technique wanted
On 10/31/10 10:26 PM, Richard O'Keefe wrote: On 1/11/2010, at 2:02 PM, wren ng thornton wrote: (Though I find it curious that you think the logic version is so different...) The logic programming version uses a SINGLE data structure for lists and differences, so that + converting from the difference representation to the plain representation involves no more than binding a single (logic) variable + does not involve ANY overheads compared with building a list directly + can easily be traversed while still under construction, - can only be used once, after which the hole has *gone*. Sure, but all of these come from the differences in how functional languages and logic languages define their notions of variable. In logic programming you're allowed to play with open terms, so the ability to move around difference lists before they're complete is the same as being able to manipulate any other open term. Whereas in functional languages you (typically) are not allowed to have open terms, and terms with variables in them must be guarded by lambdas which prohibit you looking underneath them. One benefit of lambdas is that they lift the names of unknown substructure up to the top of an expression; the difflist(xs,hole) structure is just there to give that same lifting of names, since otherwise we wouldn't have any way of knowing that xs is unground nor how to make it (more) ground[1]. The various other differences regarding linear usage just come down to the fact that difflist(xs,hole) is a closure, not a function. But there's nothing to prevent functionalizing it, if the logic language provides a mechanism for generating fresh variables. When I look at difference lists in functional languages I see them in the same light: as a mechanism for manipulating open terms. The details of what manipulations are allowed is different (mostly because of function vs closure), but that's to be expected given the paradigm differences. But then I take the algorithmic idea of using open terms for fast concatenation as primary and the implementation details of what open terms are as secondary. [1] Assuming a pure logic language, which Prolog is not. -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe