RE: How to force evaluation entirely?
Simon PJ says: Did you try "seq"? x `seq` y should evalute x to WHNF before returning y. If x is a pair you may need to say seqPair x `seq` y where seqPair (a,b) = a `seq` b in order to force the components. Simon There's an easier way to force structures hyperstrictly. To force x to be evaluated to normal form before computing y, write (x==x) `seq` y This depends on all the types occurring in x being Eq types, and also on the implementation of == being hyperstrict when its result is true. This holds for all derived instances, and for most programmer defined ones too. After all, if x==x holds for any non-total x, then x==y must hold for some pair of different values x and y, which we normally try to avoid! I sometimes write if x==x then y else error "I am the pope!" but the seq form is nicer! John Hughes | -Original Message- | From: Michael Marte [mailto:[EMAIL PROTECTED]] | | | | I am trying to process a huge bunch of large XML files in order | to extract some data. For each XML file, a small summary (6 integers) | is created which is kept until writing a HTML page displaying the | results. | | The ghc-compiled program behaves as expected: It opens one | XML file after | the other but does not read a lot. After some 50 files, it | bails out due | to lack of heap storage. | | To overcome the problem, I tried to force the program to | compute summaries | immediately after reading the corresponding XML file. I tried | some eager | application ($!), some irrefutable pattern, and some | strictness flags, but | I did not succeed.
Re: combinator parsers and XSLT
Lennart Augustsson [EMAIL PROTECTED] wrote, "Manuel M. T. Chakravarty" wrote: Currently, most Haskell systems don't support unicode anyway (I think, hbc is the only exception), so I guess this is not a pressing issue. As soon as, we have unicode support and there is a need for lexers handling unicode input, I am willing to extend the lexer library to gracefully handle the cases that you outlined. I'm sorry, but I much object (strongly) towards this attitude. It's this kind of reasoning that stops Unicode from becoming widespread. I am tempted to agree with you. I am just a lazy bastard, that's the problem. Soon the GHC people (or whoever :) will say "Well, why should we support Unicode, there's all this software out there that breaks down with it." and we're in a viscious circle. Hmmm, in this particular case nothing breaks down. The lexer combinators themselves never internally use the assumption that a char is 8bit (I may be lazy, but I still prefer clean code). Only when you explicily use them to build a scanner that does scan unicode files (and is aware of it), you might run into space efficiency problems. Strong hint to various people: Haskell has had Unicode for a long time now. I think that before you start implementing various extensions to Haskell, perhaps you should implement what the standard says should be there. Implementing Unicode isn't that hard, just a few days work. You might be pleased to hear that - if I am not mistaken - Qrczak is working at Unicode support for ghc. Strongly opposing Anglosaxan language imperialism :-) Manuel
Re: combinator parsers and XSLT
Lars Henrik Mathiesen [EMAIL PROTECTED] wrote, From: "Manuel M. T. Chakravarty" [EMAIL PROTECTED] Date: Tue, 26 Sep 2000 15:11:23 +1100 For 16bit character ranges, it would be necessary to directly store negated character sets (such as [^abc]). From what he told me, Doitse Swierstra is working on a lexer that is using explicit ranges, but I am not sure whether he also has negated ranges. People with experience from other Unicode-enabled environments will expect support for character classes like letter or digit --- which in Unicode are not simple single ranges, but widely scattered over the map. (Just look at Latin-1, where you have to use [A-Za-zÀ-ÖØ-öø-ÿ] because two arithmetic operators snuck into the accented character range. (Blame the French)). Such support will also allow your parser to work with the next, bigger version of Unicode, since the parser library should just inherit the character class support from the Haskell runtime, which should in turn get it from the OS. The OS people are already doing the work to get the necessary tables and routines compressed into a few kilobytes. Hmm, this seems like a shortcoming in the Haskell spec. We have all these isAlpha, isDigit, etc functions, but I can't get at a list of, say, all characters for which isAlpha is true. Also, Unicode isn't 16-bit any more, it's more like 20.1 bits --- the range is hex 0 to 1f. Although the official character assignments will stay below hex 2 or so, your code may have to work on systems with private character assignments in the hex 10+ range. Ok, I didn't really mean that the mentioned extension will rely on Unicode being 16 bits. This is only a size, where you don't really want to build an exhaustive transition table anymore. Manuel
Re: How to force evaluation entirely?
"Ch. A. Herrmann" wrote: Hi, John There's an easier way to force structures hyperstrictly. To John force x to be evaluated to normal form before computing y, John write (x==x) `seq` y I'm heavily confused here. What happens, if (a) an optimizer replaces (x==x) by True? If an optimizer did that it would be severly broken in several ways. First, there is absolutelty no guarantee that the (==) operator defines anything that is a reflexive relation. E.g., I can (for a particular type) define it to always return False if I like. Second, even if (==) was defined to be reflexive it's highly likely that `x==x' would behave differently than `True'. The former probably diverges if x is bottom, whereas the latter doesn't. If the optimizer is not permitted to do that, its power appears to be limited severely. Who said Haskell was easy to optimize? -- Lennart
Re: How to force evaluation entirely?
John Hughes [EMAIL PROTECTED] writes: As far as the power of the optimizer is concerned, my guess is programmers very rarely write x==x (unless they MEAN to force x!), so the loss of optimization doesn't matter. Of course, in principle, an optimizer *could* replace x==x by x`seq`True (if x is known to be of base type), and the x`seq` might well be removed by later transformations (if x can be shown to be defined, something compilers do analyses to discover). Who knows, maybe this happens in the innards of ghc... Or the compiler could internally create its own HyperStrict class and replace x==x by x`hyperSeq`True, if all the Eq instances involved in the type of x are known to be reflexive (which is the case if they were all automatically derived). :-) Carl Witty