RE: How to force evaluation entirely?

2000-09-26 Thread John Hughes


Simon PJ says:

Did you try "seq"?  
x `seq` y
should evalute x to WHNF before returning y.  If x is a pair
you may need to say

seqPair x `seq` y

where
seqPair (a,b) = a `seq` b

in order to force the components.

Simon

There's an easier way to force structures hyperstrictly. To force x to be
evaluated to normal form before computing y, write
(x==x) `seq` y
This depends on all the types occurring in x being Eq types, and also on 
the implementation of == being hyperstrict when its result is true. This holds
for all derived instances, and for most programmer defined ones too. After
all, if x==x holds for any non-total x, then x==y must hold for some pair of
different values x and y, which we normally try to avoid!

I sometimes write
if x==x then y else error "I am the pope!"
but the seq form is nicer!

John Hughes

| -Original Message-
| From: Michael Marte [mailto:[EMAIL PROTECTED]]
| 
| 
| 
| I am trying to process a huge bunch of large XML files in order
| to extract some data. For each XML file, a small summary (6 integers)
| is created which is kept until writing a HTML page displaying the
| results.
| 
| The ghc-compiled program behaves as expected: It opens one 
| XML file after
| the other but does not read a lot. After some 50 files, it 
| bails out due
| to lack of heap storage.
| 
| To overcome the problem, I tried to force the program to 
| compute summaries
| immediately after reading the corresponding XML file. I tried 
| some eager
| application ($!), some irrefutable pattern, and some 
| strictness flags, but
| I did not succeed. 




Re: combinator parsers and XSLT

2000-09-26 Thread Manuel M. T. Chakravarty

Lennart Augustsson [EMAIL PROTECTED] wrote,

 "Manuel M. T. Chakravarty" wrote:
 
  Currently, most Haskell systems don't support unicode anyway
  (I think, hbc is the only exception), so I guess this is not
  a pressing issue.  As soon as, we have unicode support and
  there is a need for lexers handling unicode input, I am
  willing to extend the lexer library to gracefully handle the
  cases that you outlined.
 I'm sorry, but I much object (strongly) towards this attitude.
 It's this kind of reasoning that stops Unicode from becoming
 widespread.

I am tempted to agree with you.  I am just a lazy bastard,
that's the problem.

 Soon the GHC people (or whoever :) will say "Well, why should we
 support Unicode, there's all this software out there that breaks down
 with it." and we're in a viscious circle.

Hmmm, in this particular case nothing breaks down.  The
lexer combinators themselves never internally use the
assumption that a char is 8bit (I may be lazy, but I still
prefer clean code).  Only when you explicily use them to
build a scanner that does scan unicode files (and is aware
of it), you might run into space efficiency problems.

 Strong hint to various people:
 Haskell has had Unicode for a long time now.  I think that before
 you start implementing various extensions to Haskell, perhaps you
 should implement what the standard says should be there.
 Implementing Unicode isn't that hard, just a few days work.

You might be pleased to hear that - if I am not mistaken -
Qrczak is working at Unicode support for ghc.

 Strongly opposing Anglosaxan language imperialism

:-)

Manuel




Re: combinator parsers and XSLT

2000-09-26 Thread Manuel M. T. Chakravarty

Lars Henrik Mathiesen [EMAIL PROTECTED] wrote,

  From: "Manuel M. T. Chakravarty" [EMAIL PROTECTED]
  Date: Tue, 26 Sep 2000 15:11:23 +1100
 
  For 16bit character ranges, it would be necessary to
  directly store negated character sets (such as [^abc]).
  From what he told me, Doitse Swierstra is working on a lexer
  that is using explicit ranges, but I am not sure whether he
  also has negated ranges.
 
 People with experience from other Unicode-enabled environments will
 expect support for character classes like letter or digit --- which in
 Unicode are not simple single ranges, but widely scattered over the
 map. (Just look at Latin-1, where you have to use [A-Za-zÀ-ÖØ-öø-ÿ]
 because two arithmetic operators snuck into the accented character
 range. (Blame the French)).
 
 Such support will also allow your parser to work with the next, bigger
 version of Unicode, since the parser library should just inherit the
 character class support from the Haskell runtime, which should in turn
 get it from the OS. The OS people are already doing the work to get
 the necessary tables and routines compressed into a few kilobytes.

Hmm, this seems like a shortcoming in the Haskell spec.  We
have all these isAlpha, isDigit, etc functions, but I can't
get at a list of, say, all characters for which isAlpha is
true. 

 Also, Unicode isn't 16-bit any more, it's more like 20.1 bits --- the
 range is hex 0 to 1f. Although the official character assignments
 will stay below hex 2 or so, your code may have to work on systems
 with private character assignments in the hex 10+ range.

Ok, I didn't really mean that the mentioned extension will
rely on Unicode being 16 bits.  This is only a size, where
you don't really want to build an exhaustive transition
table anymore.

Manuel




Re: How to force evaluation entirely?

2000-09-26 Thread Lennart Augustsson

"Ch. A. Herrmann" wrote:

 Hi,

 John There's an easier way to force structures hyperstrictly. To
 John force x to be evaluated to normal form before computing y,
 John write (x==x) `seq` y

 I'm heavily confused here.

 What happens, if

(a) an optimizer replaces (x==x) by True?

If an optimizer did that it would be severly broken in several ways.
First, there is absolutelty no guarantee that the (==) operator defines
anything that is a reflexive relation.  E.g., I can (for a particular type) define
it to always return False if I like.
Second, even if (==) was defined to be reflexive it's highly likely that
`x==x' would behave differently than `True'.  The former probably
diverges if x is bottom, whereas the latter doesn't.



If the optimizer is not permitted to do that,
its power appears to be limited severely.

Who said Haskell was easy to optimize?

-- Lennart






Re: How to force evaluation entirely?

2000-09-26 Thread Carl R. Witty

John Hughes [EMAIL PROTECTED] writes:

 As far as the power of the optimizer is concerned, my guess is programmers
 very rarely write x==x (unless they MEAN to force x!), so the loss of
 optimization doesn't matter. Of course, in principle, an optimizer *could*
 replace x==x by x`seq`True (if x is known to be of base type), and the x`seq`
 might well be removed by later transformations (if x can be shown to be
 defined, something compilers do analyses to discover). Who knows, maybe this
 happens in the innards of ghc...

Or the compiler could internally create its own HyperStrict class and
replace x==x by x`hyperSeq`True, if all the Eq instances involved in
the type of x are known to be reflexive (which is the case if they
were all automatically derived). :-)

Carl Witty