Re: Picky details about Unicode (was RE: Haskell 98 Report possible errors, part one)
Mon, 23 Jul 2001 11:23:30 -0700, Mark P Jones [EMAIL PROTECTED] pisze: I guess the intention here is that: symbol - ascSymbol | uniSymbol_special | _ | : | | ' Right. In fact, since all the characters in ascSymbol are either punctuation or symbols in Unicode, the inclusion of ascSymbol is redundant, and a better specification might be: symbol - uniSymbol_special | _ | : | | ' It would still be nice to explicitly list ASCII symbols, so one doesn't need to look at Unicode specs to use ASCII-only source. There are two places when character predicates are used in Haskell: program source and module Char. I'm sure that we all agree that they should be consistent with each other. Some predicates in module Char are wrong, i.e. I don't agree with their meaning. For example that isSpace is restricted to ISO-8859-1, and that caseless letters are considered uppercase. It's not clear what good definitions are, or even what set of predicates is useful, because there is no single official source with unambiguous and complete set of predicates. There are Unicode character categories, Unicode property lists, and implementations of C character predicates - all with different data. I guess Java specs have something to tell here too. I have an implemented proposal of improved Char predicates in QForeign http://sf.net/projects/qforeign/. Definitions are based on both Unicode character categories and PropList.txt from Unicode. -- __( Marcin Kowalczyk * [EMAIL PROTECTED] http://qrczak.ids.net.pl/ \__/ ^^ SYGNATURA ZASTÊPCZA QRCZAK ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: Haskell 98 Report possible errors, part one
From: Dylan Thurston [EMAIL PROTECTED] Date: Mon, 23 Jul 2001 19:57:54 -0400 On Mon, Jul 23, 2001 at 06:30:30AM -0700, Simon Peyton-Jones wrote: Someone else, quoted by Simon, attribution elided by Dylan, wrote: | 2.2. Identifiers can use small and large Unicode letters. | What about caseless scripts where letters are neither small | nor large? The description of module Char says: For the | purposes of Haskell, any alphabetic character which is not | lower case is treated as upper case (Unicode actually has | three cases: upper, lower and title). This suggests that the | only anomaly is that titlecase letters are considered | uppercase. But what is actually specified is that caseless | scripts can be used to write constructor names, but not to | variable names. I don't know how to solve this. I am woefully ignorant of Unicode, and I have no idea what to do about this one. I therefore propose to do nothing on the grounds that I might easily make matters worse. In this case, what about requiring identifiers to start with an upper or lower case alphabetic character? I'm not sure that makes things better. It just makes it impossible to have identifiers in caseless scripts (some of which are alphabetic). And whether you choose your upper or lower case alphabetic character from Latin, Greek, Coptic, Cyrillic, Armenian, Georgian, or Deseret, it will probably look silly in front of a variable name spelled in Hangul. What would make sense to me is to define that caseless letters (Unicode class Lo) behave as lowercase, and to choose some easily visible, culturally neutral, symbol as the official 'conid marker'. Since the problem only arises on Unicode-capable systems, there should be plenty of those to choose from, even outside Latin-1. To fix Haskell 98, the least intrusive way might be to allow only classes Ll, Lt, and Lu in identifiers, with Lt (titlecase) and Lu counting as uppercase --- it looks like that may actually have been the intention. And then add a note explaining that caseless scripts can't be used because they weren't considered initially. Lars Mathiesen (U of Copenhagen CS Dep) [EMAIL PROTECTED] (Humour NOT marked) ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Haskell 98 Report possible errors, part one
3. A precedence table says that case (rightwards) has higher precedence than operators and right associativity. If it's meaningful to talk about precedence of such syntactic constructs as case at all, it should probably be told to have a lower precedence, so case x+1 of ... is valid as case (x+1) of At least I don't see a difference between case (rightwards) and if (rightwards). I'm not sure if it makes sense to explain parsing of case in terms of precedence. Interesting. The table seems to say that case(rightwards) has a higher precedence than infix operators, so that eg. case x of p - x + y would parse as (case x of p - x) + y which is in conflict with the longest parse rule. I have no idea why case(rightwards) is given a different precedence, and the inclusion of 'case alternative' in the list is confusing. 4.3.1. A class declaration with no where part [...] The instance declaration must be given explicitly with no where part. Actually the where part may be present but empty, with the same meaning as no where part. Generally I'm not sure that having a layout rule which says that {} is inserted when the next indentation level is about to start and the new indent is smaller than the outer one is necessary; in all useful cases the keyword which triggered the layout could be omitted, and writing let x = case x of foo - ... should be either an error or it should be allowed to have the next indent smaller than the previous one - it's not useful to let it mean let {x = case x of {}} foo - ... and in case one really wants to have empty alts in case, he can write {} explicitly. I agree, but this isn't really a bug so there's no need to change the report. Besides, GHC is the only compiler which actually implements the layout rule as specified :-) Cheers, Simon ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Haskell 98 Report possible errors, part one
Marcin Thanks for your careful read. Many of your suggestions I will implement. I'll send separate email about any others. [Haskell mailing list folk: I hope you'll forgive email about the minutae of the Haskell Report. But I don't want to let changes, or even clarifications, go by without giving you all a chance to yell. It is amazing how the act of saying here's the final version has an uncanny ability to stimulate new, and entirely well-founded, feedback. I propose to continue this process, though. I continue to make strenuous efforts to change the report only (a) to clarify what is obscure, (b) to fix grevious errors or inconsistencies.] Simon ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Haskell 98 Report possible errors, part one
Folks Marcin is right about this. It is inconsistent as it stands. I propose to delete the sentence The Preldue module is always available as a qualified import... in the first para of 5.6.1. The situation will then be: if you don't import Prelude explicitly, you implicitly get import Prelude if you do import Prelude explicitly, you get no implicit imports Nice and simple Simon | 5.6.1. an implicit `import qualified Prelude' is part of | every module and names prefixed by `Prelude.' can always be | used to refer to entities in the Prelude. So what happens in | the following? | | module Test (null) where | import Prelude hiding (null) | null :: Int | null = 0 | | module Test2 where | import Test as Prelude | import Prelude hiding (null) | x :: Int | x = Prelude.null | | ghc allows that, it dosen't seem to implement the qualified | part of the implicit Prelude import. The report is | contradictory: adding `import qualified Prelude' makes | Prelude.null ambiguous, and thus names prefixed by `Prelude.' | can't always be used to refer to entities in the Prelude. ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Haskell 98 Report possible errors, part one
| 3. A precedence table says that case (rightwards) has higher | precedence than operators and right associativity. If it's | meaningful to talk about precedence of such syntactic | constructs as case at all, it should probably be told to have | a lower precedence, so case x+1 of ... is valid as case | (x+1) of At least I don't see a difference between | case (rightwards) and if (rightwards). I'm not sure if it | makes sense to explain parsing of case in terms of precedence. I can't make head or tail of what Table 1 (Section 3, beginning) is trying to say. It claims to be an aid to understanding the grammar, but it seems downright confusing to me. Proposal: remove Table 1 and its associated paragraph. Does anyone like it? Simon ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Haskell 98 Report possible errors, part one
| 2.2. Identifiers can use small and large Unicode letters. | What about caseless scripts where letters are neither small | nor large? The description of module Char says: For the | purposes of Haskell, any alphabetic character which is not | lower case is treated as upper case (Unicode actually has | three cases: upper, lower and title). This suggests that the | only anomaly is that titlecase letters are considered | uppercase. But what is actually specified is that caseless | scripts can be used to write constructor names, but not to | variable names. I don't know how to solve this. I am woefully ignorant of Unicode, and I have no idea what to do about this one. I therefore propose to do nothing on the grounds that I might easily make matters worse. Simon ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: Haskell 98 Report possible errors, part one
Unfortunately both the old and the new situation are not so nice. Both don't allow a simple translation of Haskell into the Haskell kernel, e.g. you cannot translate [1..] into Prelude.enumFrom 1, because the latter may be ambiguous. The following remark at the beginning of Section 3 is misleading: Free variables and constructors used in these translations refer to entities defined by the Prelude. To avoid clutter, we use True instead of Prelude.True or map instead of Prelude.map. (Prelude.True is a qualified name as described in Section 5.3.) It implicitly suggests that a simple translation is possible. Unfortunately I don't see any simple way to regain a simple translation. Hence I just suggest to change the remark at the beginning of Section 3. Just say that all free variables and constructors refer to entities defined by the Prelude and warn that full qualification is in general not sufficient to achieve this (because the entity may not be imported and because of import .. as). Ciao, Olaf Marcin is right about this. It is inconsistent as it stands. I propose to delete the sentence The Preldue module is always available as a qualified import... in the first para of 5.6.1. The situation will then be: if you don't import Prelude explicitly, you implicitly get import Prelude if you do import Prelude explicitly, you get no implicit imports Nice and simple | 5.6.1. an implicit `import qualified Prelude' is part of | every module and names prefixed by `Prelude.' can always be | used to refer to entities in the Prelude. So what happens in | the following? | | module Test (null) where | import Prelude hiding (null) | null :: Int | null = 0 | | module Test2 where | import Test as Prelude | import Prelude hiding (null) | x :: Int | x = Prelude.null | | ghc allows that, it dosen't seem to implement the qualified | part of the implicit Prelude import. The report is | contradictory: adding `import qualified Prelude' makes | Prelude.null ambiguous, and thus names prefixed by `Prelude.' | can't always be used to refer to entities in the Prelude. -- OLAF CHITIL, Dept. of Computer Science, University of York, York YO10 5DD, UK. URL: http://www.cs.york.ac.uk/~olaf/ Tel: +44 1904 434756; Fax: +44 1904 432767 ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: Haskell 98 Report possible errors, part one
Mon, 23 Jul 2001 15:11:32 +0100, Olaf Chitil [EMAIL PROTECTED] pisze: Both don't allow a simple translation of Haskell into the Haskell kernel, e.g. you cannot translate [1..] into Prelude.enumFrom 1, because the latter may be ambiguous. That's why I was proposing that importing another module as Prelude should be the way to change the meaning of builtin syntax in ghc, instead of -fno-implicit-prelude combined with importing some names to be available unqualified. It's not a change in the report, which doesn't support changing the meaning of builtin syntax and should only be clarified that entities refer to standard Prelude. But as an extension some builtin syntax in ghc might be defined as textual expansion to Prelude-qualified names, so they may come either from original Prelude or from a replacement. To complete that extension it should be legal to self-import: module MyPrelude where import MyPrelude as Prelude import Prelude as P so that 5 used in this very module expands to Prelude.fromIntegral 5, i.e. MyPrelude.fromIntegral 5. -- __( Marcin Kowalczyk * [EMAIL PROTECTED] http://qrczak.ids.net.pl/ \__/ ^^ SYGNATURA ZASTÊPCZA QRCZAK ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: Haskell 98 Report possible errors, part one
| Unfortunately both the old and the new situation are not so | nice. Both don't allow a simple translation of Haskell into | the Haskell kernel, e.g. you cannot translate [1..] into | Prelude.enumFrom 1, because the latter may be ambiguous. | | The following remark at the beginning of Section 3 is misleading: | | Free variables and constructors used in these translations | refer to entities defined by the Prelude. To avoid clutter, | we use True instead of Prelude.True or map instead of | Prelude.map. (Prelude.True is a qualified name as described | in Section 5.3.) The report is vainly trying to say that, regardless of what is lexically in scope, the builtin syntax refers to Prelude entities. Perhaps I should reword the offending paragraph to say: Free variables and constructors used in these translations refer to entities defined by the Prelude, regardless of what variables or constructors are actually in scope. For example, concatMap used in the translation of list comprehensions (Section 3.11) means the concatMap defined by the Prelude, regardless of whether or not concatMap or Prelude.concatMap are in scope. Would that be better? Simon ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: Haskell 98 Report possible errors, part one
The report is vainly trying to say that, regardless of what is lexically in scope, the builtin syntax refers to Prelude entities. Perhaps I should reword the offending paragraph to say: Free variables and constructors used in these translations refer to entities defined by the Prelude, regardless of what variables or constructors are actually in scope. For example, concatMap used in the translation of list comprehensions (Section 3.11) means the concatMap defined by the Prelude, regardless of whether or not concatMap or Prelude.concatMap are in scope. Would that be better? You can probably delete the first , regardless ... subsentence for better readability. Maybe you should add and refer unambiguously to the Prelude. to the end of the last sentence? Or does the report state somewhere that being in scope includes not being ambiguous? Unfortunately, as far as I see, the report does not even explain what it means for an identifier to be ambiguous. -- OLAF CHITIL, Dept. of Computer Science, University of York, York YO10 5DD, UK. URL: http://www.cs.york.ac.uk/~olaf/ Tel: +44 1904 434756; Fax: +44 1904 432767 ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Picky details about Unicode (was RE: Haskell 98 Report possible errors, part one)
| 2.2. Identifiers can use small and large Unicode letters ... If we're picking on the report's handling of Unicode, here's another minor quibble to add to the list. In describing the lexical syntax of operator symbols, the report uses: varsym- (symbol {symbol | :})_reservedop symbol- ascSymbol | uniSymbol uniSymbol - any Unicode symbol or punctuation The last line seems to include more characters than I'd expect. Specifically: ()[]{} are punctuation (Unicode type Pe, Ps) ` is a symbol, modifier (Unicode type Sk) ':;, are punctuation, other (Unicode type Po) _ is punctuation, connector (Unicode type Pc) And, so, if I read the report correctly, I should be able to define :-) as a consym and `div`, [], and hello as varsyms! (Not to mention some altogether more bizarre choices!) I guess the intention here is that: symbol - ascSymbol | uniSymbol_special | _ | : | | ' In fact, since all the characters in ascSymbol are either punctuation or symbols in Unicode, the inclusion of ascSymbol is redundant, and a better specification might be: symbol - uniSymbol_special | _ | : | | ' All the best, Mark P.S. A caveat: I'm not a Unicode expert! Perhaps Marcin can advise ... ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: Haskell 98 Report possible errors, part one
On Mon, Jul 23, 2001 at 06:30:30AM -0700, Simon Peyton-Jones wrote: | 2.2. Identifiers can use small and large Unicode letters. | What about caseless scripts where letters are neither small | nor large? The description of module Char says: For the | purposes of Haskell, any alphabetic character which is not | lower case is treated as upper case (Unicode actually has | three cases: upper, lower and title). This suggests that the | only anomaly is that titlecase letters are considered | uppercase. But what is actually specified is that caseless | scripts can be used to write constructor names, but not to | variable names. I don't know how to solve this. I am woefully ignorant of Unicode, and I have no idea what to do about this one. I therefore propose to do nothing on the grounds that I might easily make matters worse. In general, the situation with claimed support of Unicode in Haskell is pretty shaky; probably much more needs to be said in many places. I will continue to be dubious until some compiler actually provides reasonable support. In this case, what about requiring identifiers to start with an upper or lower case alphabetic character? --Dylan ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell