unicode in source [was Re: lexer puzzle]
hello, Frank Atanassow wrote: On vrijdag, sep 26, 2003, at 09:16 Europe/Amsterdam, John Meacham wrote: On Fri, Sep 26, 2003 at 08:59:12AM +0200, Ketil Z. Malde wrote: I think there is a problem with too much overloaded syntax. Perhaps it is time to put non-ASCII characters to good use? For instance, function composition could use the degree sign: and leave the . for module qualification. why not the actual functional composition operator: or we could also make good use ofand all the other fun mathematical operators. This is very readable, but not very writable. Until I get a keyboard with a key, I would prefer to restrict the syntax to ASCII/whatever and simply make it the editor's responsibility to display source code using special characters. actually it is not that hard to write unicode characters. after all there are no capital letter keys on the keyboard either, but we can write them. i use vim and at least with it it quite easy to teach the editor to recognize certain sequences of characters and replace them with the appropriate mathematical symbol. a similar thing happens when you write Isabelle theories with x-symbol, and it really works quite nicely. i think it would be nice to allow programmers to write unicode for what is now done with sequences of ascii chars (not replacing the current symbols, but adding additional syntax). bye iavor ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe -- == | Iavor S. Diatchki, Ph.D. student | | Department of Computer Science and Engineering | | School of OGI at OHSU | | http://www.cse.ogi.edu/~diatchki | == ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: lexer puzzle
Brandon Michael Moore [EMAIL PROTECTED] writes: Or was that supposed to be composition of a constructor with a function, A . f? Function composition, and higher order functions in general are likely to confuse an imperative programmer, but I think there isn't much syntax can do there. I think there is a problem with too much overloaded syntax. Perhaps it is time to put non-ASCII characters to good use? For instance, function composition could use the degree sign: ° and leave the . for module qualification. Template Haskell could use double-angle quotation marks: « » and the section sign: § and avoid clashing with list comprehensions and the function application operator. Implicit parameters could use an inverted question mark: ¿ And so on, just look for places where the semantics depend on spaces in the right (or wrong) place. -kzm -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: lexer puzzle
On Fri, Sep 26, 2003 at 08:59:12AM +0200, Ketil Z. Malde wrote: Brandon Michael Moore [EMAIL PROTECTED] writes: Or was that supposed to be composition of a constructor with a function, A . f? Function composition, and higher order functions in general are likely to confuse an imperative programmer, but I think there isn't much syntax can do there. I think there is a problem with too much overloaded syntax. Perhaps it is time to put non-ASCII characters to good use? For instance, function composition could use the degree sign: and leave the . for module qualification. why not the actual functional composition operator: or we could also make good use ofand all the other fun mathematical operators. Template Haskell could use double-angle quotation marks: and the section sign: and avoid clashing with list comprehensions and the function application operator. Implicit parameters could use an inverted question mark: And so on, just look for places where the semantics depend on spaces in the right (or wrong) place. I would love to be able to use unicode to make my programs more readable. just as an alternate syntax for awkward ascii constructs. and as operator, function names when they make sense. this could probably be done with a preprocessor, but wolud be easier in the compiler to work out the layout rule and handle language extensions and whatnot. John -- --- John Meacham - California Institute of Technology, Alum. - [EMAIL PROTECTED] --- ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: lexer puzzle
Hi. I'm really new to Haskell, just learning it, and I must say I'm pretty overwhelmed by the large variety of constructs. (=, -, \ to name a few) Would that be \ as in TREX row variable polymorphism? Just remember most operators are just library functions. It's only =, -, =, -, :: that are really part of the language, and {,},; for grouping. Did I miss any? Yes, you missed \, which is used for function abstraction: (\x - x*x) 3 And ( , ) for tuples. And I don't think - is part of the language - it only appears in the type syntax, not term syntax. If you allow it, you have to allow * as well. --KW 8-) -- Keith Wansbrough [EMAIL PROTECTED] http://www.cl.cam.ac.uk/users/kw217/ University of Cambridge Computer Laboratory. ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: lexer puzzle
Keith Wansbrough [EMAIL PROTECTED] writes: And I don't think - is part of the language - it only appears in the type syntax, not term syntax. If you allow it, you have to allow * as well. Errm, you just gave an example of - in the term syntax... (\x - x*x) 3 Regards, Malcolm ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: lexer puzzle
Keith Wansbrough [EMAIL PROTECTED] writes: And I don't think - is part of the language - it only appears in the type syntax, not term syntax. If you allow it, you have to allow * as well. Errm, you just gave an example of - in the term syntax... (\x - x*x) 3 Guilty... sorry! :-( --KW 8-) -- Keith Wansbrough [EMAIL PROTECTED] http://www.cl.cam.ac.uk/users/kw217/ University of Cambridge Computer Laboratory. ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: lexer puzzle
On vrijdag, sep 26, 2003, at 09:16 Europe/Amsterdam, John Meacham wrote: On Fri, Sep 26, 2003 at 08:59:12AM +0200, Ketil Z. Malde wrote: I think there is a problem with too much overloaded syntax. Perhaps it is time to put non-ASCII characters to good use? For instance, function composition could use the degree sign: and leave the . for module qualification. why not the actual functional composition operator: or we could also make good use ofand all the other fun mathematical operators. This is very readable, but not very writable. Until I get a keyboard with a key, I would prefer to restrict the syntax to ASCII/whatever and simply make it the editor's responsibility to display source code using special characters. ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: lexer puzzle
A... should be split into A.. and . I found a compromise: let's make it a lexing error! :-) At least that agrees with what some Haskell compilers implement. No current Haskell compiler/interpreter agrees with what the report seems to say, that is that A... should be lexed as the two tokens A.. and ., and similarly, A.where should be lexed as A.wher followed by e. Hi. I'm really new to Haskell, just learning it, and I must say I'm pretty overwhelmed by the large variety of constructs. (=, -, \ to name a few) But I'm just writing this to let you guys know (surely you know this already) that anyone from a C/C++/Java/Delphi background is going to completely misunderstand the meaning of A.anything in Haskell... it's completely nonintuitive to people with my background. I kinda like dot notation because it ties together the symbols visually, for instance myrec.myfield is more of a unit than myrec myfield. It stays together better when surrounded by other code, and would result in fewer parenthesis necessary. Haskell to me seems to be a great language with a syntax problem, and a bad case of too many ways to do the same thing; thus every programmer does things their own way and it's difficult to grasp the language by looking at various programs, since they're all so very different. As a small example, there's 'let' vs. 'where'. Maybe a bit of pruning would be in order. That said, I still think it looks more promising than any other language I've looked at that actually is being actively used and maintained and has a decent installed base and good cross platform support. So I will learn it. I just wish the transition was easier and that it took less time to learn. ;) Sean ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: lexer puzzle
Hi, But I'm just writing this to let you guys know (surely you know this already) that anyone from a C/C++/Java/Delphi background is going to completely misunderstand the meaning of A.anything in Haskell... it's completely nonintuitive to people with my background. Surely this is no worse than misunderstanding '=', as in: f n = n + 1 is it? I'd say of all the hurdles going from C++-esque to Haskell, the A.foo is one of the least troubling (I could be wrong and would like to know if I am). Haskell to me seems to be a great language with a syntax problem, and a bad case of too many ways to do the same thing; thus every programmer does I've always thought it the opposite :). Let vs. where can be somewhat confusing, and it is largely a matter of style, but they're not completely interchangable, esp. in the presence of, say guards, ie.: f x | x 0 = foo x | x 0 = foo (-x) where foo y = ... could not be done with a let. My 2 cents... - Hal ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: lexer puzzle
Hi, Haskell to me seems to be a great language with a syntax problem, and a bad case of too many ways to do the same thing; thus every programmer does things their own way and it's difficult to grasp the language by looking at examples. int fact(int x){int t=1; while(x) t*=x--;} int fact(int x){int t=1; for(int i=1;i=x;i++) t=t*i;} int fact(int x){if(x=0) return 1; else return x * fact(x-1);} int fact(int x){return aux(x,1);} int aux(int x, int t){ return x ? t : aux(x-1,t*x);} int fact(int x){int t=1; while(x (t*=x--)); return t;} #define TRUE 1 #define FALSE 0 int fact(int x) { if((x==0)==TRUE) return 1; else { t = fact(x-1); return(t*x); } } x = x + 1; x += 1; x++; #include stdio.h main(t,_,a) char *a; { return!0t?t3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)): 1,t_?main(t+1,_,a):3,main(-94,-27+t,a)t==2?_13? main(2,_+1,%s %d %d\n):9:16:t0?t-72?main(_,t, @n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\ ;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l \ q#'+d'K#!/+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# \ ){nl]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#n'wk nw' \ iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \ ;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;#'rdq#w! nr'/ ') }+}{rl#'{n' ')# \ }'+}##(!!/) :t-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1) :0t?main(2,2,%s):*a=='/'||main(0,main(-61,*a, !ek;dc [EMAIL PROTECTED]'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry),a+1); } I'm not being facetious here, but deadly serious. Your average language that is worth anything allows programmers to program in more than one way. If the language truely constrained everyone to produce a given program looking exactly the same way, then it is probably a nightmare to program in. Speaking as someone who comes from a C/C++/Java/Scheme/Assembler/Algebra background, I don't find Haskell syntax to be any more annoying than any other typical syntax. And I did not find any particular clash with `.' It's like the the game of Go, the rules (syntax) you learn in one sitting, but the strategy can take a life time. Any programmer should be able to pick up a new syntax in a week. (not that you want to do that every week, because you want to get down to playing the game). Regards, Bruce. Institute for Information and Mathematical Sciences Massey University at Albany, email: [EMAIL PROTECTED] web: http://www.massey.ac.nz/~bimills ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: lexer puzzle
Note I've replied to haskell-cafe. This post is a bit chatty and low on solid answers. On Thu, 25 Sep 2003, Sean L. Palmer wrote: A... should be split into A.. and . I found a compromise: let's make it a lexing error! :-) At least that agrees with what some Haskell compilers implement. No current Haskell compiler/interpreter agrees with what the report seems to say, that is that A... should be lexed as the two tokens A.. and ., and similarly, A.where should be lexed as A.wher followed by e. Hi. I'm really new to Haskell, just learning it, and I must say I'm pretty overwhelmed by the large variety of constructs. (=, -, \ to name a few) Would that be \ as in TREX row variable polymorphism? Just remember most operators are just library functions. It's only =, -, =, -, :: that are really part of the language, and {,},; for grouping. Did I miss any? But I'm just writing this to let you guys know (surely you know this already) that anyone from a C/C++/Java/Delphi background is going to completely misunderstand the meaning of A.anything in Haskell... it's completely nonintuitive to people with my background. I kinda like dot notation because it ties together the symbols visually, for instance myrec.myfield is more of a unit than myrec myfield. It stays together better when surrounded by other code, and would result in fewer parenthesis necessary. A Python programmer would understand instantly: Python uses exactly the same syntax for module access, though Python modules are usually in lowercase. It also seems to be very much in the spirit of access a member of this object of an OO language. Or was that supposed to be composition of a constructor with a function, A . f? Function composition, and higher order functions in general are likely to confuse an imperative programmer, but I think there isn't much syntax can do there. Or are you talking about the field access syntax? Maybe the problem is that dot has two to five different meanings, function composition, naming module members, building hierarchial module names, being a decimal point, and making elipses, and is commonly used for yet another purpose in OO languages. Haskell to me seems to be a great language with a syntax problem, and a bad case of too many ways to do the same thing; thus every programmer does things their own way and it's difficult to grasp the language by looking at various programs, since they're all so very different. As a small example, there's 'let' vs. 'where'. Maybe a bit of pruning would be in order. Do you mean the syntax is bad in places? Haskell is the cleanest language I know of, but I'm sure it has some grungy bits. I've had problems with unary minus (can't slice binary minus), and precedence of with irrefuatble patterns and type ascription. I would be happy for any confusing syntax to be improved. Any good ideas? Syntax change is a possibility: do notation is a relatively recent addition, and arrow syntax is in the works. I think you might instead mean the syntax cuts down our market share because it isn't like common (C derived) languages. I don't think Haskell could borrow any more syntax from C without actually making the language worse. It's a problem, but not with the syntax. If someone is so solidly into a C++/Java/OO mindset that the syntax would be a problem, the semantics would probably be even more of a problem. I would suggest Python if Haskell was too much of a jump for someone. It's still OO, but it encourages more flexible and interesting programs, and you don't have to live in a Java type system. Plus, it has more libraries, bindings, and PR, so it's easier to get permission to use it in a company. If someone is used to Python's layout rule and lack of type signatures, and gets their head around some of the fun you can have dynamically picking which members of an object to access, assigning to __dict__ and so on, then Haskell should be much less of a jump in syntax, and less imposing in semantics. That said, I still think it looks more promising than any other language I've looked at that actually is being actively used and maintained and has a decent installed base and good cross platform support. So I will learn it. I just wish the transition was easier and that it took less time to learn. ;) Sean I learned Haskell from the gentle introduction. It seemed gentle enough to me but others disagree, so I'm probably not the best for advice for the raw beginner. If you are interested in learning about monads though, Jeff Newbern's monad tutorial seems accessible and as complete as anything this side of Phil Wadler's paper. I hope learning Haskell goes well. Brandon ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: lexer puzzle
Marcin 'Qrczak' Kowalczyk wrote: A... should be split into A.. and . I found a compromise: let's make it a lexing error! :-) At least that agrees with what some Haskell compilers implement. No current Haskell compiler/interpreter agrees with what the report seems to say, that is that A... should be lexed as the two tokens A.. and ., and similarly, A.where should be lexed as A.wher followed by e. It seems that the source of the problem is the use of the (nonstandard) difference operator r1r2 in the specification of the lexical syntax in the Haskell report [1]. It was presumably fairly innocent and easy to understand originally, but I guess that nobody understood its consequences when qualified names were introduced. Anyway, this should teach us not to use homemade pseudo formal notation when defining a new language, but to stick to well-established, well-understood, tool-supported formalisms... For the Programatica Haskell front-end, I have now switched to a regular expression compiler that has direct support for the difference operator, so hopefully, our implementation agrees with what the report specifies. (This is not necessarily a good thing, though, since it makes our front-end different from all other Haskell implementations :-) For what it is worth, I tested A..., A.where and A.-- in the main Haskell implementations and in the Programatica Haskell front-end. The input was two modules A and B: 1 module A where 2 3 wher = id 4 e = id 1 module B where 2 import A 3 4 x = (A.where) 5 y = x Here is the result: GHC: B.hs:4: parse error on input `where' HBC: B.hs, line 4, Bad qualified name on input:eof Hugs: ERROR B.hs:4 - Undefined qualified variable A.where NHC98: Identifier A.where used at 4:6 is not defined. PFE: ok If line 4 in module B is replaced with x = (A...): GHC: B.hs:4: Variable not in scope: `A...' HBC: B.hs, line 4, Bad qualified name on input:eof Hugs: ERROR B.hs:4 - Undefined qualified variable A... NHC98: Identifier A... used at 4:6 is not defined. PFE: B.hs:4,13, before ): syntax error (A.. is lexed as A.. .) If line 4 in module B is replaced with x = (A.--) GHC: B.hs:4: Variable not in scope: `A.--' HBC: B.hs, line 5, syntax error on input:= (treats -- as the start of a comment) Hugs: ERROR B.hs:4 - Undefined qualified variable A.-- NHC98: Identifier A.-- used at 4:6 is not defined. PFE: B.hs:5,1, before : syntax error (A.-- is lexed as A.- -) I used the following versions GHC 6.0.1 HBC 0..5b Hugs 98 November 2002 NHC98 1.16 PFE 030912 -- Thomas H [1] http://www.haskell.org/onlinereport/syntax-iso.html ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: lexer puzzle
Thus, the only possible lexical interpretation is the one you first suggested, namely a constructor A followed by a three-dot operator A... should be split into A.. and . I found a compromise: let's make it a lexing error! :-) -- __( Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/ ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: lexer puzzle
Marcin 'Qrczak' Kowalczyk [EMAIL PROTECTED] writes: A... A (constructor), then ... (operator). This is how I understand Haskell 98 lexing rules. Argh, I was wrong. It's A.. (qualified operator), then . (operator). You are forgetting about the maximal munch rule. An operator cannot appear directly next to another operator without some whitespace to separate them. For instance A.+. is an operator called (+.) from module A, not an operator called + followed by compose. But, although A could be the three-dot operator ... from the module A, it is not possible to have A... interpreted as a two-dot operator, because .. is reserved as sugar for enumeration sequences, and so is explicitly excluded from the varsym production. Thus, the only possible lexical interpretation is the one you first suggested, namely a constructor A followed by a three-dot operator Regards, Malcolm ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: lexer puzzle
Iavor Diatchki writes: what do people think should be the tokens produced by a haskell lexer when applied to the following input: A... This has been discussed before (a while back, admittedly). See: http://www.mail-archive.com/[EMAIL PROTECTED]/msg04054.html GHC (still) gets this wrong. It's a documented bug though. Cheers, Simon ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: lexer puzzle
I agree with Marcin, A... should be split into A.. and . As I read the (on-line) report the maximal munch rule says that you should read the longest lexeme. It does not say that two operators have to be separated by whitespace. Because A... is not a lexeme, the longest lexeme you can read from A... is A.. (qualified dot-operator). Arthur On maandag, sep 15, 2003, at 12:11 Europe/Amsterdam, Malcolm Wallace wrote: Marcin 'Qrczak' Kowalczyk [EMAIL PROTECTED] writes: Argh, I was wrong. It's A.. (qualified operator), then . (operator). You are forgetting about the maximal munch rule. An operator cannot appear directly next to another operator without some whitespace to separate them. For instance A.+. is an operator called (+.) from module A, not an operator called + followed by compose. But, although A could be the three-dot operator ... from the module A, it is not possible to have A... interpreted as a two-dot operator, because .. is reserved as sugar for enumeration sequences, and so is explicitly excluded from the varsym production. Thus, the only possible lexical interpretation is the one you first suggested, namely a constructor A followed by a three-dot operator Regards, Malcolm ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: lexer puzzle
I agree with Marcin, A... should be split into A.. and . As I read the (on-line) report the maximal munch rule says that you should read the longest lexeme. It does not say that two operators have to be separated by whitespace. Because A... is not a lexeme, the longest lexeme you can read from A... is A.. (qualified dot-operator). Wow! I hadn't noticed that before. This means that for example, 'M.where ' must be interpreted as the two tokens 'M.wher' 'e' This is bound to keep the Obfuscated Haskell programmers happy :-) Cheers, Simon ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: lexer puzzle
Dnia nie 14. wrzenia 2003 01:04, Derek Elkins napisa: A... A (constructor), then ... (operator). This is how I understand Haskell 98 lexing rules. My first thought was that it should produce, A.. ., as in (.) (A..), but obviously that would be wrong as A.. must be a function and therefore to be passed to (.) it would need to be (A..). Argh, I was wrong. It's A.. (qualified operator), then . (operator). So it's syntax error recognized during parsing. I take this to mean the (..) function from the Prelude module. .. is a reserved operator, used for ranges. -- __( Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/ ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: lexer puzzle
Dnia pi 12. wrzenia 2003 20:31, Iavor Diatchki napisa: what do people think should be the tokens produced by a haskell lexer when applied to the following input: A... A (constructor), then ... (operator). This is how I understand Haskell 98 lexing rules. -- __( Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/ ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: lexer puzzle
On Sun, 14 Sep 2003 00:30:40 +0200 Marcin 'Qrczak' Kowalczyk [EMAIL PROTECTED] wrote: Dnia pi_ 12. wrze_nia 2003 20:31, Iavor Diatchki napisa_: what do people think should be the tokens produced by a haskell lexer when applied to the following input: A... A (constructor), then ... (operator). This is how I understand Haskell 98 lexing rules. My first thought was that it should produce, A.. ., as in (.) (A..), but obviously that would be wrong as A.. must be a function and therefore to be passed to (.) it would need to be (A..). So with a little more thought, I seem to agree with GHC despite it being non-sensical. GHC produces the following given: 5 Prelude... 6 Variable not in scope: `Prelude...' I take this to mean the (..) function from the Prelude module. This is, as far as I'm concerned, syntactically correct, despite the fact that there isn't any way to make a (..) function* as .. is syntax (A... being different syntax). * Well... actually I'm pretty certain you could with TH ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell