Re: When do named subs bind to their variables? (Re: Questionable scope of state variables ([perl #113930] Lexical subs))
Father Chrysostomos via RT perlbug-comm...@perl.org wrote on Sat, 07 Jul 2012 17:44:46 PDT: I’m forwarding this to the Perl 6 language list, so see if I can find an answer there. I do have an answer from Damian, which I will enclose below, and a Rakudo result for you. [This conversation is about how lexical subs should be implemented in Perl 5. What Perl 6 does may help in determining how to iron out the edge cases.] [...] This question might be more appropriate: In this example, which @a does the bar subroutine see (in Perl 6)? sub foo { my @a = (1,2,3); my sub bar { say @a }; @a := [4,5,6]; bar(); } The answer to your immediate question is that if you call foo(), it prints out 456 under Rakudo. Following is Damian's answer to my question, shared with permission. --tom From: Damian Conway dam...@conway.org To:Tom Christiansen tchr...@perl.com CC:Larry Wall la...@wall.org Date: Sun, 08 Jul 2012 07:17:19 +1000 Delivery-Date: Sat, 07 Jul 2012 15:19:09 Subject: Re: my subs and state vars In-Reply-To: 22255.1341691089@chthon X-Spam-Status: No, score=-102.6 required=4.5 tests=BAYES_00,RCVD_IN_DNSWL_LOW, USER_IN_WHITELIST autolearn=ham version=3.3.0 X-Google-Sender-Auth: UHLwfgo2kyvv2prdl6qJm-RfLF8 Content-Type: text/plain; charset=ISO-8859-1 It looks like perl5 may be close to having my subs, but a puzzle has emerged about how in some circumstances to treat state variables within those. [I'm pretty sure that perl6 has thought this through thoroughly, but [I] am personally unfamiliar with the outcome of said contemplations.] I bet you aren't, though. Any ideas or clues? The right things to do (and what Rakudo actually does) is to treat lexical subs as lexically scoped *instances* of the specified sub within the current surrounding block. That is: a lexical sub is like a my var, in that you get a new one each time the surrounding block is executed. Rather than like an our variable, where you get a new lexically scoped alias to the same package scoped variable. By that reasoning, state vars inside a my sub must belong to each instance of the sub, just as state vars inside anonymous subs belong to each instance of the anonymous sub. Another way of thinking about what Perl 6 does is that: my sub foo { whatever() } is just syntactic sugar for: my foo := sub { whatever() } That is: create a lexically scoped Code object and alias it at run-time to an anonymous subroutine. So the rules for state variables inside lexical subs *must* be the same as the rules for state variables inside anonymous subs, since they're actually just two ways of creating the same thing. With this approach, in Perl 6 it's easy to specify exactly what you want: sub recount_from ($n) { my sub counter { state $count = $n; # Each instance of counter has its own count say $count--; die if $count == 0; } while prompt recount $n { counter; } } vs: sub first_count_down_from ($n) { state $count = $n; # All instances of counter share a common count my sub counter { say $count--; die if $count == 0; } while prompt first count $n { counter; } } Feel free to forward the above to anyone who might find it useful. Damian
Re: When do named subs bind to their variables? (Re: Questionable scope of state variables ([perl #113930] Lexical subs))
Father Chrysostomos via RT perlbug-comm...@perl.org wrote on Sat, 07 Jul 2012 18:54:15 PDT: Thank you. So the bar sub seems to be closing over the name @a (the container/variable slot/pad entry/whatever), rather than the actual array itself. Since I don't have it installed, could you tell me what this does? All three of those say the same thing: 123 456 --tom
Re: Underscores v Hyphens (Was: [perl6/specs] a7cfe0: [S32] backtraces overhaul)
Darren Duncan dar...@darrenduncan.net wrote on Wed, 24 Aug 2011 11:18:20 PDT: Smylers wrote: Could we have underscores and hyphens mean the same thing? That is, Perl 6 always interprets illo-figut and illo_figut as being the same identifier (both for its own identifiers and those minted in programs), with programmers able to use either separator on a whim? I oppose this. Underscores and hyphens should remain distinct. That would seem to be the most human-friendly approach. I disagree. More human friendly is if it looks different in any way then it is different. (I am not also saying that same-looking things are equal, given Unicode's redundancy.) Your mentioning of Unicode is poignant. In Unicode properties, you are not supposed to have to worry about these things.For example, from UTS#18: Note: Because it is recommended that the property syntax be lenient as to spaces, casing, hyphens and underbars, any of the following should be equivalent: \p{Lu}, \p{lu}, \p{uppercase letter}, \p{uppercase letter}, \p{Uppercase_Letter}, and \p{uppercaseletter} Simillarly, since this applies to property names as well as to property values, these are all the same: \p{GC =Lu} \p{gc =Lu} \p{General Category=Lu} \p{General_Category=Lu} \p{general_category=Lu} \p{general-category=Lu} \p{GENERAL-CATEGORY=Lu} \p{generalcategory =Lu} \p{GENERALCATEGORY =Lu} I'll let you permute the RHS on your own. :) However, I use the opposite of that sort of loose matching of identifiers in my own code. For example, when I make a named character alias, I always use lowercase so that it looks different from an official one. use charnames :full, :alias = { e_acute = LATIN SMALL LETTER E WITH ACUTE, ae = LATIN SMALL LETTER AE, smcap_ae= LATIN LETTER SMALL CAPITAL AE, # this is a lowercase letter AE = LATIN CAPTIAL LETTER AE, oe = LATIN SMALL LIGATURE OE, smcap_oe= LATIN LETTER SMALL CAPITAL OE, # this is a lowercase letter OE = LATIN CAPITAL LIGATURE OE, }; I don't make E_ACUTE and eacute also work there. However, there is a new :loose that does do that, but I suspect I shan't use it, since I use both ae and AE differently in existing code. --tom
UCA and NFC/NFD issues in pattern matching
I have two points. First, this excerpt from Synopsis 6: The :m (or :ignoremark) modifier scopes exactly like :ignorecase except that it ignores marks (accents and such) instead of case. It is equivalent to taking each grapheme (in both target and pattern), converting both to NFD (maximally decomposed) and then comparing the two base characters (Unicode non-mark characters) while ignoring any trailing mark characters. The mark characters are ignored only for the purpose of determining the truth of the assertion; the actual text matched includes all ignored characters, including any that follow the final base character. The :mm (or :samemark) variant may be used on a substitution to change the substituted string to the same mark/accent pattern as the matched string. Mark info is carried across on a character by character basis. If the right string is longer than the left one, the remaining characters are substituted without any modification. (Note that NFD/NFC distinctions are usually immaterial, since Perl encapsulates that in grapheme mode.) Under :sigspace the preceding rules are applied word by word. In perl5, one must manually run two matches on all data. First: I notice that ignoring marks (and such) and ignoring case are both differently strengthed effects of the Unicode Collation Algorithm. What about simply allowing folks to specify which of the four (or more, I guess) levels of UCA equivalence/folding they want? Second: I'm not altogether reassured by the parenned bit about NFD/NFC being immaterial. That's because I've been pretty annoying lately in perl5 with having to manually run *everything* through a double match every time, and I can't avoid it by prenormalizing. I'm just hoping that perl6 will handle this better. It's usually like this: NFD($data) =~ $pattern NFC($data) =~ $pattern Or if you know your data is NFD: $data =~ $pattern NFC($data) =~ $pattern Or if you know your data is NFC: NFD($data) =~ $pattern $data =~ $pattern That's because even if your data in a known state with respect to normalization, if your pattern admits both NFD and NFC forms, which it would if read in from a file etc, then you have to run them both. For example, suppose you read a pattern whose characters are specified indirectly/symbolically: $pattern = q\xE9; # LATIN SMALL LETTER E WITH ACUTE or $pattern = qe\x{301}; # e + COMBINING ACUTE ACCENT It would be ok if those were literal characters, because you could just NFD the patterns and be done. But they're not. So in order for $data =~ $pattern to work properly with both, you really have to do a guaranteed double-convert/match each time. This is rather unfortunate, to put it mildly. What you really want is a pattern compile flag that imposes canonical matching, and does this correctly even when faced with named characters, etc. My read of S06 suggests that this will not be an issue. I do wonder what happens when you want to match just the combining part. Does that fail in grapheme mode? It shouldn't: you *can* have standalones. But then we're back to partial matches in the middle of things, which is something that plagues us with full Unicode case-folding. This is the \N{LATIN SMALL LIGATURE FFI} =~ /(f)(f)/i problem, amongst others. Seems that you are going to get into the same dilemma if you allow matching partial graphemes in grapheme mode. Hm. --tom
Perl6 regexes and UTS#18
Has anybody specifically looked at how Perl6 regexes might map to the various requirements of UTS#18, Unicode Regular Expressions? http://unicode.org/reports/tr18/ I ask because to my inexperienced eye, quite a few perl6isms are *much* better at this than in perl5 obtain, and so I wondered whether this was by conscious intent and design. Is/Was it? I'm also curious whether there are active plans to address the tr18 requirements in perl6 regexes. It would be a wonderful feather in perl6's cap to be able to legitimately claim Level 2 or even Level 3 compliance, since besides perl5, only ICU right now manages even Level 1, with everybody else *very* far behind. TR18 specifies three levels of support (Basic, Extended, and Tailored), with each having specific, reasonably well-defined requirements: =Level 1: Basic Unicode Support RL1.1Hex Notation RL1.2Properties RL1.2a Compatibility Properties RL1.3Subtraction and Intersection RL1.4Simple Word Boundaries RL1.5Simple Loose Matches RL1.6Line Boundaries RL1.7Supplementary Code Points =Level 2: Extended Unicode Support RL2.1Canonical Equivalents RL2.2Default Grapheme Clusters RL2.3Default Word Boundaries RL2.4Default Loose Matches RL2.5Name Properties RL2.6Wildcard Properties =Level 3: Tailored Unicode Support RL3.1Tailored Punctuation RL3.2Tailored Grapheme Clusters RL3.3Tailored Word Boundaries RL3.4Tailored Loose Matches RL3.5Tailored Ranges RL3.6Context Matching RL3.7Incremental Matches ( RL3.8Unicode Set Sharing ) RL3.9Possible Match Sets RL3.10 Folded Matching RL3.11 Submatchers thanks, --tom
Re: Unicode Categories
Patrick wrote at 12:15pm CST on Wednesday, 10 November 2010: Sorry if this is the wrong forum. I was wondering if there was a way to specify unicode categorieshttp://www.fileformat.info/info/unicode/category/index.htmin a regular expression (and hence a grammar), or if there would be any consideration for adding support for that (requiring some kind of special syntax). Unicode categories are done using assertion syntax with is followed by the category name. Thus isLu (uppercase letter), isNd (decimal digit), isZs (space separator), etc. This even works in Rakudo today: $ ./perl6 say 'abcdEFG' ~~ / isLu / E They can also be combined, as in +isLu+isLt (uppercase+titlecase). The relevant section of the spec is in Synopsis 5; search for Unicode properties are always available with a prefix. Hope this helps! Actually, that quote from Synopsis raises more questions than it answers. Below I've annonated the three output groups with (letters): % uniprops -a A U+0041 ‹A› \N{ LATIN CAPITAL LETTER A }: (A)\w \pL \p{LC} \p{L_} \p{L} \p{Lu} (B)AHex ASCII_Hex_Digit All Any Alnum Alpha Alphabetic ASCII Assigned Cased Cased_Letter LC Changes_When_Casefolded CWCF Changes_When_Casemapped CWCM Changes_When_Lowercased CWL Changes_When_NFKC_Casefolded CWKCF Lu L Gr_Base Grapheme_Base Graph GrBase Hex XDigit Hex_Digit ID_Continue IDC ID_Start IDS Letter L_ Latin Latn Uppercase_Letter PerlWord PosixAlnum PosixAlpha PosixGraph PosixPrint PosixUpper Print Upper Uppercase Word XID_Continue XIDC XID_Start XIDS (C)Age:1.1 Block=Basic_Latin Bidi_Class:L Bidi_Class=Left_To_Right Bidi_Class:Left_To_Right Bc=L Block:ASCII Block:Basic_Latin Blk=ASCII Canonical_Combining_Class:0 Canonical_Combining_Class=Not_Reordered Canonical_Combining_Class:Not_Reordered Ccc=NR Canonical_Combining_Class:NR Decomposition_Type:None Dt=None East_Asian_Width:Na East_Asian_Width=Narrow East_Asian_Width:Narrow Ea=Na Grapheme_Cluster_Break:Other GCB=XX Grapheme_Cluster_Break:XX Grapheme_Cluster_Break=Other Hangul_Syllable_Type:NA Hangul_Syllable_Type=Not_Applicable Hangul_Syllable_Type:Not_Applicable Hst=NA Joining_Group:No_Joining_Group Jg=NoJoiningGroup Joining_Type:Non_Joining Jt=U Joining_Type:U Joining_Type=Non_Joining Script=Latin Line_Break:AL Line_Break=Alphabetic Line_Break:Alphabetic Lb=AL Numeric_Type:None Nt=None Numeric_Value:NaN Nv=NaN Present_In:1.1 Age=1.1 In=1.1 Present_In:2.0 In=2.0 Present_In:2.1 In=2.1 Present_In:3.0 In=3.0 Present_In:3.1 In=3.1 Present_In:3.2 In=3.2 Present_In:4.0 In=4.0 Present_In:4.1 In=4.1 Present_In:5.0 In=5.0 Present_In:5.1 In=5.1 Present_In:5.2 In=5.2 Script:Latin Sc=Latn Script:Latn Sentence_Break:UP Sentence_Break=Upper Sentence_Break:Upper SB=UP Word_Break:ALetter WB=LE Word_Break:LE Word_Break=ALetter What that means is that the B properties are properties from the *General* category. They may all be referred to as \p{X} or \p{IsX}, \p{General_Category=X} or \p{General_Category:X}, and \p{GC=X} or \p{GC:X}. I have a feeling that your synopsis quote is referring only to type B properties alone. It is not talking about type C properties, which must also be accounted for. --tom
Re: Unicode Categories
Patrick wrote: : * Almost. E.g. isL would be nice to have as well. : : Those exist also: : : $ ./perl6 : say 'abCD34' ~~ / isL / : a : say 'abCD34' ~~ / isN / : 3 : They may exist, but I'm not certain it's a good idea to encourage the Is_XXX approach on *anything* except Script=XXX properties. They certainly don't work on everything, you know. Also, I can't for the life of me why one would ever write isL when Letter is so much more obvious; similarly, for isN over Number. Just because you can do so, doesn't mean you necessarily should. http://unicode.org/reports/tr18/#Categories The recommended names for UCD properties and property values are in PropertyAliases.txt [Prop] and PropertyValueAliases.txt [PropValue]. There are both abbreviated names and longer, more descriptive names. It is strongly recommended that both names be recognized, and that loose matching of property names be used, whereby the case distinctions, whitespace, hyphens, and underbar are ignored. Furthermore, be aware that the Number property is *NOT* the same as the Decimal_Number property. In perl5, if one wants [0-9], then one expresses it exactly that way, since that's a lot shorter than writing (?=\p{ASCII})\p{Nd}, where Nd can also be Decimal_Number. Again, please that Number is far broader than even Decimal_Number, which is itself almost certainly broader than you're thinking. Here's a trio of little programs specifically designed to help scout out Unicode characters and their properties. They work best on 5.12+, but should be ok on 5.10, too. --tom unitrio.tar.gz Description: application/tar
Perl6 and accents
Exegesis 5 @ http://dev.perl.org/perl6/doc/design/exe/E05.html reads: # Perl 6 / alpha - [A-Za-z] + / # All alphabetics except A-Z or a-z # (i.e. the accented alphabetics) [Update: Would now need to be +alpha - [A..Za..z] to avoid ambiguity with Texas quotes, and because we want to reserve whitespace as the first character inside the angles for other uses.] Explicit character classes were deliberately made a little less convenient in Perl 6, because they're generally a bad idea in a Unicode world. For example, the [A-Za-z] character class in the above examples won't even match standard alphabetic Latin-1 characters like 'Ã', 'é', 'ø', let alone alphabetic characters from code-sets such as Cyrillic, Hiragana, Ogham, Cherokee, or Klingon. First off, that i.e. the accented alphabetics phrasing is quite incorrect! Code like /[^\P{Alpha}A-Za-z]/ matches not just things like 00C1 LATIN CAPITAL LETTER A WITH ACUTE 00C7 LATIN CAPITAL LETTER C WITH CEDILLA 00C8 LATIN CAPITAL LETTER E WITH GRAVE 00E5 LATIN SMALL LETTER A WITH RING ABOVE 00F1 LATIN SMALL LETTER N WITH TILDE but also of course: 00AA FEMININE ORDINAL INDICATOR 00B5 MICRO SIGN 00BA MASCULINE ORDINAL INDICATOR 00C6 LATIN CAPITAL LETTER AE 00D0 LATIN CAPITAL LETTER ETH 00DE LATIN CAPITAL LETTER THORN 00DF LATIN SMALL LETTER SHARP S 00E6 LATIN SMALL LETTER AE 00F0 LATIN SMALL LETTER ETH 01A6 LATIN LETTER YR 01BA LATIN SMALL LETTER EZH WITH TAIL 01BC LATIN CAPITAL LETTER TONE FIVE 01BF LATIN LETTER WYNN 02C7 CARON 0391 GREEK CAPITAL LETTER ALPHA 0410 CYRILLIC CAPITAL LETTER A and many, many more. I'm also disappointed to see perl6 spreading the notion that accent is somehow a valid synonym for diacritical marking diacritic marking diacritic mark diacritic mark It's not. Accent is not a synonym for any of those. Not all marks are accents, and not all accents are marks. I believe what is meant by accent is NFD($char) =~ /\pM/. Fine: then say with diacritics, not with accents. Also, there are many combining characters that aren't accents by any stretch of term, such as 20E3 COMBINING ENCLOSING KEYCAP, to name just one. Only three code points have official names that include ACCENT, and even these are dubious. Finally, I note also that people use the Alpha property too loosely. Note the caron and such above. One probably wants the LC property instead. --tom use charnames (); use Unicode::Normalize; for $cp ( 1 .. 0x ) { $orig = chr($cp); $canon = NFD($orig); # NFKD gives diff results ## if ($orig =~ /[^\P{Alpha}A-Za-z]/) { if ($orig =~ /\p{LC}/ $canon !~ /^[A-Za-z]/) { printf(%c %04X %s\n, $cp, $cp, charnames::viacode($cp)); } }
Re: Amazing Perl 6
· Quoth Larry: ˸ So let’s not make the mistake of thinking something ˸ longer is always less confusing or more official. ⋮ I already have too much problem with people thinking the ⋮ efficiency of a perl construct is related to its length. So you’re saying the Law of Parsimony has its uses… a̲n̲d̲ abuses? ☻ --tom -- ENTIA NON · SVNT M V L T I P L I C A N D A PRÆTER N̳E̳C̳E̳S̳S̳I̳T̳A̳T̳E̳M̳
Re: Files, Directories, Resources, Operating Systems
In-Reply-To: Message from Mark Overmeer [EMAIL PROTECTED] of Thu, 27 Nov 2008 08:23:50 +0100. [EMAIL PROTECTED] * Tom Christiansen ([EMAIL PROTECTED]) [081126 23:55]: On Wed, 26 Nov 2008 11:18:01 PST.--or, for backwards compatibility, at 7:18:01 p.m. hora Romae on a.d. VI Kal. Dec. MMDCCLXI AUC, Larry Wall [EMAIL PROTECTED] wrote: SUMMARY: I've been looking into this sort of thing lately (see p5p), and there may not even *be* **a** right answer. The reasons why take us into an area we've traditionally avoided. What a long message... It *was*? That was approaching a medium in my epistolary (and RFC) world, the one unrelated to PostIt notes. I can therefore see you've never been FMTEYEWTK'd, and thus also to all outward appearances, we've not made each other's acquaintance. I'm tchrist; pleased to meet you. Read the //www.unicode.org/reports/tr10/ treatise, as I have repeatedly done, and you will quickly reassess your length calls. This is not necessarily a good thing. Neal Stephenson can do the same, and of far lesser utility. --tom
Re: Files, Directories, Resources, Operating Systems
In-Reply-To: Message from Darren Duncan [EMAIL PROTECTED] of Wed, 26 Nov 2008 19:34:09 PST. [EMAIL PROTECTED] Tom Christiansen wrote: I believe database folks have been doing the same with character data, but I'm not up-to-date on the DB world, so maybe we have some metainfo about the locale to draw on there. Tim? AFAIK, modern databases are all strongly typed at least to the point that the values you store in and fetch from them are each explicitly character data or binary data or numbers or what-have-you; and so, when you are dealing with a DBMS in terms of character data, it is explicitly specified somewhere (either locally for the data or globally/hardcoded for the DBMS) that each value of character data belongs to a particular character repertoire and text encoding, and so the DBMS knows what encoding etc the character data is in, or at least it treats it consistently based on what the user said it was when it input the data. Oh, good then. That's what I'd heard was happening, but wasn't sure since I've steared clear of such beasties since before it was true. I wish our filesystems worked that way. But Andrew said something to me last week about Ken and Dennis writing quite pointedly that while you *could* use the f/s as a database, that you *shouldn't*. I didn't know the reference he was thinking of, so just nodded pensively (=thoughtfully). There is ABSOLUTELY NO WAY I've found to tell whether these utf-8 string should test equal, and when, nor how to order them, without knowing the locale: RESUME, Resume resume Resum\x{e9} r\x{E9}sum\x{E9} r\x{E9}sume\x{301} Re\x{301}sume\x{301} Case insensitively, in Spanish they should be identical in all regards. In French, they should be identical but for ties, in which case you work your way right to left on the diactricals. This leads me to talk about my main point about sensitivity etc. I believe that the most important issues here, those having to do with identity, can be discussed and solved without unduly worrying about matters of collation; It's funny you should say that, as I could nearly swear that I just showed that identify cannot be determmined in the examples above without knowing about locales. To wit, while all of those sort somewhat differently, even case-insensitively, no matter whether you're thinking of a French or a Spanish ordering (and what is English's, anyway?), you have a a more fundadmental = vs != scenario which is entirely locale-dependent. If I can make a RESUME file, ought I be able to make a distcint r\x{E9}sum\x{E9} or re\x{301}sume\x{301} file in a case-ignorant filesystem? There is no good answer, because we might think it reasonable to lc(strip_marks($old_fn)) eq lc(strip_marks($new_fn)) Theee problem of what is or is not a mark varies by locale, * Castilian doesn't think ~ is a mark; Portuguese does, and so if you strip marks, you in Castilian count as the same two letters that it deems disinct, but in Portuguese, you incur no lasting harm. * Catalan doesn't think ¸ is a mark; French does. and so if you strip marks, you in Catalan count as the same two letters that it deems disinct, but in French or Portuguese, you incur no lasting harm. * Modern English (usually) decomposes æ into a+e, but OE/AS and Icelandic do not. * Moreover, Icelandic deems é and e to be completely different letters altogether. If you strip marks, you count as the same letters that that language does not. Similarly with ö, which is at the end of their alphabet, (like ø in some), and nowhere near o or ó. BTW, those are three separate letters, not variants. * And in OE/AS you could have a long mark on an asc (say ash for the atomic *letter* æ). If split into a and e and stripped of marks, it woudn't make any sense at all. Case in point: Ælene Frisch, whom many of you doubtless know, insists her name be spelt as I have written it. She does not want Aelene Frish, for she considers her forename to have 5 letters in it, not 6. But Unicode doesn't give us a title case version of that (did AS?), suggesting it a ligature not a digraph. But if we have a file called ÆLENE, may be assume it the same in a case- insensitive sense to both aelene and ælene? I can only go on code-points, because I don't want to deal with ß and SS and Ss. Case-folding file systems are just begging for trouble, and I just don't know what to do. Think of the 3 Greek sigmata. identity is a lot more important than collation, as well as a precondition for collation, and collation is a lot more difficult and can be put off. I agree everything with everthing save and can be put off. I would like you to be right. I should truly wish to be mistaken. And I don't know what we have for prior (cough) art. respect to dealing with a file system, generally
Re: Smooth numeric upgrades?
On Mon, 06 Oct 2008 at wee small hour of 02:20:22 EDT you, Michael G Schwern [EMAIL PROTECTED], wrote: Darren Duncan wrote: [2] Num should have an optional limit on the number of decimal places it remembers, like NUMERIC in SQL, but that's a simple truncation. I disagree. Any numeric operations that would return an irrational number in the general case, such as sqrt() and sin(), and the user desires the result to be truncated to an exact rational number rather than as a symbolic number, then those operators should have an extra argument that specifies rounding, eg to an exact multiple of 1/1000. That seems like scattering a lot of redundant extra arguments around. The nice thing about doing it as part of the type is you just specify it once. But instead of truncating data in the type, maybe what I want is to leave the full accuracy inside and instead override string/numification to display only 2 decimal places. This is currently something of an annoyance with Math::Complex. It needs a way of specify epsilon. If you ask for both sqrt()s of 4, you get (2, -2+2.44929359829471e-16i) in Cartesian but in Polar: ( [2,0], [2,pi] ) Is the problem that it's working in Polar and the conversion to Cartesian is off by a wee bit? I would really like to get Cartesian answers of (2, -2), not that -2e-16i silliness. If you ask for both roots of -4, you get Cartesian: ( 1.22464679914735e-16+2i, -3.67394039744206e-16-2i ) Polar: ( [2,pi/2], [2,-1pi/2] ); But I'd like a Cartesian return of (2i, -2i). And a Polar return of ([2,pi/2],[2,-pi/2]). It's worse still with the 10 roots of 2**10: The 10 roots of 1024 are: CRTSN: 1: 2 POLAR: 1: [2,0] CRTSN: 2: 1.61803398874989+1.17557050458495i POLAR: 2: [2,pi/5] CRTSN: 3: 0.618033988749895+1.90211303259031i POLAR: 3: [2,2pi/5] CRTSN: 4: -0.618033988749895+1.90211303259031i POLAR: 4: [2,3pi/5] CRTSN: 5: -1.61803398874989+1.17557050458495i POLAR: 5: [2,4pi/5] CRTSN: 6: -2+2.44929359829471e-16i POLAR: 6: [2,pi] CRTSN: 7: -1.61803398874989-1.17557050458495i POLAR: 7: [2,-4pi/5] CRTSN: 8: -0.618033988749895-1.90211303259031i POLAR: 8: [2,-3pi/5] CRTSN: 9: 0.618033988749894-1.90211303259031i POLAR: 9: [2,-2pi/5] CRTSN: 10: 1.61803398874989-1.17557050458495i POLAR: 10: [2,-1pi/5] The 10 roots of -1024 are: CRTSN: 1: 1.90211303259031+0.618033988749895i POLAR: 1: [2,0.314159265358979] CRTSN: 2: 1.17557050458495+1.61803398874989i POLAR: 2: [2,0.942477796076938] CRTSN: 3: 1.22464679914735e-16+2i POLAR: 3: [2,pi/2] CRTSN: 4: -1.17557050458495+1.61803398874989i POLAR: 4: [2,2.19911485751286] CRTSN: 5: -1.90211303259031+0.618033988749895i POLAR: 5: [2,2.82743338823081] CRTSN: 6: -1.90211303259031-0.618033988749895i POLAR: 6: [2,-2.82743338823081] CRTSN: 7: -1.17557050458495-1.61803398874989i POLAR: 7: [2,-2.19911485751286] CRTSN: 8: -3.67394039744206e-16-2i POLAR: 8: [2,-1pi/2] CRTSN: 9: 1.17557050458495-1.6180339887499i POLAR: 9: [2,-0.942477796076938] CRTSN: 10: 1.90211303259031-0.618033988749895i POLAR: 10: [2,-0.31415926535898] Note, a generic numeric rounding operator would also take the exact multiple of argument rather than a number of digits argument, except when that operator is simply rounding to an integer, in which case no such argument is applicable. Note, for extra determinism and flexibility, any operation rounding/truncating to a rational would also take an optional argument specifying the rounding method, eg so users can choose between the likes of half-up, to-even, to-zero, etc. Then Perl can easily copy any semantics a user desires, including when code is ported from other languages and wants to maintain exact semantics. Yes, this is very important for currency operations. Now, as I see it, if Num has any purpose apart from Rat, it would be like a whatever numeric type or effectively a union of the Int|Rat|that-symbolic-number-type|etc types, for people that just want to accept numbers from somewhere and don't care about the exact semantics. The actual underlying type used in any given situation would determine the exact semantics. So Int and Rat would be exact and unlimited precision, and maybe Symbolic or IRat or something would be the symbolic number type, also with exact precision components. That sounds right. It's the whatever can conceivably be called a number type. I think you might be surprised by what some people conceive of by numbers. :-( --tom #!/usr/bin/perl use strict; use warnings; use Math::Complex; my $STYLE = NORMAL; # my $STYLE = HACKED; unless (@ARGV) { die usage: $0 number rootcount\n; } my ($number, $rootcount) = @ARGV; $number = cplx($number); die $0: $number poor for rooting\n if !$number; die $0: $rootcount should be
Re: Smooth numeric upgrades?
In-Reply-To: Message from Nicholas Clark [EMAIL PROTECTED] of Sun, 05 Oct 2008 22:13:14 BST. [EMAIL PROTECTED] Studiously ignoring that request to nail down promotion and demotion, I'm going to jump straight to implementation, and ask: If one has floating point in the mix [and however much one uses rationals, and has the parser store all decimal string constants as rationals, floating point enters the mix as soon as someone wants to use transcendental functions such as sin(), exp() or sqrt()], I can't see how any implementation that wants to preserve infinite precision for as long as possible is going to work, apart from storing every value as a thunk that holds the sequence of operations that were used to compute the value, and defer calculation for as long as possible. (And possibly as a sop to efficiency, cache the floating point outcome of evaluating the thunk if it gets called) Nicholas Clark My dear Nicholas, You mentioned sin(), exp(), and sqrt() as being transcendental functions, but this is not true! Perhaps you meant something more in the way of their being--um, irrational. but far be it from me to risk using so loaded a word in reference to anyone's hypothetical intentions or rationale! :-) While all transcendentals are indeed irrationals, the opposite relationship does *not* apply. It's all really rather simple, provided you look at it as a brief, binary decision-tree. = All reals are also one of either rational or irrational: + Rational numbers are those expressible as the RATIO of I/J, where I is any integer and J any non-zero integer. - Irrationals are all other reals *EXCEPT* the rationals. = All irrationals are also one of either algebraic or transcendental: + Algebraic numbers are solutions to polynomial equations of a single variable and integer coefficients. When you solve for x in the polynomial equation 3*x**2 - 15 == 0, you get an algebraic number. - Transcendentals are all other irrationals *EXCEPT* the algebraics. Thinking of the sine function and its inverse, I notice that sin(pi/2) == 1 and asin(1) is pi/2. Pi is *the* most famous of transcendental numbers, and sin() is a transcendental function. Thinking of the exponential function and its inverse, I notice that exp(1) == e and log(e) == 1. And e, Euler's number, is likely the #2 most famous transcendental, and exp() is a transcendental function. However, we come now to a problem. If you solved the simple equation I presented above as one whose solution was by definition *not* a transcendental but rather an algebraic number, you may have noticed that solution is 5**(1/2), better known as sqrt(5). So that makes sqrt(5) an algebraic number, and sqrt() is an algebraic function, which means therefore that it is *not* a transcendental one. Q.E.D. :-) Ok, I was teasing a little. But I'd now like to politely and sincerely inquire into your assertion that floating point need inevitably enter the picture just to determine sin(x), exp(x), or sqrt(x). Your last one, sqrt(), isn't hard at all. Though I no longer recall the algorithm, there exists one for solving square roots by hand that is only a little more complicated than that of solving long division by hand. Like division, it is an iterative process, somewhat tedious but quite well-defined easily implemented even on pen and paper. Perhaps that has something to do with sqrt() being an algebraic function. :-) j/k As for the two transcendental functions, this does ask for more work. But it's not as though we do not understand them, nor how to derive them at need from first principles! They aren't magic black-ball functions with secret look-up tables that when poked with a given input, return some arbitrary answer. We *know* how to *do* these! Sure, many and probably most solutions, at least for the transendentals, do involve power series, and usually Taylor Series. But this only means that you get to joyfully sum up an infinite sequence of figures receding into infinity (And beyond! quoth Buzz), but where each figure in said series tends to be a reasonably simple and straightforward computation. For example, each term in the Taylor Series for exp(x) is simply x**N / N!, and the final answer the sum of all suchterms for N going from 0 to infinity. Its series is therefore x**0 / 0! # er: that's just 1, of course :-) + x**1 / 1! + x**2 / 2! + x**3 / 3! + x**4 / 4! + + + + + + + + + + + ad infinitum. For sin(x), it's a bit harder, but not much: the series is a convergent one of alternating sign, running N from 0 to infinity and producing a series that looks like this: (x**1 / 1!)# er: that's just x, of course :-) - (x**3 / 3!) + (x**5 / 5!) - (x**7 / 7!) + (x**9 / 9!) - + - + - + - + - + ad infinitum. Each term in the sin(x) series is still a comparitively easy one, reading much better on paper than on the computer with
Re: Smooth numeric upgrades?
In-Reply-To: Message from Michael G Schwern [EMAIL PROTECTED] of Sat, 04 Oct 2008 02:06:18 EDT. [EMAIL PROTECTED] Larry Wall wrote: The status of numeric upgrades in Perl 6 is fine. It's rakudo that doesn't do so well. :) As another datapoint: $ pugs -e 'say 2**40' 1099511627776 $ pugs -e 'say 2**50' 1125899906842624 $ pugs -e 'say 2**1100' 1358298529049385849277351428359266778603493846931744549748519669727813SNIP That's good [1] to hear, thanks. I don't think of Int as a type that automatically upgrades. I think of it as an arbitrarily large integer that the implementation can in some cases choose to optimize to a smaller or faster representation, Oh don't worry, I do. I just got so flustered when I saw Rakudo do the same thing that Perl 5 does I was worried this got lost somewhere along the line. [1] We need a polite way to say less bad. ! ah fab yay good cool ayup d'oh! helps tasty yummy smooth cheers better yippee! soothes pleases niftier relieves mediates inspires mollifies mitigates clarifies my mistake oh, right! great! delightful not to worry oh be joyful! 'tain't so bad calms my qualms less sub-optimal cheers my spirit soothes my nerves dispells my doubts heartens my resolve cushions the cudgel dismisses my dismay drives out the dread comforts me to learn inspires me with hope restores my confidence gladdens my good humor brightens my rainy day alleviates my concerns mollifies my misgivings alleviates my confusion puts down the false alarm perks/plucks up my courage pacifies my preoccupations banishes my paranoia-demons felicitates my facilitation facilitates my felicitation assuages my misapprehensions shows I was worrying too much encourages me; is encouraging warms the cockles of my heart offers hope for a better world trounces my tetchy trepidations dispells my misplaced anxieties sure puts a spiffier shine on it makes molehills out of mountains eases up on my nerves a fair bit not nearly so gnarly as I'd feared 'tis not too late to seek a newer world way better than I'd half-begun to suspect patches the potholes in my crumbling wetware softens the imagined blow that wasn't even there to start with serenades such sweet sonnets as to nullify nervous nellies' natterings
Re: Allowing '-' in identifiers: what's the motivation?
I'm still somewhat ambivalent about this, myself. My previous experience with hyphens in identifiers is chiefly in languages that don't generally have algebraic expressions, e.g. LISP, XML, so it will take some getting used to in Perl. But at least in Perl's case the subtraction conflict is mitigated by the fact that many subtraction expressions will involve sigils; $x-$y can't possibly be a single identifier. People use nonadic functions (nonary operators? where non = 0, not 9) without parens, and get themselves into trouble for it. % perl -E 'say time-time' 0 % perl -E 'say time-do{sleep 3; time}' -3 % perl -E 'say time +5' 1218475824 % perl -E 'say time -5' 1218475817 % perl -E 'say time(-5)' syntax error at -e line 1, near (- Execution of -e aborted due to compilation errors. Exit 19 --tom
Re: Exegesis 7: Fill Justification
On Tue, Mar 02, 2004 at 10:01:11AM +1100, Damian Conway wrote: : That's a *very* interesting idea. What do people think? I think anyone who does full justification without proportional spacing and hyphenation is severely lacking in empathy for the reader. Ragged right is much easier on the eyes--speaking as someone who had their seventh eye operation today. At least aesthetically, yes, it sure does look better ragged. I do wonder why that is, though. Could it be that the unevenness of the inserted fixed-width spacing looks rough? Or is maybe because with long lines, one's eye might get lost, being slower to tell one line from the next? That's certainly a reason for have shorter columns. In a message of mine to p5p of 4-Nov-2003 [EMAIL PROTECTED], I showed (but did not mention) how this sort of can be done without inserting any spurious spaces whatsoever, even in a long paragraph: Well, no. Mark answered so quickly after I did, and covered so much of it so succinctly, that I backed off again. It seems to me that he and I have both for a long time yearned for a perliotut; I don't believe either of us has ever fleshed out more than an outline, though. IO is a subject that's not always easy to figure out how to get the best handle on (ENOPUN). For one thing, it's steeped in Unix lore and tradition, and it requires either knowing or else teaching quite a bit of C programming that would otherwise be completely irrelevant to Perl. For example, when you see someone lseek zero bytes from the current position in Perl, you know they're remembering the ANSI C requirement of a seek falling between switching from reading to writing or vice versa. As always, you're subject to all the silly bugs in your libc runtime system and in your kernel; for example, we tried to have all buffers flushed before a fork() to avoid duplicate output in the child by calling fflush(0) from C, the intent being to flush data still there in stdio buffers. Unfortunately, on some platforms, you'll accidentally toss not just pending output, but also pending input. Thus the case where read on STDIN was called with 2 against asdf\n, you'd still have the df\n yet to read get completely trounced. This is incorrect behaviour, at least as far as the goal of flushing pending output buffers before forking. Sadly, there really are a zillion little things like this, and these are just the exceptions, not the core functionality that you'd like to teach people for learning IO. Blocking and buffering are tricky; did you remember that the output commands can also block? Think about sending something down a pipe where the reader on the other end is slow or busy. That's why with select you also have a slot for output handles you want to know whether are ready for IO. It just goes on and on. It would be easier to hand out copies of Stevens than to write perliotut, but that's too embarrassing and annoying. However, I fear this isn't really readily automated; sorry to interrupt. :-) --tom PS: Ok, maybe one *could* do it, but that would still require a whole lot of PhD-ish NLP work, and surely Damian's too engaged now for the diversion.
Re: == vs. eq
When you write: (1..Inf) equal (0..Inf) I'd like Perl to consider that false rather than having a blank look on its face for a long time. The price of that consideration would be to give the Mathematicians blank looks on *their* faces for a very long time instead. Certainly, they'll be quick to tell you there are just as many whole numbers as naturals. So they won't know what you mean by equal up there. The Engineers might also be perplexed if you make that false, but for rather different reasons. I think that you will have to define your equal operator in a way contrary to their respective expectations, because both have ways of thinking that could quite reasonably lead them to expect the answer to the expression above to come out to be false, not true. Practically speaking, I'm not sure how--even whether--you *could* define it. One is tempted to attempt something like saying that operator equal is true of a *lazy*, *infinite* list if and only if elements 0..N of both lists were each themselves equal, and then only blessedly finite values of N. But where then do you stop? If N is 1, then (1..Inf) and (1..Inf:2) are ok. If N is 2, meaning to check both the first pair from each list and also the second one, they aren't. However, if N is true, (1..Inf) and (1, 2..Inf:2) are certainly ok. In fact, that definition seems trivial to break. Given a known N steps of comparison, the lazy lists (1..Inf) and (1..N-1, (N..Inf:2)) would both test equal in the first N positions and differ in position N+1. Therefore, we can always break any operator that tests the first N positions of both lazy lists, and thus that definition would be wrong. The reason Mr Engineer might expect false would be if they thought you were eventually testing against Inf. Due to his experience in numerical programming, he sees NaN and Inf having certain behaviors that no pure mathematician would even countenance. On a system whose system nummifier knows things like Inf and NaN already, you see this happening even today. Witness real Perl: % perl -e 'printf %g\n, NaN' nan % perl -e 'printf %g\n, 1 + NaN' nan % perl -e 'printf %g\n, 42 * NaN' nan % perl -e 'printf %g\n, Nan == NaN' 0 % perl -e 'printf %g\n, 1+Nan == NaN' 0 % perl -le 'printf %g\n, Inf' inf % perl -le 'printf %g\n, 1+Inf' inf % perl -le 'printf %g\n, 2+Inf' inf % perl -e 'printf %g\n, Inf == Inf' 1 % perl -e 'printf %g\n, Inf == -Inf' 0 % perl -e 'printf %g\n, 1+Inf == Inf' 1 % perl -e 'printf %g\n, Inf + Inf' inf % perl -e 'printf %g\n, Inf * Inf' inf % perl -e 'printf %g\n, Inf / Inf' nan =begin ASIDE Yes, it's platform dependent what you'll get: mach1% perl -le 'printf On $^O, NaN == NaN is %g\n, Nan == NaN' On openbsd, NaN == NaN is 1 mach2% perl -le 'printf On $^O, NaN == NaN is %g\n, Nan == NaN' On linux, NaN == NaN is 0 I believe that's because the libc on openbsd doesn't nummify string NaN to any special IEEE float, whereas the redhate one did. I am truly hoping that on Perl6, comparing apples with greed will mean that you're testing NaN with itself, that testing NaN and *anything* including another Nan with == will get you into trouble no matter what your platform, and that that trouble will be the same irrespective of platform. =end ASIDE In other words, if you treat Inf as any particular number (which Mr Mathematician stridently yet somewhat ineffectually reminds you that are *not* allowed to do!), then you may get peculiar results. Heck, you could probably then get Mr Engineer to agree that the lazy lists (1..Inf) and (0..Inf) are the same in the *last* N positions for all values of N, and since you could just select N to be equal (ahem) to the length (ahem**Inf) of your list, they must be equal. :-) Mr Mathematician, purist that he is, has of course long ago thrown up his hands in disgust, contempt, or both, and stormed out of the room. To him, most of those Perl examples given above were utter nonsense: how can you say 1+Inf? It bothers him to talk about Inf+1, and 1..Inf will be problematic, too, since to say 1..Inf is also to say there must exist some number whose successor is Inf. And of course, there isn't. Which is why Inf is not a valid operand for numerical questions in Mr Mathematician's platonically purist world of ideas. But practical Mr Engineer has defined his own Inf in which you can do limited otherwise apparently numerical operations, because it was *practical* for him to do so. He had work to do, and needed some new rules. While Mr Mathematician won't put up with comparing numbers and infinities, he's quite comfortable with comparing *infinities* themselves. He's comfortable with infinite sets, and he's comfortable with infinite series, too, which is what these lists seem to be. I'm not sure that his experience with infinite series will help us much here, because you see, those
Re: == vs. eq
You can define is very easily: two lists are equal if the ith element of one list is equal to the ith element of the other list, for all valid indices i. The problem is that you've slipped subtly from a well-known creature, like 1..10, a finite set of ten distinct integers, to a quite a different sort of beast entirely, 1..Inf, which while notationally similar to the first, does not share some very fundamental properties. For example, it no longer has an integral membership count, that is, a length. This is problematic if one is not quite careful. As for whether you can *evaluate* this test in bounded time, that depends. Computers are incapable of storing truly infinite lists, so the lists will have finite internal representations which you can compare. Is it possible that finite internal representations will differ in internal representation yet produce identical series? It seems to me that some meta-analsys would be required if this is possible. If it is not possible, then that means that every distinct series has a distinct internal representation. Certainly this is not true lexically: I can use many variant lexical representations to produce the same infinite series. For example: (1 .. X, X+1 .. Inf) (1 .. Y, Y+1 .. Inf) Those define identical list, for any natural numbers X and Y, even as compile-time constants. However, save for special case of X==Y, I do not expect their internal representation to be the same. As for two dynamically generated infinite lists (which you can't easily compare, for example if they're based on external input)... it will either return false in finite time, or spend infinite time on determining they're indeed equal. I suppose you could classify (1 .. X, X+1 .. Inf) (1 .. Y, Y+1 .. Inf) as dynamically generated infinite lists, but again, given constants X and Y, they really needn't be. Even a run-time thing like the list (1 .. $X) shouldn't actually need to spend infinite time on determining its equivalence to (1 .. $Y). However, there are more interesting possibilities: generic iterator functions that you repeatedly call and which produce successors that aren't generally recognizable. Remember the old flipflop, as in if ( 1 .. /^$/ ) { } if ( /foo/ .. /bar/ ) { } if ( f() .. g() ) { } You could, a think, have an infinite list that was really some fancy interface to a dynamic interator of some sort. I know it's interesting, but whether this would be sufficiently useful to justify its complexity is rather less obvious. But if you did have such a list, where stepping down it implicitly called some sort of -ITER method or whatnot, then for those I could see the intractability of finite evaluation, since it's perfectly conceivable that it wouldn't terminate. Another pitfall is non-reproduceability; think about readline() as an iterator on a stream object whose underlying file descriptor is not seekable. But I'm not sure that the any of the sorts of lists we've been talking about have to have that problem. But I don't know whether we can be clever enough to step around infinite evaluation through some sort of higher-level analysis. A clever compiler could move things around. Maybe it could change for ($i = 1; $i = 10_000; $i++) into for $i ( 1 .. 10_000 ) and then perhaps take advantage of that construct's lazy evaluation. Given general purpose lazy evaluation, you could start doing things like thinking of for ($i = 1; $i = fn(); $i++) for $i ( 1 .. fn() ) and making instead a list or array whose members are ( 1 .. fn() ) However, do you evaluate fn() only once or repeatedly? Hm. If it were repeatedly, then I do see what you mean by dynamically generated infinite lists. In other words, if you treat Inf as any particular number (which Mr Mathematician stridently yet somewhat ineffectually reminds you that are *not* allowed to do!), then you may get peculiar results. There is no problem with doing that, as long as you define what you want it to do. Well, sure, you could let Inf = MAXINT + 1 for example, and then define things as you want them to act, but that doesn't mean that this resulting Inf is either what people think of as a Number nor what they think of as Infinity. See below on IEEE, which found it very useful to something of the like. Remember, most of mathematics is just an invention of humans :) I believe we are indeed trying to define what we want it to do, no? So sure, you can create a new infinite set by conjoining some new elements to an existing one. That's what all the numberic sets are, pretty much. Do be careful that the result has consistent properties, though. (crap about testing first/last N elements) testing the first/last N elements is not the same as testing the whole list for all N :) Mr Mathematician, purist that he is, has of course long ago thrown up his hands in disgust, contempt, or both, and stormed out of the room
Re: == vs. eq
Unless I'm very wrong, there are more whole numbers than natural numbers. An induction should prove that there are twice as many. We're probably having a language and/or terminology collision. By natural numbers, I mean the positive integers. By whole numbers, I mean the natural numbers plus the number zero. Since both sets have infinite members, each has just as many members as the other has. It just *looks* like the whole numbers have one more. But they don't, you know, because Inf+1 == Inf, as IEEE shows us in their seminal treatise on How to Lie With Computers under IEEE Floating Point. It's not really relevant to figuring out how to evaluate equality testing on unbounded lists in Perl, but I think that your inductive proof would lead you to conclude the opposite of what you're thinking. You can pick a first member of both sets. Then you can pick a second member of both sets. Then a third, then a fourth, and so and so forth for all cardinal numbers. Even though your list of pairings one from each set itself stretches to infinity (not that that means it actually stops somewhere, of course, as though infinity were a place; I mean it just stretches ever upwards without bound), then I think induction will convince you that in the resulting pair-list, there are no missed members from either set. So we are comfortable saying that there are just as many of one as the other; well, *I* am comfortable saying that, at least, and I hope you are, too. :-) It's initially a bit disturbing, though, when you realize that this necessarily leads to saying there are just as many multiple of two as there are of, oh, eight. Maybe that's why Cantor died mad. :-) --tom
Re: == vs. eq
The IEEE-float-style infinities are quite sufficient for most purposes One thing I agree is that writing 1..Inf is a *bit* sloppy since the range operator n..m normally produces the numbers i for which n = i = m while n..Inf gives n = i Inf but I can live with it I could sure save myself a lot of typing by reading ahead to message N+1 before answering message N. :-) --tom PS: For all N. :-):-)
Re: Barewords and subscripts
Maybe there will be a Perl 6 rule forcing the keys to be quoted, but it won't be because of the no barewords rule. If there were such a rule, I presume you'd also apply it to the LHS of =? There is another way to resolve the ambiguity of foo meaning either foo or foo() depending on current subroutine visibility. This would also extend then to issue of $hash{foo} meaning either $hash{foo()} or $hash{foo}. Just use parens. Oh, I know, I know. I can already hear the mass reaction now: Oh, horrors! cry the masses from every timezone. But let's think about it anyway. Perl's historical optionality of explicit parentheses to delimit a function's argument list is, like its similar optionality of explicit quotation marks, a source of ambiguity. And while ambiguity can be a source flexibility, expressibility, and convenience, it can also have a darker side that would be better relegated to obfuscated programming contests than to production-calibre code. In my experience, many programmers would prefer that all functions (perhaps restricted to only those of no arguments to appease hysterical cetaceans?) mandatorily take (even empty) parens. Thus, shift() in lieu of shift, no matter whether it's as a hash subscript or the left-hand operand of the comma arrow, or whether it's floating around free, outside of any such autoquoting construct. Since this matter has now been mentioned, I would like to suggest that there lurk other related and perhaps even more important ramifications to the current optionality of parentheses than the one concerning strings. Witness: % perl -MO=Deparse,-p -e 'push @foo, reverse +1, -2' push(@foo, reverse(1, -2)); % perl -MO=Deparse,-p -e 'push @foo, rand +1, -2' push(@foo, rand(1), -2); % perl -MO=Deparse,-p -e 'push @foo, time +1, -2' push(@foo, (time + 1), -2); [ Gr. That should read time(). ] % perl -MO=Deparse,-p -e 'push @foo, fred +1, -2' push(@foo, ('fred' + 1), -2); Do you see what I'm talking about? The reader unfamiliar with the particular context coercion templates of the functions used in code like use SpangleFrob; frob @foo, spangle +1, -2; can have no earthly idea how that will even *parse*. This situation seems at best, unfortunate. I'm sure that if it were somehow possible to require proper placement of all those parens, even with something like the hypothetical and wholly optional use strict 'parens', that this would raise the hackles of many a current Perl programmer. But perhaps this owes more to the fact that those folks do not have to explain or justify this particular--well, let's be charitable and merely call it an issue--to those whom it befuddles or annoys than it owes to any legitimate convenience or desirable functionality. When you get to see non-wizards repeatedly stumble on these ambiguities on a regular basis, this whole situation can quickly become a source of frustration, embarrassment, or both. Whether this scenario inspires apologetics or apoplectics is not consistently predictable. However, if one were simply *able* to write something like use SpangleFrob; use strict 'parens'; # subsumed within a blanket use strict frob(@foo, spangle(1, -2)); frob(@foo, spangle(1), -2); frob(@foo, spangle() + 1, -2); then, without even inflicting grievous harm on compile-time checking of arguments, one could at least always readily discern which arguments went where--which hardly seems an undesirable goal, now does it? Nevertheless, even that wouldn't help in being able to know whether that's really meaning frob(@foo, frob( \@foo, frob( scalar @foo, It all would depend upon the existence of coercion templates such as frob(@...), frob(\@...), and frob($...). Sadly, there's no B::Deparse switch to tell you under which scenario your operating, but that's all probably best left for a semi-separate discussion (if at all). The devil's advocate might suggest that not knowing which of the three treatments of @foo silently occurred in the frobbing function call--which they with some credibility assert a desirable goal--goes hand in glove with not knowing whether spangle is here acting as a list-op, as a un(ary)-op, or as a non(e)-op. But considering that such devils need no help in their advocacy, I shan't bother to do so myself. :-) --tom
Re: Perl 5's non-greedy matching can be TOO greedy!
More generally, it seems to me that you're hung up on the description of "*?" as "shortest possible match". That's an ambiguous Yup, that's a bit confusing. It's really "start matching as soon as possible, and stop matching as soon as possible". (The usual greedy one is, of course, "keep matching as long as possible".) The initial invariant part, "start as soon as possible", is the de facto and de jure (at least POSIX 1003.2, but probably also Single Unix) definition, and therefore rather non-negotiable. It's like people who write /^.*fred/ instead of /.*fred/. They are forgetting something critical: where the Engine starts the serach. --tom
Re: Perl 5's non-greedy matching can be TOO greedy!
Have you thought it through NOW, on a purely semantic level (in isolation from implementation issues and historical precedent), I've said it before, and I'll say it again: you keep using the word "semantic", but I do not think you know what that word means. --tom
Re: RFC 357 (v1) Perl should use XML for documentation instead of POD
POD, presumably. Or maybe son-of-POD; it would be nice to have better support for tables and lists. We did this for the camel. Which, I remind the world, was written in pod. ''tom
Re: RFC 357 (v1) Perl should use XML for documentation instead of POD
No-one ever did suggest adding « and » to the list of matched delimiters that q() etc support, did they? :-) I did. Does Unicode define bracket pairings for character sets? ducks $ grep ^Prop /usr/local/lib/perl5/5.6.0/unicode/Props.txt does not seem very helpful, but this may not be much of a proof. --tom
Re: RFC 357 (v1) Perl should use XML for documentation instead of POD
- Done right, it could be easier to write and maintain Strongly disagree. - Why make people learn pod, when everyone's learning XML? Because it is simple. It is supposed to be simple. It is not supposed to do what you want to do. In fact, it is suppose to NOT DO what you want to do. - Pod can be translated into XML and vice versa Then do that. - Standard elements could be defined and utilized with the same or greater ease than pod for build and configuration. /pod NameModule::Name/Name Version0.01/Version Synopsisshort description/Synopsis Description name=head1 long description/name section name=head2 heading/name list type="ordered" symbol="1" itemfoo/item /list Type in some text here... /section /Description AuthorEliott P. Squibb/Author MaintainerJoe Blogg/Author Bugsnone/Bugs CopyrightDistributed under same terms as Perl/Copyright section namedefine your own section/name blab here /section /pod That is an excellent description of why THIS IS COMPLETE MADNESS. --TOM
Re: RFC 325 (v1) POD and comments handling in perl
It really is not feasible to relax the pod requirement that pod diretives begin with an equals to allow them to begin with a pound sign as well, for to do so would expose an untold number of programs to unpredictable effects. I also don't really see any advantage. And yes, I'm sure I'm days behind. I have no choice. Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Term::ReadKey inclusion
It is unreasonably complicated to do single-character input in a portable fashion. We should therefore include the Term::ReadKey module in the standard distribution. Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Re: RFC 308 (v1) Ban Perl hooks into regexes
I consider recursive regexps very useful: $a = qr{ (? [^()]+ ) | \( (??{ $a }) \) }; Yes, they're "useful", but darned tricky sometimes, and in ways other than simple regex-related stuff. For example, consider what happens if you do my $regex = qr{ (? [^()]+ ) | \( (??{ $regex }) \) }; That doesn't work due to differing scopings on either side of the assignment. And clearly a non-regex approach could be more legible for recursive parsing. --tom Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Re: RFC 143 (v2) Case ignoring eq and cmp operators
This RFC still has silly language that discounts what has been said before. 1) It calls uc($a) eq uc($b) "ugly", despite their being completely intuitive and legible to even the uninitiated. 2) It then proposes "eq/i" without the least blush, despite how incredibly ugly and non-intuitive and, if I may, syntactically perverse such a notion is. --tom Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
my and local
As we sneak under the wire here, I'm hoping someone has posted an RFC that alters the meaning of my/local. It's very hard to explain as is. my is fine, but local should be changed to something like "temporary" (yes, that is supposed to be annoying to type) or "dynamic". Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Re: RFC 277 (v1) Eliminate unquoted barewords from Perl entirely
Try thinking of it this way: it's only a bareword if it would make use strict whinge at you. Thus, the constructs you cited are all non-uses of barewords, such as in use Foo or require Foo or Foo = 1, or even $x{Foo}. And I have proposed (nonRFC) that Foo-bar() also be not a bareword. Yes, I know strict doesn't carp about it, but that could be Foo(). Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Murdering @ISA considered cruel and unusual
I strongly agree with the opinion that we should try and get away from special variables and switches in favor of functions and pragmas. Witness 'use base' instead of '@ISA', 'use warnings', and so on. Huh? Why??? Perl's use of @ISA is beautiful. It's an example of code reuse, because we don't need no stinking syntax! use base is, or can be, pretty silly -- think pseudohashes, just for one. The general sentiment you espouse obviously has a line beyond which you don't intend to cross. The question is where that line lies. --tom, who knows that it's hard to read his mail, but it's even harder to write it Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Re: RFC 277 (v1) Eliminate unquoted barewords from Perl entirely
So what's left? print STDERR "Foo"; We have a proposal to turn STDERR into $STDERR, and it looks likely it'll go through. It is? I certainly hope not. It makes as much sense to do that as to force a dollar sign on subroutines. sub $foo { ... } or sub 'foo' { ... } Heck, maybe everyone should be forced to write *foo = sub { ... }; $time = time; print; If use strict 'subs' is in effect you're guaranteed these are subroutine calls, or compile-time errors. If it isn't you get a nice little warning. Perhaps the stringification should be removed entirely, and the syntax always be a subroutine call. Eek, that's what I want to kill. I want you to HAVE to write that as $time = time(); with the parens. The lack of parens is the root of MANY an evil in perl. --tom Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Re: RFC 48 (v4) Replace localtime() and gmtime() with date() and utcdate()
Certainly numbers should never be "zero-padded"! Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Re: RFC 263 (v1) Add null() keyword and fundamental data type
This is screaming mad. I will become perl6's greatest detractor and anti-campaigner if this nullcrap happens. And I will never shut up about it, either. Mark my words. Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Re: Perl6Storm: Intent to RFC #0101
./sun4-solaris/POSIX.pm:sub isatty { ./sun4-solaris/B/Deparse.pm:sub is_scope { ./sun4-solaris/B/Deparse.pm:sub is_state { ./sun4-solaris/B/Deparse.pm:sub is_miniwhile { # check for one-line loop (`foo() while $y--') ./sun4-solaris/B/Deparse.pm:sub is_scalar { ./sun4-solaris/B/Deparse.pm:sub is_subscriptable { ./CGI.pm:sub isindex { ./CPAN.pm:sub is_reachable { ./CPAN.pm:sub isa_perl { ./Pod/Select.pm:sub is_selected { ./ExtUtils/Embed.pm:sub is_cmd { $0 eq '-e' } ./ExtUtils/Embed.pm:sub is_perl_object { Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Re: Perl6Storm: Intent to RFC #0101
You suggested: file($file, 'w'); # is it writeable? That's really insane. The goal was to produce code that's legible. That is hardly better. It's much worse than is_writable or writable or whatnot. Just use -w if that's what you want. --tom Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Re: RFC 259 (v2) Builtins : Make use of hashref context for garrulous builtins
grep -l Class::Struct */*.pm Class/Struct.pm File/stat.pm Net/hostent.pm Net/netent.pm Net/protoent.pm Net/servent.pm Time/gmtime.pm Time/localtime.pm Time/tm.pm User/grent.pm User/pwent.pm Please check those out for precedent and practice. Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Re: RFC 290 Remove -X
One doesn't remove useful and intuitive syntax just because Mr Bill never put it into MS-BASIC! I merely passingly suggested that there be a use English style alias for these. They are, however, wholly natural to millions of people, and should not be harrassed. (NB: 10 million Linux weenies alone) Still, twould be nice to have -rw and -rx and stuff, too. :-) BTW, -s(FH)/2 is still wickedly broken. Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Better Security support (was: RFC 290 (v1) Remove -X)
The -wd syntax (writeable directory) is nicer than file($file, "wd"). But anyway, there's hardly anything wrong with -w -d. Don't understand the complaint. One thing I would really like to see is better security support. Look at the Camel-III's security chapter, File::Temp, and the is_safe stuff I've done lately. Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Re: RFC 303 (v1) Keep Cuse less, but make it work.
Don't change "use less" to "use optimize". We don't need to ruin the cuteness. Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
RFC 307 (v1) PRAYER - what gets said when you Cbless something
Goodness, no, don't call it "PRAYER". The blessing is one of corporate approval, not ecclesiastical deprecationem. Please don't piss people off. Visit our website at http://www.ubswarburg.com This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments.
Re: RFC 263 (v1) Add null() keyword and fundamental data type
No, in that wonderfully consistent Perl documentation, it's "undef" not SNIP is only used to refer to (as you pointed out in another post) the null string the null character the null list Those use null as an adjective. This RFC proposes an addition to Perl tSNIP the null This uses null as a noun, and it has a different meaning than undef. A null is a null byte, or a null character. Period. You are completely out of your mind if you expect to co-opt an extant term for this screwed up notion of yours. I place my faith in Larry not to fuck up the language with your insanity. --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
In Perl, this is the null string:"" In Perl, this is the null character: "\0" In Perl, this is the null list: () In RFC 263, this is the null: null That's a different word for a different concept. No conflict, if you learn the way the RFC speaks. Wrong. Just plain wrong. It's a shame you don't like it, but this is the way we speak. What's this we and you business? I'm a perl user too. Who can't speak. If you wish to make sense of the documentation, you must learn its language. The documentation isn't all that consistent about everything, either. Perhaps you, personally, are more so, and if so, perhaps you should help rewrite the documentation to make it as perfectly consistent as yourself. Thank you very much, but I just did that. It's called Camel-3. I allowed that you might want to call it the null string, and I'm allowed to read "null string" and think "empty string", and I'm just as right as you are. You must not have a cohesive argument to make, if you resort to insults in an attempt to make points. You haven't heard insults. Here are insults: you are a stupid idiot. And I am incredibly glad that within hours, I'm about to spend three solid weeks afraid from such a fucked up blathering fool. --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
By your "reasoning", we can just add infinitely more things that take twice a few pages to explain. You took that to an illogical extreme conclusion. Clearly you can't add everything to the language. However, it is clear by the set of currently submitted RFCs that more people think suggesting additions to Perl is a better use of their time than suggesting subtractions. Bullshit. If it takes several pages just to introduce your new, fucked-up notion of "false", you've done something wrong. If it then again takes several pages just to introduce how your new, fucked-up notion of "false" is different from the existing ones, you've done something wrong. Guess what? You've done something wrong. Perl is already too hard. So make it easier. Where are your RFCs to remove things? They're right here in my edit buffer. I will simply explain them to Larry directly. You won't even get the chance to waste my time. Fortunately, I have every reason to believe that Larry will reject your idiotic notion of false that grew out of a cancerous complexity in an obscure niche of programming has no business burdening users with its incredibly lame-ass naming and confusing behavior. --tom
Re: Beefier prototypes (was Re: Multiple for loop variables)
Could the prototype people please report whether Tim Bunce's issues with prototypes have been intentionally/adequately addressed? I'm not a prototype person (in fact RFC 128 makes it a hanging offence to use that confusing word in connection with parameter lists! ;-) Could someone please recapitulate Tim's issues? The long story is here: http://www.perl.com/pub/language/misc/bunce.html The short story includes details that involve how to permit sub fn($$$) to work with fn(@foo) where @foo==3, which won't be known till runtime. --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
Russ: About the only piece of code of mine that this would affect are places where I use ++ on an undef value, and that's not a bad thing to avoid as a matter of style anyway (usually I'm just setting a flag and = 1 would work just as well; either that, or it's easy enough to explicitly initialize the counter to 0). Philip: Depends. While it is possible to initialise counters in the canonical "have I seen this before" situation, it's more convenient the way it is at the moment: $seen{$word}++; looks, to me, nicer than $seen{$word} = (exists $seen{$word}) ? 1 : $seen{$word} + 1; er, flip that. or if(defined($seen{$word})) { $seen{$word}++ } else { $seen{$word} = 1 } or similar. In general, if you can get away with a simpler expression, it's better. For example, if ($foo is_whatnot($foo)) is inferior to if ($foo) just as if (!$foo !is_whatnot($foo)) is inferior to unless ($foo) "Inferior by what metric?" you ask? Complexity. Larry wrote (in Camel-3) that ...the autoincrement will never warn that you're using undefined values, because autoincrement is an accepted way to define undefined values. ^^^ So I think you're safe there. He also wrote: The C|| can be used to set defaults despite its origins as a Boolean operator, since Perl returns the first true value. Perl programmers manifest a cavalier attitude with respect to truth, since the line above would break if, for instance, you tried to specify a quantity of 0. But as long as you never want to set either C$quality or C$quantity to a false value, the idiom works great. There's no point in getting all superstitious and throwing in calls to Cdefined and Cexists all over the place. You just have to understand what it's doing. As long as it won't accidentally be false, you're fine. Simple true and simple false are best if your goal is simplicity. Sometimes you need more than that. So you write functions. Or, if you're into the quirks of using strange magic of occasionally dubious charm, then through operationally overloaded objects. --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
How can you convince anyoone if you say you would not use it. For any feature enhancement to perl, unless there is a strong case for how it makes the labguage easier and better it is just not going to happen. It's not as though Tim Bunce has been hollering for this, which is a bad sign. --tom
PERL6STORM - tchrist's brainstorm list for perl6
t people don't get surprised. =item perl6storm #0053 Make DIRH call readdir() just as FILEH calls readline(). =item perl6storm #0054 Add dup() and dup2() style stuff to give legible ways of handling FH and =FH. =item perl6storm #0055 Make it clean and easy to push input and output stream filters as alteratives to forkopen. For example: push_filter STDOUT, { s/^/READY/ } is like calling program with prog | perl -pe 's/^/READY/' or fancy forkopen tricks. Allow ways to look at filters on stack. =item perl6storm #0056 Remove the complicated "extended" regex features (?...) Or rewrite them in a non-regex spec-y way. (?{...}) and (??{...}) come to mind. =item perl6storm #0057 ADD MORE TOOLS Code devel and analysis tools. Maybe PPT, too. BTW: I can't make "perlman fred" be accessible as "perl -man fred" but I want to. That's because the an.pm pragma won't be loaded without a read -e or script. So "fred" must exist as a path, for all values of "fred". That's annoying. =item perl6storm #0060 formats and html doesn't mix nicely (not wysiwig). Neither do any hidden chars, like ESC-blah. Can we do more for generating simple clean transparent xml? =item perl6storm #0061 Make CPAN.pm not hate me. =item perl6storm #0062 Can there be an anal mode that detects anything that might xU (raise exception as unimplemented on some plats?) =item perl6storm #0063 Core the portopath manippers. Too slow. Fix their stupid names: catfile sucks. it doesn't `cat file`. =item perl6storm #0064 Do something about microsoft's CRLF abomination. =item perl6storm #0065 Make indirect objectable built-ins overloadable/overrideable/inheritable by object type. =item perl6storm #0066 Allow next/last/redo in do{} per C. =item perl6storm #0067 Where's ferror()? Can we raise exceptions on them? use io_errors; # wrong name or maybe STDOUT-raise_on_error =item perl6storm #0070 Make tacit fclose stdout detect failure. END { close STDOUT || die "close STDOUT: $!" } But was there some horrible gotcha with this? =item perl6storm #0071 How do you prototype split? print? =item perl6storm #0072 "Fix" the $ prototype. No coerce on @ or %. Fix for lists. fn($$) should permit fn(foo()) to mean fn((foo())[0,1]), which is damned annoying to write. likewise fn(@foo[0,1]), which freaks. =item perl6storm #0073 kill bareword strings entirely. =item perl6storm #0074 make all the built-ins take perl style interfaces, not C ones. eg: notice how larry changed openlog from C to Perl. having to write "O_blah | O_blah" hurts. =item perl6storm #0075 Make a way for regex switches not to be single lettered. re_match( EXPR, REGEX, FLAGS ) $gotit = re_match($line = readline(), qr/^foo.*bar/, REG_ICASE | REG_NEWLINE) But now we're back to ugly O_ or'ing. See regcomp(3). =item perl6storm #0076 Allow ASCII characters to be specified symbolically. Too retro? chr(NUL), chr(SOH), chr(STX). Or are those already chars, and one would do ord(SOH) instead? Module would suffice. =item perl6storm #0077 make open(FH, "|cmd|") just work -- call open2 etc. =item perl6storm #0100 add python and java sections to perltrap. =item perl6storm #0101 Just like the "use english" pragma (the modern not-yet-written version of "use English" module), make something for legible fileops. is_readable(file) is really -r(file) note that these are hard to write now due to -s(FH)/2 style parsing bugs and prototype issues on handles vs paths. =item perl6storm #0102 Make "my sub" work. Make nested subs work. =item perl6storm #0103 Finally implement the less pragma. use less 'memory'; etc. Right now, you can say silly things. use less 'sillyiness'; What about use more? Or is that just no less use less 'magic'; no more 'magic'; =item perl6storm #0104 Look at the deep magic seen in some of the examples in Camel-3's OO and tie chapters and in perltootc. Consider what to canonize into a simpler-to-get-at mechanism, just as plum engendered much in perl5. =item perl6storm #0105 Learn to count in decimal. =back =head1 BUGS None. These are features. =head1 AUTHOR Tom Christiansen
Re: PERL6STORM - tchrist's brainstorm list for perl6
=item perl6storm #0106 Safe "signals"! (not syssigs,really)
Re: RFC 263 (v1) Add null() keyword and fundamental data type
Now, that's not accurate either. "NUL" is simply a normalized form of "null", because all the ASCII special characters have three upper-case letter names. There is no doubt that the ASCII guys meant "null" by this. All other matters aside, kindly consider this simple one: If ever you thought homophones were bad, imagine then how to rely upon nothing more than mere case distinction, an ancillary artifact of our system of writing, in two distinct terms whose usages are not radically different from each other but whose meanings most certainly are, is an endeavour virtually guaranteed to be frequently misheard and thus misconstrued when those terms are used in spoken discourse--as they inevitably shall be. --tom
Re: \z vs \Z vs $
"TC" == Tom Christiansen [EMAIL PROTECTED] writes: Could you explain what the problem is? TC /$/ does not only match at the end of the string. TC It also matches one character fewer. This makes TC code like $path =~ /etc$/ "wrong". Sorry, I'm missing it. I know. On your "longest match", you are committing the classic error of thinking green more important than eagerness. It's not. This is unrelated to /m. Go back and read all the insanities we (mostly gbacon and your truly) went through to fix the 5.6 release's modules. People coded them *WRONG*. Wrong means incorrect behaviour. Sometimes this even leads to security foo. BOTTOM LINE: You cannot use /foo$/ to say "does the string end in `foo'?". You can't do that. You can't even use /s to fix it. It doesn't fix it. This is an annoying gotcha. Larry once said that he wished he had made \Z do what \z now does. One would like $ to (be able to) mean "ONLY AT END OF STRING". --tom EXAMPLE 1: --- /usr/local/lib/perl5/5.00554/File/Basename.pm Mon Jan 4 13:00:53 1999 +++ /usr/local/lib/perl5/5.6.0/File/Basename.pm Sun Mar 12 22:24:29 2000 @@ -37,10 +37,10 @@ "VMS", "MSDOS", "MacOS", "AmigaOS" or "MSWin32", the file specification syntax of that operating system is used in future calls to fileparse(), basename(), and dirname(). If it contains none of -these substrings, UNIX syntax is used. This pattern matching is +these substrings, Unix syntax is used. This pattern matching is case-insensitive. If you've selected VMS syntax, and the file specification you pass to one of these routines contains a "/", -they assume you are using UNIX emulation and apply the UNIX syntax +they assume you are using Unix emulation and apply the Unix syntax rules instead, for that function call only. If the argument passed to it contains one of the substrings "VMS", @@ -73,7 +73,7 @@ =head1 EXAMPLES -Using UNIX file syntax: +Using Unix file syntax: ($base,$path,$type) = fileparse('/virgil/aeneid/draft.book7', '\.book\d+'); @@ -102,7 +102,7 @@ The basename() routine returns the first element of the list produced by calling fileparse() with the same arguments, except that it always quotes metacharacters in the given suffixes. It is provided for -programmer compatibility with the UNIX shell command basename(1). +programmer compatibility with the Unix shell command basename(1). =item Cdirname @@ -111,8 +111,8 @@ second element of the list produced by calling fileparse() with the same input file specification. (Under VMS, if there is no directory information in the input file specification, then the current default device and -directory are returned.) When using UNIX or MSDOS syntax, the return -value conforms to the behavior of the UNIX shell command dirname(1). This +directory are returned.) When using Unix or MSDOS syntax, the return +value conforms to the behavior of the Unix shell command dirname(1). This is usually the same as the behavior of fileparse(), but differs in some cases. For example, for the input file specification Flib/, fileparse() considers the directory name to be Flib/, while dirname() considers the @@ -124,12 +124,22 @@ ## use strict; -use re 'taint'; +# A bit of juggling to insure that Cuse re 'taint'; always works, since +# File::Basename is used during the Perl build, when the re extension may +# not be available. +BEGIN { + unless (eval { require re; }) +{ eval ' sub re::import { $^H |= 0x0010; } ' } + import re 'taint'; +} + + +use 5.005_64; +our(@ISA, @EXPORT, $VERSION, $Fileparse_fstype, $Fileparse_igncase); require Exporter; @ISA = qw(Exporter); @EXPORT = qw(fileparse fileparse_set_fstype basename dirname); -use vars qw($VERSION $Fileparse_fstype $Fileparse_igncase); $VERSION = "2.6"; @@ -162,23 +172,23 @@ if ($fstype =~ /^VMS/i) { if ($fullname =~ m#/#) { $fstype = '' } # We're doing Unix emulation else { - ($dirpath,$basename) = ($fullname =~ /^(.*[:\]])?(.*)/); + ($dirpath,$basename) = ($fullname =~ /^(.*[:\]])?(.*)/s); $dirpath ||= ''; # should always be defined } } if ($fstype =~ /^MS(DOS|Win32)/i) { -($dirpath,$basename) = ($fullname =~ /^((?:.*[:\\\/])?)(.*)/); -$dirpath .= '.\\' unless $dirpath =~ /[\\\/]$/; +($dirpath,$basename) = ($fullname =~ /^((?:.*[:\\\/])?)(.*)/s); +$dirpath .= '.\\' unless $dirpath =~ /[\\\/]\z/; } - elsif ($fstype =~ /^MacOS/i) { -($dirpath,$basename) = ($fullname =~ /^(.*:)?(.*)/); + elsif ($fstype =~ /^MacOS/si) { +($dirpath,$basename) = ($fullname =~ /^(.*:)?(.*)/s); } elsif ($fstype =~ /^AmigaOS/i) { -($dirpath,$basename) = ($fullname =~ /(.*[:\/])?(.*)/); +($dirpath,$basename) = ($fullname =~ /(.*[:\/])?(.*)/s); $dirpath = './' unless $dirpath; } e
Re: \z vs \Z vs $
That was my second thought. I kinda like it, because //s would have two effects: + let . match a newline too (current) + let /$/ NOT accept a trailing newline (new) Don't forget /s's other meaning. --tom
Re: RFC 212 (v1) Make length(@array) work
What I said was: making length(@array) "work" would be catering to novice people *coming from C*. We shouldn't. Not that much. In Perl, a string is not an array. I'm pretty sure it's not just the people coming from C who expect this. This all points to the bug^H^H^Hdubious feature which is the sub($) context template as applied to named arrays and hashes. Requiring an explicit conversion would help a lot. Or so it seems. --tom
Re: RFC 153 (v2) New pragma 'autoload' to load functions and modules on-demand
This will make programs highly nonportable. You can't easily know what modules they really need. --tom
Re: RFC 12 (v2) variable usage warnings
And what about $$x? Dang, are we back to this incredible confusion about what it is to be defined in Perl.? undef $a; That is now UNINITIALIZED. So is this: $a = undef; You have initialized it to undef. There is no reasonable difference. Solution: Remove all references from the language to defined and undef. People just aren't smart enough to understand them. Change defined() to read has_a_valid_initialized_scalar_value(). Change undef() to "operator_to_uninitialize_a_variable". Touch luck on the chumps who can't type well. They pay for their brothers' idiocy. repeat until blue: INITIALIZED == DEFINED UNINITIALIZED == UNDEFINED --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
Nathan Wiger wrote: ...a "use tristate" pragma which obeys blocks bka "lexically scoped". If I'm not mistaken, pragmas *are* lexically scoped. They *can* be. They needn't be. --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
The semantics for NULL is different, read the SQL standard. Perl has no business contaminating itself with SQL. --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
Unlike undef, which gets assigned to uninitialized variables, NULL is only used by choice. So you only need deal with NULL when there is the possibility that it needs to be handled in some special way, and might exist as a value in the expression being handled. This can be done without being in the language. Return a ref to a blessed object whose stringification or numification method raises an exception. The novice need not use NULL until he is an expert, or is dealing with databases. As an expert, it is not hard to understand the difference, and if dealing with databases, there is a definite need to understand the difference. I completely disbelieve. Changing the fundamental nature of what a VALUE is in Perl is hardly something you can hide. The amount of pain people seem to go through already understanding this stupid spectre out of database hell is sufficient to run in terror. --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
no strict; $a = undef; $b = null; Perl already has a null string: "". --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
Perl has *one* out-of-band value. It doesn't need more. That doesn't mean that perhaps some rare sorts of programming might not benefit from fancy weirdnesses. That's what modules are for. You don't need to complicate the general language to get what you want. Don't make others pay for your problems. 1) all otherwise uninitialized variables are set to undef Wrong. You cannot say that an aggregate is undef. Scalar variables--not all variables, just scalar variables alone--hold the uninitialized value, henceforth known as the antiïinitialized value, if they were last initialized to the antiïinitialized value, or if they haven't been initialized at all--in which case, I suppose, you *might* choose to call it _a_n_t_eïinitialized instead of antiïinitialized, but then you'll get people wanting to split those up again. 2) under "use strict", use of undef in expressions is diagnosed with a warning Wrong. You are thinking, perhaps, of `use warnings', not `use strict'. In particular, use warnings qw'uninitialized'; 3) undef is coerced to 0 in numeric expressions, false in boolean expressions, and the empty string in string expressions. I'm not happy with your use of "coerce". There's no mutation. It simply *is* those things. It's not quite kosher to claim that undef gets "coerced" to false in Boolean expresions. The antiïinitialized value *is* a false value. The only false number is 0, and therefore the antiïinitialized numeric value is 0. Yes, we have two false strings--lamentably--but since we need a canonical one (eg the result of 1 == 2), we choose "". You also forgot this: 4) The antiïinitialized value is autovivified to a true value when used that value is (legally) used lvaluably. Notice also this: % perl -le 'use warnings; $a = 1 == 2; print $a-[1] ? "good" : "bad"' bad % perl -le 'use strict; $a = 1 == 2; print $a-[1] ? "good" : "bad"' Can't use string ("") as an ARRAY ref while "strict refs" in use at -e line 1. Exit 255 --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
$a = null; $b = ($a == 42); print defined($b)? "defined" : "not defined"; would print "not defined", maybe? In a sane world of real (non-oo-sneaky) perl, the "==" operator returns only 1 or "". Both are defined. --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
It only takes a few pages, and a few truth tables to explain NULL. It should only take a few pages and a few examples, to explain the difference between undef and null. Ah, so the cost of this is twice a few pages of explanation, plus truth tables and examples? Are you mad? I can think of no better proof that this is the Wrong Thing than your very own words. Thank you. ---tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
* Tom Christiansen ([EMAIL PROTECTED]) [21 Sep 2000 05:49]: no strict; $a = undef; $b = null; Perl already has a null string: "". Looks more like a string of no length than a null string. Well, it's not. That's a null string. You're thinking of "\0", a true value in Perl. Here are the canonical definitions: NULL STRING: A string containing no characters, not to be confused with a string containing a null character, which has a positive length. NULL CHARACTER: A character with the ASCII value of zero. It's used by C and some Unix syscalls to terminate strings, but Perl allows strings to contain a null. NULL LIST: A list value with zero elements, represented in Perl by (). --tom
Re: RFC 85 (v2) All perl generated errors should have a unique identifier
"TC" == Tom Christiansen [EMAIL PROTECTED] writes: Currently many programs handle error returns by examining the text of the error returned in $@. This makes changes in the text of the error message, an issue for the backwards compatibility police. TC eval { fn() }; TC if ($@ == EYOURWHATHURTS) { } TC sub fn { die "blindlesnot" } I don't understand what you are trying to say. I'm saying that you can't know what to check for, because you don't know who generated the exception. Can you use your fancy constants? And what is "core"? Compiler? Interpreter? Utilities? Pragmata? Modules? Citing IBM as a reference is enough to drive a lot of us away screaming. Try errno.h or sysexits.h Notice how much nicer this is. Few values, but usable in varied places. --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
That's not much different than the cost of undef, so I fear it proves nothing, universally. YOU OVERQUOTEDsen wrote: YOU OVERQUOTEDkes a few pages, and a few truth tables to explain NULL. YOU OVERQUOTEDonly take a few pages and a few examples, to explain the YOU OVERQUOTED between undef and null. YOU OVERQUOTED YOU OVERQUOTEDcost of this is twice a few pages of explanation, plus truth YOU OVERQUOTEDexamples? Are you mad? YOU OVERQUOTED YOU OVERQUOTED of no better proof that this is the Wrong Thing than YOU OVERQUOTEDwn words. Thank you. YOU OVERQUOTED YOU OVERQUOTED YOU OVERQUOTED YOU OVERQUOTED YOU OVERQUOTED YOU OVERQUOTED YOU OVERQUOTEDe on the right track, YOU OVERQUOTEDn over if you just sit there. YOU OVERQUOTED -- Will Rogers YOU OVERQUOTED YOU OVERQUOTEDFree Internet Access and Email__ YOU OVERQUOTED.netzero.net/download/index.html By your "reasoning", we can just add infinitely more things that take twice a few pages to explain. Perl is already too hard. --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
For example, assuming this code: $name = undef; print "Hello world!" if ($name eq undef); So don't do that. Use Cdefined $name if you want to ask that question. That's why I want to change the names of these things. The current situation invites errors such as seen previously. Actually, one almost wants a warning on "=undef", too. Well, some uses. --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
Tom Christiansen wrote: no strict; $a = undef; $b = null; Perl already has a null string: "". That's an empty string. In any case, if you really want to call it a null string, that's fine, just a little more likely to be misinterpreted. In Perl, this is the null string:"" In Perl, this is the null character: "\0" In Perl, this is the null list: () It's a shame you don't like it, but this is the way we speak. If you wish to make sense of the documentation, you must learn its language. --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
I'm not happy with your use of "coerce". There's no mutation. It simply *is* those things. Fine. So, in particular, it _isn't_ null. Of course it's null. That's why it has length zero. Stop speaking SQL at me. I'm speaking Perl. 4) The antiïinitialized value is autovivified to a true value when used that value is (legally) used lvaluably. If, by "true value" in the above, you mean a value other than undef whicSNIP interpreted as boolean false, then I think I understand what you said. SNIP enough to have said it, which is why I used coerce. No, I mean this: undef $a; @$a = (); if ($a) { . } # always true It's the lvaluable deref that autoinitializes. --tom
Re: RFC 12 (v2) variable usage warnings
But that doesn't even matter that much here; I'm saying that if the compiler can definitely determine that you are using an uninitialized variable, it should warn. ... $x is a global. The compiler cannot detect all possible assignments to or accesses of globals, so it never warns about them. If you inserted my $x at the top of that code, it would most likely produce the "possible use" warning. Or not; this is a simple enough case that it might be able to infer the right answer. I am certainly not saying that the "possible use" warning should be enabled by default. But please, argue over that one separately from the others. It's the most likely to annoy. Or: foo(); print $x; Generate a warning, or not? Which one? Remember, foo() may initialize $x. Same thing. If $x is lexical, it gives a definite warning. If $x is a global, it says nothing. You're right; I need to point this out in the RFC. Careful: sub ouch { my $x; my $fn = sub { $x++ }; register($fn); print $x; } --tom
Re: RFC 12 (v3) variable usage warnings
Which is silly, because you shouldn't have to say '$x = $x = 3' when you mean '$x = 3'. Just because there's a real reason behind it doesn't make it any less silly. I'd like to see where this can happen. Sounds like someone forgot to declare something: our $x; $x = 2; --tom
Re: RFC 12 (v2) variable usage warnings
Anything else? Any opinion on whether eval "" should do what it does now, and be invisible for the purposes of this analysis; or if it should be assumed to instead both use and initialize all visible variables? The former produces more spurious warnings, the latter misses many errors. You have to assume eval STRING can do anything. --tom
Re: RFC 12 (v3) variable usage warnings
It happens when I don't bother to declare something. My company has several dozen machines with an 'our'-less perl, and 'use vars qw($x)' is a pain. As is $My::Package::Name::x. Far, far easier to fix behavioral problems than to hack Perl. --tom
Re: RFC 85 (v2) All perl generated errors should have a unique identifier
Ok, so you want message catalogues, and not solely on Perl but anything in the distribution. You should say that. --tom
Re: RFC 12 (v2) variable usage warnings
Tom Christiansen wrote: Anything else? Any opinion on whether eval "" should do what it does now, and be invisible for the purposes of this analysis; or if it should be assumed to instead both use and initialize all visible variables? The former produces more spurious warnings, the latter misses many errors. You have to assume eval STRING can do anything. --tom "have to"? Perl5 doesn't. You mean "perl". % perl -we '$x = 3; $v = "x"; eval "\$$v++"' Name "main::x" used only once: possible typo at -e line 1. Non sequitur. And no, I don't have time.
Re: RFC 12 (v3) variable usage warnings
Tom Christiansen wrote: It happens when I don't bother to declare something. My company has several dozen machines with an 'our'-less perl, and 'use vars qw($x)' is a pain. As is $My::Package::Name::x. Far, far easier to fix behavioral problems than to hack Perl. --tom Not sure what you mean, since this RFC _adds_ a warning in this case. In fact, with the proposed change, my trick to avoid punishment for my misbehavior would no longer work. The point is that if $x = 3; elicits a warning... that you should declare the variable properly, of course. --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
But I see code in the XML modules that check defined (@array) They're buggy and wrong. --tom
Re: RFC 12 (v2) variable usage warnings
Have a nice day. And thanks for all the fish.
\z vs \Z vs $
What can be done to make $ work "better", so we don't have to make people use /foo\z/ to mean /foo$/? They'll keep writing the $ for things that probably oughtn't abide optional newlines. Remember that /$/ really means /(?=\n?\z)/. And likewise with \Z. --tom
Re: RFC 259 (v1) Builtins : Make use of hashref context for garrulous builtins
It's hard to remember the sequence of values that the following builtins return: stat/lstat caller localtime/gmtime get* and though it's easy to look them up, it's a pain to look them up Every Single Time. Moreover, code like this is far from self-documenting: use File::stat; if ((stat $filename)[7] 1000) {...} if ((lstat $filename)[10] time()-1000) {...} use Time::localtime; if ((localtime(time))[3] 5) {...} use User::pwent; if ($usage (getpwent)[4]) {...} use Net::hostent; @host{qw(name aliases addrtype length addrs)} = gethostbyname $name; Don't have one for that. warn "Problem at " . join(":", @{[caller(0)]}[3,1,2]) . "\n"; It is proposed that, when one of these subroutines is called in the new HASHREF context (RFC 21), it should return a reference to a hash of values, with standardized keys. For example: Which is what the modules listed above do. And more. --tom
Re: RFC 258 (v1) Distinguish packed binary data from printable strings
On 19 Sep 2000, Perl6 RFC Librarian wrote: Distinguish packed binary data from printable strings What defines a "printable" string? What if I'm working in an environment that can "print" bytes that yours can't? Specifically I'm wondering how this proposal handles Unicode. Perl should fly far and fast from starting down the bumpy road where that data is strongly typed in the mythical and deceptive text-vs-binary sense, for that path is one littered with the frustrations of a legion of programmers stretching from the plodding mainframe operating systems of our grandfathers to the toybox legacies of our less fortunate brethren of today. Heed the wisdom of the Unix and the song of the C: Let your data be simply data, homogenously clean. To do otherwise is to suffer unending inanities and combinatoric misconnects of noncongruent types. --tom
Re: RFC 255 (v2) Fix iteration of nested hashes
This RFC proposes that the internal cursor iterated by the Ceach function be stored in the pad of the block containing the Ceach, rather than being stored within the hash being iterated. Then how do you specify which iterator is to be reset when you wish to do that? Currently, you do this by specifying the hash. If the iterator is no longer affiliated with the hash, but the opcode node, then what are you going to do? --tom
Re: RFC 258 (v1) Distinguish packed binary data from printable strings
Perhaps what you're truly looking for is a generalized tainting mechanism. --tom
Re: RFC 255 (v2) Fix iteration of nested hashes
Just to note: in version 2 of the RFC, it's associated with the pad of the block in which the Ceach appears. then what are you going to do? The short answer is that there is no "manual" reset of iterators. I am concerned about that. sub fn(\%) { my $href = shift; while (my($k,$v) = each %$href) { return if something's funny; } } Now, imagine you call fn(%foo); fn(%bar); and there's a premature exit. Isn't the second fn() going to not only be at the wrong spot, but still worse, at the wrong hash? Or do you plan for all block exits to clear all their iterators? What happens then in this code: for my $hr (\(%foo, %bar, %glarch)) { push @first_keys, scalar each %$hr; } There's no block exit there. --tom
Re: RFC 258 (v1) Distinguish packed binary data from printable strings
Tim Conrow wrote: Tom Christiansen wrote: Perhaps what you're truly looking for is a generalized tainting mechanism. Sounds cool, but I have only the vaguest idea what you (may) mean. Pointers? RFCs? Examples? Hints? Sorry for the clutter, but I didn't want to come off too clueless. I know what tainting is, I just don't know what you mean by generalized tainting. If it's been discussed before I'd love to see a pointer to the thread. You want to have more properties that work like tainting does: a per-SV attribute that is enabled or disabled by particular sorts of expressions, sometimes dependent upon the previous presence or absence of that property, other times, not so. --tom
Re: RFC 76 (v2) Builtin: reduce
$sum = reduce {$_[0]+$_[1]} 0, @numbers || die "Chaos!!"; Note with the || that way, it'll die immediately if @numbers is empty, even before destroying the universe. Yes, but why are you passing the size of the array in there? --tom
Re: RFC 76 (v2) Builtin: reduce
Why not just check @numbers? --tom
Re: RFC 76 (v2) Builtin: reduce
Following Glenn's lead, I'm in the process of RFC'ing a new null() keyword and value As though one were not already drowning in a surfeit of subtly dissimilar false values. --tom
Re: RFC 76 (v2) Builtin: reduce
Ummm...Maybe I'm missing something, but how does reduce() know the difference between $sum = reduce ^_+^_, 0, @values; unshift @values, 0; $sum = reduce ^_+^_, @values; You know, I really find it much more legible to consistently write these sorts of thing with braces around their code block, just as @x = map { $_ * 3 }, 4, 5; is infinitely better than @x = map $_ * 3, 4, 5; --tom
Re: RFC 12 (v2) variable usage warnings
The warning for the use of an unassigned variable should be "use of uninitialized variable C$x". The problem with that idea, now as before, is that this check happens where Perl is looking at a value, not a variable. Even were it possible to arduously modify Perl to handle explicitly named simple variables, there's much more to consider. if ( fx() == fy() ) { } For one. --tom
Re: RFC 85 (v2) All perl generated errors should have a unique identifier
Currently many programs handle error returns by examining the text of the error returned in $@. This makes changes in the text of the error message, an issue for the backwards compatibility police. eval { fn() }; if ($@ == EYOURWHATHURTS) { } sub fn { die "blindlesnot" } --tom
Re: RFC 263 (v1) Add null() keyword and fundamental data type
Currently, Perl has the concept of Cundef, which means that a value is not defined. One thing it lacks, however, is the concept of Cnull, which means that a value is known to be unknown or not applicable. These are two separate concepts. No, they aren't. --tom
Re: RFC - Interpolation of method calls
I doubt anyone's arguing that they're not function calls. What I find "surprising" is that Perl doesn't DWIM here. It doesn't encourage data encapsulation or try to make it easy: my $weather = new Schwern::Example; print "Today's weather will be $weather-{temp} degrees and sunny."; print "And tomorrow we'll be expecting ", $weather-forecast; You are wicked and wrong to have broken inside and peeked at the implementation and then relied upon it. If method calls interpolated, this would be easier. Instead, it encourages you to provide direct hash access to your data since it's much easier to use that way. I find myself wanting to say: print "Thanks, $cgi-param('name') for your order!"; print "It matched" if /$config-get_expression/; Oh joy: now Perl has nested quotes. I *hate* nested quotes. They're terrible. See the shell for how icky this is. Rather than: print "Thanks, " . $cgi-param('name') . " for your order"; What's the big deal? How does it hurt you to do that? And why are you catting it instead of simply passing a list? --tom
Re: RFC - Interpolation of method calls
As Nate pointed out: print "$hash-{'f'.'oo'}" already works fine and the world spins on. That is no argument for promoting illegibility. --tom
Re: RFC 252 (v1) Interpolation of subroutines
Subroutines calls should interpolate in double-quoted strings and similar contexts. print "Sunset today is at sunset($date)"; interpolates to: print 'Sunset today is at '.sunset($date); Huh? And what if it's a built-in? What if it's not quite a built-in, but an import? What if you don't *know* whether it's a built-in? I cannot but wonder what kind of childhood abuse leads programmers to expect that double quotes shouldn't count for squat anymore. If you don't like 'em this much, you should quit using them. --tom
Re: RFC 252 (v1) Interpolation of subroutines
Surely the next request will be to make anything that works outside of quotes work inside of them, completely erasing the useful visual distinction. Why should operators, after all, be any different from functions? print "I have Fooey-fright($n) frobbles.\n"; print "I have snaggle($n) frobbles.\n"; print "I have abs($n) frobbles.\n"; print "I have $x+$y frobbles.\n"; What's the use of quotes these days anyway? --tom
Re: Beefier prototypes (was Re: Multiple for loop variables)
[This somewhat elderly draft was found lying about an edit buffer, but I do not believe it was ever sent yet.] Now, the possibility to either pass individual scalars to a sub, or an array, (or several arrays, or a mixture of arrays and scalars) and Perl treating them as equivalent, that is pretty much the most important feature of Perl. IMO. Perl would not be Perl without it. Well, "most important" is an interestingly strong way of phrasing it. But how to deal with variadicity in an intuitive fashion is hard. You seem to have to sacrifice compile-time knowledge, or else programmer-convenience. Tim Bunce had some ideas on this once. I still almost always end up first using no protos and then employing extensive run-time comparisons, such as this sequence might illustrate: confess "need args" unless @_; confess "need even args" unless @_ % 2 == 0; confess "keys mustn't be refs" if grep { ref }, @_[map { 2*$_} 0.. int($#_/2)] } confess "values must be hashrefs" if grep { reftype($_) ne 'HASH' }, @_[map {1+2*$_} 0.. int($#_/2)] } confess "values must be Frobulants" if grep { $_-isa("Frobulant") }, @_[map {1+2*$_} 0.. int($#_/2)] } I should like to see the context coercer née prototype that satisfies criteria such as these. Yes, I cannot imagine that Damian doesn't already have a syntax for such :-) but what about compile-time versus run-time issues? Could the prototype people please report whether Tim Bunce's issues with prototypes have been intentionally/adequately addressed? --tom
Re: RFC 244 (v1) Method calls should not suffer from the action on a distance
foo-bar($baz, $coon) should be made synonymous with foo-bar $baz, $coon I can see no ambiguity in this call, but it not always works with Perl5. Arrow invocation does not a listop make. Only indirect object invocation style does that. print STDOUT $foo, $bar, $glarch; is a list op. STDOUT-print $foo, $bar, $glarch; is not, and, in fact, is a syntax error. You *must* use parens for the arrow invocation's arguments. You *may* use them with I/O style. --tom
Re: $a in @b (RFC 199)
From: Tom Christiansen [mailto:[EMAIL PROTECTED]] From: Jarkko Hietaniemi I find this urge to push exceptions everywhere quite sad. Rather. Languages that have forgotten or dismissed error returns, turning instead to exceptions for everything in an effort to make the code "safer", tend in fact to produce code that is tedious and annoying. There seems to be some general consensus that some people would like to be able to short-circuit functions like grep. Do you see no need for the code block equivalent of Cnext/Clast/Credo? What, you mean like Loop controls don't work in an Cif or Cunless, either, since those aren't loops. But you can always introduce an extra set of braces to give yourself a bare block, which Idoes count as a loop. if (/pattern/) {{ last if /alpha/; last if /beta/; last if /gamma/; # do something here only if still in if() }} --tom