> De: "Guy Steele" <guy.ste...@oracle.com> > À: "John Rose" <john.r.r...@oracle.com> > Cc: "amber-spec-experts" <amber-spec-experts@openjdk.java.net> > Envoyé: Mardi 27 Février 2018 22:12:14 > Objet: Re: Raw string literals and Unicode escapes
>> On Feb 27, 2018, at 4:20 PM, John Rose < [ mailto:john.r.r...@oracle.com | >> john.r.r...@oracle.com ] > wrote: >> On Feb 27, 2018, at 11:48 AM, Brian Goetz < [ mailto:brian.go...@oracle.com | >> brian.go...@oracle.com ] > wrote: >>>> So after this length instead of having the probability to see a character >>>> to be >>>> virtually 1, you have the opposite effect, because programming languages (a >>>> human construct) are very regular in the set of chars they use. So you do >>>> not >>>> need to a repetition of a character to avoid a statistical effect that >>>> does not >>>> occur. Being able to choose the escape character, is enough. >>> The problem is not that it's enough, its that it is too much. Having nine >>> ways >>> to say the same thing is too many; having infinitely many (e.g., nonces) is >>> worse. Having used the "pick your delimiter" approach taken by Perl, I find >>> that you are *still* often bitten by the inability to find a good delimiter >>> for >>> embedding a snippet of a program written in a language similar to the outer >>> language. And it surely makes code less readable, because many more things >>> can >>> be interpreted as quotes. >> My experience tracks with Brian's. That's why I think the random string >> model is more robust than some vague hope that languages won't overlap. >> Yes, random strings are an outlier, but less so that you'd think. A typical >> compression ratio for code is 5x, which means that if you replace "random >> string of length 10" with "random code snippet of length 50" you get the >> same analytic results. In order to exclude a close-quote, you need an >> additional constraint, which in practical terms results in folks having to >> grub around inside their raw strings looking for accidentall quotes. > Which leads us to the following theoretical result: the ```` mechanism does > not > require you to grub around in the interior of the string AT ALL if you don’t > want to. All you need to know is the length. If the length of the raw string > is > n, and it does not begin or end with ` (a necessary check in any case), then > using n-1 backquote characters before and after will always do the job. > In practice, many programmers (and programs) will be willing to do a quick > search to see whether “```” or failing that “````” happens to be absent from > the raw string. :-) Ok, i'm clearly in minority here, the repetition pattern wins. Rémi