On Fri, Sep 29, 2017 at 2:04 PM, ToddAndMargo <toddandma...@zoho.com> wrote:
> $ perl6 -e 'my $x="abc"; $x ~= "def"; say $x;' > abcdef > > Perfect! Thank you! > > I am slowly getting away from my Modula 2 "Array of Characters" days. > > Question: Is thee a pretty way like the above to do a prepend? > No, sorry. The downside of this is that it's easy to string-append but hard to string-prepend without having to move stuff around in memory. (The future may have string-aggregate types, commonly known as "ropes", which would make this easier.) As for "array of characters", this quickly turns into a problem --- and languages that take the 'array of characters' approach are still struggling with it. Quick: how many 'characters' is "ň"? Turns out the answer is: one grapheme OR two Unicode codepoints (Latin letter lowercase n, combining caron) OR 3 bytes in UTF-8 encoding (2 bytes for the combining caron). (Among others. Java sees two codepoints in UTF-16 encoding, for 4 bytes. C wants a trailing NUL character, also adding a 4th byte.) And which one is the correct way to look at it depends on what you are doing with it. (This may not matter much to you if you only ever deal with basic Latin-1 like the US uses. But for the past several days I've had to deal with the names of sports teams from various European countries, with things like ň and ğ and ş in them --- and that's ignoring the names in Cyrillic or Hebrew characters. The U.S. is not the whole world. And it's helpful when the language doesn't force me to jump through weird hoops to deal with them.) A string can't simultaneously be three different lists. So it ends up being one thing, and we provide ways to decompose it into the various other forms. But if you are just thinking of strings of text, we don't make you think about that; we provide specific string operations instead of making you figure out which way to decompose it and add the new part and recompose it. The string operations operate at grapheme level, because when you are thinking of it as text, that's usually what you intend: you see one 'character' (grapheme) there, not the two codepoints or the 3 bytes or whatever --- but they have to be clever, because what if you are appending another combining character? -- brandon s allbery kf8nh sine nomine associates allber...@gmail.com ballb...@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net