Thank you, Joe, for taking the time to write such a cogent reply. And thanks to Tobias and everyone else who has taken the time to reply to my questions.
On Sat, Oct 10, 2020 at 3:38 PM Joseph Brenner <doom...@gmail.com> wrote: > William Michels wrote: > > >I actually wondered where the different programming paradigms > >would be delineated > > I think were the present topic has to do more with the > strong/weak/gradual typing debates-- here Raku is doing an > automatic type conversion that a "strong-typing" fanatic > would sneer at. Though, the way this behavior is implemented > is via the "(almost) everything is an object" philosophy > that Tobias was describing. > > William Michels wrote: > > split() performs an implied join of Array elements > > Yes, that's right, because split is a *string* method, it's as > though you called .Str on the array first. > Playing devil's-advocate here, I don't *know* that split() is a string method. For all I know, there could be a form of split() which splits on bytes, or graphemes, or nulls. I can, for example, split on nulls: split("\0"). Does that properly qualify as a string method? If I call split("\0").WHAT, I'm told I've produced a (Seq), not a string. So when do I *know* that Raku's split() is confined to string substrates? (Substrate is a biochemistry term but I'm sure you get the .gist). What about awk that can apparently split on bytes? Does Raku behave like awk? Or Golang which has a split function that acts on bytes? Does Raku behave like Golang? What about Python that (apparently) can't split on bytes without a prior .decode() step? Is Raku more capable than Python? Or maybe Raku's split() is analogous to Elixir's .slice/x, where a data structure can be "split" into 2 or 3 or x Int parts. Does Raku's split() act like Elixr's .slice? And if Raku's split() acts like Elixr's .slice, who's to say it doesn't act element-wise on each element of an array? https://unix.stackexchange.com/questions/583716/awk-split-each-byte-to-their-own-file-while-specifying-created-filename-to-std https://golang.org/pkg/bytes/ https://stackoverflow.com/questions/13857856/split-byte-string-into-lines https://hexdocs.pm/elixir/1.0.5/String.html What would you *want* to happen when someone treats an array as a > string? > > my @monsters = < blob kong mothera fingfangfoom >; > put @monsters; # blob kong mothera fingfangfoom > say "{ @monsters }"; # blob kong mothera fingfangfoom > Again, I don't know anything about Raku's storage model. Does Raku efficiently pack arrays like PDL (Perl Data Language)? Even if it doesn't, surely the double-quotes and commas are screen-sugar, decorating print calls but otherwise not stored with the object. I might imagine that each element of a Raku array is separated by an ASCII record separator: US (ASCII_31). So when I call Raku's split(), it applies split to each element of the array, and leaves the US record separator (ASCII_31) untouched: https://www.unicode.org/charts/PDF/U0000.pdf https://www.cisco.com/c/en/us/td/docs/ios/12_4/cfg_fund/command/reference/cfnapph.html# https://www.lammertbies.nl/comm/info/ascii-characters https://ronaldduncan.wordpress.com/2009/10/31/text-file-formats-ascii-delimited-text-not-csv-or-tab-delimited-text/ > What Raku does is a DWIM move: it joins the array on spaces when > you use the array as a string. So these do the same things: > > my $s1 = @monsters.Str; > my $s2 = @monsters.join(" "); > dd( $s1 ); # Str $s1 = "blob kong mothera fingfangfoom" > dd( $s2 ); # Str $s2 = "blob kong mothera fingfangfoom" > > You need to use .join explicitly if you want different behavior: > > my $s3 = @monsters.join(", "); > dd( $s3 ); # Str $s3 = "blob, kong, mothera, fingfangfoom" > > All three of these do the same things: > > my @r1 = @monsters.split("a"); > my @r2 = @monsters.Str.split("a"); > my @r3 = @monsters.join(" ").split("a"); > > The each result in and array like: > > ["blob kong mother", " fingf", "ngfoom"] > Yes, but you've overlooked the obvious test, which is, what happens when I try to split on a character that DOES NOT exist in the array. In that case I end up destroying my array and creating a single element string. So a failed split actually joins (code/results below from the Raku REPL): > say @monsters.Str.split("Z").elems; 1 > dd @monsters.Str.split("Z"); ("blob kong mothera fingfangfoom",).Seq Nil > In this example of yours: > > my @a = "My Bloody Valentine", "Sonic Youth"; > > When you call split on @a, it joins on spaces first (and probably > inadvertantly, throws away the division between 3 elements), > then the actual split operation results in 5 elements: > > @a.split(" ").raku.say; > # ("My", "Bloody", "Valentine", "Sonic", "Youth").Seq > Yes, the call above appears to 'flatten' the array, which isn't what I desired. [ Had I desired to flatten the array, I would probably take a wild guess and call flat(). ]. But I guess that really isn't the end of it: if you try to do the same call on a comma-separated list using split(",") you get two different elements joined together, "Valentine Sonic". Meaning whitespace separated strings (combined with splitting on whitespace) are special-cased relative to non-whitespace separated strings (combined with splitting on non-whitespace): > dd @a Array @a = ["My Bloody Valentine", "Sonic Youth"] Nil > dd @a.split(" ") ("My", "Bloody", "Valentine", "Sonic", "Youth").Seq Nil > dd @a.split(" ").elems 5 Nil > my @b = "My,Bloody,Valentine", "Sonic,Youth"; [My,Bloody,Valentine Sonic,Youth] > dd @b.split(",") ("My", "Bloody", "Valentine Sonic", "Youth").Seq Nil > dd @b.split(",").elems 4 Nil If I try to write code that takes in arbitrary text and splits on arbitrary characters, I'll have to write two subs at the very least--one to handle splitting on whitespace, and the other splitting on non-whitespace. > You might play with an explicit join to see what it does: > > my @r; > @r = @a.join("|").split(" "); > dd( @r ); # Array @r = ["My", "Bloody", "Valentine|Sonic", "Youth"] > > Myself, I think I'd be inclined to loop over the elements, e.g. with map: > > @r = @a.map({ [ .split(" ") ] }); > dd(@r); > # Array @r = [["My", "Bloody", "Valentine"], ["Sonic", "Youth"]] > > That's an array of arrays: two top-level elements, > each split into 3 and 2 words respectively > > Note: split does stringification because it's intended to be run > on strings, or things that can become strings-- map doesn't do > this because it's intended to be run on things like Arrays. This > probably is "specced" though not down on the level you're > probably thinking of: it's not in the "split" documentation, for > example, because it's not really specific to that method. > > You *could* argue that if you call a string method on an array, > that's simply a mistake and it should error out. I think that's > what "strong-typing" crowd would say-- they would point out you > might have realized faster what was going on in that case. > Yes, the faster I can dispel my confusion the better. Thanks, Bill. PS. Thanks also to Bruce Gray, who contributed greatly to this conversation during our weekly Raku Meetup (online, Zoom). > On 10/10/20, William Michels <w...@caa.columbia.edu> wrote: > > On Tue, Oct 6, 2020 at 1:59 PM Tobias Boege <t...@taboege.de> wrote: >