There are some unmentioned issues that may trip you up eventually with this approach, for example, if you try to apply these routines to the text of Finnegan's Wake.
To hint at those issues, here's an approach that takes you directly to the final result: ex1=: <'This is Skip''s test. Testing one, two, three. Count 3, 2, 1.' DELIM=:'.?!' toss=:a.#~1-(a.e.DELIM,":i.10)+.(tolower~:toupper) a. separateclean=:3 :0 a:-.~(e.&DELIM <@deb;._2 tolower) '.',~(;y) -. toss ) separateclean ex1 ┌──────────────────┬─────────────────────┬───────────┐ │this is skips test│testing one two three│count 3 2 1│ └──────────────────┴─────────────────────┴───────────┘ And here's a longer approach which takes you there in two steps where the result of the first step will be the same length as the result of the second step: separatedirty=:3 :0 (;:'.')-.~(e.&DELIM <@deb;.2 ]) '.',~;y ) clean=: tolower@-.&(toss,DELIM) L:0 separatedirty ex1 ┌────────────────────┬────────────────────────┬──────────────┐ │This is Skip's test.│Testing one, two, three.│Count 3, 2, 1.│ └────────────────────┴────────────────────────┴──────────────┘ clean separatedirty ex1 ┌──────────────────┬─────────────────────┬───────────┐ │this is skips test│testing one two three│count 3 2 1│ └──────────────────┴─────────────────────┴───────────┘ But with ill conditioned text (Finnegan's Wake being an example of that), I expect cases where separateclean gives a different result from clean@separatedirty But that's what makes text fun... -- Raul On Wed, Apr 4, 2018 at 12:02 PM, Skip Cave <s...@caveconsulting.com> wrote: > I have the following boxed data: > > ex1=. <'This is Skip''s test. Testing one, two, three. Count 3, 2, 1.' > > > ex1 > > ┌────────────────────────────────────────────────────────────┐ > > │This is Skip's test. Testing one, two, three. Count 3, 2, 1.│ > > └────────────────────────────────────────────────────────────┘ > > I want to build a verb that will separate this boxed text data into > sentences. > > > ex2=. (<'This is Skip''s test.'),(<'Testing one, two, three.'),(<'Count 3, > 2, 1.') > > ex2 > > ┌────────────────────┬────────────────────────┬──────────────┐ > > │This is Skip's test.│Testing one, two, three.│Count 3, 2, 1.│ > > └────────────────────┴────────────────────────┴──────────────┘ > > I also want to get rid of all punctuation and caps: > > ex3=. (<'this is skips test'),(<'testing one two three'),(<'count 3 2 1') > > ex3 > > ┌──────────────────┬─────────────────────┬───────────┐ > > │this is skips test│testing one two three│count 3 2 1│ > > └──────────────────┴─────────────────────┴───────────┘ > > What is a reasonable J verb to do this separation and cleanup? > > Skip > > Cave Consulting LLC > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm