getTagsContents=: getTagContents~S:0 1 <\~&2 textfile1 getTagsContents 'tag1s';'tag1e';'tag2s';'tag2e'
-- Raul On Sun, Nov 13, 2011 at 5:17 PM, Skip Cave <[email protected]> wrote: > A few days ago I was trying to develop a verb that would pull out all the > strings between two specific tag strings in a text array. Here was the > array I presented: > > textfile=: 0 : 0 > some stuff > some more stuff > stuff start string good stuff that I want to keep end string other stuff > more stuff > lots of stuff, more stuff, start string more good stuff that I need end > string > string stuff > bad stuff stuff I don't care about start string even more stuff I want end > end string strange stuff > the end > ) > > Ric Sherlock provided a nice answer: > > getTagContents=: dyad define > 'starttag endtag'=. x > (endtag&taketo)&.>@(starttag& > E. <@((#starttag)&}.);.1 ]) y > ) > > Now I have a bit more complex issue, but it runs along the same lines. > > Here's the new text file: > > textfile1=: 0 : 0 > some stuff > some more stuff > stuff tag1s good stuff that I want to keep tag1e other stuff > more stuff > lots of stuff, more stuff, tag2s more good stuff that I need tag2e > string stuff > bad stuff stuff I don't care about tag1s even more stuff I want tag1e > strange stuff > more and more stuff > stuff tag1s stuff to keep tag1e > different stuff and new stuff > bad and unusual stuff tag2s really really good stuff tag2e bad stuff > more unneeded stuff > the end > ) > > So there are four types of tags in this text - 'tag1s', 'tag1e', 'tag2s', > and 'tag2e'. > > I want to create a two-column boxed array. The text between the tags > 'tag1s' and 'tag1e' should be placed in the first (boxed) column, and the > text between the tags 'tag2s' and 'tag2e' placed in the second column. > > It will be assumed that the first tag pair (tag1s, tag1e) will always be > followed by the second tag pair (tag2s, tag2e). if either tag pair is > missing, there should be an empty box in its place. > > So, the result of the function applied to the above text string would be: > > somearg getMultiTagContents textstring1 > > ┌───────────────────────────────┬───────────────────────────┐ > │good stuff that I want to keep │more good stuff that I need│ > ├───────────────────────────────┼───────────────────────────┤ > │even more stuff I want │ │ > ├───────────────────────────────┼───────────────────────────┤ > │stuff to keep │really really good stuff │ > └───────────────────────────────┴───────────────────────────┘ > > 'somearg' would be the list of tag pairs that bracket the required strings, > in the specific order they will appear in the text. > > And as a final touch, can this function be generalized for N tag pairs > creating an X-by-N boxed array? > This always assumes that the tag pairs will be in a specific defined > sequence, and if a pair is missing, it will be represented by an empty box. > > Skip > -- > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
