A few days ago I was trying to develop a verb that would pull out all the
strings between two specific tag strings in a text array. Here was the
array I presented:

textfile=: 0 : 0
some stuff
some more stuff
stuff  start string  good stuff that I want to keep end string other stuff
more stuff
lots of stuff, more stuff, start string more good stuff that I need end
string
string stuff
bad stuff stuff I don't care about start string even more stuff I want end
end string strange stuff
the end
)

Ric Sherlock provided a nice answer:

getTagContents=: dyad define
 'starttag endtag'=. x
 (endtag&taketo)&.>@(starttag&
E. <@((#starttag)&}.);.1 ]) y
)

Now I have a bit more complex issue, but it runs along the same lines.

Here's the new text file:

textfile1=: 0 : 0
some stuff
some more stuff
stuff  tag1s good stuff that I want to keep tag1e other stuff
more stuff
lots of stuff, more stuff, tag2s more good stuff that I need tag2e
string stuff
bad stuff stuff I don't care about tag1s even more stuff I want tag1e
strange stuff
more and more stuff
stuff tag1s stuff to keep tag1e
different stuff and new stuff
bad and unusual stuff tag2s really really good stuff tag2e bad stuff
more unneeded stuff
the end
)

So there are four types of tags in this text - 'tag1s', 'tag1e', 'tag2s',
and 'tag2e'.

I want to create a two-column boxed array. The text between the tags
'tag1s' and 'tag1e' should be placed in the first (boxed) column, and the
text between the tags 'tag2s' and 'tag2e' placed in the second column.

It will be assumed that the first tag pair (tag1s, tag1e) will always be
followed by the second tag pair (tag2s, tag2e). if either tag pair is
missing, there should be an empty box in its place.

So, the result of the function applied to the above text string would be:

somearg getMultiTagContents textstring1

┌───────────────────────────────┬───────────────────────────┐
│good stuff that I want to keep │more good stuff that I need│
├───────────────────────────────┼───────────────────────────┤
│even more stuff I want         │                           │
├───────────────────────────────┼───────────────────────────┤
│stuff to keep                  │really really good stuff   │
└───────────────────────────────┴───────────────────────────┘

'somearg' would be the list of tag pairs that bracket the required strings,
in the specific order they will appear in the text.

And as a final touch, can this function be generalized for N tag pairs
creating an X-by-N boxed array?
This always assumes that the tag pairs will be in a specific defined
sequence, and if a pair is missing, it will be represented by an empty box.

Skip
--
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to