Sorry about that, I should stop posting so carelessly: getTagsContents=: getTagContents~S:0 1 <\~&_2
-- Raul On Mon, Nov 14, 2011 at 12:12 PM, Skip Cave <[email protected]> wrote: > Raul, > > Thanks for the help. However, there is still something missing in your > function: > > textfile1 > some stuff > some more stuff > stuff tag1s good stuff that I want to keep tag1e other stuff > more stuff > lots of stuff, more stuff, tag2s more good stuff that I need tag2e > string stuff > bad stuff stuff I don't care about tag1s even more stuff I want tag1e > strange stuff > more and more stuff > stuff tag1s stuff to keep tag1e > different stuff and new stuff > bad and unusual stuff tag2s really really good stuff tag2e bad stuff > more unneeded stuff > the end > > getTagsContents=: getTagContents~S:0 1 <\~&2 > > > ftxt =: textfile1 getTagsContents 'tag1s';'tag1e';'tag2s';'tag2e' > $ftxt > 3 3 > > Hmmmmmm... Shape should be 2 3 > > ftxt > ┌───────────────────────────────────────────────────┬─────────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────┐ > │ good stuff that I want to keep │ even more stuff I > want │ stuff to > keep │ > ├───────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────┤ > │ other stuff more stuff lots of stuff, more stuff, │ strange stuff more > and more stuff stuff tag1s stuff to keep │ different stuff and new stuff > bad and unusual stuff │ > ├───────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────┤ > │ more good stuff that I need │ really really good > stuff > │ │ > └───────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────┘ > > I'm not sure where the middle row came from, but there should only be two > rows, as there are only two tag pairs. And, the rows in this output should > actually be columns, in the final result. The correct result should have > one column per tag pair type. > > The result would be closer to what I need, if we remove the middle row, and > transpose the result: > > ftxt1 =: |: 1 0 1 # ftxt > > $ 1 0 1 # ftxt1 > 2 3 > > This has the right shape. Two columns, one for each tag pair type. > > ftxt1 > ┌────────────────────────────────┬─────────────────────────────┐ > │ good stuff that I want to keep │ more good stuff that I need │ > ├────────────────────────────────┼─────────────────────────────┤ > │ even more stuff I want │ really really good stuff │ > ├────────────────────────────────┼─────────────────────────────┤ > │ stuff to keep │ │ > └────────────────────────────────┴─────────────────────────────┘ > > Now there is only one other problem. The second column is out of sequence. > The assumption is that the tagged strings will always be in groups, with > the tag1 string followed by the tag 2 string, then the tag1 string again, > then tag2, etc. If a tag pair is missing from this 1,2,1.2,1,2 sequence, > the missing string should be indicated by an empty box in the sequence. > > In the textstring1 data, the empty box needs to be in the second row, > instead of the third row, to keep the order of tagged strings in the same > sequence as in the original text. In my application, the order of > appearance of the tagged strings is critical. The missing tag2 string > follows the 'even more stuff I want' string, not after the 'stuff to keep' > string. > > A ravel of ftxt1 should provide a boxed list of all of the tagged strings, > in the order that they appear in the text, with empty boxes representing > strings missing from the 1,2,1,2 sequence. > > So close, and yet so far.... > > Skip > > > On Mon, Nov 14, 2011 at 6:37 AM, Raul Miller <[email protected]> wrote: > >> getTagsContents=: getTagContents~S:0 1 <\~&2 >> textfile1 getTagsContents 'tag1s';'tag1e';'tag2s';'tag2e' >> >> -- >> Raul >> >> >> > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
