Ok... well.. since you have something working from Arie, I do not think I
have much to add.

That said, given my current understanding of your requirements, I think I
would write the extractor something like this:

advN=:2 :0  NB. adverb that takes N noun left arguments
  if.L.n   do. d=. (<m),D [ 'N D M'=. n
    if.N-1 do. advN ((N-1);d;M)
    else.      d 1 :M end.
  else. advN (n;m;0 :0) end.
)

getFirst=: -@#@[ {. I. {./. ]

taggedEvents=: '' advN 3
  'tags lineEnd eventStart'=. m
  lines=. I. lineEnd E. y
  events=. I. eventStart E. y
  r=. i.(#events),0
  for_TAG. tags do. tag=. >TAG
    start=. events getFirst (#tag)+I. tag E. y
    end=. start getFirst lines
    start=. end getFirst start
    data=. start <@:{&y@(+i.)"0 end-start
    r=. r,. data (<: events I. start)} ($events) $ a:
  end.
)

ww1t=: 1!:1 <'ww1t.txt'

start=: 'start{'
linend=: CR,LF,TAB
tags=: 'STATUS';'RESULT[0]';'CONFIDENCE[0]';'UTTERANCE_FILENAME'
V=: tags linend start taggedEvents

With these definitions, the extracted content looks like:
   V ww1t
or
   tags linend start taggedEvents ww1t

Note that I have assumed that the first event will be preceded by some text
which does not contain any tags.  You will get an error if this assumption
is violated.  If you need to process files that do not contain a preamble,
you should add one to the text (adding a space in front should work fine).

-- 
Raul

On Fri, Nov 25, 2011 at 4:36 PM, Skip Cave <[email protected]> wrote:

> Raul,
>
> I want every event in the text logs to generate a boxed row in the output,
> even if none of the requested parameters are in the event. Every event will
> start with the 'start{'  text string, and end with the '}end' text string.
> I want to uniquely number each event (or boxed row) in the output. Since I
> have several large log file sets to analyze, I will need to be able to
> offset the event numbers in a specific output by a constant, so that every
> event across all output sets to have a unique event number.
>
> Skip
>
> On Thu, Nov 24, 2011 at 6:00 AM, Raul Miller <[email protected]>
> wrote:
>
> > Ok, this suggests a completely different design.
> >
> > That said, when you say "number each row in the output" do you mean event
> > number or do you mean line number?   I agree that event number is
> implicit.
> >
> > --
> > Raul
> >
> > On Wed, Nov 23, 2011 at 11:47 PM, Skip Cave <[email protected]>
> > wrote:
> >
> > > Raul,
> > >
> > > I'm using tags3 for your function:
> > >
> > >   tags3
> > > ┌──────┬───┬─────────┬───┬────
> > > ───────────────────────┬───┐
> > >
> > > │STATUS│   │RESULT[0]│   │CONFIDENCE[0]             =│   │
> > > └──────┴───┴─────────┴───┴───────────────────────────┴───┘
> > >
> > > The empty boxes actually carry the CR, LF, TAB character string
> defining
> > > the closing tag for each parameter..
> > >
> > > tags6 is the tag string for *Arie's* function:
> > >  tags6
> > > ┌──────┬──────┬─────────┬─────────────┬──────────────────┐
> > > │start{│STATUS│RESULT[0]│CONFIDENCE[0]│UTTERANCE_FILENAME│
> > > └──────┴──────┴─────────┴─────────────┴──────────────────┘
> > >
> > > Arie uses the first element of his tag string to define the string that
> > > starts each event. In our case that is the 'start{' string which
> > identifies
> > > the start of each event. Arie assumes that every line ends in CR, LF,
> > TAB.
> > > So he doesn't need to have the closing tag specified for each
> parameter.
> > >
> > > The more I look over the data, the more I think that the function
> should
> > > capture EVERY event in the log. Every event starts with 'start{' and
> ends
> > > with '}end', so it is easy to spot all the events. If a specific event
> > has
> > > NO matching parameter tags in it, then the output will have a row of
> > empty
> > > boxes in it, The number of boxed columns in the output will be the
> number
> > > of parameters asked for in the tag list. The number of boxed rows in
> the
> > > output will be the total number of events in the whole text log file.
> > >
> > > I think I also need to number each row in the output. However that is
> one
> > > thing I CAN do myself.
> > >
> > > Skip
> > >
> > > On Wed, Nov 23, 2011 at 8:44 PM, Raul Miller <[email protected]>
> > > wrote:
> > >
> > > > There are other reasons why mine might stop (like missing end tags).
> > > >
> > > > What definition are you using for tags6?
> > > >
> > > > --
> > > > Raul
> > > >
> > > > > > >> > Raul
> > > >
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
> > >
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> >
>
>
>
> --
> Skip Cave
> Cave Consulting LLC
> Phone: 214-460-4861
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to